Paper Group AWR 90
Real-Time User-Guided Image Colorization with Learned Deep Priors. Setting an attention region for convolutional neural networks using region selective features, for recognition of materials within glass vessels. On the Complexity of Learning Neural Networks. Towards a quality metric for dense light fields. Beyond Counting: Comparisons of Density M …
Real-Time User-Guided Image Colorization with Learned Deep Priors
Title | Real-Time User-Guided Image Colorization with Learned Deep Priors |
Authors | Richard Zhang, Jun-Yan Zhu, Phillip Isola, Xinyang Geng, Angela S. Lin, Tianhe Yu, Alexei A. Efros |
Abstract | We propose a deep learning approach for user-guided image colorization. The system directly maps a grayscale image, along with sparse, local user “hints” to an output colorization with a Convolutional Neural Network (CNN). Rather than using hand-defined rules, the network propagates user edits by fusing low-level cues along with high-level semantic information, learned from large-scale data. We train on a million images, with simulated user inputs. To guide the user towards efficient input selection, the system recommends likely colors based on the input image and current user inputs. The colorization is performed in a single feed-forward pass, enabling real-time use. Even with randomly simulated user inputs, we show that the proposed system helps novice users quickly create realistic colorizations, and offers large improvements in colorization quality with just a minute of use. In addition, we demonstrate that the framework can incorporate other user “hints” to the desired colorization, showing an application to color histogram transfer. Our code and models are available at https://richzhang.github.io/ideepcolor. |
Tasks | Colorization |
Published | 2017-05-08 |
URL | http://arxiv.org/abs/1705.02999v1 |
http://arxiv.org/pdf/1705.02999v1.pdf | |
PWC | https://paperswithcode.com/paper/real-time-user-guided-image-colorization-with |
Repo | https://github.com/junyanz/interactive-deep-colorization |
Framework | pytorch |
Setting an attention region for convolutional neural networks using region selective features, for recognition of materials within glass vessels
Title | Setting an attention region for convolutional neural networks using region selective features, for recognition of materials within glass vessels |
Authors | Sagi Eppel |
Abstract | Convolutional neural networks have emerged as the leading method for the classification and segmentation of images. In some cases, it is desirable to focus the attention of the net on a specific region in the image; one such case is the recognition of the contents of transparent vessels, where the vessel region in the image is already known. This work presents a valve filter approach for focusing the attention of the net on a region of interest (ROI). In this approach, the ROI is inserted into the net as a binary map. The net uses a different set of convolution filters for the ROI and background image regions, resulting in a different set of features being extracted from each region. More accurately, for each filter used on the image, a corresponding valve filter exists that acts on the ROI map and determines the regions in which the corresponding image filter will be used. This valve filter effectively acts as a valve that inhibits specific features in different image regions according to the ROI map. In addition, a new data set for images of materials in glassware vessels in a chemistry laboratory setting is presented. This data set contains a thousand images with pixel-wise annotation according to categories ranging from filled and empty to the exact phase of the material inside the vessel. The results of the valve filter approach and fully convolutional neural nets (FCN) with no ROI input are compared based on this data set. |
Tasks | |
Published | 2017-08-29 |
URL | http://arxiv.org/abs/1708.08711v3 |
http://arxiv.org/pdf/1708.08711v3.pdf | |
PWC | https://paperswithcode.com/paper/setting-an-attention-region-for-convolutional |
Repo | https://github.com/sagieppel/Focusing-attention-of-Fully-convolutional-neural-networks-on-Region-of-interest-ROI-input-map- |
Framework | tf |
On the Complexity of Learning Neural Networks
Title | On the Complexity of Learning Neural Networks |
Authors | Le Song, Santosh Vempala, John Wilmes, Bo Xie |
Abstract | The stunning empirical successes of neural networks currently lack rigorous theoretical explanation. What form would such an explanation take, in the face of existing complexity-theoretic lower bounds? A first step might be to show that data generated by neural networks with a single hidden layer, smooth activation functions and benign input distributions can be learned efficiently. We demonstrate here a comprehensive lower bound ruling out this possibility: for a wide class of activation functions (including all currently used), and inputs drawn from any logconcave distribution, there is a family of one-hidden-layer functions whose output is a sum gate, that are hard to learn in a precise sense: any statistical query algorithm (which includes all known variants of stochastic gradient descent with any loss function) needs an exponential number of queries even using tolerance inversely proportional to the input dimensionality. Moreover, this hard family of functions is realizable with a small (sublinear in dimension) number of activation units in the single hidden layer. The lower bound is also robust to small perturbations of the true weights. Systematic experiments illustrate a phase transition in the training error as predicted by the analysis. |
Tasks | |
Published | 2017-07-14 |
URL | http://arxiv.org/abs/1707.04615v1 |
http://arxiv.org/pdf/1707.04615v1.pdf | |
PWC | https://paperswithcode.com/paper/on-the-complexity-of-learning-neural-networks |
Repo | https://github.com/timbrgr/yellow-brick-road-to-MrLd-city |
Framework | tf |
Towards a quality metric for dense light fields
Title | Towards a quality metric for dense light fields |
Authors | Vamsi Kiran Adhikarla, Marek Vinkler, Denis Sumin, Rafał K. Mantiuk, Karol Myszkowski, Hans-Peter Seidel, Piotr Didyk |
Abstract | Light fields become a popular representation of three dimensional scenes, and there is interest in their processing, resampling, and compression. As those operations often result in loss of quality, there is a need to quantify it. In this work, we collect a new dataset of dense reference and distorted light fields as well as the corresponding quality scores which are scaled in perceptual units. The scores were acquired in a subjective experiment using an interactive light-field viewing setup. The dataset contains typical artifacts that occur in light-field processing chain due to light-field reconstruction, multi-view compression, and limitations of automultiscopic displays. We test a number of existing objective quality metrics to determine how well they can predict the quality of light fields. We find that the existing image quality metrics provide good measures of light-field quality, but require dense reference light- fields for optimal performance. For more complex tasks of comparing two distorted light fields, their performance drops significantly, which reveals the need for new, light-field-specific metrics. |
Tasks | |
Published | 2017-04-25 |
URL | http://arxiv.org/abs/1704.07576v1 |
http://arxiv.org/pdf/1704.07576v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-a-quality-metric-for-dense-light |
Repo | https://github.com/mantiuk/pwcmp |
Framework | none |
Beyond Counting: Comparisons of Density Maps for Crowd Analysis Tasks - Counting, Detection, and Tracking
Title | Beyond Counting: Comparisons of Density Maps for Crowd Analysis Tasks - Counting, Detection, and Tracking |
Authors | Di Kang, Zheng Ma, Antoni B. Chan |
Abstract | For crowded scenes, the accuracy of object-based computer vision methods declines when the images are low-resolution and objects have severe occlusions. Taking counting methods for example, almost all the recent state-of-the-art counting methods bypass explicit detection and adopt regression-based methods to directly count the objects of interest. Among regression-based methods, density map estimation, where the number of objects inside a subregion is the integral of the density map over that subregion, is especially promising because it preserves spatial information, which makes it useful for both counting and localization (detection and tracking). With the power of deep convolutional neural networks (CNNs) the counting performance has improved steadily. The goal of this paper is to evaluate density maps generated by density estimation methods on a variety of crowd analysis tasks, including counting, detection, and tracking. Most existing CNN methods produce density maps with resolution that is smaller than the original images, due to the downsample strides in the convolution/pooling operations. To produce an original-resolution density map, we also evaluate a classical CNN that uses a sliding window regressor to predict the density for every pixel in the image. We also consider a fully convolutional (FCNN) adaptation, with skip connections from lower convolutional layers to compensate for loss in spatial information during upsampling. In our experiments, we found that the lower-resolution density maps sometimes have better counting performance. In contrast, the original-resolution density maps improved localization tasks, such as detection and tracking, compared to bilinear upsampling the lower-resolution density maps. Finally, we also propose several metrics for measuring the quality of a density map, and relate them to experiment results on counting and localization. |
Tasks | Density Estimation |
Published | 2017-05-29 |
URL | http://arxiv.org/abs/1705.10118v2 |
http://arxiv.org/pdf/1705.10118v2.pdf | |
PWC | https://paperswithcode.com/paper/beyond-counting-comparisons-of-density-maps |
Repo | https://github.com/krutikabapat/Crowd_Counting |
Framework | none |
EX2: Exploration with Exemplar Models for Deep Reinforcement Learning
Title | EX2: Exploration with Exemplar Models for Deep Reinforcement Learning |
Authors | Justin Fu, John D. Co-Reyes, Sergey Levine |
Abstract | Deep reinforcement learning algorithms have been shown to learn complex tasks using highly general policy classes. However, sparse reward problems remain a significant challenge. Exploration methods based on novelty detection have been particularly successful in such settings but typically require generative or predictive models of the observations, which can be difficult to train when the observations are very high-dimensional and complex, as in the case of raw images. We propose a novelty detection algorithm for exploration that is based entirely on discriminatively trained exemplar models, where classifiers are trained to discriminate each visited state against all others. Intuitively, novel states are easier to distinguish against other states seen during training. We show that this kind of discriminative modeling corresponds to implicit density estimation, and that it can be combined with count-based exploration to produce competitive results on a range of popular benchmark tasks, including state-of-the-art results on challenging egocentric observations in the vizDoom benchmark. |
Tasks | Density Estimation |
Published | 2017-03-03 |
URL | http://arxiv.org/abs/1703.01260v2 |
http://arxiv.org/pdf/1703.01260v2.pdf | |
PWC | https://paperswithcode.com/paper/ex2-exploration-with-exemplar-models-for-deep |
Repo | https://github.com/jcoreyes/ex2 |
Framework | none |
Finding Streams in Knowledge Graphs to Support Fact Checking
Title | Finding Streams in Knowledge Graphs to Support Fact Checking |
Authors | Prashant Shiralkar, Alessandro Flammini, Filippo Menczer, Giovanni Luca Ciampaglia |
Abstract | The volume and velocity of information that gets generated online limits current journalistic practices to fact-check claims at the same rate. Computational approaches for fact checking may be the key to help mitigate the risks of massive misinformation spread. Such approaches can be designed to not only be scalable and effective at assessing veracity of dubious claims, but also to boost a human fact checker’s productivity by surfacing relevant facts and patterns to aid their analysis. To this end, we present a novel, unsupervised network-flow based approach to determine the truthfulness of a statement of fact expressed in the form of a (subject, predicate, object) triple. We view a knowledge graph of background information about real-world entities as a flow network, and knowledge as a fluid, abstract commodity. We show that computational fact checking of such a triple then amounts to finding a “knowledge stream” that emanates from the subject node and flows toward the object node through paths connecting them. Evaluation on a range of real-world and hand-crafted datasets of facts related to entertainment, business, sports, geography and more reveals that this network-flow model can be very effective in discerning true statements from false ones, outperforming existing algorithms on many test cases. Moreover, the model is expressive in its ability to automatically discover several useful path patterns and surface relevant facts that may help a human fact checker corroborate or refute a claim. |
Tasks | Knowledge Graphs |
Published | 2017-08-24 |
URL | http://arxiv.org/abs/1708.07239v1 |
http://arxiv.org/pdf/1708.07239v1.pdf | |
PWC | https://paperswithcode.com/paper/finding-streams-in-knowledge-graphs-to |
Repo | https://github.com/shiralkarprashant/knowledgestream |
Framework | tf |
Softmax GAN
Title | Softmax GAN |
Authors | Min Lin |
Abstract | Softmax GAN is a novel variant of Generative Adversarial Network (GAN). The key idea of Softmax GAN is to replace the classification loss in the original GAN with a softmax cross-entropy loss in the sample space of one single batch. In the adversarial learning of $N$ real training samples and $M$ generated samples, the target of discriminator training is to distribute all the probability mass to the real samples, each with probability $\frac{1}{M}$, and distribute zero probability to generated data. In the generator training phase, the target is to assign equal probability to all data points in the batch, each with probability $\frac{1}{M+N}$. While the original GAN is closely related to Noise Contrastive Estimation (NCE), we show that Softmax GAN is the Importance Sampling version of GAN. We futher demonstrate with experiments that this simple change stabilizes GAN training. |
Tasks | |
Published | 2017-04-20 |
URL | http://arxiv.org/abs/1704.06191v1 |
http://arxiv.org/pdf/1704.06191v1.pdf | |
PWC | https://paperswithcode.com/paper/softmax-gan |
Repo | https://github.com/eriklindernoren/PyTorch-GAN |
Framework | pytorch |
A Neural Representation of Sketch Drawings
Title | A Neural Representation of Sketch Drawings |
Authors | David Ha, Douglas Eck |
Abstract | We present sketch-rnn, a recurrent neural network (RNN) able to construct stroke-based drawings of common objects. The model is trained on thousands of crude human-drawn images representing hundreds of classes. We outline a framework for conditional and unconditional sketch generation, and describe new robust training methods for generating coherent sketch drawings in a vector format. |
Tasks | |
Published | 2017-04-11 |
URL | http://arxiv.org/abs/1704.03477v4 |
http://arxiv.org/pdf/1704.03477v4.pdf | |
PWC | https://paperswithcode.com/paper/a-neural-representation-of-sketch-drawings |
Repo | https://github.com/thinkingmachines/christmAIs |
Framework | tf |
A* CCG Parsing with a Supertag and Dependency Factored Model
Title | A* CCG Parsing with a Supertag and Dependency Factored Model |
Authors | Masashi Yoshikawa, Hiroshi Noji, Yuji Matsumoto |
Abstract | We propose a new A* CCG parsing model in which the probability of a tree is decomposed into factors of CCG categories and its syntactic dependencies both defined on bi-directional LSTMs. Our factored model allows the precomputation of all probabilities and runs very efficiently, while modeling sentence structures explicitly via dependencies. Our model achieves the state-of-the-art results on English and Japanese CCG parsing. |
Tasks | |
Published | 2017-04-23 |
URL | http://arxiv.org/abs/1704.06936v1 |
http://arxiv.org/pdf/1704.06936v1.pdf | |
PWC | https://paperswithcode.com/paper/a-ccg-parsing-with-a-supertag-and-dependency |
Repo | https://github.com/masashi-y/depccg |
Framework | none |
Latent Dirichlet Allocation (LDA) and Topic modeling: models, applications, a survey
Title | Latent Dirichlet Allocation (LDA) and Topic modeling: models, applications, a survey |
Authors | Hamed Jelodar, Yongli Wang, Chi Yuan, Xia Feng, Xiahui Jiang, Yanchao Li, Liang Zhao |
Abstract | Topic modeling is one of the most powerful techniques in text mining for data mining, latent data discovery, and finding relationships among data, text documents. Researchers have published many articles in the field of topic modeling and applied in various fields such as software engineering, political science, medical and linguistic science, etc. There are various methods for topic modeling, which Latent Dirichlet allocation (LDA) is one of the most popular methods in this field. Researchers have proposed various models based on the LDA in topic modeling. According to previous work, this paper can be very useful and valuable for introducing LDA approaches in topic modeling. In this paper, we investigated scholarly articles highly (between 2003 to 2016) related to Topic Modeling based on LDA to discover the research development, current trends and intellectual structure of topic modeling. Also, we summarize challenges and introduce famous tools and datasets in topic modeling based on LDA. |
Tasks | |
Published | 2017-11-12 |
URL | http://arxiv.org/abs/1711.04305v2 |
http://arxiv.org/pdf/1711.04305v2.pdf | |
PWC | https://paperswithcode.com/paper/171104305 |
Repo | https://github.com/thevisheshone/whatsthatsong |
Framework | none |
Learning to Estimate Pose by Watching Videos
Title | Learning to Estimate Pose by Watching Videos |
Authors | Prabuddha Chakraborty, Vinay P. Namboodiri |
Abstract | In this paper we propose a technique for obtaining coarse pose estimation of humans in an image that does not require any manual supervision. While a general unsupervised technique would fail to estimate human pose, we suggest that sufficient information about coarse pose can be obtained by observing human motion in multiple frames. Specifically, we consider obtaining surrogate supervision through videos as a means for obtaining motion based grouping cues. We supplement the method using a basic object detector that detects persons. With just these components we obtain a rough estimate of the human pose. With these samples for training, we train a fully convolutional neural network (FCNN)[20] to obtain accurate dense blob based pose estimation. We show that the results obtained are close to the ground-truth and to the results obtained using a fully supervised convolutional pose estimation method [31] as evaluated on a challenging dataset [15]. This is further validated by evaluating the obtained poses using a pose based action recognition method [5]. In this setting we outperform the results as obtained using the baseline method that uses a fully supervised pose estimation algorithm and is competitive with a new baseline created using convolutional pose estimation with full supervision. |
Tasks | Pose Estimation, Temporal Action Localization |
Published | 2017-04-13 |
URL | http://arxiv.org/abs/1704.04081v1 |
http://arxiv.org/pdf/1704.04081v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-estimate-pose-by-watching-videos |
Repo | https://github.com/prabuddha1/acpe |
Framework | none |
Comicolorization: Semi-Automatic Manga Colorization
Title | Comicolorization: Semi-Automatic Manga Colorization |
Authors | Chie Furusawa, Kazuyuki Hiroshiba, Keisuke Ogaki, Yuri Odagiri |
Abstract | We developed “Comicolorization”, a semi-automatic colorization system for manga images. Given a monochrome manga and reference images as inputs, our system generates a plausible color version of the manga. This is the first work to address the colorization of an entire manga title (a set of manga pages). Our method colorizes a whole page (not a single panel) semi-automatically, with the same color for the same character across multiple panels. To colorize the target character by the color from the reference image, we extract a color feature from the reference and feed it to the colorization network to help the colorization. Our approach employs adversarial loss to encourage the effect of the color features. Optionally, our tool allows users to revise the colorization result interactively. By feeding the color features to our deep colorization network, we accomplish colorization of the entire manga using the desired colors for each panel. |
Tasks | Colorization |
Published | 2017-06-21 |
URL | http://arxiv.org/abs/1706.06759v4 |
http://arxiv.org/pdf/1706.06759v4.pdf | |
PWC | https://paperswithcode.com/paper/comicolorization-semi-automatic-manga |
Repo | https://github.com/DwangoMediaVillage/Comicolorization |
Framework | none |
Domain-adaptive deep network compression
Title | Domain-adaptive deep network compression |
Authors | Marc Masana, Joost van de Weijer, Luis Herranz, Andrew D. Bagdanov, Jose M Alvarez |
Abstract | Deep Neural Networks trained on large datasets can be easily transferred to new domains with far fewer labeled examples by a process called fine-tuning. This has the advantage that representations learned in the large source domain can be exploited on smaller target domains. However, networks designed to be optimal for the source task are often prohibitively large for the target task. In this work we address the compression of networks after domain transfer. We focus on compression algorithms based on low-rank matrix decomposition. Existing methods base compression solely on learned network weights and ignore the statistics of network activations. We show that domain transfer leads to large shifts in network activations and that it is desirable to take this into account when compressing. We demonstrate that considering activation statistics when compressing weights leads to a rank-constrained regression problem with a closed-form solution. Because our method takes into account the target domain, it can more optimally remove the redundancy in the weights. Experiments show that our Domain Adaptive Low Rank (DALR) method significantly outperforms existing low-rank compression techniques. With our approach, the fc6 layer of VGG19 can be compressed more than 4x more than using truncated SVD alone – with only a minor or no loss in accuracy. When applied to domain-transferred networks it allows for compression down to only 5-20% of the original number of parameters with only a minor drop in performance. |
Tasks | |
Published | 2017-09-04 |
URL | http://arxiv.org/abs/1709.01041v2 |
http://arxiv.org/pdf/1709.01041v2.pdf | |
PWC | https://paperswithcode.com/paper/domain-adaptive-deep-network-compression |
Repo | https://github.com/mmasana/DALR |
Framework | tf |
Efficient Feature Screening for Lasso-Type Problems via Hybrid Safe-Strong Rules
Title | Efficient Feature Screening for Lasso-Type Problems via Hybrid Safe-Strong Rules |
Authors | Yaohui Zeng, Tianbao Yang, Patrick Breheny |
Abstract | The lasso model has been widely used for model selection in data mining, machine learning, and high-dimensional statistical analysis. However, due to the ultrahigh-dimensional, large-scale data sets collected in many real-world applications, it remains challenging to solve the lasso problems even with state-of-the-art algorithms. Feature screening is a powerful technique for addressing the Big Data challenge by discarding inactive features from the lasso optimization. In this paper, we propose a family of hybrid safe-strong rules (HSSR) which incorporate safe screening rules into the sequential strong rule (SSR) to remove unnecessary computational burden. In particular, we present two instances of HSSR, namely SSR-Dome and SSR-BEDPP, for the standard lasso problem. We further extend SSR-BEDPP to the elastic net and group lasso problems to demonstrate the generalizability of the hybrid screening idea. Extensive numerical experiments with synthetic and real data sets are conducted for both the standard lasso and the group lasso problems. Results show that our proposed hybrid rules substantially outperform existing state-of-the-art rules. |
Tasks | Model Selection |
Published | 2017-04-27 |
URL | http://arxiv.org/abs/1704.08742v2 |
http://arxiv.org/pdf/1704.08742v2.pdf | |
PWC | https://paperswithcode.com/paper/efficient-feature-screening-for-lasso-type |
Repo | https://github.com/YaohuiZeng/biglasso |
Framework | none |