July 29, 2019

3266 words 16 mins read

Paper Group AWR 90

Real-Time User-Guided Image Colorization with Learned Deep Priors. Setting an attention region for convolutional neural networks using region selective features, for recognition of materials within glass vessels. On the Complexity of Learning Neural Networks. Towards a quality metric for dense light fields. Beyond Counting: Comparisons of Density M …

Real-Time User-Guided Image Colorization with Learned Deep Priors


Title	Real-Time User-Guided Image Colorization with Learned Deep Priors
Authors	Richard Zhang, Jun-Yan Zhu, Phillip Isola, Xinyang Geng, Angela S. Lin, Tianhe Yu, Alexei A. Efros
Abstract	We propose a deep learning approach for user-guided image colorization. The system directly maps a grayscale image, along with sparse, local user “hints” to an output colorization with a Convolutional Neural Network (CNN). Rather than using hand-defined rules, the network propagates user edits by fusing low-level cues along with high-level semantic information, learned from large-scale data. We train on a million images, with simulated user inputs. To guide the user towards efficient input selection, the system recommends likely colors based on the input image and current user inputs. The colorization is performed in a single feed-forward pass, enabling real-time use. Even with randomly simulated user inputs, we show that the proposed system helps novice users quickly create realistic colorizations, and offers large improvements in colorization quality with just a minute of use. In addition, we demonstrate that the framework can incorporate other user “hints” to the desired colorization, showing an application to color histogram transfer. Our code and models are available at https://richzhang.github.io/ideepcolor.
Tasks	Colorization
Published	2017-05-08
URL	http://arxiv.org/abs/1705.02999v1
PDF	http://arxiv.org/pdf/1705.02999v1.pdf
PWC	https://paperswithcode.com/paper/real-time-user-guided-image-colorization-with
Repo	https://github.com/junyanz/interactive-deep-colorization
Framework	pytorch

Setting an attention region for convolutional neural networks using region selective features, for recognition of materials within glass vessels


Title	Setting an attention region for convolutional neural networks using region selective features, for recognition of materials within glass vessels
Authors	Sagi Eppel
Abstract	Convolutional neural networks have emerged as the leading method for the classification and segmentation of images. In some cases, it is desirable to focus the attention of the net on a specific region in the image; one such case is the recognition of the contents of transparent vessels, where the vessel region in the image is already known. This work presents a valve filter approach for focusing the attention of the net on a region of interest (ROI). In this approach, the ROI is inserted into the net as a binary map. The net uses a different set of convolution filters for the ROI and background image regions, resulting in a different set of features being extracted from each region. More accurately, for each filter used on the image, a corresponding valve filter exists that acts on the ROI map and determines the regions in which the corresponding image filter will be used. This valve filter effectively acts as a valve that inhibits specific features in different image regions according to the ROI map. In addition, a new data set for images of materials in glassware vessels in a chemistry laboratory setting is presented. This data set contains a thousand images with pixel-wise annotation according to categories ranging from filled and empty to the exact phase of the material inside the vessel. The results of the valve filter approach and fully convolutional neural nets (FCN) with no ROI input are compared based on this data set.
Tasks
Published	2017-08-29
URL	http://arxiv.org/abs/1708.08711v3
PDF	http://arxiv.org/pdf/1708.08711v3.pdf
PWC	https://paperswithcode.com/paper/setting-an-attention-region-for-convolutional
Repo	https://github.com/sagieppel/Focusing-attention-of-Fully-convolutional-neural-networks-on-Region-of-interest-ROI-input-map-
Framework	tf

On the Complexity of Learning Neural Networks


Title	On the Complexity of Learning Neural Networks
Authors	Le Song, Santosh Vempala, John Wilmes, Bo Xie
Abstract	The stunning empirical successes of neural networks currently lack rigorous theoretical explanation. What form would such an explanation take, in the face of existing complexity-theoretic lower bounds? A first step might be to show that data generated by neural networks with a single hidden layer, smooth activation functions and benign input distributions can be learned efficiently. We demonstrate here a comprehensive lower bound ruling out this possibility: for a wide class of activation functions (including all currently used), and inputs drawn from any logconcave distribution, there is a family of one-hidden-layer functions whose output is a sum gate, that are hard to learn in a precise sense: any statistical query algorithm (which includes all known variants of stochastic gradient descent with any loss function) needs an exponential number of queries even using tolerance inversely proportional to the input dimensionality. Moreover, this hard family of functions is realizable with a small (sublinear in dimension) number of activation units in the single hidden layer. The lower bound is also robust to small perturbations of the true weights. Systematic experiments illustrate a phase transition in the training error as predicted by the analysis.
Tasks
Published	2017-07-14
URL	http://arxiv.org/abs/1707.04615v1
PDF	http://arxiv.org/pdf/1707.04615v1.pdf
PWC	https://paperswithcode.com/paper/on-the-complexity-of-learning-neural-networks
Repo	https://github.com/timbrgr/yellow-brick-road-to-MrLd-city
Framework	tf

Towards a quality metric for dense light fields


Title	Towards a quality metric for dense light fields
Authors	Vamsi Kiran Adhikarla, Marek Vinkler, Denis Sumin, Rafał K. Mantiuk, Karol Myszkowski, Hans-Peter Seidel, Piotr Didyk
Abstract	Light fields become a popular representation of three dimensional scenes, and there is interest in their processing, resampling, and compression. As those operations often result in loss of quality, there is a need to quantify it. In this work, we collect a new dataset of dense reference and distorted light fields as well as the corresponding quality scores which are scaled in perceptual units. The scores were acquired in a subjective experiment using an interactive light-field viewing setup. The dataset contains typical artifacts that occur in light-field processing chain due to light-field reconstruction, multi-view compression, and limitations of automultiscopic displays. We test a number of existing objective quality metrics to determine how well they can predict the quality of light fields. We find that the existing image quality metrics provide good measures of light-field quality, but require dense reference light- fields for optimal performance. For more complex tasks of comparing two distorted light fields, their performance drops significantly, which reveals the need for new, light-field-specific metrics.
Tasks
Published	2017-04-25
URL	http://arxiv.org/abs/1704.07576v1
PDF	http://arxiv.org/pdf/1704.07576v1.pdf
PWC	https://paperswithcode.com/paper/towards-a-quality-metric-for-dense-light
Repo	https://github.com/mantiuk/pwcmp
Framework	none

Beyond Counting: Comparisons of Density Maps for Crowd Analysis Tasks - Counting, Detection, and Tracking


Title	Beyond Counting: Comparisons of Density Maps for Crowd Analysis Tasks - Counting, Detection, and Tracking
Authors	Di Kang, Zheng Ma, Antoni B. Chan
Abstract	For crowded scenes, the accuracy of object-based computer vision methods declines when the images are low-resolution and objects have severe occlusions. Taking counting methods for example, almost all the recent state-of-the-art counting methods bypass explicit detection and adopt regression-based methods to directly count the objects of interest. Among regression-based methods, density map estimation, where the number of objects inside a subregion is the integral of the density map over that subregion, is especially promising because it preserves spatial information, which makes it useful for both counting and localization (detection and tracking). With the power of deep convolutional neural networks (CNNs) the counting performance has improved steadily. The goal of this paper is to evaluate density maps generated by density estimation methods on a variety of crowd analysis tasks, including counting, detection, and tracking. Most existing CNN methods produce density maps with resolution that is smaller than the original images, due to the downsample strides in the convolution/pooling operations. To produce an original-resolution density map, we also evaluate a classical CNN that uses a sliding window regressor to predict the density for every pixel in the image. We also consider a fully convolutional (FCNN) adaptation, with skip connections from lower convolutional layers to compensate for loss in spatial information during upsampling. In our experiments, we found that the lower-resolution density maps sometimes have better counting performance. In contrast, the original-resolution density maps improved localization tasks, such as detection and tracking, compared to bilinear upsampling the lower-resolution density maps. Finally, we also propose several metrics for measuring the quality of a density map, and relate them to experiment results on counting and localization.
Tasks	Density Estimation
Published	2017-05-29
URL	http://arxiv.org/abs/1705.10118v2
PDF	http://arxiv.org/pdf/1705.10118v2.pdf
PWC	https://paperswithcode.com/paper/beyond-counting-comparisons-of-density-maps
Repo	https://github.com/krutikabapat/Crowd_Counting
Framework	none

EX2: Exploration with Exemplar Models for Deep Reinforcement Learning


Title	EX2: Exploration with Exemplar Models for Deep Reinforcement Learning
Authors	Justin Fu, John D. Co-Reyes, Sergey Levine
Abstract	Deep reinforcement learning algorithms have been shown to learn complex tasks using highly general policy classes. However, sparse reward problems remain a significant challenge. Exploration methods based on novelty detection have been particularly successful in such settings but typically require generative or predictive models of the observations, which can be difficult to train when the observations are very high-dimensional and complex, as in the case of raw images. We propose a novelty detection algorithm for exploration that is based entirely on discriminatively trained exemplar models, where classifiers are trained to discriminate each visited state against all others. Intuitively, novel states are easier to distinguish against other states seen during training. We show that this kind of discriminative modeling corresponds to implicit density estimation, and that it can be combined with count-based exploration to produce competitive results on a range of popular benchmark tasks, including state-of-the-art results on challenging egocentric observations in the vizDoom benchmark.
Tasks	Density Estimation
Published	2017-03-03
URL	http://arxiv.org/abs/1703.01260v2
PDF	http://arxiv.org/pdf/1703.01260v2.pdf
PWC	https://paperswithcode.com/paper/ex2-exploration-with-exemplar-models-for-deep
Repo	https://github.com/jcoreyes/ex2
Framework	none

Finding Streams in Knowledge Graphs to Support Fact Checking


Title	Finding Streams in Knowledge Graphs to Support Fact Checking
Authors	Prashant Shiralkar, Alessandro Flammini, Filippo Menczer, Giovanni Luca Ciampaglia
Abstract	The volume and velocity of information that gets generated online limits current journalistic practices to fact-check claims at the same rate. Computational approaches for fact checking may be the key to help mitigate the risks of massive misinformation spread. Such approaches can be designed to not only be scalable and effective at assessing veracity of dubious claims, but also to boost a human fact checker’s productivity by surfacing relevant facts and patterns to aid their analysis. To this end, we present a novel, unsupervised network-flow based approach to determine the truthfulness of a statement of fact expressed in the form of a (subject, predicate, object) triple. We view a knowledge graph of background information about real-world entities as a flow network, and knowledge as a fluid, abstract commodity. We show that computational fact checking of such a triple then amounts to finding a “knowledge stream” that emanates from the subject node and flows toward the object node through paths connecting them. Evaluation on a range of real-world and hand-crafted datasets of facts related to entertainment, business, sports, geography and more reveals that this network-flow model can be very effective in discerning true statements from false ones, outperforming existing algorithms on many test cases. Moreover, the model is expressive in its ability to automatically discover several useful path patterns and surface relevant facts that may help a human fact checker corroborate or refute a claim.
Tasks	Knowledge Graphs
Published	2017-08-24
URL	http://arxiv.org/abs/1708.07239v1
PDF	http://arxiv.org/pdf/1708.07239v1.pdf
PWC	https://paperswithcode.com/paper/finding-streams-in-knowledge-graphs-to
Repo	https://github.com/shiralkarprashant/knowledgestream
Framework	tf

Softmax GAN


Title	Softmax GAN
Authors	Min Lin
Abstract	Softmax GAN is a novel variant of Generative Adversarial Network (GAN). The key idea of Softmax GAN is to replace the classification loss in the original GAN with a softmax cross-entropy loss in the sample space of one single batch. In the adversarial learning of $N$ real training samples and $M$ generated samples, the target of discriminator training is to distribute all the probability mass to the real samples, each with probability $\frac{1}{M}$, and distribute zero probability to generated data. In the generator training phase, the target is to assign equal probability to all data points in the batch, each with probability $\frac{1}{M+N}$. While the original GAN is closely related to Noise Contrastive Estimation (NCE), we show that Softmax GAN is the Importance Sampling version of GAN. We futher demonstrate with experiments that this simple change stabilizes GAN training.
Tasks
Published	2017-04-20
URL	http://arxiv.org/abs/1704.06191v1
PDF	http://arxiv.org/pdf/1704.06191v1.pdf
PWC	https://paperswithcode.com/paper/softmax-gan
Repo	https://github.com/eriklindernoren/PyTorch-GAN
Framework	pytorch

A Neural Representation of Sketch Drawings


Title	A Neural Representation of Sketch Drawings
Authors	David Ha, Douglas Eck
Abstract	We present sketch-rnn, a recurrent neural network (RNN) able to construct stroke-based drawings of common objects. The model is trained on thousands of crude human-drawn images representing hundreds of classes. We outline a framework for conditional and unconditional sketch generation, and describe new robust training methods for generating coherent sketch drawings in a vector format.
Tasks
Published	2017-04-11
URL	http://arxiv.org/abs/1704.03477v4
PDF	http://arxiv.org/pdf/1704.03477v4.pdf
PWC	https://paperswithcode.com/paper/a-neural-representation-of-sketch-drawings
Repo	https://github.com/thinkingmachines/christmAIs
Framework	tf

A* CCG Parsing with a Supertag and Dependency Factored Model


Title	A* CCG Parsing with a Supertag and Dependency Factored Model
Authors	Masashi Yoshikawa, Hiroshi Noji, Yuji Matsumoto
Abstract	We propose a new A* CCG parsing model in which the probability of a tree is decomposed into factors of CCG categories and its syntactic dependencies both defined on bi-directional LSTMs. Our factored model allows the precomputation of all probabilities and runs very efficiently, while modeling sentence structures explicitly via dependencies. Our model achieves the state-of-the-art results on English and Japanese CCG parsing.
Tasks
Published	2017-04-23
URL	http://arxiv.org/abs/1704.06936v1
PDF	http://arxiv.org/pdf/1704.06936v1.pdf
PWC	https://paperswithcode.com/paper/a-ccg-parsing-with-a-supertag-and-dependency
Repo	https://github.com/masashi-y/depccg
Framework	none

Latent Dirichlet Allocation (LDA) and Topic modeling: models, applications, a survey


Title	Latent Dirichlet Allocation (LDA) and Topic modeling: models, applications, a survey
Authors	Hamed Jelodar, Yongli Wang, Chi Yuan, Xia Feng, Xiahui Jiang, Yanchao Li, Liang Zhao
Abstract	Topic modeling is one of the most powerful techniques in text mining for data mining, latent data discovery, and finding relationships among data, text documents. Researchers have published many articles in the field of topic modeling and applied in various fields such as software engineering, political science, medical and linguistic science, etc. There are various methods for topic modeling, which Latent Dirichlet allocation (LDA) is one of the most popular methods in this field. Researchers have proposed various models based on the LDA in topic modeling. According to previous work, this paper can be very useful and valuable for introducing LDA approaches in topic modeling. In this paper, we investigated scholarly articles highly (between 2003 to 2016) related to Topic Modeling based on LDA to discover the research development, current trends and intellectual structure of topic modeling. Also, we summarize challenges and introduce famous tools and datasets in topic modeling based on LDA.
Tasks
Published	2017-11-12
URL	http://arxiv.org/abs/1711.04305v2
PDF	http://arxiv.org/pdf/1711.04305v2.pdf
PWC	https://paperswithcode.com/paper/171104305
Repo	https://github.com/thevisheshone/whatsthatsong
Framework	none

Learning to Estimate Pose by Watching Videos


Title	Learning to Estimate Pose by Watching Videos
Authors	Prabuddha Chakraborty, Vinay P. Namboodiri
Abstract	In this paper we propose a technique for obtaining coarse pose estimation of humans in an image that does not require any manual supervision. While a general unsupervised technique would fail to estimate human pose, we suggest that sufficient information about coarse pose can be obtained by observing human motion in multiple frames. Specifically, we consider obtaining surrogate supervision through videos as a means for obtaining motion based grouping cues. We supplement the method using a basic object detector that detects persons. With just these components we obtain a rough estimate of the human pose. With these samples for training, we train a fully convolutional neural network (FCNN)[20] to obtain accurate dense blob based pose estimation. We show that the results obtained are close to the ground-truth and to the results obtained using a fully supervised convolutional pose estimation method [31] as evaluated on a challenging dataset [15]. This is further validated by evaluating the obtained poses using a pose based action recognition method [5]. In this setting we outperform the results as obtained using the baseline method that uses a fully supervised pose estimation algorithm and is competitive with a new baseline created using convolutional pose estimation with full supervision.
Tasks	Pose Estimation, Temporal Action Localization
Published	2017-04-13
URL	http://arxiv.org/abs/1704.04081v1
PDF	http://arxiv.org/pdf/1704.04081v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-estimate-pose-by-watching-videos
Repo	https://github.com/prabuddha1/acpe
Framework	none

Comicolorization: Semi-Automatic Manga Colorization


Title	Comicolorization: Semi-Automatic Manga Colorization
Authors	Chie Furusawa, Kazuyuki Hiroshiba, Keisuke Ogaki, Yuri Odagiri
Abstract	We developed “Comicolorization”, a semi-automatic colorization system for manga images. Given a monochrome manga and reference images as inputs, our system generates a plausible color version of the manga. This is the first work to address the colorization of an entire manga title (a set of manga pages). Our method colorizes a whole page (not a single panel) semi-automatically, with the same color for the same character across multiple panels. To colorize the target character by the color from the reference image, we extract a color feature from the reference and feed it to the colorization network to help the colorization. Our approach employs adversarial loss to encourage the effect of the color features. Optionally, our tool allows users to revise the colorization result interactively. By feeding the color features to our deep colorization network, we accomplish colorization of the entire manga using the desired colors for each panel.
Tasks	Colorization
Published	2017-06-21
URL	http://arxiv.org/abs/1706.06759v4
PDF	http://arxiv.org/pdf/1706.06759v4.pdf
PWC	https://paperswithcode.com/paper/comicolorization-semi-automatic-manga
Repo	https://github.com/DwangoMediaVillage/Comicolorization
Framework	none

Domain-adaptive deep network compression


Title	Domain-adaptive deep network compression
Authors	Marc Masana, Joost van de Weijer, Luis Herranz, Andrew D. Bagdanov, Jose M Alvarez
Abstract	Deep Neural Networks trained on large datasets can be easily transferred to new domains with far fewer labeled examples by a process called fine-tuning. This has the advantage that representations learned in the large source domain can be exploited on smaller target domains. However, networks designed to be optimal for the source task are often prohibitively large for the target task. In this work we address the compression of networks after domain transfer. We focus on compression algorithms based on low-rank matrix decomposition. Existing methods base compression solely on learned network weights and ignore the statistics of network activations. We show that domain transfer leads to large shifts in network activations and that it is desirable to take this into account when compressing. We demonstrate that considering activation statistics when compressing weights leads to a rank-constrained regression problem with a closed-form solution. Because our method takes into account the target domain, it can more optimally remove the redundancy in the weights. Experiments show that our Domain Adaptive Low Rank (DALR) method significantly outperforms existing low-rank compression techniques. With our approach, the fc6 layer of VGG19 can be compressed more than 4x more than using truncated SVD alone – with only a minor or no loss in accuracy. When applied to domain-transferred networks it allows for compression down to only 5-20% of the original number of parameters with only a minor drop in performance.
Tasks
Published	2017-09-04
URL	http://arxiv.org/abs/1709.01041v2
PDF	http://arxiv.org/pdf/1709.01041v2.pdf
PWC	https://paperswithcode.com/paper/domain-adaptive-deep-network-compression
Repo	https://github.com/mmasana/DALR
Framework	tf

Efficient Feature Screening for Lasso-Type Problems via Hybrid Safe-Strong Rules


Title	Efficient Feature Screening for Lasso-Type Problems via Hybrid Safe-Strong Rules
Authors	Yaohui Zeng, Tianbao Yang, Patrick Breheny
Abstract	The lasso model has been widely used for model selection in data mining, machine learning, and high-dimensional statistical analysis. However, due to the ultrahigh-dimensional, large-scale data sets collected in many real-world applications, it remains challenging to solve the lasso problems even with state-of-the-art algorithms. Feature screening is a powerful technique for addressing the Big Data challenge by discarding inactive features from the lasso optimization. In this paper, we propose a family of hybrid safe-strong rules (HSSR) which incorporate safe screening rules into the sequential strong rule (SSR) to remove unnecessary computational burden. In particular, we present two instances of HSSR, namely SSR-Dome and SSR-BEDPP, for the standard lasso problem. We further extend SSR-BEDPP to the elastic net and group lasso problems to demonstrate the generalizability of the hybrid screening idea. Extensive numerical experiments with synthetic and real data sets are conducted for both the standard lasso and the group lasso problems. Results show that our proposed hybrid rules substantially outperform existing state-of-the-art rules.
Tasks	Model Selection
Published	2017-04-27
URL	http://arxiv.org/abs/1704.08742v2
PDF	http://arxiv.org/pdf/1704.08742v2.pdf
PWC	https://paperswithcode.com/paper/efficient-feature-screening-for-lasso-type
Repo	https://github.com/YaohuiZeng/biglasso
Framework	none