May 7, 2019

2765 words 13 mins read

Paper Group AWR 45

Recurrent Highway Networks. Compression Artifacts Removal Using Convolutional Neural Networks. Convolutional Oriented Boundaries. Fast Fourier Color Constancy. Pixel Recurrent Neural Networks. Sequence-Level Knowledge Distillation. Simultaneous Surface Reflectance and Fluorescence Spectra Estimation. On Regularization Parameter Estimation under Cov …

Recurrent Highway Networks


Title	Recurrent Highway Networks
Authors	Julian Georg Zilly, Rupesh Kumar Srivastava, Jan Koutník, Jürgen Schmidhuber
Abstract	Many sequential processing tasks require complex nonlinear transition functions from one step to the next. However, recurrent neural networks with ‘deep’ transition functions remain difficult to train, even when using Long Short-Term Memory (LSTM) networks. We introduce a novel theoretical analysis of recurrent networks based on Gersgorin’s circle theorem that illuminates several modeling and optimization issues and improves our understanding of the LSTM cell. Based on this analysis we propose Recurrent Highway Networks, which extend the LSTM architecture to allow step-to-step transition depths larger than one. Several language modeling experiments demonstrate that the proposed architecture results in powerful and efficient models. On the Penn Treebank corpus, solely increasing the transition depth from 1 to 10 improves word-level perplexity from 90.6 to 65.4 using the same number of parameters. On the larger Wikipedia datasets for character prediction (text8 and enwik8), RHNs outperform all previous results and achieve an entropy of 1.27 bits per character.
Tasks	Language Modelling
Published	2016-07-12
URL	http://arxiv.org/abs/1607.03474v5
PDF	http://arxiv.org/pdf/1607.03474v5.pdf
PWC	https://paperswithcode.com/paper/recurrent-highway-networks
Repo	https://github.com/davidsvaughn/dts-tf
Framework	tf

Compression Artifacts Removal Using Convolutional Neural Networks


Title	Compression Artifacts Removal Using Convolutional Neural Networks
Authors	Pavel Svoboda, Michal Hradis, David Barina, Pavel Zemcik
Abstract	This paper shows that it is possible to train large and deep convolutional neural networks (CNN) for JPEG compression artifacts reduction, and that such networks can provide significantly better reconstruction quality compared to previously used smaller networks as well as to any other state-of-the-art methods. We were able to train networks with 8 layers in a single step and in relatively short time by combining residual learning, skip architecture, and symmetric weight initialization. We provide further insights into convolution networks for JPEG artifact reduction by evaluating three different objectives, generalization with respect to training dataset size, and generalization with respect to JPEG quality level.
Tasks
Published	2016-05-02
URL	http://arxiv.org/abs/1605.00366v1
PDF	http://arxiv.org/pdf/1605.00366v1.pdf
PWC	https://paperswithcode.com/paper/compression-artifacts-removal-using
Repo	https://github.com/ShakedDovrat/JpegArtifactRemoval
Framework	none

Convolutional Oriented Boundaries


Title	Convolutional Oriented Boundaries
Authors	Kevis-Kokitsi Maninis, Jordi Pont-Tuset, Pablo Arbeláez, Luc Van Gool
Abstract	We present Convolutional Oriented Boundaries (COB), which produces multiscale oriented contours and region hierarchies starting from generic image classification Convolutional Neural Networks (CNNs). COB is computationally efficient, because it requires a single CNN forward pass for contour detection and it uses a novel sparse boundary representation for hierarchical segmentation; it gives a significant leap in performance over the state-of-the-art, and it generalizes very well to unseen categories and datasets. Particularly, we show that learning to estimate not only contour strength but also orientation provides more accurate results. We perform extensive experiments on BSDS, PASCAL Context, PASCAL Segmentation, and MS-COCO, showing that COB provides state-of-the-art contours, region hierarchies, and object proposals in all datasets.
Tasks	Contour Detection, Image Classification
Published	2016-08-09
URL	http://arxiv.org/abs/1608.02755v1
PDF	http://arxiv.org/pdf/1608.02755v1.pdf
PWC	https://paperswithcode.com/paper/convolutional-oriented-boundaries
Repo	https://github.com/kmaninis/COB
Framework	none

Fast Fourier Color Constancy


Title	Fast Fourier Color Constancy
Authors	Jonathan T. Barron, Yun-Ta Tsai
Abstract	We present Fast Fourier Color Constancy (FFCC), a color constancy algorithm which solves illuminant estimation by reducing it to a spatial localization task on a torus. By operating in the frequency domain, FFCC produces lower error rates than the previous state-of-the-art by 13-20% while being 250-3000 times faster. This unconventional approach introduces challenges regarding aliasing, directional statistics, and preconditioning, which we address. By producing a complete posterior distribution over illuminants instead of a single illuminant estimate, FFCC enables better training techniques, an effective temporal smoothing technique, and richer methods for error analysis. Our implementation of FFCC runs at ~700 frames per second on a mobile device, allowing it to be used as an accurate, real-time, temporally-coherent automatic white balance algorithm.
Tasks	Color Constancy
Published	2016-11-23
URL	http://arxiv.org/abs/1611.07596v2
PDF	http://arxiv.org/pdf/1611.07596v2.pdf
PWC	https://paperswithcode.com/paper/fast-fourier-color-constancy
Repo	https://github.com/google/ffcc
Framework	none

Pixel Recurrent Neural Networks


Title	Pixel Recurrent Neural Networks
Authors	Aaron van den Oord, Nal Kalchbrenner, Koray Kavukcuoglu
Abstract	Modeling the distribution of natural images is a landmark problem in unsupervised learning. This task requires an image model that is at once expressive, tractable and scalable. We present a deep neural network that sequentially predicts the pixels in an image along the two spatial dimensions. Our method models the discrete probability of the raw pixel values and encodes the complete set of dependencies in the image. Architectural novelties include fast two-dimensional recurrent layers and an effective use of residual connections in deep recurrent networks. We achieve log-likelihood scores on natural images that are considerably better than the previous state of the art. Our main results also provide benchmarks on the diverse ImageNet dataset. Samples generated from the model appear crisp, varied and globally coherent.
Tasks	Image Generation
Published	2016-01-25
URL	http://arxiv.org/abs/1601.06759v3
PDF	http://arxiv.org/pdf/1601.06759v3.pdf
PWC	https://paperswithcode.com/paper/pixel-recurrent-neural-networks
Repo	https://github.com/ardapekis/pixel-rnn
Framework	pytorch

Sequence-Level Knowledge Distillation


Title	Sequence-Level Knowledge Distillation
Authors	Yoon Kim, Alexander M. Rush
Abstract	Neural machine translation (NMT) offers a novel alternative formulation of translation that is potentially simpler than statistical approaches. However to reach competitive performance, NMT models need to be exceedingly large. In this paper we consider applying knowledge distillation approaches (Bucila et al., 2006; Hinton et al., 2015) that have proven successful for reducing the size of neural models in other domains to the problem of NMT. We demonstrate that standard knowledge distillation applied to word-level prediction can be effective for NMT, and also introduce two novel sequence-level versions of knowledge distillation that further improve performance, and somewhat surprisingly, seem to eliminate the need for beam search (even when applied on the original teacher model). Our best student model runs 10 times faster than its state-of-the-art teacher with little loss in performance. It is also significantly better than a baseline model trained without knowledge distillation: by 4.2/1.7 BLEU with greedy decoding/beam search. Applying weight pruning on top of knowledge distillation results in a student model that has 13 times fewer parameters than the original teacher model, with a decrease of 0.4 BLEU.
Tasks	Machine Translation
Published	2016-06-25
URL	http://arxiv.org/abs/1606.07947v4
PDF	http://arxiv.org/pdf/1606.07947v4.pdf
PWC	https://paperswithcode.com/paper/sequence-level-knowledge-distillation
Repo	https://github.com/harvardnlp/nmt-android
Framework	torch

Simultaneous Surface Reflectance and Fluorescence Spectra Estimation


Title	Simultaneous Surface Reflectance and Fluorescence Spectra Estimation
Authors	Henryk Blasinski, Joyce Farrell, Brian Wandell
Abstract	There is widespread interest in estimating the fluorescence properties of natural materials in an image. However, the separation between reflected and fluoresced components is difficult, because it is impossible to distinguish reflected and fluoresced photons without controlling the illuminant spectrum. We show how to jointly estimate the reflectance and fluorescence from a single set of images acquired under multiple illuminants. We present a framework based on a linear approximation to the physical equations describing image formation in terms of surface spectral reflectance and fluorescence due to multiple fluorophores. We relax the non-convex, inverse estimation problem in order to jointly estimate the reflectance and fluorescence properties in a single optimization step and we use the Alternating Direction Method of Multipliers (ADMM) approach to efficiently find a solution. We provide a software implementation of the solver for our method and prior methods. We evaluate the accuracy and reliability of the method using both simulations and experimental data. To acquire data to test the methods, we built a custom imaging system using a monochrome camera, a filter wheel with bandpass transmissive filters and a small number of light emitting diodes. We compared the system and algorithm performance with the ground truth as well as with prior methods. Our approach produces lower errors compared to earlier algorithms.
Tasks
Published	2016-05-13
URL	http://arxiv.org/abs/1605.04243v1
PDF	http://arxiv.org/pdf/1605.04243v1.pdf
PWC	https://paperswithcode.com/paper/simultaneous-surface-reflectance-and
Repo	https://github.com/hblasins/fiToolbox
Framework	none

On Regularization Parameter Estimation under Covariate Shift


Title	On Regularization Parameter Estimation under Covariate Shift
Authors	Wouter M. Kouw, Marco Loog
Abstract	This paper identifies a problem with the usual procedure for L2-regularization parameter estimation in a domain adaptation setting. In such a setting, there are differences between the distributions generating the training data (source domain) and the test data (target domain). The usual cross-validation procedure requires validation data, which can not be obtained from the unlabeled target data. The problem is that if one decides to use source validation data, the regularization parameter is underestimated. One possible solution is to scale the source validation data through importance weighting, but we show that this correction is not sufficient. We conclude the paper with an empirical analysis of the effect of several importance weight estimators on the estimation of the regularization parameter.
Tasks	Domain Adaptation, L2 Regularization
Published	2016-07-31
URL	http://arxiv.org/abs/1608.00250v1
PDF	http://arxiv.org/pdf/1608.00250v1.pdf
PWC	https://paperswithcode.com/paper/on-regularization-parameter-estimation-under
Repo	https://github.com/wmkouw/covshift-l2reg
Framework	none

Probabilistic Data Analysis with Probabilistic Programming


Title	Probabilistic Data Analysis with Probabilistic Programming
Authors	Feras Saad, Vikash Mansinghka
Abstract	Probabilistic techniques are central to data analysis, but different approaches can be difficult to apply, combine, and compare. This paper introduces composable generative population models (CGPMs), a computational abstraction that extends directed graphical models and can be used to describe and compose a broad class of probabilistic data analysis techniques. Examples include hierarchical Bayesian models, multivariate kernel methods, discriminative machine learning, clustering algorithms, dimensionality reduction, and arbitrary probabilistic programs. We also demonstrate the integration of CGPMs into BayesDB, a probabilistic programming platform that can express data analysis tasks using a modeling language and a structured query language. The practical value is illustrated in two ways. First, CGPMs are used in an analysis that identifies satellite data records which probably violate Kepler’s Third Law, by composing causal probabilistic programs with non-parametric Bayes in under 50 lines of probabilistic code. Second, for several representative data analysis tasks, we report on lines of code and accuracy measurements of various CGPMs, plus comparisons with standard baseline solutions from Python and MATLAB libraries.
Tasks	Dimensionality Reduction, Probabilistic Programming
Published	2016-08-18
URL	http://arxiv.org/abs/1608.05347v1
PDF	http://arxiv.org/pdf/1608.05347v1.pdf
PWC	https://paperswithcode.com/paper/probabilistic-data-analysis-with
Repo	https://github.com/probcomp/cgpm
Framework	none

Revisiting Classifier Two-Sample Tests


Title	Revisiting Classifier Two-Sample Tests
Authors	David Lopez-Paz, Maxime Oquab
Abstract	The goal of two-sample tests is to assess whether two samples, $S_P \sim P^n$ and $S_Q \sim Q^m$, are drawn from the same distribution. Perhaps intriguingly, one relatively unexplored method to build two-sample tests is the use of binary classifiers. In particular, construct a dataset by pairing the $n$ examples in $S_P$ with a positive label, and by pairing the $m$ examples in $S_Q$ with a negative label. If the null hypothesis “$P = Q$” is true, then the classification accuracy of a binary classifier on a held-out subset of this dataset should remain near chance-level. As we will show, such Classifier Two-Sample Tests (C2ST) learn a suitable representation of the data on the fly, return test statistics in interpretable units, have a simple null distribution, and their predictive uncertainty allow to interpret where $P$ and $Q$ differ. The goal of this paper is to establish the properties, performance, and uses of C2ST. First, we analyze their main theoretical properties. Second, we compare their performance against a variety of state-of-the-art alternatives. Third, we propose their use to evaluate the sample quality of generative models with intractable likelihoods, such as Generative Adversarial Networks (GANs). Fourth, we showcase the novel application of GANs together with C2ST for causal discovery.
Tasks	Causal Discovery
Published	2016-10-20
URL	http://arxiv.org/abs/1610.06545v4
PDF	http://arxiv.org/pdf/1610.06545v4.pdf
PWC	https://paperswithcode.com/paper/revisiting-classifier-two-sample-tests
Repo	https://github.com/lopezpaz/classifier_tests
Framework	none

Person Re-identification: Past, Present and Future


Title	Person Re-identification: Past, Present and Future
Authors	Liang Zheng, Yi Yang, Alexander G. Hauptmann
Abstract	Person re-identification (re-ID) has become increasingly popular in the community due to its application and research significance. It aims at spotting a person of interest in other cameras. In the early days, hand-crafted algorithms and small-scale evaluation were predominantly reported. Recent years have witnessed the emergence of large-scale datasets and deep learning systems which make use of large data volumes. Considering different tasks, we classify most current re-ID methods into two classes, i.e., image-based and video-based; in both tasks, hand-crafted and deep learning systems will be reviewed. Moreover, two new re-ID tasks which are much closer to real-world applications are described and discussed, i.e., end-to-end re-ID and fast re-ID in very large galleries. This paper: 1) introduces the history of person re-ID and its relationship with image classification and instance retrieval; 2) surveys a broad selection of the hand-crafted systems and the large-scale methods in both image- and video-based re-ID; 3) describes critical future directions in end-to-end re-ID and fast retrieval in large galleries; and 4) finally briefs some important yet under-developed issues.
Tasks	Image Classification, Person Re-Identification
Published	2016-10-10
URL	http://arxiv.org/abs/1610.02984v1
PDF	http://arxiv.org/pdf/1610.02984v1.pdf
PWC	https://paperswithcode.com/paper/person-re-identification-past-present-and
Repo	https://github.com/shumming/Person_ReID
Framework	none

Time Series Classification from Scratch with Deep Neural Networks: A Strong Baseline


Title	Time Series Classification from Scratch with Deep Neural Networks: A Strong Baseline
Authors	Zhiguang Wang, Weizhong Yan, Tim Oates
Abstract	We propose a simple but strong baseline for time series classification from scratch with deep neural networks. Our proposed baseline models are pure end-to-end without any heavy preprocessing on the raw data or feature crafting. The proposed Fully Convolutional Network (FCN) achieves premium performance to other state-of-the-art approaches and our exploration of the very deep neural networks with the ResNet structure is also competitive. The global average pooling in our convolutional model enables the exploitation of the Class Activation Map (CAM) to find out the contributing region in the raw data for the specific labels. Our models provides a simple choice for the real world application and a good starting point for the future research. An overall analysis is provided to discuss the generalization capability of our models, learned features, network structures and the classification semantics.
Tasks	Time Series, Time Series Classification
Published	2016-11-20
URL	http://arxiv.org/abs/1611.06455v4
PDF	http://arxiv.org/pdf/1611.06455v4.pdf
PWC	https://paperswithcode.com/paper/time-series-classification-from-scratch-with
Repo	https://github.com/cauchyturing/UCR_Time_Series_Classification_Deep_Learning_Baseline
Framework	tf

Conditional Similarity Networks


Title	Conditional Similarity Networks
Authors	Andreas Veit, Serge Belongie, Theofanis Karaletsos
Abstract	What makes images similar? To measure the similarity between images, they are typically embedded in a feature-vector space, in which their distance preserve the relative dissimilarity. However, when learning such similarity embeddings the simplifying assumption is commonly made that images are only compared to one unique measure of similarity. A main reason for this is that contradicting notions of similarities cannot be captured in a single space. To address this shortcoming, we propose Conditional Similarity Networks (CSNs) that learn embeddings differentiated into semantically distinct subspaces that capture the different notions of similarities. CSNs jointly learn a disentangled embedding where features for different similarities are encoded in separate dimensions as well as masks that select and reweight relevant dimensions to induce a subspace that encodes a specific similarity notion. We show that our approach learns interpretable image representations with visually relevant semantic subspaces. Further, when evaluating on triplet questions from multiple similarity notions our model even outperforms the accuracy obtained by training individual specialized networks for each notion separately.
Tasks
Published	2016-03-25
URL	http://arxiv.org/abs/1603.07810v3
PDF	http://arxiv.org/pdf/1603.07810v3.pdf
PWC	https://paperswithcode.com/paper/conditional-similarity-networks
Repo	https://github.com/mvasil/fashion-compatibility
Framework	pytorch

Template Matching with Deformable Diversity Similarity


Title	Template Matching with Deformable Diversity Similarity
Authors	Itamar Talmi, Roey Mechrez, Lihi Zelnik-Manor
Abstract	We propose a novel measure for template matching named Deformable Diversity Similarity – based on the diversity of feature matches between a target image window and the template. We rely on both local appearance and geometric information that jointly lead to a powerful approach for matching. Our key contribution is a similarity measure, that is robust to complex deformations, significant background clutter, and occlusions. Empirical evaluation on the most up-to-date benchmark shows that our method outperforms the current state-of-the-art in its detection accuracy while improving computational complexity.
Tasks
Published	2016-12-07
URL	http://arxiv.org/abs/1612.02190v2
PDF	http://arxiv.org/pdf/1612.02190v2.pdf
PWC	https://paperswithcode.com/paper/template-matching-with-deformable-diversity
Repo	https://github.com/roimehrez/DDIS
Framework	none

3D Shape Segmentation with Projective Convolutional Networks


Title	3D Shape Segmentation with Projective Convolutional Networks
Authors	Evangelos Kalogerakis, Melinos Averkiou, Subhransu Maji, Siddhartha Chaudhuri
Abstract	This paper introduces a deep architecture for segmenting 3D objects into their labeled semantic parts. Our architecture combines image-based Fully Convolutional Networks (FCNs) and surface-based Conditional Random Fields (CRFs) to yield coherent segmentations of 3D shapes. The image-based FCNs are used for efficient view-based reasoning about 3D object parts. Through a special projection layer, FCN outputs are effectively aggregated across multiple views and scales, then are projected onto the 3D object surfaces. Finally, a surface-based CRF combines the projected outputs with geometric consistency cues to yield coherent segmentations. The whole architecture (multi-view FCNs and CRF) is trained end-to-end. Our approach significantly outperforms the existing state-of-the-art methods in the currently largest segmentation benchmark (ShapeNet). Finally, we demonstrate promising segmentation results on noisy 3D shapes acquired from consumer-grade depth cameras.
Tasks
Published	2016-12-08
URL	http://arxiv.org/abs/1612.02808v3
PDF	http://arxiv.org/pdf/1612.02808v3.pdf
PWC	https://paperswithcode.com/paper/3d-shape-segmentation-with-projective
Repo	https://github.com/kalov/ShapePFCN
Framework	none