Paper Group AWR 45
Recurrent Highway Networks. Compression Artifacts Removal Using Convolutional Neural Networks. Convolutional Oriented Boundaries. Fast Fourier Color Constancy. Pixel Recurrent Neural Networks. Sequence-Level Knowledge Distillation. Simultaneous Surface Reflectance and Fluorescence Spectra Estimation. On Regularization Parameter Estimation under Cov …
Recurrent Highway Networks
Title | Recurrent Highway Networks |
Authors | Julian Georg Zilly, Rupesh Kumar Srivastava, Jan Koutník, Jürgen Schmidhuber |
Abstract | Many sequential processing tasks require complex nonlinear transition functions from one step to the next. However, recurrent neural networks with ‘deep’ transition functions remain difficult to train, even when using Long Short-Term Memory (LSTM) networks. We introduce a novel theoretical analysis of recurrent networks based on Gersgorin’s circle theorem that illuminates several modeling and optimization issues and improves our understanding of the LSTM cell. Based on this analysis we propose Recurrent Highway Networks, which extend the LSTM architecture to allow step-to-step transition depths larger than one. Several language modeling experiments demonstrate that the proposed architecture results in powerful and efficient models. On the Penn Treebank corpus, solely increasing the transition depth from 1 to 10 improves word-level perplexity from 90.6 to 65.4 using the same number of parameters. On the larger Wikipedia datasets for character prediction (text8 and enwik8), RHNs outperform all previous results and achieve an entropy of 1.27 bits per character. |
Tasks | Language Modelling |
Published | 2016-07-12 |
URL | http://arxiv.org/abs/1607.03474v5 |
http://arxiv.org/pdf/1607.03474v5.pdf | |
PWC | https://paperswithcode.com/paper/recurrent-highway-networks |
Repo | https://github.com/davidsvaughn/dts-tf |
Framework | tf |
Compression Artifacts Removal Using Convolutional Neural Networks
Title | Compression Artifacts Removal Using Convolutional Neural Networks |
Authors | Pavel Svoboda, Michal Hradis, David Barina, Pavel Zemcik |
Abstract | This paper shows that it is possible to train large and deep convolutional neural networks (CNN) for JPEG compression artifacts reduction, and that such networks can provide significantly better reconstruction quality compared to previously used smaller networks as well as to any other state-of-the-art methods. We were able to train networks with 8 layers in a single step and in relatively short time by combining residual learning, skip architecture, and symmetric weight initialization. We provide further insights into convolution networks for JPEG artifact reduction by evaluating three different objectives, generalization with respect to training dataset size, and generalization with respect to JPEG quality level. |
Tasks | |
Published | 2016-05-02 |
URL | http://arxiv.org/abs/1605.00366v1 |
http://arxiv.org/pdf/1605.00366v1.pdf | |
PWC | https://paperswithcode.com/paper/compression-artifacts-removal-using |
Repo | https://github.com/ShakedDovrat/JpegArtifactRemoval |
Framework | none |
Convolutional Oriented Boundaries
Title | Convolutional Oriented Boundaries |
Authors | Kevis-Kokitsi Maninis, Jordi Pont-Tuset, Pablo Arbeláez, Luc Van Gool |
Abstract | We present Convolutional Oriented Boundaries (COB), which produces multiscale oriented contours and region hierarchies starting from generic image classification Convolutional Neural Networks (CNNs). COB is computationally efficient, because it requires a single CNN forward pass for contour detection and it uses a novel sparse boundary representation for hierarchical segmentation; it gives a significant leap in performance over the state-of-the-art, and it generalizes very well to unseen categories and datasets. Particularly, we show that learning to estimate not only contour strength but also orientation provides more accurate results. We perform extensive experiments on BSDS, PASCAL Context, PASCAL Segmentation, and MS-COCO, showing that COB provides state-of-the-art contours, region hierarchies, and object proposals in all datasets. |
Tasks | Contour Detection, Image Classification |
Published | 2016-08-09 |
URL | http://arxiv.org/abs/1608.02755v1 |
http://arxiv.org/pdf/1608.02755v1.pdf | |
PWC | https://paperswithcode.com/paper/convolutional-oriented-boundaries |
Repo | https://github.com/kmaninis/COB |
Framework | none |
Fast Fourier Color Constancy
Title | Fast Fourier Color Constancy |
Authors | Jonathan T. Barron, Yun-Ta Tsai |
Abstract | We present Fast Fourier Color Constancy (FFCC), a color constancy algorithm which solves illuminant estimation by reducing it to a spatial localization task on a torus. By operating in the frequency domain, FFCC produces lower error rates than the previous state-of-the-art by 13-20% while being 250-3000 times faster. This unconventional approach introduces challenges regarding aliasing, directional statistics, and preconditioning, which we address. By producing a complete posterior distribution over illuminants instead of a single illuminant estimate, FFCC enables better training techniques, an effective temporal smoothing technique, and richer methods for error analysis. Our implementation of FFCC runs at ~700 frames per second on a mobile device, allowing it to be used as an accurate, real-time, temporally-coherent automatic white balance algorithm. |
Tasks | Color Constancy |
Published | 2016-11-23 |
URL | http://arxiv.org/abs/1611.07596v2 |
http://arxiv.org/pdf/1611.07596v2.pdf | |
PWC | https://paperswithcode.com/paper/fast-fourier-color-constancy |
Repo | https://github.com/google/ffcc |
Framework | none |
Pixel Recurrent Neural Networks
Title | Pixel Recurrent Neural Networks |
Authors | Aaron van den Oord, Nal Kalchbrenner, Koray Kavukcuoglu |
Abstract | Modeling the distribution of natural images is a landmark problem in unsupervised learning. This task requires an image model that is at once expressive, tractable and scalable. We present a deep neural network that sequentially predicts the pixels in an image along the two spatial dimensions. Our method models the discrete probability of the raw pixel values and encodes the complete set of dependencies in the image. Architectural novelties include fast two-dimensional recurrent layers and an effective use of residual connections in deep recurrent networks. We achieve log-likelihood scores on natural images that are considerably better than the previous state of the art. Our main results also provide benchmarks on the diverse ImageNet dataset. Samples generated from the model appear crisp, varied and globally coherent. |
Tasks | Image Generation |
Published | 2016-01-25 |
URL | http://arxiv.org/abs/1601.06759v3 |
http://arxiv.org/pdf/1601.06759v3.pdf | |
PWC | https://paperswithcode.com/paper/pixel-recurrent-neural-networks |
Repo | https://github.com/ardapekis/pixel-rnn |
Framework | pytorch |
Sequence-Level Knowledge Distillation
Title | Sequence-Level Knowledge Distillation |
Authors | Yoon Kim, Alexander M. Rush |
Abstract | Neural machine translation (NMT) offers a novel alternative formulation of translation that is potentially simpler than statistical approaches. However to reach competitive performance, NMT models need to be exceedingly large. In this paper we consider applying knowledge distillation approaches (Bucila et al., 2006; Hinton et al., 2015) that have proven successful for reducing the size of neural models in other domains to the problem of NMT. We demonstrate that standard knowledge distillation applied to word-level prediction can be effective for NMT, and also introduce two novel sequence-level versions of knowledge distillation that further improve performance, and somewhat surprisingly, seem to eliminate the need for beam search (even when applied on the original teacher model). Our best student model runs 10 times faster than its state-of-the-art teacher with little loss in performance. It is also significantly better than a baseline model trained without knowledge distillation: by 4.2/1.7 BLEU with greedy decoding/beam search. Applying weight pruning on top of knowledge distillation results in a student model that has 13 times fewer parameters than the original teacher model, with a decrease of 0.4 BLEU. |
Tasks | Machine Translation |
Published | 2016-06-25 |
URL | http://arxiv.org/abs/1606.07947v4 |
http://arxiv.org/pdf/1606.07947v4.pdf | |
PWC | https://paperswithcode.com/paper/sequence-level-knowledge-distillation |
Repo | https://github.com/harvardnlp/nmt-android |
Framework | torch |
Simultaneous Surface Reflectance and Fluorescence Spectra Estimation
Title | Simultaneous Surface Reflectance and Fluorescence Spectra Estimation |
Authors | Henryk Blasinski, Joyce Farrell, Brian Wandell |
Abstract | There is widespread interest in estimating the fluorescence properties of natural materials in an image. However, the separation between reflected and fluoresced components is difficult, because it is impossible to distinguish reflected and fluoresced photons without controlling the illuminant spectrum. We show how to jointly estimate the reflectance and fluorescence from a single set of images acquired under multiple illuminants. We present a framework based on a linear approximation to the physical equations describing image formation in terms of surface spectral reflectance and fluorescence due to multiple fluorophores. We relax the non-convex, inverse estimation problem in order to jointly estimate the reflectance and fluorescence properties in a single optimization step and we use the Alternating Direction Method of Multipliers (ADMM) approach to efficiently find a solution. We provide a software implementation of the solver for our method and prior methods. We evaluate the accuracy and reliability of the method using both simulations and experimental data. To acquire data to test the methods, we built a custom imaging system using a monochrome camera, a filter wheel with bandpass transmissive filters and a small number of light emitting diodes. We compared the system and algorithm performance with the ground truth as well as with prior methods. Our approach produces lower errors compared to earlier algorithms. |
Tasks | |
Published | 2016-05-13 |
URL | http://arxiv.org/abs/1605.04243v1 |
http://arxiv.org/pdf/1605.04243v1.pdf | |
PWC | https://paperswithcode.com/paper/simultaneous-surface-reflectance-and |
Repo | https://github.com/hblasins/fiToolbox |
Framework | none |
On Regularization Parameter Estimation under Covariate Shift
Title | On Regularization Parameter Estimation under Covariate Shift |
Authors | Wouter M. Kouw, Marco Loog |
Abstract | This paper identifies a problem with the usual procedure for L2-regularization parameter estimation in a domain adaptation setting. In such a setting, there are differences between the distributions generating the training data (source domain) and the test data (target domain). The usual cross-validation procedure requires validation data, which can not be obtained from the unlabeled target data. The problem is that if one decides to use source validation data, the regularization parameter is underestimated. One possible solution is to scale the source validation data through importance weighting, but we show that this correction is not sufficient. We conclude the paper with an empirical analysis of the effect of several importance weight estimators on the estimation of the regularization parameter. |
Tasks | Domain Adaptation, L2 Regularization |
Published | 2016-07-31 |
URL | http://arxiv.org/abs/1608.00250v1 |
http://arxiv.org/pdf/1608.00250v1.pdf | |
PWC | https://paperswithcode.com/paper/on-regularization-parameter-estimation-under |
Repo | https://github.com/wmkouw/covshift-l2reg |
Framework | none |
Probabilistic Data Analysis with Probabilistic Programming
Title | Probabilistic Data Analysis with Probabilistic Programming |
Authors | Feras Saad, Vikash Mansinghka |
Abstract | Probabilistic techniques are central to data analysis, but different approaches can be difficult to apply, combine, and compare. This paper introduces composable generative population models (CGPMs), a computational abstraction that extends directed graphical models and can be used to describe and compose a broad class of probabilistic data analysis techniques. Examples include hierarchical Bayesian models, multivariate kernel methods, discriminative machine learning, clustering algorithms, dimensionality reduction, and arbitrary probabilistic programs. We also demonstrate the integration of CGPMs into BayesDB, a probabilistic programming platform that can express data analysis tasks using a modeling language and a structured query language. The practical value is illustrated in two ways. First, CGPMs are used in an analysis that identifies satellite data records which probably violate Kepler’s Third Law, by composing causal probabilistic programs with non-parametric Bayes in under 50 lines of probabilistic code. Second, for several representative data analysis tasks, we report on lines of code and accuracy measurements of various CGPMs, plus comparisons with standard baseline solutions from Python and MATLAB libraries. |
Tasks | Dimensionality Reduction, Probabilistic Programming |
Published | 2016-08-18 |
URL | http://arxiv.org/abs/1608.05347v1 |
http://arxiv.org/pdf/1608.05347v1.pdf | |
PWC | https://paperswithcode.com/paper/probabilistic-data-analysis-with |
Repo | https://github.com/probcomp/cgpm |
Framework | none |
Revisiting Classifier Two-Sample Tests
Title | Revisiting Classifier Two-Sample Tests |
Authors | David Lopez-Paz, Maxime Oquab |
Abstract | The goal of two-sample tests is to assess whether two samples, $S_P \sim P^n$ and $S_Q \sim Q^m$, are drawn from the same distribution. Perhaps intriguingly, one relatively unexplored method to build two-sample tests is the use of binary classifiers. In particular, construct a dataset by pairing the $n$ examples in $S_P$ with a positive label, and by pairing the $m$ examples in $S_Q$ with a negative label. If the null hypothesis “$P = Q$” is true, then the classification accuracy of a binary classifier on a held-out subset of this dataset should remain near chance-level. As we will show, such Classifier Two-Sample Tests (C2ST) learn a suitable representation of the data on the fly, return test statistics in interpretable units, have a simple null distribution, and their predictive uncertainty allow to interpret where $P$ and $Q$ differ. The goal of this paper is to establish the properties, performance, and uses of C2ST. First, we analyze their main theoretical properties. Second, we compare their performance against a variety of state-of-the-art alternatives. Third, we propose their use to evaluate the sample quality of generative models with intractable likelihoods, such as Generative Adversarial Networks (GANs). Fourth, we showcase the novel application of GANs together with C2ST for causal discovery. |
Tasks | Causal Discovery |
Published | 2016-10-20 |
URL | http://arxiv.org/abs/1610.06545v4 |
http://arxiv.org/pdf/1610.06545v4.pdf | |
PWC | https://paperswithcode.com/paper/revisiting-classifier-two-sample-tests |
Repo | https://github.com/lopezpaz/classifier_tests |
Framework | none |
Person Re-identification: Past, Present and Future
Title | Person Re-identification: Past, Present and Future |
Authors | Liang Zheng, Yi Yang, Alexander G. Hauptmann |
Abstract | Person re-identification (re-ID) has become increasingly popular in the community due to its application and research significance. It aims at spotting a person of interest in other cameras. In the early days, hand-crafted algorithms and small-scale evaluation were predominantly reported. Recent years have witnessed the emergence of large-scale datasets and deep learning systems which make use of large data volumes. Considering different tasks, we classify most current re-ID methods into two classes, i.e., image-based and video-based; in both tasks, hand-crafted and deep learning systems will be reviewed. Moreover, two new re-ID tasks which are much closer to real-world applications are described and discussed, i.e., end-to-end re-ID and fast re-ID in very large galleries. This paper: 1) introduces the history of person re-ID and its relationship with image classification and instance retrieval; 2) surveys a broad selection of the hand-crafted systems and the large-scale methods in both image- and video-based re-ID; 3) describes critical future directions in end-to-end re-ID and fast retrieval in large galleries; and 4) finally briefs some important yet under-developed issues. |
Tasks | Image Classification, Person Re-Identification |
Published | 2016-10-10 |
URL | http://arxiv.org/abs/1610.02984v1 |
http://arxiv.org/pdf/1610.02984v1.pdf | |
PWC | https://paperswithcode.com/paper/person-re-identification-past-present-and |
Repo | https://github.com/shumming/Person_ReID |
Framework | none |
Time Series Classification from Scratch with Deep Neural Networks: A Strong Baseline
Title | Time Series Classification from Scratch with Deep Neural Networks: A Strong Baseline |
Authors | Zhiguang Wang, Weizhong Yan, Tim Oates |
Abstract | We propose a simple but strong baseline for time series classification from scratch with deep neural networks. Our proposed baseline models are pure end-to-end without any heavy preprocessing on the raw data or feature crafting. The proposed Fully Convolutional Network (FCN) achieves premium performance to other state-of-the-art approaches and our exploration of the very deep neural networks with the ResNet structure is also competitive. The global average pooling in our convolutional model enables the exploitation of the Class Activation Map (CAM) to find out the contributing region in the raw data for the specific labels. Our models provides a simple choice for the real world application and a good starting point for the future research. An overall analysis is provided to discuss the generalization capability of our models, learned features, network structures and the classification semantics. |
Tasks | Time Series, Time Series Classification |
Published | 2016-11-20 |
URL | http://arxiv.org/abs/1611.06455v4 |
http://arxiv.org/pdf/1611.06455v4.pdf | |
PWC | https://paperswithcode.com/paper/time-series-classification-from-scratch-with |
Repo | https://github.com/cauchyturing/UCR_Time_Series_Classification_Deep_Learning_Baseline |
Framework | tf |
Conditional Similarity Networks
Title | Conditional Similarity Networks |
Authors | Andreas Veit, Serge Belongie, Theofanis Karaletsos |
Abstract | What makes images similar? To measure the similarity between images, they are typically embedded in a feature-vector space, in which their distance preserve the relative dissimilarity. However, when learning such similarity embeddings the simplifying assumption is commonly made that images are only compared to one unique measure of similarity. A main reason for this is that contradicting notions of similarities cannot be captured in a single space. To address this shortcoming, we propose Conditional Similarity Networks (CSNs) that learn embeddings differentiated into semantically distinct subspaces that capture the different notions of similarities. CSNs jointly learn a disentangled embedding where features for different similarities are encoded in separate dimensions as well as masks that select and reweight relevant dimensions to induce a subspace that encodes a specific similarity notion. We show that our approach learns interpretable image representations with visually relevant semantic subspaces. Further, when evaluating on triplet questions from multiple similarity notions our model even outperforms the accuracy obtained by training individual specialized networks for each notion separately. |
Tasks | |
Published | 2016-03-25 |
URL | http://arxiv.org/abs/1603.07810v3 |
http://arxiv.org/pdf/1603.07810v3.pdf | |
PWC | https://paperswithcode.com/paper/conditional-similarity-networks |
Repo | https://github.com/mvasil/fashion-compatibility |
Framework | pytorch |
Template Matching with Deformable Diversity Similarity
Title | Template Matching with Deformable Diversity Similarity |
Authors | Itamar Talmi, Roey Mechrez, Lihi Zelnik-Manor |
Abstract | We propose a novel measure for template matching named Deformable Diversity Similarity – based on the diversity of feature matches between a target image window and the template. We rely on both local appearance and geometric information that jointly lead to a powerful approach for matching. Our key contribution is a similarity measure, that is robust to complex deformations, significant background clutter, and occlusions. Empirical evaluation on the most up-to-date benchmark shows that our method outperforms the current state-of-the-art in its detection accuracy while improving computational complexity. |
Tasks | |
Published | 2016-12-07 |
URL | http://arxiv.org/abs/1612.02190v2 |
http://arxiv.org/pdf/1612.02190v2.pdf | |
PWC | https://paperswithcode.com/paper/template-matching-with-deformable-diversity |
Repo | https://github.com/roimehrez/DDIS |
Framework | none |
3D Shape Segmentation with Projective Convolutional Networks
Title | 3D Shape Segmentation with Projective Convolutional Networks |
Authors | Evangelos Kalogerakis, Melinos Averkiou, Subhransu Maji, Siddhartha Chaudhuri |
Abstract | This paper introduces a deep architecture for segmenting 3D objects into their labeled semantic parts. Our architecture combines image-based Fully Convolutional Networks (FCNs) and surface-based Conditional Random Fields (CRFs) to yield coherent segmentations of 3D shapes. The image-based FCNs are used for efficient view-based reasoning about 3D object parts. Through a special projection layer, FCN outputs are effectively aggregated across multiple views and scales, then are projected onto the 3D object surfaces. Finally, a surface-based CRF combines the projected outputs with geometric consistency cues to yield coherent segmentations. The whole architecture (multi-view FCNs and CRF) is trained end-to-end. Our approach significantly outperforms the existing state-of-the-art methods in the currently largest segmentation benchmark (ShapeNet). Finally, we demonstrate promising segmentation results on noisy 3D shapes acquired from consumer-grade depth cameras. |
Tasks | |
Published | 2016-12-08 |
URL | http://arxiv.org/abs/1612.02808v3 |
http://arxiv.org/pdf/1612.02808v3.pdf | |
PWC | https://paperswithcode.com/paper/3d-shape-segmentation-with-projective |
Repo | https://github.com/kalov/ShapePFCN |
Framework | none |