May 7, 2019

2765 words 13 mins read

Paper Group AWR 45

Paper Group AWR 45

Recurrent Highway Networks. Compression Artifacts Removal Using Convolutional Neural Networks. Convolutional Oriented Boundaries. Fast Fourier Color Constancy. Pixel Recurrent Neural Networks. Sequence-Level Knowledge Distillation. Simultaneous Surface Reflectance and Fluorescence Spectra Estimation. On Regularization Parameter Estimation under Cov …

Recurrent Highway Networks

Title Recurrent Highway Networks
Authors Julian Georg Zilly, Rupesh Kumar Srivastava, Jan Koutník, Jürgen Schmidhuber
Abstract Many sequential processing tasks require complex nonlinear transition functions from one step to the next. However, recurrent neural networks with ‘deep’ transition functions remain difficult to train, even when using Long Short-Term Memory (LSTM) networks. We introduce a novel theoretical analysis of recurrent networks based on Gersgorin’s circle theorem that illuminates several modeling and optimization issues and improves our understanding of the LSTM cell. Based on this analysis we propose Recurrent Highway Networks, which extend the LSTM architecture to allow step-to-step transition depths larger than one. Several language modeling experiments demonstrate that the proposed architecture results in powerful and efficient models. On the Penn Treebank corpus, solely increasing the transition depth from 1 to 10 improves word-level perplexity from 90.6 to 65.4 using the same number of parameters. On the larger Wikipedia datasets for character prediction (text8 and enwik8), RHNs outperform all previous results and achieve an entropy of 1.27 bits per character.
Tasks Language Modelling
Published 2016-07-12
URL http://arxiv.org/abs/1607.03474v5
PDF http://arxiv.org/pdf/1607.03474v5.pdf
PWC https://paperswithcode.com/paper/recurrent-highway-networks
Repo https://github.com/davidsvaughn/dts-tf
Framework tf

Compression Artifacts Removal Using Convolutional Neural Networks

Title Compression Artifacts Removal Using Convolutional Neural Networks
Authors Pavel Svoboda, Michal Hradis, David Barina, Pavel Zemcik
Abstract This paper shows that it is possible to train large and deep convolutional neural networks (CNN) for JPEG compression artifacts reduction, and that such networks can provide significantly better reconstruction quality compared to previously used smaller networks as well as to any other state-of-the-art methods. We were able to train networks with 8 layers in a single step and in relatively short time by combining residual learning, skip architecture, and symmetric weight initialization. We provide further insights into convolution networks for JPEG artifact reduction by evaluating three different objectives, generalization with respect to training dataset size, and generalization with respect to JPEG quality level.
Tasks
Published 2016-05-02
URL http://arxiv.org/abs/1605.00366v1
PDF http://arxiv.org/pdf/1605.00366v1.pdf
PWC https://paperswithcode.com/paper/compression-artifacts-removal-using
Repo https://github.com/ShakedDovrat/JpegArtifactRemoval
Framework none

Convolutional Oriented Boundaries

Title Convolutional Oriented Boundaries
Authors Kevis-Kokitsi Maninis, Jordi Pont-Tuset, Pablo Arbeláez, Luc Van Gool
Abstract We present Convolutional Oriented Boundaries (COB), which produces multiscale oriented contours and region hierarchies starting from generic image classification Convolutional Neural Networks (CNNs). COB is computationally efficient, because it requires a single CNN forward pass for contour detection and it uses a novel sparse boundary representation for hierarchical segmentation; it gives a significant leap in performance over the state-of-the-art, and it generalizes very well to unseen categories and datasets. Particularly, we show that learning to estimate not only contour strength but also orientation provides more accurate results. We perform extensive experiments on BSDS, PASCAL Context, PASCAL Segmentation, and MS-COCO, showing that COB provides state-of-the-art contours, region hierarchies, and object proposals in all datasets.
Tasks Contour Detection, Image Classification
Published 2016-08-09
URL http://arxiv.org/abs/1608.02755v1
PDF http://arxiv.org/pdf/1608.02755v1.pdf
PWC https://paperswithcode.com/paper/convolutional-oriented-boundaries
Repo https://github.com/kmaninis/COB
Framework none

Fast Fourier Color Constancy

Title Fast Fourier Color Constancy
Authors Jonathan T. Barron, Yun-Ta Tsai
Abstract We present Fast Fourier Color Constancy (FFCC), a color constancy algorithm which solves illuminant estimation by reducing it to a spatial localization task on a torus. By operating in the frequency domain, FFCC produces lower error rates than the previous state-of-the-art by 13-20% while being 250-3000 times faster. This unconventional approach introduces challenges regarding aliasing, directional statistics, and preconditioning, which we address. By producing a complete posterior distribution over illuminants instead of a single illuminant estimate, FFCC enables better training techniques, an effective temporal smoothing technique, and richer methods for error analysis. Our implementation of FFCC runs at ~700 frames per second on a mobile device, allowing it to be used as an accurate, real-time, temporally-coherent automatic white balance algorithm.
Tasks Color Constancy
Published 2016-11-23
URL http://arxiv.org/abs/1611.07596v2
PDF http://arxiv.org/pdf/1611.07596v2.pdf
PWC https://paperswithcode.com/paper/fast-fourier-color-constancy
Repo https://github.com/google/ffcc
Framework none

Pixel Recurrent Neural Networks

Title Pixel Recurrent Neural Networks
Authors Aaron van den Oord, Nal Kalchbrenner, Koray Kavukcuoglu
Abstract Modeling the distribution of natural images is a landmark problem in unsupervised learning. This task requires an image model that is at once expressive, tractable and scalable. We present a deep neural network that sequentially predicts the pixels in an image along the two spatial dimensions. Our method models the discrete probability of the raw pixel values and encodes the complete set of dependencies in the image. Architectural novelties include fast two-dimensional recurrent layers and an effective use of residual connections in deep recurrent networks. We achieve log-likelihood scores on natural images that are considerably better than the previous state of the art. Our main results also provide benchmarks on the diverse ImageNet dataset. Samples generated from the model appear crisp, varied and globally coherent.
Tasks Image Generation
Published 2016-01-25
URL http://arxiv.org/abs/1601.06759v3
PDF http://arxiv.org/pdf/1601.06759v3.pdf
PWC https://paperswithcode.com/paper/pixel-recurrent-neural-networks
Repo https://github.com/ardapekis/pixel-rnn
Framework pytorch

Sequence-Level Knowledge Distillation

Title Sequence-Level Knowledge Distillation
Authors Yoon Kim, Alexander M. Rush
Abstract Neural machine translation (NMT) offers a novel alternative formulation of translation that is potentially simpler than statistical approaches. However to reach competitive performance, NMT models need to be exceedingly large. In this paper we consider applying knowledge distillation approaches (Bucila et al., 2006; Hinton et al., 2015) that have proven successful for reducing the size of neural models in other domains to the problem of NMT. We demonstrate that standard knowledge distillation applied to word-level prediction can be effective for NMT, and also introduce two novel sequence-level versions of knowledge distillation that further improve performance, and somewhat surprisingly, seem to eliminate the need for beam search (even when applied on the original teacher model). Our best student model runs 10 times faster than its state-of-the-art teacher with little loss in performance. It is also significantly better than a baseline model trained without knowledge distillation: by 4.2/1.7 BLEU with greedy decoding/beam search. Applying weight pruning on top of knowledge distillation results in a student model that has 13 times fewer parameters than the original teacher model, with a decrease of 0.4 BLEU.
Tasks Machine Translation
Published 2016-06-25
URL http://arxiv.org/abs/1606.07947v4
PDF http://arxiv.org/pdf/1606.07947v4.pdf
PWC https://paperswithcode.com/paper/sequence-level-knowledge-distillation
Repo https://github.com/harvardnlp/nmt-android
Framework torch

Simultaneous Surface Reflectance and Fluorescence Spectra Estimation

Title Simultaneous Surface Reflectance and Fluorescence Spectra Estimation
Authors Henryk Blasinski, Joyce Farrell, Brian Wandell
Abstract There is widespread interest in estimating the fluorescence properties of natural materials in an image. However, the separation between reflected and fluoresced components is difficult, because it is impossible to distinguish reflected and fluoresced photons without controlling the illuminant spectrum. We show how to jointly estimate the reflectance and fluorescence from a single set of images acquired under multiple illuminants. We present a framework based on a linear approximation to the physical equations describing image formation in terms of surface spectral reflectance and fluorescence due to multiple fluorophores. We relax the non-convex, inverse estimation problem in order to jointly estimate the reflectance and fluorescence properties in a single optimization step and we use the Alternating Direction Method of Multipliers (ADMM) approach to efficiently find a solution. We provide a software implementation of the solver for our method and prior methods. We evaluate the accuracy and reliability of the method using both simulations and experimental data. To acquire data to test the methods, we built a custom imaging system using a monochrome camera, a filter wheel with bandpass transmissive filters and a small number of light emitting diodes. We compared the system and algorithm performance with the ground truth as well as with prior methods. Our approach produces lower errors compared to earlier algorithms.
Tasks
Published 2016-05-13
URL http://arxiv.org/abs/1605.04243v1
PDF http://arxiv.org/pdf/1605.04243v1.pdf
PWC https://paperswithcode.com/paper/simultaneous-surface-reflectance-and
Repo https://github.com/hblasins/fiToolbox
Framework none

On Regularization Parameter Estimation under Covariate Shift

Title On Regularization Parameter Estimation under Covariate Shift
Authors Wouter M. Kouw, Marco Loog
Abstract This paper identifies a problem with the usual procedure for L2-regularization parameter estimation in a domain adaptation setting. In such a setting, there are differences between the distributions generating the training data (source domain) and the test data (target domain). The usual cross-validation procedure requires validation data, which can not be obtained from the unlabeled target data. The problem is that if one decides to use source validation data, the regularization parameter is underestimated. One possible solution is to scale the source validation data through importance weighting, but we show that this correction is not sufficient. We conclude the paper with an empirical analysis of the effect of several importance weight estimators on the estimation of the regularization parameter.
Tasks Domain Adaptation, L2 Regularization
Published 2016-07-31
URL http://arxiv.org/abs/1608.00250v1
PDF http://arxiv.org/pdf/1608.00250v1.pdf
PWC https://paperswithcode.com/paper/on-regularization-parameter-estimation-under
Repo https://github.com/wmkouw/covshift-l2reg
Framework none

Probabilistic Data Analysis with Probabilistic Programming

Title Probabilistic Data Analysis with Probabilistic Programming
Authors Feras Saad, Vikash Mansinghka
Abstract Probabilistic techniques are central to data analysis, but different approaches can be difficult to apply, combine, and compare. This paper introduces composable generative population models (CGPMs), a computational abstraction that extends directed graphical models and can be used to describe and compose a broad class of probabilistic data analysis techniques. Examples include hierarchical Bayesian models, multivariate kernel methods, discriminative machine learning, clustering algorithms, dimensionality reduction, and arbitrary probabilistic programs. We also demonstrate the integration of CGPMs into BayesDB, a probabilistic programming platform that can express data analysis tasks using a modeling language and a structured query language. The practical value is illustrated in two ways. First, CGPMs are used in an analysis that identifies satellite data records which probably violate Kepler’s Third Law, by composing causal probabilistic programs with non-parametric Bayes in under 50 lines of probabilistic code. Second, for several representative data analysis tasks, we report on lines of code and accuracy measurements of various CGPMs, plus comparisons with standard baseline solutions from Python and MATLAB libraries.
Tasks Dimensionality Reduction, Probabilistic Programming
Published 2016-08-18
URL http://arxiv.org/abs/1608.05347v1
PDF http://arxiv.org/pdf/1608.05347v1.pdf
PWC https://paperswithcode.com/paper/probabilistic-data-analysis-with
Repo https://github.com/probcomp/cgpm
Framework none

Revisiting Classifier Two-Sample Tests

Title Revisiting Classifier Two-Sample Tests
Authors David Lopez-Paz, Maxime Oquab
Abstract The goal of two-sample tests is to assess whether two samples, $S_P \sim P^n$ and $S_Q \sim Q^m$, are drawn from the same distribution. Perhaps intriguingly, one relatively unexplored method to build two-sample tests is the use of binary classifiers. In particular, construct a dataset by pairing the $n$ examples in $S_P$ with a positive label, and by pairing the $m$ examples in $S_Q$ with a negative label. If the null hypothesis “$P = Q$” is true, then the classification accuracy of a binary classifier on a held-out subset of this dataset should remain near chance-level. As we will show, such Classifier Two-Sample Tests (C2ST) learn a suitable representation of the data on the fly, return test statistics in interpretable units, have a simple null distribution, and their predictive uncertainty allow to interpret where $P$ and $Q$ differ. The goal of this paper is to establish the properties, performance, and uses of C2ST. First, we analyze their main theoretical properties. Second, we compare their performance against a variety of state-of-the-art alternatives. Third, we propose their use to evaluate the sample quality of generative models with intractable likelihoods, such as Generative Adversarial Networks (GANs). Fourth, we showcase the novel application of GANs together with C2ST for causal discovery.
Tasks Causal Discovery
Published 2016-10-20
URL http://arxiv.org/abs/1610.06545v4
PDF http://arxiv.org/pdf/1610.06545v4.pdf
PWC https://paperswithcode.com/paper/revisiting-classifier-two-sample-tests
Repo https://github.com/lopezpaz/classifier_tests
Framework none

Person Re-identification: Past, Present and Future

Title Person Re-identification: Past, Present and Future
Authors Liang Zheng, Yi Yang, Alexander G. Hauptmann
Abstract Person re-identification (re-ID) has become increasingly popular in the community due to its application and research significance. It aims at spotting a person of interest in other cameras. In the early days, hand-crafted algorithms and small-scale evaluation were predominantly reported. Recent years have witnessed the emergence of large-scale datasets and deep learning systems which make use of large data volumes. Considering different tasks, we classify most current re-ID methods into two classes, i.e., image-based and video-based; in both tasks, hand-crafted and deep learning systems will be reviewed. Moreover, two new re-ID tasks which are much closer to real-world applications are described and discussed, i.e., end-to-end re-ID and fast re-ID in very large galleries. This paper: 1) introduces the history of person re-ID and its relationship with image classification and instance retrieval; 2) surveys a broad selection of the hand-crafted systems and the large-scale methods in both image- and video-based re-ID; 3) describes critical future directions in end-to-end re-ID and fast retrieval in large galleries; and 4) finally briefs some important yet under-developed issues.
Tasks Image Classification, Person Re-Identification
Published 2016-10-10
URL http://arxiv.org/abs/1610.02984v1
PDF http://arxiv.org/pdf/1610.02984v1.pdf
PWC https://paperswithcode.com/paper/person-re-identification-past-present-and
Repo https://github.com/shumming/Person_ReID
Framework none

Time Series Classification from Scratch with Deep Neural Networks: A Strong Baseline

Title Time Series Classification from Scratch with Deep Neural Networks: A Strong Baseline
Authors Zhiguang Wang, Weizhong Yan, Tim Oates
Abstract We propose a simple but strong baseline for time series classification from scratch with deep neural networks. Our proposed baseline models are pure end-to-end without any heavy preprocessing on the raw data or feature crafting. The proposed Fully Convolutional Network (FCN) achieves premium performance to other state-of-the-art approaches and our exploration of the very deep neural networks with the ResNet structure is also competitive. The global average pooling in our convolutional model enables the exploitation of the Class Activation Map (CAM) to find out the contributing region in the raw data for the specific labels. Our models provides a simple choice for the real world application and a good starting point for the future research. An overall analysis is provided to discuss the generalization capability of our models, learned features, network structures and the classification semantics.
Tasks Time Series, Time Series Classification
Published 2016-11-20
URL http://arxiv.org/abs/1611.06455v4
PDF http://arxiv.org/pdf/1611.06455v4.pdf
PWC https://paperswithcode.com/paper/time-series-classification-from-scratch-with
Repo https://github.com/cauchyturing/UCR_Time_Series_Classification_Deep_Learning_Baseline
Framework tf

Conditional Similarity Networks

Title Conditional Similarity Networks
Authors Andreas Veit, Serge Belongie, Theofanis Karaletsos
Abstract What makes images similar? To measure the similarity between images, they are typically embedded in a feature-vector space, in which their distance preserve the relative dissimilarity. However, when learning such similarity embeddings the simplifying assumption is commonly made that images are only compared to one unique measure of similarity. A main reason for this is that contradicting notions of similarities cannot be captured in a single space. To address this shortcoming, we propose Conditional Similarity Networks (CSNs) that learn embeddings differentiated into semantically distinct subspaces that capture the different notions of similarities. CSNs jointly learn a disentangled embedding where features for different similarities are encoded in separate dimensions as well as masks that select and reweight relevant dimensions to induce a subspace that encodes a specific similarity notion. We show that our approach learns interpretable image representations with visually relevant semantic subspaces. Further, when evaluating on triplet questions from multiple similarity notions our model even outperforms the accuracy obtained by training individual specialized networks for each notion separately.
Tasks
Published 2016-03-25
URL http://arxiv.org/abs/1603.07810v3
PDF http://arxiv.org/pdf/1603.07810v3.pdf
PWC https://paperswithcode.com/paper/conditional-similarity-networks
Repo https://github.com/mvasil/fashion-compatibility
Framework pytorch

Template Matching with Deformable Diversity Similarity

Title Template Matching with Deformable Diversity Similarity
Authors Itamar Talmi, Roey Mechrez, Lihi Zelnik-Manor
Abstract We propose a novel measure for template matching named Deformable Diversity Similarity – based on the diversity of feature matches between a target image window and the template. We rely on both local appearance and geometric information that jointly lead to a powerful approach for matching. Our key contribution is a similarity measure, that is robust to complex deformations, significant background clutter, and occlusions. Empirical evaluation on the most up-to-date benchmark shows that our method outperforms the current state-of-the-art in its detection accuracy while improving computational complexity.
Tasks
Published 2016-12-07
URL http://arxiv.org/abs/1612.02190v2
PDF http://arxiv.org/pdf/1612.02190v2.pdf
PWC https://paperswithcode.com/paper/template-matching-with-deformable-diversity
Repo https://github.com/roimehrez/DDIS
Framework none

3D Shape Segmentation with Projective Convolutional Networks

Title 3D Shape Segmentation with Projective Convolutional Networks
Authors Evangelos Kalogerakis, Melinos Averkiou, Subhransu Maji, Siddhartha Chaudhuri
Abstract This paper introduces a deep architecture for segmenting 3D objects into their labeled semantic parts. Our architecture combines image-based Fully Convolutional Networks (FCNs) and surface-based Conditional Random Fields (CRFs) to yield coherent segmentations of 3D shapes. The image-based FCNs are used for efficient view-based reasoning about 3D object parts. Through a special projection layer, FCN outputs are effectively aggregated across multiple views and scales, then are projected onto the 3D object surfaces. Finally, a surface-based CRF combines the projected outputs with geometric consistency cues to yield coherent segmentations. The whole architecture (multi-view FCNs and CRF) is trained end-to-end. Our approach significantly outperforms the existing state-of-the-art methods in the currently largest segmentation benchmark (ShapeNet). Finally, we demonstrate promising segmentation results on noisy 3D shapes acquired from consumer-grade depth cameras.
Tasks
Published 2016-12-08
URL http://arxiv.org/abs/1612.02808v3
PDF http://arxiv.org/pdf/1612.02808v3.pdf
PWC https://paperswithcode.com/paper/3d-shape-segmentation-with-projective
Repo https://github.com/kalov/ShapePFCN
Framework none
comments powered by Disqus