May 7, 2019

2869 words 14 mins read

Paper Group AWR 28

Paper Group AWR 28

From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification. CYCLADES: Conflict-free Asynchronous Machine Learning. Dataset and Neural Recurrent Sequence Labeling Model for Open-Domain Factoid Question Answering. End-to-end Optimized Image Compression. Astronomical image reconstruction with convolutional neural networks. D …

From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification

Title From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification
Authors André F. T. Martins, Ramón Fernandez Astudillo
Abstract We propose sparsemax, a new activation function similar to the traditional softmax, but able to output sparse probabilities. After deriving its properties, we show how its Jacobian can be efficiently computed, enabling its use in a network trained with backpropagation. Then, we propose a new smooth and convex loss function which is the sparsemax analogue of the logistic loss. We reveal an unexpected connection between this new loss and the Huber classification loss. We obtain promising empirical results in multi-label classification problems and in attention-based neural networks for natural language inference. For the latter, we achieve a similar performance as the traditional softmax, but with a selective, more compact, attention focus.
Tasks Multi-Label Classification, Natural Language Inference
Published 2016-02-05
URL http://arxiv.org/abs/1602.02068v2
PDF http://arxiv.org/pdf/1602.02068v2.pdf
PWC https://paperswithcode.com/paper/from-softmax-to-sparsemax-a-sparse-model-of
Repo https://github.com/qrfaction/keras-sparsemax
Framework tf

CYCLADES: Conflict-free Asynchronous Machine Learning

Title CYCLADES: Conflict-free Asynchronous Machine Learning
Authors Xinghao Pan, Maximilian Lam, Stephen Tu, Dimitris Papailiopoulos, Ce Zhang, Michael I. Jordan, Kannan Ramchandran, Chris Re, Benjamin Recht
Abstract We present CYCLADES, a general framework for parallelizing stochastic optimization algorithms in a shared memory setting. CYCLADES is asynchronous during shared model updates, and requires no memory locking mechanisms, similar to HOGWILD!-type algorithms. Unlike HOGWILD!, CYCLADES introduces no conflicts during the parallel execution, and offers a black-box analysis for provable speedups across a large family of algorithms. Due to its inherent conflict-free nature and cache locality, our multi-core implementation of CYCLADES consistently outperforms HOGWILD!-type algorithms on sufficiently sparse datasets, leading to up to 40% speedup gains compared to the HOGWILD! implementation of SGD, and up to 5x gains over asynchronous implementations of variance reduction algorithms.
Tasks Stochastic Optimization
Published 2016-05-31
URL http://arxiv.org/abs/1605.09721v1
PDF http://arxiv.org/pdf/1605.09721v1.pdf
PWC https://paperswithcode.com/paper/cyclades-conflict-free-asynchronous-machine
Repo https://github.com/amplab/cyclades
Framework none

Dataset and Neural Recurrent Sequence Labeling Model for Open-Domain Factoid Question Answering

Title Dataset and Neural Recurrent Sequence Labeling Model for Open-Domain Factoid Question Answering
Authors Peng Li, Wei Li, Zhengyan He, Xuguang Wang, Ying Cao, Jie Zhou, Wei Xu
Abstract While question answering (QA) with neural network, i.e. neural QA, has achieved promising results in recent years, lacking of large scale real-word QA dataset is still a challenge for developing and evaluating neural QA system. To alleviate this problem, we propose a large scale human annotated real-world QA dataset WebQA with more than 42k questions and 556k evidences. As existing neural QA methods resolve QA either as sequence generation or classification/ranking problem, they face challenges of expensive softmax computation, unseen answers handling or separate candidate answer generation component. In this work, we cast neural QA as a sequence labeling problem and propose an end-to-end sequence labeling model, which overcomes all the above challenges. Experimental results on WebQA show that our model outperforms the baselines significantly with an F1 score of 74.69% with word-based input, and the performance drops only 3.72 F1 points with more challenging character-based input.
Tasks Question Answering
Published 2016-07-21
URL http://arxiv.org/abs/1607.06275v2
PDF http://arxiv.org/pdf/1607.06275v2.pdf
PWC https://paperswithcode.com/paper/dataset-and-neural-recurrent-sequence
Repo https://github.com/WangJiuniu/SRQA
Framework pytorch

End-to-end Optimized Image Compression

Title End-to-end Optimized Image Compression
Authors Johannes Ballé, Valero Laparra, Eero P. Simoncelli
Abstract We describe an image compression method, consisting of a nonlinear analysis transformation, a uniform quantizer, and a nonlinear synthesis transformation. The transforms are constructed in three successive stages of convolutional linear filters and nonlinear activation functions. Unlike most convolutional neural networks, the joint nonlinearity is chosen to implement a form of local gain control, inspired by those used to model biological neurons. Using a variant of stochastic gradient descent, we jointly optimize the entire model for rate-distortion performance over a database of training images, introducing a continuous proxy for the discontinuous loss function arising from the quantizer. Under certain conditions, the relaxed loss function may be interpreted as the log likelihood of a generative model, as implemented by a variational autoencoder. Unlike these models, however, the compression model must operate at any given point along the rate-distortion curve, as specified by a trade-off parameter. Across an independent set of test images, we find that the optimized method generally exhibits better rate-distortion performance than the standard JPEG and JPEG 2000 compression methods. More importantly, we observe a dramatic improvement in visual quality for all images at all bit rates, which is supported by objective quality estimates using MS-SSIM.
Tasks Image Compression
Published 2016-11-05
URL http://arxiv.org/abs/1611.01704v3
PDF http://arxiv.org/pdf/1611.01704v3.pdf
PWC https://paperswithcode.com/paper/end-to-end-optimized-image-compression
Repo https://github.com/treammm/Compression
Framework tf

Astronomical image reconstruction with convolutional neural networks

Title Astronomical image reconstruction with convolutional neural networks
Authors Rémi Flamary
Abstract State of the art methods in astronomical image reconstruction rely on the resolution of a regularized or constrained optimization problem. Solving this problem can be computationally intensive and usually leads to a quadratic or at least superlinear complexity w.r.t. the number of pixels in the image. We investigate in this work the use of convolutional neural networks for image reconstruction in astronomy. With neural networks, the computationally intensive tasks is the training step, but the prediction step has a fixed complexity per pixel, i.e. a linear complexity. Numerical experiments show that our approach is both computationally efficient and competitive with other state of the art methods in addition to being interpretable.
Tasks Image Reconstruction
Published 2016-12-14
URL http://arxiv.org/abs/1612.04526v2
PDF http://arxiv.org/pdf/1612.04526v2.pdf
PWC https://paperswithcode.com/paper/astronomical-image-reconstruction-with
Repo https://github.com/mirapy-org/mirapy
Framework tf

Deep Image Homography Estimation

Title Deep Image Homography Estimation
Authors Daniel DeTone, Tomasz Malisiewicz, Andrew Rabinovich
Abstract We present a deep convolutional neural network for estimating the relative homography between a pair of images. Our feed-forward network has 10 layers, takes two stacked grayscale images as input, and produces an 8 degree of freedom homography which can be used to map the pixels from the first image to the second. We present two convolutional neural network architectures for HomographyNet: a regression network which directly estimates the real-valued homography parameters, and a classification network which produces a distribution over quantized homographies. We use a 4-point homography parameterization which maps the four corners from one image into the second image. Our networks are trained in an end-to-end fashion using warped MS-COCO images. Our approach works without the need for separate local feature detection and transformation estimation stages. Our deep models are compared to a traditional homography estimator based on ORB features and we highlight the scenarios where HomographyNet outperforms the traditional technique. We also describe a variety of applications powered by deep homography estimation, thus showcasing the flexibility of a deep learning approach.
Tasks Homography Estimation
Published 2016-06-13
URL http://arxiv.org/abs/1606.03798v1
PDF http://arxiv.org/pdf/1606.03798v1.pdf
PWC https://paperswithcode.com/paper/deep-image-homography-estimation
Repo https://github.com/mazenmel/Deep-homography-estimation-Pytorch
Framework pytorch

Neural Language Correction with Character-Based Attention

Title Neural Language Correction with Character-Based Attention
Authors Ziang Xie, Anand Avati, Naveen Arivazhagan, Dan Jurafsky, Andrew Y. Ng
Abstract Natural language correction has the potential to help language learners improve their writing skills. While approaches with separate classifiers for different error types have high precision, they do not flexibly handle errors such as redundancy or non-idiomatic phrasing. On the other hand, word and phrase-based machine translation methods are not designed to cope with orthographic errors, and have recently been outpaced by neural models. Motivated by these issues, we present a neural network-based approach to language correction. The core component of our method is an encoder-decoder recurrent neural network with an attention mechanism. By operating at the character level, the network avoids the problem of out-of-vocabulary words. We illustrate the flexibility of our approach on dataset of noisy, user-generated text collected from an English learner forum. When combined with a language model, our method achieves a state-of-the-art $F_{0.5}$-score on the CoNLL 2014 Shared Task. We further demonstrate that training the network on additional data with synthesized errors can improve performance.
Tasks Language Modelling, Machine Translation
Published 2016-03-31
URL http://arxiv.org/abs/1603.09727v1
PDF http://arxiv.org/pdf/1603.09727v1.pdf
PWC https://paperswithcode.com/paper/neural-language-correction-with-character
Repo https://github.com/hoangtuanvu/OCR_Correction
Framework tf

New word analogy corpus for exploring embeddings of Czech words

Title New word analogy corpus for exploring embeddings of Czech words
Authors Lukáš Svoboda, Tomáš Brychcín
Abstract The word embedding methods have been proven to be very useful in many tasks of NLP (Natural Language Processing). Much has been investigated about word embeddings of English words and phrases, but only little attention has been dedicated to other languages. Our goal in this paper is to explore the behavior of state-of-the-art word embedding methods on Czech, the language that is characterized by very rich morphology. We introduce new corpus for word analogy task that inspects syntactic, morphosyntactic and semantic properties of Czech words and phrases. We experiment with Word2Vec and GloVe algorithms and discuss the results on this corpus. The corpus is available for the research community.
Tasks Word Embeddings
Published 2016-08-02
URL http://arxiv.org/abs/1608.00789v1
PDF http://arxiv.org/pdf/1608.00789v1.pdf
PWC https://paperswithcode.com/paper/new-word-analogy-corpus-for-exploring
Repo https://github.com/Svobikl/cz_corpus
Framework none

Concordance and the Smallest Covering Set of Preference Orderings

Title Concordance and the Smallest Covering Set of Preference Orderings
Authors Zhiwei Lin, Hui Wang, Cees H. Elzinga
Abstract Preference orderings are orderings of a set of items according to the preferences (of judges). Such orderings arise in a variety of domains, including group decision making, consumer marketing, voting and machine learning. Measuring the mutual information and extracting the common patterns in a set of preference orderings are key to these areas. In this paper we deal with the representation of sets of preference orderings, the quantification of the degree to which judges agree on their ordering of the items (i.e. the concordance), and the efficient, meaningful description of such sets. We propose to represent the orderings in a subsequence-based feature space and present a new algorithm to calculate the size of the set of all common subsequences - the basis of a quantification of concordance, not only for pairs of orderings but also for sets of orderings. The new algorithm is fast and storage efficient with a time complexity of only $O(Nn^2)$ for the orderings of $n$ items by $N$ judges and a space complexity of only $O(\min{Nn,n^2})$. Also, we propose to represent the set of all $N$ orderings through a smallest set of covering preferences and present an algorithm to construct this smallest covering set. The source code for the algorithms is available at https://github.com/zhiweiuu/secs
Tasks Decision Making
Published 2016-09-15
URL http://arxiv.org/abs/1609.04722v3
PDF http://arxiv.org/pdf/1609.04722v3.pdf
PWC https://paperswithcode.com/paper/concordance-and-the-smallest-covering-set-of
Repo https://github.com/zhiweiuu/secs
Framework none

Word2Vec is a special case of Kernel Correspondence Analysis and Kernels for Natural Language Processing

Title Word2Vec is a special case of Kernel Correspondence Analysis and Kernels for Natural Language Processing
Authors Hirotaka Niitsuma, Minho Lee
Abstract We show that correspondence analysis (CA) is equivalent to defining a Gini index with appropriately scaled one-hot encoding. Using this relation, we introduce a nonlinear kernel extension to CA. This extended CA gives a known analysis for natural language via specialized kernels that use an appropriate contingency table. We propose a semi-supervised CA, which is a special case of the kernel extension to CA. Because CA requires excessive memory if applied to numerous categories, CA has not been used for natural language processing. We address this problem by introducing delayed evaluation to randomized singular value decomposition. The memory-efficient CA is then applied to a word-vector representation task. We propose a tail-cut kernel, which is an extension to the skip-gram within the kernel extension to CA. Our tail-cut kernel outperforms existing word-vector representation methods.
Tasks
Published 2016-05-17
URL http://arxiv.org/abs/1605.05087v3
PDF http://arxiv.org/pdf/1605.05087v3.pdf
PWC https://paperswithcode.com/paper/word2vec-is-a-special-case-of-kernel
Repo https://github.com/niitsuma/wordca
Framework none

Exponential expressivity in deep neural networks through transient chaos

Title Exponential expressivity in deep neural networks through transient chaos
Authors Ben Poole, Subhaneil Lahiri, Maithra Raghu, Jascha Sohl-Dickstein, Surya Ganguli
Abstract We combine Riemannian geometry with the mean field theory of high dimensional chaos to study the nature of signal propagation in generic, deep neural networks with random weights. Our results reveal an order-to-chaos expressivity phase transition, with networks in the chaotic phase computing nonlinear functions whose global curvature grows exponentially with depth but not width. We prove this generic class of deep random functions cannot be efficiently computed by any shallow network, going beyond prior work restricted to the analysis of single functions. Moreover, we formalize and quantitatively demonstrate the long conjectured idea that deep networks can disentangle highly curved manifolds in input space into flat manifolds in hidden space. Our theoretical analysis of the expressive power of deep networks broadly applies to arbitrary nonlinearities, and provides a quantitative underpinning for previously abstract notions about the geometry of deep functions.
Tasks
Published 2016-06-16
URL http://arxiv.org/abs/1606.05340v2
PDF http://arxiv.org/pdf/1606.05340v2.pdf
PWC https://paperswithcode.com/paper/exponential-expressivity-in-deep-neural
Repo https://github.com/ganguli-lab/deepchaos
Framework none

High-dimensional regression over disease subgroups

Title High-dimensional regression over disease subgroups
Authors Frank Dondelinger, Sach Mukherjee, The Alzheimer’s Disease Neuroimaging Initiative
Abstract We consider high-dimensional regression over subgroups of observations. Our work is motivated by biomedical problems, where disease subtypes, for example, may differ with respect to underlying regression models, but sample sizes at the subgroup-level may be limited. We focus on the case in which subgroup-specific models may be expected to be similar but not necessarily identical. Our approach is to treat subgroups as related problem instances and jointly estimate subgroup-specific regression coefficients. This is done in a penalized framework, combining an $\ell_1$ term with an additional term that penalizes differences between subgroup-specific coefficients. This gives solutions that are globally sparse but that allow information-sharing between the subgroups. We present algorithms for estimation and empirical results on simulated data and using Alzheimer’s disease, amyotrophic lateral sclerosis and cancer datasets. These examples demonstrate the gains our approach can offer in terms of prediction and the ability to estimate subgroup-specific sparsity patterns.
Tasks
Published 2016-11-03
URL http://arxiv.org/abs/1611.00953v2
PDF http://arxiv.org/pdf/1611.00953v2.pdf
PWC https://paperswithcode.com/paper/high-dimensional-regression-over-disease
Repo https://github.com/FrankD/fuser
Framework none

Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis

Title Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis
Authors Angela Dai, Charles Ruizhongtai Qi, Matthias Nießner
Abstract We introduce a data-driven approach to complete partial 3D shapes through a combination of volumetric deep neural networks and 3D shape synthesis. From a partially-scanned input shape, our method first infers a low-resolution – but complete – output. To this end, we introduce a 3D-Encoder-Predictor Network (3D-EPN) which is composed of 3D convolutional layers. The network is trained to predict and fill in missing data, and operates on an implicit surface representation that encodes both known and unknown space. This allows us to predict global structure in unknown areas at high accuracy. We then correlate these intermediary results with 3D geometry from a shape database at test time. In a final pass, we propose a patch-based 3D shape synthesis method that imposes the 3D geometry from these retrieved shapes as constraints on the coarsely-completed mesh. This synthesis process enables us to reconstruct fine-scale detail and generate high-resolution output while respecting the global mesh structure obtained by the 3D-EPN. Although our 3D-EPN outperforms state-of-the-art completion method, the main contribution in our work lies in the combination of a data-driven shape predictor and analytic 3D shape synthesis. In our results, we show extensive evaluations on a newly-introduced shape completion benchmark for both real-world and synthetic data.
Tasks
Published 2016-12-01
URL http://arxiv.org/abs/1612.00101v2
PDF http://arxiv.org/pdf/1612.00101v2.pdf
PWC https://paperswithcode.com/paper/shape-completion-using-3d-encoder-predictor
Repo https://github.com/angeladai/cnncomplete
Framework torch

Trainable Frontend For Robust and Far-Field Keyword Spotting

Title Trainable Frontend For Robust and Far-Field Keyword Spotting
Authors Yuxuan Wang, Pascal Getreuer, Thad Hughes, Richard F. Lyon, Rif A. Saurous
Abstract Robust and far-field speech recognition is critical to enable true hands-free communication. In far-field conditions, signals are attenuated due to distance. To improve robustness to loudness variation, we introduce a novel frontend called per-channel energy normalization (PCEN). The key ingredient of PCEN is the use of an automatic gain control based dynamic compression to replace the widely used static (such as log or root) compression. We evaluate PCEN on the keyword spotting task. On our large rerecorded noisy and far-field eval sets, we show that PCEN significantly improves recognition performance. Furthermore, we model PCEN as neural network layers and optimize high-dimensional PCEN parameters jointly with the keyword spotting acoustic model. The trained PCEN frontend demonstrates significant further improvements without increasing model complexity or inference-time cost.
Tasks Keyword Spotting, Speech Recognition
Published 2016-07-19
URL http://arxiv.org/abs/1607.05666v1
PDF http://arxiv.org/pdf/1607.05666v1.pdf
PWC https://paperswithcode.com/paper/trainable-frontend-for-robust-and-far-field
Repo https://github.com/simongrest/kaggle-freesound-audio-tagging-2019
Framework none

Exchangeable Random Measures for Sparse and Modular Graphs with Overlapping Communities

Title Exchangeable Random Measures for Sparse and Modular Graphs with Overlapping Communities
Authors Adrien Todeschini, Xenia Miscouridou, François Caron
Abstract We propose a novel statistical model for sparse networks with overlapping community structure. The model is based on representing the graph as an exchangeable point process, and naturally generalizes existing probabilistic models with overlapping block-structure to the sparse regime. Our construction builds on vectors of completely random measures, and has interpretable parameters, each node being assigned a vector representing its level of affiliation to some latent communities. We develop methods for simulating this class of random graphs, as well as to perform posterior inference. We show that the proposed approach can recover interpretable structure from two real-world networks and can handle graphs with thousands of nodes and tens of thousands of edges.
Tasks
Published 2016-02-05
URL http://arxiv.org/abs/1602.02114v2
PDF http://arxiv.org/pdf/1602.02114v2.pdf
PWC https://paperswithcode.com/paper/exchangeable-random-measures-for-sparse-and
Repo https://github.com/misxenia/SNetOC
Framework none
comments powered by Disqus