May 7, 2019

2852 words 14 mins read

Paper Group AWR 87

Paper Group AWR 87

Pointer Sentinel Mixture Models. RHOG: A Refinement-Operator Library for Directed Labeled Graphs. Deep Neural Networks for No-Reference and Full-Reference Image Quality Assessment. Learning to Protect Communications with Adversarial Neural Cryptography. Bag of Tricks for Efficient Text Classification. Numerical Coding of Nominal Data. Layer Normali …

Pointer Sentinel Mixture Models

Title Pointer Sentinel Mixture Models
Authors Stephen Merity, Caiming Xiong, James Bradbury, Richard Socher
Abstract Recent neural network sequence models with softmax classifiers have achieved their best language modeling performance only with very large hidden states and large vocabularies. Even then they struggle to predict rare or unseen words even if the context makes the prediction unambiguous. We introduce the pointer sentinel mixture architecture for neural sequence models which has the ability to either reproduce a word from the recent context or produce a word from a standard softmax classifier. Our pointer sentinel-LSTM model achieves state of the art language modeling performance on the Penn Treebank (70.9 perplexity) while using far fewer parameters than a standard softmax LSTM. In order to evaluate how well language models can exploit longer contexts and deal with more realistic vocabularies and larger corpora we also introduce the freely available WikiText corpus.
Tasks Language Modelling
Published 2016-09-26
URL http://arxiv.org/abs/1609.07843v1
PDF http://arxiv.org/pdf/1609.07843v1.pdf
PWC https://paperswithcode.com/paper/pointer-sentinel-mixture-models
Repo https://github.com/elanmart/psmm
Framework pytorch

RHOG: A Refinement-Operator Library for Directed Labeled Graphs

Title RHOG: A Refinement-Operator Library for Directed Labeled Graphs
Authors Santiago Ontañón
Abstract This document provides the foundations behind the functionality provided by the $\rho$G library (https://github.com/santiontanon/RHOG), focusing on the basic operations the library provides: subsumption, refinement of directed labeled graphs, and distance/similarity assessment between directed labeled graphs. $\rho$G development was initially supported by the National Science Foundation, by the EAGER grant IIS-1551338.
Tasks
Published 2016-04-23
URL http://arxiv.org/abs/1604.06954v1
PDF http://arxiv.org/pdf/1604.06954v1.pdf
PWC https://paperswithcode.com/paper/rhog-a-refinement-operator-library-for
Repo https://github.com/santiontanon/RHOG
Framework none

Deep Neural Networks for No-Reference and Full-Reference Image Quality Assessment

Title Deep Neural Networks for No-Reference and Full-Reference Image Quality Assessment
Authors Sebastian Bosse, Dominique Maniry, Klaus-Robert Müller, Thomas Wiegand, Wojciech Samek
Abstract We present a deep neural network-based approach to image quality assessment (IQA). The network is trained end-to-end and comprises ten convolutional layers and five pooling layers for feature extraction, and two fully connected layers for regression, which makes it significantly deeper than related IQA models. Unique features of the proposed architecture are that: 1) with slight adaptations it can be used in a no-reference (NR) as well as in a full-reference (FR) IQA setting and 2) it allows for joint learning of local quality and local weights, i.e., relative importance of local quality to the global quality estimate, in an unified framework. Our approach is purely data-driven and does not rely on hand-crafted features or other types of prior domain knowledge about the human visual system or image statistics. We evaluate the proposed approach on the LIVE, CISQ, and TID2013 databases as well as the LIVE In the wild image quality challenge database and show superior performance to state-of-the-art NR and FR IQA methods. Finally, cross-database evaluation shows a high ability to generalize between different databases, indicating a high robustness of the learned features.
Tasks Image Quality Assessment
Published 2016-12-06
URL http://arxiv.org/abs/1612.01697v2
PDF http://arxiv.org/pdf/1612.01697v2.pdf
PWC https://paperswithcode.com/paper/deep-neural-networks-for-no-reference-and
Repo https://github.com/dmaniry/deepIQA
Framework none

Learning to Protect Communications with Adversarial Neural Cryptography

Title Learning to Protect Communications with Adversarial Neural Cryptography
Authors Martín Abadi, David G. Andersen
Abstract We ask whether neural networks can learn to use secret keys to protect information from other neural networks. Specifically, we focus on ensuring confidentiality properties in a multiagent system, and we specify those properties in terms of an adversary. Thus, a system may consist of neural networks named Alice and Bob, and we aim to limit what a third neural network named Eve learns from eavesdropping on the communication between Alice and Bob. We do not prescribe specific cryptographic algorithms to these neural networks; instead, we train end-to-end, adversarially. We demonstrate that the neural networks can learn how to perform forms of encryption and decryption, and also how to apply these operations selectively in order to meet confidentiality goals.
Tasks
Published 2016-10-21
URL http://arxiv.org/abs/1610.06918v1
PDF http://arxiv.org/pdf/1610.06918v1.pdf
PWC https://paperswithcode.com/paper/learning-to-protect-communications-with
Repo https://github.com/IBM/MAX-Adversarial-Cryptography
Framework tf

Bag of Tricks for Efficient Text Classification

Title Bag of Tricks for Efficient Text Classification
Authors Armand Joulin, Edouard Grave, Piotr Bojanowski, Tomas Mikolov
Abstract This paper explores a simple and efficient baseline for text classification. Our experiments show that our fast text classifier fastText is often on par with deep learning classifiers in terms of accuracy, and many orders of magnitude faster for training and evaluation. We can train fastText on more than one billion words in less than ten minutes using a standard multicore~CPU, and classify half a million sentences among~312K classes in less than a minute.
Tasks Sentiment Analysis, Text Classification
Published 2016-07-06
URL http://arxiv.org/abs/1607.01759v3
PDF http://arxiv.org/pdf/1607.01759v3.pdf
PWC https://paperswithcode.com/paper/bag-of-tricks-for-efficient-text
Repo https://github.com/DW-yejing/fasttext4j-jdk6
Framework none

Numerical Coding of Nominal Data

Title Numerical Coding of Nominal Data
Authors Zenon Gniazdowski, Michal Grabowski
Abstract In this paper, a novel approach for coding nominal data is proposed. For the given nominal data, a rank in a form of complex number is assigned. The proposed method does not lose any information about the attribute and brings other properties previously unknown. The approach based on these knew properties can been used for classification. The analyzed example shows that classification with the use of coded nominal data or both numerical as well as coded nominal data is more effective than the classification, which uses only numerical data.
Tasks
Published 2016-01-08
URL http://arxiv.org/abs/1601.01966v1
PDF http://arxiv.org/pdf/1601.01966v1.pdf
PWC https://paperswithcode.com/paper/numerical-coding-of-nominal-data
Repo https://github.com/IrisStark/ComplexEncoder
Framework none

Layer Normalization

Title Layer Normalization
Authors Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton
Abstract Training state-of-the-art, deep neural networks is computationally expensive. One way to reduce the training time is to normalize the activities of the neurons. A recently introduced technique called batch normalization uses the distribution of the summed input to a neuron over a mini-batch of training cases to compute a mean and variance which are then used to normalize the summed input to that neuron on each training case. This significantly reduces the training time in feed-forward neural networks. However, the effect of batch normalization is dependent on the mini-batch size and it is not obvious how to apply it to recurrent neural networks. In this paper, we transpose batch normalization into layer normalization by computing the mean and variance used for normalization from all of the summed inputs to the neurons in a layer on a single training case. Like batch normalization, we also give each neuron its own adaptive bias and gain which are applied after the normalization but before the non-linearity. Unlike batch normalization, layer normalization performs exactly the same computation at training and test times. It is also straightforward to apply to recurrent neural networks by computing the normalization statistics separately at each time step. Layer normalization is very effective at stabilizing the hidden state dynamics in recurrent networks. Empirically, we show that layer normalization can substantially reduce the training time compared with previously published techniques.
Tasks
Published 2016-07-21
URL http://arxiv.org/abs/1607.06450v1
PDF http://arxiv.org/pdf/1607.06450v1.pdf
PWC https://paperswithcode.com/paper/layer-normalization
Repo https://github.com/jiamings/fast-weights
Framework tf

Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision

Title Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision
Authors Xinchen Yan, Jimei Yang, Ersin Yumer, Yijie Guo, Honglak Lee
Abstract Understanding the 3D world is a fundamental problem in computer vision. However, learning a good representation of 3D objects is still an open problem due to the high dimensionality of the data and many factors of variation involved. In this work, we investigate the task of single-view 3D object reconstruction from a learning agent’s perspective. We formulate the learning process as an interaction between 3D and 2D representations and propose an encoder-decoder network with a novel projection loss defined by the perspective transformation. More importantly, the projection loss enables the unsupervised learning using 2D observation without explicit 3D supervision. We demonstrate the ability of the model in generating 3D volume from a single 2D image with three sets of experiments: (1) learning from single-class objects; (2) learning from multi-class objects and (3) testing on novel object classes. Results show superior performance and better generalization ability for 3D object reconstruction when the projection loss is involved.
Tasks 3D Object Reconstruction, Object Reconstruction
Published 2016-12-01
URL http://arxiv.org/abs/1612.00814v3
PDF http://arxiv.org/pdf/1612.00814v3.pdf
PWC https://paperswithcode.com/paper/perspective-transformer-nets-learning-single
Repo https://github.com/tensorflow/models/tree/master/research/ptn
Framework tf

Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification

Title Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification
Authors Justin Salamon, Juan Pablo Bello
Abstract The ability of deep convolutional neural networks (CNN) to learn discriminative spectro-temporal patterns makes them well suited to environmental sound classification. However, the relative scarcity of labeled data has impeded the exploitation of this family of high-capacity models. This study has two primary contributions: first, we propose a deep convolutional neural network architecture for environmental sound classification. Second, we propose the use of audio data augmentation for overcoming the problem of data scarcity and explore the influence of different augmentations on the performance of the proposed CNN architecture. Combined with data augmentation, the proposed model produces state-of-the-art results for environmental sound classification. We show that the improved performance stems from the combination of a deep, high-capacity model and an augmented training set: this combination outperforms both the proposed CNN without augmentation and a “shallow” dictionary learning model with augmentation. Finally, we examine the influence of each augmentation on the model’s classification accuracy for each class, and observe that the accuracy for each class is influenced differently by each augmentation, suggesting that the performance of the model could be improved further by applying class-conditional data augmentation.
Tasks Data Augmentation, Dictionary Learning, Environmental Sound Classification
Published 2016-08-15
URL http://arxiv.org/abs/1608.04363v2
PDF http://arxiv.org/pdf/1608.04363v2.pdf
PWC https://paperswithcode.com/paper/deep-convolutional-neural-networks-and-data-1
Repo https://github.com/justinsalamon/UrbanSound8K-JAMS
Framework none

Fast Low-rank Shared Dictionary Learning for Image Classification

Title Fast Low-rank Shared Dictionary Learning for Image Classification
Authors Tiep Vu, Vishal Monga
Abstract Despite the fact that different objects possess distinct class-specific features, they also usually share common patterns. This observation has been exploited partially in a recently proposed dictionary learning framework by separating the particularity and the commonality (COPAR). Inspired by this, we propose a novel method to explicitly and simultaneously learn a set of common patterns as well as class-specific features for classification with more intuitive constraints. Our dictionary learning framework is hence characterized by both a shared dictionary and particular (class-specific) dictionaries. For the shared dictionary, we enforce a low-rank constraint, i.e. claim that its spanning subspace should have low dimension and the coefficients corresponding to this dictionary should be similar. For the particular dictionaries, we impose on them the well-known constraints stated in the Fisher discrimination dictionary learning (FDDL). Further, we develop new fast and accurate algorithms to solve the subproblems in the learning step, accelerating its convergence. The said algorithms could also be applied to FDDL and its extensions. The efficiencies of these algorithms are theoretically and experimentally verified by comparing their complexities and running time with those of other well-known dictionary learning methods. Experimental results on widely used image datasets establish the advantages of our method over state-of-the-art dictionary learning methods.
Tasks Dictionary Learning, Image Classification
Published 2016-10-27
URL http://arxiv.org/abs/1610.08606v3
PDF http://arxiv.org/pdf/1610.08606v3.pdf
PWC https://paperswithcode.com/paper/fast-low-rank-shared-dictionary-learning-for
Repo https://github.com/tiepvupsu/DICTOL
Framework none

Learning a low-rank shared dictionary for object classification

Title Learning a low-rank shared dictionary for object classification
Authors Tiep H. Vu, Vishal Monga
Abstract Despite the fact that different objects possess distinct class-specific features, they also usually share common patterns. Inspired by this observation, we propose a novel method to explicitly and simultaneously learn a set of common patterns as well as class-specific features for classification. Our dictionary learning framework is hence characterized by both a shared dictionary and particular (class-specific) dictionaries. For the shared dictionary, we enforce a low-rank constraint, i.e. claim that its spanning subspace should have low dimension and the coefficients corresponding to this dictionary should be similar. For the particular dictionaries, we impose on them the well-known constraints stated in the Fisher discrimination dictionary learning (FDDL). Further, we propose a new fast and accurate algorithm to solve the sparse coding problems in the learning step, accelerating its convergence. The said algorithm could also be applied to FDDL and its extensions. Experimental results on widely used image databases establish the advantages of our method over state-of-the-art dictionary learning methods.
Tasks Dictionary Learning, Object Classification
Published 2016-01-31
URL http://arxiv.org/abs/1602.00310v2
PDF http://arxiv.org/pdf/1602.00310v2.pdf
PWC https://paperswithcode.com/paper/learning-a-low-rank-shared-dictionary-for
Repo https://github.com/tiepvupsu/DICTOL
Framework none

One-shot Learning with Memory-Augmented Neural Networks

Title One-shot Learning with Memory-Augmented Neural Networks
Authors Adam Santoro, Sergey Bartunov, Matthew Botvinick, Daan Wierstra, Timothy Lillicrap
Abstract Despite recent breakthroughs in the applications of deep neural networks, one setting that presents a persistent challenge is that of “one-shot learning.” Traditional gradient-based networks require a lot of data to learn, often through extensive iterative training. When new data is encountered, the models must inefficiently relearn their parameters to adequately incorporate the new information without catastrophic interference. Architectures with augmented memory capacities, such as Neural Turing Machines (NTMs), offer the ability to quickly encode and retrieve new information, and hence can potentially obviate the downsides of conventional models. Here, we demonstrate the ability of a memory-augmented neural network to rapidly assimilate new data, and leverage this data to make accurate predictions after only a few samples. We also introduce a new method for accessing an external memory that focuses on memory content, unlike previous methods that additionally use memory location-based focusing mechanisms.
Tasks One-Shot Learning
Published 2016-05-19
URL http://arxiv.org/abs/1605.06065v1
PDF http://arxiv.org/pdf/1605.06065v1.pdf
PWC https://paperswithcode.com/paper/one-shot-learning-with-memory-augmented
Repo https://github.com/ash3n/One-shot-Memory-Augmented-NN
Framework tf

Towards a Neural Statistician

Title Towards a Neural Statistician
Authors Harrison Edwards, Amos Storkey
Abstract An efficient learner is one who reuses what they already know to tackle a new problem. For a machine learner, this means understanding the similarities amongst datasets. In order to do this, one must take seriously the idea of working with datasets, rather than datapoints, as the key objects to model. Towards this goal, we demonstrate an extension of a variational autoencoder that can learn a method for computing representations, or statistics, of datasets in an unsupervised fashion. The network is trained to produce statistics that encapsulate a generative model for each dataset. Hence the network enables efficient learning from new datasets for both unsupervised and supervised tasks. We show that we are able to learn statistics that can be used for: clustering datasets, transferring generative models to new datasets, selecting representative samples of datasets and classifying previously unseen classes. We refer to our model as a neural statistician, and by this we mean a neural network that can learn to compute summary statistics of datasets without supervision.
Tasks Few-Shot Image Classification
Published 2016-06-07
URL http://arxiv.org/abs/1606.02185v2
PDF http://arxiv.org/pdf/1606.02185v2.pdf
PWC https://paperswithcode.com/paper/towards-a-neural-statistician
Repo https://github.com/cravingoxygen/neuralstat
Framework pytorch

SOL: A Library for Scalable Online Learning Algorithms

Title SOL: A Library for Scalable Online Learning Algorithms
Authors Yue Wu, Steven C. H. Hoi, Chenghao Liu, Jing Lu, Doyen Sahoo, Nenghai Yu
Abstract SOL is an open-source library for scalable online learning algorithms, and is particularly suitable for learning with high-dimensional data. The library provides a family of regular and sparse online learning algorithms for large-scale binary and multi-class classification tasks with high efficiency, scalability, portability, and extensibility. SOL was implemented in C++, and provided with a collection of easy-to-use command-line tools, python wrappers and library calls for users and developers, as well as comprehensive documents for both beginners and advanced users. SOL is not only a practical machine learning toolbox, but also a comprehensive experimental platform for online learning research. Experiments demonstrate that SOL is highly efficient and scalable for large-scale machine learning with high-dimensional data.
Tasks
Published 2016-10-28
URL http://arxiv.org/abs/1610.09083v1
PDF http://arxiv.org/pdf/1610.09083v1.pdf
PWC https://paperswithcode.com/paper/sol-a-library-for-scalable-online-learning
Repo https://github.com/LIBOL/SOL
Framework none

Joint CTC-Attention based End-to-End Speech Recognition using Multi-task Learning

Title Joint CTC-Attention based End-to-End Speech Recognition using Multi-task Learning
Authors Suyoun Kim, Takaaki Hori, Shinji Watanabe
Abstract Recently, there has been an increasing interest in end-to-end speech recognition that directly transcribes speech to text without any predefined alignments. One approach is the attention-based encoder-decoder framework that learns a mapping between variable-length input and output sequences in one step using a purely data-driven method. The attention model has often been shown to improve the performance over another end-to-end approach, the Connectionist Temporal Classification (CTC), mainly because it explicitly uses the history of the target character without any conditional independence assumptions. However, we observed that the performance of the attention has shown poor results in noisy condition and is hard to learn in the initial training stage with long input sequences. This is because the attention model is too flexible to predict proper alignments in such cases due to the lack of left-to-right constraints as used in CTC. This paper presents a novel method for end-to-end speech recognition to improve robustness and achieve fast convergence by using a joint CTC-attention model within the multi-task learning framework, thereby mitigating the alignment issue. An experiment on the WSJ and CHiME-4 tasks demonstrates its advantages over both the CTC and attention-based encoder-decoder baselines, showing 5.4-14.6% relative improvements in Character Error Rate (CER).
Tasks End-To-End Speech Recognition, Multi-Task Learning, Speech Recognition
Published 2016-09-21
URL http://arxiv.org/abs/1609.06773v2
PDF http://arxiv.org/pdf/1609.06773v2.pdf
PWC https://paperswithcode.com/paper/joint-ctc-attention-based-end-to-end-speech
Repo https://github.com/Alexander-H-Liu/End-to-end-ASR-Pytorch
Framework pytorch
comments powered by Disqus