May 7, 2019

2852 words 14 mins read

Paper Group AWR 87

Pointer Sentinel Mixture Models. RHOG: A Refinement-Operator Library for Directed Labeled Graphs. Deep Neural Networks for No-Reference and Full-Reference Image Quality Assessment. Learning to Protect Communications with Adversarial Neural Cryptography. Bag of Tricks for Efficient Text Classification. Numerical Coding of Nominal Data. Layer Normali …

Pointer Sentinel Mixture Models


Title	Pointer Sentinel Mixture Models
Authors	Stephen Merity, Caiming Xiong, James Bradbury, Richard Socher
Abstract	Recent neural network sequence models with softmax classifiers have achieved their best language modeling performance only with very large hidden states and large vocabularies. Even then they struggle to predict rare or unseen words even if the context makes the prediction unambiguous. We introduce the pointer sentinel mixture architecture for neural sequence models which has the ability to either reproduce a word from the recent context or produce a word from a standard softmax classifier. Our pointer sentinel-LSTM model achieves state of the art language modeling performance on the Penn Treebank (70.9 perplexity) while using far fewer parameters than a standard softmax LSTM. In order to evaluate how well language models can exploit longer contexts and deal with more realistic vocabularies and larger corpora we also introduce the freely available WikiText corpus.
Tasks	Language Modelling
Published	2016-09-26
URL	http://arxiv.org/abs/1609.07843v1
PDF	http://arxiv.org/pdf/1609.07843v1.pdf
PWC	https://paperswithcode.com/paper/pointer-sentinel-mixture-models
Repo	https://github.com/elanmart/psmm
Framework	pytorch


Title	RHOG: A Refinement-Operator Library for Directed Labeled Graphs
Authors	Santiago Ontañón
Abstract	This document provides the foundations behind the functionality provided by the $\rho$G library (https://github.com/santiontanon/RHOG), focusing on the basic operations the library provides: subsumption, refinement of directed labeled graphs, and distance/similarity assessment between directed labeled graphs. $\rho$G development was initially supported by the National Science Foundation, by the EAGER grant IIS-1551338.
Tasks
Published	2016-04-23
URL	http://arxiv.org/abs/1604.06954v1
PDF	http://arxiv.org/pdf/1604.06954v1.pdf
PWC	https://paperswithcode.com/paper/rhog-a-refinement-operator-library-for
Repo	https://github.com/santiontanon/RHOG
Framework	none

Deep Neural Networks for No-Reference and Full-Reference Image Quality Assessment


Title	Deep Neural Networks for No-Reference and Full-Reference Image Quality Assessment
Authors	Sebastian Bosse, Dominique Maniry, Klaus-Robert Müller, Thomas Wiegand, Wojciech Samek
Abstract	We present a deep neural network-based approach to image quality assessment (IQA). The network is trained end-to-end and comprises ten convolutional layers and five pooling layers for feature extraction, and two fully connected layers for regression, which makes it significantly deeper than related IQA models. Unique features of the proposed architecture are that: 1) with slight adaptations it can be used in a no-reference (NR) as well as in a full-reference (FR) IQA setting and 2) it allows for joint learning of local quality and local weights, i.e., relative importance of local quality to the global quality estimate, in an unified framework. Our approach is purely data-driven and does not rely on hand-crafted features or other types of prior domain knowledge about the human visual system or image statistics. We evaluate the proposed approach on the LIVE, CISQ, and TID2013 databases as well as the LIVE In the wild image quality challenge database and show superior performance to state-of-the-art NR and FR IQA methods. Finally, cross-database evaluation shows a high ability to generalize between different databases, indicating a high robustness of the learned features.
Tasks	Image Quality Assessment
Published	2016-12-06
URL	http://arxiv.org/abs/1612.01697v2
PDF	http://arxiv.org/pdf/1612.01697v2.pdf
PWC	https://paperswithcode.com/paper/deep-neural-networks-for-no-reference-and
Repo	https://github.com/dmaniry/deepIQA
Framework	none

Learning to Protect Communications with Adversarial Neural Cryptography


Title	Learning to Protect Communications with Adversarial Neural Cryptography
Authors	Martín Abadi, David G. Andersen
Abstract	We ask whether neural networks can learn to use secret keys to protect information from other neural networks. Specifically, we focus on ensuring confidentiality properties in a multiagent system, and we specify those properties in terms of an adversary. Thus, a system may consist of neural networks named Alice and Bob, and we aim to limit what a third neural network named Eve learns from eavesdropping on the communication between Alice and Bob. We do not prescribe specific cryptographic algorithms to these neural networks; instead, we train end-to-end, adversarially. We demonstrate that the neural networks can learn how to perform forms of encryption and decryption, and also how to apply these operations selectively in order to meet confidentiality goals.
Tasks
Published	2016-10-21
URL	http://arxiv.org/abs/1610.06918v1
PDF	http://arxiv.org/pdf/1610.06918v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-protect-communications-with
Repo	https://github.com/IBM/MAX-Adversarial-Cryptography
Framework	tf

Bag of Tricks for Efficient Text Classification


Title	Bag of Tricks for Efficient Text Classification
Authors	Armand Joulin, Edouard Grave, Piotr Bojanowski, Tomas Mikolov
Abstract	This paper explores a simple and efficient baseline for text classification. Our experiments show that our fast text classifier fastText is often on par with deep learning classifiers in terms of accuracy, and many orders of magnitude faster for training and evaluation. We can train fastText on more than one billion words in less than ten minutes using a standard multicore~CPU, and classify half a million sentences among~312K classes in less than a minute.
Tasks	Sentiment Analysis, Text Classification
Published	2016-07-06
URL	http://arxiv.org/abs/1607.01759v3
PDF	http://arxiv.org/pdf/1607.01759v3.pdf
PWC	https://paperswithcode.com/paper/bag-of-tricks-for-efficient-text
Repo	https://github.com/DW-yejing/fasttext4j-jdk6
Framework	none

Numerical Coding of Nominal Data


Title	Numerical Coding of Nominal Data
Authors	Zenon Gniazdowski, Michal Grabowski
Abstract	In this paper, a novel approach for coding nominal data is proposed. For the given nominal data, a rank in a form of complex number is assigned. The proposed method does not lose any information about the attribute and brings other properties previously unknown. The approach based on these knew properties can been used for classification. The analyzed example shows that classification with the use of coded nominal data or both numerical as well as coded nominal data is more effective than the classification, which uses only numerical data.
Tasks
Published	2016-01-08
URL	http://arxiv.org/abs/1601.01966v1
PDF	http://arxiv.org/pdf/1601.01966v1.pdf
PWC	https://paperswithcode.com/paper/numerical-coding-of-nominal-data
Repo	https://github.com/IrisStark/ComplexEncoder
Framework	none

Layer Normalization


Title	Layer Normalization
Authors	Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton
Abstract	Training state-of-the-art, deep neural networks is computationally expensive. One way to reduce the training time is to normalize the activities of the neurons. A recently introduced technique called batch normalization uses the distribution of the summed input to a neuron over a mini-batch of training cases to compute a mean and variance which are then used to normalize the summed input to that neuron on each training case. This significantly reduces the training time in feed-forward neural networks. However, the effect of batch normalization is dependent on the mini-batch size and it is not obvious how to apply it to recurrent neural networks. In this paper, we transpose batch normalization into layer normalization by computing the mean and variance used for normalization from all of the summed inputs to the neurons in a layer on a single training case. Like batch normalization, we also give each neuron its own adaptive bias and gain which are applied after the normalization but before the non-linearity. Unlike batch normalization, layer normalization performs exactly the same computation at training and test times. It is also straightforward to apply to recurrent neural networks by computing the normalization statistics separately at each time step. Layer normalization is very effective at stabilizing the hidden state dynamics in recurrent networks. Empirically, we show that layer normalization can substantially reduce the training time compared with previously published techniques.
Tasks
Published	2016-07-21
URL	http://arxiv.org/abs/1607.06450v1
PDF	http://arxiv.org/pdf/1607.06450v1.pdf
PWC	https://paperswithcode.com/paper/layer-normalization
Repo	https://github.com/jiamings/fast-weights
Framework	tf

Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision


Title	Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision
Authors	Xinchen Yan, Jimei Yang, Ersin Yumer, Yijie Guo, Honglak Lee
Abstract	Understanding the 3D world is a fundamental problem in computer vision. However, learning a good representation of 3D objects is still an open problem due to the high dimensionality of the data and many factors of variation involved. In this work, we investigate the task of single-view 3D object reconstruction from a learning agent’s perspective. We formulate the learning process as an interaction between 3D and 2D representations and propose an encoder-decoder network with a novel projection loss defined by the perspective transformation. More importantly, the projection loss enables the unsupervised learning using 2D observation without explicit 3D supervision. We demonstrate the ability of the model in generating 3D volume from a single 2D image with three sets of experiments: (1) learning from single-class objects; (2) learning from multi-class objects and (3) testing on novel object classes. Results show superior performance and better generalization ability for 3D object reconstruction when the projection loss is involved.
Tasks	3D Object Reconstruction, Object Reconstruction
Published	2016-12-01
URL	http://arxiv.org/abs/1612.00814v3
PDF	http://arxiv.org/pdf/1612.00814v3.pdf
PWC	https://paperswithcode.com/paper/perspective-transformer-nets-learning-single
Repo	https://github.com/tensorflow/models/tree/master/research/ptn
Framework	tf

Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification


Title	Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification
Authors	Justin Salamon, Juan Pablo Bello
Abstract	The ability of deep convolutional neural networks (CNN) to learn discriminative spectro-temporal patterns makes them well suited to environmental sound classification. However, the relative scarcity of labeled data has impeded the exploitation of this family of high-capacity models. This study has two primary contributions: first, we propose a deep convolutional neural network architecture for environmental sound classification. Second, we propose the use of audio data augmentation for overcoming the problem of data scarcity and explore the influence of different augmentations on the performance of the proposed CNN architecture. Combined with data augmentation, the proposed model produces state-of-the-art results for environmental sound classification. We show that the improved performance stems from the combination of a deep, high-capacity model and an augmented training set: this combination outperforms both the proposed CNN without augmentation and a “shallow” dictionary learning model with augmentation. Finally, we examine the influence of each augmentation on the model’s classification accuracy for each class, and observe that the accuracy for each class is influenced differently by each augmentation, suggesting that the performance of the model could be improved further by applying class-conditional data augmentation.
Tasks	Data Augmentation, Dictionary Learning, Environmental Sound Classification
Published	2016-08-15
URL	http://arxiv.org/abs/1608.04363v2
PDF	http://arxiv.org/pdf/1608.04363v2.pdf
PWC	https://paperswithcode.com/paper/deep-convolutional-neural-networks-and-data-1
Repo	https://github.com/justinsalamon/UrbanSound8K-JAMS
Framework	none

Fast Low-rank Shared Dictionary Learning for Image Classification


Title	Fast Low-rank Shared Dictionary Learning for Image Classification
Authors	Tiep Vu, Vishal Monga
Abstract	Despite the fact that different objects possess distinct class-specific features, they also usually share common patterns. This observation has been exploited partially in a recently proposed dictionary learning framework by separating the particularity and the commonality (COPAR). Inspired by this, we propose a novel method to explicitly and simultaneously learn a set of common patterns as well as class-specific features for classification with more intuitive constraints. Our dictionary learning framework is hence characterized by both a shared dictionary and particular (class-specific) dictionaries. For the shared dictionary, we enforce a low-rank constraint, i.e. claim that its spanning subspace should have low dimension and the coefficients corresponding to this dictionary should be similar. For the particular dictionaries, we impose on them the well-known constraints stated in the Fisher discrimination dictionary learning (FDDL). Further, we develop new fast and accurate algorithms to solve the subproblems in the learning step, accelerating its convergence. The said algorithms could also be applied to FDDL and its extensions. The efficiencies of these algorithms are theoretically and experimentally verified by comparing their complexities and running time with those of other well-known dictionary learning methods. Experimental results on widely used image datasets establish the advantages of our method over state-of-the-art dictionary learning methods.
Tasks	Dictionary Learning, Image Classification
Published	2016-10-27
URL	http://arxiv.org/abs/1610.08606v3
PDF	http://arxiv.org/pdf/1610.08606v3.pdf
PWC	https://paperswithcode.com/paper/fast-low-rank-shared-dictionary-learning-for
Repo	https://github.com/tiepvupsu/DICTOL
Framework	none

Learning a low-rank shared dictionary for object classification


Title	Learning a low-rank shared dictionary for object classification
Authors	Tiep H. Vu, Vishal Monga
Abstract	Despite the fact that different objects possess distinct class-specific features, they also usually share common patterns. Inspired by this observation, we propose a novel method to explicitly and simultaneously learn a set of common patterns as well as class-specific features for classification. Our dictionary learning framework is hence characterized by both a shared dictionary and particular (class-specific) dictionaries. For the shared dictionary, we enforce a low-rank constraint, i.e. claim that its spanning subspace should have low dimension and the coefficients corresponding to this dictionary should be similar. For the particular dictionaries, we impose on them the well-known constraints stated in the Fisher discrimination dictionary learning (FDDL). Further, we propose a new fast and accurate algorithm to solve the sparse coding problems in the learning step, accelerating its convergence. The said algorithm could also be applied to FDDL and its extensions. Experimental results on widely used image databases establish the advantages of our method over state-of-the-art dictionary learning methods.
Tasks	Dictionary Learning, Object Classification
Published	2016-01-31
URL	http://arxiv.org/abs/1602.00310v2
PDF	http://arxiv.org/pdf/1602.00310v2.pdf
PWC	https://paperswithcode.com/paper/learning-a-low-rank-shared-dictionary-for
Repo	https://github.com/tiepvupsu/DICTOL
Framework	none

One-shot Learning with Memory-Augmented Neural Networks


Title	One-shot Learning with Memory-Augmented Neural Networks
Authors	Adam Santoro, Sergey Bartunov, Matthew Botvinick, Daan Wierstra, Timothy Lillicrap
Abstract	Despite recent breakthroughs in the applications of deep neural networks, one setting that presents a persistent challenge is that of “one-shot learning.” Traditional gradient-based networks require a lot of data to learn, often through extensive iterative training. When new data is encountered, the models must inefficiently relearn their parameters to adequately incorporate the new information without catastrophic interference. Architectures with augmented memory capacities, such as Neural Turing Machines (NTMs), offer the ability to quickly encode and retrieve new information, and hence can potentially obviate the downsides of conventional models. Here, we demonstrate the ability of a memory-augmented neural network to rapidly assimilate new data, and leverage this data to make accurate predictions after only a few samples. We also introduce a new method for accessing an external memory that focuses on memory content, unlike previous methods that additionally use memory location-based focusing mechanisms.
Tasks	One-Shot Learning
Published	2016-05-19
URL	http://arxiv.org/abs/1605.06065v1
PDF	http://arxiv.org/pdf/1605.06065v1.pdf
PWC	https://paperswithcode.com/paper/one-shot-learning-with-memory-augmented
Repo	https://github.com/ash3n/One-shot-Memory-Augmented-NN
Framework	tf

Towards a Neural Statistician


Title	Towards a Neural Statistician
Authors	Harrison Edwards, Amos Storkey
Abstract	An efficient learner is one who reuses what they already know to tackle a new problem. For a machine learner, this means understanding the similarities amongst datasets. In order to do this, one must take seriously the idea of working with datasets, rather than datapoints, as the key objects to model. Towards this goal, we demonstrate an extension of a variational autoencoder that can learn a method for computing representations, or statistics, of datasets in an unsupervised fashion. The network is trained to produce statistics that encapsulate a generative model for each dataset. Hence the network enables efficient learning from new datasets for both unsupervised and supervised tasks. We show that we are able to learn statistics that can be used for: clustering datasets, transferring generative models to new datasets, selecting representative samples of datasets and classifying previously unseen classes. We refer to our model as a neural statistician, and by this we mean a neural network that can learn to compute summary statistics of datasets without supervision.
Tasks	Few-Shot Image Classification
Published	2016-06-07
URL	http://arxiv.org/abs/1606.02185v2
PDF	http://arxiv.org/pdf/1606.02185v2.pdf
PWC	https://paperswithcode.com/paper/towards-a-neural-statistician
Repo	https://github.com/cravingoxygen/neuralstat
Framework	pytorch

SOL: A Library for Scalable Online Learning Algorithms


Title	SOL: A Library for Scalable Online Learning Algorithms
Authors	Yue Wu, Steven C. H. Hoi, Chenghao Liu, Jing Lu, Doyen Sahoo, Nenghai Yu
Abstract	SOL is an open-source library for scalable online learning algorithms, and is particularly suitable for learning with high-dimensional data. The library provides a family of regular and sparse online learning algorithms for large-scale binary and multi-class classification tasks with high efficiency, scalability, portability, and extensibility. SOL was implemented in C++, and provided with a collection of easy-to-use command-line tools, python wrappers and library calls for users and developers, as well as comprehensive documents for both beginners and advanced users. SOL is not only a practical machine learning toolbox, but also a comprehensive experimental platform for online learning research. Experiments demonstrate that SOL is highly efficient and scalable for large-scale machine learning with high-dimensional data.
Tasks
Published	2016-10-28
URL	http://arxiv.org/abs/1610.09083v1
PDF	http://arxiv.org/pdf/1610.09083v1.pdf
PWC	https://paperswithcode.com/paper/sol-a-library-for-scalable-online-learning
Repo	https://github.com/LIBOL/SOL
Framework	none

Joint CTC-Attention based End-to-End Speech Recognition using Multi-task Learning


Title	Joint CTC-Attention based End-to-End Speech Recognition using Multi-task Learning
Authors	Suyoun Kim, Takaaki Hori, Shinji Watanabe
Abstract	Recently, there has been an increasing interest in end-to-end speech recognition that directly transcribes speech to text without any predefined alignments. One approach is the attention-based encoder-decoder framework that learns a mapping between variable-length input and output sequences in one step using a purely data-driven method. The attention model has often been shown to improve the performance over another end-to-end approach, the Connectionist Temporal Classification (CTC), mainly because it explicitly uses the history of the target character without any conditional independence assumptions. However, we observed that the performance of the attention has shown poor results in noisy condition and is hard to learn in the initial training stage with long input sequences. This is because the attention model is too flexible to predict proper alignments in such cases due to the lack of left-to-right constraints as used in CTC. This paper presents a novel method for end-to-end speech recognition to improve robustness and achieve fast convergence by using a joint CTC-attention model within the multi-task learning framework, thereby mitigating the alignment issue. An experiment on the WSJ and CHiME-4 tasks demonstrates its advantages over both the CTC and attention-based encoder-decoder baselines, showing 5.4-14.6% relative improvements in Character Error Rate (CER).
Tasks	End-To-End Speech Recognition, Multi-Task Learning, Speech Recognition
Published	2016-09-21
URL	http://arxiv.org/abs/1609.06773v2
PDF	http://arxiv.org/pdf/1609.06773v2.pdf
PWC	https://paperswithcode.com/paper/joint-ctc-attention-based-end-to-end-speech
Repo	https://github.com/Alexander-H-Liu/End-to-end-ASR-Pytorch
Framework	pytorch