Paper Group AWR 87
Pointer Sentinel Mixture Models. RHOG: A Refinement-Operator Library for Directed Labeled Graphs. Deep Neural Networks for No-Reference and Full-Reference Image Quality Assessment. Learning to Protect Communications with Adversarial Neural Cryptography. Bag of Tricks for Efficient Text Classification. Numerical Coding of Nominal Data. Layer Normali …
Pointer Sentinel Mixture Models
Title | Pointer Sentinel Mixture Models |
Authors | Stephen Merity, Caiming Xiong, James Bradbury, Richard Socher |
Abstract | Recent neural network sequence models with softmax classifiers have achieved their best language modeling performance only with very large hidden states and large vocabularies. Even then they struggle to predict rare or unseen words even if the context makes the prediction unambiguous. We introduce the pointer sentinel mixture architecture for neural sequence models which has the ability to either reproduce a word from the recent context or produce a word from a standard softmax classifier. Our pointer sentinel-LSTM model achieves state of the art language modeling performance on the Penn Treebank (70.9 perplexity) while using far fewer parameters than a standard softmax LSTM. In order to evaluate how well language models can exploit longer contexts and deal with more realistic vocabularies and larger corpora we also introduce the freely available WikiText corpus. |
Tasks | Language Modelling |
Published | 2016-09-26 |
URL | http://arxiv.org/abs/1609.07843v1 |
http://arxiv.org/pdf/1609.07843v1.pdf | |
PWC | https://paperswithcode.com/paper/pointer-sentinel-mixture-models |
Repo | https://github.com/elanmart/psmm |
Framework | pytorch |
RHOG: A Refinement-Operator Library for Directed Labeled Graphs
Title | RHOG: A Refinement-Operator Library for Directed Labeled Graphs |
Authors | Santiago Ontañón |
Abstract | This document provides the foundations behind the functionality provided by the $\rho$G library (https://github.com/santiontanon/RHOG), focusing on the basic operations the library provides: subsumption, refinement of directed labeled graphs, and distance/similarity assessment between directed labeled graphs. $\rho$G development was initially supported by the National Science Foundation, by the EAGER grant IIS-1551338. |
Tasks | |
Published | 2016-04-23 |
URL | http://arxiv.org/abs/1604.06954v1 |
http://arxiv.org/pdf/1604.06954v1.pdf | |
PWC | https://paperswithcode.com/paper/rhog-a-refinement-operator-library-for |
Repo | https://github.com/santiontanon/RHOG |
Framework | none |
Deep Neural Networks for No-Reference and Full-Reference Image Quality Assessment
Title | Deep Neural Networks for No-Reference and Full-Reference Image Quality Assessment |
Authors | Sebastian Bosse, Dominique Maniry, Klaus-Robert Müller, Thomas Wiegand, Wojciech Samek |
Abstract | We present a deep neural network-based approach to image quality assessment (IQA). The network is trained end-to-end and comprises ten convolutional layers and five pooling layers for feature extraction, and two fully connected layers for regression, which makes it significantly deeper than related IQA models. Unique features of the proposed architecture are that: 1) with slight adaptations it can be used in a no-reference (NR) as well as in a full-reference (FR) IQA setting and 2) it allows for joint learning of local quality and local weights, i.e., relative importance of local quality to the global quality estimate, in an unified framework. Our approach is purely data-driven and does not rely on hand-crafted features or other types of prior domain knowledge about the human visual system or image statistics. We evaluate the proposed approach on the LIVE, CISQ, and TID2013 databases as well as the LIVE In the wild image quality challenge database and show superior performance to state-of-the-art NR and FR IQA methods. Finally, cross-database evaluation shows a high ability to generalize between different databases, indicating a high robustness of the learned features. |
Tasks | Image Quality Assessment |
Published | 2016-12-06 |
URL | http://arxiv.org/abs/1612.01697v2 |
http://arxiv.org/pdf/1612.01697v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-neural-networks-for-no-reference-and |
Repo | https://github.com/dmaniry/deepIQA |
Framework | none |
Learning to Protect Communications with Adversarial Neural Cryptography
Title | Learning to Protect Communications with Adversarial Neural Cryptography |
Authors | Martín Abadi, David G. Andersen |
Abstract | We ask whether neural networks can learn to use secret keys to protect information from other neural networks. Specifically, we focus on ensuring confidentiality properties in a multiagent system, and we specify those properties in terms of an adversary. Thus, a system may consist of neural networks named Alice and Bob, and we aim to limit what a third neural network named Eve learns from eavesdropping on the communication between Alice and Bob. We do not prescribe specific cryptographic algorithms to these neural networks; instead, we train end-to-end, adversarially. We demonstrate that the neural networks can learn how to perform forms of encryption and decryption, and also how to apply these operations selectively in order to meet confidentiality goals. |
Tasks | |
Published | 2016-10-21 |
URL | http://arxiv.org/abs/1610.06918v1 |
http://arxiv.org/pdf/1610.06918v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-protect-communications-with |
Repo | https://github.com/IBM/MAX-Adversarial-Cryptography |
Framework | tf |
Bag of Tricks for Efficient Text Classification
Title | Bag of Tricks for Efficient Text Classification |
Authors | Armand Joulin, Edouard Grave, Piotr Bojanowski, Tomas Mikolov |
Abstract | This paper explores a simple and efficient baseline for text classification. Our experiments show that our fast text classifier fastText is often on par with deep learning classifiers in terms of accuracy, and many orders of magnitude faster for training and evaluation. We can train fastText on more than one billion words in less than ten minutes using a standard multicore~CPU, and classify half a million sentences among~312K classes in less than a minute. |
Tasks | Sentiment Analysis, Text Classification |
Published | 2016-07-06 |
URL | http://arxiv.org/abs/1607.01759v3 |
http://arxiv.org/pdf/1607.01759v3.pdf | |
PWC | https://paperswithcode.com/paper/bag-of-tricks-for-efficient-text |
Repo | https://github.com/DW-yejing/fasttext4j-jdk6 |
Framework | none |
Numerical Coding of Nominal Data
Title | Numerical Coding of Nominal Data |
Authors | Zenon Gniazdowski, Michal Grabowski |
Abstract | In this paper, a novel approach for coding nominal data is proposed. For the given nominal data, a rank in a form of complex number is assigned. The proposed method does not lose any information about the attribute and brings other properties previously unknown. The approach based on these knew properties can been used for classification. The analyzed example shows that classification with the use of coded nominal data or both numerical as well as coded nominal data is more effective than the classification, which uses only numerical data. |
Tasks | |
Published | 2016-01-08 |
URL | http://arxiv.org/abs/1601.01966v1 |
http://arxiv.org/pdf/1601.01966v1.pdf | |
PWC | https://paperswithcode.com/paper/numerical-coding-of-nominal-data |
Repo | https://github.com/IrisStark/ComplexEncoder |
Framework | none |
Layer Normalization
Title | Layer Normalization |
Authors | Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton |
Abstract | Training state-of-the-art, deep neural networks is computationally expensive. One way to reduce the training time is to normalize the activities of the neurons. A recently introduced technique called batch normalization uses the distribution of the summed input to a neuron over a mini-batch of training cases to compute a mean and variance which are then used to normalize the summed input to that neuron on each training case. This significantly reduces the training time in feed-forward neural networks. However, the effect of batch normalization is dependent on the mini-batch size and it is not obvious how to apply it to recurrent neural networks. In this paper, we transpose batch normalization into layer normalization by computing the mean and variance used for normalization from all of the summed inputs to the neurons in a layer on a single training case. Like batch normalization, we also give each neuron its own adaptive bias and gain which are applied after the normalization but before the non-linearity. Unlike batch normalization, layer normalization performs exactly the same computation at training and test times. It is also straightforward to apply to recurrent neural networks by computing the normalization statistics separately at each time step. Layer normalization is very effective at stabilizing the hidden state dynamics in recurrent networks. Empirically, we show that layer normalization can substantially reduce the training time compared with previously published techniques. |
Tasks | |
Published | 2016-07-21 |
URL | http://arxiv.org/abs/1607.06450v1 |
http://arxiv.org/pdf/1607.06450v1.pdf | |
PWC | https://paperswithcode.com/paper/layer-normalization |
Repo | https://github.com/jiamings/fast-weights |
Framework | tf |
Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision
Title | Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision |
Authors | Xinchen Yan, Jimei Yang, Ersin Yumer, Yijie Guo, Honglak Lee |
Abstract | Understanding the 3D world is a fundamental problem in computer vision. However, learning a good representation of 3D objects is still an open problem due to the high dimensionality of the data and many factors of variation involved. In this work, we investigate the task of single-view 3D object reconstruction from a learning agent’s perspective. We formulate the learning process as an interaction between 3D and 2D representations and propose an encoder-decoder network with a novel projection loss defined by the perspective transformation. More importantly, the projection loss enables the unsupervised learning using 2D observation without explicit 3D supervision. We demonstrate the ability of the model in generating 3D volume from a single 2D image with three sets of experiments: (1) learning from single-class objects; (2) learning from multi-class objects and (3) testing on novel object classes. Results show superior performance and better generalization ability for 3D object reconstruction when the projection loss is involved. |
Tasks | 3D Object Reconstruction, Object Reconstruction |
Published | 2016-12-01 |
URL | http://arxiv.org/abs/1612.00814v3 |
http://arxiv.org/pdf/1612.00814v3.pdf | |
PWC | https://paperswithcode.com/paper/perspective-transformer-nets-learning-single |
Repo | https://github.com/tensorflow/models/tree/master/research/ptn |
Framework | tf |
Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification
Title | Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification |
Authors | Justin Salamon, Juan Pablo Bello |
Abstract | The ability of deep convolutional neural networks (CNN) to learn discriminative spectro-temporal patterns makes them well suited to environmental sound classification. However, the relative scarcity of labeled data has impeded the exploitation of this family of high-capacity models. This study has two primary contributions: first, we propose a deep convolutional neural network architecture for environmental sound classification. Second, we propose the use of audio data augmentation for overcoming the problem of data scarcity and explore the influence of different augmentations on the performance of the proposed CNN architecture. Combined with data augmentation, the proposed model produces state-of-the-art results for environmental sound classification. We show that the improved performance stems from the combination of a deep, high-capacity model and an augmented training set: this combination outperforms both the proposed CNN without augmentation and a “shallow” dictionary learning model with augmentation. Finally, we examine the influence of each augmentation on the model’s classification accuracy for each class, and observe that the accuracy for each class is influenced differently by each augmentation, suggesting that the performance of the model could be improved further by applying class-conditional data augmentation. |
Tasks | Data Augmentation, Dictionary Learning, Environmental Sound Classification |
Published | 2016-08-15 |
URL | http://arxiv.org/abs/1608.04363v2 |
http://arxiv.org/pdf/1608.04363v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-convolutional-neural-networks-and-data-1 |
Repo | https://github.com/justinsalamon/UrbanSound8K-JAMS |
Framework | none |
Fast Low-rank Shared Dictionary Learning for Image Classification
Title | Fast Low-rank Shared Dictionary Learning for Image Classification |
Authors | Tiep Vu, Vishal Monga |
Abstract | Despite the fact that different objects possess distinct class-specific features, they also usually share common patterns. This observation has been exploited partially in a recently proposed dictionary learning framework by separating the particularity and the commonality (COPAR). Inspired by this, we propose a novel method to explicitly and simultaneously learn a set of common patterns as well as class-specific features for classification with more intuitive constraints. Our dictionary learning framework is hence characterized by both a shared dictionary and particular (class-specific) dictionaries. For the shared dictionary, we enforce a low-rank constraint, i.e. claim that its spanning subspace should have low dimension and the coefficients corresponding to this dictionary should be similar. For the particular dictionaries, we impose on them the well-known constraints stated in the Fisher discrimination dictionary learning (FDDL). Further, we develop new fast and accurate algorithms to solve the subproblems in the learning step, accelerating its convergence. The said algorithms could also be applied to FDDL and its extensions. The efficiencies of these algorithms are theoretically and experimentally verified by comparing their complexities and running time with those of other well-known dictionary learning methods. Experimental results on widely used image datasets establish the advantages of our method over state-of-the-art dictionary learning methods. |
Tasks | Dictionary Learning, Image Classification |
Published | 2016-10-27 |
URL | http://arxiv.org/abs/1610.08606v3 |
http://arxiv.org/pdf/1610.08606v3.pdf | |
PWC | https://paperswithcode.com/paper/fast-low-rank-shared-dictionary-learning-for |
Repo | https://github.com/tiepvupsu/DICTOL |
Framework | none |
Learning a low-rank shared dictionary for object classification
Title | Learning a low-rank shared dictionary for object classification |
Authors | Tiep H. Vu, Vishal Monga |
Abstract | Despite the fact that different objects possess distinct class-specific features, they also usually share common patterns. Inspired by this observation, we propose a novel method to explicitly and simultaneously learn a set of common patterns as well as class-specific features for classification. Our dictionary learning framework is hence characterized by both a shared dictionary and particular (class-specific) dictionaries. For the shared dictionary, we enforce a low-rank constraint, i.e. claim that its spanning subspace should have low dimension and the coefficients corresponding to this dictionary should be similar. For the particular dictionaries, we impose on them the well-known constraints stated in the Fisher discrimination dictionary learning (FDDL). Further, we propose a new fast and accurate algorithm to solve the sparse coding problems in the learning step, accelerating its convergence. The said algorithm could also be applied to FDDL and its extensions. Experimental results on widely used image databases establish the advantages of our method over state-of-the-art dictionary learning methods. |
Tasks | Dictionary Learning, Object Classification |
Published | 2016-01-31 |
URL | http://arxiv.org/abs/1602.00310v2 |
http://arxiv.org/pdf/1602.00310v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-a-low-rank-shared-dictionary-for |
Repo | https://github.com/tiepvupsu/DICTOL |
Framework | none |
One-shot Learning with Memory-Augmented Neural Networks
Title | One-shot Learning with Memory-Augmented Neural Networks |
Authors | Adam Santoro, Sergey Bartunov, Matthew Botvinick, Daan Wierstra, Timothy Lillicrap |
Abstract | Despite recent breakthroughs in the applications of deep neural networks, one setting that presents a persistent challenge is that of “one-shot learning.” Traditional gradient-based networks require a lot of data to learn, often through extensive iterative training. When new data is encountered, the models must inefficiently relearn their parameters to adequately incorporate the new information without catastrophic interference. Architectures with augmented memory capacities, such as Neural Turing Machines (NTMs), offer the ability to quickly encode and retrieve new information, and hence can potentially obviate the downsides of conventional models. Here, we demonstrate the ability of a memory-augmented neural network to rapidly assimilate new data, and leverage this data to make accurate predictions after only a few samples. We also introduce a new method for accessing an external memory that focuses on memory content, unlike previous methods that additionally use memory location-based focusing mechanisms. |
Tasks | One-Shot Learning |
Published | 2016-05-19 |
URL | http://arxiv.org/abs/1605.06065v1 |
http://arxiv.org/pdf/1605.06065v1.pdf | |
PWC | https://paperswithcode.com/paper/one-shot-learning-with-memory-augmented |
Repo | https://github.com/ash3n/One-shot-Memory-Augmented-NN |
Framework | tf |
Towards a Neural Statistician
Title | Towards a Neural Statistician |
Authors | Harrison Edwards, Amos Storkey |
Abstract | An efficient learner is one who reuses what they already know to tackle a new problem. For a machine learner, this means understanding the similarities amongst datasets. In order to do this, one must take seriously the idea of working with datasets, rather than datapoints, as the key objects to model. Towards this goal, we demonstrate an extension of a variational autoencoder that can learn a method for computing representations, or statistics, of datasets in an unsupervised fashion. The network is trained to produce statistics that encapsulate a generative model for each dataset. Hence the network enables efficient learning from new datasets for both unsupervised and supervised tasks. We show that we are able to learn statistics that can be used for: clustering datasets, transferring generative models to new datasets, selecting representative samples of datasets and classifying previously unseen classes. We refer to our model as a neural statistician, and by this we mean a neural network that can learn to compute summary statistics of datasets without supervision. |
Tasks | Few-Shot Image Classification |
Published | 2016-06-07 |
URL | http://arxiv.org/abs/1606.02185v2 |
http://arxiv.org/pdf/1606.02185v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-a-neural-statistician |
Repo | https://github.com/cravingoxygen/neuralstat |
Framework | pytorch |
SOL: A Library for Scalable Online Learning Algorithms
Title | SOL: A Library for Scalable Online Learning Algorithms |
Authors | Yue Wu, Steven C. H. Hoi, Chenghao Liu, Jing Lu, Doyen Sahoo, Nenghai Yu |
Abstract | SOL is an open-source library for scalable online learning algorithms, and is particularly suitable for learning with high-dimensional data. The library provides a family of regular and sparse online learning algorithms for large-scale binary and multi-class classification tasks with high efficiency, scalability, portability, and extensibility. SOL was implemented in C++, and provided with a collection of easy-to-use command-line tools, python wrappers and library calls for users and developers, as well as comprehensive documents for both beginners and advanced users. SOL is not only a practical machine learning toolbox, but also a comprehensive experimental platform for online learning research. Experiments demonstrate that SOL is highly efficient and scalable for large-scale machine learning with high-dimensional data. |
Tasks | |
Published | 2016-10-28 |
URL | http://arxiv.org/abs/1610.09083v1 |
http://arxiv.org/pdf/1610.09083v1.pdf | |
PWC | https://paperswithcode.com/paper/sol-a-library-for-scalable-online-learning |
Repo | https://github.com/LIBOL/SOL |
Framework | none |
Joint CTC-Attention based End-to-End Speech Recognition using Multi-task Learning
Title | Joint CTC-Attention based End-to-End Speech Recognition using Multi-task Learning |
Authors | Suyoun Kim, Takaaki Hori, Shinji Watanabe |
Abstract | Recently, there has been an increasing interest in end-to-end speech recognition that directly transcribes speech to text without any predefined alignments. One approach is the attention-based encoder-decoder framework that learns a mapping between variable-length input and output sequences in one step using a purely data-driven method. The attention model has often been shown to improve the performance over another end-to-end approach, the Connectionist Temporal Classification (CTC), mainly because it explicitly uses the history of the target character without any conditional independence assumptions. However, we observed that the performance of the attention has shown poor results in noisy condition and is hard to learn in the initial training stage with long input sequences. This is because the attention model is too flexible to predict proper alignments in such cases due to the lack of left-to-right constraints as used in CTC. This paper presents a novel method for end-to-end speech recognition to improve robustness and achieve fast convergence by using a joint CTC-attention model within the multi-task learning framework, thereby mitigating the alignment issue. An experiment on the WSJ and CHiME-4 tasks demonstrates its advantages over both the CTC and attention-based encoder-decoder baselines, showing 5.4-14.6% relative improvements in Character Error Rate (CER). |
Tasks | End-To-End Speech Recognition, Multi-Task Learning, Speech Recognition |
Published | 2016-09-21 |
URL | http://arxiv.org/abs/1609.06773v2 |
http://arxiv.org/pdf/1609.06773v2.pdf | |
PWC | https://paperswithcode.com/paper/joint-ctc-attention-based-end-to-end-speech |
Repo | https://github.com/Alexander-H-Liu/End-to-end-ASR-Pytorch |
Framework | pytorch |