May 7, 2019

2541 words 12 mins read

Paper Group AWR 102

Convolutional Recurrent Neural Networks for Music Classification. Recurrent Dropout without Memory Loss. Geometric deep learning on graphs and manifolds using mixture model CNNs. Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization. An Efficient Character-Level Neural Machine Translation. NIPS 2016 Tutorial: Generative Adversaria …

Convolutional Recurrent Neural Networks for Music Classification


Title	Convolutional Recurrent Neural Networks for Music Classification
Authors	Keunwoo Choi, George Fazekas, Mark Sandler, Kyunghyun Cho
Abstract	We introduce a convolutional recurrent neural network (CRNN) for music tagging. CRNNs take advantage of convolutional neural networks (CNNs) for local feature extraction and recurrent neural networks for temporal summarisation of the extracted features. We compare CRNN with three CNN structures that have been used for music tagging while controlling the number of parameters with respect to their performance and training time per sample. Overall, we found that CRNNs show a strong performance with respect to the number of parameter and training time, indicating the effectiveness of its hybrid structure in music feature extraction and feature summarisation.
Tasks	Music Classification
Published	2016-09-14
URL	http://arxiv.org/abs/1609.04243v3
PDF	http://arxiv.org/pdf/1609.04243v3.pdf
PWC	https://paperswithcode.com/paper/convolutional-recurrent-neural-networks-for-1
Repo	https://github.com/keunwoochoi/MSD_split_for_tagging
Framework	none

Recurrent Dropout without Memory Loss


Title	Recurrent Dropout without Memory Loss
Authors	Stanislau Semeniuta, Aliaksei Severyn, Erhardt Barth
Abstract	This paper presents a novel approach to recurrent neural network (RNN) regularization. Differently from the widely adopted dropout method, which is applied to \textit{forward} connections of feed-forward architectures or RNNs, we propose to drop neurons directly in \textit{recurrent} connections in a way that does not cause loss of long-term memory. Our approach is as easy to implement and apply as the regular feed-forward dropout and we demonstrate its effectiveness for Long Short-Term Memory network, the most popular type of RNN cells. Our experiments on NLP benchmarks show consistent improvements even when combined with conventional feed-forward dropout.
Tasks
Published	2016-03-16
URL	http://arxiv.org/abs/1603.05118v2
PDF	http://arxiv.org/pdf/1603.05118v2.pdf
PWC	https://paperswithcode.com/paper/recurrent-dropout-without-memory-loss
Repo	https://github.com/stas-semeniuta/drop-rnn
Framework	none

Geometric deep learning on graphs and manifolds using mixture model CNNs


Title	Geometric deep learning on graphs and manifolds using mixture model CNNs
Authors	Federico Monti, Davide Boscaini, Jonathan Masci, Emanuele Rodolà, Jan Svoboda, Michael M. Bronstein
Abstract	Deep learning has achieved a remarkable performance breakthrough in several fields, most notably in speech recognition, natural language processing, and computer vision. In particular, convolutional neural network (CNN) architectures currently produce state-of-the-art performance on a variety of image analysis tasks such as object detection and recognition. Most of deep learning research has so far focused on dealing with 1D, 2D, or 3D Euclidean-structured data such as acoustic signals, images, or videos. Recently, there has been an increasing interest in geometric deep learning, attempting to generalize deep learning methods to non-Euclidean structured data such as graphs and manifolds, with a variety of applications from the domains of network analysis, computational social science, or computer graphics. In this paper, we propose a unified framework allowing to generalize CNN architectures to non-Euclidean domains (graphs and manifolds) and learn local, stationary, and compositional task-specific features. We show that various non-Euclidean CNN methods previously proposed in the literature can be considered as particular instances of our framework. We test the proposed method on standard tasks from the realms of image-, graph- and 3D shape analysis and show that it consistently outperforms previous approaches.
Tasks	3D Shape Analysis, Document Classification, Object Detection, Speech Recognition
Published	2016-11-25
URL	http://arxiv.org/abs/1611.08402v3
PDF	http://arxiv.org/pdf/1611.08402v3.pdf
PWC	https://paperswithcode.com/paper/geometric-deep-learning-on-graphs-and
Repo	https://github.com/HeapHop30/graph-attention-nets
Framework	tf

Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization


Title	Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization
Authors	Lisha Li, Kevin Jamieson, Giulia DeSalvo, Afshin Rostamizadeh, Ameet Talwalkar
Abstract	Performance of machine learning algorithms depends critically on identifying a good set of hyperparameters. While recent approaches use Bayesian optimization to adaptively select configurations, we focus on speeding up random search through adaptive resource allocation and early-stopping. We formulate hyperparameter optimization as a pure-exploration non-stochastic infinite-armed bandit problem where a predefined resource like iterations, data samples, or features is allocated to randomly sampled configurations. We introduce a novel algorithm, Hyperband, for this framework and analyze its theoretical properties, providing several desirable guarantees. Furthermore, we compare Hyperband with popular Bayesian optimization methods on a suite of hyperparameter optimization problems. We observe that Hyperband can provide over an order-of-magnitude speedup over our competitor set on a variety of deep-learning and kernel-based learning problems.
Tasks	Hyperparameter Optimization
Published	2016-03-21
URL	http://arxiv.org/abs/1603.06560v4
PDF	http://arxiv.org/pdf/1603.06560v4.pdf
PWC	https://paperswithcode.com/paper/hyperband-a-novel-bandit-based-approach-to
Repo	https://github.com/zygmuntz/hyperband
Framework	none

An Efficient Character-Level Neural Machine Translation


Title	An Efficient Character-Level Neural Machine Translation
Authors	Shenjian Zhao, Zhihua Zhang
Abstract	Neural machine translation aims at building a single large neural network that can be trained to maximize translation performance. The encoder-decoder architecture with an attention mechanism achieves a translation performance comparable to the existing state-of-the-art phrase-based systems on the task of English-to-French translation. However, the use of large vocabulary becomes the bottleneck in both training and improving the performance. In this paper, we propose an efficient architecture to train a deep character-level neural machine translation by introducing a decimator and an interpolator. The decimator is used to sample the source sequence before encoding while the interpolator is used to resample after decoding. Such a deep model has two major advantages. It avoids the large vocabulary issue radically; at the same time, it is much faster and more memory-efficient in training than conventional character-based models. More interestingly, our model is able to translate the misspelled word like human beings.
Tasks	Machine Translation
Published	2016-08-16
URL	http://arxiv.org/abs/1608.04738v2
PDF	http://arxiv.org/pdf/1608.04738v2.pdf
PWC	https://paperswithcode.com/paper/an-efficient-character-level-neural-machine
Repo	https://github.com/SwordYork/DCNMT
Framework	none

NIPS 2016 Tutorial: Generative Adversarial Networks


Title	NIPS 2016 Tutorial: Generative Adversarial Networks
Authors	Ian Goodfellow
Abstract	This report summarizes the tutorial presented by the author at NIPS 2016 on generative adversarial networks (GANs). The tutorial describes: (1) Why generative modeling is a topic worth studying, (2) how generative models work, and how GANs compare to other generative models, (3) the details of how GANs work, (4) research frontiers in GANs, and (5) state-of-the-art image models that combine GANs with other methods. Finally, the tutorial contains three exercises for readers to complete, and the solutions to these exercises.
Tasks
Published	2016-12-31
URL	http://arxiv.org/abs/1701.00160v4
PDF	http://arxiv.org/pdf/1701.00160v4.pdf
PWC	https://paperswithcode.com/paper/nips-2016-tutorial-generative-adversarial
Repo	https://github.com/Natsu6767/DCGAN-PyTorch
Framework	pytorch

Actionable and Political Text Classification using Word Embeddings and LSTM


Title	Actionable and Political Text Classification using Word Embeddings and LSTM
Authors	Adithya Rao, Nemanja Spasojevic
Abstract	In this work, we apply word embeddings and neural networks with Long Short-Term Memory (LSTM) to text classification problems, where the classification criteria are decided by the context of the application. We examine two applications in particular. The first is that of Actionability, where we build models to classify social media messages from customers of service providers as Actionable or Non-Actionable. We build models for over 30 different languages for actionability, and most of the models achieve accuracy around 85%, with some reaching over 90% accuracy. We also show that using LSTM neural networks with word embeddings vastly outperform traditional techniques. Second, we explore classification of messages with respect to political leaning, where social media messages are classified as Democratic or Republican. The model is able to classify messages with a high accuracy of 87.57%. As part of our experiments, we vary different hyperparameters of the neural networks, and report the effect of such variation on the accuracy. These actionability models have been deployed to production and help company agents provide customer support by prioritizing which messages to respond to. The model for political leaning has been opened and made available for wider use.
Tasks	Text Classification, Word Embeddings
Published	2016-07-08
URL	http://arxiv.org/abs/1607.02501v2
PDF	http://arxiv.org/pdf/1607.02501v2.pdf
PWC	https://paperswithcode.com/paper/actionable-and-political-text-classification
Repo	https://github.com/klout/opendata
Framework	none

Jointly Learning to Align and Convert Graphemes to Phonemes with Neural Attention Models


Title	Jointly Learning to Align and Convert Graphemes to Phonemes with Neural Attention Models
Authors	Shubham Toshniwal, Karen Livescu
Abstract	We propose an attention-enabled encoder-decoder model for the problem of grapheme-to-phoneme conversion. Most previous work has tackled the problem via joint sequence models that require explicit alignments for training. In contrast, the attention-enabled encoder-decoder model allows for jointly learning to align and convert characters to phonemes. We explore different types of attention models, including global and local attention, and our best models achieve state-of-the-art results on three standard data sets (CMUDict, Pronlex, and NetTalk).
Tasks
Published	2016-10-20
URL	http://arxiv.org/abs/1610.06540v1
PDF	http://arxiv.org/pdf/1610.06540v1.pdf
PWC	https://paperswithcode.com/paper/jointly-learning-to-align-and-convert
Repo	https://github.com/shtoshni92/g2p
Framework	tf

Source Localization on Graphs via l1 Recovery and Spectral Graph Theory


Title	Source Localization on Graphs via l1 Recovery and Spectral Graph Theory
Authors	Rodrigo Pena, Xavier Bresson, Pierre Vandergheynst
Abstract	We cast the problem of source localization on graphs as the simultaneous problem of sparse recovery and diffusion kernel learning. An l1 regularization term enforces the sparsity constraint while we recover the sources of diffusion from a single snapshot of the diffusion process. The diffusion kernel is estimated by assuming the process to be as generic as the standard heat diffusion. We show with synthetic data that we can concomitantly learn the diffusion kernel and the sources, given an estimated initialization. We validate our model with cholera mortality and atmospheric tracer diffusion data, showing also that the accuracy of the solution depends on the construction of the graph from the data points.
Tasks
Published	2016-03-24
URL	http://arxiv.org/abs/1603.07584v2
PDF	http://arxiv.org/pdf/1603.07584v2.pdf
PWC	https://paperswithcode.com/paper/source-localization-on-graphs-via-l1-recovery
Repo	https://github.com/rodrigo-pena/src-localization-graphs
Framework	none

Indoor occupancy estimation from carbon dioxide concentration


Title	Indoor occupancy estimation from carbon dioxide concentration
Authors	Chaoyang Jiang, Mustafa K. Masood, Yeng Chai Soh, Hua Li
Abstract	This paper presents an indoor occupancy estimator with which we can estimate the number of real-time indoor occupants based on the carbon dioxide (CO2) measurement. The estimator is actually a dynamic model of the occupancy level. To identify the dynamic model, we propose the Feature Scaled Extreme Learning Machine (FS-ELM) algorithm, which is a variation of the standard Extreme Learning Machine (ELM) but is shown to perform better for the occupancy estimation problem. The measured CO2 concentration suffers from serious spikes. We find that pre-smoothing the CO2 data can greatly improve the estimation accuracy. In real applications, however, we cannot obtain the real-time globally smoothed CO2 data. We provide a way to use the locally smoothed CO2 data instead, which is real-time available. We introduce a new criterion, i.e. $x$-tolerance accuracy, to assess the occupancy estimator. The proposed occupancy estimator was tested in an office room with 24 cubicles and 11 open seats. The accuracy is up to 94 percent with a tolerance of four occupants.
Tasks
Published	2016-07-20
URL	http://arxiv.org/abs/1607.05962v1
PDF	http://arxiv.org/pdf/1607.05962v1.pdf
PWC	https://paperswithcode.com/paper/indoor-occupancy-estimation-from-carbon
Repo	https://github.com/wdmdev/valhacks2019
Framework	none

Content-based image retrieval tutorial


Title	Content-based image retrieval tutorial
Authors	Joani Mitro
Abstract	This paper functions as a tutorial for individuals interested to enter the field of information retrieval but wouldn’t know where to begin from. It describes two fundamental yet efficient image retrieval techniques, the first being k - nearest neighbors (knn) and the second support vector machines(svm). The goal is to provide the reader with both the theoretical and practical aspects in order to acquire a better understanding. Along with this tutorial we have also developed the equivalent software1 using the MATLAB environment in order to illustrate the techniques, so that the reader can have a hands-on experience.
Tasks	Content-Based Image Retrieval, Image Retrieval, Information Retrieval
Published	2016-08-12
URL	http://arxiv.org/abs/1608.03811v1
PDF	http://arxiv.org/pdf/1608.03811v1.pdf
PWC	https://paperswithcode.com/paper/content-based-image-retrieval-tutorial
Repo	https://github.com/tanyakoh/CBIR
Framework	none

Counter-fitting Word Vectors to Linguistic Constraints


Title	Counter-fitting Word Vectors to Linguistic Constraints
Authors	Nikola Mrkšić, Diarmuid Ó Séaghdha, Blaise Thomson, Milica Gašić, Lina Rojas-Barahona, Pei-Hao Su, David Vandyke, Tsung-Hsien Wen, Steve Young
Abstract	In this work, we present a novel counter-fitting method which injects antonymy and synonymy constraints into vector space representations in order to improve the vectors’ capability for judging semantic similarity. Applying this method to publicly available pre-trained word vectors leads to a new state of the art performance on the SimLex-999 dataset. We also show how the method can be used to tailor the word vector space for the downstream task of dialogue state tracking, resulting in robust improvements across different dialogue domains.
Tasks	Dialogue State Tracking, Semantic Similarity, Semantic Textual Similarity
Published	2016-03-02
URL	http://arxiv.org/abs/1603.00892v1
PDF	http://arxiv.org/pdf/1603.00892v1.pdf
PWC	https://paperswithcode.com/paper/counter-fitting-word-vectors-to-linguistic
Repo	https://github.com/nmrksic/counter-fitting
Framework	none

Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling


Title	Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling
Authors	Jiajun Wu, Chengkai Zhang, Tianfan Xue, William T. Freeman, Joshua B. Tenenbaum
Abstract	We study the problem of 3D object generation. We propose a novel framework, namely 3D Generative Adversarial Network (3D-GAN), which generates 3D objects from a probabilistic space by leveraging recent advances in volumetric convolutional networks and generative adversarial nets. The benefits of our model are three-fold: first, the use of an adversarial criterion, instead of traditional heuristic criteria, enables the generator to capture object structure implicitly and to synthesize high-quality 3D objects; second, the generator establishes a mapping from a low-dimensional probabilistic space to the space of 3D objects, so that we can sample objects without a reference image or CAD models, and explore the 3D object manifold; third, the adversarial discriminator provides a powerful 3D shape descriptor which, learned without supervision, has wide applications in 3D object recognition. Experiments demonstrate that our method generates high-quality 3D objects, and our unsupervisedly learned features achieve impressive performance on 3D object recognition, comparable with those of supervised learning methods.
Tasks	3D Object Recognition, Object Recognition
Published	2016-10-24
URL	http://arxiv.org/abs/1610.07584v2
PDF	http://arxiv.org/pdf/1610.07584v2.pdf
PWC	https://paperswithcode.com/paper/learning-a-probabilistic-latent-space-of
Repo	https://github.com/zck119/3dgan-release
Framework	torch

Large-Margin Softmax Loss for Convolutional Neural Networks


Title	Large-Margin Softmax Loss for Convolutional Neural Networks
Authors	Weiyang Liu, Yandong Wen, Zhiding Yu, Meng Yang
Abstract	Cross-entropy loss together with softmax is arguably one of the most common used supervision components in convolutional neural networks (CNNs). Despite its simplicity, popularity and excellent performance, the component does not explicitly encourage discriminative learning of features. In this paper, we propose a generalized large-margin softmax (L-Softmax) loss which explicitly encourages intra-class compactness and inter-class separability between learned features. Moreover, L-Softmax not only can adjust the desired margin but also can avoid overfitting. We also show that the L-Softmax loss can be optimized by typical stochastic gradient descent. Extensive experiments on four benchmark datasets demonstrate that the deeply-learned features with L-softmax loss become more discriminative, hence significantly boosting the performance on a variety of visual classification and verification tasks.
Tasks
Published	2016-12-07
URL	http://arxiv.org/abs/1612.02295v4
PDF	http://arxiv.org/pdf/1612.02295v4.pdf
PWC	https://paperswithcode.com/paper/large-margin-softmax-loss-for-convolutional
Repo	https://github.com/wy1iu/LargeMargin_Softmax_Loss
Framework	pytorch

Greedy Step Averaging: A parameter-free stochastic optimization method


Title	Greedy Step Averaging: A parameter-free stochastic optimization method
Authors	Xiatian Zhang, Fan Yao, Yongjun Tian
Abstract	In this paper we present the greedy step averaging(GSA) method, a parameter-free stochastic optimization algorithm for a variety of machine learning problems. As a gradient-based optimization method, GSA makes use of the information from the minimizer of a single sample’s loss function, and takes average strategy to calculate reasonable learning rate sequence. While most existing gradient-based algorithms introduce an increasing number of hyper parameters or try to make a trade-off between computational cost and convergence rate, GSA avoids the manual tuning of learning rate and brings in no more hyper parameters or extra cost. We perform exhaustive numerical experiments for logistic and softmax regression to compare our method with the other state of the art ones on 16 datasets. Results show that GSA is robust on various scenarios.
Tasks	Stochastic Optimization
Published	2016-11-11
URL	http://arxiv.org/abs/1611.03608v1
PDF	http://arxiv.org/pdf/1611.03608v1.pdf
PWC	https://paperswithcode.com/paper/greedy-step-averaging-a-parameter-free
Repo	https://github.com/TalkingData/Fregata
Framework	none