May 7, 2019

2827 words 14 mins read

Paper Group AWR 14

Unsupervised Representation Learning of Structured Radio Communication Signals. Professor Forcing: A New Algorithm for Training Recurrent Networks. Meta-Prod2Vec - Product Embeddings Using Side-Information for Recommendation. WAHRSIS: A Low-cost, High-resolution Whole Sky Imager With Near-Infrared Capabilities. Two are Better than One: An Ensemble …

Unsupervised Representation Learning of Structured Radio Communication Signals


Title	Unsupervised Representation Learning of Structured Radio Communication Signals
Authors	Timothy J. O’Shea, Johnathan Corgan, T. Charles Clancy
Abstract	We explore unsupervised representation learning of radio communication signals in raw sampled time series representation. We demonstrate that we can learn modulation basis functions using convolutional autoencoders and visually recognize their relationship to the analytic bases used in digital communications. We also propose and evaluate quantitative met- rics for quality of encoding using domain relevant performance metrics.
Tasks	Representation Learning, Time Series, Unsupervised Representation Learning
Published	2016-04-24
URL	http://arxiv.org/abs/1604.07078v1
PDF	http://arxiv.org/pdf/1604.07078v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-representation-learning-of-1
Repo	https://github.com/mistic-lab/IPSW-RFI
Framework	pytorch

Professor Forcing: A New Algorithm for Training Recurrent Networks


Title	Professor Forcing: A New Algorithm for Training Recurrent Networks
Authors	Alex Lamb, Anirudh Goyal, Ying Zhang, Saizheng Zhang, Aaron Courville, Yoshua Bengio
Abstract	The Teacher Forcing algorithm trains recurrent networks by supplying observed sequence values as inputs during training and using the network’s own one-step-ahead predictions to do multi-step sampling. We introduce the Professor Forcing algorithm, which uses adversarial domain adaptation to encourage the dynamics of the recurrent network to be the same when training the network and when sampling from the network over multiple time steps. We apply Professor Forcing to language modeling, vocal synthesis on raw waveforms, handwriting generation, and image generation. Empirically we find that Professor Forcing acts as a regularizer, improving test likelihood on character level Penn Treebank and sequential MNIST. We also find that the model qualitatively improves samples, especially when sampling for a large number of time steps. This is supported by human evaluation of sample quality. Trade-offs between Professor Forcing and Scheduled Sampling are discussed. We produce T-SNEs showing that Professor Forcing successfully makes the dynamics of the network during training and sampling more similar.
Tasks	Domain Adaptation, Image Generation, Language Modelling
Published	2016-10-27
URL	http://arxiv.org/abs/1610.09038v1
PDF	http://arxiv.org/pdf/1610.09038v1.pdf
PWC	https://paperswithcode.com/paper/professor-forcing-a-new-algorithm-for
Repo	https://github.com/mojesty/professor_forcing
Framework	pytorch

Meta-Prod2Vec - Product Embeddings Using Side-Information for Recommendation


Title	Meta-Prod2Vec - Product Embeddings Using Side-Information for Recommendation
Authors	Flavian Vasile, Elena Smirnova, Alexis Conneau
Abstract	We propose Meta-Prod2vec, a novel method to compute item similarities for recommendation that leverages existing item metadata. Such scenarios are frequently encountered in applications such as content recommendation, ad targeting and web search. Our method leverages past user interactions with items and their attributes to compute low-dimensional embeddings of items. Specifically, the item metadata is in- jected into the model as side information to regularize the item embeddings. We show that the new item representa- tions lead to better performance on recommendation tasks on an open music dataset.
Tasks
Published	2016-07-25
URL	http://arxiv.org/abs/1607.07326v1
PDF	http://arxiv.org/pdf/1607.07326v1.pdf
PWC	https://paperswithcode.com/paper/meta-prod2vec-product-embeddings-using-side
Repo	https://github.com/YIZHE12/music_recommend
Framework	tf

WAHRSIS: A Low-cost, High-resolution Whole Sky Imager With Near-Infrared Capabilities


Title	WAHRSIS: A Low-cost, High-resolution Whole Sky Imager With Near-Infrared Capabilities
Authors	Soumyabrata Dev, Florian M. Savoy, Yee Hui Lee, Stefan Winkler
Abstract	Cloud imaging using ground-based whole sky imagers is essential for a fine-grained understanding of the effects of cloud formations, which can be useful in many applications. Some such imagers are available commercially, but their cost is relatively high, and their flexibility is limited. Therefore, we built a new daytime Whole Sky Imager (WSI) called Wide Angle High-Resolution Sky Imaging System. The strengths of our new design are its simplicity, low manufacturing cost and high resolution. Our imager captures the entire hemisphere in a single high-resolution picture via a digital camera using a fish-eye lens. The camera was modified to capture light across the visible as well as the near-infrared spectral ranges. This paper describes the design of the device as well as the geometric and radiometric calibration of the imaging system.
Tasks	Calibration
Published	2016-05-21
URL	http://arxiv.org/abs/1605.06595v2
PDF	http://arxiv.org/pdf/1605.06595v2.pdf
PWC	https://paperswithcode.com/paper/wahrsis-a-low-cost-high-resolution-whole-sky
Repo	https://github.com/Soumyabrata/WAHRSIS
Framework	none

Two are Better than One: An Ensemble of Retrieval- and Generation-Based Dialog Systems


Title	Two are Better than One: An Ensemble of Retrieval- and Generation-Based Dialog Systems
Authors	Yiping Song, Rui Yan, Xiang Li, Dongyan Zhao, Ming Zhang
Abstract	Open-domain human-computer conversation has attracted much attention in the field of NLP. Contrary to rule- or template-based domain-specific dialog systems, open-domain conversation usually requires data-driven approaches, which can be roughly divided into two categories: retrieval-based and generation-based systems. Retrieval systems search a user-issued utterance (called a query) in a large database, and return a reply that best matches the query. Generative approaches, typically based on recurrent neural networks (RNNs), can synthesize new replies, but they suffer from the problem of generating short, meaningless utterances. In this paper, we propose a novel ensemble of retrieval-based and generation-based dialog systems in the open domain. In our approach, the retrieved candidate, in addition to the original query, is fed to an RNN-based reply generator, so that the neural model is aware of more information. The generated reply is then fed back as a new candidate for post-reranking. Experimental results show that such ensemble outperforms each single part of it by a large margin.
Tasks
Published	2016-10-23
URL	http://arxiv.org/abs/1610.07149v1
PDF	http://arxiv.org/pdf/1610.07149v1.pdf
PWC	https://paperswithcode.com/paper/two-are-better-than-one-an-ensemble-of
Repo	https://github.com/jimth001/Bi-Seq2Seq
Framework	tf

A Fully Convolutional Neural Network for Speech Enhancement


Title	A Fully Convolutional Neural Network for Speech Enhancement
Authors	Se Rim Park, Jinwon Lee
Abstract	In hearing aids, the presence of babble noise degrades hearing intelligibility of human speech greatly. However, removing the babble without creating artifacts in human speech is a challenging task in a low SNR environment. Here, we sought to solve the problem by finding a `mapping’ between noisy speech spectra and clean speech spectra via supervised learning. Specifically, we propose using fully Convolutional Neural Networks, which consist of lesser number of parameters than fully connected networks. The proposed network, Redundant Convolutional Encoder Decoder (R-CED), demonstrates that a convolutional network can be 12 times smaller than a recurrent network and yet achieves better performance, which shows its applicability for an embedded system: the hearing aids. \|
Tasks	Speech Enhancement
Published	2016-09-22
URL	http://arxiv.org/abs/1609.07132v1
PDF	http://arxiv.org/pdf/1609.07132v1.pdf
PWC	https://paperswithcode.com/paper/a-fully-convolutional-neural-network-for
Repo	https://github.com/zhr1201/CNN-for-single-channel-speech-enhancement
Framework	tf

Learning a Predictable and Generative Vector Representation for Objects


Title	Learning a Predictable and Generative Vector Representation for Objects
Authors	Rohit Girdhar, David F. Fouhey, Mikel Rodriguez, Abhinav Gupta
Abstract	What is a good vector representation of an object? We believe that it should be generative in 3D, in the sense that it can produce new 3D objects; as well as be predictable from 2D, in the sense that it can be perceived from 2D images. We propose a novel architecture, called the TL-embedding network, to learn an embedding space with these properties. The network consists of two components: (a) an autoencoder that ensures the representation is generative; and (b) a convolutional network that ensures the representation is predictable. This enables tackling a number of tasks including voxel prediction from 2D images and 3D model retrieval. Extensive experimental analysis demonstrates the usefulness and versatility of this embedding.
Tasks
Published	2016-03-29
URL	http://arxiv.org/abs/1603.08637v2
PDF	http://arxiv.org/pdf/1603.08637v2.pdf
PWC	https://paperswithcode.com/paper/learning-a-predictable-and-generative-vector
Repo	https://github.com/JeremyFisher/deep_level_sets
Framework	pytorch

Nonlinear Systems Identification Using Deep Dynamic Neural Networks


Title	Nonlinear Systems Identification Using Deep Dynamic Neural Networks
Authors	Olalekan Ogunmolu, Xuejun Gu, Steve Jiang, Nicholas Gans
Abstract	Neural networks are known to be effective function approximators. Recently, deep neural networks have proven to be very effective in pattern recognition, classification tasks and human-level control to model highly nonlinear realworld systems. This paper investigates the effectiveness of deep neural networks in the modeling of dynamical systems with complex behavior. Three deep neural network structures are trained on sequential data, and we investigate the effectiveness of these networks in modeling associated characteristics of the underlying dynamical systems. We carry out similar evaluations on select publicly available system identification datasets. We demonstrate that deep neural networks are effective model estimators from input-output data
Tasks
Published	2016-10-05
URL	http://arxiv.org/abs/1610.01439v1
PDF	http://arxiv.org/pdf/1610.01439v1.pdf
PWC	https://paperswithcode.com/paper/nonlinear-systems-identification-using-deep
Repo	https://github.com/lakehanne/FARNN
Framework	torch

Incremental Sequence Learning


Title	Incremental Sequence Learning
Authors	Edwin D. de Jong
Abstract	Deep learning research over the past years has shown that by increasing the scope or difficulty of the learning problem over time, increasingly complex learning problems can be addressed. We study incremental learning in the context of sequence learning, using generative RNNs in the form of multi-layer recurrent Mixture Density Networks. While the potential of incremental or curriculum learning to enhance learning is known, indiscriminate application of the principle does not necessarily lead to improvement, and it is essential therefore to know which forms of incremental or curriculum learning have a positive effect. This research contributes to that aim by comparing three instantiations of incremental or curriculum learning. We introduce Incremental Sequence Learning, a simple incremental approach to sequence learning. Incremental Sequence Learning starts out by using only the first few steps of each sequence as training data. Each time a performance criterion has been reached, the length of the parts of the sequences used for training is increased. We introduce and make available a novel sequence learning task and data set: predicting and classifying MNIST pen stroke sequences. We find that Incremental Sequence Learning greatly speeds up sequence learning and reaches the best test performance level of regular sequence learning 20 times faster, reduces the test error by 74%, and in general performs more robustly; it displays lower variance and achieves sustained progress after all three comparison methods have stopped improving. The other instantiations of curriculum learning do not result in any noticeable improvement. A trained sequence prediction model is also used in transfer learning to the task of sequence classification, where it is found that transfer learning realizes improved classification performance compared to methods that learn to classify from scratch.
Tasks	Transfer Learning
Published	2016-11-09
URL	http://arxiv.org/abs/1611.03068v2
PDF	http://arxiv.org/pdf/1611.03068v2.pdf
PWC	https://paperswithcode.com/paper/incremental-sequence-learning
Repo	https://github.com/edwin-de-jong/incremental-sequence-learning
Framework	tf

The CMA Evolution Strategy: A Tutorial


Title	The CMA Evolution Strategy: A Tutorial
Authors	Nikolaus Hansen
Abstract	This tutorial introduces the CMA Evolution Strategy (ES), where CMA stands for Covariance Matrix Adaptation. The CMA-ES is a stochastic, or randomized, method for real-parameter (continuous domain) optimization of non-linear, non-convex functions. We try to motivate and derive the algorithm from intuitive concepts and from requirements of non-linear, non-convex search in continuous domain.
Tasks
Published	2016-04-04
URL	http://arxiv.org/abs/1604.00772v1
PDF	http://arxiv.org/pdf/1604.00772v1.pdf
PWC	https://paperswithcode.com/paper/the-cma-evolution-strategy-a-tutorial
Repo	https://github.com/ppocma/ppocma
Framework	tf

Wider or Deeper: Revisiting the ResNet Model for Visual Recognition


Title	Wider or Deeper: Revisiting the ResNet Model for Visual Recognition
Authors	Zifeng Wu, Chunhua Shen, Anton van den Hengel
Abstract	The trend towards increasingly deep neural networks has been driven by a general observation that increasing depth increases the performance of a network. Recently, however, evidence has been amassing that simply increasing depth may not be the best way to increase performance, particularly given other limitations. Investigations into deep residual networks have also suggested that they may not in fact be operating as a single deep network, but rather as an ensemble of many relatively shallow networks. We examine these issues, and in doing so arrive at a new interpretation of the unravelled view of deep residual networks which explains some of the behaviours that have been observed experimentally. As a result, we are able to derive a new, shallower, architecture of residual networks which significantly outperforms much deeper models such as ResNet-200 on the ImageNet classification dataset. We also show that this performance is transferable to other problem domains by developing a semantic segmentation approach which outperforms the state-of-the-art by a remarkable margin on datasets including PASCAL VOC, PASCAL Context, and Cityscapes. The architecture that we propose thus outperforms its comparators, including very deep ResNets, and yet is more efficient in memory use and sometimes also in training time. The code and models are available at https://github.com/itijyou/ademxapp
Tasks	Semantic Segmentation
Published	2016-11-30
URL	http://arxiv.org/abs/1611.10080v1
PDF	http://arxiv.org/pdf/1611.10080v1.pdf
PWC	https://paperswithcode.com/paper/wider-or-deeper-revisiting-the-resnet-model
Repo	https://github.com/itijyou/ademxapp
Framework	mxnet

COCO-Stuff: Thing and Stuff Classes in Context


Title	COCO-Stuff: Thing and Stuff Classes in Context
Authors	Holger Caesar, Jasper Uijlings, Vittorio Ferrari
Abstract	Semantic classes can be either things (objects with a well-defined shape, e.g. car, person) or stuff (amorphous background regions, e.g. grass, sky). While lots of classification and detection works focus on thing classes, less attention has been given to stuff classes. Nonetheless, stuff classes are important as they allow to explain important aspects of an image, including (1) scene type; (2) which thing classes are likely to be present and their location (through contextual reasoning); (3) physical attributes, material types and geometric properties of the scene. To understand stuff and things in context we introduce COCO-Stuff, which augments all 164K images of the COCO 2017 dataset with pixel-wise annotations for 91 stuff classes. We introduce an efficient stuff annotation protocol based on superpixels, which leverages the original thing annotations. We quantify the speed versus quality trade-off of our protocol and explore the relation between annotation time and boundary complexity. Furthermore, we use COCO-Stuff to analyze: (a) the importance of stuff and thing classes in terms of their surface cover and how frequently they are mentioned in image captions; (b) the spatial relations between stuff and things, highlighting the rich contextual relations that make our dataset unique; (c) the performance of a modern semantic segmentation method on stuff and thing classes, and whether stuff is easier to segment than things.
Tasks	Image Captioning, Semantic Segmentation
Published	2016-12-12
URL	http://arxiv.org/abs/1612.03716v4
PDF	http://arxiv.org/pdf/1612.03716v4.pdf
PWC	https://paperswithcode.com/paper/coco-stuff-thing-and-stuff-classes-in-context
Repo	https://github.com/nightrome/cocostuff10k
Framework	none


Title	Modelling Context with User Embeddings for Sarcasm Detection in Social Media
Authors	Silvio Amir, Byron C. Wallace, Hao Lyu, Paula Carvalho Mário J. Silva
Abstract	We introduce a deep neural network for automated sarcasm detection. Recent work has emphasized the need for models to capitalize on contextual features, beyond lexical and syntactic cues present in utterances. For example, different speakers will tend to employ sarcasm regarding different subjects and, thus, sarcasm detection models ought to encode such speaker information. Current methods have achieved this by way of laborious feature engineering. By contrast, we propose to automatically learn and then exploit user embeddings, to be used in concert with lexical signals to recognize sarcasm. Our approach does not require elaborate feature engineering (and concomitant data scraping); fitting user embeddings requires only the text from their previous posts. The experimental results show that our model outperforms a state-of-the-art approach leveraging an extensive set of carefully crafted features.
Tasks	Feature Engineering, Sarcasm Detection
Published	2016-07-04
URL	http://arxiv.org/abs/1607.00976v2
PDF	http://arxiv.org/pdf/1607.00976v2.pdf
PWC	https://paperswithcode.com/paper/modelling-context-with-user-embeddings-for
Repo	https://github.com/samiroid/CUE-CNN
Framework	none


Title	RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation
Authors	Guosheng Lin, Anton Milan, Chunhua Shen, Ian Reid
Abstract	Recently, very deep convolutional neural networks (CNNs) have shown outstanding performance in object recognition and have also been the first choice for dense classification problems such as semantic segmentation. However, repeated subsampling operations like pooling or convolution striding in deep CNNs lead to a significant decrease in the initial image resolution. Here, we present RefineNet, a generic multi-path refinement network that explicitly exploits all the information available along the down-sampling process to enable high-resolution prediction using long-range residual connections. In this way, the deeper layers that capture high-level semantic features can be directly refined using fine-grained features from earlier convolutions. The individual components of RefineNet employ residual connections following the identity mapping mindset, which allows for effective end-to-end training. Further, we introduce chained residual pooling, which captures rich background context in an efficient manner. We carry out comprehensive experiments and set new state-of-the-art results on seven public datasets. In particular, we achieve an intersection-over-union score of 83.4 on the challenging PASCAL VOC 2012 dataset, which is the best reported result to date.
Tasks	Object Recognition, Semantic Segmentation
Published	2016-11-20
URL	http://arxiv.org/abs/1611.06612v3
PDF	http://arxiv.org/pdf/1611.06612v3.pdf
PWC	https://paperswithcode.com/paper/refinenet-multi-path-refinement-networks-for
Repo	https://github.com/oravus/lostX
Framework	none

Single-shot Adaptive Measurement for Quantum-enhanced Metrology


Title	Single-shot Adaptive Measurement for Quantum-enhanced Metrology
Authors	Pantita Palittapongarnpim, Peter Wittek, Barry C. Sanders
Abstract	Quantum-enhanced metrology aims to estimate an unknown parameter such that the precision scales better than the shot-noise bound. Single-shot adaptive quantum-enhanced metrology (AQEM) is a promising approach that uses feedback to tweak the quantum process according to previous measurement outcomes. Techniques and formalism for the adaptive case are quite different from the usual non-adaptive quantum metrology approach due to the causal relationship between measurements and outcomes. We construct a formal framework for AQEM by modeling the procedure as a decision-making process, and we derive the imprecision and the Cram'{e}r-Rao lower bound with explicit dependence on the feedback policy. We also explain the reinforcement learning approach for generating quantum control policies, which is adopted due to the optimal policy being non-trivial to devise. Applying a learning algorithm based on differential evolution enables us to attain imprecision for adaptive interferometric phase estimation, which turns out to be SQL when non-entangled particles are used in the scheme.
Tasks	Decision Making
Published	2016-08-22
URL	http://arxiv.org/abs/1608.06238v1
PDF	http://arxiv.org/pdf/1608.06238v1.pdf
PWC	https://paperswithcode.com/paper/single-shot-adaptive-measurement-for-quantum
Repo	https://github.com/PanPalitta/phase_estimation
Framework	none