Paper Group AWR 14
Unsupervised Representation Learning of Structured Radio Communication Signals. Professor Forcing: A New Algorithm for Training Recurrent Networks. Meta-Prod2Vec - Product Embeddings Using Side-Information for Recommendation. WAHRSIS: A Low-cost, High-resolution Whole Sky Imager With Near-Infrared Capabilities. Two are Better than One: An Ensemble …
Unsupervised Representation Learning of Structured Radio Communication Signals
Title | Unsupervised Representation Learning of Structured Radio Communication Signals |
Authors | Timothy J. O’Shea, Johnathan Corgan, T. Charles Clancy |
Abstract | We explore unsupervised representation learning of radio communication signals in raw sampled time series representation. We demonstrate that we can learn modulation basis functions using convolutional autoencoders and visually recognize their relationship to the analytic bases used in digital communications. We also propose and evaluate quantitative met- rics for quality of encoding using domain relevant performance metrics. |
Tasks | Representation Learning, Time Series, Unsupervised Representation Learning |
Published | 2016-04-24 |
URL | http://arxiv.org/abs/1604.07078v1 |
http://arxiv.org/pdf/1604.07078v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-representation-learning-of-1 |
Repo | https://github.com/mistic-lab/IPSW-RFI |
Framework | pytorch |
Professor Forcing: A New Algorithm for Training Recurrent Networks
Title | Professor Forcing: A New Algorithm for Training Recurrent Networks |
Authors | Alex Lamb, Anirudh Goyal, Ying Zhang, Saizheng Zhang, Aaron Courville, Yoshua Bengio |
Abstract | The Teacher Forcing algorithm trains recurrent networks by supplying observed sequence values as inputs during training and using the network’s own one-step-ahead predictions to do multi-step sampling. We introduce the Professor Forcing algorithm, which uses adversarial domain adaptation to encourage the dynamics of the recurrent network to be the same when training the network and when sampling from the network over multiple time steps. We apply Professor Forcing to language modeling, vocal synthesis on raw waveforms, handwriting generation, and image generation. Empirically we find that Professor Forcing acts as a regularizer, improving test likelihood on character level Penn Treebank and sequential MNIST. We also find that the model qualitatively improves samples, especially when sampling for a large number of time steps. This is supported by human evaluation of sample quality. Trade-offs between Professor Forcing and Scheduled Sampling are discussed. We produce T-SNEs showing that Professor Forcing successfully makes the dynamics of the network during training and sampling more similar. |
Tasks | Domain Adaptation, Image Generation, Language Modelling |
Published | 2016-10-27 |
URL | http://arxiv.org/abs/1610.09038v1 |
http://arxiv.org/pdf/1610.09038v1.pdf | |
PWC | https://paperswithcode.com/paper/professor-forcing-a-new-algorithm-for |
Repo | https://github.com/mojesty/professor_forcing |
Framework | pytorch |
Meta-Prod2Vec - Product Embeddings Using Side-Information for Recommendation
Title | Meta-Prod2Vec - Product Embeddings Using Side-Information for Recommendation |
Authors | Flavian Vasile, Elena Smirnova, Alexis Conneau |
Abstract | We propose Meta-Prod2vec, a novel method to compute item similarities for recommendation that leverages existing item metadata. Such scenarios are frequently encountered in applications such as content recommendation, ad targeting and web search. Our method leverages past user interactions with items and their attributes to compute low-dimensional embeddings of items. Specifically, the item metadata is in- jected into the model as side information to regularize the item embeddings. We show that the new item representa- tions lead to better performance on recommendation tasks on an open music dataset. |
Tasks | |
Published | 2016-07-25 |
URL | http://arxiv.org/abs/1607.07326v1 |
http://arxiv.org/pdf/1607.07326v1.pdf | |
PWC | https://paperswithcode.com/paper/meta-prod2vec-product-embeddings-using-side |
Repo | https://github.com/YIZHE12/music_recommend |
Framework | tf |
WAHRSIS: A Low-cost, High-resolution Whole Sky Imager With Near-Infrared Capabilities
Title | WAHRSIS: A Low-cost, High-resolution Whole Sky Imager With Near-Infrared Capabilities |
Authors | Soumyabrata Dev, Florian M. Savoy, Yee Hui Lee, Stefan Winkler |
Abstract | Cloud imaging using ground-based whole sky imagers is essential for a fine-grained understanding of the effects of cloud formations, which can be useful in many applications. Some such imagers are available commercially, but their cost is relatively high, and their flexibility is limited. Therefore, we built a new daytime Whole Sky Imager (WSI) called Wide Angle High-Resolution Sky Imaging System. The strengths of our new design are its simplicity, low manufacturing cost and high resolution. Our imager captures the entire hemisphere in a single high-resolution picture via a digital camera using a fish-eye lens. The camera was modified to capture light across the visible as well as the near-infrared spectral ranges. This paper describes the design of the device as well as the geometric and radiometric calibration of the imaging system. |
Tasks | Calibration |
Published | 2016-05-21 |
URL | http://arxiv.org/abs/1605.06595v2 |
http://arxiv.org/pdf/1605.06595v2.pdf | |
PWC | https://paperswithcode.com/paper/wahrsis-a-low-cost-high-resolution-whole-sky |
Repo | https://github.com/Soumyabrata/WAHRSIS |
Framework | none |
Two are Better than One: An Ensemble of Retrieval- and Generation-Based Dialog Systems
Title | Two are Better than One: An Ensemble of Retrieval- and Generation-Based Dialog Systems |
Authors | Yiping Song, Rui Yan, Xiang Li, Dongyan Zhao, Ming Zhang |
Abstract | Open-domain human-computer conversation has attracted much attention in the field of NLP. Contrary to rule- or template-based domain-specific dialog systems, open-domain conversation usually requires data-driven approaches, which can be roughly divided into two categories: retrieval-based and generation-based systems. Retrieval systems search a user-issued utterance (called a query) in a large database, and return a reply that best matches the query. Generative approaches, typically based on recurrent neural networks (RNNs), can synthesize new replies, but they suffer from the problem of generating short, meaningless utterances. In this paper, we propose a novel ensemble of retrieval-based and generation-based dialog systems in the open domain. In our approach, the retrieved candidate, in addition to the original query, is fed to an RNN-based reply generator, so that the neural model is aware of more information. The generated reply is then fed back as a new candidate for post-reranking. Experimental results show that such ensemble outperforms each single part of it by a large margin. |
Tasks | |
Published | 2016-10-23 |
URL | http://arxiv.org/abs/1610.07149v1 |
http://arxiv.org/pdf/1610.07149v1.pdf | |
PWC | https://paperswithcode.com/paper/two-are-better-than-one-an-ensemble-of |
Repo | https://github.com/jimth001/Bi-Seq2Seq |
Framework | tf |
A Fully Convolutional Neural Network for Speech Enhancement
Title | A Fully Convolutional Neural Network for Speech Enhancement |
Authors | Se Rim Park, Jinwon Lee |
Abstract | In hearing aids, the presence of babble noise degrades hearing intelligibility of human speech greatly. However, removing the babble without creating artifacts in human speech is a challenging task in a low SNR environment. Here, we sought to solve the problem by finding a `mapping’ between noisy speech spectra and clean speech spectra via supervised learning. Specifically, we propose using fully Convolutional Neural Networks, which consist of lesser number of parameters than fully connected networks. The proposed network, Redundant Convolutional Encoder Decoder (R-CED), demonstrates that a convolutional network can be 12 times smaller than a recurrent network and yet achieves better performance, which shows its applicability for an embedded system: the hearing aids. | |
Tasks | Speech Enhancement |
Published | 2016-09-22 |
URL | http://arxiv.org/abs/1609.07132v1 |
http://arxiv.org/pdf/1609.07132v1.pdf | |
PWC | https://paperswithcode.com/paper/a-fully-convolutional-neural-network-for |
Repo | https://github.com/zhr1201/CNN-for-single-channel-speech-enhancement |
Framework | tf |
Learning a Predictable and Generative Vector Representation for Objects
Title | Learning a Predictable and Generative Vector Representation for Objects |
Authors | Rohit Girdhar, David F. Fouhey, Mikel Rodriguez, Abhinav Gupta |
Abstract | What is a good vector representation of an object? We believe that it should be generative in 3D, in the sense that it can produce new 3D objects; as well as be predictable from 2D, in the sense that it can be perceived from 2D images. We propose a novel architecture, called the TL-embedding network, to learn an embedding space with these properties. The network consists of two components: (a) an autoencoder that ensures the representation is generative; and (b) a convolutional network that ensures the representation is predictable. This enables tackling a number of tasks including voxel prediction from 2D images and 3D model retrieval. Extensive experimental analysis demonstrates the usefulness and versatility of this embedding. |
Tasks | |
Published | 2016-03-29 |
URL | http://arxiv.org/abs/1603.08637v2 |
http://arxiv.org/pdf/1603.08637v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-a-predictable-and-generative-vector |
Repo | https://github.com/JeremyFisher/deep_level_sets |
Framework | pytorch |
Nonlinear Systems Identification Using Deep Dynamic Neural Networks
Title | Nonlinear Systems Identification Using Deep Dynamic Neural Networks |
Authors | Olalekan Ogunmolu, Xuejun Gu, Steve Jiang, Nicholas Gans |
Abstract | Neural networks are known to be effective function approximators. Recently, deep neural networks have proven to be very effective in pattern recognition, classification tasks and human-level control to model highly nonlinear realworld systems. This paper investigates the effectiveness of deep neural networks in the modeling of dynamical systems with complex behavior. Three deep neural network structures are trained on sequential data, and we investigate the effectiveness of these networks in modeling associated characteristics of the underlying dynamical systems. We carry out similar evaluations on select publicly available system identification datasets. We demonstrate that deep neural networks are effective model estimators from input-output data |
Tasks | |
Published | 2016-10-05 |
URL | http://arxiv.org/abs/1610.01439v1 |
http://arxiv.org/pdf/1610.01439v1.pdf | |
PWC | https://paperswithcode.com/paper/nonlinear-systems-identification-using-deep |
Repo | https://github.com/lakehanne/FARNN |
Framework | torch |
Incremental Sequence Learning
Title | Incremental Sequence Learning |
Authors | Edwin D. de Jong |
Abstract | Deep learning research over the past years has shown that by increasing the scope or difficulty of the learning problem over time, increasingly complex learning problems can be addressed. We study incremental learning in the context of sequence learning, using generative RNNs in the form of multi-layer recurrent Mixture Density Networks. While the potential of incremental or curriculum learning to enhance learning is known, indiscriminate application of the principle does not necessarily lead to improvement, and it is essential therefore to know which forms of incremental or curriculum learning have a positive effect. This research contributes to that aim by comparing three instantiations of incremental or curriculum learning. We introduce Incremental Sequence Learning, a simple incremental approach to sequence learning. Incremental Sequence Learning starts out by using only the first few steps of each sequence as training data. Each time a performance criterion has been reached, the length of the parts of the sequences used for training is increased. We introduce and make available a novel sequence learning task and data set: predicting and classifying MNIST pen stroke sequences. We find that Incremental Sequence Learning greatly speeds up sequence learning and reaches the best test performance level of regular sequence learning 20 times faster, reduces the test error by 74%, and in general performs more robustly; it displays lower variance and achieves sustained progress after all three comparison methods have stopped improving. The other instantiations of curriculum learning do not result in any noticeable improvement. A trained sequence prediction model is also used in transfer learning to the task of sequence classification, where it is found that transfer learning realizes improved classification performance compared to methods that learn to classify from scratch. |
Tasks | Transfer Learning |
Published | 2016-11-09 |
URL | http://arxiv.org/abs/1611.03068v2 |
http://arxiv.org/pdf/1611.03068v2.pdf | |
PWC | https://paperswithcode.com/paper/incremental-sequence-learning |
Repo | https://github.com/edwin-de-jong/incremental-sequence-learning |
Framework | tf |
The CMA Evolution Strategy: A Tutorial
Title | The CMA Evolution Strategy: A Tutorial |
Authors | Nikolaus Hansen |
Abstract | This tutorial introduces the CMA Evolution Strategy (ES), where CMA stands for Covariance Matrix Adaptation. The CMA-ES is a stochastic, or randomized, method for real-parameter (continuous domain) optimization of non-linear, non-convex functions. We try to motivate and derive the algorithm from intuitive concepts and from requirements of non-linear, non-convex search in continuous domain. |
Tasks | |
Published | 2016-04-04 |
URL | http://arxiv.org/abs/1604.00772v1 |
http://arxiv.org/pdf/1604.00772v1.pdf | |
PWC | https://paperswithcode.com/paper/the-cma-evolution-strategy-a-tutorial |
Repo | https://github.com/ppocma/ppocma |
Framework | tf |
Wider or Deeper: Revisiting the ResNet Model for Visual Recognition
Title | Wider or Deeper: Revisiting the ResNet Model for Visual Recognition |
Authors | Zifeng Wu, Chunhua Shen, Anton van den Hengel |
Abstract | The trend towards increasingly deep neural networks has been driven by a general observation that increasing depth increases the performance of a network. Recently, however, evidence has been amassing that simply increasing depth may not be the best way to increase performance, particularly given other limitations. Investigations into deep residual networks have also suggested that they may not in fact be operating as a single deep network, but rather as an ensemble of many relatively shallow networks. We examine these issues, and in doing so arrive at a new interpretation of the unravelled view of deep residual networks which explains some of the behaviours that have been observed experimentally. As a result, we are able to derive a new, shallower, architecture of residual networks which significantly outperforms much deeper models such as ResNet-200 on the ImageNet classification dataset. We also show that this performance is transferable to other problem domains by developing a semantic segmentation approach which outperforms the state-of-the-art by a remarkable margin on datasets including PASCAL VOC, PASCAL Context, and Cityscapes. The architecture that we propose thus outperforms its comparators, including very deep ResNets, and yet is more efficient in memory use and sometimes also in training time. The code and models are available at https://github.com/itijyou/ademxapp |
Tasks | Semantic Segmentation |
Published | 2016-11-30 |
URL | http://arxiv.org/abs/1611.10080v1 |
http://arxiv.org/pdf/1611.10080v1.pdf | |
PWC | https://paperswithcode.com/paper/wider-or-deeper-revisiting-the-resnet-model |
Repo | https://github.com/itijyou/ademxapp |
Framework | mxnet |
COCO-Stuff: Thing and Stuff Classes in Context
Title | COCO-Stuff: Thing and Stuff Classes in Context |
Authors | Holger Caesar, Jasper Uijlings, Vittorio Ferrari |
Abstract | Semantic classes can be either things (objects with a well-defined shape, e.g. car, person) or stuff (amorphous background regions, e.g. grass, sky). While lots of classification and detection works focus on thing classes, less attention has been given to stuff classes. Nonetheless, stuff classes are important as they allow to explain important aspects of an image, including (1) scene type; (2) which thing classes are likely to be present and their location (through contextual reasoning); (3) physical attributes, material types and geometric properties of the scene. To understand stuff and things in context we introduce COCO-Stuff, which augments all 164K images of the COCO 2017 dataset with pixel-wise annotations for 91 stuff classes. We introduce an efficient stuff annotation protocol based on superpixels, which leverages the original thing annotations. We quantify the speed versus quality trade-off of our protocol and explore the relation between annotation time and boundary complexity. Furthermore, we use COCO-Stuff to analyze: (a) the importance of stuff and thing classes in terms of their surface cover and how frequently they are mentioned in image captions; (b) the spatial relations between stuff and things, highlighting the rich contextual relations that make our dataset unique; (c) the performance of a modern semantic segmentation method on stuff and thing classes, and whether stuff is easier to segment than things. |
Tasks | Image Captioning, Semantic Segmentation |
Published | 2016-12-12 |
URL | http://arxiv.org/abs/1612.03716v4 |
http://arxiv.org/pdf/1612.03716v4.pdf | |
PWC | https://paperswithcode.com/paper/coco-stuff-thing-and-stuff-classes-in-context |
Repo | https://github.com/nightrome/cocostuff10k |
Framework | none |
Modelling Context with User Embeddings for Sarcasm Detection in Social Media
Title | Modelling Context with User Embeddings for Sarcasm Detection in Social Media |
Authors | Silvio Amir, Byron C. Wallace, Hao Lyu, Paula Carvalho Mário J. Silva |
Abstract | We introduce a deep neural network for automated sarcasm detection. Recent work has emphasized the need for models to capitalize on contextual features, beyond lexical and syntactic cues present in utterances. For example, different speakers will tend to employ sarcasm regarding different subjects and, thus, sarcasm detection models ought to encode such speaker information. Current methods have achieved this by way of laborious feature engineering. By contrast, we propose to automatically learn and then exploit user embeddings, to be used in concert with lexical signals to recognize sarcasm. Our approach does not require elaborate feature engineering (and concomitant data scraping); fitting user embeddings requires only the text from their previous posts. The experimental results show that our model outperforms a state-of-the-art approach leveraging an extensive set of carefully crafted features. |
Tasks | Feature Engineering, Sarcasm Detection |
Published | 2016-07-04 |
URL | http://arxiv.org/abs/1607.00976v2 |
http://arxiv.org/pdf/1607.00976v2.pdf | |
PWC | https://paperswithcode.com/paper/modelling-context-with-user-embeddings-for |
Repo | https://github.com/samiroid/CUE-CNN |
Framework | none |
RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation
Title | RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation |
Authors | Guosheng Lin, Anton Milan, Chunhua Shen, Ian Reid |
Abstract | Recently, very deep convolutional neural networks (CNNs) have shown outstanding performance in object recognition and have also been the first choice for dense classification problems such as semantic segmentation. However, repeated subsampling operations like pooling or convolution striding in deep CNNs lead to a significant decrease in the initial image resolution. Here, we present RefineNet, a generic multi-path refinement network that explicitly exploits all the information available along the down-sampling process to enable high-resolution prediction using long-range residual connections. In this way, the deeper layers that capture high-level semantic features can be directly refined using fine-grained features from earlier convolutions. The individual components of RefineNet employ residual connections following the identity mapping mindset, which allows for effective end-to-end training. Further, we introduce chained residual pooling, which captures rich background context in an efficient manner. We carry out comprehensive experiments and set new state-of-the-art results on seven public datasets. In particular, we achieve an intersection-over-union score of 83.4 on the challenging PASCAL VOC 2012 dataset, which is the best reported result to date. |
Tasks | Object Recognition, Semantic Segmentation |
Published | 2016-11-20 |
URL | http://arxiv.org/abs/1611.06612v3 |
http://arxiv.org/pdf/1611.06612v3.pdf | |
PWC | https://paperswithcode.com/paper/refinenet-multi-path-refinement-networks-for |
Repo | https://github.com/oravus/lostX |
Framework | none |
Single-shot Adaptive Measurement for Quantum-enhanced Metrology
Title | Single-shot Adaptive Measurement for Quantum-enhanced Metrology |
Authors | Pantita Palittapongarnpim, Peter Wittek, Barry C. Sanders |
Abstract | Quantum-enhanced metrology aims to estimate an unknown parameter such that the precision scales better than the shot-noise bound. Single-shot adaptive quantum-enhanced metrology (AQEM) is a promising approach that uses feedback to tweak the quantum process according to previous measurement outcomes. Techniques and formalism for the adaptive case are quite different from the usual non-adaptive quantum metrology approach due to the causal relationship between measurements and outcomes. We construct a formal framework for AQEM by modeling the procedure as a decision-making process, and we derive the imprecision and the Cram'{e}r-Rao lower bound with explicit dependence on the feedback policy. We also explain the reinforcement learning approach for generating quantum control policies, which is adopted due to the optimal policy being non-trivial to devise. Applying a learning algorithm based on differential evolution enables us to attain imprecision for adaptive interferometric phase estimation, which turns out to be SQL when non-entangled particles are used in the scheme. |
Tasks | Decision Making |
Published | 2016-08-22 |
URL | http://arxiv.org/abs/1608.06238v1 |
http://arxiv.org/pdf/1608.06238v1.pdf | |
PWC | https://paperswithcode.com/paper/single-shot-adaptive-measurement-for-quantum |
Repo | https://github.com/PanPalitta/phase_estimation |
Framework | none |