Paper Group ANR 65
Learning Sensor Multiplexing Design through Back-propagation. A Natural Language Query Interface for Searching Personal Information on Smartwatches. Semi-supervised Learning with Explicit Relationship Regularization. Abstractive Headline Generation for Spoken Content by Attentive Recurrent Neural Networks with ASR Error Modeling. Training Echo Stat …
Learning Sensor Multiplexing Design through Back-propagation
Title | Learning Sensor Multiplexing Design through Back-propagation |
Authors | Ayan Chakrabarti |
Abstract | Recent progress on many imaging and vision tasks has been driven by the use of deep feed-forward neural networks, which are trained by propagating gradients of a loss defined on the final output, back through the network up to the first layer that operates directly on the image. We propose back-propagating one step further—to learn camera sensor designs jointly with networks that carry out inference on the images they capture. In this paper, we specifically consider the design and inference problems in a typical color camera—where the sensor is able to measure only one color channel at each pixel location, and computational inference is required to reconstruct a full color image. We learn the camera sensor’s color multiplexing pattern by encoding it as layer whose learnable weights determine which color channel, from among a fixed set, will be measured at each location. These weights are jointly trained with those of a reconstruction network that operates on the corresponding sensor measurements to produce a full color image. Our network achieves significant improvements in accuracy over the traditional Bayer pattern used in most color cameras. It automatically learns to employ a sparse color measurement approach similar to that of a recent design, and moreover, improves upon that design by learning an optimal layout for these measurements. |
Tasks | |
Published | 2016-05-23 |
URL | http://arxiv.org/abs/1605.07078v2 |
http://arxiv.org/pdf/1605.07078v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-sensor-multiplexing-design-through |
Repo | |
Framework | |
A Natural Language Query Interface for Searching Personal Information on Smartwatches
Title | A Natural Language Query Interface for Searching Personal Information on Smartwatches |
Authors | Reza Rawassizadeh, Chelsea Dobbins, Manouchehr Nourizadeh, Zahra Ghamchili, Michael Pazzani |
Abstract | Currently, personal assistant systems, run on smartphones and use natural language interfaces. However, these systems rely mostly on the web for finding information. Mobile and wearable devices can collect an enormous amount of contextual personal data such as sleep and physical activities. These information objects and their applications are known as quantified-self, mobile health or personal informatics, and they can be used to provide a deeper insight into our behavior. To our knowledge, existing personal assistant systems do not support all types of quantified-self queries. In response to this, we have undertaken a user study to analyze a set of “textual questions/queries” that users have used to search their quantified-self or mobile health data. Through analyzing these questions, we have constructed a light-weight natural language based query interface, including a text parser algorithm and a user interface, to process the users’ queries that have been used for searching quantified-self information. This query interface has been designed to operate on small devices, i.e. smartwatches, as well as augmenting the personal assistant systems by allowing them to process end users’ natural language queries about their quantified-self data. |
Tasks | |
Published | 2016-11-22 |
URL | http://arxiv.org/abs/1611.07139v1 |
http://arxiv.org/pdf/1611.07139v1.pdf | |
PWC | https://paperswithcode.com/paper/a-natural-language-query-interface-for |
Repo | |
Framework | |
Semi-supervised Learning with Explicit Relationship Regularization
Title | Semi-supervised Learning with Explicit Relationship Regularization |
Authors | Kwang In Kim, James Tompkin, Hanspeter Pfister, Christian Theobalt |
Abstract | In many learning tasks, the structure of the target space of a function holds rich information about the relationships between evaluations of functions on different data points. Existing approaches attempt to exploit this relationship information implicitly by enforcing smoothness on function evaluations only. However, what happens if we explicitly regularize the relationships between function evaluations? Inspired by homophily, we regularize based on a smooth relationship function, either defined from the data or with labels. In experiments, we demonstrate that this significantly improves the performance of state-of-the-art algorithms in semi-supervised classification and in spectral data embedding for constrained clustering and dimensionality reduction. |
Tasks | Dimensionality Reduction |
Published | 2016-02-11 |
URL | http://arxiv.org/abs/1602.03808v1 |
http://arxiv.org/pdf/1602.03808v1.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-learning-with-explicit |
Repo | |
Framework | |
Abstractive Headline Generation for Spoken Content by Attentive Recurrent Neural Networks with ASR Error Modeling
Title | Abstractive Headline Generation for Spoken Content by Attentive Recurrent Neural Networks with ASR Error Modeling |
Authors | Lang-Chi Yu, Hung-yi Lee, Lin-shan Lee |
Abstract | Headline generation for spoken content is important since spoken content is difficult to be shown on the screen and browsed by the user. It is a special type of abstractive summarization, for which the summaries are generated word by word from scratch without using any part of the original content. Many deep learning approaches for headline generation from text document have been proposed recently, all requiring huge quantities of training data, which is difficult for spoken document summarization. In this paper, we propose an ASR error modeling approach to learn the underlying structure of ASR error patterns and incorporate this model in an Attentive Recurrent Neural Network (ARNN) architecture. In this way, the model for abstractive headline generation for spoken content can be learned from abundant text data and the ASR data for some recognizers. Experiments showed very encouraging results and verified that the proposed ASR error model works well even when the input spoken content is recognized by a recognizer very different from the one the model learned from. |
Tasks | Abstractive Text Summarization, Document Summarization |
Published | 2016-12-26 |
URL | http://arxiv.org/abs/1612.08375v1 |
http://arxiv.org/pdf/1612.08375v1.pdf | |
PWC | https://paperswithcode.com/paper/abstractive-headline-generation-for-spoken |
Repo | |
Framework | |
Training Echo State Networks with Regularization through Dimensionality Reduction
Title | Training Echo State Networks with Regularization through Dimensionality Reduction |
Authors | Sigurd Løkse, Filippo Maria Bianchi, Robert Jenssen |
Abstract | In this paper we introduce a new framework to train an Echo State Network to predict real valued time-series. The method consists in projecting the output of the internal layer of the network on a space with lower dimensionality, before training the output layer to learn the target task. Notably, we enforce a regularization constraint that leads to better generalization capabilities. We evaluate the performances of our approach on several benchmark tests, using different techniques to train the readout of the network, achieving superior predictive performance when using the proposed framework. Finally, we provide an insight on the effectiveness of the implemented mechanics through a visualization of the trajectory in the phase space and relying on the methodologies of nonlinear time-series analysis. By applying our method on well known chaotic systems, we provide evidence that the lower dimensional embedding retains the dynamical properties of the underlying system better than the full-dimensional internal states of the network. |
Tasks | Dimensionality Reduction, Time Series, Time Series Analysis |
Published | 2016-08-16 |
URL | http://arxiv.org/abs/1608.04622v1 |
http://arxiv.org/pdf/1608.04622v1.pdf | |
PWC | https://paperswithcode.com/paper/training-echo-state-networks-with |
Repo | |
Framework | |
Bidirectional Multirate Reconstruction for Temporal Modeling in Videos
Title | Bidirectional Multirate Reconstruction for Temporal Modeling in Videos |
Authors | Linchao Zhu, Zhongwen Xu, Yi Yang |
Abstract | Despite the recent success of neural networks in image feature learning, a major problem in the video domain is the lack of sufficient labeled data for learning to model temporal information. In this paper, we propose an unsupervised temporal modeling method that learns from untrimmed videos. The speed of motion varies constantly, e.g., a man may run quickly or slowly. We therefore train a Multirate Visual Recurrent Model (MVRM) by encoding frames of a clip with different intervals. This learning process makes the learned model more capable of dealing with motion speed variance. Given a clip sampled from a video, we use its past and future neighboring clips as the temporal context, and reconstruct the two temporal transitions, i.e., present$\rightarrow$past transition and present$\rightarrow$future transition, reflecting the temporal information in different views. The proposed method exploits the two transitions simultaneously by incorporating a bidirectional reconstruction which consists of a backward reconstruction and a forward reconstruction. We apply the proposed method to two challenging video tasks, i.e., complex event detection and video captioning, in which it achieves state-of-the-art performance. Notably, our method generates the best single feature for event detection with a relative improvement of 10.4% on the MEDTest-13 dataset and achieves the best performance in video captioning across all evaluation metrics on the YouTube2Text dataset. |
Tasks | Video Captioning |
Published | 2016-11-28 |
URL | http://arxiv.org/abs/1611.09053v1 |
http://arxiv.org/pdf/1611.09053v1.pdf | |
PWC | https://paperswithcode.com/paper/bidirectional-multirate-reconstruction-for |
Repo | |
Framework | |
Biologically Inspired Radio Signal Feature Extraction with Sparse Denoising Autoencoders
Title | Biologically Inspired Radio Signal Feature Extraction with Sparse Denoising Autoencoders |
Authors | Benjamin Migliori, Riley Zeller-Townson, Daniel Grady, Daniel Gebhardt |
Abstract | Automatic modulation classification (AMC) is an important task for modern communication systems; however, it is a challenging problem when signal features and precise models for generating each modulation may be unknown. We present a new biologically-inspired AMC method without the need for models or manually specified features — thus removing the requirement for expert prior knowledge. We accomplish this task using regularized stacked sparse denoising autoencoders (SSDAs). Our method selects efficient classification features directly from raw in-phase/quadrature (I/Q) radio signals in an unsupervised manner. These features are then used to construct higher-complexity abstract features which can be used for automatic modulation classification. We demonstrate this process using a dataset generated with a software defined radio, consisting of random input bits encoded in 100-sample segments of various common digital radio modulations. Our results show correct classification rates of > 99% at 7.5 dB signal-to-noise ratio (SNR) and > 92% at 0 dB SNR in a 6-way classification test. Our experiments demonstrate a dramatically new and broadly applicable mechanism for performing AMC and related tasks without the need for expert-defined or modulation-specific signal information. |
Tasks | Denoising |
Published | 2016-05-17 |
URL | http://arxiv.org/abs/1605.05239v1 |
http://arxiv.org/pdf/1605.05239v1.pdf | |
PWC | https://paperswithcode.com/paper/biologically-inspired-radio-signal-feature |
Repo | |
Framework | |
FPGA-Based Low-Power Speech Recognition with Recurrent Neural Networks
Title | FPGA-Based Low-Power Speech Recognition with Recurrent Neural Networks |
Authors | Minjae Lee, Kyuyeon Hwang, Jinhwan Park, Sungwook Choi, Sungho Shin, Wonyong Sung |
Abstract | In this paper, a neural network based real-time speech recognition (SR) system is developed using an FPGA for very low-power operation. The implemented system employs two recurrent neural networks (RNNs); one is a speech-to-character RNN for acoustic modeling (AM) and the other is for character-level language modeling (LM). The system also employs a statistical word-level LM to improve the recognition accuracy. The results of the AM, the character-level LM, and the word-level LM are combined using a fairly simple N-best search algorithm instead of the hidden Markov model (HMM) based network. The RNNs are implemented using massively parallel processing elements (PEs) for low latency and high throughput. The weights are quantized to 6 bits to store all of them in the on-chip memory of an FPGA. The proposed algorithm is implemented on a Xilinx XC7Z045, and the system can operate much faster than real-time. |
Tasks | Language Modelling, Speech Recognition |
Published | 2016-09-30 |
URL | http://arxiv.org/abs/1610.00552v1 |
http://arxiv.org/pdf/1610.00552v1.pdf | |
PWC | https://paperswithcode.com/paper/fpga-based-low-power-speech-recognition-with |
Repo | |
Framework | |
Enhanced Twitter Sentiment Classification Using Contextual Information
Title | Enhanced Twitter Sentiment Classification Using Contextual Information |
Authors | Soroush Vosoughi, Helen Zhou, Deb Roy |
Abstract | The rise in popularity and ubiquity of Twitter has made sentiment analysis of tweets an important and well-covered area of research. However, the 140 character limit imposed on tweets makes it hard to use standard linguistic methods for sentiment classification. On the other hand, what tweets lack in structure they make up with sheer volume and rich metadata. This metadata includes geolocation, temporal and author information. We hypothesize that sentiment is dependent on all these contextual factors. Different locations, times and authors have different emotional valences. In this paper, we explored this hypothesis by utilizing distant supervision to collect millions of labelled tweets from different locations, times and authors. We used this data to analyse the variation of tweet sentiments across different authors, times and locations. Once we explored and understood the relationship between these variables and sentiment, we used a Bayesian approach to combine these variables with more standard linguistic features such as n-grams to create a Twitter sentiment classifier. This combined classifier outperforms the purely linguistic classifier, showing that integrating the rich contextual information available on Twitter into sentiment classification is a promising direction of research. |
Tasks | Sentiment Analysis |
Published | 2016-05-17 |
URL | http://arxiv.org/abs/1605.05195v1 |
http://arxiv.org/pdf/1605.05195v1.pdf | |
PWC | https://paperswithcode.com/paper/enhanced-twitter-sentiment-classification |
Repo | |
Framework | |
Review of state-of-the-arts in artificial intelligence with application to AI safety problem
Title | Review of state-of-the-arts in artificial intelligence with application to AI safety problem |
Authors | Vladimir Shakirov |
Abstract | Here, I review current state-of-the-arts in many areas of AI to estimate when it’s reasonable to expect human level AI development. Predictions of prominent AI researchers vary broadly from very pessimistic predictions of Andrew Ng to much more moderate predictions of Geoffrey Hinton and optimistic predictions of Shane Legg, DeepMind cofounder. Given huge rate of progress in recent years and this broad range of predictions of AI experts, AI safety questions are also discussed. |
Tasks | |
Published | 2016-05-11 |
URL | http://arxiv.org/abs/1605.04232v2 |
http://arxiv.org/pdf/1605.04232v2.pdf | |
PWC | https://paperswithcode.com/paper/review-of-state-of-the-arts-in-artificial |
Repo | |
Framework | |
Geometric Decomposition of Feed Forward Neural Networks
Title | Geometric Decomposition of Feed Forward Neural Networks |
Authors | Sven Cattell |
Abstract | There have been several attempts to mathematically understand neural networks and many more from biological and computational perspectives. The field has exploded in the last decade, yet neural networks are still treated much like a black box. In this work we describe a structure that is inherent to a feed forward neural network. This will provide a framework for future work on neural networks to improve training algorithms, compute the homology of the network, and other applications. Our approach takes a more geometric point of view and is unlike other attempts to mathematically understand neural networks that rely on a functional perspective. |
Tasks | |
Published | 2016-12-08 |
URL | http://arxiv.org/abs/1612.02522v1 |
http://arxiv.org/pdf/1612.02522v1.pdf | |
PWC | https://paperswithcode.com/paper/geometric-decomposition-of-feed-forward |
Repo | |
Framework | |
Knowledge Representation Analysis of Graph Mining
Title | Knowledge Representation Analysis of Graph Mining |
Authors | Matthias van der Hallen, Sergey Paramonov, Michael Leuschel, Gerda Janssens |
Abstract | Many problems, especially those with a composite structure, can naturally be expressed in higher order logic. From a KR perspective modeling these problems in an intuitive way is a challenging task. In this paper we study the graph mining problem as an example of a higher order problem. In short, this problem asks us to find a graph that frequently occurs as a subgraph among a set of example graphs. We start from the problem’s mathematical definition to solve it in three state-of-the-art specification systems. For IDP and ASP, which have no native support for higher order logic, we propose the use of encoding techniques such as the disjoint union technique and the saturation technique. ProB benefits from the higher order support for sets. We compare the performance of the three approaches to get an idea of the overhead of the higher order support. We propose higher-order language extensions for IDP-like specification languages and discuss what kind of solver support is needed. Native higher order shifts the burden of rewriting specifications using encoding techniques from the user to the solver itself. |
Tasks | |
Published | 2016-08-31 |
URL | http://arxiv.org/abs/1608.08956v1 |
http://arxiv.org/pdf/1608.08956v1.pdf | |
PWC | https://paperswithcode.com/paper/knowledge-representation-analysis-of-graph |
Repo | |
Framework | |
Classifier ensemble creation via false labelling
Title | Classifier ensemble creation via false labelling |
Authors | Bálint Antal |
Abstract | In this paper, a novel approach to classifier ensemble creation is presented. While other ensemble creation techniques are based on careful selection of existing classifiers or preprocessing of the data, the presented approach automatically creates an optimal labelling for a number of classifiers, which are then assigned to the original data instances and fed to classifiers. The approach has been evaluated on high-dimensional biomedical datasets. The results show that the approach outperformed individual approaches in all cases. |
Tasks | |
Published | 2016-03-05 |
URL | http://arxiv.org/abs/1603.01716v1 |
http://arxiv.org/pdf/1603.01716v1.pdf | |
PWC | https://paperswithcode.com/paper/classifier-ensemble-creation-via-false |
Repo | |
Framework | |
Character-Level Language Modeling with Hierarchical Recurrent Neural Networks
Title | Character-Level Language Modeling with Hierarchical Recurrent Neural Networks |
Authors | Kyuyeon Hwang, Wonyong Sung |
Abstract | Recurrent neural network (RNN) based character-level language models (CLMs) are extremely useful for modeling out-of-vocabulary words by nature. However, their performance is generally much worse than the word-level language models (WLMs), since CLMs need to consider longer history of tokens to properly predict the next one. We address this problem by proposing hierarchical RNN architectures, which consist of multiple modules with different timescales. Despite the multi-timescale structures, the input and output layers operate with the character-level clock, which allows the existing RNN CLM training approaches to be directly applicable without any modifications. Our CLM models show better perplexity than Kneser-Ney (KN) 5-gram WLMs on the One Billion Word Benchmark with only 2% of parameters. Also, we present real-time character-level end-to-end speech recognition examples on the Wall Street Journal (WSJ) corpus, where replacing traditional mono-clock RNN CLMs with the proposed models results in better recognition accuracies even though the number of parameters are reduced to 30%. |
Tasks | End-To-End Speech Recognition, Language Modelling, Speech Recognition |
Published | 2016-09-13 |
URL | http://arxiv.org/abs/1609.03777v2 |
http://arxiv.org/pdf/1609.03777v2.pdf | |
PWC | https://paperswithcode.com/paper/character-level-language-modeling-with |
Repo | |
Framework | |
Record Counting in Historical Handwritten Documents with Convolutional Neural Networks
Title | Record Counting in Historical Handwritten Documents with Convolutional Neural Networks |
Authors | Samuele Capobianco, Simone Marinai |
Abstract | In this paper, we investigate the use of Convolutional Neural Networks for counting the number of records in historical handwritten documents. With this work we demonstrate that training the networks only with synthetic images allows us to perform a near perfect evaluation of the number of records printed on historical documents. The experiments have been performed on a benchmark dataset composed by marriage records and outperform previous results on this dataset. |
Tasks | |
Published | 2016-10-24 |
URL | http://arxiv.org/abs/1610.07393v2 |
http://arxiv.org/pdf/1610.07393v2.pdf | |
PWC | https://paperswithcode.com/paper/record-counting-in-historical-handwritten |
Repo | |
Framework | |