Paper Group ANR 389
Topic Modeling Using Distributed Word Embeddings. Making brain-machine interfaces robust to future neural variability. Stochastic Portfolio Theory: A Machine Learning Perspective. Post Selection Inference with Kernels. Learning Lexical Entries for Robotic Commands using Crowdsourcing. Connectionist Temporal Modeling for Weakly Supervised Action Lab …
Topic Modeling Using Distributed Word Embeddings
Title | Topic Modeling Using Distributed Word Embeddings |
Authors | Ramandeep S Randhawa, Parag Jain, Gagan Madan |
Abstract | We propose a new algorithm for topic modeling, Vec2Topic, that identifies the main topics in a corpus using semantic information captured via high-dimensional distributed word embeddings. Our technique is unsupervised and generates a list of topics ranked with respect to importance. We find that it works better than existing topic modeling techniques such as Latent Dirichlet Allocation for identifying key topics in user-generated content, such as emails, chats, etc., where topics are diffused across the corpus. We also find that Vec2Topic works equally well for non-user generated content, such as papers, reports, etc., and for small corpora such as a single-document. |
Tasks | Word Embeddings |
Published | 2016-03-15 |
URL | http://arxiv.org/abs/1603.04747v1 |
http://arxiv.org/pdf/1603.04747v1.pdf | |
PWC | https://paperswithcode.com/paper/topic-modeling-using-distributed-word |
Repo | |
Framework | |
Making brain-machine interfaces robust to future neural variability
Title | Making brain-machine interfaces robust to future neural variability |
Authors | David Sussillo, Sergey D. Stavisky, Jonathan C. Kao, Stephen I. Ryu, Krishna V. Shenoy |
Abstract | A major hurdle to clinical translation of brain-machine interfaces (BMIs) is that current decoders, which are trained from a small quantity of recent data, become ineffective when neural recording conditions subsequently change. We tested whether a decoder could be made more robust to future neural variability by training it to handle a variety of recording conditions sampled from months of previously collected data as well as synthetic training data perturbations. We developed a new multiplicative recurrent neural network BMI decoder that successfully learned a large variety of neural-to- kinematic mappings and became more robust with larger training datasets. When tested with a non-human primate preclinical BMI model, this decoder was robust under conditions that disabled a state-of-the-art Kalman filter based decoder. These results validate a new BMI strategy in which accumulated data history is effectively harnessed, and may facilitate reliable daily BMI use by reducing decoder retraining downtime. |
Tasks | |
Published | 2016-10-19 |
URL | http://arxiv.org/abs/1610.05872v1 |
http://arxiv.org/pdf/1610.05872v1.pdf | |
PWC | https://paperswithcode.com/paper/making-brain-machine-interfaces-robust-to |
Repo | |
Framework | |
Stochastic Portfolio Theory: A Machine Learning Perspective
Title | Stochastic Portfolio Theory: A Machine Learning Perspective |
Authors | Yves-Laurent Kom Samo, Alexander Vervuurt |
Abstract | In this paper we propose a novel application of Gaussian processes (GPs) to financial asset allocation. Our approach is deeply rooted in Stochastic Portfolio Theory (SPT), a stochastic analysis framework introduced by Robert Fernholz that aims at flexibly analysing the performance of certain investment strategies in stock markets relative to benchmark indices. In particular, SPT has exhibited some investment strategies based on company sizes that, under realistic assumptions, outperform benchmark indices with probability 1 over certain time horizons. Galvanised by this result, we consider the inverse problem that consists of learning (from historical data) an optimal investment strategy based on any given set of trading characteristics, and using a user-specified optimality criterion that may go beyond outperforming a benchmark index. Although this inverse problem is of the utmost interest to investment management practitioners, it can hardly be tackled using the SPT framework. We show that our machine learning approach learns investment strategies that considerably outperform existing SPT strategies in the US stock market. |
Tasks | Gaussian Processes |
Published | 2016-05-09 |
URL | http://arxiv.org/abs/1605.02654v1 |
http://arxiv.org/pdf/1605.02654v1.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-portfolio-theory-a-machine |
Repo | |
Framework | |
Post Selection Inference with Kernels
Title | Post Selection Inference with Kernels |
Authors | Makoto Yamada, Yuta Umezu, Kenji Fukumizu, Ichiro Takeuchi |
Abstract | We propose a novel kernel based post selection inference (PSI) algorithm, which can not only handle non-linearity in data but also structured output such as multi-dimensional and multi-label outputs. Specifically, we develop a PSI algorithm for independence measures, and propose the Hilbert-Schmidt Independence Criterion (HSIC) based PSI algorithm (hsicInf). The novelty of the proposed algorithm is that it can handle non-linearity and/or structured data through kernels. Namely, the proposed algorithm can be used for wider range of applications including nonlinear multi-class classification and multi-variate regressions, while existing PSI algorithms cannot handle them. Through synthetic experiments, we show that the proposed approach can find a set of statistically significant features for both regression and classification problems. Moreover, we apply the hsicInf algorithm to a real-world data, and show that hsicInf can successfully identify important features. |
Tasks | |
Published | 2016-10-12 |
URL | http://arxiv.org/abs/1610.03725v2 |
http://arxiv.org/pdf/1610.03725v2.pdf | |
PWC | https://paperswithcode.com/paper/post-selection-inference-with-kernels |
Repo | |
Framework | |
Learning Lexical Entries for Robotic Commands using Crowdsourcing
Title | Learning Lexical Entries for Robotic Commands using Crowdsourcing |
Authors | Junjie Hu, Jean Oh, Anatole Gershman |
Abstract | Robotic commands in natural language usually contain various spatial descriptions that are semantically similar but syntactically different. Mapping such syntactic variants into semantic concepts that can be understood by robots is challenging due to the high flexibility of natural language expressions. To tackle this problem, we collect robotic commands for navigation and manipulation tasks using crowdsourcing. We further define a robot language and use a generative machine translation model to translate robotic commands from natural language to robot language. The main purpose of this paper is to simulate the interaction process between human and robots using crowdsourcing platforms, and investigate the possibility of translating natural language to robot language with paraphrases. |
Tasks | Machine Translation |
Published | 2016-09-08 |
URL | http://arxiv.org/abs/1609.02549v3 |
http://arxiv.org/pdf/1609.02549v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-lexical-entries-for-robotic-commands |
Repo | |
Framework | |
Connectionist Temporal Modeling for Weakly Supervised Action Labeling
Title | Connectionist Temporal Modeling for Weakly Supervised Action Labeling |
Authors | De-An Huang, Li Fei-Fei, Juan Carlos Niebles |
Abstract | We propose a weakly-supervised framework for action labeling in video, where only the order of occurring actions is required during training time. The key challenge is that the per-frame alignments between the input (video) and label (action) sequences are unknown during training. We address this by introducing the Extended Connectionist Temporal Classification (ECTC) framework to efficiently evaluate all possible alignments via dynamic programming and explicitly enforce their consistency with frame-to-frame visual similarities. This protects the model from distractions of visually inconsistent or degenerated alignments without the need of temporal supervision. We further extend our framework to the semi-supervised case when a few frames are sparsely annotated in a video. With less than 1% of labeled frames per video, our method is able to outperform existing semi-supervised approaches and achieve comparable performance to that of fully supervised approaches. |
Tasks | |
Published | 2016-07-28 |
URL | http://arxiv.org/abs/1607.08584v1 |
http://arxiv.org/pdf/1607.08584v1.pdf | |
PWC | https://paperswithcode.com/paper/connectionist-temporal-modeling-for-weakly |
Repo | |
Framework | |
A New Data Representation Based on Training Data Characteristics to Extract Drug Named-Entity in Medical Text
Title | A New Data Representation Based on Training Data Characteristics to Extract Drug Named-Entity in Medical Text |
Authors | Sadikin Mujiono, Mohamad Ivan Fanany, Chan Basaruddin |
Abstract | One essential task in information extraction from the medical corpus is drug name recognition. Compared with text sources come from other domains, the medical text is special and has unique characteristics. In addition, the medical text mining poses more challenges, e.g., more unstructured text, the fast growing of new terms addition, a wide range of name variation for the same drug. The mining is even more challenging due to the lack of labeled dataset sources and external knowledge, as well as multiple token representations for a single drug name that is more common in the real application setting. Although many approaches have been proposed to overwhelm the task, some problems remained with poor F-score performance (less than 0.75). This paper presents a new treatment in data representation techniques to overcome some of those challenges. We propose three data representation techniques based on the characteristics of word distribution and word similarities as a result of word embedding training. The first technique is evaluated with the standard NN model, i.e., MLP (Multi-Layer Perceptrons). The second technique involves two deep network classifiers, i.e., DBN (Deep Belief Networks), and SAE (Stacked Denoising Encoders). The third technique represents the sentence as a sequence that is evaluated with a recurrent NN model, i.e., LSTM (Long Short Term Memory). In extracting the drug name entities, the third technique gives the best F-score performance compared to the state of the art, with its average F-score being 0.8645. |
Tasks | Denoising |
Published | 2016-10-06 |
URL | http://arxiv.org/abs/1610.01891v1 |
http://arxiv.org/pdf/1610.01891v1.pdf | |
PWC | https://paperswithcode.com/paper/a-new-data-representation-based-on-training |
Repo | |
Framework | |
Wind ramp event prediction with parallelized Gradient Boosted Regression Trees
Title | Wind ramp event prediction with parallelized Gradient Boosted Regression Trees |
Authors | Saurav Gupta, Nitin Anand Shrivastava, Abbas Khosravi, Bijaya Ketan Panigrahi |
Abstract | Accurate prediction of wind ramp events is critical for ensuring the reliability and stability of the power systems with high penetration of wind energy. This paper proposes a classification based approach for estimating the future class of wind ramp event based on certain thresholds. A parallelized gradient boosted regression tree based technique has been proposed to accurately classify the normal as well as rare extreme wind power ramp events. The model has been validated using wind power data obtained from the National Renewable Energy Laboratory database. Performance comparison with several benchmark techniques indicates the superiority of the proposed technique in terms of superior classification accuracy. |
Tasks | |
Published | 2016-10-17 |
URL | http://arxiv.org/abs/1610.05009v1 |
http://arxiv.org/pdf/1610.05009v1.pdf | |
PWC | https://paperswithcode.com/paper/wind-ramp-event-prediction-with-parallelized |
Repo | |
Framework | |
Compositional Sequence Labeling Models for Error Detection in Learner Writing
Title | Compositional Sequence Labeling Models for Error Detection in Learner Writing |
Authors | Marek Rei, Helen Yannakoudakis |
Abstract | In this paper, we present the first experiments using neural network models for the task of error detection in learner writing. We perform a systematic comparison of alternative compositional architectures and propose a framework for error detection based on bidirectional LSTMs. Experiments on the CoNLL-14 shared task dataset show the model is able to outperform other participants on detecting errors in learner writing. Finally, the model is integrated with a publicly deployed self-assessment system, leading to performance comparable to human annotators. |
Tasks | Grammatical Error Detection |
Published | 2016-07-20 |
URL | http://arxiv.org/abs/1607.06153v1 |
http://arxiv.org/pdf/1607.06153v1.pdf | |
PWC | https://paperswithcode.com/paper/compositional-sequence-labeling-models-for |
Repo | |
Framework | |
R-FUSE: Robust Fast Fusion of Multi-Band Images Based on Solving a Sylvester Equation
Title | R-FUSE: Robust Fast Fusion of Multi-Band Images Based on Solving a Sylvester Equation |
Authors | Qi Wei, Nicolas Dobigeon, Jean-Yves Tourneret, Jose Bioucas-Dias, Simon Godsill |
Abstract | This paper proposes a robust fast multi-band image fusion method to merge a high-spatial low-spectral resolution image and a low-spatial high-spectral resolution image. Following the method recently developed in [1], the generalized Sylvester matrix equation associated with the multi-band image fusion problem is solved in a more robust and efficient way by exploiting the Woodbury formula, avoiding any permutation operation in the frequency domain as well as the blurring kernel invertibility assumption required in [1]. Thanks to this improvement, the proposed algorithm requires fewer computational operations and is also more robust with respect to the blurring kernel compared with the one in [1]. The proposed new algorithm is tested with different priors considered in [1]. Our conclusion is that the proposed fusion algorithm is more robust than the one in [1] with a reduced computational cost. |
Tasks | |
Published | 2016-04-06 |
URL | http://arxiv.org/abs/1604.01818v1 |
http://arxiv.org/pdf/1604.01818v1.pdf | |
PWC | https://paperswithcode.com/paper/r-fuse-robust-fast-fusion-of-multi-band |
Repo | |
Framework | |
Autonomous Racing using Learning Model Predictive Control
Title | Autonomous Racing using Learning Model Predictive Control |
Authors | Ugo Rosolia, Ashwin Carvalho, Francesco Borrelli |
Abstract | A novel learning Model Predictive Control technique is applied to the autonomous racing problem. The goal of the controller is to minimize the time to complete a lap. The proposed control strategy uses the data from previous laps to improve its performance while satisfying safety requirements. Moreover, a system identification technique is proposed to estimate the vehicle dynamics. Simulation results with the high fidelity simulator software CarSim show the effectiveness of the proposed control scheme. |
Tasks | |
Published | 2016-10-20 |
URL | http://arxiv.org/abs/1610.06534v6 |
http://arxiv.org/pdf/1610.06534v6.pdf | |
PWC | https://paperswithcode.com/paper/autonomous-racing-using-learning-model |
Repo | |
Framework | |
Recurrent Neural Network Language Model Adaptation Derived Document Vector
Title | Recurrent Neural Network Language Model Adaptation Derived Document Vector |
Authors | Wei Li, Brian Kan Wing Mak |
Abstract | In many natural language processing (NLP) tasks, a document is commonly modeled as a bag of words using the term frequency-inverse document frequency (TF-IDF) vector. One major shortcoming of the frequency-based TF-IDF feature vector is that it ignores word orders that carry syntactic and semantic relationships among the words in a document, and they can be important in some NLP tasks such as genre classification. This paper proposes a novel distributed vector representation of a document: a simple recurrent-neural-network language model (RNN-LM) or a long short-term memory RNN language model (LSTM-LM) is first created from all documents in a task; some of the LM parameters are then adapted by each document, and the adapted parameters are vectorized to represent the document. The new document vectors are labeled as DV-RNN and DV-LSTM respectively. We believe that our new document vectors can capture some high-level sequential information in the documents, which other current document representations fail to capture. The new document vectors were evaluated in the genre classification of documents in three corpora: the Brown Corpus, the BNC Baby Corpus and an artificially created Penn Treebank dataset. Their classification performances are compared with the performance of TF-IDF vector and the state-of-the-art distributed memory model of paragraph vector (PV-DM). The results show that DV-LSTM significantly outperforms TF-IDF and PV-DM in most cases, and combinations of the proposed document vectors with TF-IDF or PV-DM may further improve performance. |
Tasks | Language Modelling |
Published | 2016-11-01 |
URL | http://arxiv.org/abs/1611.00196v1 |
http://arxiv.org/pdf/1611.00196v1.pdf | |
PWC | https://paperswithcode.com/paper/recurrent-neural-network-language-model |
Repo | |
Framework | |
Coactive Critiquing: Elicitation of Preferences and Features
Title | Coactive Critiquing: Elicitation of Preferences and Features |
Authors | Stefano Teso, Paolo Dragone, Andrea Passerini |
Abstract | When faced with complex choices, users refine their own preference criteria as they explore the catalogue of options. In this paper we propose an approach to preference elicitation suited for this scenario. We extend Coactive Learning, which iteratively collects manipulative feedback, to optionally query example critiques. User critiques are integrated into the learning model by dynamically extending the feature space. Our formulation natively supports constructive learning tasks, where the option catalogue is generated on-the-fly. We present an upper bound on the average regret suffered by the learner. Our empirical analysis highlights the promise of our approach. |
Tasks | |
Published | 2016-12-06 |
URL | http://arxiv.org/abs/1612.01941v1 |
http://arxiv.org/pdf/1612.01941v1.pdf | |
PWC | https://paperswithcode.com/paper/coactive-critiquing-elicitation-of |
Repo | |
Framework | |
Addressing Limited Data for Textual Entailment Across Domains
Title | Addressing Limited Data for Textual Entailment Across Domains |
Authors | Chaitanya Shivade, Preethi Raghavan, Siddharth Patwardhan |
Abstract | We seek to address the lack of labeled data (and high cost of annotation) for textual entailment in some domains. To that end, we first create (for experimental purposes) an entailment dataset for the clinical domain, and a highly competitive supervised entailment system, ENT, that is effective (out of the box) on two domains. We then explore self-training and active learning strategies to address the lack of labeled data. With self-training, we successfully exploit unlabeled data to improve over ENT by 15% F-score on the newswire domain, and 13% F-score on clinical data. On the other hand, our active learning experiments demonstrate that we can match (and even beat) ENT using only 6.6% of the training data in the clinical domain, and only 5.8% of the training data in the newswire domain. |
Tasks | Active Learning, Natural Language Inference |
Published | 2016-06-08 |
URL | http://arxiv.org/abs/1606.02638v1 |
http://arxiv.org/pdf/1606.02638v1.pdf | |
PWC | https://paperswithcode.com/paper/addressing-limited-data-for-textual |
Repo | |
Framework | |
Attention Tree: Learning Hierarchies of Visual Features for Large-Scale Image Recognition
Title | Attention Tree: Learning Hierarchies of Visual Features for Large-Scale Image Recognition |
Authors | Priyadarshini Panda, Kaushik Roy |
Abstract | One of the key challenges in machine learning is to design a computationally efficient multi-class classifier while maintaining the output accuracy and performance. In this paper, we present a tree-based classifier: Attention Tree (ATree) for large-scale image classification that uses recursive Adaboost training to construct a visual attention hierarchy. The proposed attention model is inspired from the biological ‘selective tuning mechanism for cortical visual processing’. We exploit the inherent feature similarity across images in datasets to identify the input variability and use recursive optimization procedure, to determine data partitioning at each node, thereby, learning the attention hierarchy. A set of binary classifiers is organized on top of the learnt hierarchy to minimize the overall test-time complexity. The attention model maximizes the margins for the binary classifiers for optimal decision boundary modelling, leading to better performance at minimal complexity. The proposed framework has been evaluated on both Caltech-256 and SUN datasets and achieves accuracy improvement over state-of-the-art tree-based methods at significantly lower computational cost. |
Tasks | Image Classification |
Published | 2016-08-01 |
URL | http://arxiv.org/abs/1608.00611v1 |
http://arxiv.org/pdf/1608.00611v1.pdf | |
PWC | https://paperswithcode.com/paper/attention-tree-learning-hierarchies-of-visual |
Repo | |
Framework | |