July 29, 2019

3165 words 15 mins read

Paper Group ANR 145

Paper Group ANR 145

Geometric Methods for Robust Data Analysis in High Dimension. Sparse Coding and Autoencoders. Encoder Based Lifelong Learning. Application of Transfer Learning Approaches in Multimodal Wearable Human Activity Recognition. Driver Identification Using Automobile Sensor Data from a Single Turn. Sparse Algorithm for Robust LSSVM in Primal Space. Basic …

Geometric Methods for Robust Data Analysis in High Dimension

Title Geometric Methods for Robust Data Analysis in High Dimension
Authors Joseph Anderson
Abstract Machine learning and data analysis now finds both scientific and industrial application in biology, chemistry, geology, medicine, and physics. These applications rely on large quantities of data gathered from automated sensors and user input. Furthermore, the dimensionality of many datasets is extreme: more details are being gathered about single user interactions or sensor readings. All of these applications encounter problems with a common theme: use observed data to make inferences about the world. Our work obtains the first provably efficient algorithms for Independent Component Analysis (ICA) in the presence of heavy-tailed data. The main tool in this result is the centroid body (a well-known topic in convex geometry), along with optimization and random walks for sampling from a convex body. This is the first algorithmic use of the centroid body and it is of independent theoretical interest, since it effectively replaces the estimation of covariance from samples, and is more generally accessible. This reduction relies on a non-linear transformation of samples from such an intersection of halfspaces (i.e. a simplex) to samples which are approximately from a linearly transformed product distribution. Through this transformation of samples, which can be done efficiently, one can then use an ICA algorithm to recover the vertices of the intersection of halfspaces. Finally, we again use ICA as an algorithmic primitive to construct an efficient solution to the widely-studied problem of learning the parameters of a Gaussian mixture model. Our algorithm again transforms samples from a Gaussian mixture model into samples which fit into the ICA model and, when processed by an ICA algorithm, result in recovery of the mixture parameters. Our algorithm is effective even when the number of Gaussians in the mixture grows polynomially with the ambient dimension
Tasks
Published 2017-05-25
URL http://arxiv.org/abs/1705.09269v1
PDF http://arxiv.org/pdf/1705.09269v1.pdf
PWC https://paperswithcode.com/paper/geometric-methods-for-robust-data-analysis-in
Repo
Framework

Sparse Coding and Autoencoders

Title Sparse Coding and Autoencoders
Authors Akshay Rangamani, Anirbit Mukherjee, Amitabh Basu, Tejaswini Ganapathy, Ashish Arora, Sang Chin, Trac D. Tran
Abstract In “Dictionary Learning” one tries to recover incoherent matrices $A^* \in \mathbb{R}^{n \times h}$ (typically overcomplete and whose columns are assumed to be normalized) and sparse vectors $x^* \in \mathbb{R}^h$ with a small support of size $h^p$ for some $0 <p < 1$ while having access to observations $y \in \mathbb{R}^n$ where $y = A^*x^*$. In this work we undertake a rigorous analysis of whether gradient descent on the squared loss of an autoencoder can solve the dictionary learning problem. The “Autoencoder” architecture we consider is a $\mathbb{R}^n \rightarrow \mathbb{R}^n$ mapping with a single ReLU activation layer of size $h$. Under very mild distributional assumptions on $x^*$, we prove that the norm of the expected gradient of the standard squared loss function is asymptotically (in sparse code dimension) negligible for all points in a small neighborhood of $A^*$. This is supported with experimental evidence using synthetic data. We also conduct experiments to suggest that $A^*$ is a local minimum. Along the way we prove that a layer of ReLU gates can be set up to automatically recover the support of the sparse codes. This property holds independent of the loss function. We believe that it could be of independent interest.
Tasks Dictionary Learning
Published 2017-08-12
URL http://arxiv.org/abs/1708.03735v2
PDF http://arxiv.org/pdf/1708.03735v2.pdf
PWC https://paperswithcode.com/paper/sparse-coding-and-autoencoders
Repo
Framework

Encoder Based Lifelong Learning

Title Encoder Based Lifelong Learning
Authors Amal Rannen Triki, Rahaf Aljundi, Mathew B. Blaschko, Tinne Tuytelaars
Abstract This paper introduces a new lifelong learning solution where a single model is trained for a sequence of tasks. The main challenge that vision systems face in this context is catastrophic forgetting: as they tend to adapt to the most recently seen task, they lose performance on the tasks that were learned previously. Our method aims at preserving the knowledge of the previous tasks while learning a new one by using autoencoders. For each task, an under-complete autoencoder is learned, capturing the features that are crucial for its achievement. When a new task is presented to the system, we prevent the reconstructions of the features with these autoencoders from changing, which has the effect of preserving the information on which the previous tasks are mainly relying. At the same time, the features are given space to adjust to the most recent environment as only their projection into a low dimension submanifold is controlled. The proposed system is evaluated on image classification tasks and shows a reduction of forgetting over the state-of-the-art
Tasks Image Classification
Published 2017-04-06
URL http://arxiv.org/abs/1704.01920v1
PDF http://arxiv.org/pdf/1704.01920v1.pdf
PWC https://paperswithcode.com/paper/encoder-based-lifelong-learning
Repo
Framework

Application of Transfer Learning Approaches in Multimodal Wearable Human Activity Recognition

Title Application of Transfer Learning Approaches in Multimodal Wearable Human Activity Recognition
Authors Hailin Chen, Shengping Cui, Sebastian Li
Abstract Through this project, we researched on transfer learning methods and their applications on real world problems. By implementing and modifying various methods in transfer learning for our problem, we obtained an insight in the advantages and disadvantages of these methods, as well as experiences in developing neural network models for knowledge transfer. Due to time constraint, we only applied a representative method for each major approach in transfer learning. As pointed out in the literature review, each method has its own assumptions, strengths and shortcomings. Thus we believe that an ensemble-learning approach combining the different methods should yield a better performance, which can be our future research focus.
Tasks Activity Recognition, Human Activity Recognition, Transfer Learning
Published 2017-07-08
URL http://arxiv.org/abs/1707.02412v1
PDF http://arxiv.org/pdf/1707.02412v1.pdf
PWC https://paperswithcode.com/paper/application-of-transfer-learning-approaches
Repo
Framework

Driver Identification Using Automobile Sensor Data from a Single Turn

Title Driver Identification Using Automobile Sensor Data from a Single Turn
Authors David Hallac, Abhijit Sharang, Rainer Stahlmann, Andreas Lamprecht, Markus Huber, Martin Roehder, Rok Sosic, Jure Leskovec
Abstract As automotive electronics continue to advance, cars are becoming more and more reliant on sensors to perform everyday driving operations. These sensors are omnipresent and help the car navigate, reduce accidents, and provide comfortable rides. However, they can also be used to learn about the drivers themselves. In this paper, we propose a method to predict, from sensor data collected at a single turn, the identity of a driver out of a given set of individuals. We cast the problem in terms of time series classification, where our dataset contains sensor readings at one turn, repeated several times by multiple drivers. We build a classifier to find unique patterns in each individual’s driving style, which are visible in the data even on such a short road segment. To test our approach, we analyze a new dataset collected by AUDI AG and Audi Electronics Venture, where a fleet of test vehicles was equipped with automotive data loggers storing all sensor readings on real roads. We show that turns are particularly well-suited for detecting variations across drivers, especially when compared to straightaways. We then focus on the 12 most frequently made turns in the dataset, which include rural, urban, highway on-ramps, and more, obtaining accurate identification results and learning useful insights about driver behavior in a variety of settings.
Tasks Time Series, Time Series Classification
Published 2017-06-09
URL http://arxiv.org/abs/1708.04636v1
PDF http://arxiv.org/pdf/1708.04636v1.pdf
PWC https://paperswithcode.com/paper/driver-identification-using-automobile-sensor
Repo
Framework

Sparse Algorithm for Robust LSSVM in Primal Space

Title Sparse Algorithm for Robust LSSVM in Primal Space
Authors Li Chen, Shuisheng Zhou
Abstract As enjoying the closed form solution, least squares support vector machine (LSSVM) has been widely used for classification and regression problems having the comparable performance with other types of SVMs. However, LSSVM has two drawbacks: sensitive to outliers and lacking sparseness. Robust LSSVM (R-LSSVM) overcomes the first partly via nonconvex truncated loss function, but the current algorithms for R-LSSVM with the dense solution are faced with the second drawback and are inefficient for training large-scale problems. In this paper, we interpret the robustness of R-LSSVM from a re-weighted viewpoint and give a primal R-LSSVM by the representer theorem. The new model may have sparse solution if the corresponding kernel matrix has low rank. Then approximating the kernel matrix by a low-rank matrix and smoothing the loss function by entropy penalty function, we propose a convergent sparse R-LSSVM (SR-LSSVM) algorithm to achieve the sparse solution of primal R-LSSVM, which overcomes two drawbacks of LSSVM simultaneously. The proposed algorithm has lower complexity than the existing algorithms and is very efficient for training large-scale problems. Many experimental results illustrate that SR-LSSVM can achieve better or comparable performance with less training time than related algorithms, especially for training large scale problems.
Tasks
Published 2017-02-07
URL http://arxiv.org/abs/1702.01935v1
PDF http://arxiv.org/pdf/1702.01935v1.pdf
PWC https://paperswithcode.com/paper/sparse-algorithm-for-robust-lssvm-in-primal
Repo
Framework

Basic Formal Properties of A Relational Model of The Mathematical Theory of Evidence

Title Basic Formal Properties of A Relational Model of The Mathematical Theory of Evidence
Authors Mieczysław A. Kłopotek, Sławomir T. Wierzchoń
Abstract The paper presents a novel view of the Dempster-Shafer belief function as a measure of diversity in relational data bases. It is demonstrated that under the interpretation The Dempster rule of evidence combination corresponds to the join operator of the relational database theory. This rough-set based interpretation is qualitative in nature and can represent a number of belief function operators. The interpretation has the property that Given a definition of the belief measure of objects in the interpretation domain we can perform operations in this domain and the measure of the resulting object is derivable from measures of component objects via belief operator. We demonstrated this property for Dempster rule of combination, marginalization, Shafer’s conditioning, independent variables, Shenoy’s notion of conditional independence of variables. The interpretation is based on rough sets (in connection with decision tables), but differs from previous interpretations of this type in that it counts the diversity rather than frequencies in a decision table.
Tasks
Published 2017-04-08
URL http://arxiv.org/abs/1704.02468v1
PDF http://arxiv.org/pdf/1704.02468v1.pdf
PWC https://paperswithcode.com/paper/basic-formal-properties-of-a-relational-model
Repo
Framework

Random Features for Compositional Kernels

Title Random Features for Compositional Kernels
Authors Amit Daniely, Roy Frostig, Vineet Gupta, Yoram Singer
Abstract We describe and analyze a simple random feature scheme (RFS) from prescribed compositional kernels. The compositional kernels we use are inspired by the structure of convolutional neural networks and kernels. The resulting scheme yields sparse and efficiently computable features. Each random feature can be represented as an algebraic expression over a small number of (random) paths in a composition tree. Thus, compositional random features can be stored compactly. The discrete nature of the generation process enables de-duplication of repeated features, further compacting the representation and increasing the diversity of the embeddings. Our approach complements and can be combined with previous random feature schemes.
Tasks
Published 2017-03-22
URL http://arxiv.org/abs/1703.07872v1
PDF http://arxiv.org/pdf/1703.07872v1.pdf
PWC https://paperswithcode.com/paper/random-features-for-compositional-kernels
Repo
Framework

Logical Parsing from Natural Language Based on a Neural Translation Model

Title Logical Parsing from Natural Language Based on a Neural Translation Model
Authors Liang Li, Pengyu Li, Yifan Liu, Tao Wan, Zengchang Qin
Abstract Semantic parsing has emerged as a significant and powerful paradigm for natural language interface and question answering systems. Traditional methods of building a semantic parser rely on high-quality lexicons, hand-crafted grammars and linguistic features which are limited by applied domain or representation. In this paper, we propose a general approach to learn from denotations based on Seq2Seq model augmented with attention mechanism. We encode input sequence into vectors and use dynamic programming to infer candidate logical forms. We utilize the fact that similar utterances should have similar logical forms to help reduce the searching space. Under our learning policy, the Seq2Seq model can learn mappings gradually with noises. Curriculum learning is adopted to make the learning smoother. We test our method on the arithmetic domain which shows our model can successfully infer the correct logical forms and learn the word meanings, compositionality and operation orders simultaneously.
Tasks Question Answering, Semantic Parsing
Published 2017-05-09
URL http://arxiv.org/abs/1705.03389v1
PDF http://arxiv.org/pdf/1705.03389v1.pdf
PWC https://paperswithcode.com/paper/logical-parsing-from-natural-language-based
Repo
Framework

Who Will Share My Image? Predicting the Content Diffusion Path in Online Social Networks

Title Who Will Share My Image? Predicting the Content Diffusion Path in Online Social Networks
Authors Wenjian Hu, Krishna Kumar Singh, Fanyi Xiao, Jinyoung Han, Chen-Nee Chuah, Yong Jae Lee
Abstract Content popularity prediction has been extensively studied due to its importance and interest for both users and hosts of social media sites like Facebook, Instagram, Twitter, and Pinterest. However, existing work mainly focuses on modeling popularity using a single metric such as the total number of likes or shares. In this work, we propose Diffusion-LSTM, a memory-based deep recurrent network that learns to recursively predict the entire diffusion path of an image through a social network. By combining user social features and image features, and encoding the diffusion path taken thus far with an explicit memory cell, our model predicts the diffusion path of an image more accurately compared to alternate baselines that either encode only image or social features, or lack memory. By mapping individual users to user prototypes, our model can generalize to new users not seen during training. Finally, we demonstrate our model’s capability of generating diffusion trees, and show that the generated trees closely resemble ground-truth trees.
Tasks
Published 2017-05-25
URL http://arxiv.org/abs/1705.09275v4
PDF http://arxiv.org/pdf/1705.09275v4.pdf
PWC https://paperswithcode.com/paper/who-will-share-my-image-predicting-the
Repo
Framework

Character-Word LSTM Language Models

Title Character-Word LSTM Language Models
Authors Lyan Verwimp, Joris Pelemans, Hugo Van hamme, Patrick Wambacq
Abstract We present a Character-Word Long Short-Term Memory Language Model which both reduces the perplexity with respect to a baseline word-level language model and reduces the number of parameters of the model. Character information can reveal structural (dis)similarities between words and can even be used when a word is out-of-vocabulary, thus improving the modeling of infrequent and unknown words. By concatenating word and character embeddings, we achieve up to 2.77% relative improvement on English compared to a baseline model with a similar amount of parameters and 4.57% on Dutch. Moreover, we also outperform baseline word-level models with a larger number of parameters.
Tasks Language Modelling
Published 2017-04-10
URL http://arxiv.org/abs/1704.02813v1
PDF http://arxiv.org/pdf/1704.02813v1.pdf
PWC https://paperswithcode.com/paper/character-word-lstm-language-models
Repo
Framework

Exploration in NetHack With Secret Discovery

Title Exploration in NetHack With Secret Discovery
Authors Jonathan C. Campbell, Clark Verbrugge
Abstract Roguelike games generally feature exploration problems as a critical, yet often repetitive element of gameplay. Automated approaches, however, face challenges in terms of optimality, as well as due to incomplete information, such as from the presence of secret doors. This paper presents an algorithmic approach to exploration of roguelike dungeon environments. Our design aims to minimize exploration time, balancing coverage and discovery of secret areas with resource cost. Our algorithm is based on the concept of occupancy maps popular in robotics, adapted to encourage efficient discovery of secret access points. Through extensive experimentation on NetHack maps we show that this technique is significantly more efficient than simpler greedy approaches and an existing automated player. We further investigate optimized parameterization for the algorithm through a comprehensive data analysis. These results point towards better automation for players as well as heuristics applicable to fully automated gameplay.
Tasks
Published 2017-11-08
URL http://arxiv.org/abs/1711.03087v2
PDF http://arxiv.org/pdf/1711.03087v2.pdf
PWC https://paperswithcode.com/paper/exploration-in-nethack-with-secret-discovery
Repo
Framework

Trimming and Improving Skip-thought Vectors

Title Trimming and Improving Skip-thought Vectors
Authors Shuai Tang, Hailin Jin, Chen Fang, Zhaowen Wang, Virginia R. de Sa
Abstract The skip-thought model has been proven to be effective at learning sentence representations and capturing sentence semantics. In this paper, we propose a suite of techniques to trim and improve it. First, we validate a hypothesis that, given a current sentence, inferring the previous and inferring the next sentence provide similar supervision power, therefore only one decoder for predicting the next sentence is preserved in our trimmed skip-thought model. Second, we present a connection layer between encoder and decoder to help the model to generalize better on semantic relatedness tasks. Third, we found that a good word embedding initialization is also essential for learning better sentence representations. We train our model unsupervised on a large corpus with contiguous sentences, and then evaluate the trained model on 7 supervised tasks, which includes semantic relatedness, paraphrase detection, and text classification benchmarks. We empirically show that, our proposed model is a faster, lighter-weight and equally powerful alternative to the original skip-thought model.
Tasks Text Classification
Published 2017-06-09
URL http://arxiv.org/abs/1706.03148v1
PDF http://arxiv.org/pdf/1706.03148v1.pdf
PWC https://paperswithcode.com/paper/trimming-and-improving-skip-thought-vectors
Repo
Framework

A supervised approach to time scale detection in dynamic networks

Title A supervised approach to time scale detection in dynamic networks
Authors Benjamin Fish, Rajmonda S. Caceres
Abstract For any stream of time-stamped edges that form a dynamic network, an important choice is the aggregation granularity that an analyst uses to bin the data. Picking such a windowing of the data is often done by hand, or left up to the technology that is collecting the data. However, the choice can make a big difference in the properties of the dynamic network. This is the time scale detection problem. In previous work, this problem is often solved with a heuristic as an unsupervised task. As an unsupervised problem, it is difficult to measure how well a given algorithm performs. In addition, we show that the quality of the windowing is dependent on which task an analyst wants to perform on the network after windowing. Therefore the time scale detection problem should not be handled independently from the rest of the analysis of the network. We introduce a framework that tackles both of these issues: By measuring the performance of the time scale detection algorithm based on how well a given task is accomplished on the resulting network, we are for the first time able to directly compare different time scale detection algorithms to each other. Using this framework, we introduce time scale detection algorithms that take a supervised approach: they leverage ground truth on training data to find a good windowing of the test data. We compare the supervised approach to previous approaches and several baselines on real data.
Tasks
Published 2017-02-24
URL http://arxiv.org/abs/1702.07752v1
PDF http://arxiv.org/pdf/1702.07752v1.pdf
PWC https://paperswithcode.com/paper/a-supervised-approach-to-time-scale-detection
Repo
Framework

Detecting and Explaining Crisis

Title Detecting and Explaining Crisis
Authors Rohan Kshirsagar, Robert Morris, Sam Bowman
Abstract Individuals on social media may reveal themselves to be in various states of crisis (e.g. suicide, self-harm, abuse, or eating disorders). Detecting crisis from social media text automatically and accurately can have profound consequences. However, detecting a general state of crisis without explaining why has limited applications. An explanation in this context is a coherent, concise subset of the text that rationalizes the crisis detection. We explore several methods to detect and explain crisis using a combination of neural and non-neural techniques. We evaluate these techniques on a unique data set obtained from Koko, an anonymous emotional support network available through various messaging applications. We annotate a small subset of the samples labeled with crisis with corresponding explanations. Our best technique significantly outperforms the baseline for detection and explanation.
Tasks
Published 2017-05-26
URL http://arxiv.org/abs/1705.09585v1
PDF http://arxiv.org/pdf/1705.09585v1.pdf
PWC https://paperswithcode.com/paper/detecting-and-explaining-crisis
Repo
Framework
comments powered by Disqus