October 20, 2019

2846 words 14 mins read

Paper Group AWR 257

Pursuit of Low-Rank Models of Time-Varying Matrices Robust to Sparse and Measurement Noise. Left-Center-Right Separated Neural Network for Aspect-based Sentiment Analysis with Rotatory Attention. Scoring Lexical Entailment with a Supervised Directional Similarity Network. Modality Distillation with Multiple Stream Networks for Action Recognition. D …

Pursuit of Low-Rank Models of Time-Varying Matrices Robust to Sparse and Measurement Noise


Title	Pursuit of Low-Rank Models of Time-Varying Matrices Robust to Sparse and Measurement Noise
Authors	Albert Akhriev, Jakub Marecek, Andrea Simonetto
Abstract	In tracking of time-varying low-rank models of time-varying matrices, we present a method robust to both uniformly-distributed measurement noise and arbitrarily-distributed ``sparse’’ noise. In theory, we bound the tracking error. In practice, our use of randomised coordinate descent is scalable and allows for encouraging results on changedetection net, a benchmark. \|
Tasks
Published	2018-09-10
URL	https://arxiv.org/abs/1809.03550v3
PDF	https://arxiv.org/pdf/1809.03550v3.pdf
PWC	https://paperswithcode.com/paper/pursuit-of-low-rank-models-of-time-varying
Repo	https://github.com/jmarecek/OnlineLowRank
Framework	none

Left-Center-Right Separated Neural Network for Aspect-based Sentiment Analysis with Rotatory Attention


Title	Left-Center-Right Separated Neural Network for Aspect-based Sentiment Analysis with Rotatory Attention
Authors	Shiliang Zheng, Rui Xia
Abstract	Deep learning techniques have achieved success in aspect-based sentiment analysis in recent years. However, there are two important issues that still remain to be further studied, i.e., 1) how to efficiently represent the target especially when the target contains multiple words; 2) how to utilize the interaction between target and left/right contexts to capture the most important words in them. In this paper, we propose an approach, called left-center-right separated neural network with rotatory attention (LCR-Rot), to better address the two problems. Our approach has two characteristics: 1) it has three separated LSTMs, i.e., left, center and right LSTMs, corresponding to three parts of a review (left context, target phrase and right context); 2) it has a rotatory attention mechanism which models the relation between target and left/right contexts. The target2context attention is used to capture the most indicative sentiment words in left/right contexts. Subsequently, the context2target attention is used to capture the most important word in the target. This leads to a two-side representation of the target: left-aware target and right-aware target. We compare our approach on three benchmark datasets with ten related methods proposed recently. The results show that our approach significantly outperforms the state-of-the-art techniques.
Tasks	Aspect-Based Sentiment Analysis, Sentiment Analysis
Published	2018-02-03
URL	http://arxiv.org/abs/1802.00892v1
PDF	http://arxiv.org/pdf/1802.00892v1.pdf
PWC	https://paperswithcode.com/paper/left-center-right-separated-neural-network
Repo	https://github.com/NUSTM/ABSC
Framework	tf

Scoring Lexical Entailment with a Supervised Directional Similarity Network


Title	Scoring Lexical Entailment with a Supervised Directional Similarity Network
Authors	Marek Rei, Daniela Gerz, Ivan Vulić
Abstract	We present the Supervised Directional Similarity Network (SDSN), a novel neural architecture for learning task-specific transformation functions on top of general-purpose word embeddings. Relying on only a limited amount of supervision from task-specific scores on a subset of the vocabulary, our architecture is able to generalise and transform a general-purpose distributional vector space to model the relation of lexical entailment. Experiments show excellent performance on scoring graded lexical entailment, raising the state-of-the-art on the HyperLex dataset by approximately 25%.
Tasks	Word Embeddings
Published	2018-05-23
URL	http://arxiv.org/abs/1805.09355v1
PDF	http://arxiv.org/pdf/1805.09355v1.pdf
PWC	https://paperswithcode.com/paper/scoring-lexical-entailment-with-a-supervised
Repo	https://github.com/marekrei/sdsn
Framework	none

Modality Distillation with Multiple Stream Networks for Action Recognition


Title	Modality Distillation with Multiple Stream Networks for Action Recognition
Authors	Nuno Garcia, Pietro Morerio, Vittorio Murino
Abstract	Diverse input data modalities can provide complementary cues for several tasks, usually leading to more robust algorithms and better performance. However, while a (training) dataset could be accurately designed to include a variety of sensory inputs, it is often the case that not all modalities could be available in real life (testing) scenarios, where a model has to be deployed. This raises the challenge of how to learn robust representations leveraging multimodal data in the training stage, while considering limitations at test time, such as noisy or missing modalities. This paper presents a new approach for multimodal video action recognition, developed within the unified frameworks of distillation and privileged information, named generalized distillation. Particularly, we consider the case of learning representations from depth and RGB videos, while relying on RGB data only at test time. We propose a new approach to train an hallucination network that learns to distill depth features through multiplicative connections of spatiotemporal representations, leveraging soft labels and hard labels, as well as distance between feature maps. We report state-of-the-art results on video action classification on the largest multimodal dataset available for this task, the NTU RGB+D. Code available at https://github.com/ncgarcia/modality-distillation .
Tasks	Action Classification, Temporal Action Localization
Published	2018-06-19
URL	http://arxiv.org/abs/1806.07110v2
PDF	http://arxiv.org/pdf/1806.07110v2.pdf
PWC	https://paperswithcode.com/paper/modality-distillation-with-multiple-stream
Repo	https://github.com/ncgarcia/modality-distillation
Framework	tf

Deep Temporal Clustering : Fully Unsupervised Learning of Time-Domain Features


Title	Deep Temporal Clustering : Fully Unsupervised Learning of Time-Domain Features
Authors	Naveen Sai Madiraju, Seid M. Sadat, Dimitry Fisher, Homa Karimabadi
Abstract	Unsupervised learning of time series data, also known as temporal clustering, is a challenging problem in machine learning. Here we propose a novel algorithm, Deep Temporal Clustering (DTC), to naturally integrate dimensionality reduction and temporal clustering into a single end-to-end learning framework, fully unsupervised. The algorithm utilizes an autoencoder for temporal dimensionality reduction and a novel temporal clustering layer for cluster assignment. Then it jointly optimizes the clustering objective and the dimensionality reduction objec tive. Based on requirement and application, the temporal clustering layer can be customized with any temporal similarity metric. Several similarity metrics and state-of-the-art algorithms are considered and compared. To gain insight into temporal features that the network has learned for its clustering, we apply a visualization method that generates a region of interest heatmap for the time series. The viability of the algorithm is demonstrated using time series data from diverse domains, ranging from earthquakes to spacecraft sensor data. In each case, we show that the proposed algorithm outperforms traditional methods. The superior performance is attributed to the fully integrated temporal dimensionality reduction and clustering criterion.
Tasks	Dimensionality Reduction, Time Series
Published	2018-02-04
URL	http://arxiv.org/abs/1802.01059v1
PDF	http://arxiv.org/pdf/1802.01059v1.pdf
PWC	https://paperswithcode.com/paper/deep-temporal-clustering-fully-unsupervised
Repo	https://github.com/saeeeeru/dtc-tensorflow
Framework	tf

Regularizing RNNs for Caption Generation by Reconstructing The Past with The Present


Title	Regularizing RNNs for Caption Generation by Reconstructing The Past with The Present
Authors	Xinpeng Chen, Lin Ma, Wenhao Jiang, Jian Yao, Wei Liu
Abstract	Recently, caption generation with an encoder-decoder framework has been extensively studied and applied in different domains, such as image captioning, code captioning, and so on. In this paper, we propose a novel architecture, namely Auto-Reconstructor Network (ARNet), which, coupling with the conventional encoder-decoder framework, works in an end-to-end fashion to generate captions. ARNet aims at reconstructing the previous hidden state with the present one, besides behaving as the input-dependent transition operator. Therefore, ARNet encourages the current hidden state to embed more information from the previous one, which can help regularize the transition dynamics of recurrent neural networks (RNNs). Extensive experimental results show that our proposed ARNet boosts the performance over the existing encoder-decoder models on both image captioning and source code captioning tasks. Additionally, ARNet remarkably reduces the discrepancy between training and inference processes for caption generation. Furthermore, the performance on permuted sequential MNIST demonstrates that ARNet can effectively regularize RNN, especially on modeling long-term dependencies. Our code is available at: https://github.com/chenxinpeng/ARNet
Tasks	Image Captioning
Published	2018-03-30
URL	http://arxiv.org/abs/1803.11439v2
PDF	http://arxiv.org/pdf/1803.11439v2.pdf
PWC	https://paperswithcode.com/paper/regularizing-rnns-for-caption-generation-by
Repo	https://github.com/chenxinpeng/ARNet
Framework	tf

code2vec: Learning Distributed Representations of Code


Title	code2vec: Learning Distributed Representations of Code
Authors	Uri Alon, Meital Zilberstein, Omer Levy, Eran Yahav
Abstract	We present a neural model for representing snippets of code as continuous distributed vectors (“code embeddings”). The main idea is to represent a code snippet as a single fixed-length $\textit{code vector}$, which can be used to predict semantic properties of the snippet. This is performed by decomposing code to a collection of paths in its abstract syntax tree, and learning the atomic representation of each path $\textit{simultaneously}$ with learning how to aggregate a set of them. We demonstrate the effectiveness of our approach by using it to predict a method’s name from the vector representation of its body. We evaluate our approach by training a model on a dataset of 14M methods. We show that code vectors trained on this dataset can predict method names from files that were completely unobserved during training. Furthermore, we show that our model learns useful method name vectors that capture semantic similarities, combinations, and analogies. Comparing previous techniques over the same data set, our approach obtains a relative improvement of over 75%, being the first to successfully predict method names based on a large, cross-project, corpus. Our trained model, visualizations and vector similarities are available as an interactive online demo at http://code2vec.org. The code, data, and trained models are available at https://github.com/tech-srl/code2vec.
Tasks
Published	2018-03-26
URL	http://arxiv.org/abs/1803.09473v5
PDF	http://arxiv.org/pdf/1803.09473v5.pdf
PWC	https://paperswithcode.com/paper/code2vec-learning-distributed-representations
Repo	https://github.com/kano1021/my-internship
Framework	none

Celer: a Fast Solver for the Lasso with Dual Extrapolation


Title	Celer: a Fast Solver for the Lasso with Dual Extrapolation
Authors	Mathurin Massias, Alexandre Gramfort, Joseph Salmon
Abstract	Convex sparsity-inducing regularizations are ubiquitous in high-dimensional machine learning, but solving the resulting optimization problems can be slow. To accelerate solvers, state-of-the-art approaches consist in reducing the size of the optimization problem at hand. In the context of regression, this can be achieved either by discarding irrelevant features (screening techniques) or by prioritizing features likely to be included in the support of the solution (working set techniques). Duality comes into play at several steps in these techniques. Here, we propose an extrapolation technique starting from a sequence of iterates in the dual that leads to the construction of improved dual points. This enables a tighter control of optimality as used in stopping criterion, as well as better screening performance of Gap Safe rules. Finally, we propose a working set strategy based on an aggressive use of Gap Safe screening rules. Thanks to our new dual point construction, we show significant computational speedups on multiple real-world problems.
Tasks
Published	2018-02-21
URL	http://arxiv.org/abs/1802.07481v3
PDF	http://arxiv.org/pdf/1802.07481v3.pdf
PWC	https://paperswithcode.com/paper/celer-a-fast-solver-for-the-lasso-with-dual
Repo	https://github.com/mathurinm/celer
Framework	none

FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling


Title	FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling
Authors	Jie Chen, Tengfei Ma, Cao Xiao
Abstract	The graph convolutional networks (GCN) recently proposed by Kipf and Welling are an effective graph model for semi-supervised learning. This model, however, was originally designed to be learned with the presence of both training and test data. Moreover, the recursive neighborhood expansion across layers poses time and memory challenges for training with large, dense graphs. To relax the requirement of simultaneous availability of test data, we interpret graph convolutions as integral transforms of embedding functions under probability measures. Such an interpretation allows for the use of Monte Carlo approaches to consistently estimate the integrals, which in turn leads to a batched training scheme as we propose in this work—FastGCN. Enhanced with importance sampling, FastGCN not only is efficient for training but also generalizes well for inference. We show a comprehensive set of experiments to demonstrate its effectiveness compared with GCN and related models. In particular, training is orders of magnitude more efficient while predictions remain comparably accurate.
Tasks	Node Classification
Published	2018-01-30
URL	http://arxiv.org/abs/1801.10247v1
PDF	http://arxiv.org/pdf/1801.10247v1.pdf
PWC	https://paperswithcode.com/paper/fastgcn-fast-learning-with-graph
Repo	https://github.com/jiechenjiechen/FastGCN-matlab
Framework	tf

Aspect Level Sentiment Classification with Attention-over-Attention Neural Networks


Title	Aspect Level Sentiment Classification with Attention-over-Attention Neural Networks
Authors	Binxuan Huang, Yanglan Ou, Kathleen M. Carley
Abstract	Aspect Based Sentiment Analysis, PyTorch Implementations. 基于方面的情感分析，使用PyTorch实现。
Tasks	Aspect-Based Sentiment Analysis
Published	2018-04-18
URL	http://arxiv.org/abs/1804.06536v1
PDF	http://arxiv.org/pdf/1804.06536v1.pdf
PWC	https://paperswithcode.com/paper/aspect-level-sentiment-classification-with
Repo	https://github.com/songyouwei/ABSA-PyTorch
Framework	pytorch

Asynchronous Training of Word Embeddings for Large Text Corpora


Title	Asynchronous Training of Word Embeddings for Large Text Corpora
Authors	Avishek Anand, Megha Khosla, Jaspreet Singh, Jan-Hendrik Zab, Zijian Zhang
Abstract	Word embeddings are a powerful approach for analyzing language and have been widely popular in numerous tasks in information retrieval and text mining. Training embeddings over huge corpora is computationally expensive because the input is typically sequentially processed and parameters are synchronously updated. Distributed architectures for asynchronous training that have been proposed either focus on scaling vocabulary sizes and dimensionality or suffer from expensive synchronization latencies. In this paper, we propose a scalable approach to train word embeddings by partitioning the input space instead in order to scale to massive text corpora while not sacrificing the performance of the embeddings. Our training procedure does not involve any parameter synchronization except a final sub-model merge phase that typically executes in a few minutes. Our distributed training scales seamlessly to large corpus sizes and we get comparable and sometimes even up to 45% performance improvement in a variety of NLP benchmarks using models trained by our distributed procedure which requires $1/10$ of the time taken by the baseline approach. Finally we also show that we are robust to missing words in sub-models and are able to effectively reconstruct word representations.
Tasks	Information Retrieval, Word Embeddings
Published	2018-12-07
URL	http://arxiv.org/abs/1812.03825v1
PDF	http://arxiv.org/pdf/1812.03825v1.pdf
PWC	https://paperswithcode.com/paper/asynchronous-training-of-word-embeddings-for
Repo	https://github.com/jhzab/dist_w2v
Framework	none

Alignment Analysis of Sequential Segmentation of Lexicons to Improve Automatic Cognate Detection


Title	Alignment Analysis of Sequential Segmentation of Lexicons to Improve Automatic Cognate Detection
Authors	Pranav A
Abstract	Ranking functions in information retrieval are often used in search engines to recommend the relevant answers to the query. This paper makes use of this notion of information retrieval and applies onto the problem domain of cognate detection. The main contributions of this paper are: (1) positional segmentation, which incorporates the sequential notion; (2) graphical error modelling, which deduces the transformations. The current research work focuses on classification problem; which is distinguishing whether a pair of words are cognates. This paper focuses on a harder problem, whether we could predict a possible cognate from the given input. Our study shows that when language modelling smoothing methods are applied as the retrieval functions and used in conjunction with positional segmentation and error modelling gives better results than competing baselines, in both classification and prediction of cognates. Source code is at: https://github.com/pranav-ust/cognates
Tasks	Information Retrieval, Language Modelling
Published	2018-11-20
URL	http://arxiv.org/abs/1811.08129v1
PDF	http://arxiv.org/pdf/1811.08129v1.pdf
PWC	https://paperswithcode.com/paper/alignment-analysis-of-sequential-segmentation
Repo	https://github.com/pranav-ust/cognates
Framework	none

Improving Information Retrieval Results for Persian Documents using FarsNet


Title	Improving Information Retrieval Results for Persian Documents using FarsNet
Authors	Adel Rahimi, Mohammad Bahrani
Abstract	In this paper, we propose a new method for query expansion, which uses FarsNet (Persian WordNet) to find similar tokens related to the query and expand the semantic meaning of the query. For this purpose, we use synonymy relations in FarsNet and extract the related synonyms to query words. This algorithm is used to enhance information retrieval systems and improve search results. The overall evaluation of this system in comparison to the baseline method (without using query expansion) shows an improvement of about 9 percent in Mean Average Precision (MAP).
Tasks	Information Retrieval
Published	2018-11-01
URL	http://arxiv.org/abs/1811.00854v1
PDF	http://arxiv.org/pdf/1811.00854v1.pdf
PWC	https://paperswithcode.com/paper/improving-information-retrieval-results-for
Repo	https://github.com/adelra/query-expansion
Framework	none


Title	MULAN: A Blind and Off-Grid Method for Multichannel Echo Retrieval
Authors	Helena Peic Tukuljac, Antoine Deleforge, Rémi Gribonval
Abstract	This paper addresses the general problem of blind echo retrieval, i.e., given M sensors measuring in the discrete-time domain M mixtures of K delayed and attenuated copies of an unknown source signal, can the echo locations and weights be recovered? This problem has broad applications in fields such as sonars, seismol-ogy, ultrasounds or room acoustics. It belongs to the broader class of blind channel identification problems, which have been intensively studied in signal processing. Existing methods in the literature proceed in two steps: (i) blind estimation of sparse discrete-time filters and (ii) echo information retrieval by peak-picking on filters. The precision of these methods is fundamentally limited by the rate at which the signals are sampled: estimated echo locations are necessary on-grid, and since true locations never match the sampling grid, the weight estimation precision is impacted. This is the so-called basis-mismatch problem in compressed sensing. We propose a radically different approach to the problem, building on the framework of finite-rate-of-innovation sampling. The approach operates directly in the parameter-space of echo locations and weights, and enables near-exact blind and off-grid echo retrieval from discrete-time measurements. It is shown to outperform conventional methods by several orders of magnitude in precision.
Tasks	Information Retrieval
Published	2018-10-31
URL	http://arxiv.org/abs/1810.13338v1
PDF	http://arxiv.org/pdf/1810.13338v1.pdf
PWC	https://paperswithcode.com/paper/mulan-a-blind-and-off-grid-method-for
Repo	https://github.com/epfl-lts2/mulan
Framework	none

Clinical Concept Extraction with Contextual Word Embedding


Title	Clinical Concept Extraction with Contextual Word Embedding
Authors	Henghui Zhu, Ioannis Ch. Paschalidis, Amir Tahmasebi
Abstract	Automatic extraction of clinical concepts is an essential step for turning the unstructured data within a clinical note into structured and actionable information. In this work, we propose a clinical concept extraction model for automatic annotation of clinical problems, treatments, and tests in clinical notes utilizing domain-specific contextual word embedding. A contextual word embedding model is first trained on a corpus with a mixture of clinical reports and relevant Wikipedia pages in the clinical domain. Next, a bidirectional LSTM-CRF model is trained for clinical concept extraction using the contextual word embedding model. We tested our proposed model on the I2B2 2010 challenge dataset. Our proposed model achieved the best performance among reported baseline models and outperformed the state-of-the-art models by 3.4% in terms of F1-score.
Tasks	Clinical Concept Extraction
Published	2018-10-24
URL	http://arxiv.org/abs/1810.10566v2
PDF	http://arxiv.org/pdf/1810.10566v2.pdf
PWC	https://paperswithcode.com/paper/clinical-concept-extraction-with-contextual
Repo	https://github.com/noc-lab/clinical_concept_extraction
Framework	tf