October 20, 2019

2846 words 14 mins read

Paper Group AWR 257

Paper Group AWR 257

Pursuit of Low-Rank Models of Time-Varying Matrices Robust to Sparse and Measurement Noise. Left-Center-Right Separated Neural Network for Aspect-based Sentiment Analysis with Rotatory Attention. Scoring Lexical Entailment with a Supervised Directional Similarity Network. Modality Distillation with Multiple Stream Networks for Action Recognition. D …

Pursuit of Low-Rank Models of Time-Varying Matrices Robust to Sparse and Measurement Noise

Title Pursuit of Low-Rank Models of Time-Varying Matrices Robust to Sparse and Measurement Noise
Authors Albert Akhriev, Jakub Marecek, Andrea Simonetto
Abstract In tracking of time-varying low-rank models of time-varying matrices, we present a method robust to both uniformly-distributed measurement noise and arbitrarily-distributed ``sparse’’ noise. In theory, we bound the tracking error. In practice, our use of randomised coordinate descent is scalable and allows for encouraging results on changedetection net, a benchmark. |
Tasks
Published 2018-09-10
URL https://arxiv.org/abs/1809.03550v3
PDF https://arxiv.org/pdf/1809.03550v3.pdf
PWC https://paperswithcode.com/paper/pursuit-of-low-rank-models-of-time-varying
Repo https://github.com/jmarecek/OnlineLowRank
Framework none

Left-Center-Right Separated Neural Network for Aspect-based Sentiment Analysis with Rotatory Attention

Title Left-Center-Right Separated Neural Network for Aspect-based Sentiment Analysis with Rotatory Attention
Authors Shiliang Zheng, Rui Xia
Abstract Deep learning techniques have achieved success in aspect-based sentiment analysis in recent years. However, there are two important issues that still remain to be further studied, i.e., 1) how to efficiently represent the target especially when the target contains multiple words; 2) how to utilize the interaction between target and left/right contexts to capture the most important words in them. In this paper, we propose an approach, called left-center-right separated neural network with rotatory attention (LCR-Rot), to better address the two problems. Our approach has two characteristics: 1) it has three separated LSTMs, i.e., left, center and right LSTMs, corresponding to three parts of a review (left context, target phrase and right context); 2) it has a rotatory attention mechanism which models the relation between target and left/right contexts. The target2context attention is used to capture the most indicative sentiment words in left/right contexts. Subsequently, the context2target attention is used to capture the most important word in the target. This leads to a two-side representation of the target: left-aware target and right-aware target. We compare our approach on three benchmark datasets with ten related methods proposed recently. The results show that our approach significantly outperforms the state-of-the-art techniques.
Tasks Aspect-Based Sentiment Analysis, Sentiment Analysis
Published 2018-02-03
URL http://arxiv.org/abs/1802.00892v1
PDF http://arxiv.org/pdf/1802.00892v1.pdf
PWC https://paperswithcode.com/paper/left-center-right-separated-neural-network
Repo https://github.com/NUSTM/ABSC
Framework tf

Scoring Lexical Entailment with a Supervised Directional Similarity Network

Title Scoring Lexical Entailment with a Supervised Directional Similarity Network
Authors Marek Rei, Daniela Gerz, Ivan Vulić
Abstract We present the Supervised Directional Similarity Network (SDSN), a novel neural architecture for learning task-specific transformation functions on top of general-purpose word embeddings. Relying on only a limited amount of supervision from task-specific scores on a subset of the vocabulary, our architecture is able to generalise and transform a general-purpose distributional vector space to model the relation of lexical entailment. Experiments show excellent performance on scoring graded lexical entailment, raising the state-of-the-art on the HyperLex dataset by approximately 25%.
Tasks Word Embeddings
Published 2018-05-23
URL http://arxiv.org/abs/1805.09355v1
PDF http://arxiv.org/pdf/1805.09355v1.pdf
PWC https://paperswithcode.com/paper/scoring-lexical-entailment-with-a-supervised
Repo https://github.com/marekrei/sdsn
Framework none

Modality Distillation with Multiple Stream Networks for Action Recognition

Title Modality Distillation with Multiple Stream Networks for Action Recognition
Authors Nuno Garcia, Pietro Morerio, Vittorio Murino
Abstract Diverse input data modalities can provide complementary cues for several tasks, usually leading to more robust algorithms and better performance. However, while a (training) dataset could be accurately designed to include a variety of sensory inputs, it is often the case that not all modalities could be available in real life (testing) scenarios, where a model has to be deployed. This raises the challenge of how to learn robust representations leveraging multimodal data in the training stage, while considering limitations at test time, such as noisy or missing modalities. This paper presents a new approach for multimodal video action recognition, developed within the unified frameworks of distillation and privileged information, named generalized distillation. Particularly, we consider the case of learning representations from depth and RGB videos, while relying on RGB data only at test time. We propose a new approach to train an hallucination network that learns to distill depth features through multiplicative connections of spatiotemporal representations, leveraging soft labels and hard labels, as well as distance between feature maps. We report state-of-the-art results on video action classification on the largest multimodal dataset available for this task, the NTU RGB+D. Code available at https://github.com/ncgarcia/modality-distillation .
Tasks Action Classification, Temporal Action Localization
Published 2018-06-19
URL http://arxiv.org/abs/1806.07110v2
PDF http://arxiv.org/pdf/1806.07110v2.pdf
PWC https://paperswithcode.com/paper/modality-distillation-with-multiple-stream
Repo https://github.com/ncgarcia/modality-distillation
Framework tf

Deep Temporal Clustering : Fully Unsupervised Learning of Time-Domain Features

Title Deep Temporal Clustering : Fully Unsupervised Learning of Time-Domain Features
Authors Naveen Sai Madiraju, Seid M. Sadat, Dimitry Fisher, Homa Karimabadi
Abstract Unsupervised learning of time series data, also known as temporal clustering, is a challenging problem in machine learning. Here we propose a novel algorithm, Deep Temporal Clustering (DTC), to naturally integrate dimensionality reduction and temporal clustering into a single end-to-end learning framework, fully unsupervised. The algorithm utilizes an autoencoder for temporal dimensionality reduction and a novel temporal clustering layer for cluster assignment. Then it jointly optimizes the clustering objective and the dimensionality reduction objec tive. Based on requirement and application, the temporal clustering layer can be customized with any temporal similarity metric. Several similarity metrics and state-of-the-art algorithms are considered and compared. To gain insight into temporal features that the network has learned for its clustering, we apply a visualization method that generates a region of interest heatmap for the time series. The viability of the algorithm is demonstrated using time series data from diverse domains, ranging from earthquakes to spacecraft sensor data. In each case, we show that the proposed algorithm outperforms traditional methods. The superior performance is attributed to the fully integrated temporal dimensionality reduction and clustering criterion.
Tasks Dimensionality Reduction, Time Series
Published 2018-02-04
URL http://arxiv.org/abs/1802.01059v1
PDF http://arxiv.org/pdf/1802.01059v1.pdf
PWC https://paperswithcode.com/paper/deep-temporal-clustering-fully-unsupervised
Repo https://github.com/saeeeeru/dtc-tensorflow
Framework tf

Regularizing RNNs for Caption Generation by Reconstructing The Past with The Present

Title Regularizing RNNs for Caption Generation by Reconstructing The Past with The Present
Authors Xinpeng Chen, Lin Ma, Wenhao Jiang, Jian Yao, Wei Liu
Abstract Recently, caption generation with an encoder-decoder framework has been extensively studied and applied in different domains, such as image captioning, code captioning, and so on. In this paper, we propose a novel architecture, namely Auto-Reconstructor Network (ARNet), which, coupling with the conventional encoder-decoder framework, works in an end-to-end fashion to generate captions. ARNet aims at reconstructing the previous hidden state with the present one, besides behaving as the input-dependent transition operator. Therefore, ARNet encourages the current hidden state to embed more information from the previous one, which can help regularize the transition dynamics of recurrent neural networks (RNNs). Extensive experimental results show that our proposed ARNet boosts the performance over the existing encoder-decoder models on both image captioning and source code captioning tasks. Additionally, ARNet remarkably reduces the discrepancy between training and inference processes for caption generation. Furthermore, the performance on permuted sequential MNIST demonstrates that ARNet can effectively regularize RNN, especially on modeling long-term dependencies. Our code is available at: https://github.com/chenxinpeng/ARNet
Tasks Image Captioning
Published 2018-03-30
URL http://arxiv.org/abs/1803.11439v2
PDF http://arxiv.org/pdf/1803.11439v2.pdf
PWC https://paperswithcode.com/paper/regularizing-rnns-for-caption-generation-by
Repo https://github.com/chenxinpeng/ARNet
Framework tf

code2vec: Learning Distributed Representations of Code

Title code2vec: Learning Distributed Representations of Code
Authors Uri Alon, Meital Zilberstein, Omer Levy, Eran Yahav
Abstract We present a neural model for representing snippets of code as continuous distributed vectors (“code embeddings”). The main idea is to represent a code snippet as a single fixed-length $\textit{code vector}$, which can be used to predict semantic properties of the snippet. This is performed by decomposing code to a collection of paths in its abstract syntax tree, and learning the atomic representation of each path $\textit{simultaneously}$ with learning how to aggregate a set of them. We demonstrate the effectiveness of our approach by using it to predict a method’s name from the vector representation of its body. We evaluate our approach by training a model on a dataset of 14M methods. We show that code vectors trained on this dataset can predict method names from files that were completely unobserved during training. Furthermore, we show that our model learns useful method name vectors that capture semantic similarities, combinations, and analogies. Comparing previous techniques over the same data set, our approach obtains a relative improvement of over 75%, being the first to successfully predict method names based on a large, cross-project, corpus. Our trained model, visualizations and vector similarities are available as an interactive online demo at http://code2vec.org. The code, data, and trained models are available at https://github.com/tech-srl/code2vec.
Tasks
Published 2018-03-26
URL http://arxiv.org/abs/1803.09473v5
PDF http://arxiv.org/pdf/1803.09473v5.pdf
PWC https://paperswithcode.com/paper/code2vec-learning-distributed-representations
Repo https://github.com/kano1021/my-internship
Framework none

Celer: a Fast Solver for the Lasso with Dual Extrapolation

Title Celer: a Fast Solver for the Lasso with Dual Extrapolation
Authors Mathurin Massias, Alexandre Gramfort, Joseph Salmon
Abstract Convex sparsity-inducing regularizations are ubiquitous in high-dimensional machine learning, but solving the resulting optimization problems can be slow. To accelerate solvers, state-of-the-art approaches consist in reducing the size of the optimization problem at hand. In the context of regression, this can be achieved either by discarding irrelevant features (screening techniques) or by prioritizing features likely to be included in the support of the solution (working set techniques). Duality comes into play at several steps in these techniques. Here, we propose an extrapolation technique starting from a sequence of iterates in the dual that leads to the construction of improved dual points. This enables a tighter control of optimality as used in stopping criterion, as well as better screening performance of Gap Safe rules. Finally, we propose a working set strategy based on an aggressive use of Gap Safe screening rules. Thanks to our new dual point construction, we show significant computational speedups on multiple real-world problems.
Tasks
Published 2018-02-21
URL http://arxiv.org/abs/1802.07481v3
PDF http://arxiv.org/pdf/1802.07481v3.pdf
PWC https://paperswithcode.com/paper/celer-a-fast-solver-for-the-lasso-with-dual
Repo https://github.com/mathurinm/celer
Framework none

FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling

Title FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling
Authors Jie Chen, Tengfei Ma, Cao Xiao
Abstract The graph convolutional networks (GCN) recently proposed by Kipf and Welling are an effective graph model for semi-supervised learning. This model, however, was originally designed to be learned with the presence of both training and test data. Moreover, the recursive neighborhood expansion across layers poses time and memory challenges for training with large, dense graphs. To relax the requirement of simultaneous availability of test data, we interpret graph convolutions as integral transforms of embedding functions under probability measures. Such an interpretation allows for the use of Monte Carlo approaches to consistently estimate the integrals, which in turn leads to a batched training scheme as we propose in this work—FastGCN. Enhanced with importance sampling, FastGCN not only is efficient for training but also generalizes well for inference. We show a comprehensive set of experiments to demonstrate its effectiveness compared with GCN and related models. In particular, training is orders of magnitude more efficient while predictions remain comparably accurate.
Tasks Node Classification
Published 2018-01-30
URL http://arxiv.org/abs/1801.10247v1
PDF http://arxiv.org/pdf/1801.10247v1.pdf
PWC https://paperswithcode.com/paper/fastgcn-fast-learning-with-graph
Repo https://github.com/jiechenjiechen/FastGCN-matlab
Framework tf

Aspect Level Sentiment Classification with Attention-over-Attention Neural Networks

Title Aspect Level Sentiment Classification with Attention-over-Attention Neural Networks
Authors Binxuan Huang, Yanglan Ou, Kathleen M. Carley
Abstract Aspect Based Sentiment Analysis, PyTorch Implementations. 基于方面的情感分析,使用PyTorch实现。
Tasks Aspect-Based Sentiment Analysis
Published 2018-04-18
URL http://arxiv.org/abs/1804.06536v1
PDF http://arxiv.org/pdf/1804.06536v1.pdf
PWC https://paperswithcode.com/paper/aspect-level-sentiment-classification-with
Repo https://github.com/songyouwei/ABSA-PyTorch
Framework pytorch

Asynchronous Training of Word Embeddings for Large Text Corpora

Title Asynchronous Training of Word Embeddings for Large Text Corpora
Authors Avishek Anand, Megha Khosla, Jaspreet Singh, Jan-Hendrik Zab, Zijian Zhang
Abstract Word embeddings are a powerful approach for analyzing language and have been widely popular in numerous tasks in information retrieval and text mining. Training embeddings over huge corpora is computationally expensive because the input is typically sequentially processed and parameters are synchronously updated. Distributed architectures for asynchronous training that have been proposed either focus on scaling vocabulary sizes and dimensionality or suffer from expensive synchronization latencies. In this paper, we propose a scalable approach to train word embeddings by partitioning the input space instead in order to scale to massive text corpora while not sacrificing the performance of the embeddings. Our training procedure does not involve any parameter synchronization except a final sub-model merge phase that typically executes in a few minutes. Our distributed training scales seamlessly to large corpus sizes and we get comparable and sometimes even up to 45% performance improvement in a variety of NLP benchmarks using models trained by our distributed procedure which requires $1/10$ of the time taken by the baseline approach. Finally we also show that we are robust to missing words in sub-models and are able to effectively reconstruct word representations.
Tasks Information Retrieval, Word Embeddings
Published 2018-12-07
URL http://arxiv.org/abs/1812.03825v1
PDF http://arxiv.org/pdf/1812.03825v1.pdf
PWC https://paperswithcode.com/paper/asynchronous-training-of-word-embeddings-for
Repo https://github.com/jhzab/dist_w2v
Framework none

Alignment Analysis of Sequential Segmentation of Lexicons to Improve Automatic Cognate Detection

Title Alignment Analysis of Sequential Segmentation of Lexicons to Improve Automatic Cognate Detection
Authors Pranav A
Abstract Ranking functions in information retrieval are often used in search engines to recommend the relevant answers to the query. This paper makes use of this notion of information retrieval and applies onto the problem domain of cognate detection. The main contributions of this paper are: (1) positional segmentation, which incorporates the sequential notion; (2) graphical error modelling, which deduces the transformations. The current research work focuses on classification problem; which is distinguishing whether a pair of words are cognates. This paper focuses on a harder problem, whether we could predict a possible cognate from the given input. Our study shows that when language modelling smoothing methods are applied as the retrieval functions and used in conjunction with positional segmentation and error modelling gives better results than competing baselines, in both classification and prediction of cognates. Source code is at: https://github.com/pranav-ust/cognates
Tasks Information Retrieval, Language Modelling
Published 2018-11-20
URL http://arxiv.org/abs/1811.08129v1
PDF http://arxiv.org/pdf/1811.08129v1.pdf
PWC https://paperswithcode.com/paper/alignment-analysis-of-sequential-segmentation
Repo https://github.com/pranav-ust/cognates
Framework none

Improving Information Retrieval Results for Persian Documents using FarsNet

Title Improving Information Retrieval Results for Persian Documents using FarsNet
Authors Adel Rahimi, Mohammad Bahrani
Abstract In this paper, we propose a new method for query expansion, which uses FarsNet (Persian WordNet) to find similar tokens related to the query and expand the semantic meaning of the query. For this purpose, we use synonymy relations in FarsNet and extract the related synonyms to query words. This algorithm is used to enhance information retrieval systems and improve search results. The overall evaluation of this system in comparison to the baseline method (without using query expansion) shows an improvement of about 9 percent in Mean Average Precision (MAP).
Tasks Information Retrieval
Published 2018-11-01
URL http://arxiv.org/abs/1811.00854v1
PDF http://arxiv.org/pdf/1811.00854v1.pdf
PWC https://paperswithcode.com/paper/improving-information-retrieval-results-for
Repo https://github.com/adelra/query-expansion
Framework none

MULAN: A Blind and Off-Grid Method for Multichannel Echo Retrieval

Title MULAN: A Blind and Off-Grid Method for Multichannel Echo Retrieval
Authors Helena Peic Tukuljac, Antoine Deleforge, Rémi Gribonval
Abstract This paper addresses the general problem of blind echo retrieval, i.e., given M sensors measuring in the discrete-time domain M mixtures of K delayed and attenuated copies of an unknown source signal, can the echo locations and weights be recovered? This problem has broad applications in fields such as sonars, seismol-ogy, ultrasounds or room acoustics. It belongs to the broader class of blind channel identification problems, which have been intensively studied in signal processing. Existing methods in the literature proceed in two steps: (i) blind estimation of sparse discrete-time filters and (ii) echo information retrieval by peak-picking on filters. The precision of these methods is fundamentally limited by the rate at which the signals are sampled: estimated echo locations are necessary on-grid, and since true locations never match the sampling grid, the weight estimation precision is impacted. This is the so-called basis-mismatch problem in compressed sensing. We propose a radically different approach to the problem, building on the framework of finite-rate-of-innovation sampling. The approach operates directly in the parameter-space of echo locations and weights, and enables near-exact blind and off-grid echo retrieval from discrete-time measurements. It is shown to outperform conventional methods by several orders of magnitude in precision.
Tasks Information Retrieval
Published 2018-10-31
URL http://arxiv.org/abs/1810.13338v1
PDF http://arxiv.org/pdf/1810.13338v1.pdf
PWC https://paperswithcode.com/paper/mulan-a-blind-and-off-grid-method-for
Repo https://github.com/epfl-lts2/mulan
Framework none

Clinical Concept Extraction with Contextual Word Embedding

Title Clinical Concept Extraction with Contextual Word Embedding
Authors Henghui Zhu, Ioannis Ch. Paschalidis, Amir Tahmasebi
Abstract Automatic extraction of clinical concepts is an essential step for turning the unstructured data within a clinical note into structured and actionable information. In this work, we propose a clinical concept extraction model for automatic annotation of clinical problems, treatments, and tests in clinical notes utilizing domain-specific contextual word embedding. A contextual word embedding model is first trained on a corpus with a mixture of clinical reports and relevant Wikipedia pages in the clinical domain. Next, a bidirectional LSTM-CRF model is trained for clinical concept extraction using the contextual word embedding model. We tested our proposed model on the I2B2 2010 challenge dataset. Our proposed model achieved the best performance among reported baseline models and outperformed the state-of-the-art models by 3.4% in terms of F1-score.
Tasks Clinical Concept Extraction
Published 2018-10-24
URL http://arxiv.org/abs/1810.10566v2
PDF http://arxiv.org/pdf/1810.10566v2.pdf
PWC https://paperswithcode.com/paper/clinical-concept-extraction-with-contextual
Repo https://github.com/noc-lab/clinical_concept_extraction
Framework tf
comments powered by Disqus