October 18, 2019

2919 words 14 mins read

Paper Group ANR 489

Emotions are Universal: Learning Sentiment Based Representations of Resource-Poor Languages using Siamese Networks. Discriminative Label Consistent Domain Adaptation. Unsupervised Detection and Explanation of Latent-class Contextual Anomalies. On the information in spike timing: neural codes derived from polychronous groups. Using Normalized Cross …

Emotions are Universal: Learning Sentiment Based Representations of Resource-Poor Languages using Siamese Networks


Title	Emotions are Universal: Learning Sentiment Based Representations of Resource-Poor Languages using Siamese Networks
Authors	Nurendra Choudhary, Rajat Singh, Ishita Bindlish, Manish Shrivastava
Abstract	Machine learning approaches in sentiment analysis principally rely on the abundance of resources. To limit this dependence, we propose a novel method called Siamese Network Architecture for Sentiment Analysis (SNASA) to learn representations of resource-poor languages by jointly training them with resource-rich languages using a siamese network. SNASA model consists of twin Bi-directional Long Short-Term Memory Recurrent Neural Networks (Bi-LSTM RNN) with shared parameters joined by a contrastive loss function, based on a similarity metric. The model learns the sentence representations of resource-poor and resource-rich language in a common sentiment space by using a similarity metric based on their individual sentiments. The model, hence, projects sentences with similar sentiment closer to each other and the sentences with different sentiment farther from each other. Experiments on large-scale datasets of resource-rich languages - English and Spanish and resource-poor languages - Hindi and Telugu reveal that SNASA outperforms the state-of-the-art sentiment analysis approaches based on distributional semantics, semantic rules, lexicon lists and deep neural network representations without sh
Tasks	Sentiment Analysis
Published	2018-04-03
URL	http://arxiv.org/abs/1804.00805v1
PDF	http://arxiv.org/pdf/1804.00805v1.pdf
PWC	https://paperswithcode.com/paper/emotions-are-universal-learning-sentiment
Repo
Framework

Discriminative Label Consistent Domain Adaptation


Title	Discriminative Label Consistent Domain Adaptation
Authors	Lingkun Luo, Liming Chen, Ying lu, Shiqiang Hu
Abstract	Domain adaptation (DA) is transfer learning which aims to learn an effective predictor on target data from source data despite data distribution mismatch between source and target. We present in this paper a novel unsupervised DA method for cross-domain visual recognition which simultaneously optimizes the three terms of a theoretically established error bound. Specifically, the proposed DA method iteratively searches a latent shared feature subspace where not only the divergence of data distributions between the source domain and the target domain is decreased as most state-of-the-art DA methods do, but also the inter-class distances are increased to facilitate discriminative learning. Moreover, the proposed DA method sparsely regresses class labels from the features achieved in the shared subspace while minimizing the prediction errors on the source data and ensuring label consistency between source and target. Data outliers are also accounted for to further avoid negative knowledge transfer. Comprehensive experiments and in-depth analysis verify the effectiveness of the proposed DA method which consistently outperforms the state-of-the-art DA methods on standard DA benchmarks, i.e., 12 cross-domain image classification tasks.
Tasks	Domain Adaptation, Image Classification, Transfer Learning
Published	2018-02-21
URL	http://arxiv.org/abs/1802.08077v1
PDF	http://arxiv.org/pdf/1802.08077v1.pdf
PWC	https://paperswithcode.com/paper/discriminative-label-consistent-domain
Repo
Framework

Unsupervised Detection and Explanation of Latent-class Contextual Anomalies


Title	Unsupervised Detection and Explanation of Latent-class Contextual Anomalies
Authors	Jacob Kauffmann, Grégoire Montavon, Luiz Alberto Lima, Shinichi Nakajima, Klaus-Robert Müller, Nico Görnitz
Abstract	Detecting and explaining anomalies is a challenging effort. This holds especially true when data exhibits strong dependencies and single measurements need to be assessed and analyzed in their respective context. In this work, we consider scenarios where measurements are non-i.i.d, i.e. where samples are dependent on corresponding discrete latent variables which are connected through some given dependency structure, the contextual information. Our contribution is twofold: (i) Building atop of support vector data description (SVDD), we derive a method able to cope with latent-class dependency structure that can still be optimized efficiently. We further show that our approach neatly generalizes vanilla SVDD as well as k-means and conditional random fields (CRF) and provide a corresponding probabilistic interpretation. (ii) In unsupervised scenarios where it is not possible to quantify the accuracy of an anomaly detector, having an human-interpretable solution is the key to success. Based on deep Taylor decomposition and a reformulation of our trained anomaly detector as a neural network, we are able to backpropagate predictions to pixel-domain and thus identify features and regions of high relevance. We demonstrate the usefulness of our novel approach on toy data with known spatio-temporal structure and successfully validate on synthetic as well as real world off-shore data from the oil industry.
Tasks
Published	2018-06-29
URL	http://arxiv.org/abs/1806.11326v1
PDF	http://arxiv.org/pdf/1806.11326v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-detection-and-explanation-of
Repo
Framework

On the information in spike timing: neural codes derived from polychronous groups


Title	On the information in spike timing: neural codes derived from polychronous groups
Authors	Zhinus Marzi, Joao Hespanha, Upamanyu Madhow
Abstract	There is growing evidence regarding the importance of spike timing in neural information processing, with even a small number of spikes carrying information, but computational models lag significantly behind those for rate coding. Experimental evidence on neuronal behavior is consistent with the dynamical and state dependent behavior provided by recurrent connections. This motivates the minimalistic abstraction investigated in this paper, aimed at providing insight into information encoding in spike timing via recurrent connections. We employ information-theoretic techniques for a simple reservoir model which encodes input spatiotemporal patterns into a sparse neural code, translating the polychronous groups introduced by Izhikevich into codewords on which we can perform standard vector operations. We show that the distance properties of the code are similar to those for (optimal) random codes. In particular, the code meets benchmarks associated with both linear classification and capacity, with the latter scaling exponentially with reservoir size.
Tasks
Published	2018-03-09
URL	http://arxiv.org/abs/1803.03692v1
PDF	http://arxiv.org/pdf/1803.03692v1.pdf
PWC	https://paperswithcode.com/paper/on-the-information-in-spike-timing-neural
Repo
Framework

Using Normalized Cross Correlation in Least Squares Optimizations


Title	Using Normalized Cross Correlation in Least Squares Optimizations
Authors	Oliver J. Woodford
Abstract	Direct methods for vision have widely used photometric least squares minimizations since the seminal 1981 work of Lucas & Kanade, and have leveraged normalized cross correlation since at least 1972. However, no work to our knowledge has successfully combined photometric least squares minimizations and normalized cross correlation: despite obvious complementary benefits of efficiency and accuracy on the one hand, and robustness to lighting changes on the other. This work shows that combining the two methods is not only possible, but also straightforward and efficient. The resulting minimization is shown to be superior to competing approaches, both in terms of convergence rate and computation time. Furthermore, a new, robust, sparse formulation is introduced to mitigate local intensity variations and partial occlusions.
Tasks
Published	2018-10-10
URL	http://arxiv.org/abs/1810.04320v1
PDF	http://arxiv.org/pdf/1810.04320v1.pdf
PWC	https://paperswithcode.com/paper/using-normalized-cross-correlation-in-least
Repo
Framework

On the Analysis of Trajectories of Gradient Descent in the Optimization of Deep Neural Networks


Title	On the Analysis of Trajectories of Gradient Descent in the Optimization of Deep Neural Networks
Authors	Adepu Ravi Sankar, Vishwak Srinivasan, Vineeth N Balasubramanian
Abstract	Theoretical analysis of the error landscape of deep neural networks has garnered significant interest in recent years. In this work, we theoretically study the importance of noise in the trajectories of gradient descent towards optimal solutions in multi-layer neural networks. We show that adding noise (in different ways) to a neural network while training increases the rank of the product of weight matrices of a multi-layer linear neural network. We thus study how adding noise can assist reaching a global optimum when the product matrix is full-rank (under certain conditions). We establish theoretical foundations between the noise induced into the neural network - either to the gradient, to the architecture, or to the input/output to a neural network - and the rank of product of weight matrices. We corroborate our theoretical findings with empirical results.
Tasks
Published	2018-07-21
URL	http://arxiv.org/abs/1807.08140v1
PDF	http://arxiv.org/pdf/1807.08140v1.pdf
PWC	https://paperswithcode.com/paper/on-the-analysis-of-trajectories-of-gradient
Repo
Framework

Towards Task Understanding in Visual Settings


Title	Towards Task Understanding in Visual Settings
Authors	Sebastin Santy, Wazeer Zulfikar, Rishabh Mehrotra, Emine Yilmaz
Abstract	We consider the problem of understanding real world tasks depicted in visual images. While most existing image captioning methods excel in producing natural language descriptions of visual scenes involving human tasks, there is often the need for an understanding of the exact task being undertaken rather than a literal description of the scene. We leverage insights from real world task understanding systems, and propose a framework composed of convolutional neural networks, and an external hierarchical task ontology to produce task descriptions from input images. Detailed experiments highlight the efficacy of the extracted descriptions, which could potentially find their way in many applications, including image alt text generation.
Tasks	Image Captioning, Text Generation
Published	2018-11-28
URL	http://arxiv.org/abs/1811.11833v1
PDF	http://arxiv.org/pdf/1811.11833v1.pdf
PWC	https://paperswithcode.com/paper/towards-task-understanding-in-visual-settings
Repo
Framework

A Content-Based Late Fusion Approach Applied to Pedestrian Detection


Title	A Content-Based Late Fusion Approach Applied to Pedestrian Detection
Authors	Jessica Sena, Artur Jordao, William Robson Schwartz
Abstract	The variety of pedestrians detectors proposed in recent years has encouraged some works to fuse pedestrian detectors to achieve a more accurate detection. The intuition behind is to combine the detectors based on its spatial consensus. We propose a novel method called Content-Based Spatial Consensus (CSBC), which, in addition to relying on spatial consensus, considers the content of the detection windows to learn a weighted-fusion of pedestrian detectors. The result is a reduction in false alarms and an enhancement in the detection. In this work, we also demonstrate that there is small influence of the feature used to learn the contents of the windows of each detector, which enables our method to be efficient even employing simple features. The CSBC overcomes state-of-the-art fusion methods in the ETH dataset and in the Caltech dataset. Particularly, our method is more efficient since fewer detectors are necessary to achieve expressive results.
Tasks	Pedestrian Detection
Published	2018-06-08
URL	http://arxiv.org/abs/1806.03361v1
PDF	http://arxiv.org/pdf/1806.03361v1.pdf
PWC	https://paperswithcode.com/paper/a-content-based-late-fusion-approach-applied
Repo
Framework

Matrix Linear Discriminant Analysis


Title	Matrix Linear Discriminant Analysis
Authors	Wei Hu, Weining Shen, Hua Zhou, Dehan Kong
Abstract	We propose a novel linear discriminant analysis approach for the classification of high-dimensional matrix-valued data that commonly arises from imaging studies. Motivated by the equivalence of the conventional linear discriminant analysis and the ordinary least squares, we consider an efficient nuclear norm penalized regression that encourages a low-rank structure. Theoretical properties including a non-asymptotic risk bound and a rank consistency result are established. Simulation studies and an application to electroencephalography data show the superior performance of the proposed method over the existing approaches.
Tasks
Published	2018-09-24
URL	https://arxiv.org/abs/1809.08746v2
PDF	https://arxiv.org/pdf/1809.08746v2.pdf
PWC	https://paperswithcode.com/paper/matrix-linear-discriminant-analysis
Repo
Framework

Cluster validity index based on Jeffrey divergence


Title	Cluster validity index based on Jeffrey divergence
Authors	Ahmed Ben Said, Rachid Hadjidj, Sebti Foufou
Abstract	Cluster validity indexes are very important tools designed for two purposes: comparing the performance of clustering algorithms and determining the number of clusters that best fits the data. These indexes are in general constructed by combining a measure of compactness and a measure of separation. A classical measure of compactness is the variance. As for separation, the distance between cluster centers is used. However, such a distance does not always reflect the quality of the partition between clusters and sometimes gives misleading results. In this paper, we propose a new cluster validity index for which Jeffrey divergence is used to measure separation between clusters. Experimental results are conducted using different types of data and comparison with widely used cluster validity indexes demonstrates the outperformance of the proposed index.
Tasks
Published	2018-12-20
URL	http://arxiv.org/abs/1812.08891v1
PDF	http://arxiv.org/pdf/1812.08891v1.pdf
PWC	https://paperswithcode.com/paper/cluster-validity-index-based-on-jeffrey
Repo
Framework

Revisiting Perspective Information for Efficient Crowd Counting


Title	Revisiting Perspective Information for Efficient Crowd Counting
Authors	Miaojing Shi, Zhaohui Yang, Chao Xu, Qijun Chen
Abstract	Crowd counting is the task of estimating people numbers in crowd images. Modern crowd counting methods employ deep neural networks to estimate crowd counts via crowd density regressions. A major challenge of this task lies in the perspective distortion, which results in drastic person scale change in an image. Density regression on the small person area is in general very hard. In this work, we propose a perspective-aware convolutional neural network (PACNN) for efficient crowd counting, which integrates the perspective information into density regression to provide additional knowledge of the person scale change in an image. Ground truth perspective maps are firstly generated for training; PACNN is then specifically designed to predict multi-scale perspective maps, and encode them as perspective-aware weighting layers in the network to adaptively combine the outputs of multi-scale density maps. The weights are learned at every pixel of the maps such that the final density combination is robust to the perspective distortion. We conduct extensive experiments on the ShanghaiTech, WorldExpo’10, UCF_CC_50, and UCSD datasets, and demonstrate the effectiveness and efficiency of PACNN over the state-of-the-art.
Tasks	Crowd Counting
Published	2018-07-05
URL	http://arxiv.org/abs/1807.01989v3
PDF	http://arxiv.org/pdf/1807.01989v3.pdf
PWC	https://paperswithcode.com/paper/revisiting-perspective-information-for
Repo
Framework

Learning Class Prototypes via Structure Alignment for Zero-Shot Recognition


Title	Learning Class Prototypes via Structure Alignment for Zero-Shot Recognition
Authors	Huajie Jiang, Ruiping Wang, Shiguang Shan, Xilin Chen
Abstract	Zero-shot learning (ZSL) aims to recognize objects of novel classes without any training samples of specific classes, which is achieved by exploiting the semantic information and auxiliary datasets. Recently most ZSL approaches focus on learning visual-semantic embeddings to transfer knowledge from the auxiliary datasets to the novel classes. However, few works study whether the semantic information is discriminative or not for the recognition task. To tackle such problem, we propose a coupled dictionary learning approach to align the visual-semantic structures using the class prototypes, where the discriminative information lying in the visual space is utilized to improve the less discriminative semantic space. Then, zero-shot recognition can be performed in different spaces by the simple nearest neighbor approach using the learned class prototypes. Extensive experiments on four benchmark datasets show the effectiveness of the proposed approach.
Tasks	Dictionary Learning, Zero-Shot Learning
Published	2018-07-24
URL	http://arxiv.org/abs/1807.09123v1
PDF	http://arxiv.org/pdf/1807.09123v1.pdf
PWC	https://paperswithcode.com/paper/learning-class-prototypes-via-structure
Repo
Framework

Correlated Anomaly Detection from Large Streaming Data


Title	Correlated Anomaly Detection from Large Streaming Data
Authors	Zheng Chen, Xinli Yu, Yuan Ling, Bo Song, Wei Quan, Xiaohua Hu, Erjia Yan
Abstract	Correlated anomaly detection (CAD) from streaming data is a type of group anomaly detection and an essential task in useful real-time data mining applications like botnet detection, financial event detection, industrial process monitor, etc. The primary approach for this type of detection in previous researches is based on principal score (PS) of divided batches or sliding windows by computing top eigenvalues of the correlation matrix, e.g. the Lanczos algorithm. However, this paper brings up the phenomenon of principal score degeneration for large data set, and then mathematically and practically prove current PS-based methods are likely to fail for CAD on large-scale streaming data even if the number of correlated anomalies grows with the data size at a reasonable rate; in reality, anomalies tend to be the minority of the data, and this issue can be more serious. We propose a framework with two novel randomized algorithms rPS and gPS for better detection of correlated anomalies from large streaming data of various correlation strength. The experiment shows high and balanced recall and estimated accuracy of our framework for anomaly detection from a large server log data set and a U.S. stock daily price data set in comparison to direct principal score evaluation and some other recent group anomaly detection algorithms. Moreover, our techniques significantly improve the computation efficiency and scalability for principal score calculation.
Tasks	Anomaly Detection, Group Anomaly Detection
Published	2018-12-19
URL	http://arxiv.org/abs/1812.09387v2
PDF	http://arxiv.org/pdf/1812.09387v2.pdf
PWC	https://paperswithcode.com/paper/correlated-anomaly-detection-from-large
Repo
Framework

A Symbolic Approach to Explaining Bayesian Network Classifiers


Title	A Symbolic Approach to Explaining Bayesian Network Classifiers
Authors	Andy Shih, Arthur Choi, Adnan Darwiche
Abstract	We propose an approach for explaining Bayesian network classifiers, which is based on compiling such classifiers into decision functions that have a tractable and symbolic form. We introduce two types of explanations for why a classifier may have classified an instance positively or negatively and suggest algorithms for computing these explanations. The first type of explanation identifies a minimal set of the currently active features that is responsible for the current classification, while the second type of explanation identifies a minimal set of features whose current state (active or not) is sufficient for the classification. We consider in particular the compilation of Naive and Latent-Tree Bayesian network classifiers into Ordered Decision Diagrams (ODDs), providing a context for evaluating our proposal using case studies and experiments based on classifiers from the literature.
Tasks
Published	2018-05-09
URL	http://arxiv.org/abs/1805.03364v1
PDF	http://arxiv.org/pdf/1805.03364v1.pdf
PWC	https://paperswithcode.com/paper/a-symbolic-approach-to-explaining-bayesian
Repo
Framework

Connecting the Dots Between MLE and RL for Sequence Prediction


Title	Connecting the Dots Between MLE and RL for Sequence Prediction
Authors	Bowen Tan, Zhiting Hu, Zichao Yang, Ruslan Salakhutdinov, Eric Xing
Abstract	Sequence prediction models can be learned from example sequences with a variety of training algorithms. Maximum likelihood learning is simple and efficient, yet can suffer from compounding error at test time. Reinforcement learning such as policy gradient addresses the issue but can have prohibitively poor exploration efficiency. A rich set of other algorithms such as RAML, SPG, and data noising, have also been developed from different perspectives. This paper establishes a formal connection between these algorithms. We present a generalized entropy regularized policy optimization formulation, and show that the apparently distinct algorithms can all be reformulated as special instances of the framework, with the only difference being the configurations of a reward function and a couple of hyperparameters. The unified interpretation offers a systematic view of the varying properties of exploration and learning efficiency. Besides, inspired from the framework, we present a new algorithm that dynamically interpolates among the family of algorithms for scheduled sequence model learning. Experiments on machine translation, text summarization, and game imitation learning demonstrate the superiority of the proposed algorithm.
Tasks	Imitation Learning, Machine Translation, Text Summarization
Published	2018-11-24
URL	https://arxiv.org/abs/1811.09740v2
PDF	https://arxiv.org/pdf/1811.09740v2.pdf
PWC	https://paperswithcode.com/paper/connecting-the-dots-between-mle-and-rl-for
Repo
Framework