July 26, 2019

2141 words 11 mins read

Paper Group NANR 159

Distributed Document and Phrase Co-embeddings for Descriptive Clustering. Study on Visual Word Recognition in Bangla across Different Reader Groups. Uncorrelation and Evenness: a New Diversity-Promoting Regularizer. Selective Inference for Sparse High-Order Interaction Models. 基於i-vector與PLDA並使用GMM-HMM強制對位之自動語者分段標記系統 (Speaker Diarization based on I …

Distributed Document and Phrase Co-embeddings for Descriptive Clustering


Title	Distributed Document and Phrase Co-embeddings for Descriptive Clustering
Authors	Motoki Sato, Austin J. Brockmeier, Georgios Kontonatsios, Tingting Mu, John Y. Goulermas, Jun{'}ichi Tsujii, Sophia Ananiadou
Abstract	Descriptive document clustering aims to automatically discover groups of semantically related documents and to assign a meaningful label to characterise the content of each cluster. In this paper, we present a descriptive clustering approach that employs a distributed representation model, namely the paragraph vector model, to capture semantic similarities between documents and phrases. The proposed method uses a joint representation of phrases and documents (i.e., a co-embedding) to automatically select a descriptive phrase that best represents each document cluster. We evaluate our method by comparing its performance to an existing state-of-the-art descriptive clustering method that also uses co-embedding but relies on a bag-of-words representation. Results obtained on benchmark datasets demonstrate that the paragraph vector-based method obtains superior performance over the existing approach in both identifying clusters and assigning appropriate descriptive labels to them.
Tasks	Information Retrieval, Semantic Textual Similarity
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-1093/
PDF	https://www.aclweb.org/anthology/E17-1093
PWC	https://paperswithcode.com/paper/distributed-document-and-phrase-co-embeddings
Repo
Framework

Study on Visual Word Recognition in Bangla across Different Reader Groups


Title	Study on Visual Word Recognition in Bangla across Different Reader Groups
Authors	Manjira Sinha, Tirthankar Dasgupta
Abstract
Tasks
Published	2017-12-01
URL	https://www.aclweb.org/anthology/W17-7553/
PDF	https://www.aclweb.org/anthology/W17-7553
PWC	https://paperswithcode.com/paper/study-on-visual-word-recognition-in-bangla
Repo
Framework

Uncorrelation and Evenness: a New Diversity-Promoting Regularizer


Title	Uncorrelation and Evenness: a New Diversity-Promoting Regularizer
Authors	Pengtao Xie, Aarti Singh, Eric P. Xing
Abstract	Latent space models (LSMs) provide a principled and effective way to extract hidden patterns from observed data. To cope with two challenges in LSMs: (1) how to capture infrequent patterns when pattern frequency is imbalanced and (2) how to reduce model size without sacrificing their expressiveness, several studies have been proposed to “diversify” LSMs, which design regularizers to encourage the components therein to be “diverse”. In light of the limitations of existing approaches, we design a new diversity-promoting regularizer by considering two factors: uncorrelation and evenness, which encourage the components to be uncorrelated and to play equally important roles in modeling data. Formally, this amounts to encouraging the covariance matrix of the components to have more uniform eigenvalues. We apply the regularizer to two LSMs and develop an efficient optimization algorithm. Experiments on healthcare, image and text data demonstrate the effectiveness of the regularizer.
Tasks
Published	2017-08-01
URL	https://icml.cc/Conferences/2017/Schedule?showEvent=491
PDF	http://proceedings.mlr.press/v70/xie17b/xie17b.pdf
PWC	https://paperswithcode.com/paper/uncorrelation-and-evenness-a-new-diversity
Repo
Framework

Selective Inference for Sparse High-Order Interaction Models


Title	Selective Inference for Sparse High-Order Interaction Models
Authors	Shinya Suzumura, Kazuya Nakagawa, Yuta Umezu, Koji Tsuda, Ichiro Takeuchi
Abstract	Finding statistically significant high-order interactions in predictive modeling is important but challenging task because the possible number of high-order interactions is extremely large (e.g., $> 10^{17}$). In this paper we study feature selection and statistical inference for sparse high-order interaction models. Our main contribution is to extend recently developed selective inference framework for linear models to high-order interaction models by developing a novel algorithm for efficiently characterizing the selection event for the selective inference of high-order interactions. We demonstrate the effectiveness of the proposed algorithm by applying it to an HIV drug response prediction problem.
Tasks	Feature Selection
Published	2017-08-01
URL	https://icml.cc/Conferences/2017/Schedule?showEvent=801
PDF	http://proceedings.mlr.press/v70/suzumura17a/suzumura17a.pdf
PWC	https://paperswithcode.com/paper/selective-inference-for-sparse-high-order
Repo
Framework

基於i-vector與PLDA並使用GMM-HMM強制對位之自動語者分段標記系統 (Speaker Diarization based on I-vector PLDA Scoring and using GMM-HMM Forced Alignment) [In Chinese]


Title	基於i-vector與PLDA並使用GMM-HMM強制對位之自動語者分段標記系統 (Speaker Diarization based on I-vector PLDA Scoring and using GMM-HMM Forced Alignment) [In Chinese]
Authors	Cheng-Jo Ray Chang, Hung-Shin Lee, Hsin-Min Wang, Jyh-Shing Roger Jang
Abstract
Tasks	Speaker Diarization
Published	2017-11-01
URL	https://www.aclweb.org/anthology/O17-1012/
PDF	https://www.aclweb.org/anthology/O17-1012
PWC	https://paperswithcode.com/paper/ao14i-vectorepldaa-a12c-gmm
Repo
Framework

Coarse Semantic Classification of Rare Nouns Using Cross-Lingual Data and Recurrent Neural Networks


Title	Coarse Semantic Classification of Rare Nouns Using Cross-Lingual Data and Recurrent Neural Networks
Authors	Oliver Hellwig
Abstract
Tasks	Sentence Classification
Published	2017-01-01
URL	https://www.aclweb.org/anthology/W17-6811/
PDF	https://www.aclweb.org/anthology/W17-6811
PWC	https://paperswithcode.com/paper/coarse-semantic-classification-of-rare-nouns
Repo
Framework

Latent LSTM Allocation: Joint clustering and non-linear dynamic modeling of sequence data


Title	Latent LSTM Allocation: Joint clustering and non-linear dynamic modeling of sequence data
Authors	Manzil Zaheer, Amr Ahmed, Alexander J. Smola
Abstract	Recurrent neural networks, such as long-short term memory (LSTM) networks, are powerful tools for modeling sequential data like user browsing history (Tan et al., 2016; Korpusik et al., 2016) or natural language text (Mikolov et al., 2010). However, to generalize across different user types, LSTMs require a large number of parameters, notwithstanding the simplicity of the underlying dynamics, rendering it uninterpretable, which is highly undesirable in user modeling. The increase in complexity and parameters arises due to a large action space in which many of the actions have similar intent or topic. In this paper, we introduce Latent LSTM Allocation (LLA) for user modeling combining hierarchical Bayesian models with LSTMs. In LLA, each user is modeled as a sequence of actions, and the model jointly groups actions into topics and learns the temporal dynamics over the topic sequence, instead of action space directly. This leads to a model that is highly interpretable, concise, and can capture intricate dynamics. We present an efficient Stochastic EM inference algorithm for our model that scales to millions of users/documents. Our experimental evaluations show that the proposed model compares favorably with several state-of-the-art baselines.
Tasks
Published	2017-08-01
URL	https://icml.cc/Conferences/2017/Schedule?showEvent=817
PDF	http://proceedings.mlr.press/v70/zaheer17a/zaheer17a.pdf
PWC	https://paperswithcode.com/paper/latent-lstm-allocation-joint-clustering-and
Repo
Framework

Exploring Convolutional Neural Networks for Sentiment Analysis of Spanish tweets


Title	Exploring Convolutional Neural Networks for Sentiment Analysis of Spanish tweets
Authors	Isabel Segura-Bedmar, Antonio Quir{'o}s, Paloma Mart{'\i}nez
Abstract	Spanish is the third-most used language on the internet, after English and Chinese, with a total of 7.7{%} (more than 277 million of users) and a huge internet growth of more than 1,400{%}. However, most work on sentiment analysis has been focused on English. This paper describes a deep learning system for Spanish sentiment analysis. To the best of our knowledge, this is the first work that explores the use of a convolutional neural network to polarity classification of Spanish tweets.
Tasks	Sentiment Analysis, Word Embeddings
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-1095/
PDF	https://www.aclweb.org/anthology/E17-1095
PWC	https://paperswithcode.com/paper/exploring-convolutional-neural-networks-for
Repo
Framework

Large-scale Opinion Relation Extraction with Distantly Supervised Neural Network


Title	Large-scale Opinion Relation Extraction with Distantly Supervised Neural Network
Authors	Changzhi Sun, Yuanbin Wu, Man Lan, Shiliang Sun, Qi Zhang
Abstract	We investigate the task of open domain opinion relation extraction. Different from works on manually labeled corpus, we propose an efficient distantly supervised framework based on pattern matching and neural network classifiers. The patterns are designed to automatically generate training data, and the deep learning model is design to capture various lexical and syntactic features. The result algorithm is fast and scalable on large-scale corpus. We test the system on the Amazon online review dataset. The result shows that our model is able to achieve promising performances without any human annotations.
Tasks	Opinion Mining, Relation Extraction
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-1097/
PDF	https://www.aclweb.org/anthology/E17-1097
PWC	https://paperswithcode.com/paper/large-scale-opinion-relation-extraction-with
Repo
Framework

Clustering by Sum of Norms: Stochastic Incremental Algorithm, Convergence and Cluster Recovery


Title	Clustering by Sum of Norms: Stochastic Incremental Algorithm, Convergence and Cluster Recovery
Authors	Ashkan Panahi, Devdatt Dubhashi, Fredrik D. Johansson, Chiranjib Bhattacharyya
Abstract	Standard clustering methods such as K-means, Gaussian mixture models, and hierarchical clustering are beset by local minima, which are sometimes drastically suboptimal. Moreover the number of clusters K must be known in advance. The recently introduced the sum-of-norms (SON) or Clusterpath convex relaxation of k-means and hierarchical clustering shrinks cluster centroids toward one another and ensure a unique global minimizer. We give a scalable stochastic incremental algorithm based on proximal iterations to solve the SON problem with convergence guarantees. We also show that the algorithm recovers clusters under quite general conditions which have a similar form to the unifying proximity condition introduced in the approximation algorithms community (that covers paradigm cases such as Gaussian mixtures and planted partition models). We give experimental results to confirm that our algorithm scales much better than previous methods while producing clusters of comparable quality.
Tasks
Published	2017-08-01
URL	https://icml.cc/Conferences/2017/Schedule?showEvent=699
PDF	http://proceedings.mlr.press/v70/panahi17a/panahi17a.pdf
PWC	https://paperswithcode.com/paper/clustering-by-sum-of-norms-stochastic
Repo
Framework

Accelerated Stochastic Greedy Coordinate Descent by Soft Thresholding Projection onto Simplex


Title	Accelerated Stochastic Greedy Coordinate Descent by Soft Thresholding Projection onto Simplex
Authors	Chaobing Song, Shaobo Cui, Yong Jiang, Shu-Tao Xia
Abstract	In this paper we study the well-known greedy coordinate descent (GCD) algorithm to solve $\ell_1$-regularized problems and improve GCD by the two popular strategies: Nesterov’s acceleration and stochastic optimization. Firstly, we propose a new rule for greedy selection based on an $\ell_1$-norm square approximation which is nontrivial to solve but convex; then an efficient algorithm called ``SOft ThreshOlding PrOjection (SOTOPO)’’ is proposed to exactly solve the $\ell_1$-regularized $\ell_1$-norm square approximation problem, which is induced by the new rule. Based on the new rule and the SOTOPO algorithm, the Nesterov’s acceleration and stochastic optimization strategies are then successfully applied to the GCD algorithm. The resulted algorithm called accelerated stochastic greedy coordinate descent (ASGCD) has the optimal convergence rate $O(\sqrt{1/\epsilon})$; meanwhile, it reduces the iteration complexity of greedy selection up to a factor of sample size. Both theoretically and empirically, we show that ASGCD has better performance for high-dimensional and dense problems with sparse solution. \|
Tasks	Stochastic Optimization
Published	2017-12-01
URL	http://papers.nips.cc/paper/7069-accelerated-stochastic-greedy-coordinate-descent-by-soft-thresholding-projection-onto-simplex
PDF	http://papers.nips.cc/paper/7069-accelerated-stochastic-greedy-coordinate-descent-by-soft-thresholding-projection-onto-simplex.pdf
PWC	https://paperswithcode.com/paper/accelerated-stochastic-greedy-coordinate
Repo
Framework

Continuous multilinguality with language vectors


Title	Continuous multilinguality with language vectors
Authors	Robert {"O}stling, J{"o}rg Tiedemann
Abstract	Most existing models for multilingual natural language processing (NLP) treat language as a discrete category, and make predictions for either one language or the other. In contrast, we propose using continuous vector representations of language. We show that these can be learned efficiently with a character-based neural language model, and used to improve inference about language varieties not seen during training. In experiments with 1303 Bible translations into 990 different languages, we empirically explore the capacity of multilingual language models, and also show that the language vectors capture genetic relationships between languages.
Tasks	Image Captioning, Language Modelling, Machine Translation, Speech Recognition
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-2102/
PDF	https://www.aclweb.org/anthology/E17-2102
PWC	https://paperswithcode.com/paper/continuous-multilinguality-with-language
Repo
Framework

Convolutional Neural Networks for Authorship Attribution of Short Texts


Title	Convolutional Neural Networks for Authorship Attribution of Short Texts
Authors	Prasha Shrestha, Sebastian Sierra, Fabio Gonz{'a}lez, Manuel Montes, Paolo Rosso, Thamar Solorio
Abstract	We present a model to perform authorship attribution of tweets using Convolutional Neural Networks (CNNs) over character n-grams. We also present a strategy that improves model interpretability by estimating the importance of input text fragments in the predicted classification. The experimental evaluation shows that text CNNs perform competitively and are able to outperform previous methods.
Tasks	Language Modelling
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-2106/
PDF	https://www.aclweb.org/anthology/E17-2106
PWC	https://paperswithcode.com/paper/convolutional-neural-networks-for-authorship
Repo
Framework

Linguistic analysis of differences in portrayal of movie characters


Title	Linguistic analysis of differences in portrayal of movie characters
Authors	Anil Ramakrishna, Victor R. Mart{'\i}nez, Mal, Nikolaos rakis, Karan Singla, Shrikanth Narayanan
Abstract	We examine differences in portrayal of characters in movies using psycholinguistic and graph theoretic measures computed directly from screenplays. Differences are examined with respect to characters{'} gender, race, age and other metadata. Psycholinguistic metrics are extrapolated to dialogues in movies using a linear regression model built on a set of manually annotated seed words. Interesting patterns are revealed about relationships between genders of production team and the gender ratio of characters. Several correlations are noted between gender, race, age of characters and the linguistic metrics.
Tasks
Published	2017-07-01
URL	https://www.aclweb.org/anthology/P17-1153/
PDF	https://www.aclweb.org/anthology/P17-1153
PWC	https://paperswithcode.com/paper/linguistic-analysis-of-differences-in
Repo
Framework

Deep Spectral Clustering Learning


Title	Deep Spectral Clustering Learning
Authors	Marc T. Law, Raquel Urtasun, Richard S. Zemel
Abstract	Clustering is the task of grouping a set of examples so that similar examples are grouped into the same cluster while dissimilar examples are in different clusters. The quality of a clustering depends on two problem-dependent factors which are i) the chosen similarity metric and ii) the data representation. Supervised clustering approaches, which exploit labeled partitioned datasets have thus been proposed, for instance to learn a metric optimized to perform clustering. However, most of these approaches assume that the representation of the data is fixed and then learn an appropriate linear transformation. Some deep supervised clustering learning approaches have also been proposed. However, they rely on iterative methods to compute gradients resulting in high algorithmic complexity. In this paper, we propose a deep supervised clustering metric learning method that formulates a novel loss function. We derive a closed-form expression for the gradient that is efficient to compute: the complexity to compute the gradient is linear in the size of the training mini-batch and quadratic in the representation dimensionality. We further reveal how our approach can be seen as learning spectral clustering. Experiments on standard real-world datasets confirm state-of-the-art Recall@K performance.
Tasks	Metric Learning
Published	2017-08-01
URL	https://icml.cc/Conferences/2017/Schedule?showEvent=494
PDF	http://proceedings.mlr.press/v70/law17a/law17a.pdf
PWC	https://paperswithcode.com/paper/deep-spectral-clustering-learning
Repo
Framework