July 26, 2019

2124 words 10 mins read

Paper Group NANR 161

Canopy — Fast Sampling with Cover Trees. Context-Aware Smoothing for Neural Machine Translation. SUT Submission for NIST 2016 Speaker Recognition Evaluation: Description and Analysis. Bayesian Optimization with Tree-structured Dependencies. High-Dimensional Structured Quantile Regression. Learning User Embeddings from Emails. Projection-free Dist …

Canopy — Fast Sampling with Cover Trees


Title	Canopy — Fast Sampling with Cover Trees
Authors	Manzil Zaheer, Satwik Kottur, Amr Ahmed, José Moura, Alex Smola
Abstract	Hierarchical Bayesian models often capture distributions over a very large number of distinct atoms. The need for these models arises when organizing huge amount of unsupervised data, for instance, features extracted using deep convnets that can be exploited to organize abundant unlabeled images. Inference for hierarchical Bayesian models in such cases can be rather nontrivial, leading to approximate approaches. In this work, we propose Canopy, a sampler based on Cover Trees that is exact, has guaranteed runtime logarithmic in the number of atoms, and is provably polynomial in the inherent dimensionality of the underlying parameter space. In other words, the algorithm is as fast as search over a hierarchical data structure. We provide theory for Canopy and demonstrate its effectiveness on both synthetic and real datasets, consisting of over 100 million images.
Tasks
Published	2017-08-01
URL	https://icml.cc/Conferences/2017/Schedule?showEvent=507
PDF	http://proceedings.mlr.press/v70/zaheer17b/zaheer17b.pdf
PWC	https://paperswithcode.com/paper/canopy-fast-sampling-with-cover-trees
Repo
Framework

Context-Aware Smoothing for Neural Machine Translation


Title	Context-Aware Smoothing for Neural Machine Translation
Authors	Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita, Tiejun Zhao
Abstract	In Neural Machine Translation (NMT), each word is represented as a low-dimension, real-value vector for encoding its syntax and semantic information. This means that even if the word is in a different sentence context, it is represented as the fixed vector to learn source representation. Moreover, a large number of Out-Of-Vocabulary (OOV) words, which have different syntax and semantic information, are represented as the same vector representation of {``}unk{''}. To alleviate this problem, we propose a novel context-aware smoothing method to dynamically learn a sentence-specific vector for each word (including OOV words) depending on its local context words in a sentence. The learned context-aware representation is integrated into the NMT to improve the translation performance. Empirical results on NIST Chinese-to-English translation task show that the proposed approach achieves 1.78 BLEU improvements on average over a strong attentional NMT, and outperforms some existing systems. \|
Tasks	Machine Translation, Representation Learning
Published	2017-11-01
URL	https://www.aclweb.org/anthology/I17-1002/
PDF	https://www.aclweb.org/anthology/I17-1002
PWC	https://paperswithcode.com/paper/context-aware-smoothing-for-neural-machine
Repo
Framework

SUT Submission for NIST 2016 Speaker Recognition Evaluation: Description and Analysis


Title	SUT Submission for NIST 2016 Speaker Recognition Evaluation: Description and Analysis
Authors	Hossein Zeinali, Hossein Sameti, Noushin Maghsoodi
Abstract
Tasks	Speaker Recognition, Speaker Verification
Published	2017-11-01
URL	https://www.aclweb.org/anthology/O17-1026/
PDF	https://www.aclweb.org/anthology/O17-1026
PWC	https://paperswithcode.com/paper/sut-submission-for-nist-2016-speaker
Repo
Framework

Bayesian Optimization with Tree-structured Dependencies


Title	Bayesian Optimization with Tree-structured Dependencies
Authors	Rodolphe Jenatton, Cedric Archambeau, Javier González, Matthias Seeger
Abstract	Bayesian optimization has been successfully used to optimize complex black-box functions whose evaluations are expensive. In many applications, like in deep learning and predictive analytics, the optimization domain is itself complex and structured. In this work, we focus on use cases where this domain exhibits a known dependency structure. The benefit of leveraging this structure is twofold: we explore the search space more efficiently and posterior inference scales more favorably with the number of observations than Gaussian Process-based approaches published in the literature. We introduce a novel surrogate model for Bayesian optimization which combines independent Gaussian Processes with a linear model that encodes a tree-based dependency structure and can transfer information between overlapping decision sequences. We also design a specialized two-step acquisition function that explores the search space more effectively. Our experiments on synthetic tree-structured functions and the tuning of feedforward neural networks trained on a range of binary classification datasets show that our method compares favorably with competing approaches.
Tasks	Gaussian Processes
Published	2017-08-01
URL	https://icml.cc/Conferences/2017/Schedule?showEvent=666
PDF	http://proceedings.mlr.press/v70/jenatton17a/jenatton17a.pdf
PWC	https://paperswithcode.com/paper/bayesian-optimization-with-tree-structured
Repo
Framework

High-Dimensional Structured Quantile Regression


Title	High-Dimensional Structured Quantile Regression
Authors	Vidyashankar Sivakumar, Arindam Banerjee
Abstract	Quantile regression aims at modeling the conditional median and quantiles of a response variable given certain predictor variables. In this work we consider the problem of linear quantile regression in high dimensions where the number of predictor variables is much higher than the number of samples available for parameter estimation. We assume the true parameter to have some structure characterized as having a small value according to some atomic norm R(.) and consider the norm regularized quantile regression estimator. We characterize the sample complexity for consistent recovery and give non-asymptotic bounds on the estimation error. While this problem has been previously considered, our analysis reveals geometric and statistical characteristics of the problem not available in prior literature. We perform experiments on synthetic data which support the theoretical results.
Tasks
Published	2017-08-01
URL	https://icml.cc/Conferences/2017/Schedule?showEvent=741
PDF	http://proceedings.mlr.press/v70/sivakumar17a/sivakumar17a.pdf
PWC	https://paperswithcode.com/paper/high-dimensional-structured-quantile
Repo
Framework

Learning User Embeddings from Emails


Title	Learning User Embeddings from Emails
Authors	Yan Song, Chia-Jung Lee
Abstract	Many important email-related tasks, such as email classification or search, highly rely on building quality document representations (e.g., bag-of-words or key phrases) to assist matching and understanding. Despite prior success on representing textual messages, creating quality user representations from emails was overlooked. In this paper, we propose to represent users using embeddings that are trained to reflect the email communication network. Our experiments on Enron dataset suggest that the resulting embeddings capture the semantic distance between users. To assess the quality of embeddings in a real-world application, we carry out auto-foldering task where the lexical representation of an email is enriched with user embedding features. Our results show that folder prediction accuracy is improved when embedding features are present across multiple settings.
Tasks	Recommendation Systems, Word Embeddings
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-2116/
PDF	https://www.aclweb.org/anthology/E17-2116
PWC	https://paperswithcode.com/paper/learning-user-embeddings-from-emails
Repo
Framework

Projection-free Distributed Online Learning in Networks


Title	Projection-free Distributed Online Learning in Networks
Authors	Wenpeng Zhang, Peilin Zhao, Wenwu Zhu, Steven C. H. Hoi, Tong Zhang
Abstract	The conditional gradient algorithm has regained a surge of research interest in recent years due to its high efficiency in handling large-scale machine learning problems. However, none of existing studies has explored it in the distributed online learning setting, where locally light computation is assumed. In this paper, we fill this gap by proposing the distributed online conditional gradient algorithm, which eschews the expensive projection operation needed in its counterpart algorithms by exploiting much simpler linear optimization steps. We give a regret bound for the proposed algorithm as a function of the network size and topology, which will be smaller on smaller graphs or “well-connected” graphs. Experiments on two large-scale real-world datasets for a multiclass classification task confirm the computational benefit of the proposed algorithm and also verify the theoretical regret bound.
Tasks
Published	2017-08-01
URL	https://icml.cc/Conferences/2017/Schedule?showEvent=700
PDF	http://proceedings.mlr.press/v70/zhang17g/zhang17g.pdf
PWC	https://paperswithcode.com/paper/projection-free-distributed-online-learning
Repo
Framework

BIBI System Description: Building with CNNs and Breaking with Deep Reinforcement Learning


Title	BIBI System Description: Building with CNNs and Breaking with Deep Reinforcement Learning
Authors	Yitong Li, Trevor Cohn, Timothy Baldwin
Abstract	This paper describes our submission to the sentiment analysis sub-task of {``}Build It, Break It: The Language Edition (BIBI){''}, on both the builder and breaker sides. As a builder, we use convolutional neural nets, trained on both phrase and sentence data. As a breaker, we use Q-learning to learn minimal change pairs, and apply a token substitution method automatically. We analyse the results to gauge the robustness of NLP systems. \|
Tasks	Q-Learning, Sentiment Analysis, Text Classification
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-5404/
PDF	https://www.aclweb.org/anthology/W17-5404
PWC	https://paperswithcode.com/paper/bibi-system-description-building-with-cnns
Repo
Framework

Benchmarking Joint Lexical and Syntactic Analysis on Multiword-Rich Data


Title	Benchmarking Joint Lexical and Syntactic Analysis on Multiword-Rich Data
Authors	Matthieu Constant, H{'e}ctor Martinez Alonso
Abstract	This article evaluates the extension of a dependency parser that performs joint syntactic analysis and multiword expression identification. We show that, given sufficient training data, the parser benefits from explicit multiword information and improves overall labeled accuracy score in eight of the ten evaluation cases.
Tasks	Dependency Parsing, Lexical Analysis
Published	2017-04-01
URL	https://www.aclweb.org/anthology/W17-1725/
PDF	https://www.aclweb.org/anthology/W17-1725
PWC	https://paperswithcode.com/paper/benchmarking-joint-lexical-and-syntactic
Repo
Framework

Breaking NLP: Using Morphosyntax, Semantics, Pragmatics and World Knowledge to Fool Sentiment Analysis Systems


Title	Breaking NLP: Using Morphosyntax, Semantics, Pragmatics and World Knowledge to Fool Sentiment Analysis Systems
Authors	Taylor Mahler, Willy Cheung, Micha Elsner, David King, Marie-Catherine de Marneffe, Cory Shain, Symon Stevens-Guille, Michael White
Abstract	This paper describes our {`}breaker{''} submission to the 2017 EMNLP {`}Build It Break It{''} shared task on sentiment analysis. In order to cause the {``}builder{''} systems to make incorrect predictions, we edited items in the blind test data according to linguistically interpretable strategies that allow us to assess the ease with which the builder systems learn various components of linguistic structure. On the whole, our submitted pairs break all systems at a high rate (72.6{%}), indicating that sentiment analysis as an NLP task may still have a lot of ground to cover. Of the breaker strategies that we consider, we find our semantic and pragmatic manipulations to pose the most substantial difficulties for the builder systems. \|
Tasks	Sentiment Analysis
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-5405/
PDF	https://www.aclweb.org/anthology/W17-5405
PWC	https://paperswithcode.com/paper/consistent-classification-of-translation
Repo
Framework

High-dimensional Non-Gaussian Single Index Models via Thresholded Score Function Estimation


Title	High-dimensional Non-Gaussian Single Index Models via Thresholded Score Function Estimation
Authors	Zhuoran Yang, Krishnakumar Balasubramanian, Han Liu
Abstract	We consider estimating the parametric component of single index models in high dimensions. Compared with existing work, we do not require the covariate to be normally distributed. Utilizing Stein’s Lemma, we propose estimators based on the score function of the covariate. Moreover, to handle score function and response variables that are heavy-tailed, our estimators are constructed via carefully thresholding their empirical counterparts. Under a bounded fourth moment condition, we establish optimal statistical rates of convergence for the proposed estimators. Extensive numerical experiments are provided to back up our theory.
Tasks
Published	2017-08-01
URL	https://icml.cc/Conferences/2017/Schedule?showEvent=737
PDF	http://proceedings.mlr.press/v70/yang17a/yang17a.pdf
PWC	https://paperswithcode.com/paper/high-dimensional-non-gaussian-single-index
Repo
Framework

Differentially Private Learning of Graphical Models using CGMs


Title	Differentially Private Learning of Graphical Models using CGMs
Authors	Garrett Bernstein, Ryan McKenna, Tao Sun, Daniel Sheldon, Michael Hay, Gerome Miklau
Abstract	We investigate the problem of learning discrete graphical models in a differentially private way. Approaches to this problem range from privileged algorithms that conduct learning completely behind the privacy barrier to schemes that release private summary statistics paired with algorithms to learn parameters from those statistics. We show that the approach of releasing noisy sufficient statistics using the Laplace mechanism achieves a good trade-off between privacy, utility, and practicality. A naive learning algorithm that uses the noisy sufficient statistics “as is” outperforms general-purpose differentially private learning algorithms. However, it has three limitations: it ignores knowledge about the data generating process, rests on uncertain theoretical foundations, and exhibits certain pathologies. We develop a more principled approach that applies the formalism of collective graphical models to perform inference over the true sufficient statistics within an expectation-maximization framework. We show that this learns better models than competing approaches on both synthetic data and on real human mobility data used as a case study.
Tasks
Published	2017-08-01
URL	https://icml.cc/Conferences/2017/Schedule?showEvent=612
PDF	http://proceedings.mlr.press/v70/bernstein17a/bernstein17a.pdf
PWC	https://paperswithcode.com/paper/differentially-private-learning-of-graphical
Repo
Framework

Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue


Title	Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue
Authors
Abstract
Tasks
Published	2017-08-01
URL	https://www.aclweb.org/anthology/W17-5500/
PDF	https://www.aclweb.org/anthology/W17-5500
PWC	https://paperswithcode.com/paper/proceedings-of-the-18th-annual-sigdial
Repo
Framework

Identity Deception Detection


Title	Identity Deception Detection
Authors	Ver{'o}nica P{'e}rez-Rosas, Quincy Davenport, Anna Mengdan Dai, Mohamed Abouelenien, Rada Mihalcea
Abstract	This paper addresses the task of detecting identity deception in language. Using a novel identity deception dataset, consisting of real and portrayed identities from 600 individuals, we show that we can build accurate identity detectors targeting both age and gender, with accuracies of up to 88. We also perform an analysis of the linguistic patterns used in identity deception, which lead to interesting insights into identity portrayers.
Tasks	Deception Detection
Published	2017-11-01
URL	https://www.aclweb.org/anthology/I17-1089/
PDF	https://www.aclweb.org/anthology/I17-1089
PWC	https://paperswithcode.com/paper/identity-deception-detection
Repo
Framework

Minimizing Trust Leaks for Robust Sybil Detection


Title	Minimizing Trust Leaks for Robust Sybil Detection
Authors	János Höner, Shinichi Nakajima, Alexander Bauer, Klaus-Robert Müller, Nico Görnitz
Abstract	Sybil detection is a crucial task to protect online social networks (OSNs) against intruders who try to manipulate automatic services provided by OSNs to their customers. In this paper, we first discuss the robustness of graph-based Sybil detectors SybilRank and Integro and refine theoretically their security guarantees towards more realistic assumptions. After that, we formally introduce adversarial settings for the graph-based Sybil detection problem and derive a corresponding optimal attacking strategy by exploitation of trust leaks. Based on our analysis, we propose transductive Sybil ranking (TSR), a robust extension to SybilRank and Integro that directly minimizes trust leaks. Our empirical evaluation shows significant advantages of TSR over state-of-the-art competitors on a variety of attacking scenarios on artificially generated data and real-world datasets.
Tasks
Published	2017-08-01
URL	https://icml.cc/Conferences/2017/Schedule?showEvent=669
PDF	http://proceedings.mlr.press/v70/honer17a/honer17a.pdf
PWC	https://paperswithcode.com/paper/minimizing-trust-leaks-for-robust-sybil
Repo
Framework