Paper Group NANR 161
Canopy — Fast Sampling with Cover Trees. Context-Aware Smoothing for Neural Machine Translation. SUT Submission for NIST 2016 Speaker Recognition Evaluation: Description and Analysis. Bayesian Optimization with Tree-structured Dependencies. High-Dimensional Structured Quantile Regression. Learning User Embeddings from Emails. Projection-free Dist …
Canopy — Fast Sampling with Cover Trees
Title | Canopy — Fast Sampling with Cover Trees |
Authors | Manzil Zaheer, Satwik Kottur, Amr Ahmed, José Moura, Alex Smola |
Abstract | Hierarchical Bayesian models often capture distributions over a very large number of distinct atoms. The need for these models arises when organizing huge amount of unsupervised data, for instance, features extracted using deep convnets that can be exploited to organize abundant unlabeled images. Inference for hierarchical Bayesian models in such cases can be rather nontrivial, leading to approximate approaches. In this work, we propose Canopy, a sampler based on Cover Trees that is exact, has guaranteed runtime logarithmic in the number of atoms, and is provably polynomial in the inherent dimensionality of the underlying parameter space. In other words, the algorithm is as fast as search over a hierarchical data structure. We provide theory for Canopy and demonstrate its effectiveness on both synthetic and real datasets, consisting of over 100 million images. |
Tasks | |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=507 |
http://proceedings.mlr.press/v70/zaheer17b/zaheer17b.pdf | |
PWC | https://paperswithcode.com/paper/canopy-fast-sampling-with-cover-trees |
Repo | |
Framework | |
Context-Aware Smoothing for Neural Machine Translation
Title | Context-Aware Smoothing for Neural Machine Translation |
Authors | Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita, Tiejun Zhao |
Abstract | In Neural Machine Translation (NMT), each word is represented as a low-dimension, real-value vector for encoding its syntax and semantic information. This means that even if the word is in a different sentence context, it is represented as the fixed vector to learn source representation. Moreover, a large number of Out-Of-Vocabulary (OOV) words, which have different syntax and semantic information, are represented as the same vector representation of {``}unk{''}. To alleviate this problem, we propose a novel context-aware smoothing method to dynamically learn a sentence-specific vector for each word (including OOV words) depending on its local context words in a sentence. The learned context-aware representation is integrated into the NMT to improve the translation performance. Empirical results on NIST Chinese-to-English translation task show that the proposed approach achieves 1.78 BLEU improvements on average over a strong attentional NMT, and outperforms some existing systems. | |
Tasks | Machine Translation, Representation Learning |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/I17-1002/ |
https://www.aclweb.org/anthology/I17-1002 | |
PWC | https://paperswithcode.com/paper/context-aware-smoothing-for-neural-machine |
Repo | |
Framework | |
SUT Submission for NIST 2016 Speaker Recognition Evaluation: Description and Analysis
Title | SUT Submission for NIST 2016 Speaker Recognition Evaluation: Description and Analysis |
Authors | Hossein Zeinali, Hossein Sameti, Noushin Maghsoodi |
Abstract | |
Tasks | Speaker Recognition, Speaker Verification |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/O17-1026/ |
https://www.aclweb.org/anthology/O17-1026 | |
PWC | https://paperswithcode.com/paper/sut-submission-for-nist-2016-speaker |
Repo | |
Framework | |
Bayesian Optimization with Tree-structured Dependencies
Title | Bayesian Optimization with Tree-structured Dependencies |
Authors | Rodolphe Jenatton, Cedric Archambeau, Javier González, Matthias Seeger |
Abstract | Bayesian optimization has been successfully used to optimize complex black-box functions whose evaluations are expensive. In many applications, like in deep learning and predictive analytics, the optimization domain is itself complex and structured. In this work, we focus on use cases where this domain exhibits a known dependency structure. The benefit of leveraging this structure is twofold: we explore the search space more efficiently and posterior inference scales more favorably with the number of observations than Gaussian Process-based approaches published in the literature. We introduce a novel surrogate model for Bayesian optimization which combines independent Gaussian Processes with a linear model that encodes a tree-based dependency structure and can transfer information between overlapping decision sequences. We also design a specialized two-step acquisition function that explores the search space more effectively. Our experiments on synthetic tree-structured functions and the tuning of feedforward neural networks trained on a range of binary classification datasets show that our method compares favorably with competing approaches. |
Tasks | Gaussian Processes |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=666 |
http://proceedings.mlr.press/v70/jenatton17a/jenatton17a.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-optimization-with-tree-structured |
Repo | |
Framework | |
High-Dimensional Structured Quantile Regression
Title | High-Dimensional Structured Quantile Regression |
Authors | Vidyashankar Sivakumar, Arindam Banerjee |
Abstract | Quantile regression aims at modeling the conditional median and quantiles of a response variable given certain predictor variables. In this work we consider the problem of linear quantile regression in high dimensions where the number of predictor variables is much higher than the number of samples available for parameter estimation. We assume the true parameter to have some structure characterized as having a small value according to some atomic norm R(.) and consider the norm regularized quantile regression estimator. We characterize the sample complexity for consistent recovery and give non-asymptotic bounds on the estimation error. While this problem has been previously considered, our analysis reveals geometric and statistical characteristics of the problem not available in prior literature. We perform experiments on synthetic data which support the theoretical results. |
Tasks | |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=741 |
http://proceedings.mlr.press/v70/sivakumar17a/sivakumar17a.pdf | |
PWC | https://paperswithcode.com/paper/high-dimensional-structured-quantile |
Repo | |
Framework | |
Learning User Embeddings from Emails
Title | Learning User Embeddings from Emails |
Authors | Yan Song, Chia-Jung Lee |
Abstract | Many important email-related tasks, such as email classification or search, highly rely on building quality document representations (e.g., bag-of-words or key phrases) to assist matching and understanding. Despite prior success on representing textual messages, creating quality user representations from emails was overlooked. In this paper, we propose to represent users using embeddings that are trained to reflect the email communication network. Our experiments on Enron dataset suggest that the resulting embeddings capture the semantic distance between users. To assess the quality of embeddings in a real-world application, we carry out auto-foldering task where the lexical representation of an email is enriched with user embedding features. Our results show that folder prediction accuracy is improved when embedding features are present across multiple settings. |
Tasks | Recommendation Systems, Word Embeddings |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-2116/ |
https://www.aclweb.org/anthology/E17-2116 | |
PWC | https://paperswithcode.com/paper/learning-user-embeddings-from-emails |
Repo | |
Framework | |
Projection-free Distributed Online Learning in Networks
Title | Projection-free Distributed Online Learning in Networks |
Authors | Wenpeng Zhang, Peilin Zhao, Wenwu Zhu, Steven C. H. Hoi, Tong Zhang |
Abstract | The conditional gradient algorithm has regained a surge of research interest in recent years due to its high efficiency in handling large-scale machine learning problems. However, none of existing studies has explored it in the distributed online learning setting, where locally light computation is assumed. In this paper, we fill this gap by proposing the distributed online conditional gradient algorithm, which eschews the expensive projection operation needed in its counterpart algorithms by exploiting much simpler linear optimization steps. We give a regret bound for the proposed algorithm as a function of the network size and topology, which will be smaller on smaller graphs or “well-connected” graphs. Experiments on two large-scale real-world datasets for a multiclass classification task confirm the computational benefit of the proposed algorithm and also verify the theoretical regret bound. |
Tasks | |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=700 |
http://proceedings.mlr.press/v70/zhang17g/zhang17g.pdf | |
PWC | https://paperswithcode.com/paper/projection-free-distributed-online-learning |
Repo | |
Framework | |
BIBI System Description: Building with CNNs and Breaking with Deep Reinforcement Learning
Title | BIBI System Description: Building with CNNs and Breaking with Deep Reinforcement Learning |
Authors | Yitong Li, Trevor Cohn, Timothy Baldwin |
Abstract | This paper describes our submission to the sentiment analysis sub-task of {``}Build It, Break It: The Language Edition (BIBI){''}, on both the builder and breaker sides. As a builder, we use convolutional neural nets, trained on both phrase and sentence data. As a breaker, we use Q-learning to learn minimal change pairs, and apply a token substitution method automatically. We analyse the results to gauge the robustness of NLP systems. | |
Tasks | Q-Learning, Sentiment Analysis, Text Classification |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-5404/ |
https://www.aclweb.org/anthology/W17-5404 | |
PWC | https://paperswithcode.com/paper/bibi-system-description-building-with-cnns |
Repo | |
Framework | |
Benchmarking Joint Lexical and Syntactic Analysis on Multiword-Rich Data
Title | Benchmarking Joint Lexical and Syntactic Analysis on Multiword-Rich Data |
Authors | Matthieu Constant, H{'e}ctor Martinez Alonso |
Abstract | This article evaluates the extension of a dependency parser that performs joint syntactic analysis and multiword expression identification. We show that, given sufficient training data, the parser benefits from explicit multiword information and improves overall labeled accuracy score in eight of the ten evaluation cases. |
Tasks | Dependency Parsing, Lexical Analysis |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/W17-1725/ |
https://www.aclweb.org/anthology/W17-1725 | |
PWC | https://paperswithcode.com/paper/benchmarking-joint-lexical-and-syntactic |
Repo | |
Framework | |
Breaking NLP: Using Morphosyntax, Semantics, Pragmatics and World Knowledge to Fool Sentiment Analysis Systems
Title | Breaking NLP: Using Morphosyntax, Semantics, Pragmatics and World Knowledge to Fool Sentiment Analysis Systems |
Authors | Taylor Mahler, Willy Cheung, Micha Elsner, David King, Marie-Catherine de Marneffe, Cory Shain, Symon Stevens-Guille, Michael White |
Abstract | This paper describes our {}breaker{''} submission to the 2017 EMNLP { }Build It Break It{''} shared task on sentiment analysis. In order to cause the {``}builder{''} systems to make incorrect predictions, we edited items in the blind test data according to linguistically interpretable strategies that allow us to assess the ease with which the builder systems learn various components of linguistic structure. On the whole, our submitted pairs break all systems at a high rate (72.6{%}), indicating that sentiment analysis as an NLP task may still have a lot of ground to cover. Of the breaker strategies that we consider, we find our semantic and pragmatic manipulations to pose the most substantial difficulties for the builder systems. | |
Tasks | Sentiment Analysis |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-5405/ |
https://www.aclweb.org/anthology/W17-5405 | |
PWC | https://paperswithcode.com/paper/consistent-classification-of-translation |
Repo | |
Framework | |
High-dimensional Non-Gaussian Single Index Models via Thresholded Score Function Estimation
Title | High-dimensional Non-Gaussian Single Index Models via Thresholded Score Function Estimation |
Authors | Zhuoran Yang, Krishnakumar Balasubramanian, Han Liu |
Abstract | We consider estimating the parametric component of single index models in high dimensions. Compared with existing work, we do not require the covariate to be normally distributed. Utilizing Stein’s Lemma, we propose estimators based on the score function of the covariate. Moreover, to handle score function and response variables that are heavy-tailed, our estimators are constructed via carefully thresholding their empirical counterparts. Under a bounded fourth moment condition, we establish optimal statistical rates of convergence for the proposed estimators. Extensive numerical experiments are provided to back up our theory. |
Tasks | |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=737 |
http://proceedings.mlr.press/v70/yang17a/yang17a.pdf | |
PWC | https://paperswithcode.com/paper/high-dimensional-non-gaussian-single-index |
Repo | |
Framework | |
Differentially Private Learning of Graphical Models using CGMs
Title | Differentially Private Learning of Graphical Models using CGMs |
Authors | Garrett Bernstein, Ryan McKenna, Tao Sun, Daniel Sheldon, Michael Hay, Gerome Miklau |
Abstract | We investigate the problem of learning discrete graphical models in a differentially private way. Approaches to this problem range from privileged algorithms that conduct learning completely behind the privacy barrier to schemes that release private summary statistics paired with algorithms to learn parameters from those statistics. We show that the approach of releasing noisy sufficient statistics using the Laplace mechanism achieves a good trade-off between privacy, utility, and practicality. A naive learning algorithm that uses the noisy sufficient statistics “as is” outperforms general-purpose differentially private learning algorithms. However, it has three limitations: it ignores knowledge about the data generating process, rests on uncertain theoretical foundations, and exhibits certain pathologies. We develop a more principled approach that applies the formalism of collective graphical models to perform inference over the true sufficient statistics within an expectation-maximization framework. We show that this learns better models than competing approaches on both synthetic data and on real human mobility data used as a case study. |
Tasks | |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=612 |
http://proceedings.mlr.press/v70/bernstein17a/bernstein17a.pdf | |
PWC | https://paperswithcode.com/paper/differentially-private-learning-of-graphical |
Repo | |
Framework | |
Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue
Title | Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue |
Authors | |
Abstract | |
Tasks | |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/W17-5500/ |
https://www.aclweb.org/anthology/W17-5500 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-18th-annual-sigdial |
Repo | |
Framework | |
Identity Deception Detection
Title | Identity Deception Detection |
Authors | Ver{'o}nica P{'e}rez-Rosas, Quincy Davenport, Anna Mengdan Dai, Mohamed Abouelenien, Rada Mihalcea |
Abstract | This paper addresses the task of detecting identity deception in language. Using a novel identity deception dataset, consisting of real and portrayed identities from 600 individuals, we show that we can build accurate identity detectors targeting both age and gender, with accuracies of up to 88. We also perform an analysis of the linguistic patterns used in identity deception, which lead to interesting insights into identity portrayers. |
Tasks | Deception Detection |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/I17-1089/ |
https://www.aclweb.org/anthology/I17-1089 | |
PWC | https://paperswithcode.com/paper/identity-deception-detection |
Repo | |
Framework | |
Minimizing Trust Leaks for Robust Sybil Detection
Title | Minimizing Trust Leaks for Robust Sybil Detection |
Authors | János Höner, Shinichi Nakajima, Alexander Bauer, Klaus-Robert Müller, Nico Görnitz |
Abstract | Sybil detection is a crucial task to protect online social networks (OSNs) against intruders who try to manipulate automatic services provided by OSNs to their customers. In this paper, we first discuss the robustness of graph-based Sybil detectors SybilRank and Integro and refine theoretically their security guarantees towards more realistic assumptions. After that, we formally introduce adversarial settings for the graph-based Sybil detection problem and derive a corresponding optimal attacking strategy by exploitation of trust leaks. Based on our analysis, we propose transductive Sybil ranking (TSR), a robust extension to SybilRank and Integro that directly minimizes trust leaks. Our empirical evaluation shows significant advantages of TSR over state-of-the-art competitors on a variety of attacking scenarios on artificially generated data and real-world datasets. |
Tasks | |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=669 |
http://proceedings.mlr.press/v70/honer17a/honer17a.pdf | |
PWC | https://paperswithcode.com/paper/minimizing-trust-leaks-for-robust-sybil |
Repo | |
Framework | |