July 26, 2019

2165 words 11 mins read

Paper Group NANR 163

Paper Group NANR 163

Universal Dependencies for Afrikaans. High-Dimensional Variance-Reduced Stochastic Gradient Expectation-Maximization Algorithm. Differentially Private Chi-squared Test by Unit Circle Mechanism. A Multi-task Approach to Predict Likability of Books. Should Neural Network Architecture Reflect Linguistic Structure?. Sparse Approximate Conic Hulls. Coll …

Universal Dependencies for Afrikaans

Title Universal Dependencies for Afrikaans
Authors Peter Dirix, Liesbeth Augustinus, Daniel van Niekerk, Frank Van Eynde
Abstract
Tasks
Published 2017-05-01
URL https://www.aclweb.org/anthology/W17-0405/
PDF https://www.aclweb.org/anthology/W17-0405
PWC https://paperswithcode.com/paper/universal-dependencies-for-afrikaans
Repo
Framework

High-Dimensional Variance-Reduced Stochastic Gradient Expectation-Maximization Algorithm

Title High-Dimensional Variance-Reduced Stochastic Gradient Expectation-Maximization Algorithm
Authors Rongda Zhu, Lingxiao Wang, Chengxiang Zhai, Quanquan Gu
Abstract We propose a generic stochastic expectation-maximization (EM) algorithm for the estimation of high-dimensional latent variable models. At the core of our algorithm is a novel semi-stochastic variance-reduced gradient designed for the $Q$-function in the EM algorithm. Under a mild condition on the initialization, our algorithm is guaranteed to attain a linear convergence rate to the unknown parameter of the latent variable model, and achieve an optimal statistical rate up to a logarithmic factor for parameter estimation. Compared with existing high-dimensional EM algorithms, our algorithm enjoys a better computational complexity and is therefore more efficient. We apply our generic algorithm to two illustrative latent variable models: Gaussian mixture model and mixture of linear regression, and demonstrate the advantages of our algorithm by both theoretical analysis and numerical experiments. We believe that the proposed semi-stochastic gradient is of independent interest for general nonconvex optimization problems with bivariate structures.
Tasks Latent Variable Models
Published 2017-08-01
URL https://icml.cc/Conferences/2017/Schedule?showEvent=547
PDF http://proceedings.mlr.press/v70/zhu17a/zhu17a.pdf
PWC https://paperswithcode.com/paper/high-dimensional-variance-reduced-stochastic
Repo
Framework

Differentially Private Chi-squared Test by Unit Circle Mechanism

Title Differentially Private Chi-squared Test by Unit Circle Mechanism
Authors Kazuya Kakizaki, Kazuto Fukuchi, Jun Sakuma
Abstract This paper develops differentially private mechanisms for $\chi^2$ test of independence. While existing works put their effort into properly controlling the type-I error, in addition to that, we investigate the type-II error of differentially private mechanisms. Based on the analysis, we present unit circle mechanism: a novel differentially private mechanism based on the geometrical property of the test statistics. Compared to existing output perturbation mechanisms, our mechanism improves the dominated term of the type-II error from $O(1)$ to $O(\exp(-\sqrt{N}))$ where $N$ is the sample size. Furthermore, we introduce novel procedures for multiple $\chi^2$ tests by incorporating the unit circle mechanism into the sparse vector technique and the exponential mechanism. These procedures can control the family-wise error rate (FWER) properly, which has never been attained by existing mechanisms.
Tasks
Published 2017-08-01
URL https://icml.cc/Conferences/2017/Schedule?showEvent=847
PDF http://proceedings.mlr.press/v70/kakizaki17a/kakizaki17a.pdf
PWC https://paperswithcode.com/paper/differentially-private-chi-squared-test-by
Repo
Framework

A Multi-task Approach to Predict Likability of Books

Title A Multi-task Approach to Predict Likability of Books
Authors Suraj Maharjan, John Arevalo, Manuel Montes, Fabio A. Gonz{'a}lez, Thamar Solorio
Abstract We investigate the value of feature engineering and neural network models for predicting successful writing. Similar to previous work, we treat this as a binary classification task and explore new strategies to automatically learn representations from book contents. We evaluate our feature set on two different corpora created from Project Gutenberg books. The first presents a novel approach for generating the gold standard labels for the task and the other is based on prior research. Using a combination of hand-crafted and recurrent neural network learned representations in a dual learning setting, we obtain the best performance of 73.50{%} weighted F1-score.
Tasks Feature Engineering, Sentiment Analysis
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-1114/
PDF https://www.aclweb.org/anthology/E17-1114
PWC https://paperswithcode.com/paper/a-multi-task-approach-to-predict-likability
Repo
Framework

Should Neural Network Architecture Reflect Linguistic Structure?

Title Should Neural Network Architecture Reflect Linguistic Structure?
Authors Chris Dyer
Abstract I explore the hypothesis that conventional neural network models (e.g., recurrent neural networks) are incorrectly biased for making linguistically sensible generalizations when learning, and that a better class of models is based on architectures that reflect hierarchical structures for which considerable behavioral evidence exists. I focus on the problem of modeling and representing the meanings of sentences. On the generation front, I introduce recurrent neural network grammars (RNNGs), a joint, generative model of phrase-structure trees and sentences. RNNGs operate via a recursive syntactic process reminiscent of probabilistic context-free grammar generation, but decisions are parameterized using RNNs that condition on the entire (top-down, left-to-right) syntactic derivation history, thus relaxing context-free independence assumptions, while retaining a bias toward explaining decisions via {``}syntactically local{''} conditioning contexts. Experiments show that RNNGs obtain better results in generating language than models that don{'}t exploit linguistic structure. On the representation front, I explore unsupervised learning of syntactic structures based on distant semantic supervision using a reinforcement-learning algorithm. The learner seeks a syntactic structure that provides a compositional architecture that produces a good representation for a downstream semantic task. Although the inferred structures are quite different from traditional syntactic analyses, the performance on the downstream tasks surpasses that of systems that use sequential RNNs and tree-structured RNNs based on treebank dependencies. This is joint work with Adhi Kuncoro, Dani Yogatama, Miguel Ballesteros, Phil Blunsom, Ed Grefenstette, Wang Ling, and Noah A. Smith. |
Tasks
Published 2017-08-01
URL https://www.aclweb.org/anthology/K17-1001/
PDF https://www.aclweb.org/anthology/K17-1001
PWC https://paperswithcode.com/paper/should-neural-network-architecture-reflect
Repo
Framework

Sparse Approximate Conic Hulls

Title Sparse Approximate Conic Hulls
Authors Greg Van Buskirk, Benjamin Raichel, Nicholas Ruozzi
Abstract We consider the problem of computing a restricted nonnegative matrix factorization (NMF) of an m\times n matrix X. Specifically, we seek a factorization X\approx BC, where the k columns of B are a subset of those from X and C\in\Re_{\geq 0}^{k\times n}. Equivalently, given the matrix X, consider the problem of finding a small subset, S, of the columns of X such that the conic hull of S \eps-approximates the conic hull of the columns of X, i.e., the distance of every column of X to the conic hull of the columns of S should be at most an \eps-fraction of the angular diameter of X. If k is the size of the smallest \eps-approximation, then we produce an O(k/\eps^{2/3}) sized O(\eps^{1/3})-approximation, yielding the first provable, polynomial time \eps-approximation for this class of NMF problems, where also desirably the approximation is independent of n and m. Furthermore, we prove an approximate conic Carathéodory theorem, a general sparsity result, that shows that any column of X can be \eps-approximated with an O(1/\eps^2) sparse combination from S. Our results are facilitated by a reduction to the problem of approximating convex hulls, and we prove that both the convex and conic hull variants are d-sum-hard, resolving an open problem. Finally, we provide experimental results for the convex and conic algorithms on a variety of feature selection tasks.
Tasks Feature Selection
Published 2017-12-01
URL http://papers.nips.cc/paper/6847-sparse-approximate-conic-hulls
PDF http://papers.nips.cc/paper/6847-sparse-approximate-conic-hulls.pdf
PWC https://paperswithcode.com/paper/sparse-approximate-conic-hulls
Repo
Framework

Collaborative Partitioning for Coreference Resolution

Title Collaborative Partitioning for Coreference Resolution
Authors Olga Uryupina, Aless Moschitti, ro
Abstract This paper presents a collaborative partitioning algorithm{—}a novel ensemble-based approach to coreference resolution. Starting from the all-singleton partition, we search for a solution close to the ensemble{'}s outputs in terms of a task-specific similarity measure. Our approach assumes a loose integration of individual components of the ensemble and can therefore combine arbitrary coreference resolvers, regardless of their models. Our experiments on the CoNLL dataset show that collaborative partitioning yields results superior to those attained by the individual components, for ensembles of both strong and weak systems. Moreover, by applying the collaborative partitioning algorithm on top of three state-of-the-art resolvers, we obtain the best coreference performance reported so far in the literature (MELA v08 score of 64.47).
Tasks Coreference Resolution
Published 2017-08-01
URL https://www.aclweb.org/anthology/K17-1007/
PDF https://www.aclweb.org/anthology/K17-1007
PWC https://paperswithcode.com/paper/collaborative-partitioning-for-coreference
Repo
Framework

Recognizing Counterfactual Thinking in Social Media Texts

Title Recognizing Counterfactual Thinking in Social Media Texts
Authors Youngseo Son, Anneke Buffone, Joe Raso, Allegra Larche, Anthony Janocko, Kevin Zembroski, H. Andrew Schwartz, Lyle Ungar
Abstract
Tasks
Published 2017-07-01
URL https://www.aclweb.org/anthology/papers/P17-2103/p17-2103
PDF https://www.aclweb.org/anthology/P17-2103v2
PWC https://paperswithcode.com/paper/recognizing-counterfactual-thinking-in-social
Repo
Framework

Universal Dependencies for Serbian in Comparison with Croatian and Other Slavic Languages

Title Universal Dependencies for Serbian in Comparison with Croatian and Other Slavic Languages
Authors Tanja Samard{\v{z}}i{'c}, Mirjana Starovi{'c}, {\v{Z}}eljko Agi{'c}, Nikola Ljube{\v{s}}i{'c}
Abstract The paper documents the procedure of building a new Universal Dependencies (UDv2) treebank for Serbian starting from an existing Croatian UDv1 treebank and taking into account the other Slavic UD annotation guidelines. We describe the automatic and manual annotation procedures, discuss the annotation of Slavic-specific categories (case governing quantifiers, reflexive pronouns, question particles) and propose an approach to handling deverbal nouns in Slavic languages.
Tasks
Published 2017-04-01
URL https://www.aclweb.org/anthology/W17-1407/
PDF https://www.aclweb.org/anthology/W17-1407
PWC https://paperswithcode.com/paper/universal-dependencies-for-serbian-in
Repo
Framework

Variable Mini-Batch Sizing and Pre-Trained Embeddings

Title Variable Mini-Batch Sizing and Pre-Trained Embeddings
Authors Mostafa Abdou, Vladan Glon{\v{c}}{'a}k, Ond{\v{r}}ej Bojar
Abstract
Tasks Machine Translation, Word Embeddings
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4780/
PDF https://www.aclweb.org/anthology/W17-4780
PWC https://paperswithcode.com/paper/variable-mini-batch-sizing-and-pre-trained
Repo
Framework

Deletion-Robust Submodular Maximization: Data Summarization with “the Right to be Forgotten”

Title Deletion-Robust Submodular Maximization: Data Summarization with “the Right to be Forgotten”
Authors Baharan Mirzasoleiman, Amin Karbasi, Andreas Krause
Abstract How can we summarize a dynamic data stream when elements selected for the summary can be deleted at any time? This is an important challenge in online services, where the users generating the data may decide to exercise their right to restrict the service provider from using (part of) their data due to privacy concerns. Motivated by this challenge, we introduce the dynamic deletion-robust submodular maximization problem. We develop the first resilient streaming algorithm, called ROBUST-STREAMING, with a constant factor approximation guarantee to the optimum solution. We evaluate the effectiveness of our approach on several real-world applica tions, including summarizing (1) streams of geo-coordinates (2); streams of images; and (3) click-stream log data, consisting of 45 million feature vectors from a news recommendation task.
Tasks Data Summarization
Published 2017-08-01
URL https://icml.cc/Conferences/2017/Schedule?showEvent=698
PDF http://proceedings.mlr.press/v70/mirzasoleiman17a/mirzasoleiman17a.pdf
PWC https://paperswithcode.com/paper/deletion-robust-submodular-maximization-data
Repo
Framework

Word Transduction for Addressing the OOV Problem in Machine Translation for Similar Resource-Scarce Languages

Title Word Transduction for Addressing the OOV Problem in Machine Translation for Similar Resource-Scarce Languages
Authors Shashikant Sharma, Anil Kumar Singh
Abstract
Tasks Machine Translation, Speech Recognition
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4007/
PDF https://www.aclweb.org/anthology/W17-4007
PWC https://paperswithcode.com/paper/word-transduction-for-addressing-the-oov
Repo
Framework

Tell Me Why: Using Question Answering as Distant Supervision for Answer Justification

Title Tell Me Why: Using Question Answering as Distant Supervision for Answer Justification
Authors Rebecca Sharp, Mihai Surdeanu, Peter Jansen, Marco A. Valenzuela-Esc{'a}rcega, Peter Clark, Michael Hammond
Abstract For many applications of question answering (QA), being able to explain why a given model chose an answer is critical. However, the lack of labeled data for answer justifications makes learning this difficult and expensive. Here we propose an approach that uses answer ranking as distant supervision for learning how to select informative justifications, where justifications serve as inferential connections between the question and the correct answer while often containing little lexical overlap with either. We propose a neural network architecture for QA that reranks answer justifications as an intermediate (and human-interpretable) step in answer selection. Our approach is informed by a set of features designed to combine both learned representations and explicit features to capture the connection between questions, answers, and answer justifications. We show that with this end-to-end approach we are able to significantly improve upon a strong IR baseline in both justification ranking (+9{%} rated highly relevant) and answer selection (+6{%} P@1).
Tasks Answer Selection, Interpretable Machine Learning, Question Answering
Published 2017-08-01
URL https://www.aclweb.org/anthology/K17-1009/
PDF https://www.aclweb.org/anthology/K17-1009
PWC https://paperswithcode.com/paper/tell-me-why-using-question-answering-as
Repo
Framework

BabelDomains: Large-Scale Domain Labeling of Lexical Resources

Title BabelDomains: Large-Scale Domain Labeling of Lexical Resources
Authors Jose Camacho-Collados, Roberto Navigli
Abstract In this paper we present BabelDomains, a unified resource which provides lexical items with information about domains of knowledge. We propose an automatic method that uses knowledge from various lexical resources, exploiting both distributional and graph-based clues, to accurately propagate domain information. We evaluate our methodology intrinsically on two lexical resources (WordNet and BabelNet), achieving a precision over 80{%} in both cases. Finally, we show the potential of BabelDomains in a supervised learning setting, clustering training data by domain for hypernym discovery.
Tasks Domain Adaptation, Hypernym Discovery, Sentiment Analysis, Text Categorization, Word Sense Disambiguation
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-2036/
PDF https://www.aclweb.org/anthology/E17-2036
PWC https://paperswithcode.com/paper/babeldomains-large-scale-domain-labeling-of
Repo
Framework

Effects of Lexical Properties on Viewing Time per Word in Autistic and Neurotypical Readers

Title Effects of Lexical Properties on Viewing Time per Word in Autistic and Neurotypical Readers
Authors Sanja {\v{S}}tajner, Victoria Yaneva, Ruslan Mitkov, Simone Paolo Ponzetto
Abstract Eye tracking studies from the past few decades have shaped the way we think of word complexity and cognitive load: words that are long, rare and ambiguous are more difficult to read. However, online processing techniques have been scarcely applied to investigating the reading difficulties of people with autism and what vocabulary is challenging for them. We present parallel gaze data obtained from adult readers with autism and a control group of neurotypical readers and show that the former required higher cognitive effort to comprehend the texts as evidenced by three gaze-based measures. We divide all words into four classes based on their viewing times for both groups and investigate the relationship between longer viewing times and word length, word frequency, and four cognitively-based measures (word concreteness, familiarity, age of acquisition and imagability).
Tasks Eye Tracking, Lexical Simplification
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-5030/
PDF https://www.aclweb.org/anthology/W17-5030
PWC https://paperswithcode.com/paper/effects-of-lexical-properties-on-viewing-time
Repo
Framework
comments powered by Disqus