Paper Group NANR 163
Universal Dependencies for Afrikaans. High-Dimensional Variance-Reduced Stochastic Gradient Expectation-Maximization Algorithm. Differentially Private Chi-squared Test by Unit Circle Mechanism. A Multi-task Approach to Predict Likability of Books. Should Neural Network Architecture Reflect Linguistic Structure?. Sparse Approximate Conic Hulls. Coll …
Universal Dependencies for Afrikaans
Title | Universal Dependencies for Afrikaans |
Authors | Peter Dirix, Liesbeth Augustinus, Daniel van Niekerk, Frank Van Eynde |
Abstract | |
Tasks | |
Published | 2017-05-01 |
URL | https://www.aclweb.org/anthology/W17-0405/ |
https://www.aclweb.org/anthology/W17-0405 | |
PWC | https://paperswithcode.com/paper/universal-dependencies-for-afrikaans |
Repo | |
Framework | |
High-Dimensional Variance-Reduced Stochastic Gradient Expectation-Maximization Algorithm
Title | High-Dimensional Variance-Reduced Stochastic Gradient Expectation-Maximization Algorithm |
Authors | Rongda Zhu, Lingxiao Wang, Chengxiang Zhai, Quanquan Gu |
Abstract | We propose a generic stochastic expectation-maximization (EM) algorithm for the estimation of high-dimensional latent variable models. At the core of our algorithm is a novel semi-stochastic variance-reduced gradient designed for the $Q$-function in the EM algorithm. Under a mild condition on the initialization, our algorithm is guaranteed to attain a linear convergence rate to the unknown parameter of the latent variable model, and achieve an optimal statistical rate up to a logarithmic factor for parameter estimation. Compared with existing high-dimensional EM algorithms, our algorithm enjoys a better computational complexity and is therefore more efficient. We apply our generic algorithm to two illustrative latent variable models: Gaussian mixture model and mixture of linear regression, and demonstrate the advantages of our algorithm by both theoretical analysis and numerical experiments. We believe that the proposed semi-stochastic gradient is of independent interest for general nonconvex optimization problems with bivariate structures. |
Tasks | Latent Variable Models |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=547 |
http://proceedings.mlr.press/v70/zhu17a/zhu17a.pdf | |
PWC | https://paperswithcode.com/paper/high-dimensional-variance-reduced-stochastic |
Repo | |
Framework | |
Differentially Private Chi-squared Test by Unit Circle Mechanism
Title | Differentially Private Chi-squared Test by Unit Circle Mechanism |
Authors | Kazuya Kakizaki, Kazuto Fukuchi, Jun Sakuma |
Abstract | This paper develops differentially private mechanisms for $\chi^2$ test of independence. While existing works put their effort into properly controlling the type-I error, in addition to that, we investigate the type-II error of differentially private mechanisms. Based on the analysis, we present unit circle mechanism: a novel differentially private mechanism based on the geometrical property of the test statistics. Compared to existing output perturbation mechanisms, our mechanism improves the dominated term of the type-II error from $O(1)$ to $O(\exp(-\sqrt{N}))$ where $N$ is the sample size. Furthermore, we introduce novel procedures for multiple $\chi^2$ tests by incorporating the unit circle mechanism into the sparse vector technique and the exponential mechanism. These procedures can control the family-wise error rate (FWER) properly, which has never been attained by existing mechanisms. |
Tasks | |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=847 |
http://proceedings.mlr.press/v70/kakizaki17a/kakizaki17a.pdf | |
PWC | https://paperswithcode.com/paper/differentially-private-chi-squared-test-by |
Repo | |
Framework | |
A Multi-task Approach to Predict Likability of Books
Title | A Multi-task Approach to Predict Likability of Books |
Authors | Suraj Maharjan, John Arevalo, Manuel Montes, Fabio A. Gonz{'a}lez, Thamar Solorio |
Abstract | We investigate the value of feature engineering and neural network models for predicting successful writing. Similar to previous work, we treat this as a binary classification task and explore new strategies to automatically learn representations from book contents. We evaluate our feature set on two different corpora created from Project Gutenberg books. The first presents a novel approach for generating the gold standard labels for the task and the other is based on prior research. Using a combination of hand-crafted and recurrent neural network learned representations in a dual learning setting, we obtain the best performance of 73.50{%} weighted F1-score. |
Tasks | Feature Engineering, Sentiment Analysis |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-1114/ |
https://www.aclweb.org/anthology/E17-1114 | |
PWC | https://paperswithcode.com/paper/a-multi-task-approach-to-predict-likability |
Repo | |
Framework | |
Should Neural Network Architecture Reflect Linguistic Structure?
Title | Should Neural Network Architecture Reflect Linguistic Structure? |
Authors | Chris Dyer |
Abstract | I explore the hypothesis that conventional neural network models (e.g., recurrent neural networks) are incorrectly biased for making linguistically sensible generalizations when learning, and that a better class of models is based on architectures that reflect hierarchical structures for which considerable behavioral evidence exists. I focus on the problem of modeling and representing the meanings of sentences. On the generation front, I introduce recurrent neural network grammars (RNNGs), a joint, generative model of phrase-structure trees and sentences. RNNGs operate via a recursive syntactic process reminiscent of probabilistic context-free grammar generation, but decisions are parameterized using RNNs that condition on the entire (top-down, left-to-right) syntactic derivation history, thus relaxing context-free independence assumptions, while retaining a bias toward explaining decisions via {``}syntactically local{''} conditioning contexts. Experiments show that RNNGs obtain better results in generating language than models that don{'}t exploit linguistic structure. On the representation front, I explore unsupervised learning of syntactic structures based on distant semantic supervision using a reinforcement-learning algorithm. The learner seeks a syntactic structure that provides a compositional architecture that produces a good representation for a downstream semantic task. Although the inferred structures are quite different from traditional syntactic analyses, the performance on the downstream tasks surpasses that of systems that use sequential RNNs and tree-structured RNNs based on treebank dependencies. This is joint work with Adhi Kuncoro, Dani Yogatama, Miguel Ballesteros, Phil Blunsom, Ed Grefenstette, Wang Ling, and Noah A. Smith. | |
Tasks | |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/K17-1001/ |
https://www.aclweb.org/anthology/K17-1001 | |
PWC | https://paperswithcode.com/paper/should-neural-network-architecture-reflect |
Repo | |
Framework | |
Sparse Approximate Conic Hulls
Title | Sparse Approximate Conic Hulls |
Authors | Greg Van Buskirk, Benjamin Raichel, Nicholas Ruozzi |
Abstract | We consider the problem of computing a restricted nonnegative matrix factorization (NMF) of an m\times n matrix X. Specifically, we seek a factorization X\approx BC, where the k columns of B are a subset of those from X and C\in\Re_{\geq 0}^{k\times n}. Equivalently, given the matrix X, consider the problem of finding a small subset, S, of the columns of X such that the conic hull of S \eps-approximates the conic hull of the columns of X, i.e., the distance of every column of X to the conic hull of the columns of S should be at most an \eps-fraction of the angular diameter of X. If k is the size of the smallest \eps-approximation, then we produce an O(k/\eps^{2/3}) sized O(\eps^{1/3})-approximation, yielding the first provable, polynomial time \eps-approximation for this class of NMF problems, where also desirably the approximation is independent of n and m. Furthermore, we prove an approximate conic Carathéodory theorem, a general sparsity result, that shows that any column of X can be \eps-approximated with an O(1/\eps^2) sparse combination from S. Our results are facilitated by a reduction to the problem of approximating convex hulls, and we prove that both the convex and conic hull variants are d-sum-hard, resolving an open problem. Finally, we provide experimental results for the convex and conic algorithms on a variety of feature selection tasks. |
Tasks | Feature Selection |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/6847-sparse-approximate-conic-hulls |
http://papers.nips.cc/paper/6847-sparse-approximate-conic-hulls.pdf | |
PWC | https://paperswithcode.com/paper/sparse-approximate-conic-hulls |
Repo | |
Framework | |
Collaborative Partitioning for Coreference Resolution
Title | Collaborative Partitioning for Coreference Resolution |
Authors | Olga Uryupina, Aless Moschitti, ro |
Abstract | This paper presents a collaborative partitioning algorithm{—}a novel ensemble-based approach to coreference resolution. Starting from the all-singleton partition, we search for a solution close to the ensemble{'}s outputs in terms of a task-specific similarity measure. Our approach assumes a loose integration of individual components of the ensemble and can therefore combine arbitrary coreference resolvers, regardless of their models. Our experiments on the CoNLL dataset show that collaborative partitioning yields results superior to those attained by the individual components, for ensembles of both strong and weak systems. Moreover, by applying the collaborative partitioning algorithm on top of three state-of-the-art resolvers, we obtain the best coreference performance reported so far in the literature (MELA v08 score of 64.47). |
Tasks | Coreference Resolution |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/K17-1007/ |
https://www.aclweb.org/anthology/K17-1007 | |
PWC | https://paperswithcode.com/paper/collaborative-partitioning-for-coreference |
Repo | |
Framework | |
Recognizing Counterfactual Thinking in Social Media Texts
Title | Recognizing Counterfactual Thinking in Social Media Texts |
Authors | Youngseo Son, Anneke Buffone, Joe Raso, Allegra Larche, Anthony Janocko, Kevin Zembroski, H. Andrew Schwartz, Lyle Ungar |
Abstract | |
Tasks | |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/papers/P17-2103/p17-2103 |
https://www.aclweb.org/anthology/P17-2103v2 | |
PWC | https://paperswithcode.com/paper/recognizing-counterfactual-thinking-in-social |
Repo | |
Framework | |
Universal Dependencies for Serbian in Comparison with Croatian and Other Slavic Languages
Title | Universal Dependencies for Serbian in Comparison with Croatian and Other Slavic Languages |
Authors | Tanja Samard{\v{z}}i{'c}, Mirjana Starovi{'c}, {\v{Z}}eljko Agi{'c}, Nikola Ljube{\v{s}}i{'c} |
Abstract | The paper documents the procedure of building a new Universal Dependencies (UDv2) treebank for Serbian starting from an existing Croatian UDv1 treebank and taking into account the other Slavic UD annotation guidelines. We describe the automatic and manual annotation procedures, discuss the annotation of Slavic-specific categories (case governing quantifiers, reflexive pronouns, question particles) and propose an approach to handling deverbal nouns in Slavic languages. |
Tasks | |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/W17-1407/ |
https://www.aclweb.org/anthology/W17-1407 | |
PWC | https://paperswithcode.com/paper/universal-dependencies-for-serbian-in |
Repo | |
Framework | |
Variable Mini-Batch Sizing and Pre-Trained Embeddings
Title | Variable Mini-Batch Sizing and Pre-Trained Embeddings |
Authors | Mostafa Abdou, Vladan Glon{\v{c}}{'a}k, Ond{\v{r}}ej Bojar |
Abstract | |
Tasks | Machine Translation, Word Embeddings |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-4780/ |
https://www.aclweb.org/anthology/W17-4780 | |
PWC | https://paperswithcode.com/paper/variable-mini-batch-sizing-and-pre-trained |
Repo | |
Framework | |
Deletion-Robust Submodular Maximization: Data Summarization with “the Right to be Forgotten”
Title | Deletion-Robust Submodular Maximization: Data Summarization with “the Right to be Forgotten” |
Authors | Baharan Mirzasoleiman, Amin Karbasi, Andreas Krause |
Abstract | How can we summarize a dynamic data stream when elements selected for the summary can be deleted at any time? This is an important challenge in online services, where the users generating the data may decide to exercise their right to restrict the service provider from using (part of) their data due to privacy concerns. Motivated by this challenge, we introduce the dynamic deletion-robust submodular maximization problem. We develop the first resilient streaming algorithm, called ROBUST-STREAMING, with a constant factor approximation guarantee to the optimum solution. We evaluate the effectiveness of our approach on several real-world applica tions, including summarizing (1) streams of geo-coordinates (2); streams of images; and (3) click-stream log data, consisting of 45 million feature vectors from a news recommendation task. |
Tasks | Data Summarization |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=698 |
http://proceedings.mlr.press/v70/mirzasoleiman17a/mirzasoleiman17a.pdf | |
PWC | https://paperswithcode.com/paper/deletion-robust-submodular-maximization-data |
Repo | |
Framework | |
Word Transduction for Addressing the OOV Problem in Machine Translation for Similar Resource-Scarce Languages
Title | Word Transduction for Addressing the OOV Problem in Machine Translation for Similar Resource-Scarce Languages |
Authors | Shashikant Sharma, Anil Kumar Singh |
Abstract | |
Tasks | Machine Translation, Speech Recognition |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-4007/ |
https://www.aclweb.org/anthology/W17-4007 | |
PWC | https://paperswithcode.com/paper/word-transduction-for-addressing-the-oov |
Repo | |
Framework | |
Tell Me Why: Using Question Answering as Distant Supervision for Answer Justification
Title | Tell Me Why: Using Question Answering as Distant Supervision for Answer Justification |
Authors | Rebecca Sharp, Mihai Surdeanu, Peter Jansen, Marco A. Valenzuela-Esc{'a}rcega, Peter Clark, Michael Hammond |
Abstract | For many applications of question answering (QA), being able to explain why a given model chose an answer is critical. However, the lack of labeled data for answer justifications makes learning this difficult and expensive. Here we propose an approach that uses answer ranking as distant supervision for learning how to select informative justifications, where justifications serve as inferential connections between the question and the correct answer while often containing little lexical overlap with either. We propose a neural network architecture for QA that reranks answer justifications as an intermediate (and human-interpretable) step in answer selection. Our approach is informed by a set of features designed to combine both learned representations and explicit features to capture the connection between questions, answers, and answer justifications. We show that with this end-to-end approach we are able to significantly improve upon a strong IR baseline in both justification ranking (+9{%} rated highly relevant) and answer selection (+6{%} P@1). |
Tasks | Answer Selection, Interpretable Machine Learning, Question Answering |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/K17-1009/ |
https://www.aclweb.org/anthology/K17-1009 | |
PWC | https://paperswithcode.com/paper/tell-me-why-using-question-answering-as |
Repo | |
Framework | |
BabelDomains: Large-Scale Domain Labeling of Lexical Resources
Title | BabelDomains: Large-Scale Domain Labeling of Lexical Resources |
Authors | Jose Camacho-Collados, Roberto Navigli |
Abstract | In this paper we present BabelDomains, a unified resource which provides lexical items with information about domains of knowledge. We propose an automatic method that uses knowledge from various lexical resources, exploiting both distributional and graph-based clues, to accurately propagate domain information. We evaluate our methodology intrinsically on two lexical resources (WordNet and BabelNet), achieving a precision over 80{%} in both cases. Finally, we show the potential of BabelDomains in a supervised learning setting, clustering training data by domain for hypernym discovery. |
Tasks | Domain Adaptation, Hypernym Discovery, Sentiment Analysis, Text Categorization, Word Sense Disambiguation |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-2036/ |
https://www.aclweb.org/anthology/E17-2036 | |
PWC | https://paperswithcode.com/paper/babeldomains-large-scale-domain-labeling-of |
Repo | |
Framework | |
Effects of Lexical Properties on Viewing Time per Word in Autistic and Neurotypical Readers
Title | Effects of Lexical Properties on Viewing Time per Word in Autistic and Neurotypical Readers |
Authors | Sanja {\v{S}}tajner, Victoria Yaneva, Ruslan Mitkov, Simone Paolo Ponzetto |
Abstract | Eye tracking studies from the past few decades have shaped the way we think of word complexity and cognitive load: words that are long, rare and ambiguous are more difficult to read. However, online processing techniques have been scarcely applied to investigating the reading difficulties of people with autism and what vocabulary is challenging for them. We present parallel gaze data obtained from adult readers with autism and a control group of neurotypical readers and show that the former required higher cognitive effort to comprehend the texts as evidenced by three gaze-based measures. We divide all words into four classes based on their viewing times for both groups and investigate the relationship between longer viewing times and word length, word frequency, and four cognitively-based measures (word concreteness, familiarity, age of acquisition and imagability). |
Tasks | Eye Tracking, Lexical Simplification |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-5030/ |
https://www.aclweb.org/anthology/W17-5030 | |
PWC | https://paperswithcode.com/paper/effects-of-lexical-properties-on-viewing-time |
Repo | |
Framework | |