Paper Group NANR 155
Referring Expression Generation under Uncertainty: Algorithm and Evaluation Framework. “Convex Until Proven Guilty”: Dimension-Free Acceleration of Gradient Descent on Non-Convex Functions. Noise-Clustered Distant Supervision for Relation Extraction: A Nonparametric Bayesian Perspective. ES-LDA: Entity Summarization using Knowledge-based Topic Mode …
Referring Expression Generation under Uncertainty: Algorithm and Evaluation Framework
Title | Referring Expression Generation under Uncertainty: Algorithm and Evaluation Framework |
Authors | Tom Williams, Matthias Scheutz |
Abstract | For situated agents to effectively engage in natural-language interactions with humans, they must be able to refer to entities such as people, locations, and objects. While classic referring expression generation (REG) algorithms like the Incremental Algorithm (IA) assume perfect, complete, and accessible knowledge of all referents, this is not always possible. In this work, we show how a previously presented consultant framework (which facilitates reference resolution when knowledge is uncertain, heterogeneous and distributed) can be used to extend the IA to produce DIST-PIA, a domain-independent algorithm for REG under uncertain, heterogeneous, and distributed knowledge. We also present a novel framework that can be used to evaluate such REG algorithms without conflating the performance of the algorithm with the performance of classifiers it employs. |
Tasks | Text Generation |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-3511/ |
https://www.aclweb.org/anthology/W17-3511 | |
PWC | https://paperswithcode.com/paper/referring-expression-generation-under |
Repo | |
Framework | |
“Convex Until Proven Guilty”: Dimension-Free Acceleration of Gradient Descent on Non-Convex Functions
Title | “Convex Until Proven Guilty”: Dimension-Free Acceleration of Gradient Descent on Non-Convex Functions |
Authors | Yair Carmon, John C. Duchi, Oliver Hinder, Aaron Sidford |
Abstract | We develop and analyze a variant of Nesterov’s accelerated gradient descent (AGD) for minimization of smooth non-convex functions. We prove that one of two cases occurs: either our AGD variant converges quickly, as if the function was convex, or we produce a certificate that the function is “guilty” of being non-convex. This non-convexity certificate allows us to exploit negative curvature and obtain deterministic, dimension-free acceleration of convergence for non-convex functions. For a function $f$ with Lipschitz continuous gradient and Hessian, we compute a point $x$ with $\nabla f(x)\ \le \epsilon$ in $O(\epsilon^{-7/4} \log(1/ \epsilon) )$ gradient and function evaluations. Assuming additionally that the third derivative is Lipschitz, we require only $O(\epsilon^{-5/3} \log(1/ \epsilon) )$ evaluations. |
Tasks | |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=831 |
http://proceedings.mlr.press/v70/carmon17a/carmon17a.pdf | |
PWC | https://paperswithcode.com/paper/convex-until-proven-guilty-dimension-free |
Repo | |
Framework | |
Noise-Clustered Distant Supervision for Relation Extraction: A Nonparametric Bayesian Perspective
Title | Noise-Clustered Distant Supervision for Relation Extraction: A Nonparametric Bayesian Perspective |
Authors | Qing Zhang, Houfeng Wang |
Abstract | For the task of relation extraction, distant supervision is an efficient approach to generate labeled data by aligning knowledge base with free texts. The essence of it is a challenging incomplete multi-label classification problem with sparse and noisy features. To address the challenge, this work presents a novel nonparametric Bayesian formulation for the task. Experiment results show substantially higher top precision improvements over the traditional state-of-the-art approaches. |
Tasks | Matrix Completion, Multi-Label Classification, Relation Extraction |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-1192/ |
https://www.aclweb.org/anthology/D17-1192 | |
PWC | https://paperswithcode.com/paper/noise-clustered-distant-supervision-for |
Repo | |
Framework | |
ES-LDA: Entity Summarization using Knowledge-based Topic Modeling
Title | ES-LDA: Entity Summarization using Knowledge-based Topic Modeling |
Authors | Seyedamin Pouriyeh, Mehdi Allahyari, Krzysztof Kochut, Gong Cheng, Hamid Reza Arabnia |
Abstract | With the advent of the Internet, the amount of Semantic Web documents that describe real-world entities and their inter-links as a set of statements have grown considerably. These descriptions are usually lengthy, which makes the utilization of the underlying entities a difficult task. Entity summarization, which aims to create summaries for real-world entities, has gained increasing attention in recent years. In this paper, we propose a probabilistic topic model, ES-LDA, that combines prior knowledge with statistical learning techniques within a single framework to create more reliable and representative summaries for entities. We demonstrate the effectiveness of our approach by conducting extensive experiments and show that our model outperforms the state-of-the-art techniques and enhances the quality of the entity summaries. |
Tasks | Document Summarization |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/I17-1032/ |
https://www.aclweb.org/anthology/I17-1032 | |
PWC | https://paperswithcode.com/paper/es-lda-entity-summarization-using-knowledge |
Repo | |
Framework | |
Distributed and Provably Good Seedings for k-Means in Constant Rounds
Title | Distributed and Provably Good Seedings for k-Means in Constant Rounds |
Authors | Olivier Bachem, Mario Lucic, Andreas Krause |
Abstract | The k-Means++ algorithm is the state of the art algorithm to solve k-Means clustering problems as the computed clusterings are O(log k) competitive in expectation. However, its seeding step requires k inherently sequential passes through the full data set making it hard to scale to massive data sets. The standard remedy is to use the k-Means algorithm which reduces the number of sequential rounds and is thus suitable for a distributed setting. In this paper, we provide a novel analysis of the k-Means algorithm that bounds the expected solution quality for any number of rounds and oversampling factors greater than k, the two parameters one needs to choose in practice. In particular, we show that k-Means provides provably good clusterings even for a small, constant number of iterations. This theoretical finding explains the common observation that k-Means performs extremely well in practice even if the number of rounds is low. We further provide a hard instance that shows that an additive error term as encountered in our analysis is inevitable if less than k-1 rounds are employed. |
Tasks | |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=472 |
http://proceedings.mlr.press/v70/bachem17b/bachem17b.pdf | |
PWC | https://paperswithcode.com/paper/distributed-and-provably-good-seedings-for-k |
Repo | |
Framework | |
Exact Inference for Integer Latent-Variable Models
Title | Exact Inference for Integer Latent-Variable Models |
Authors | Kevin Winner, Debora Sujono, Dan Sheldon |
Abstract | Graphical models with latent count variables arise in a number of areas. However, standard inference algorithms do not apply to these models due to the infinite support of the latent variables. Winner and Sheldon (2016) recently developed a new technique using probability generating functions (PGFs) to perform efficient, exact inference for certain Poisson latent variable models. However, the method relies on symbolic manipulation of PGFs, and it is unclear whether this can be extended to more general models. In this paper we introduce a new approach for inference with PGFs: instead of manipulating PGFs symbolically, we adapt techniques from the autodiff literature to compute the higher-order derivatives necessary for inference. This substantially generalizes the class of models for which efficient, exact inference algorithms are available. Specifically, our results apply to a class of models that includes branching processes, which are widely used in applied mathematics and population ecology, and autoregressive models for integer data. Experiments show that our techniques are more scalable than existing approximate methods and enable new applications. |
Tasks | Latent Variable Models |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=771 |
http://proceedings.mlr.press/v70/winner17a/winner17a.pdf | |
PWC | https://paperswithcode.com/paper/exact-inference-for-integer-latent-variable |
Repo | |
Framework | |
Metaheuristic Approaches to Lexical Substitution and Simplification
Title | Metaheuristic Approaches to Lexical Substitution and Simplification |
Authors | Sallam Abualhaija, Tristan Miller, Judith Eckle-Kohler, Iryna Gurevych, Karl-Heinz Zimmermann |
Abstract | In this paper, we propose using metaheuristics{—}in particular, simulated annealing and the new D-Bees algorithm{—}to solve word sense disambiguation as an optimization problem within a knowledge-based lexical substitution system. We are the first to perform such an extrinsic evaluation of metaheuristics, for which we use two standard lexical substitution datasets, one English and one German. We find that D-Bees has robust performance for both languages, and performs better than simulated annealing, though both achieve good results. Moreover, the D-Bees{–}based lexical substitution system outperforms state-of-the-art systems on several evaluation metrics. We also show that D-Bees achieves competitive performance in lexical simplification, a variant of lexical substitution. |
Tasks | Lexical Simplification, Machine Translation, Question Answering, Sentence Compression, Text Generation, Text Summarization, Word Sense Disambiguation |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-1082/ |
https://www.aclweb.org/anthology/E17-1082 | |
PWC | https://paperswithcode.com/paper/metaheuristic-approaches-to-lexical |
Repo | |
Framework | |
WebChild 2.0 : Fine-Grained Commonsense Knowledge Distillation
Title | WebChild 2.0 : Fine-Grained Commonsense Knowledge Distillation |
Authors | T, Niket on, Gerard de Melo, Gerhard Weikum |
Abstract | |
Tasks | Semantic Parsing |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-4020/ |
https://www.aclweb.org/anthology/P17-4020 | |
PWC | https://paperswithcode.com/paper/webchild-20-fine-grained-commonsense |
Repo | |
Framework | |
Multilingual Training of Crosslingual Word Embeddings
Title | Multilingual Training of Crosslingual Word Embeddings |
Authors | Long Duong, Hiroshi Kanayama, Tengfei Ma, Steven Bird, Trevor Cohn |
Abstract | Crosslingual word embeddings represent lexical items from different languages using the same vector space, enabling crosslingual transfer. Most prior work constructs embeddings for a pair of languages, with English on one side. We investigate methods for building high quality crosslingual word embeddings for many languages in a unified vector space.In this way, we can exploit and combine strength of many languages. We obtained high performance on bilingual lexicon induction, monolingual similarity and crosslingual document classification tasks. |
Tasks | Dependency Parsing, Document Classification, Machine Translation, Multilingual Word Embeddings, Named Entity Recognition, Sentiment Analysis, Transfer Learning, Word Embeddings |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-1084/ |
https://www.aclweb.org/anthology/E17-1084 | |
PWC | https://paperswithcode.com/paper/multilingual-training-of-crosslingual-word |
Repo | |
Framework | |
Extracting Tags from Large Raw Texts Using End-to-End Memory Networks
Title | Extracting Tags from Large Raw Texts Using End-to-End Memory Networks |
Authors | Feras Al Kassar, Fr{'e}d{'e}ric Armetta |
Abstract | |
Tasks | Word Embeddings |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-7305/ |
https://www.aclweb.org/anthology/W17-7305 | |
PWC | https://paperswithcode.com/paper/extracting-tags-from-large-raw-texts-using |
Repo | |
Framework | |
Character-based recurrent neural networks for morphological relational reasoning
Title | Character-based recurrent neural networks for morphological relational reasoning |
Authors | Olof Mogren, Richard Johansson |
Abstract | We present a model for predicting word forms based on \textit{morphological relational reasoning} with analogies. While previous work has explored tasks such as morphological inflection and reinflection, these models rely on an explicit enumeration of morphological features, which may not be available in all cases. To address the task of predicting a word form given a \textit{demo relation} (a pair of word forms) and a \textit{query word}, we devise a character-based recurrent neural network architecture using three separate encoders and a decoder. We also investigate a multiclass learning setup, where the prediction of the relation type label is used as an auxiliary task. Our results show that the exact form can be predicted for English with an accuracy of 94.7{%}. For Swedish, which has a more complex morphology with more inflectional patterns for nouns and verbs, the accuracy is 89.3{%}. We also show that using the auxiliary task of learning the relation type speeds up convergence and improves the prediction accuracy for the word generation task. |
Tasks | Language Modelling, Morphological Analysis, Morphological Inflection, Relational Reasoning, Text Categorization, Tokenization |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-4108/ |
https://www.aclweb.org/anthology/W17-4108 | |
PWC | https://paperswithcode.com/paper/character-based-recurrent-neural-networks-for |
Repo | |
Framework | |
Spectral Learning from a Single Trajectory under Finite-State Policies
Title | Spectral Learning from a Single Trajectory under Finite-State Policies |
Authors | Borja Balle, Odalric-Ambrym Maillard |
Abstract | We present spectral methods of moments for learning sequential models from a single trajectory, in stark contrast with the classical literature that assumes the availability of multiple i.i.d. trajectories. Our approach leverages an efficient SVD-based learning algorithm for weighted automata and provides the first rigorous analysis for learning many important models using dependent data. We state and analyze the algorithm under three increasingly difficult scenarios: probabilistic automata, stochastic weighted automata, and reactive predictive state representations controlled by a finite-state policy. Our proofs include novel tools for studying mixing properties of stochastic weighted automata. |
Tasks | |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=820 |
http://proceedings.mlr.press/v70/balle17a/balle17a.pdf | |
PWC | https://paperswithcode.com/paper/spectral-learning-from-a-single-trajectory |
Repo | |
Framework | |
Redundancy Localization for the Conversationalization of Unstructured Responses
Title | Redundancy Localization for the Conversationalization of Unstructured Responses |
Authors | Sebastian Krause, Mikhail Kozhevnikov, Eric Malmi, Daniele Pighin |
Abstract | Conversational agents offer users a natural-language interface to accomplish tasks, entertain themselves, or access information. Informational dialogue is particularly challenging in that the agent has to hold a conversation on an open topic, and to achieve a reasonable coverage it generally needs to digest and present unstructured information from textual sources. Making responses based on such sources sound natural and fit appropriately into the conversation context is a topic of ongoing research, one of the key issues of which is preventing the agent{'}s responses from sounding repetitive. Targeting this issue, we propose a new task, known as redundancy localization, which aims to pinpoint semantic overlap between text passages. To help address it systematically, we formalize the task, prepare a public dataset with fine-grained redundancy labels, and propose a model utilizing a weak training signal defined over the results of a passage-retrieval system on web texts. The proposed model demonstrates superior performance compared to a state-of-the-art entailment model and yields encouraging results when applied to a real-world dialogue. |
Tasks | Question Answering |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/W17-5515/ |
https://www.aclweb.org/anthology/W17-5515 | |
PWC | https://paperswithcode.com/paper/redundancy-localization-for-the |
Repo | |
Framework | |
Recognizing Reputation Defence Strategies in Critical Political Exchanges
Title | Recognizing Reputation Defence Strategies in Critical Political Exchanges |
Authors | Nona Naderi, Graeme Hirst |
Abstract | We propose a new task of automatically detecting reputation defence strategies in the field of computational argumentation. We cast the problem as relation classification, where given a pair of reputation threat and reputation defence, we determine the reputation defence strategy. We annotate a dataset of parliamentary questions and answers with reputation defence strategies. We then propose a model based on supervised learning to address the detection of these strategies, and report promising experimental results. |
Tasks | Relation Classification |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/R17-1069/ |
https://doi.org/10.26615/978-954-452-049-6_069 | |
PWC | https://paperswithcode.com/paper/recognizing-reputation-defence-strategies-in |
Repo | |
Framework | |
HCCL at SemEval-2017 Task 2: Combining Multilingual Word Embeddings and Transliteration Model for Semantic Similarity
Title | HCCL at SemEval-2017 Task 2: Combining Multilingual Word Embeddings and Transliteration Model for Semantic Similarity |
Authors | Junqing He, Long Wu, Xuemin Zhao, Yonghong Yan |
Abstract | In this paper, we introduce an approach to combining word embeddings and machine translation for multilingual semantic word similarity, the task2 of SemEval-2017. Thanks to the unsupervised transliteration model, our cross-lingual word embeddings encounter decreased sums of OOVs. Our results are produced using only monolingual Wikipedia corpora and a limited amount of sentence-aligned data. Although relatively little resources are utilized, our system ranked 3rd in the monolingual subtask and can be the 6th in the cross-lingual subtask. |
Tasks | Machine Translation, Multilingual Word Embeddings, Semantic Similarity, Semantic Textual Similarity, Stock Price Prediction, Text Classification, Transliteration, Word Embeddings |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/S17-2033/ |
https://www.aclweb.org/anthology/S17-2033 | |
PWC | https://paperswithcode.com/paper/hccl-at-semeval-2017-task-2-combining |
Repo | |
Framework | |