October 15, 2019

2579 words 13 mins read

Paper Group NANR 149

Overview of the Third Social Media Mining for Health (SMM4H) Shared Tasks at EMNLP 2018. CONDUCT: An Expressive Conducting Gesture Dataset for Sound Control. Synthesize Policies for Transfer and Adaptation across Tasks and Environments. ALB at SemEval-2018 Task 10: A System for Capturing Discriminative Attributes. Distributed k-Clustering for Data …


Title	Overview of the Third Social Media Mining for Health (SMM4H) Shared Tasks at EMNLP 2018
Authors	Davy Weissenbacher, Abeed Sarker, Michael J. Paul, Gonzalez-Hern, Graciela ez
Abstract	The goals of the SMM4H shared tasks are to release annotated social media based health related datasets to the research community, and to compare the performances of natural language processing and machine learning systems on tasks involving these datasets. The third execution of the SMM4H shared tasks, co-hosted with EMNLP-2018, comprised of four subtasks. These subtasks involve annotated user posts from Twitter (tweets) and focus on the (i) automatic classification of tweets mentioning a drug name, (ii) automatic classification of tweets containing reports of first-person medication intake, (iii) automatic classification of tweets presenting self-reports of adverse drug reaction (ADR) detection, and (iv) automatic classification of vaccine behavior mentions in tweets. A total of 14 teams participated and 78 system runs were submitted (23 for task 1, 20 for task 2, 18 for task 3, 17 for task 4).
Tasks	Text Classification
Published	2018-10-01
URL	https://www.aclweb.org/anthology/W18-5904/
PDF	https://www.aclweb.org/anthology/W18-5904
PWC	https://paperswithcode.com/paper/overview-of-the-third-social-media-mining-for
Repo
Framework

CONDUCT: An Expressive Conducting Gesture Dataset for Sound Control


Title	CONDUCT: An Expressive Conducting Gesture Dataset for Sound Control
Authors	Lei Chen, Sylvie Gibet, Camille Marteau
Abstract
Tasks	Gesture Recognition, Motion Capture
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1272/
PDF	https://www.aclweb.org/anthology/L18-1272
PWC	https://paperswithcode.com/paper/conduct-an-expressive-conducting-gesture
Repo
Framework

Synthesize Policies for Transfer and Adaptation across Tasks and Environments


Title	Synthesize Policies for Transfer and Adaptation across Tasks and Environments
Authors	Hexiang Hu, Liyu Chen, Boqing Gong, Fei Sha
Abstract	The ability to transfer in reinforcement learning is key towards building an agent of general artificial intelligence. In this paper, we consider the problem of learning to simultaneously transfer across both environments and tasks, probably more importantly, by learning from only sparse (environment, task) pairs out of all the possible combinations. We propose a novel compositional neural network architecture which depicts a meta rule for composing policies from environment and task embeddings. Notably, one of the main challenges is to learn the embeddings jointly with the meta rule. We further propose new training methods to disentangle the embeddings, making them both distinctive signatures of the environments and tasks and effective building blocks for composing the policies. Experiments on GridWorld and THOR, of which the agent takes as input an egocentric view, show that our approach gives rise to high success rates on all the (environment, task) pairs after learning from only 40% of them.
Tasks
Published	2018-12-01
URL	http://papers.nips.cc/paper/7393-synthesize-policies-for-transfer-and-adaptation-across-tasks-and-environments
PDF	http://papers.nips.cc/paper/7393-synthesize-policies-for-transfer-and-adaptation-across-tasks-and-environments.pdf
PWC	https://paperswithcode.com/paper/synthesize-policies-for-transfer-and
Repo
Framework

ALB at SemEval-2018 Task 10: A System for Capturing Discriminative Attributes


Title	ALB at SemEval-2018 Task 10: A System for Capturing Discriminative Attributes
Authors	Bogdan Dumitru, Alina Maria Ciobanu, Liviu P. Dinu
Abstract	Semantic difference detection attempts to capture whether a word is a discriminative attribute between two other words. For example, the discriminative feature red characterizes the first word from the (apple, banana) pair, but not the second. Modeling semantic difference is essential for language understanding systems, as it provides useful information for identifying particular aspects of word senses. This paper describes our system implementation (the ALB system of the NLP@Unibuc team) for the 10th task of the SemEval 2018 workshop, {``}Capturing Discriminative Attributes{''}. We propose a method for semantic difference detection that uses an SVM classifier with features based on co-occurrence counts and shallow semantic parsing, achieving 0.63 F1 score in the competition. \|
Tasks	Dependency Parsing, Feature Selection, Machine Translation, Semantic Parsing, Semantic Textual Similarity
Published	2018-06-01
URL	https://www.aclweb.org/anthology/S18-1158/
PDF	https://www.aclweb.org/anthology/S18-1158
PWC	https://paperswithcode.com/paper/alb-at-semeval-2018-task-10-a-system-for
Repo
Framework

Distributed k-Clustering for Data with Heavy Noise


Title	Distributed k-Clustering for Data with Heavy Noise
Authors	Shi Li, Xiangyu Guo
Abstract	In this paper, we consider the $k$-center/median/means clustering with outliers problems (or the $(k, z)$-center/median/means problems) in the distributed setting. Most previous distributed algorithms have their communication costs linearly depending on $z$, the number of outliers. Recently Guha et al.[10] overcame this dependence issue by considering bi-criteria approximation algorithms that output solutions with $2z$ outliers. For the case where $z$ is large, the extra $z$ outliers discarded by the algorithms might be too large, considering that the data gathering process might be costly. In this paper, we improve the number of outliers to the best possible $(1+\epsilon)z$, while maintaining the $O(1)$-approximation ratio and independence of communication cost on $z$. The problems we consider include the $(k, z)$-center problem, and $(k, z)$-median/means problems in Euclidean metrics. Implementation of the our algorithm for $(k, z)$-center shows that it outperforms many previous algorithms, both in terms of the communication cost and quality of the output solution.
Tasks
Published	2018-12-01
URL	http://papers.nips.cc/paper/8009-distributed-k-clustering-for-data-with-heavy-noise
PDF	http://papers.nips.cc/paper/8009-distributed-k-clustering-for-data-with-heavy-noise.pdf
PWC	https://paperswithcode.com/paper/distributed-k-clustering-for-data-with-heavy-1
Repo
Framework

From Stochastic Planning to Marginal MAP


Title	From Stochastic Planning to Marginal MAP
Authors	Hao Cui, Radu Marinescu, Roni Khardon
Abstract	It is well known that the problems of stochastic planning and probabilistic inference are closely related. This paper makes two contributions in this context. The first is to provide an analysis of the recently developed SOGBOFA heuristic planning algorithm that was shown to be effective for problems with large factored state and action spaces. It is shown that SOGBOFA can be seen as a specialized inference algorithm that computes its solutions through a combination of a symbolic variant of belief propagation and gradient ascent. The second contribution is a new solver for Marginal MAP (MMAP) inference. We introduce a new reduction from MMAP to maximum expected utility problems which are suitable for the symbolic computation in SOGBOFA. This yields a novel algebraic gradient-based solver (AGS) for MMAP. An experimental evaluation illustrates the potential of AGS in solving difficult MMAP problems.
Tasks
Published	2018-12-01
URL	http://papers.nips.cc/paper/7571-from-stochastic-planning-to-marginal-map
PDF	http://papers.nips.cc/paper/7571-from-stochastic-planning-to-marginal-map.pdf
PWC	https://paperswithcode.com/paper/from-stochastic-planning-to-marginal-map
Repo
Framework

Extended HowNet 2.0 – An Entity-Relation Common-Sense Representation Model


Title	Extended HowNet 2.0 – An Entity-Relation Common-Sense Representation Model
Authors	Wei-Yun Ma, Yueh-Yin Shih
Abstract
Tasks	Common Sense Reasoning, Information Retrieval, Machine Translation, Semantic Composition
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1724/
PDF	https://www.aclweb.org/anthology/L18-1724
PWC	https://paperswithcode.com/paper/extended-hownet-20-a-an-entity-relation
Repo
Framework


Title	Changes in Psycholinguistic Attributes of Social Media Users Before, During, and After Self-Reported Influenza Symptoms
Authors	Lucie Flekova, Vasileios Lampos, Ingemar Cox
Abstract	Previous research has linked psychological and social variables to physical health. At the same time, psychological and social variables have been successfully predicted from the language used by individuals in social media. In this paper, we conduct an initial exploratory study linking these two areas. Using the social media platform of Twitter, we identify users self-reporting symptoms that are descriptive of influenza-like illness (ILI). We analyze the tweets of those users in the periods before, during, and after the reported symptoms, exploring emotional, cognitive, and structural components of language. We observe a post-ILI increase in social activity and cognitive processes, possibly supporting previous offline findings linking more active social activities and stronger cognitive coping skills to a better immune status.
Tasks
Published	2018-10-01
URL	https://www.aclweb.org/anthology/W18-5905/
PDF	https://www.aclweb.org/anthology/W18-5905
PWC	https://paperswithcode.com/paper/changes-in-psycholinguistic-attributes-of
Repo
Framework

Drug-Use Identification from Tweets with Word and Character N-Grams


Title	Drug-Use Identification from Tweets with Word and Character N-Grams
Authors	{\c{C}}a{\u{g}}r{\i} {\c{C}}{"o}ltekin, Taraka Rama
Abstract	This paper describes our systems in social media mining for health applications (SMM4H) shared task. We participated in all four tracks of the shared task using linear models with a combination of character and word n-gram features. We did not use any external data or domain specific information. The resulting systems achieved above-average scores among other participating systems, with F1-scores of 91.22, 46.8, 42.4, and 85.53 on tasks 1, 2, 3, and 4 respectively.
Tasks	Text Classification, Tokenization
Published	2018-10-01
URL	https://www.aclweb.org/anthology/W18-5914/
PDF	https://www.aclweb.org/anthology/W18-5914
PWC	https://paperswithcode.com/paper/drug-use-identification-from-tweets-with-word
Repo
Framework

Sublinear Time Low-Rank Approximation of Distance Matrices


Title	Sublinear Time Low-Rank Approximation of Distance Matrices
Authors	Ainesh Bakshi, David Woodruff
Abstract	Let $\PP={ p_1, p_2, \ldots p_n }$ and $\QQ = { q_1, q_2 \ldots q_m }$ be two point sets in an arbitrary metric space. Let $\AA$ represent the $m\times n$ pairwise distance matrix with $\AA_{i,j} = d(p_i, q_j)$. Such distance matrices are commonly computed in software packages and have applications to learning image manifolds, handwriting recognition, and multi-dimensional unfolding, among other things. In an attempt to reduce their description size, we study low rank approximation of such matrices. Our main result is to show that for any underlying distance metric $d$, it is possible to achieve an additive error low rank approximation in sublinear time. We note that it is provably impossible to achieve such a guarantee in sublinear time for arbitrary matrices $\AA$, and our proof exploits special properties of distance matrices. We develop a recursive algorithm based on additive projection-cost preserving sampling. We then show that in general, relative error approximation in sublinear time is impossible for distance matrices, even if one allows for bicriteria solutions. Additionally, we show that if $\PP = \QQ$ and $d$ is the squared Euclidean distance, which is not a metric but rather the square of a metric, then a relative error bicriteria solution can be found in sublinear time. Finally, we empirically compare our algorithm with the SVD and input sparsity time algorithms. Our algorithm is several hundred times faster than the SVD, and about $8$-$20$ times faster than input sparsity methods on real-world and and synthetic datasets of size $10^8$. Accuracy-wise, our algorithm is only slightly worse than that of the SVD (optimal) and input-sparsity time algorithms.
Tasks
Published	2018-12-01
URL	http://papers.nips.cc/paper/7635-sublinear-time-low-rank-approximation-of-distance-matrices
PDF	http://papers.nips.cc/paper/7635-sublinear-time-low-rank-approximation-of-distance-matrices.pdf
PWC	https://paperswithcode.com/paper/sublinear-time-low-rank-approximation-of
Repo
Framework

Guiding Generation for Abstractive Text Summarization Based on Key Information Guide Network


Title	Guiding Generation for Abstractive Text Summarization Based on Key Information Guide Network
Authors	Chenliang Li, Weiran Xu, Si Li, Sheng Gao
Abstract	Neural network models, based on the attentional encoder-decoder model, have good capability in abstractive text summarization. However, these models are hard to be controlled in the process of generation, which leads to a lack of key information. We propose a guiding generation model that combines the extractive method and the abstractive method. Firstly, we obtain keywords from the text by a extractive model. Then, we introduce a Key Information Guide Network (KIGN), which encodes the keywords to the key information representation, to guide the process of generation. In addition, we use a prediction-guide mechanism, which can obtain the long-term value for future decoding, to further guide the summary generation. We evaluate our model on the CNN/Daily Mail dataset. The experimental results show that our model leads to significant improvements.
Tasks	Abstractive Text Summarization, Text Summarization
Published	2018-06-01
URL	https://www.aclweb.org/anthology/N18-2009/
PDF	https://www.aclweb.org/anthology/N18-2009
PWC	https://paperswithcode.com/paper/guiding-generation-for-abstractive-text
Repo
Framework

Knowledge Representation with Conceptual Spaces


Title	Knowledge Representation with Conceptual Spaces
Authors	Steven Schockaert
Abstract	Entity embeddings are vector space representations of a given domain of interest. They are typically learned from text corpora (possibly in combination with any available structured knowledge), based on the intuition that similar entities should be represented by similar vectors. The usefulness of such entity embeddings largely stems from the fact that they implicitly encode a rich amount of knowledge about the considered domain, beyond mere similarity. In an embedding of movies, for instance, we may expect all movies from a given genre to be located in some low-dimensional manifold. This is particularly useful in supervised learning settings, where it may e.g. allow neural movie recommenders to base predictions on the genre of a movie, without that genre having to be specified explicitly for each movie, or without even the need to specify that the genre of a movie is a property that may have predictive value for the considered task. In unsupervised settings, however, such implicitly encoded knowledge cannot be leveraged. Conceptual spaces, as proposed by Grdenfors, are similar to entity embeddings, but provide more structure. In conceptual spaces, among others, dimensions are interpretable and grouped into facets, and properties and concepts are explicitly modelled as (vague) regions. Thanks to this additional structure, conceptual spaces can be used as a knowledge representation framework, which can also be effectively exploited in unsupervised settings. Given a conceptual space of movies, for instance, we are able to answer queries that ask about similarity w.r.t. a particular facet (e.g. movies which are cinematographically similar to Jurassic Park), that refer to a given feature (e.g. movies which are scarier than Jurassic Park but otherwise similar), or that refer to particular properties or concepts (e.g. thriller from the 1990s with a dinosaur theme). Compared to standard entity embeddings, however, conceptual spaces are more challenging to learn in a purely data-driven fashion. In this talk, I will give an overview of some approaches for learning such representations that have recently been developed within the context of the FLEXILOG project.
Tasks	Entity Embeddings, Information Retrieval, Interpretable Machine Learning, Representation Learning
Published	2018-08-01
URL	https://www.aclweb.org/anthology/W18-4006/
PDF	https://www.aclweb.org/anthology/W18-4006
PWC	https://paperswithcode.com/paper/knowledge-representation-with-conceptual
Repo
Framework

Proceedings of the Workshop Events and Stories in the News 2018


Title	Proceedings of the Workshop Events and Stories in the News 2018
Authors
Abstract
Tasks
Published	2018-08-01
URL	https://www.aclweb.org/anthology/W18-4300/
PDF	https://www.aclweb.org/anthology/W18-4300
PWC	https://paperswithcode.com/paper/proceedings-of-the-workshop-events-and
Repo
Framework

SetExpander: End-to-end Term Set Expansion Based on Multi-Context Term Embeddings


Title	SetExpander: End-to-end Term Set Expansion Based on Multi-Context Term Embeddings
Authors	Jonathan Mamou, Oren Pereg, Moshe Wasserblat, Ido Dagan, Yoav Goldberg, Alon Eirew, Yael Green, Shira Guskin, Peter Izsak, Daniel Korat
Abstract	We present SetExpander, a corpus-based system for expanding a seed set of terms into a more complete set of terms that belong to the same semantic class. SetExpander implements an iterative end-to end workflow for term set expansion. It enables users to easily select a seed set of terms, expand it, view the expanded set, validate it, re-expand the validated set and store it, thus simplifying the extraction of domain-specific fine-grained semantic classes. SetExpander has been used for solving real-life use cases including integration in an automated recruitment system and an issues and defects resolution system. A video demo of SetExpander is available at https://drive.google.com/open?id=1e545bB87Autsch36DjnJHmq3HWfSd1Rv .
Tasks	Relation Extraction, Semantic Textual Similarity
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-2013/
PDF	https://www.aclweb.org/anthology/C18-2013
PWC	https://paperswithcode.com/paper/setexpander-end-to-end-term-set-expansion
Repo
Framework

Learning to Solve SMT Formulas


Title	Learning to Solve SMT Formulas
Authors	Mislav Balunovic, Pavol Bielik, Martin Vechev
Abstract	We present a new approach for learning to solve SMT formulas. We phrase the challenge of solving SMT formulas as a tree search problem where at each step a transformation is applied to the input formula until the formula is solved. Our approach works in two phases: first, given a dataset of unsolved formulas we learn a policy that for each formula selects a suitable transformation to apply at each step in order to solve the formula, and second, we synthesize a strategy in the form of a loop-free program with branches. This strategy is an interpretable representation of the policy decisions and is used to guide the SMT solver to decide formulas more efficiently, without requiring any modification to the solver itself and without needing to evaluate the learned policy at inference time. We show that our approach is effective in practice - it solves 17% more formulas over a range of benchmarks and achieves up to 100x runtime improvement over a state-of-the-art SMT solver.
Tasks
Published	2018-12-01
URL	http://papers.nips.cc/paper/8233-learning-to-solve-smt-formulas
PDF	http://papers.nips.cc/paper/8233-learning-to-solve-smt-formulas.pdf
PWC	https://paperswithcode.com/paper/learning-to-solve-smt-formulas
Repo
Framework