October 15, 2019

2217 words 11 mins read

Paper Group NANR 263

Characterizing and Learning Equivalence Classes of Causal DAGs under Interventions. Supervised Clustering of Questions into Intents for Dialog System Applications. Cross-Lingual Learning-to-Rank with Shared Representations. Subword-level Composition Functions for Learning Word Embeddings. An Adversarial Approach to Hard Triplet Generation. A FrameN …

Characterizing and Learning Equivalence Classes of Causal DAGs under Interventions


Title	Characterizing and Learning Equivalence Classes of Causal DAGs under Interventions
Authors	Karren Yang, Abigail Katcoff, Caroline Uhler
Abstract	We consider the problem of learning causal DAGs in the setting where both observational and interventional data is available. This setting is common in biology, where gene regulatory networks can be intervened on using chemical reagents or gene deletions. Hauser & Buhlmann (2012) previously characterized the identifiability of causal DAGs under perfect interventions, which eliminate dependencies between targeted variables and their direct causes. In this paper, we extend these identifiability results to general interventions, which may modify the dependencies between targeted variables and their causes without eliminating them. We define and characterize the interventional Markov equivalence class that can be identified from general (not necessarily perfect) intervention experiments. We also propose the first provably consistent algorithm for learning DAGs in this setting and evaluate our algorithm on simulated and biological datasets.
Tasks
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=2097
PDF	http://proceedings.mlr.press/v80/yang18a/yang18a.pdf
PWC	https://paperswithcode.com/paper/characterizing-and-learning-equivalence
Repo
Framework

Supervised Clustering of Questions into Intents for Dialog System Applications


Title	Supervised Clustering of Questions into Intents for Dialog System Applications
Authors	Iryna Haponchyk, Antonio Uva, Seunghak Yu, Olga Uryupina, Aless Moschitti, ro
Abstract	Modern automated dialog systems require complex dialog managers able to deal with user intent triggered by high-level semantic questions. In this paper, we propose a model for automatically clustering questions into user intents to help the design tasks. Since questions are short texts, uncovering their semantics to group them together can be very challenging. We approach the problem by using powerful semantic classifiers from question duplicate/matching research along with a novel idea of supervised clustering methods based on structured output. We test our approach on two intent clustering corpora, showing an impressive improvement over previous methods for two languages/domains.
Tasks	Chatbot, Intent Detection, Semantic Parsing, Semantic Textual Similarity
Published	2018-10-01
URL	https://www.aclweb.org/anthology/D18-1254/
PDF	https://www.aclweb.org/anthology/D18-1254
PWC	https://paperswithcode.com/paper/supervised-clustering-of-questions-into
Repo
Framework

Cross-Lingual Learning-to-Rank with Shared Representations


Title	Cross-Lingual Learning-to-Rank with Shared Representations
Authors	Shota Sasaki, Shuo Sun, Shigehiko Schamoni, Kevin Duh, Kentaro Inui
Abstract	Cross-lingual information retrieval (CLIR) is a document retrieval task where the documents are written in a language different from that of the user{'}s query. This is a challenging problem for data-driven approaches due to the general lack of labeled training data. We introduce a large-scale dataset derived from Wikipedia to support CLIR research in 25 languages. Further, we present a simple yet effective neural learning-to-rank model that shares representations across languages and reduces the data requirement. This model can exploit training data in, for example, Japanese-English CLIR to improve the results of Swahili-English CLIR.
Tasks	Information Retrieval, Learning-To-Rank, Machine Translation
Published	2018-06-01
URL	https://www.aclweb.org/anthology/N18-2073/
PDF	https://www.aclweb.org/anthology/N18-2073
PWC	https://paperswithcode.com/paper/cross-lingual-learning-to-rank-with-shared
Repo
Framework

Subword-level Composition Functions for Learning Word Embeddings


Title	Subword-level Composition Functions for Learning Word Embeddings
Authors	Bofang Li, Aleks Drozd, r, Tao Liu, Xiaoyong Du
Abstract	Subword-level information is crucial for capturing the meaning and morphology of words, especially for out-of-vocabulary entries. We propose CNN- and RNN-based subword-level composition functions for learning word embeddings, and systematically compare them with popular word-level and subword-level models (Skip-Gram and FastText). Additionally, we propose a hybrid training scheme in which a pure subword-level model is trained jointly with a conventional word-level embedding model based on lookup-tables. This increases the fitness of all types of subword-level word embeddings; the word-level embeddings can be discarded after training, leaving only compact subword-level representation with much smaller data volume. We evaluate these embeddings on a set of intrinsic and extrinsic tasks, showing that subword-level models have advantage on tasks related to morphology and datasets with high OOV rate, and can be combined with other types of embeddings.
Tasks	Learning Word Embeddings, Word Embeddings
Published	2018-06-01
URL	https://www.aclweb.org/anthology/W18-1205/
PDF	https://www.aclweb.org/anthology/W18-1205
PWC	https://paperswithcode.com/paper/subword-level-composition-functions-for
Repo
Framework

An Adversarial Approach to Hard Triplet Generation


Title	An Adversarial Approach to Hard Triplet Generation
Authors	Yiru Zhao, Zhongming Jin, Guo-jun Qi, Hongtao Lu, Xian-sheng Hua
Abstract	While deep neural networks have demonstrated competitive results for many visual recognition and image retrieval tasks, the major challenge lies in distinguishing similar images from different categories (i.e., hard negative examples) while clustering images with large variations from the same category (i.e., hard positive examples). The current state-of-the-art is to mine the most hard triplet examples from the mini-batch to train the network. However, mining-based methods tend to look into these triplets that are hard in terms of the current estimated network, rather than deliberately generating those hard triplets that really matter in globally optimizing the network. For this purpose, we propose an adversarial network for Hard Triplet Generation (HTG) to optimize the network ability in distinguishing similar examples of different categories as well as grouping varied examples of the same categories. We evaluate our method on the real-world challenging datasets, such as CUB200-2011, CARS196, DeepFashion and VehicleID datasets, and show that our method outperforms the state-of-the-art methods significantly.
Tasks	Image Retrieval
Published	2018-09-01
URL	http://openaccess.thecvf.com/content_ECCV_2018/html/Yiru_Zhao_A_Principled_Approach_ECCV_2018_paper.html
PDF	http://openaccess.thecvf.com/content_ECCV_2018/papers/Yiru_Zhao_A_Principled_Approach_ECCV_2018_paper.pdf
PWC	https://paperswithcode.com/paper/an-adversarial-approach-to-hard-triplet
Repo
Framework

A FrameNet for Cancer Information in Clinical Narratives: Schema and Annotation


Title	A FrameNet for Cancer Information in Clinical Narratives: Schema and Annotation
Authors	Kirk Roberts, Yuqi Si, G, Anshul hi, Elmer Bernstam
Abstract
Tasks
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1041/
PDF	https://www.aclweb.org/anthology/L18-1041
PWC	https://paperswithcode.com/paper/a-framenet-for-cancer-information-in-clinical
Repo
Framework

Improving Neural Abstractive Document Summarization with Explicit Information Selection Modeling


Title	Improving Neural Abstractive Document Summarization with Explicit Information Selection Modeling
Authors	Wei Li, Xinyan Xiao, Yajuan Lyu, Yuanzhuo Wang
Abstract	Information selection is the most important component in document summarization task. In this paper, we propose to extend the basic neural encoding-decoding framework with an information selection layer to explicitly model and optimize the information selection process in abstractive document summarization. Specifically, our information selection layer consists of two parts: gated global information filtering and local sentence selection. Unnecessary information in the original document is first globally filtered, then salient sentences are selected locally while generating each summary sentence sequentially. To optimize the information selection process directly, distantly-supervised training guided by the golden summary is also imported. Experimental results demonstrate that the explicit modeling and optimizing of the information selection process improves document summarization performance significantly, which enables our model to generate more informative and concise summaries, and thus significantly outperform state-of-the-art neural abstractive methods.
Tasks	Document Summarization, Machine Translation, Text Generation
Published	2018-10-01
URL	https://www.aclweb.org/anthology/D18-1205/
PDF	https://www.aclweb.org/anthology/D18-1205
PWC	https://paperswithcode.com/paper/improving-neural-abstractive-document
Repo
Framework

Imdlawn Tashlhiyt Berber Syllabification is Quantifier-Free


Title	Imdlawn Tashlhiyt Berber Syllabification is Quantifier-Free
Authors	Kristina Strother-Garcia
Abstract
Tasks
Published	2018-01-01
URL	https://www.aclweb.org/anthology/W18-0315/
PDF	https://www.aclweb.org/anthology/W18-0315
PWC	https://paperswithcode.com/paper/imdlawn-tashlhiyt-berber-syllabification-is
Repo
Framework

Modeling discourse cohesion for discourse parsing via memory network


Title	Modeling discourse cohesion for discourse parsing via memory network
Authors	Yanyan Jia, Yuan Ye, Yansong Feng, Yuxuan Lai, Rui Yan, Dongyan Zhao
Abstract	Identifying long-span dependencies between discourse units is crucial to improve discourse parsing performance. Most existing approaches design sophisticated features or exploit various off-the-shelf tools, but achieve little success. In this paper, we propose a new transition-based discourse parser that makes use of memory networks to take discourse cohesion into account. The automatically captured discourse cohesion benefits discourse parsing, especially for long span scenarios. Experiments on the RST discourse treebank show that our method outperforms traditional featured based methods, and the memory based discourse cohesion can improve the overall parsing performance significantly.
Tasks	Question Answering, Sentiment Analysis
Published	2018-07-01
URL	https://www.aclweb.org/anthology/P18-2070/
PDF	https://www.aclweb.org/anthology/P18-2070
PWC	https://paperswithcode.com/paper/modeling-discourse-cohesion-for-discourse
Repo
Framework

The Edge Density Barrier: Computational-Statistical Tradeoffs in Combinatorial Inference


Title	The Edge Density Barrier: Computational-Statistical Tradeoffs in Combinatorial Inference
Authors	Hao Lu, Yuan Cao, Zhuoran Yang, Junwei Lu, Han Liu, Zhaoran Wang
Abstract	We study the hypothesis testing problem of inferring the existence of combinatorial structures in undirected graphical models. Although there exist extensive studies on the information-theoretic limits of this problem, it remains largely unexplored whether such limits can be attained by efficient algorithms. In this paper, we quantify the minimum computational complexity required to attain the information-theoretic limits based on an oracle computational model. We prove that, for testing common combinatorial structures, such as clique, nearest neighbor graph and perfect matching, against an empty graph, or large clique against small clique, the information-theoretic limits are provably unachievable by tractable algorithms in general. More importantly, we define structural quantities called the weak and strong edge densities, which offer deep insight into the existence of such computational-statistical tradeoffs. To the best of our knowledge, our characterization is the first to identify and explain the fundamental tradeoffs between statistics and computation for combinatorial inference problems in undirected graphical models.
Tasks
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=2088
PDF	http://proceedings.mlr.press/v80/lu18a/lu18a.pdf
PWC	https://paperswithcode.com/paper/the-edge-density-barrier-computational
Repo
Framework

Determining Code Words in Euphemistic Hate Speech Using Word Embedding Networks


Title	Determining Code Words in Euphemistic Hate Speech Using Word Embedding Networks
Authors	Rijul Magu, Jiebo Luo
Abstract	While analysis of online explicit abusive language detection has lately seen an ever-increasing focus, implicit abuse detection remains a largely unexplored space. We carry out a study on a subcategory of implicit hate: euphemistic hate speech. We propose a method to assist in identifying unknown euphemisms (or code words) given a set of hateful tweets containing a known code word. Our approach leverages word embeddings and network analysis (through centrality measures and community detection) in a manner that can be generalized to identify euphemisms across contexts- not just hate speech.
Tasks	Abuse Detection, Community Detection, Hate Speech Detection, Word Embeddings
Published	2018-10-01
URL	https://www.aclweb.org/anthology/W18-5112/
PDF	https://www.aclweb.org/anthology/W18-5112
PWC	https://paperswithcode.com/paper/determining-code-words-in-euphemistic-hate
Repo
Framework

Learning to Explore via Meta-Policy Gradient


Title	Learning to Explore via Meta-Policy Gradient
Authors	Tianbing Xu, Qiang Liu, Liang Zhao, Jian Peng
Abstract	The performance of off-policy learning, including deep Q-learning and deep deterministic policy gradient (DDPG), critically depends on the choice of the exploration policy. Existing exploration methods are mostly based on adding noise to the on-going actor policy and can only explore local regions close to what the actor policy dictates. In this work, we develop a simple meta-policy gradient algorithm that allows us to adaptively learn the exploration policy in DDPG. Our algorithm allows us to train flexible exploration behaviors that are independent of the actor policy, yielding a global exploration that significantly speeds up the learning process. With an extensive study, we show that our method significantly improves the sample-efficiency of DDPG on a variety of reinforcement learning continuous control tasks.
Tasks	Continuous Control, Q-Learning
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=1939
PDF	http://proceedings.mlr.press/v80/xu18d/xu18d.pdf
PWC	https://paperswithcode.com/paper/learning-to-explore-via-meta-policy-gradient
Repo
Framework


Title	Language-Based Automatic Assessment of Cognitive and Communicative Functions Related to Parkinson’s Disease
Authors	Lesley Jessiman, Gabriel Murray, McKenzie Braley
Abstract	We explore the use of natural language processing and machine learning for detecting evidence of Parkinson{'}s disease from transcribed speech of subjects who are describing everyday tasks. Experiments reveal the difficulty of treating this as a binary classification task, and a multi-class approach yields superior results. We also show that these models can be used to predict cognitive abilities across all subjects.
Tasks
Published	2018-08-01
URL	https://www.aclweb.org/anthology/W18-4107/
PDF	https://www.aclweb.org/anthology/W18-4107
PWC	https://paperswithcode.com/paper/language-based-automatic-assessment-of
Repo
Framework

Mean-Variance Loss for Deep Age Estimation From a Face


Title	Mean-Variance Loss for Deep Age Estimation From a Face
Authors	Hongyu Pan, Hu Han, Shiguang Shan, Xilin Chen
Abstract	Age estimation has broad application prospects of many fields, such as video surveillance, social networking, and human-computer interaction. However, many of the published age estimation approaches simply treat the age estimation as an exact age regression problem, and thus did not leverage a distribution’s robustness in representing labels with ambiguity such as ages. In this paper, we propose a new loss function, called mean-variance loss, for robust age estimation via distribution learning. Specifically, the mean-variance loss consists of a mean loss, which penalizes difference between the mean of the estimated age distribution and the ground-truth age, and a variance loss, which penalizes the variance of the estimated age distribution to ensure a concentrated distribution. The proposed mean-variance loss and softmax loss are embedded jointly into Convolutional Neural Networks (CNNs) for age estimation, and the network weights are optimized via stochastic gradient descent (SGD) in an end-to-end learning way. Experimental results on a number of challenging face aging databases (FG-NET, MORPH Album II, and CLAP2016) show that the proposed approach outperforms the state-of-the-art methods by a large margin using a single model.
Tasks	Age Estimation
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Pan_Mean-Variance_Loss_for_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Pan_Mean-Variance_Loss_for_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/mean-variance-loss-for-deep-age-estimation
Repo
Framework

Exploring Semantic Properties of Sentence Embeddings


Title	Exploring Semantic Properties of Sentence Embeddings
Authors	Xunjie Zhu, Tingfeng Li, Gerard de Melo
Abstract	Neural vector representations are ubiquitous throughout all subfields of NLP. While word vectors have been studied in much detail, thus far only little light has been shed on the properties of sentence embeddings. In this paper, we assess to what extent prominent sentence embedding methods exhibit select semantic properties. We propose a framework that generate triplets of sentences to explore how changes in the syntactic structure or semantics of a given sentence affect the similarities obtained between their sentence embeddings.
Tasks	Machine Translation, Reading Comprehension, Semantic Textual Similarity, Sentence Embedding, Sentence Embeddings
Published	2018-07-01
URL	https://www.aclweb.org/anthology/P18-2100/
PDF	https://www.aclweb.org/anthology/P18-2100
PWC	https://paperswithcode.com/paper/exploring-semantic-properties-of-sentence
Repo
Framework