Paper Group NANR 263
Characterizing and Learning Equivalence Classes of Causal DAGs under Interventions. Supervised Clustering of Questions into Intents for Dialog System Applications. Cross-Lingual Learning-to-Rank with Shared Representations. Subword-level Composition Functions for Learning Word Embeddings. An Adversarial Approach to Hard Triplet Generation. A FrameN …
Characterizing and Learning Equivalence Classes of Causal DAGs under Interventions
Title | Characterizing and Learning Equivalence Classes of Causal DAGs under Interventions |
Authors | Karren Yang, Abigail Katcoff, Caroline Uhler |
Abstract | We consider the problem of learning causal DAGs in the setting where both observational and interventional data is available. This setting is common in biology, where gene regulatory networks can be intervened on using chemical reagents or gene deletions. Hauser & Buhlmann (2012) previously characterized the identifiability of causal DAGs under perfect interventions, which eliminate dependencies between targeted variables and their direct causes. In this paper, we extend these identifiability results to general interventions, which may modify the dependencies between targeted variables and their causes without eliminating them. We define and characterize the interventional Markov equivalence class that can be identified from general (not necessarily perfect) intervention experiments. We also propose the first provably consistent algorithm for learning DAGs in this setting and evaluate our algorithm on simulated and biological datasets. |
Tasks | |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=2097 |
http://proceedings.mlr.press/v80/yang18a/yang18a.pdf | |
PWC | https://paperswithcode.com/paper/characterizing-and-learning-equivalence |
Repo | |
Framework | |
Supervised Clustering of Questions into Intents for Dialog System Applications
Title | Supervised Clustering of Questions into Intents for Dialog System Applications |
Authors | Iryna Haponchyk, Antonio Uva, Seunghak Yu, Olga Uryupina, Aless Moschitti, ro |
Abstract | Modern automated dialog systems require complex dialog managers able to deal with user intent triggered by high-level semantic questions. In this paper, we propose a model for automatically clustering questions into user intents to help the design tasks. Since questions are short texts, uncovering their semantics to group them together can be very challenging. We approach the problem by using powerful semantic classifiers from question duplicate/matching research along with a novel idea of supervised clustering methods based on structured output. We test our approach on two intent clustering corpora, showing an impressive improvement over previous methods for two languages/domains. |
Tasks | Chatbot, Intent Detection, Semantic Parsing, Semantic Textual Similarity |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1254/ |
https://www.aclweb.org/anthology/D18-1254 | |
PWC | https://paperswithcode.com/paper/supervised-clustering-of-questions-into |
Repo | |
Framework | |
Cross-Lingual Learning-to-Rank with Shared Representations
Title | Cross-Lingual Learning-to-Rank with Shared Representations |
Authors | Shota Sasaki, Shuo Sun, Shigehiko Schamoni, Kevin Duh, Kentaro Inui |
Abstract | Cross-lingual information retrieval (CLIR) is a document retrieval task where the documents are written in a language different from that of the user{'}s query. This is a challenging problem for data-driven approaches due to the general lack of labeled training data. We introduce a large-scale dataset derived from Wikipedia to support CLIR research in 25 languages. Further, we present a simple yet effective neural learning-to-rank model that shares representations across languages and reduces the data requirement. This model can exploit training data in, for example, Japanese-English CLIR to improve the results of Swahili-English CLIR. |
Tasks | Information Retrieval, Learning-To-Rank, Machine Translation |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-2073/ |
https://www.aclweb.org/anthology/N18-2073 | |
PWC | https://paperswithcode.com/paper/cross-lingual-learning-to-rank-with-shared |
Repo | |
Framework | |
Subword-level Composition Functions for Learning Word Embeddings
Title | Subword-level Composition Functions for Learning Word Embeddings |
Authors | Bofang Li, Aleks Drozd, r, Tao Liu, Xiaoyong Du |
Abstract | Subword-level information is crucial for capturing the meaning and morphology of words, especially for out-of-vocabulary entries. We propose CNN- and RNN-based subword-level composition functions for learning word embeddings, and systematically compare them with popular word-level and subword-level models (Skip-Gram and FastText). Additionally, we propose a hybrid training scheme in which a pure subword-level model is trained jointly with a conventional word-level embedding model based on lookup-tables. This increases the fitness of all types of subword-level word embeddings; the word-level embeddings can be discarded after training, leaving only compact subword-level representation with much smaller data volume. We evaluate these embeddings on a set of intrinsic and extrinsic tasks, showing that subword-level models have advantage on tasks related to morphology and datasets with high OOV rate, and can be combined with other types of embeddings. |
Tasks | Learning Word Embeddings, Word Embeddings |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/W18-1205/ |
https://www.aclweb.org/anthology/W18-1205 | |
PWC | https://paperswithcode.com/paper/subword-level-composition-functions-for |
Repo | |
Framework | |
An Adversarial Approach to Hard Triplet Generation
Title | An Adversarial Approach to Hard Triplet Generation |
Authors | Yiru Zhao, Zhongming Jin, Guo-jun Qi, Hongtao Lu, Xian-sheng Hua |
Abstract | While deep neural networks have demonstrated competitive results for many visual recognition and image retrieval tasks, the major challenge lies in distinguishing similar images from different categories (i.e., hard negative examples) while clustering images with large variations from the same category (i.e., hard positive examples). The current state-of-the-art is to mine the most hard triplet examples from the mini-batch to train the network. However, mining-based methods tend to look into these triplets that are hard in terms of the current estimated network, rather than deliberately generating those hard triplets that really matter in globally optimizing the network. For this purpose, we propose an adversarial network for Hard Triplet Generation (HTG) to optimize the network ability in distinguishing similar examples of different categories as well as grouping varied examples of the same categories. We evaluate our method on the real-world challenging datasets, such as CUB200-2011, CARS196, DeepFashion and VehicleID datasets, and show that our method outperforms the state-of-the-art methods significantly. |
Tasks | Image Retrieval |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Yiru_Zhao_A_Principled_Approach_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Yiru_Zhao_A_Principled_Approach_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/an-adversarial-approach-to-hard-triplet |
Repo | |
Framework | |
A FrameNet for Cancer Information in Clinical Narratives: Schema and Annotation
Title | A FrameNet for Cancer Information in Clinical Narratives: Schema and Annotation |
Authors | Kirk Roberts, Yuqi Si, G, Anshul hi, Elmer Bernstam |
Abstract | |
Tasks | |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1041/ |
https://www.aclweb.org/anthology/L18-1041 | |
PWC | https://paperswithcode.com/paper/a-framenet-for-cancer-information-in-clinical |
Repo | |
Framework | |
Improving Neural Abstractive Document Summarization with Explicit Information Selection Modeling
Title | Improving Neural Abstractive Document Summarization with Explicit Information Selection Modeling |
Authors | Wei Li, Xinyan Xiao, Yajuan Lyu, Yuanzhuo Wang |
Abstract | Information selection is the most important component in document summarization task. In this paper, we propose to extend the basic neural encoding-decoding framework with an information selection layer to explicitly model and optimize the information selection process in abstractive document summarization. Specifically, our information selection layer consists of two parts: gated global information filtering and local sentence selection. Unnecessary information in the original document is first globally filtered, then salient sentences are selected locally while generating each summary sentence sequentially. To optimize the information selection process directly, distantly-supervised training guided by the golden summary is also imported. Experimental results demonstrate that the explicit modeling and optimizing of the information selection process improves document summarization performance significantly, which enables our model to generate more informative and concise summaries, and thus significantly outperform state-of-the-art neural abstractive methods. |
Tasks | Document Summarization, Machine Translation, Text Generation |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1205/ |
https://www.aclweb.org/anthology/D18-1205 | |
PWC | https://paperswithcode.com/paper/improving-neural-abstractive-document |
Repo | |
Framework | |
Imdlawn Tashlhiyt Berber Syllabification is Quantifier-Free
Title | Imdlawn Tashlhiyt Berber Syllabification is Quantifier-Free |
Authors | Kristina Strother-Garcia |
Abstract | |
Tasks | |
Published | 2018-01-01 |
URL | https://www.aclweb.org/anthology/W18-0315/ |
https://www.aclweb.org/anthology/W18-0315 | |
PWC | https://paperswithcode.com/paper/imdlawn-tashlhiyt-berber-syllabification-is |
Repo | |
Framework | |
Modeling discourse cohesion for discourse parsing via memory network
Title | Modeling discourse cohesion for discourse parsing via memory network |
Authors | Yanyan Jia, Yuan Ye, Yansong Feng, Yuxuan Lai, Rui Yan, Dongyan Zhao |
Abstract | Identifying long-span dependencies between discourse units is crucial to improve discourse parsing performance. Most existing approaches design sophisticated features or exploit various off-the-shelf tools, but achieve little success. In this paper, we propose a new transition-based discourse parser that makes use of memory networks to take discourse cohesion into account. The automatically captured discourse cohesion benefits discourse parsing, especially for long span scenarios. Experiments on the RST discourse treebank show that our method outperforms traditional featured based methods, and the memory based discourse cohesion can improve the overall parsing performance significantly. |
Tasks | Question Answering, Sentiment Analysis |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/P18-2070/ |
https://www.aclweb.org/anthology/P18-2070 | |
PWC | https://paperswithcode.com/paper/modeling-discourse-cohesion-for-discourse |
Repo | |
Framework | |
The Edge Density Barrier: Computational-Statistical Tradeoffs in Combinatorial Inference
Title | The Edge Density Barrier: Computational-Statistical Tradeoffs in Combinatorial Inference |
Authors | Hao Lu, Yuan Cao, Zhuoran Yang, Junwei Lu, Han Liu, Zhaoran Wang |
Abstract | We study the hypothesis testing problem of inferring the existence of combinatorial structures in undirected graphical models. Although there exist extensive studies on the information-theoretic limits of this problem, it remains largely unexplored whether such limits can be attained by efficient algorithms. In this paper, we quantify the minimum computational complexity required to attain the information-theoretic limits based on an oracle computational model. We prove that, for testing common combinatorial structures, such as clique, nearest neighbor graph and perfect matching, against an empty graph, or large clique against small clique, the information-theoretic limits are provably unachievable by tractable algorithms in general. More importantly, we define structural quantities called the weak and strong edge densities, which offer deep insight into the existence of such computational-statistical tradeoffs. To the best of our knowledge, our characterization is the first to identify and explain the fundamental tradeoffs between statistics and computation for combinatorial inference problems in undirected graphical models. |
Tasks | |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=2088 |
http://proceedings.mlr.press/v80/lu18a/lu18a.pdf | |
PWC | https://paperswithcode.com/paper/the-edge-density-barrier-computational |
Repo | |
Framework | |
Determining Code Words in Euphemistic Hate Speech Using Word Embedding Networks
Title | Determining Code Words in Euphemistic Hate Speech Using Word Embedding Networks |
Authors | Rijul Magu, Jiebo Luo |
Abstract | While analysis of online explicit abusive language detection has lately seen an ever-increasing focus, implicit abuse detection remains a largely unexplored space. We carry out a study on a subcategory of implicit hate: euphemistic hate speech. We propose a method to assist in identifying unknown euphemisms (or code words) given a set of hateful tweets containing a known code word. Our approach leverages word embeddings and network analysis (through centrality measures and community detection) in a manner that can be generalized to identify euphemisms across contexts- not just hate speech. |
Tasks | Abuse Detection, Community Detection, Hate Speech Detection, Word Embeddings |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/W18-5112/ |
https://www.aclweb.org/anthology/W18-5112 | |
PWC | https://paperswithcode.com/paper/determining-code-words-in-euphemistic-hate |
Repo | |
Framework | |
Learning to Explore via Meta-Policy Gradient
Title | Learning to Explore via Meta-Policy Gradient |
Authors | Tianbing Xu, Qiang Liu, Liang Zhao, Jian Peng |
Abstract | The performance of off-policy learning, including deep Q-learning and deep deterministic policy gradient (DDPG), critically depends on the choice of the exploration policy. Existing exploration methods are mostly based on adding noise to the on-going actor policy and can only explore local regions close to what the actor policy dictates. In this work, we develop a simple meta-policy gradient algorithm that allows us to adaptively learn the exploration policy in DDPG. Our algorithm allows us to train flexible exploration behaviors that are independent of the actor policy, yielding a global exploration that significantly speeds up the learning process. With an extensive study, we show that our method significantly improves the sample-efficiency of DDPG on a variety of reinforcement learning continuous control tasks. |
Tasks | Continuous Control, Q-Learning |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=1939 |
http://proceedings.mlr.press/v80/xu18d/xu18d.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-explore-via-meta-policy-gradient |
Repo | |
Framework | |
Language-Based Automatic Assessment of Cognitive and Communicative Functions Related to Parkinson’s Disease
Title | Language-Based Automatic Assessment of Cognitive and Communicative Functions Related to Parkinson’s Disease |
Authors | Lesley Jessiman, Gabriel Murray, McKenzie Braley |
Abstract | We explore the use of natural language processing and machine learning for detecting evidence of Parkinson{'}s disease from transcribed speech of subjects who are describing everyday tasks. Experiments reveal the difficulty of treating this as a binary classification task, and a multi-class approach yields superior results. We also show that these models can be used to predict cognitive abilities across all subjects. |
Tasks | |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/W18-4107/ |
https://www.aclweb.org/anthology/W18-4107 | |
PWC | https://paperswithcode.com/paper/language-based-automatic-assessment-of |
Repo | |
Framework | |
Mean-Variance Loss for Deep Age Estimation From a Face
Title | Mean-Variance Loss for Deep Age Estimation From a Face |
Authors | Hongyu Pan, Hu Han, Shiguang Shan, Xilin Chen |
Abstract | Age estimation has broad application prospects of many fields, such as video surveillance, social networking, and human-computer interaction. However, many of the published age estimation approaches simply treat the age estimation as an exact age regression problem, and thus did not leverage a distribution’s robustness in representing labels with ambiguity such as ages. In this paper, we propose a new loss function, called mean-variance loss, for robust age estimation via distribution learning. Specifically, the mean-variance loss consists of a mean loss, which penalizes difference between the mean of the estimated age distribution and the ground-truth age, and a variance loss, which penalizes the variance of the estimated age distribution to ensure a concentrated distribution. The proposed mean-variance loss and softmax loss are embedded jointly into Convolutional Neural Networks (CNNs) for age estimation, and the network weights are optimized via stochastic gradient descent (SGD) in an end-to-end learning way. Experimental results on a number of challenging face aging databases (FG-NET, MORPH Album II, and CLAP2016) show that the proposed approach outperforms the state-of-the-art methods by a large margin using a single model. |
Tasks | Age Estimation |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Pan_Mean-Variance_Loss_for_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Pan_Mean-Variance_Loss_for_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/mean-variance-loss-for-deep-age-estimation |
Repo | |
Framework | |
Exploring Semantic Properties of Sentence Embeddings
Title | Exploring Semantic Properties of Sentence Embeddings |
Authors | Xunjie Zhu, Tingfeng Li, Gerard de Melo |
Abstract | Neural vector representations are ubiquitous throughout all subfields of NLP. While word vectors have been studied in much detail, thus far only little light has been shed on the properties of sentence embeddings. In this paper, we assess to what extent prominent sentence embedding methods exhibit select semantic properties. We propose a framework that generate triplets of sentences to explore how changes in the syntactic structure or semantics of a given sentence affect the similarities obtained between their sentence embeddings. |
Tasks | Machine Translation, Reading Comprehension, Semantic Textual Similarity, Sentence Embedding, Sentence Embeddings |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/P18-2100/ |
https://www.aclweb.org/anthology/P18-2100 | |
PWC | https://paperswithcode.com/paper/exploring-semantic-properties-of-sentence |
Repo | |
Framework | |