Paper Group NANR 195
Proceedings of the 1st Workshop on Interactive Natural Language Technology for Explainable Artificial Intelligence (NL4XAI 2019). Data Augmentation Based on Distributed Expressions in Text Classification Tasks. Going on a vacation'' takes longer than
Going for a walk’': A Study of Temporal Commonsense Understanding. Multilingual Unsupervised NM …
Proceedings of the 1st Workshop on Interactive Natural Language Technology for Explainable Artificial Intelligence (NL4XAI 2019)
Title | Proceedings of the 1st Workshop on Interactive Natural Language Technology for Explainable Artificial Intelligence (NL4XAI 2019) |
Authors | |
Abstract | |
Tasks | |
Published | 2019-01-01 |
URL | https://www.aclweb.org/anthology/W19-8400/ |
https://www.aclweb.org/anthology/W19-8400 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-1st-workshop-on-2 |
Repo | |
Framework | |
Data Augmentation Based on Distributed Expressions in Text Classification Tasks
Title | Data Augmentation Based on Distributed Expressions in Text Classification Tasks |
Authors | Yu Sugawara |
Abstract | |
Tasks | Data Augmentation, Text Classification |
Published | 2019-10-01 |
URL | https://www.aclweb.org/anthology/W19-8304/ |
https://www.aclweb.org/anthology/W19-8304 | |
PWC | https://paperswithcode.com/paper/data-augmentation-based-on-distributed |
Repo | |
Framework | |
Going on a vacation'' takes longer than
Going for a walk’': A Study of Temporal Commonsense Understanding
Title | Going on a vacation'' takes longer than Going for a walk’': A Study of Temporal Commonsense Understanding |
Authors | Ben Zhou, Daniel Khashabi, Qiang Ning, Dan Roth |
Abstract | Understanding time is crucial for understanding events expressed in natural language. Because people rarely say the obvious, it is often necessary to have commonsense knowledge about various temporal aspects of events, such as duration, frequency, and temporal order. However, this important problem has so far received limited attention. This paper systematically studies this temporal commonsense problem. Specifically, we define five classes of temporal commonsense, and use crowdsourcing to develop a new dataset, MCTACO, that serves as a test set for this task. We find that the best current methods used on MCTACO are still far behind human performance, by about 20{%}, and discuss several directions for improvement. We hope that the new dataset and our study here can foster more future research on this topic. |
Tasks | |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1332/ |
https://www.aclweb.org/anthology/D19-1332 | |
PWC | https://paperswithcode.com/paper/going-on-a-vacation-takes-longer-than-going-1 |
Repo | |
Framework | |
Multilingual Unsupervised NMT using Shared Encoder and Language-Specific Decoders
Title | Multilingual Unsupervised NMT using Shared Encoder and Language-Specific Decoders |
Authors | Sukanta Sen, Kamal Kumar Gupta, Asif Ekbal, Pushpak Bhattacharyya |
Abstract | In this paper, we propose a multilingual unsupervised NMT scheme which jointly trains multiple languages with a shared encoder and multiple decoders. Our approach is based on denoising autoencoding of each language and back-translating between English and multiple non-English languages. This results in a universal encoder which can encode any language participating in training into an inter-lingual representation, and language-specific decoders. Our experiments using only monolingual corpora show that multilingual unsupervised model performs better than the separately trained bilingual models achieving improvement of up to 1.48 BLEU points on WMT test sets. We also observe that even if we do not train the network for all possible translation directions, the network is still able to translate in a many-to-many fashion leveraging encoder{'}s ability to generate interlingual representation. |
Tasks | Denoising |
Published | 2019-07-01 |
URL | https://www.aclweb.org/anthology/P19-1297/ |
https://www.aclweb.org/anthology/P19-1297 | |
PWC | https://paperswithcode.com/paper/multilingual-unsupervised-nmt-using-shared |
Repo | |
Framework | |
Efficient Rematerialization for Deep Networks
Title | Efficient Rematerialization for Deep Networks |
Authors | Ravi Kumar, Manish Purohit, Zoya Svitkina, Erik Vee, Joshua Wang |
Abstract | When training complex neural networks, memory usage can be an important bottleneck. The question of when to rematerialize, i.e., to recompute intermediate values rather than retaining them in memory, becomes critical to achieving the best time and space efficiency. In this work we consider the rematerialization problem and devise efficient algorithms that use structural characterizations of computation graphs—treewidth and pathwidth—to obtain provably efficient rematerialization schedules. Our experiments demonstrate the performance of these algorithms on many common deep learning models. |
Tasks | |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9653-efficient-rematerialization-for-deep-networks |
http://papers.nips.cc/paper/9653-efficient-rematerialization-for-deep-networks.pdf | |
PWC | https://paperswithcode.com/paper/efficient-rematerialization-for-deep-networks |
Repo | |
Framework | |
Modelling linguistic vagueness and uncertainty in historical texts
Title | Modelling linguistic vagueness and uncertainty in historical texts |
Authors | Cristina Vertan |
Abstract | Many applications in Digital Humanities (DH) rely on annotations of the raw material. These annotations (inferred automatically or done manually) assume that labelled facts are either true or false, thus all inferences started on such annotations us boolean logic. This contradicts hermeneutic principles used by humanites in which most part of the knowledge has a degree of truth which varies depending on the experience and the world knowledge of the interpreter. In this paper we will show how uncertainty and vagueness, two main features of any historical text can be encoded in annotations and thus be considered by DH applications. |
Tasks | |
Published | 2019-09-01 |
URL | https://www.aclweb.org/anthology/W19-9007/ |
https://www.aclweb.org/anthology/W19-9007 | |
PWC | https://paperswithcode.com/paper/modelling-linguistic-vagueness-and |
Repo | |
Framework | |
Understanding the Evolution of Circular Economy through Language Change
Title | Understanding the Evolution of Circular Economy through Language Change |
Authors | Sampriti Mahanty, Frank Boons, H, Julia l, Riza Theresa Batista-Navarro |
Abstract | In this study, we propose to focus on understanding the evolution of a specific scientific concept{—}that of Circular Economy (CE){—}by analysing how the language used in academic discussions has changed semantically. It is worth noting that the meaning and central theme of this concept has remained the same; however, we hypothesise that it has undergone semantic change by way of additional layers being added to the concept. We have shown that semantic change in language is a reflection of shifts in scientific ideas, which in turn help explain the evolution of a concept. Focusing on the CE concept, our analysis demonstrated that the change over time in the language used in academic discussions of CE is indicative of the way in which the concept evolved and expanded. |
Tasks | |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-4731/ |
https://www.aclweb.org/anthology/W19-4731 | |
PWC | https://paperswithcode.com/paper/understanding-the-evolution-of-circular |
Repo | |
Framework | |
Proceedings of the 13th International Conference on Computational Semantics - Short Papers
Title | Proceedings of the 13th International Conference on Computational Semantics - Short Papers |
Authors | |
Abstract | |
Tasks | |
Published | 2019-05-01 |
URL | https://www.aclweb.org/anthology/W19-0500/ |
https://www.aclweb.org/anthology/W19-0500 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-13th-international-5 |
Repo | |
Framework | |
Aspect-Level Sentiment Analysis Via Convolution over Dependency Tree
Title | Aspect-Level Sentiment Analysis Via Convolution over Dependency Tree |
Authors | Kai Sun, Richong Zhang, Samuel Mensah, Yongyi Mao, Xudong Liu |
Abstract | We propose a method based on neural networks to identify the sentiment polarity of opinion words expressed on a specific aspect of a sentence. Although a large majority of works typically focus on leveraging the expressive power of neural networks in handling this task, we explore the possibility of integrating dependency trees with neural networks for representation learning. To this end, we present a convolution over a dependency tree (CDT) model which exploits a Bi-directional Long Short Term Memory (Bi-LSTM) to learn representations for features of a sentence, and further enhance the embeddings with a graph convolutional network (GCN) which operates directly on the dependency tree of the sentence. Our approach propagates both contextual and dependency information from opinion words to aspect words, offering discriminative properties for supervision. Experimental results ranks our approach as the new state-of-the-art in aspect-based sentiment classification. |
Tasks | Representation Learning, Sentiment Analysis |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1569/ |
https://www.aclweb.org/anthology/D19-1569 | |
PWC | https://paperswithcode.com/paper/aspect-level-sentiment-analysis-via |
Repo | |
Framework | |
Generalizable Person Re-Identification by Domain-Invariant Mapping Network
Title | Generalizable Person Re-Identification by Domain-Invariant Mapping Network |
Authors | Jifei Song, Yongxin Yang, Yi-Zhe Song, Tao Xiang, Timothy M. Hospedales |
Abstract | We aim to learn a domain generalizable person re-identification (ReID) model. When such a model is trained on a set of source domains (ReID datasets collected from different camera networks), it can be directly applied to any new unseen dataset for effective ReID without any model updating. Despite its practical value in real-world deployments, generalizable ReID has seldom been studied. In this work, a novel deep ReID model termed Domain-Invariant Mapping Network (DIMN) is proposed. DIMN is designed to learn a mapping between a person image and its identity classifier, i.e., it produces a classifier using a single shot. To make the model domain-invariant, we follow a meta-learning pipeline and sample a subset of source domain training tasks during each training episode. However, the model is significantly different from conventional meta-learning methods in that: (1) no model updating is required for the target domain, (2) different training tasks share a memory bank for maintaining both scalability and discrimination ability, and (3) it can be used to match an arbitrary number of identities in a target domain. Extensive experiments on a newly proposed large-scale ReID domain generalization benchmark show that our DIMN significantly outperforms alternative domain generalization or meta-learning methods. |
Tasks | Domain Generalization, Meta-Learning, Person Re-Identification |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Song_Generalizable_Person_Re-Identification_by_Domain-Invariant_Mapping_Network_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Song_Generalizable_Person_Re-Identification_by_Domain-Invariant_Mapping_Network_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/generalizable-person-re-identification-by |
Repo | |
Framework | |
Identifying Causal Effects via Context-specific Independence Relations
Title | Identifying Causal Effects via Context-specific Independence Relations |
Authors | Santtu Tikka, Antti Hyttinen, Juha Karvanen |
Abstract | Causal effect identification considers whether an interventional probability distribution can be uniquely determined from a passively observed distribution in a given causal structure. If the generating system induces context-specific independence (CSI) relations, the existing identification procedures and criteria based on do-calculus are inherently incomplete. We show that deciding causal effect non-identifiability is NP-hard in the presence of CSIs. Motivated by this, we design a calculus and an automated search procedure for identifying causal effects in the presence of CSIs. The approach is provably sound and it includes standard do-calculus as a special case. With the approach we can obtain identifying formulas that were unobtainable previously, and demonstrate that a small number of CSI-relations may be sufficient to turn a previously non-identifiable instance to identifiable. |
Tasks | |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/8547-identifying-causal-effects-via-context-specific-independence-relations |
http://papers.nips.cc/paper/8547-identifying-causal-effects-via-context-specific-independence-relations.pdf | |
PWC | https://paperswithcode.com/paper/identifying-causal-effects-via-context |
Repo | |
Framework | |
Korean Morphological Analysis with Tied Sequence-to-Sequence Multi-Task Model
Title | Korean Morphological Analysis with Tied Sequence-to-Sequence Multi-Task Model |
Authors | Hyun-Je Song, Seong-Bae Park |
Abstract | Korean morphological analysis has been considered as a sequence of morpheme processing and POS tagging. Thus, a pipeline model of the tasks has been adopted widely by previous studies. However, the model has a problem that it cannot utilize interactions among the tasks. This paper formulates Korean morphological analysis as a combination of the tasks and presents a tied sequence-to-sequence multi-task model for training the two tasks simultaneously without any explicit regularization. The experiments prove the proposed model achieves the state-of-the-art performance. |
Tasks | Morphological Analysis |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1150/ |
https://www.aclweb.org/anthology/D19-1150 | |
PWC | https://paperswithcode.com/paper/korean-morphological-analysis-with-tied |
Repo | |
Framework | |
Finite State Transducer Calculus for Whole Word Morphology
Title | Finite State Transducer Calculus for Whole Word Morphology |
Authors | Maciej Janicki |
Abstract | The research on machine learning of morphology often involves formulating morphological descriptions directly on surface forms of words. As the established two-level morphology paradigm requires the knowledge of the underlying structure, it is not widely used in such settings. In this paper, we propose a formalism describing structural relationships between words based on theories of morphology that reject the notions of internal word structure and morpheme. The formalism covers a wide variety of morphological phenomena (including non-concatenative ones like stem vowel alternation) without the need of workarounds and extensions. Furthermore, we show that morphological rules formulated in such way can be easily translated to FSTs, which enables us to derive performant approaches to morphological analysis, generation and automatic rule discovery. |
Tasks | Morphological Analysis |
Published | 2019-09-01 |
URL | https://www.aclweb.org/anthology/W19-3107/ |
https://www.aclweb.org/anthology/W19-3107 | |
PWC | https://paperswithcode.com/paper/finite-state-transducer-calculus-for-whole |
Repo | |
Framework | |
Deep Tensor ADMM-Net for Snapshot Compressive Imaging
Title | Deep Tensor ADMM-Net for Snapshot Compressive Imaging |
Authors | Jiawei Ma, Xiao-Yang Liu, Zheng Shou, Xin Yuan |
Abstract | Snapshot compressive imaging (SCI) systems have been developed to capture high-dimensional (> 3) signals using low-dimensional off-the-shelf sensors, i.e., mapping multiple video frames into a single measurement frame. One key module of a SCI system is an accurate decoder that recovers the original video frames. However, existing model-based decoding algorithms require exhaustive parameter tuning with prior knowledge and cannot support practical applications due to the extremely long running time. In this paper, we propose a deep tensor ADMM-Net for video SCI systems that provides high-quality decoding in seconds. Firstly, we start with a standard tensor ADMM algorithm, unfold its inference iterations into a layer-wise structure, and design a deep neural network based on tensor operations. Secondly, instead of relying on a pre-specified sparse representation domain, the network learns the domain of low-rank tensor through stochastic gradient descent. It is worth noting that the proposed deep tensor ADMM-Net has potentially mathematical interpretations. On public video data, the simulation results show the proposed method achieves average 0.8 ~ 2.5 dB improvement in PSNR and 0.07 ~ 0.1 in SSIM, and 1500x~ 3600 xspeedups over the state-of-the-art methods. On real data captured by SCI cameras, the experimental results show comparable visual results with the state-of-the-art methods but in much shorter running time. |
Tasks | |
Published | 2019-10-01 |
URL | http://openaccess.thecvf.com/content_ICCV_2019/html/Ma_Deep_Tensor_ADMM-Net_for_Snapshot_Compressive_Imaging_ICCV_2019_paper.html |
http://openaccess.thecvf.com/content_ICCV_2019/papers/Ma_Deep_Tensor_ADMM-Net_for_Snapshot_Compressive_Imaging_ICCV_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/deep-tensor-admm-net-for-snapshot-compressive |
Repo | |
Framework | |
Multi-Domain Goal-Oriented Dialogues (MultiDoGO): Strategies toward Curating and Annotating Large Scale Dialogue Data
Title | Multi-Domain Goal-Oriented Dialogues (MultiDoGO): Strategies toward Curating and Annotating Large Scale Dialogue Data |
Authors | Denis Peskov, Nancy Clarke, Jason Krone, Brigi Fodor, Yi Zhang, Adel Youssef, Mona Diab |
Abstract | The need for high-quality, large-scale, goal-oriented dialogue datasets continues to grow as virtual assistants become increasingly wide-spread. However, publicly available datasets useful for this area are limited either in their size, linguistic diversity, domain coverage, or annotation granularity. In this paper, we present strategies toward curating and annotating large scale goal oriented dialogue data. We introduce the MultiDoGO dataset to overcome these limitations. With a total of over 81K dialogues harvested across six domains, MultiDoGO is over 8 times the size of MultiWOZ, the other largest comparable dialogue dataset currently available to the public. Over 54K of these harvested conversations are annotated for intent classes and slot labels. We adopt a Wizard-of-Oz approach wherein a crowd-sourced worker (the {}customer{''}) is paired with a trained annotator (the { }agent{''}). The data curation process was controlled via biases to ensure a diversity in dialogue flows following variable dialogue policies. We provide distinct class label tags for agents vs. customer utterances, along with applicable slot labels. We also compare and contrast our strategies on annotation granularity, i.e. turn vs. sentence level. Furthermore, we compare and contrast annotations curated by leveraging professional annotators vs the crowd. We believe our strategies for eliciting and annotating such a dialogue dataset scales across modalities and domains and potentially languages in the future. To demonstrate the efficacy of our devised strategies we establish neural baselines for classification on the agent and customer utterances as well as slot labeling for each domain. |
Tasks | |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1460/ |
https://www.aclweb.org/anthology/D19-1460 | |
PWC | https://paperswithcode.com/paper/multi-domain-goal-oriented-dialogues |
Repo | |
Framework | |