January 24, 2020

2678 words 13 mins read

Paper Group NANR 215

Paper Group NANR 215

Exploring Diachronic Changes of Biomedical Knowledge using Distributed Concept Representations. A Quantum-Like Approach to Word Sense Disambiguation. Implementation of a Chomsky-Sch"utzenberger n-best parser for weighted multiple context-free grammars. On Learning Heteroscedastic Noise Models within Differentiable Bayes Filters. Multi-Level Sentim …

Exploring Diachronic Changes of Biomedical Knowledge using Distributed Concept Representations

Title Exploring Diachronic Changes of Biomedical Knowledge using Distributed Concept Representations
Authors Gaurav Vashisth, Jan-Niklas Voigt-Antons, Michael Mikhailov, Rol Roller,
Abstract In research best practices can change over time as new discoveries are made and novel methods are implemented. Scientific publications reporting about the latest facts and current state-of-the-art can be possibly outdated after some years or even proved to be false. A publication usually sheds light only on the knowledge of the period it has been published. Thus, the aspect of time can play an essential role in the reliability of the presented information. In Natural Language Processing many methods focus on information extraction from text, such as detecting entities and their relationship to each other. Those methods mostly focus on the facts presented in the text itself and not on the aspects of knowledge which changes over time. This work instead examines the evolution in biomedical knowledge over time using scientific literature in terms of diachronic change. Mainly the usage of temporal and distributional concept representations are explored and evaluated by a proof-of-concept.
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-5037/
PDF https://www.aclweb.org/anthology/W19-5037
PWC https://paperswithcode.com/paper/exploring-diachronic-changes-of-biomedical
Repo
Framework

A Quantum-Like Approach to Word Sense Disambiguation

Title A Quantum-Like Approach to Word Sense Disambiguation
Authors Fabio Tamburini
Abstract This paper presents a novel algorithm for Word Sense Disambiguation (WSD) based on Quantum Probability Theory. The Quantum WSD algorithm requires concepts representations as vectors in the complex domain and thus we have developed a technique for computing complex word and sentence embeddings based on the Paragraph Vectors algorithm. Despite the proposed method is quite simple and that it does not require long training phases, when it is evaluated on a standardized benchmark for this task it exhibits state-of-the-art (SOTA) performances.
Tasks Sentence Embeddings, Word Sense Disambiguation
Published 2019-09-01
URL https://www.aclweb.org/anthology/R19-1135/
PDF https://www.aclweb.org/anthology/R19-1135
PWC https://paperswithcode.com/paper/a-quantum-like-approach-to-word-sense
Repo
Framework

Implementation of a Chomsky-Sch"utzenberger n-best parser for weighted multiple context-free grammars

Title Implementation of a Chomsky-Sch"utzenberger n-best parser for weighted multiple context-free grammars
Authors Thomas Ruprecht, Tobias Denkinger
Abstract Constituent parsing has been studied extensively in the last decades. Chomsky-Sch{"u}tzenberger parsing as an approach to constituent parsing has only been investigated theoretically, yet. It uses the decomposition of a language into a regular language, a homomorphism, and a bracket language to divide the parsing problem into simpler subproblems. We provide the first implementation of Chomsky-Sch{"u}tzenberger parsing. It employs multiple context-free grammars and incorporates many refinements to achieve feasibility. We compare its performance to state-of-the-art grammar-based parsers.
Tasks
Published 2019-06-01
URL https://www.aclweb.org/anthology/N19-1016/
PDF https://www.aclweb.org/anthology/N19-1016
PWC https://paperswithcode.com/paper/implementation-of-a-chomsky-schutzenberger-n
Repo
Framework

On Learning Heteroscedastic Noise Models within Differentiable Bayes Filters

Title On Learning Heteroscedastic Noise Models within Differentiable Bayes Filters
Authors Alina Kloss, Jeannette Bohg
Abstract In many robotic applications, it is crucial to maintain a belief about the state of a system, like the location of a robot or the pose of an object. These state estimates serve as input for planning and decision making and provide feedback during task execution. Recursive Bayesian Filtering algorithms address the state estimation problem, but they require a model of the process dynamics and the sensory observations as well as noise estimates that quantify the accuracy of these models. Recently, multiple works have demonstrated that the process and sensor models can be learned by end-to-end training through differentiable versions of Recursive Filtering methods. However, even if the predictive models are known, finding suitable noise models remains challenging. Therefore, many practical applications rely on very simplistic noise models. Our hypothesis is that end-to-end training through differentiable Bayesian Filters enables us to learn more complex heteroscedastic noise models for the system dynamics. We evaluate learning such models with different types of filtering algorithms and on two different robotic tasks. Our experiments show that especially for sampling-based filters like the Particle Filter, learning heteroscedastic noise models can drastically improve the tracking performance in comparison to using constant noise models.
Tasks Decision Making
Published 2019-05-01
URL https://openreview.net/forum?id=BylBns0qtX
PDF https://openreview.net/pdf?id=BylBns0qtX
PWC https://paperswithcode.com/paper/on-learning-heteroscedastic-noise-models
Repo
Framework

Multi-Level Sentiment Analysis of PolEmo 2.0: Extended Corpus of Multi-Domain Consumer Reviews

Title Multi-Level Sentiment Analysis of PolEmo 2.0: Extended Corpus of Multi-Domain Consumer Reviews
Authors Jan Koco{'n}, Piotr Mi{\l}kowski, Monika Za{'s}ko-Zieli{'n}ska
Abstract In this article we present an extended version of PolEmo {–} a corpus of consumer reviews from 4 domains: medicine, hotels, products and school. Current version (PolEmo 2.0) contains 8,216 reviews having 57,466 sentences. Each text and sentence was manually annotated with sentiment in 2+1 scheme, which gives a total of 197,046 annotations. We obtained a high value of Positive Specific Agreement, which is 0.91 for texts and 0.88 for sentences. PolEmo 2.0 is publicly available under a Creative Commons copyright license. We explored recent deep learning approaches for the recognition of sentiment, such as Bi-directional Long Short-Term Memory (BiLSTM) and Bidirectional Encoder Representations from Transformers (BERT).
Tasks Sentiment Analysis
Published 2019-11-01
URL https://www.aclweb.org/anthology/K19-1092/
PDF https://www.aclweb.org/anthology/K19-1092
PWC https://paperswithcode.com/paper/multi-level-sentiment-analysis-of-polemo-20
Repo
Framework

D-GAN: Divergent generative adversarial network for positive unlabeled learning and counter-examples generation

Title D-GAN: Divergent generative adversarial network for positive unlabeled learning and counter-examples generation
Authors Florent CHIARONI. Mohamed-Cherif RAHAL. Nicolas HUEBER. Frédéric DUFAUX.
Abstract Positive Unlabeled (PU) learning consists in learning to distinguish samples of our class of interest, the positive class, from the counter-examples, the negative class, by using positive labeled and unlabeled samples during the training. Recent approaches exploit the GANs abilities to address the PU learning problem by generating relevant counter-examples. In this paper, we propose a new GAN-based PU learning approach named Divergent-GAN (D-GAN). The key idea is to incorporate a standard Positive Unlabeled learning risk inside the GAN discriminator loss function. In this way, the discriminator can ask the generator to converge towards the unlabeled samples distribution while diverging from the positive samples distribution. This enables the generator convergence towards the unlabeled counter-examples distribution without using prior knowledge, while keeping the standard adversarial GAN architecture. In addition, we discuss normalization techniques in the context of the proposed framework. Experimental results show that the proposed approach overcomes previous GAN-based PU learning methods issues, and it globally outperforms two-stage state of the art PU learning performances in terms of stability and prediction on both simple and complex image datasets.
Tasks
Published 2019-05-01
URL https://openreview.net/forum?id=SJldZ2RqFX
PDF https://openreview.net/pdf?id=SJldZ2RqFX
PWC https://paperswithcode.com/paper/d-gan-divergent-generative-adversarial
Repo
Framework

Reversing Gradients in Adversarial Domain Adaptation for Question Deduplication and Textual Entailment Tasks

Title Reversing Gradients in Adversarial Domain Adaptation for Question Deduplication and Textual Entailment Tasks
Authors Anush Kamath, Sparsh Gupta, Vitor Carvalho
Abstract Adversarial domain adaptation has been recently proposed as an effective technique for textual matching tasks, such as question deduplication. Here we investigate the use of gradient reversal on adversarial domain adaptation to explicitly learn both shared and unshared (domain specific) representations between two textual domains. In doing so, gradient reversal learns features that explicitly compensate for domain mismatch, while still distilling domain specific knowledge that can improve target domain accuracy. We evaluate reversing gradients for adversarial adaptation on multiple domains, and demonstrate that it significantly outperforms other methods on question deduplication as well as on recognizing textual entailment (RTE) tasks, achieving up to 7{%} absolute boost in base model accuracy on some datasets.
Tasks Domain Adaptation, Natural Language Inference
Published 2019-07-01
URL https://www.aclweb.org/anthology/P19-1556/
PDF https://www.aclweb.org/anthology/P19-1556
PWC https://paperswithcode.com/paper/reversing-gradients-in-adversarial-domain
Repo
Framework

Deep Neural Model Inspection and Comparison via Functional Neuron Pathways

Title Deep Neural Model Inspection and Comparison via Functional Neuron Pathways
Authors James Fiacco, Samridhi Choudhary, Carolyn Rose
Abstract We introduce a general method for the interpretation and comparison of neural models. The method is used to factor a complex neural model into its functional components, which are comprised of sets of co-firing neurons that cut across layers of the network architecture, and which we call neural pathways. The function of these pathways can be understood by identifying correlated task level and linguistic heuristics in such a way that this knowledge acts as a lens for approximating what the network has learned to apply to its intended task. As a case study for investigating the utility of these pathways, we present an examination of pathways identified in models trained for two standard tasks, namely Named Entity Recognition and Recognizing Textual Entailment.
Tasks Named Entity Recognition, Natural Language Inference
Published 2019-07-01
URL https://www.aclweb.org/anthology/P19-1575/
PDF https://www.aclweb.org/anthology/P19-1575
PWC https://paperswithcode.com/paper/deep-neural-model-inspection-and-comparison
Repo
Framework

Spatial-Temporal Attention Res-TCN for Skeleton-Based Dynamic Hand Gesture Recognition

Title Spatial-Temporal Attention Res-TCN for Skeleton-Based Dynamic Hand Gesture Recognition
Authors Jingxuan Hou
Abstract Dynamic hand gesture recognition is a crucial yet challenging task in computer vision. The key of this task lies in an effective extraction of discriminative spatial and temporal features to model the evolutions of different gestures. In this paper, we propose an end-to-end Spatial-Temporal Attention Residual Temporal Convolutional Network (STA-Res-TCN) for skeleton-based dynamic hand gesture recognition, which learns different levels of attention and assigns them to each spatial-temporal feature extracted by the convolution filters at each time step. The proposed attention branch assists the networks to adaptively focus on the informative time frames and features while exclude the irrelevant ones that often bring in unnecessary noise. Moreover, our proposed STA-Res-TCN is a lightweight model that can be trained and tested in an extremely short time. Experiments on DHG-14/28 Dataset and SHREC’17 Track Dataset show that STA-Res-TCN outperforms state-of-the-art methods on both the 14 gestures setting and the more complicated 28 gestures setting.
Tasks Gesture Recognition, Hand Gesture Recognition, Hand-Gesture Recognition, Skeleton Based Action Recognition
Published 2019-01-23
URL https://link.springer.com/chapter/10.1007/978-3-030-11024-6_18
PDF https://link.springer.com/chapter/10.1007/978-3-030-11024-6_18
PWC https://paperswithcode.com/paper/spatial-temporal-attention-res-tcn-for
Repo
Framework

Mixtape: Breaking the Softmax Bottleneck Efficiently

Title Mixtape: Breaking the Softmax Bottleneck Efficiently
Authors Zhilin Yang, Thang Luong, Russ R. Salakhutdinov, Quoc V. Le
Abstract The softmax bottleneck has been shown to limit the expressiveness of neural language models. Mixture of Softmaxes (MoS) is an effective approach to address such a theoretical limitation, but are expensive compared to softmax in terms of both memory and time. We propose Mixtape, an output layer that breaks the softmax bottleneck more efficiently with three novel techniques—logit space vector gating, sigmoid tree decomposition, and gate sharing. On four benchmarks including language modeling and machine translation, the Mixtape layer substantially improves the efficiency over the MoS layer by 3.5x to 10.5x while obtaining similar or better performance. A network equipped with Mixtape is only 20% to 34% slower than a softmax-based network with 10-30K vocabulary sizes, and outperforms softmax by a large margin in perplexity and translation quality. Notably, Mixtape achieves state-of-the-art results of 29.8 BLEU on WMT’14 English-German and 43.9 BLEU on WMT’14 English-French.
Tasks Language Modelling, Machine Translation
Published 2019-12-01
URL http://papers.nips.cc/paper/9723-mixtape-breaking-the-softmax-bottleneck-efficiently
PDF http://papers.nips.cc/paper/9723-mixtape-breaking-the-softmax-bottleneck-efficiently.pdf
PWC https://paperswithcode.com/paper/mixtape-breaking-the-softmax-bottleneck
Repo
Framework

Commonsense Inference in Natural Language Processing (COIN) - Shared Task Report

Title Commonsense Inference in Natural Language Processing (COIN) - Shared Task Report
Authors Simon Ostermann, Sheng Zhang, Michael Roth, Peter Clark
Abstract This paper reports on the results of the shared tasks of the COIN workshop at EMNLP-IJCNLP 2019. The tasks consisted of two machine comprehension evaluations, each of which tested a system{'}s ability to answer questions/queries about a text. Both evaluations were designed such that systems need to exploit commonsense knowledge, for example, in the form of inferences over information that is available in the common ground but not necessarily mentioned in the text. A total of five participating teams submitted systems for the shared tasks, with the best submitted system achieving 90.6{%} accuracy and 83.7{%} F1-score on task 1 and task 2, respectively.
Tasks Reading Comprehension
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-6007/
PDF https://www.aclweb.org/anthology/D19-6007
PWC https://paperswithcode.com/paper/commonsense-inference-in-natural-language
Repo
Framework

IITP-MT System for Gujarati-English News Translation Task at WMT 2019

Title IITP-MT System for Gujarati-English News Translation Task at WMT 2019
Authors Sukanta Sen, Kamal Kumar Gupta, Asif Ekbal, Pushpak Bhattacharyya
Abstract We describe our submission to WMT 2019 News translation shared task for Gujarati-English language pair. We submit constrained systems, i.e, we rely on the data provided for this language pair and do not use any external data. We train Transformer based subword-level neural machine translation (NMT) system using original parallel corpus along with synthetic parallel corpus obtained through back-translation of monolingual data. Our primary systems achieve BLEU scores of 10.4 and 8.1 for Gujarati→English and English→Gujarati, respectively. We observe that incorporating monolingual data through back-translation improves the BLEU score significantly over baseline NMT and SMT systems for this language pair.
Tasks Machine Translation
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-5346/
PDF https://www.aclweb.org/anthology/W19-5346
PWC https://paperswithcode.com/paper/iitp-mt-system-for-gujarati-english-news
Repo
Framework

Summarizing Relationships for Interactive Concept Map Browsers

Title Summarizing Relationships for Interactive Concept Map Browsers
Authors H, Abram ler, Premkumar Ganeshkumar, Brendan O{'}Connor, Mohamed AlTantawy
Abstract Concept maps are visual summaries, structured as directed graphs: important concepts from a dataset are displayed as vertexes, and edges between vertexes show natural language descriptions of the relationships between the concepts on the map. Thus far, preliminary attempts at automatically creating concept maps have focused on building static summaries. However, in interactive settings, users will need to dynamically investigate particular relationships between pairs of concepts. For instance, a historian using a concept map browser might decide to investigate the relationship between two politicians in a news archive. We present a model which responds to such queries by returning one or more short, importance-ranked, natural language descriptions of the relationship between two requested concepts, for display in a visual interface. Our model is trained on a new public dataset, collected for this task.
Tasks
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-5414/
PDF https://www.aclweb.org/anthology/D19-5414
PWC https://paperswithcode.com/paper/summarizing-relationships-for-interactive
Repo
Framework

Low-Rank Tensor Completion With a New Tensor Nuclear Norm Induced by Invertible Linear Transforms

Title Low-Rank Tensor Completion With a New Tensor Nuclear Norm Induced by Invertible Linear Transforms
Authors Canyi Lu, Xi Peng, Yunchao Wei
Abstract This work studies the low-rank tensor completion problem, which aims to exactly recover a low-rank tensor from partially observed entries. Our model is inspired by the recently proposed tensor-tensor product (t-product) based on any invertible linear transforms. When the linear transforms satisfy certain conditions, we deduce the new tensor tubal rank, tensor spectral norm, and tensor nuclear norm. Equipped with the tensor nuclear norm, we then solve the tensor completion problem by solving a convex program and provide the theoretical bound for the exact recovery under certain tensor incoherence conditions. The achieved sampling complexity is order-wise optimal. Our model and result greatly extend existing results in the low-rank matrix and tensor completion. Numerical experiments verify our results and the application on image recovery demonstrates the superiority of our method.
Tasks
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Lu_Low-Rank_Tensor_Completion_With_a_New_Tensor_Nuclear_Norm_Induced_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Lu_Low-Rank_Tensor_Completion_With_a_New_Tensor_Nuclear_Norm_Induced_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/low-rank-tensor-completion-with-a-new-tensor
Repo
Framework

Sample Complexity of Learning Mixture of Sparse Linear Regressions

Title Sample Complexity of Learning Mixture of Sparse Linear Regressions
Authors Akshay Krishnamurthy, Arya Mazumdar, Andrew Mcgregor, Soumyabrata Pal
Abstract In the problem of learning mixtures of linear regressions, the goal is to learn a col-lection of signal vectors from a sequence of (possibly noisy) linear measurements,where each measurement is evaluated on an unknown signal drawn uniformly fromthis collection. This setting is quite expressive and has been studied both in termsof practical applications and for the sake of establishing theoretical guarantees. Inthis paper, we consider the case where the signal vectors aresparse; this generalizesthe popular compressed sensing paradigm. We improve upon the state-of-the-artresults as follows: In the noisy case, we resolve an open question of Yin et al. (IEEETransactions on Information Theory, 2019) by showing how to handle collectionsof more than two vectors and present the first robust reconstruction algorithm, i.e.,if the signals are not perfectly sparse, we still learn a good sparse approximationof the signals. In the noiseless case, as well as in the noisy case, we show how tocircumvent the need for a restrictive assumption required in the previous work. Ourtechniques are quite different from those in the previous work: for the noiselesscase, we rely on a property of sparse polynomials and for the noisy case, we providenew connections to learning Gaussian mixtures and use ideas from the theory of
Tasks
Published 2019-12-01
URL http://papers.nips.cc/paper/9239-sample-complexity-of-learning-mixture-of-sparse-linear-regressions
PDF http://papers.nips.cc/paper/9239-sample-complexity-of-learning-mixture-of-sparse-linear-regressions.pdf
PWC https://paperswithcode.com/paper/sample-complexity-of-learning-mixture-of
Repo
Framework
comments powered by Disqus