July 26, 2019

2234 words 11 mins read

Paper Group NANR 181

Paper Group NANR 181

Automated Preamble Detection in Dictated Medical Reports. ALS at IJCNLP-2017 Task 5: Answer Localization System for Multi-Choice Question Answering in Exams. Clustering of Russian Adjective-Noun Constructions using Word Embeddings. Computationally efficient heart rate estimation during physical exercise using photoplethysmographic signals. Multitas …

Automated Preamble Detection in Dictated Medical Reports

Title Automated Preamble Detection in Dictated Medical Reports
Authors Wael Salloum, Greg Finley, Erik Edwards, Mark Miller, David Suendermann-Oeft
Abstract Dictated medical reports very often feature a preamble containing metainformation about the report such as patient and physician names, location and name of the clinic, date of procedure, and so on. In the medical transcription process, the preamble is usually omitted from the final report, as it contains information already available in the electronic medical record. We present a method which is able to automatically identify preambles in medical dictations. The method makes use of state-of-the-art NLP techniques including word embeddings and Bi-LSTMs and achieves preamble detection performance superior to humans.
Tasks Speech Recognition, Word Embeddings
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-2336/
PDF https://www.aclweb.org/anthology/W17-2336
PWC https://paperswithcode.com/paper/automated-preamble-detection-in-dictated
Repo
Framework

ALS at IJCNLP-2017 Task 5: Answer Localization System for Multi-Choice Question Answering in Exams

Title ALS at IJCNLP-2017 Task 5: Answer Localization System for Multi-Choice Question Answering in Exams
Authors Changliang Li, Cunliang Kong
Abstract Multi-choice question answering in exams is a typical QA task. To accomplish this task, we present an answer localization method to locate answers shown in web pages, considering structural information and semantic information both. Using this method as basis, we analyze sentences and paragraphs appeared on web pages to get predictions. With this answer localization system, we get effective results on both validation dataset and test dataset.
Tasks Question Answering
Published 2017-12-01
URL https://www.aclweb.org/anthology/I17-4033/
PDF https://www.aclweb.org/anthology/I17-4033
PWC https://paperswithcode.com/paper/als-at-ijcnlp-2017-task-5-answer-localization
Repo
Framework

Clustering of Russian Adjective-Noun Constructions using Word Embeddings

Title Clustering of Russian Adjective-Noun Constructions using Word Embeddings
Authors Andrey Kutuzov, Elizaveta Kuzmenko, Lidia Pivovarova
Abstract This paper presents a method of automatic construction extraction from a large corpus of Russian. The term {}construction{'} here means a multi-word expression in which a variable can be replaced with another word from the same semantic class, for example, {}a glass of [water/juice/milk]{'}. We deal with constructions that consist of a noun and its adjective modifier. We propose a method of grouping such constructions into semantic classes via 2-step clustering of word vectors in distributional models. We compare it with other clustering techniques and evaluate it against A Russian-English Collocational Dictionary of the Human Body that contains manually annotated groups of constructions with nouns meaning human body parts. The best performing method is used to cluster all adjective-noun bigrams in the Russian National Corpus. Results of this procedure are publicly available and can be used for building Russian construction dictionary as well as to accelerate theoretical studies of constructions.
Tasks Word Embeddings
Published 2017-04-01
URL https://www.aclweb.org/anthology/W17-1402/
PDF https://www.aclweb.org/anthology/W17-1402
PWC https://paperswithcode.com/paper/clustering-of-russian-adjective-noun
Repo
Framework

Computationally efficient heart rate estimation during physical exercise using photoplethysmographic signals

Title Computationally efficient heart rate estimation during physical exercise using photoplethysmographic signals
Authors Tim Schäck, Michael Muma, Abdelhak M. Zoubir
Abstract Wearable devices that acquire photoplethysmographic (PPG) signals are becoming increasingly popular to monitor the heart rate during physical exercise. However, high accuracy and low computational complexity are conflicting requirements. We propose a method that provides highly accurate heart rate estimates at a very low computational cost in order to be implementable on wearables. To achieve the lowest possible complexity, only basic signal processing operations, i.e., correlation-based fundamental frequency estimation and spectral combination, harmonic noise damping and frequency domain tracking, are used. The proposed approach outperforms state-of-the-art methods on current benchmark data considerably in terms of computation time, while achieving a similar accuracy.
Tasks Heart rate estimation, Photoplethysmography (PPG)
Published 2017-08-28
URL https://doi.org/10.23919/EUSIPCO.2017.8081656
PDF https://www.researchgate.net/publication/319176582_Computationally_Efficient_Heart_Rate_Estimation_During_Physical_Exercise_Using_Photoplethysmographic_Signals
PWC https://paperswithcode.com/paper/computationally-efficient-heart-rate
Repo
Framework

Multitask Spectral Learning of Weighted Automata

Title Multitask Spectral Learning of Weighted Automata
Authors Guillaume Rabusseau, Borja Balle, Joelle Pineau
Abstract We consider the problem of estimating multiple related functions computed by weighted automata~(WFA). We first present a natural notion of relatedness between WFAs by considering to which extent several WFAs can share a common underlying representation. We then introduce the model of vector-valued WFA which conveniently helps us formalize this notion of relatedness. Finally, we propose a spectral learning algorithm for vector-valued WFAs to tackle the multitask learning problem. By jointly learning multiple tasks in the form of a vector-valued WFA, our algorithm enforces the discovery of a representation space shared between tasks. The benefits of the proposed multitask approach are theoretically motivated and showcased through experiments on both synthetic and real world datasets.
Tasks
Published 2017-12-01
URL http://papers.nips.cc/paper/6852-multitask-spectral-learning-of-weighted-automata
PDF http://papers.nips.cc/paper/6852-multitask-spectral-learning-of-weighted-automata.pdf
PWC https://paperswithcode.com/paper/multitask-spectral-learning-of-weighted
Repo
Framework

Mapping the Perfect via Translation Mining

Title Mapping the Perfect via Translation Mining
Authors Martijn van der Klis, Bert Le Bruyn, Henri{"e}tte de Swart
Abstract Semantic analyses of the Perfect often defeat their own purpose: by restricting their attention to {}real{'} perfects (like the English one), they implicitly assume the Perfect has predefined meanings and usages. We turn the tables and focus on form, using data extracted from multilingual parallel corpora to automatically generate semantic maps (Haspelmath, 1997) of the sequence {}Have/Be + past participle{'} in five European languages (German, English, Spanish, French, Dutch). This technique, which we dub Translation Mining, has been applied before in the lexical domain (W{"a}lchli and Cysouw, 2012) but we showcase its application at the level of the grammar.
Tasks
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-2080/
PDF https://www.aclweb.org/anthology/E17-2080
PWC https://paperswithcode.com/paper/mapping-the-perfect-via-translation-mining
Repo
Framework

Fake news stance detection using stacked ensemble of classifiers

Title Fake news stance detection using stacked ensemble of classifiers
Authors James Thorne, Mingjie Chen, Giorgos Myrianthous, Jiashu Pu, Xiaoxuan Wang, Andreas Vlachos
Abstract Fake news has become a hotly debated topic in journalism. In this paper, we present our entry to the 2017 Fake News Challenge which models the detection of fake news as a stance classification task that finished in 11th place on the leader board. Our entry is an ensemble system of classifiers developed by students in the context of their coursework. We show how we used the stacking ensemble method for this purpose and obtained improvements in classification accuracy exceeding each of the individual models{'} performance on the development data. Finally, we discuss aspects of the experimental setup of the challenge.
Tasks Fake News Detection, Natural Language Inference, Sentiment Analysis, Stance Detection
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4214/
PDF https://www.aclweb.org/anthology/W17-4214
PWC https://paperswithcode.com/paper/fake-news-stance-detection-using-stacked
Repo
Framework

The First Cross-Lingual Challenge on Recognition, Normalization, and Matching of Named Entities in Slavic Languages

Title The First Cross-Lingual Challenge on Recognition, Normalization, and Matching of Named Entities in Slavic Languages
Authors Jakub Piskorski, Lidia Pivovarova, Jan {\v{S}}najder, Josef Steinberger, Roman Yangarber
Abstract This paper describes the outcomes of the first challenge on multilingual named entity recognition that aimed at recognizing mentions of named entities in web documents in Slavic languages, their normalization/lemmatization, and cross-language matching. It was organised in the context of the 6th Balto-Slavic Natural Language Processing Workshop, co-located with the EACL 2017 conference. Although eleven teams signed up for the evaluation, due to the complexity of the task(s) and short time available for elaborating a solution, only two teams submitted results on time. The reported evaluation figures reflect the relatively higher level of complexity of named entity-related tasks in the context of processing texts in Slavic languages. Since the duration of the challenge goes beyond the date of the publication of this paper and updated picture of the participating systems and their corresponding performance can be found on the web page of the challenge.
Tasks Entity Linking, Lemmatization, Named Entity Recognition
Published 2017-04-01
URL https://www.aclweb.org/anthology/W17-1412/
PDF https://www.aclweb.org/anthology/W17-1412
PWC https://paperswithcode.com/paper/the-first-cross-lingual-challenge-on
Repo
Framework

Determining Gains Acquired from Word Embedding Quantitatively Using Discrete Distribution Clustering

Title Determining Gains Acquired from Word Embedding Quantitatively Using Discrete Distribution Clustering
Authors Jianbo Ye, Yanran Li, Zhaohui Wu, James Z. Wang, Wenjie Li, Jia Li
Abstract Word embeddings have become widely-used in document analysis. While a large number of models for mapping words to vector spaces have been developed, it remains undetermined how much net gain can be achieved over traditional approaches based on bag-of-words. In this paper, we propose a new document clustering approach by combining any word embedding with a state-of-the-art algorithm for clustering empirical distributions. By using the Wasserstein distance between distributions, the word-to-word semantic relationship is taken into account in a principled way. The new clustering method is easy to use and consistently outperforms other methods on a variety of data sets. More importantly, the method provides an effective framework for determining when and how much word embeddings contribute to document analysis. Experimental results with multiple embedding models are reported.
Tasks Word Embeddings
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-1169/
PDF https://www.aclweb.org/anthology/P17-1169
PWC https://paperswithcode.com/paper/determining-gains-acquired-from-word
Repo
Framework

Evaluating discourse annotation: Some recent insights and new approaches

Title Evaluating discourse annotation: Some recent insights and new approaches
Authors Jet Hoek, Merel Scholman
Abstract
Tasks
Published 2017-01-01
URL https://www.aclweb.org/anthology/W17-7401/
PDF https://www.aclweb.org/anthology/W17-7401
PWC https://paperswithcode.com/paper/evaluating-discourse-annotation-some-recent
Repo
Framework

Best Response Regression

Title Best Response Regression
Authors Omer Ben-Porat, Moshe Tennenholtz
Abstract In a regression task, a predictor is given a set of instances, along with a real value for each point. Subsequently, she has to identify the value of a new instance as accurately as possible. In this work, we initiate the study of strategic predictions in machine learning. We consider a regression task tackled by two players, where the payoff of each player is the proportion of the points she predicts more accurately than the other player. We first revise the probably approximately correct learning framework to deal with the case of a duel between two predictors. We then devise an algorithm which finds a linear regression predictor that is a best response to any (not necessarily linear) regression algorithm. We show that it has linearithmic sample complexity, and polynomial time complexity when the dimension of the instances domain is fixed. We also test our approach in a high-dimensional setting, and show it significantly defeats classical regression algorithms in the prediction duel. Together, our work introduces a novel machine learning task that lends itself well to current competitive online settings, provides its theoretical foundations, and illustrates its applicability.
Tasks
Published 2017-12-01
URL http://papers.nips.cc/paper/6748-best-response-regression
PDF http://papers.nips.cc/paper/6748-best-response-regression.pdf
PWC https://paperswithcode.com/paper/best-response-regression
Repo
Framework

Investigating Different Syntactic Context Types and Context Representations for Learning Word Embeddings

Title Investigating Different Syntactic Context Types and Context Representations for Learning Word Embeddings
Authors Bofang Li, Tao Liu, Zhe Zhao, Buzhou Tang, Aleks Drozd, r, Anna Rogers, Xiaoyong Du
Abstract The number of word embedding models is growing every year. Most of them are based on the co-occurrence information of words and their contexts. However, it is still an open question what is the best definition of context. We provide a systematical investigation of 4 different syntactic context types and context representations for learning word embeddings. Comprehensive experiments are conducted to evaluate their effectiveness on 6 extrinsic and intrinsic tasks. We hope that this paper, along with the published code, would be helpful for choosing the best context type and representation for a given task.
Tasks Chunking, Learning Word Embeddings, Named Entity Recognition, Part-Of-Speech Tagging, Text Classification, Word Embeddings
Published 2017-09-01
URL https://www.aclweb.org/anthology/D17-1257/
PDF https://www.aclweb.org/anthology/D17-1257
PWC https://paperswithcode.com/paper/investigating-different-syntactic-context
Repo
Framework

Realization of long sentences using chunking

Title Realization of long sentences using chunking
Authors Ewa Muszy{'n}ska, Ann Copestake
Abstract We propose sentence chunking as a way to reduce the time and memory costs of realization of long sentences. During chunking we divide the semantic representation of a sentence into smaller components which can be processed and recombined without loss of information. Our meaning representation of choice is the Dependency Minimal Recursion Semantics (DMRS). We show that realizing chunks of a sentence and combining the results of such realizations increases the coverage for long sentences, significantly reduces the resources required and does not affect the quality of the realization.
Tasks Chunking, Text Generation
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-3533/
PDF https://www.aclweb.org/anthology/W17-3533
PWC https://paperswithcode.com/paper/realization-of-long-sentences-using-chunking
Repo
Framework

Subspace Clustering via Tangent Cones

Title Subspace Clustering via Tangent Cones
Authors Amin Jalali, Rebecca Willett
Abstract Given samples lying on any of a number of subspaces, subspace clustering is the task of grouping the samples based on the their corresponding subspaces. Many subspace clustering methods operate by assigning a measure of affinity to each pair of points and feeding these affinities into a graph clustering algorithm. This paper proposes a new paradigm for subspace clustering that computes affinities based on the corresponding conic geometry. The proposed conic subspace clustering (CSC) approach considers the convex hull of a collection of normalized data points and the corresponding tangent cones. The union of subspaces underlying the data imposes a strong association between the tangent cone at a sample $x$ and the original subspace containing $x$. In addition to describing this novel geometric perspective, this paper provides a practical algorithm for subspace clustering that leverages this perspective, where a tangent cone membership test is used to estimate the affinities. This algorithm is accompanied with deterministic and stochastic guarantees on the properties of the learned affinity matrix, on the true and false positive rates and spread, which directly translate into the overall clustering accuracy.
Tasks Graph Clustering
Published 2017-12-01
URL http://papers.nips.cc/paper/7251-subspace-clustering-via-tangent-cones
PDF http://papers.nips.cc/paper/7251-subspace-clustering-via-tangent-cones.pdf
PWC https://paperswithcode.com/paper/subspace-clustering-via-tangent-cones
Repo
Framework

Real-valued Syntactic Word Vectors (RSV) for Greedy Neural Dependency Parsing

Title Real-valued Syntactic Word Vectors (RSV) for Greedy Neural Dependency Parsing
Authors Ali Basirat, Joakim Nivre
Abstract
Tasks Dependency Parsing, Transition-Based Dependency Parsing
Published 2017-05-01
URL https://www.aclweb.org/anthology/W17-0203/
PDF https://www.aclweb.org/anthology/W17-0203
PWC https://paperswithcode.com/paper/real-valued-syntactic-word-vectors-rsv-for
Repo
Framework
comments powered by Disqus