July 26, 2019

2234 words 11 mins read

Paper Group NANR 181

Automated Preamble Detection in Dictated Medical Reports. ALS at IJCNLP-2017 Task 5: Answer Localization System for Multi-Choice Question Answering in Exams. Clustering of Russian Adjective-Noun Constructions using Word Embeddings. Computationally efficient heart rate estimation during physical exercise using photoplethysmographic signals. Multitas …

Automated Preamble Detection in Dictated Medical Reports


Title	Automated Preamble Detection in Dictated Medical Reports
Authors	Wael Salloum, Greg Finley, Erik Edwards, Mark Miller, David Suendermann-Oeft
Abstract	Dictated medical reports very often feature a preamble containing metainformation about the report such as patient and physician names, location and name of the clinic, date of procedure, and so on. In the medical transcription process, the preamble is usually omitted from the final report, as it contains information already available in the electronic medical record. We present a method which is able to automatically identify preambles in medical dictations. The method makes use of state-of-the-art NLP techniques including word embeddings and Bi-LSTMs and achieves preamble detection performance superior to humans.
Tasks	Speech Recognition, Word Embeddings
Published	2017-08-01
URL	https://www.aclweb.org/anthology/W17-2336/
PDF	https://www.aclweb.org/anthology/W17-2336
PWC	https://paperswithcode.com/paper/automated-preamble-detection-in-dictated
Repo
Framework

ALS at IJCNLP-2017 Task 5: Answer Localization System for Multi-Choice Question Answering in Exams


Title	ALS at IJCNLP-2017 Task 5: Answer Localization System for Multi-Choice Question Answering in Exams
Authors	Changliang Li, Cunliang Kong
Abstract	Multi-choice question answering in exams is a typical QA task. To accomplish this task, we present an answer localization method to locate answers shown in web pages, considering structural information and semantic information both. Using this method as basis, we analyze sentences and paragraphs appeared on web pages to get predictions. With this answer localization system, we get effective results on both validation dataset and test dataset.
Tasks	Question Answering
Published	2017-12-01
URL	https://www.aclweb.org/anthology/I17-4033/
PDF	https://www.aclweb.org/anthology/I17-4033
PWC	https://paperswithcode.com/paper/als-at-ijcnlp-2017-task-5-answer-localization
Repo
Framework

Clustering of Russian Adjective-Noun Constructions using Word Embeddings


Title	Clustering of Russian Adjective-Noun Constructions using Word Embeddings
Authors	Andrey Kutuzov, Elizaveta Kuzmenko, Lidia Pivovarova
Abstract	This paper presents a method of automatic construction extraction from a large corpus of Russian. The term {`}construction{'} here means a multi-word expression in which a variable can be replaced with another word from the same semantic class, for example, {`}a glass of [water/juice/milk]{'}. We deal with constructions that consist of a noun and its adjective modifier. We propose a method of grouping such constructions into semantic classes via 2-step clustering of word vectors in distributional models. We compare it with other clustering techniques and evaluate it against A Russian-English Collocational Dictionary of the Human Body that contains manually annotated groups of constructions with nouns meaning human body parts. The best performing method is used to cluster all adjective-noun bigrams in the Russian National Corpus. Results of this procedure are publicly available and can be used for building Russian construction dictionary as well as to accelerate theoretical studies of constructions.
Tasks	Word Embeddings
Published	2017-04-01
URL	https://www.aclweb.org/anthology/W17-1402/
PDF	https://www.aclweb.org/anthology/W17-1402
PWC	https://paperswithcode.com/paper/clustering-of-russian-adjective-noun
Repo
Framework

Computationally efficient heart rate estimation during physical exercise using photoplethysmographic signals


Title	Computationally efficient heart rate estimation during physical exercise using photoplethysmographic signals
Authors	Tim Schäck, Michael Muma, Abdelhak M. Zoubir
Abstract	Wearable devices that acquire photoplethysmographic (PPG) signals are becoming increasingly popular to monitor the heart rate during physical exercise. However, high accuracy and low computational complexity are conflicting requirements. We propose a method that provides highly accurate heart rate estimates at a very low computational cost in order to be implementable on wearables. To achieve the lowest possible complexity, only basic signal processing operations, i.e., correlation-based fundamental frequency estimation and spectral combination, harmonic noise damping and frequency domain tracking, are used. The proposed approach outperforms state-of-the-art methods on current benchmark data considerably in terms of computation time, while achieving a similar accuracy.
Tasks	Heart rate estimation, Photoplethysmography (PPG)
Published	2017-08-28
URL	https://doi.org/10.23919/EUSIPCO.2017.8081656
PDF	https://www.researchgate.net/publication/319176582_Computationally_Efficient_Heart_Rate_Estimation_During_Physical_Exercise_Using_Photoplethysmographic_Signals
PWC	https://paperswithcode.com/paper/computationally-efficient-heart-rate
Repo
Framework

Multitask Spectral Learning of Weighted Automata


Title	Multitask Spectral Learning of Weighted Automata
Authors	Guillaume Rabusseau, Borja Balle, Joelle Pineau
Abstract	We consider the problem of estimating multiple related functions computed by weighted automata~(WFA). We first present a natural notion of relatedness between WFAs by considering to which extent several WFAs can share a common underlying representation. We then introduce the model of vector-valued WFA which conveniently helps us formalize this notion of relatedness. Finally, we propose a spectral learning algorithm for vector-valued WFAs to tackle the multitask learning problem. By jointly learning multiple tasks in the form of a vector-valued WFA, our algorithm enforces the discovery of a representation space shared between tasks. The benefits of the proposed multitask approach are theoretically motivated and showcased through experiments on both synthetic and real world datasets.
Tasks
Published	2017-12-01
URL	http://papers.nips.cc/paper/6852-multitask-spectral-learning-of-weighted-automata
PDF	http://papers.nips.cc/paper/6852-multitask-spectral-learning-of-weighted-automata.pdf
PWC	https://paperswithcode.com/paper/multitask-spectral-learning-of-weighted
Repo
Framework

Mapping the Perfect via Translation Mining


Title	Mapping the Perfect via Translation Mining
Authors	Martijn van der Klis, Bert Le Bruyn, Henri{"e}tte de Swart
Abstract	Semantic analyses of the Perfect often defeat their own purpose: by restricting their attention to {`}real{'} perfects (like the English one), they implicitly assume the Perfect has predefined meanings and usages. We turn the tables and focus on form, using data extracted from multilingual parallel corpora to automatically generate semantic maps (Haspelmath, 1997) of the sequence {`}Have/Be + past participle{'} in five European languages (German, English, Spanish, French, Dutch). This technique, which we dub Translation Mining, has been applied before in the lexical domain (W{"a}lchli and Cysouw, 2012) but we showcase its application at the level of the grammar.
Tasks
Published	2017-04-01
URL	https://www.aclweb.org/anthology/E17-2080/
PDF	https://www.aclweb.org/anthology/E17-2080
PWC	https://paperswithcode.com/paper/mapping-the-perfect-via-translation-mining
Repo
Framework

Fake news stance detection using stacked ensemble of classifiers


Title	Fake news stance detection using stacked ensemble of classifiers
Authors	James Thorne, Mingjie Chen, Giorgos Myrianthous, Jiashu Pu, Xiaoxuan Wang, Andreas Vlachos
Abstract	Fake news has become a hotly debated topic in journalism. In this paper, we present our entry to the 2017 Fake News Challenge which models the detection of fake news as a stance classification task that finished in 11th place on the leader board. Our entry is an ensemble system of classifiers developed by students in the context of their coursework. We show how we used the stacking ensemble method for this purpose and obtained improvements in classification accuracy exceeding each of the individual models{'} performance on the development data. Finally, we discuss aspects of the experimental setup of the challenge.
Tasks	Fake News Detection, Natural Language Inference, Sentiment Analysis, Stance Detection
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-4214/
PDF	https://www.aclweb.org/anthology/W17-4214
PWC	https://paperswithcode.com/paper/fake-news-stance-detection-using-stacked
Repo
Framework

The First Cross-Lingual Challenge on Recognition, Normalization, and Matching of Named Entities in Slavic Languages


Title	The First Cross-Lingual Challenge on Recognition, Normalization, and Matching of Named Entities in Slavic Languages
Authors	Jakub Piskorski, Lidia Pivovarova, Jan {\v{S}}najder, Josef Steinberger, Roman Yangarber
Abstract	This paper describes the outcomes of the first challenge on multilingual named entity recognition that aimed at recognizing mentions of named entities in web documents in Slavic languages, their normalization/lemmatization, and cross-language matching. It was organised in the context of the 6th Balto-Slavic Natural Language Processing Workshop, co-located with the EACL 2017 conference. Although eleven teams signed up for the evaluation, due to the complexity of the task(s) and short time available for elaborating a solution, only two teams submitted results on time. The reported evaluation figures reflect the relatively higher level of complexity of named entity-related tasks in the context of processing texts in Slavic languages. Since the duration of the challenge goes beyond the date of the publication of this paper and updated picture of the participating systems and their corresponding performance can be found on the web page of the challenge.
Tasks	Entity Linking, Lemmatization, Named Entity Recognition
Published	2017-04-01
URL	https://www.aclweb.org/anthology/W17-1412/
PDF	https://www.aclweb.org/anthology/W17-1412
PWC	https://paperswithcode.com/paper/the-first-cross-lingual-challenge-on
Repo
Framework

Determining Gains Acquired from Word Embedding Quantitatively Using Discrete Distribution Clustering


Title	Determining Gains Acquired from Word Embedding Quantitatively Using Discrete Distribution Clustering
Authors	Jianbo Ye, Yanran Li, Zhaohui Wu, James Z. Wang, Wenjie Li, Jia Li
Abstract	Word embeddings have become widely-used in document analysis. While a large number of models for mapping words to vector spaces have been developed, it remains undetermined how much net gain can be achieved over traditional approaches based on bag-of-words. In this paper, we propose a new document clustering approach by combining any word embedding with a state-of-the-art algorithm for clustering empirical distributions. By using the Wasserstein distance between distributions, the word-to-word semantic relationship is taken into account in a principled way. The new clustering method is easy to use and consistently outperforms other methods on a variety of data sets. More importantly, the method provides an effective framework for determining when and how much word embeddings contribute to document analysis. Experimental results with multiple embedding models are reported.
Tasks	Word Embeddings
Published	2017-07-01
URL	https://www.aclweb.org/anthology/P17-1169/
PDF	https://www.aclweb.org/anthology/P17-1169
PWC	https://paperswithcode.com/paper/determining-gains-acquired-from-word
Repo
Framework

Evaluating discourse annotation: Some recent insights and new approaches


Title	Evaluating discourse annotation: Some recent insights and new approaches
Authors	Jet Hoek, Merel Scholman
Abstract
Tasks
Published	2017-01-01
URL	https://www.aclweb.org/anthology/W17-7401/
PDF	https://www.aclweb.org/anthology/W17-7401
PWC	https://paperswithcode.com/paper/evaluating-discourse-annotation-some-recent
Repo
Framework

Best Response Regression


Title	Best Response Regression
Authors	Omer Ben-Porat, Moshe Tennenholtz
Abstract	In a regression task, a predictor is given a set of instances, along with a real value for each point. Subsequently, she has to identify the value of a new instance as accurately as possible. In this work, we initiate the study of strategic predictions in machine learning. We consider a regression task tackled by two players, where the payoff of each player is the proportion of the points she predicts more accurately than the other player. We first revise the probably approximately correct learning framework to deal with the case of a duel between two predictors. We then devise an algorithm which finds a linear regression predictor that is a best response to any (not necessarily linear) regression algorithm. We show that it has linearithmic sample complexity, and polynomial time complexity when the dimension of the instances domain is fixed. We also test our approach in a high-dimensional setting, and show it significantly defeats classical regression algorithms in the prediction duel. Together, our work introduces a novel machine learning task that lends itself well to current competitive online settings, provides its theoretical foundations, and illustrates its applicability.
Tasks
Published	2017-12-01
URL	http://papers.nips.cc/paper/6748-best-response-regression
PDF	http://papers.nips.cc/paper/6748-best-response-regression.pdf
PWC	https://paperswithcode.com/paper/best-response-regression
Repo
Framework

Investigating Different Syntactic Context Types and Context Representations for Learning Word Embeddings


Title	Investigating Different Syntactic Context Types and Context Representations for Learning Word Embeddings
Authors	Bofang Li, Tao Liu, Zhe Zhao, Buzhou Tang, Aleks Drozd, r, Anna Rogers, Xiaoyong Du
Abstract	The number of word embedding models is growing every year. Most of them are based on the co-occurrence information of words and their contexts. However, it is still an open question what is the best definition of context. We provide a systematical investigation of 4 different syntactic context types and context representations for learning word embeddings. Comprehensive experiments are conducted to evaluate their effectiveness on 6 extrinsic and intrinsic tasks. We hope that this paper, along with the published code, would be helpful for choosing the best context type and representation for a given task.
Tasks	Chunking, Learning Word Embeddings, Named Entity Recognition, Part-Of-Speech Tagging, Text Classification, Word Embeddings
Published	2017-09-01
URL	https://www.aclweb.org/anthology/D17-1257/
PDF	https://www.aclweb.org/anthology/D17-1257
PWC	https://paperswithcode.com/paper/investigating-different-syntactic-context
Repo
Framework

Realization of long sentences using chunking


Title	Realization of long sentences using chunking
Authors	Ewa Muszy{'n}ska, Ann Copestake
Abstract	We propose sentence chunking as a way to reduce the time and memory costs of realization of long sentences. During chunking we divide the semantic representation of a sentence into smaller components which can be processed and recombined without loss of information. Our meaning representation of choice is the Dependency Minimal Recursion Semantics (DMRS). We show that realizing chunks of a sentence and combining the results of such realizations increases the coverage for long sentences, significantly reduces the resources required and does not affect the quality of the realization.
Tasks	Chunking, Text Generation
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-3533/
PDF	https://www.aclweb.org/anthology/W17-3533
PWC	https://paperswithcode.com/paper/realization-of-long-sentences-using-chunking
Repo
Framework

Subspace Clustering via Tangent Cones


Title	Subspace Clustering via Tangent Cones
Authors	Amin Jalali, Rebecca Willett
Abstract	Given samples lying on any of a number of subspaces, subspace clustering is the task of grouping the samples based on the their corresponding subspaces. Many subspace clustering methods operate by assigning a measure of affinity to each pair of points and feeding these affinities into a graph clustering algorithm. This paper proposes a new paradigm for subspace clustering that computes affinities based on the corresponding conic geometry. The proposed conic subspace clustering (CSC) approach considers the convex hull of a collection of normalized data points and the corresponding tangent cones. The union of subspaces underlying the data imposes a strong association between the tangent cone at a sample $x$ and the original subspace containing $x$. In addition to describing this novel geometric perspective, this paper provides a practical algorithm for subspace clustering that leverages this perspective, where a tangent cone membership test is used to estimate the affinities. This algorithm is accompanied with deterministic and stochastic guarantees on the properties of the learned affinity matrix, on the true and false positive rates and spread, which directly translate into the overall clustering accuracy.
Tasks	Graph Clustering
Published	2017-12-01
URL	http://papers.nips.cc/paper/7251-subspace-clustering-via-tangent-cones
PDF	http://papers.nips.cc/paper/7251-subspace-clustering-via-tangent-cones.pdf
PWC	https://paperswithcode.com/paper/subspace-clustering-via-tangent-cones
Repo
Framework

Real-valued Syntactic Word Vectors (RSV) for Greedy Neural Dependency Parsing


Title	Real-valued Syntactic Word Vectors (RSV) for Greedy Neural Dependency Parsing
Authors	Ali Basirat, Joakim Nivre
Abstract
Tasks	Dependency Parsing, Transition-Based Dependency Parsing
Published	2017-05-01
URL	https://www.aclweb.org/anthology/W17-0203/
PDF	https://www.aclweb.org/anthology/W17-0203
PWC	https://paperswithcode.com/paper/real-valued-syntactic-word-vectors-rsv-for
Repo
Framework