May 4, 2019

2224 words 11 mins read

Paper Group NANR 229

Learned Region Sparsity and Diversity Also Predicts Visual Attention. A Proposal for combining ``general’’ and specialized frames. Unsupervised Abbreviation Detection in Clinical Narratives. C2D2E2: Using Call Centers to Motivate the Use of Dialog and Diarization in Entity Extraction. A Large-scale Recipe and Meal Data Collection as Infrastructure …

Learned Region Sparsity and Diversity Also Predicts Visual Attention


Title	Learned Region Sparsity and Diversity Also Predicts Visual Attention
Authors	Zijun Wei, Hossein Adeli, Minh Hoai Nguyen, Greg Zelinsky, Dimitris Samaras
Abstract	Learned region sparsity has achieved state-of-the-art performance in classification tasks by exploiting and integrating a sparse set of local information into global decisions. The underlying mechanism resembles how people sample information from an image with their eye movements when making similar decisions. In this paper we incorporate the biologically plausible mechanism of Inhibition of Return into the learned region sparsity model, thereby imposing diversity on the selected regions. We investigate how these mechanisms of sparsity and diversity relate to visual attention by testing our model on three different types of visual search tasks. We report state-of-the-art results in predicting the locations of human gaze fixations, even though our model is trained only on image-level labels without object location annotations. Notably, the classification performance of the extended model remains the same as the original. This work suggests a new computational perspective on visual attention mechanisms and shows how the inclusion of attention-based mechanisms can improve computer vision techniques.
Tasks
Published	2016-12-01
URL	http://papers.nips.cc/paper/6451-learned-region-sparsity-and-diversity-also-predicts-visual-attention
PDF	http://papers.nips.cc/paper/6451-learned-region-sparsity-and-diversity-also-predicts-visual-attention.pdf
PWC	https://paperswithcode.com/paper/learned-region-sparsity-and-diversity-also
Repo
Framework

A Proposal for combining ``general’’ and specialized frames


Title	A Proposal for combining ``general’’ and specialized frames \|
Authors	Marie-Claude L{'} Homme, Carlos Subirats, Beno{^\i}t Robichaud
Abstract	The objectives of the work described in this paper are: 1. To list the differences between a general language resource (namely FrameNet) and a domain-specific resource; 2. To devise solutions to merge their contents in order to increase the coverage of the general resource. Both resources are based on Frame Semantics (Fillmore 1985; Fillmore and Baker 2010) and this raises specific challenges since the theoretical framework and the methodology derived from it provide for both a lexical description and a conceptual representation. We propose a series of strategies that handle both lexical and conceptual (frame) differences and implemented them in the specialized resource. We also show that most differences can be handled in a straightforward manner. However, some more domain specific differences (such as frames defined exclusively for the specialized domain or relations between these frames) are likely to be much more difficult to take into account since some are domain-specific.
Tasks
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-5321/
PDF	https://www.aclweb.org/anthology/W16-5321
PWC	https://paperswithcode.com/paper/a-proposal-for-combining-general-and
Repo
Framework

Unsupervised Abbreviation Detection in Clinical Narratives


Title	Unsupervised Abbreviation Detection in Clinical Narratives
Authors	Markus Kreuzthaler, Michel Oleynik, Alex Avian, er, Stefan Schulz
Abstract	Clinical narratives in electronic health record systems are a rich resource of patient-based information. They constitute an ongoing challenge for natural language processing, due to their high compactness and abundance of short forms. German medical texts exhibit numerous ad-hoc abbreviations that terminate with a period character. The disambiguation of period characters is therefore an important task for sentence and abbreviation detection. This task is addressed by a combination of co-occurrence information of word types with trailing period characters, a large domain dictionary, and a simple rule engine, thus merging statistical and dictionary-based disambiguation strategies. An F-measure of 0.95 could be reached by using the unsupervised approach presented in this paper. The results are promising for a domain-independent abbreviation detection strategy, because our approach avoids retraining of models or use case specific feature engineering efforts required for supervised machine learning approaches.
Tasks	Feature Engineering, Tokenization
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-4213/
PDF	https://www.aclweb.org/anthology/W16-4213
PWC	https://paperswithcode.com/paper/unsupervised-abbreviation-detection-in
Repo
Framework

C2D2E2: Using Call Centers to Motivate the Use of Dialog and Diarization in Entity Extraction


Title	C2D2E2: Using Call Centers to Motivate the Use of Dialog and Diarization in Entity Extraction
Authors	Ken Church, Weizhong Zhu, Jason Pelecanos
Abstract
Tasks	Entity Extraction
Published	2016-11-01
URL	https://www.aclweb.org/anthology/W16-6008/
PDF	https://www.aclweb.org/anthology/W16-6008
PWC	https://paperswithcode.com/paper/c2d2e2-using-call-centers-to-motivate-the-use
Repo
Framework

A Large-scale Recipe and Meal Data Collection as Infrastructure for Food Research


Title	A Large-scale Recipe and Meal Data Collection as Infrastructure for Food Research
Authors	Jun Harashima, Michiaki Ariga, Kenta Murata, Masayuki Ioki
Abstract	Everyday meals are an important part of our daily lives and, currently, there are many Internet sites that help us plan these meals. Allied to the growth in the amount of food data such as recipes available on the Internet is an increase in the number of studies on these data, such as recipe analysis and recipe search. However, there are few publicly available resources for food research; those that do exist do not include a wide range of food data or any meal data (that is, likely combinations of recipes). In this study, we construct a large-scale recipe and meal data collection as the underlying infrastructure to promote food research. Our corpus consists of approximately 1.7 million recipes and 36000 meals in cookpad, one of the largest recipe sites in the world. We made the corpus available to researchers in February 2015 and as of February 2016, 82 research groups at 56 universities have made use of it to enhance their studies.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1389/
PDF	https://www.aclweb.org/anthology/L16-1389
PWC	https://paperswithcode.com/paper/a-large-scale-recipe-and-meal-data-collection
Repo
Framework

Example-based Acquisition of Fine-grained Collocation Resources


Title	Example-based Acquisition of Fine-grained Collocation Resources
Authors	Sara Rodr{'\i}guez-Fern{'a}ndez, Roberto Carlini, Luis Espinosa Anke, Leo Wanner
Abstract	Collocations such as {`}heavy rain{''} or {`}make [a] decision{''}, are combinations of two elements where one (the base) is freely chosen, while the choice of the other (collocate) is restricted, depending on the base. Collocations present difficulties even to advanced language learners, who usually struggle to find the right collocate to express a particular meaning, e.g., both {`}heavy{''} and {`}strong{''} express the meaning {}intense{'}, but while {``}rain{''} selects {``}heavy{''}, {``}wind{''} selects {``}strong{''}. Lexical Functions (LFs) describe the meanings that hold between the elements of collocations, such as {}intense{'}, {`}perform{'}, {`}create{'}, {`}increase{'}, etc. Language resources with semantically classified collocations would be of great help for students, however they are expensive to build, since they are manually constructed, and scarce. We present an unsupervised approach to the acquisition and semantic classification of collocations according to LFs, based on word embeddings in which, given an example of a collocation for each of the target LFs and a set of bases, the system retrieves a list of collocates for each base and LF. \|
Tasks	Word Embeddings
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1367/
PDF	https://www.aclweb.org/anthology/L16-1367
PWC	https://paperswithcode.com/paper/example-based-acquisition-of-fine-grained
Repo
Framework

Word Embeddings and Convolutional Neural Network for Arabic Sentiment Classification


Title	Word Embeddings and Convolutional Neural Network for Arabic Sentiment Classification
Authors	Abdelghani Dahou, Shengwu Xiong, Junwei Zhou, Mohamed Houcine Haddoud, Pengfei Duan
Abstract	With the development and the advancement of social networks, forums, blogs and online sales, a growing number of Arabs are expressing their opinions on the web. In this paper, a scheme of Arabic sentiment classification, which evaluates and detects the sentiment polarity from Arabic reviews and Arabic social media, is studied. We investigated in several architectures to build a quality neural word embeddings using a 3.4 billion words corpus from a collected 10 billion words web-crawled corpus. Moreover, a convolutional neural network trained on top of pre-trained Arabic word embeddings is used for sentiment classification to evaluate the quality of these word embeddings. The simulation results show that the proposed scheme outperforms the existed methods on 4 out of 5 balanced and unbalanced datasets.
Tasks	Sentiment Analysis, Word Embeddings
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1228/
PDF	https://www.aclweb.org/anthology/C16-1228
PWC	https://paperswithcode.com/paper/word-embeddings-and-convolutional-neural
Repo
Framework

POS Tagging Experts via Topic Modeling


Title	POS Tagging Experts via Topic Modeling
Authors	Atreyee Mukherjee, S K{"u}bler, ra, Matthias Scheutz
Abstract
Tasks	Domain Adaptation, Part-Of-Speech Tagging, Topic Models
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-6316/
PDF	https://www.aclweb.org/anthology/W16-6316
PWC	https://paperswithcode.com/paper/pos-tagging-experts-via-topic-modeling
Repo
Framework

Revisiting the Evaluation for Cross Document Event Coreference


Title	Revisiting the Evaluation for Cross Document Event Coreference
Authors	Shyam Upadhyay, Nitish Gupta, Christos Christodoulopoulos, Dan Roth
Abstract	Cross document event coreference (CDEC) is an important task that aims at aggregating event-related information across multiple documents. We revisit the evaluation for CDEC, and discover that past works have adopted different, often inconsistent, evaluation settings, which either overlook certain mistakes in coreference decisions, or make assumptions that simplify the coreference task considerably. We suggest a new evaluation methodology which overcomes these limitations, and allows for an accurate assessment of CDEC systems. Our new evaluation setting better reflects the corpus-wide information aggregation ability of CDEC systems by separating event-coreference decisions made across documents from those made within a document. In addition, we suggest a better baseline for the task and semi-automatically identify several inconsistent annotations in the evaluation dataset.
Tasks	Document Summarization, Multi-Document Summarization, Question Answering
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1183/
PDF	https://www.aclweb.org/anthology/C16-1183
PWC	https://paperswithcode.com/paper/revisiting-the-evaluation-for-cross-document
Repo
Framework

NRC Russian-English Machine Translation System for WMT 2016


Title	NRC Russian-English Machine Translation System for WMT 2016
Authors	Chi-kiu Lo, Colin Cherry, George Foster, Darlene Stewart, Rabib Islam, Anna Kazantseva, Rol Kuhn,
Abstract
Tasks	Lemmatization, Machine Translation, Transliteration, Word Alignment
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-2317/
PDF	https://www.aclweb.org/anthology/W16-2317
PWC	https://paperswithcode.com/paper/nrc-russian-english-machine-translation
Repo
Framework

On the Robustness of Standalone Referring Expression Generation Algorithms Using RDF Data


Title	On the Robustness of Standalone Referring Expression Generation Algorithms Using RDF Data
Authors	Pablo Duboue, Martin Ariel Dom{'\i}nguez, Paula Estrella
Abstract
Tasks	Text Generation
Published	2016-09-01
URL	https://www.aclweb.org/anthology/W16-3504/
PDF	https://www.aclweb.org/anthology/W16-3504
PWC	https://paperswithcode.com/paper/on-the-robustness-of-standalone-referring
Repo
Framework

Word Embeddings as Metric Recovery in Semantic Spaces


Title	Word Embeddings as Metric Recovery in Semantic Spaces
Authors	Tatsunori B. Hashimoto, David Alvarez-Melis, Tommi S. Jaakkola
Abstract	Continuous word representations have been remarkably useful across NLP tasks but remain poorly understood. We ground word embeddings in semantic spaces studied in the cognitive-psychometric literature, taking these spaces as the primary objects to recover. To this end, we relate log co-occurrences of words in large corpora to semantic similarity assessments and show that co-occurrences are indeed consistent with an Euclidean semantic space hypothesis. Framing word embedding as metric recovery of a semantic space unifies existing word embedding algorithms, ties them to manifold learning, and demonstrates that existing algorithms are consistent metric recovery methods given co-occurrence counts from random walks. Furthermore, we propose a simple, principled, direct metric recovery algorithm that performs on par with the state-of-the-art word embedding and manifold learning methods. Finally, we complement recent focus on analogies by constructing two new inductive reasoning datasets{—}series completion and classification{—}and demonstrate that word embeddings can be used to solve them as well.
Tasks	Named Entity Recognition, Semantic Similarity, Semantic Textual Similarity, Word Embeddings
Published	2016-01-01
URL	https://www.aclweb.org/anthology/Q16-1020/
PDF	https://www.aclweb.org/anthology/Q16-1020
PWC	https://paperswithcode.com/paper/word-embeddings-as-metric-recovery-in
Repo
Framework

Is all that Glitters in Machine Translation Quality Estimation really Gold?


Title	Is all that Glitters in Machine Translation Quality Estimation really Gold?
Authors	Yvette Graham, Timothy Baldwin, Meghan Dowling, Maria Eskevich, Teresa Lynn, Lamia Tounsi
Abstract	Human-targeted metrics provide a compromise between human evaluation of machine translation, where high inter-annotator agreement is difficult to achieve, and fully automatic metrics, such as BLEU or TER, that lack the validity of human assessment. Human-targeted translation edit rate (HTER) is by far the most widely employed human-targeted metric in machine translation, commonly employed, for example, as a gold standard in evaluation of quality estimation. Original experiments justifying the design of HTER, as opposed to other possible formulations, were limited to a small sample of translations and a single language pair, however, and this motivates our re-evaluation of a range of human-targeted metrics on a substantially larger scale. Results show significantly stronger correlation with human judgment for HBLEU over HTER for two of the nine language pairs we include and no significant difference between correlations achieved by HTER and HBLEU for the remaining language pairs. Finally, we evaluate a range of quality estimation systems employing HTER and direct assessment (DA) of translation adequacy as gold labels, resulting in a divergence in system rankings, and propose employment of DA for future quality estimation evaluations.
Tasks	Machine Translation
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1294/
PDF	https://www.aclweb.org/anthology/C16-1294
PWC	https://paperswithcode.com/paper/is-all-that-glitters-in-machine-translation
Repo
Framework

AMISCO: The Austrian German Multi-Sensor Corpus


Title	AMISCO: The Austrian German Multi-Sensor Corpus
Authors	Hannes Pessentheiner, Thomas Pichler, Martin Hagm{"u}ller
Abstract	We introduce a unique, comprehensive Austrian German multi-sensor corpus with moving and non-moving speakers to facilitate the evaluation of estimators and detectors that jointly detect a speaker{'}s spatial and temporal parameters. The corpus is suitable for various machine learning and signal processing tasks, linguistic studies, and studies related to a speaker{'}s fundamental frequency (due to recorded glottograms). Available corpora are limited to (synthetically generated/spatialized) speech data or recordings of musical instruments that lack moving speakers, glottograms, and/or multi-channel distant speech recordings. That is why we recorded 24 spatially non-moving and moving speakers, balanced male and female, to set up a two-room and 43-channel Austrian German multi-sensor speech corpus. It contains 8.2 hours of read speech based on phonetically balanced sentences, commands, and digits. The orthographic transcriptions include around 53,000 word tokens and 2,070 word types. Special features of this corpus are the laryngograph recordings (representing glottograms required to detect a speaker{'}s instantaneous fundamental frequency and pitch), corresponding clean-speech recordings, and spatial information and video data provided by four Kinects and a camera.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1121/
PDF	https://www.aclweb.org/anthology/L16-1121
PWC	https://paperswithcode.com/paper/amisco-the-austrian-german-multi-sensor
Repo
Framework

A Comparative Study of Post-editing Guidelines


Title	A Comparative Study of Post-editing Guidelines
Authors	Ke Hu, Patrick Cadwell
Abstract
Tasks	Machine Translation
Published	2016-01-01
URL	https://www.aclweb.org/anthology/W16-3420/
PDF	https://www.aclweb.org/anthology/W16-3420
PWC	https://paperswithcode.com/paper/a-comparative-study-of-post-editing
Repo
Framework