May 5, 2019

1781 words 9 mins read

Paper Group NANR 86

Examining the Relationship between Preordering and Word Order Freedom in Machine Translation. Phrase Table Pruning via Submodular Function Maximization. mwetoolkit+sem: Integrating Word Embeddings in the mwetoolkit for Semantic MWE Processing. Aspectual Flexibility Increases with Agentivity and ConcretenessA Computational Classification Experiment …

Examining the Relationship between Preordering and Word Order Freedom in Machine Translation


Title	Examining the Relationship between Preordering and Word Order Freedom in Machine Translation
Authors	Joachim Daiber, Milo{\v{s}} Stanojevi{'c}, Wilker Aziz, Khalil Sima{'}an
Abstract
Tasks	Machine Translation
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-2213/
PDF	https://www.aclweb.org/anthology/W16-2213
PWC	https://paperswithcode.com/paper/examining-the-relationship-between
Repo
Framework

Phrase Table Pruning via Submodular Function Maximization


Title	Phrase Table Pruning via Submodular Function Maximization
Authors	Masaaki Nishino, Jun Suzuki, Masaaki Nagata
Abstract
Tasks	Machine Translation
Published	2016-08-01
URL	https://www.aclweb.org/anthology/P16-2066/
PDF	https://www.aclweb.org/anthology/P16-2066
PWC	https://paperswithcode.com/paper/phrase-table-pruning-via-submodular-function
Repo
Framework

mwetoolkit+sem: Integrating Word Embeddings in the mwetoolkit for Semantic MWE Processing


Title	mwetoolkit+sem: Integrating Word Embeddings in the mwetoolkit for Semantic MWE Processing
Authors	Silvio Cordeiro, Carlos Ramisch, Aline Villavicencio
Abstract	This paper presents mwetoolkit+sem: an extension of the mwetoolkit that estimates semantic compositionality scores for multiword expressions (MWEs) based on word embeddings. First, we describe our implementation of vector-space operations working on distributional vectors. The compositionality score is based on the cosine distance between the MWE vector and the composition of the vectors of its member words. Our generic system can handle several types of word embeddings and MWE lists, and may combine individual word representations using several composition techniques. We evaluate our implementation on a dataset of 1042 English noun compounds, comparing different configurations of the underlying word embeddings and word-composition models. We show that our vector-based scores model non-compositionality better than standard association measures such as log-likelihood.
Tasks	Word Embeddings
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1194/
PDF	https://www.aclweb.org/anthology/L16-1194
PWC	https://paperswithcode.com/paper/mwetoolkitsem-integrating-word-embeddings-in
Repo
Framework

Aspectual Flexibility Increases with Agentivity and ConcretenessA Computational Classification Experiment on Polysemous Verbs


Title	Aspectual Flexibility Increases with Agentivity and ConcretenessA Computational Classification Experiment on Polysemous Verbs
Authors	Ingrid Falk, Fabienne Martin
Abstract	We present an experimental study making use of a machine learning approach to identify the factors that affect the aspectual value that characterizes verbs under each of their readings. The study is based on various morpho-syntactic and semantic features collected from a French lexical resource and on a gold standard aspectual classification of verb readings designed by an expert. Our results support the tested hypothesis, namely that agentivity and abstractness influence lexical aspect.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1193/
PDF	https://www.aclweb.org/anthology/L16-1193
PWC	https://paperswithcode.com/paper/aspectual-flexibility-increases-with
Repo
Framework

Multi-region two-stream R-CNN for action detection


Title	Multi-region two-stream R-CNN for action detection
Authors	Xiaojiang Peng, Cordelia Schmid
Abstract	We propose a multi-region two-stream R-CNN model for action detection in realistic videos. We start from frame-level action detection based on faster R-CNN [1], and make three contributions: (1) we show that a motion region proposal network generates high-quality proposals , which are complementary to those of an appearance region proposal network; (2) we show that stacking optical flow over several frames significantly improves frame-level action detection; and (3) we embed a multi-region scheme in the faster R-CNN model, which adds complementary information on body parts. We then link frame-level detections with the Viterbi algorithm, and temporally localize an action with the maximum subarray method. Experimental results on the UCF-Sports, J-HMDB and UCF101 action detection datasets show that our approach outperforms the state of the art with a significant margin in both frame-mAP and video-mAP
Tasks	Action Detection, Action Recognition In Videos, Optical Flow Estimation, Skeleton Based Action Recognition
Published	2016-09-17
URL	https://doi.org/10.1007/978-3-319-46493-0_45
PDF	https://hal.inria.fr/hal-01349107v1/document
PWC	https://paperswithcode.com/paper/multi-region-two-stream-r-cnn-for-action
Repo
Framework

Comparing Data Sources and Architectures for Deep Visual Representation Learning in Semantics


Title	Comparing Data Sources and Architectures for Deep Visual Representation Learning in Semantics
Authors	Douwe Kiela, Anita Lilla Ver{\H{o}}, Stephen Clark
Abstract
Tasks	Image Retrieval, Information Retrieval, Object Recognition, Representation Learning, Semantic Textual Similarity
Published	2016-11-01
URL	https://www.aclweb.org/anthology/D16-1043/
PDF	https://www.aclweb.org/anthology/D16-1043
PWC	https://paperswithcode.com/paper/comparing-data-sources-and-architectures-for
Repo
Framework

A Multi-Layered Annotated Corpus of Scientific Papers


Title	A Multi-Layered Annotated Corpus of Scientific Papers
Authors	Beatriz Fisas, Francesco Ronzano, Horacio Saggion
Abstract	Scientific literature records the research process with a standardized structure and provides the clues to track the progress in a scientific field. Understanding its internal structure and content is of paramount importance for natural language processing (NLP) technologies. To meet this requirement, we have developed a multi-layered annotated corpus of scientific papers in the domain of Computer Graphics. Sentences are annotated with respect to their role in the argumentative structure of the discourse. The purpose of each citation is specified. Special features of the scientific discourse such as advantages and disadvantages are identified. In addition, a grade is allocated to each sentence according to its relevance for being included in a summary.To the best of our knowledge, this complex, multi-layered collection of annotations and metadata characterizing a set of research papers had never been grouped together before in one corpus and therefore constitutes a newer, richer resource with respect to those currently available in the field.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1492/
PDF	https://www.aclweb.org/anthology/L16-1492
PWC	https://paperswithcode.com/paper/a-multi-layered-annotated-corpus-of
Repo
Framework

Text Analysis and Automatic Triage of Posts in a Mental Health Forum


Title	Text Analysis and Automatic Triage of Posts in a Mental Health Forum
Authors	Ehsaneddin Asgari, Soroush Nasiriany, Mohammad R.K. Mofrad
Abstract
Tasks	Feature Importance
Published	2016-06-01
URL	https://www.aclweb.org/anthology/W16-0318/
PDF	https://www.aclweb.org/anthology/W16-0318
PWC	https://paperswithcode.com/paper/text-analysis-and-automatic-triage-of-posts
Repo
Framework

Towards Building Semantic Role Labeler for Indian Languages


Title	Towards Building Semantic Role Labeler for Indian Languages
Authors	Maaz Anwar, Dipti Sharma
Abstract	We present a statistical system for identifying the semantic relationships or semantic roles for two major Indian Languages, Hindi and Urdu. Given an input sentence and a predicate/verb, the system first identifies the arguments pertaining to that verb and then classifies it into one of the semantic labels which can either be a DOER, THEME, LOCATIVE, CAUSE, PURPOSE etc. The system is based on 2 statistical classifiers trained on roughly 130,000 words for Urdu and 100,000 words for Hindi that were hand-annotated with semantic roles under the PropBank project for these two languages. Our system achieves an accuracy of 86{%} in identifying the arguments of a verb for Hindi and 75{%} for Urdu. At the subsequent task of classifying the constituents into their semantic roles, the Hindi system achieved 58{%} precision and 42{%} recall whereas Urdu system performed better and achieved 83{%} precision and 80{%} recall. Our study also allowed us to compare the usefulness of different linguistic features and feature combinations in the semantic role labeling task. We also examine the use of statistical syntactic parsing as feature in the role labeling task.
Tasks	Semantic Role Labeling
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1727/
PDF	https://www.aclweb.org/anthology/L16-1727
PWC	https://paperswithcode.com/paper/towards-building-semantic-role-labeler-for
Repo
Framework

Textual complexity as a predictor of difficulty of listening items in language proficiency tests


Title	Textual complexity as a predictor of difficulty of listening items in language proficiency tests
Authors	Anastassia Loukina, Su-Youn Yoon, Jennifer Sakano, Youhua Wei, Kathy Sheehan
Abstract	In this paper we explore to what extent the difficulty of listening items in an English language proficiency test can be predicted by the textual properties of the prompt. We show that a system based on multiple text complexity features can predict item difficulty for several different item types and for some items achieves higher accuracy than human estimates of item difficulty.
Tasks	Reading Comprehension
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1306/
PDF	https://www.aclweb.org/anthology/C16-1306
PWC	https://paperswithcode.com/paper/textual-complexity-as-a-predictor-of
Repo
Framework

Blazing the trails before beating the path: Sample-efficient Monte-Carlo planning


Title	Blazing the trails before beating the path: Sample-efficient Monte-Carlo planning
Authors	Jean-Bastien Grill, Michal Valko, Remi Munos
Abstract	We study the sampling-based planning problem in Markov decision processes (MDPs) that we can access only through a generative model, usually referred to as Monte-Carlo planning. Our objective is to return a good estimate of the optimal value function at any state while minimizing the number of calls to the generative model, i.e. the sample complexity. We propose a new algorithm, TrailBlazer, able to handle MDPs with a finite or an infinite number of transitions from state-action to next states. TrailBlazer is an adaptive algorithm that exploits possible structures of the MDP by exploring only a subset of states reachable by following near-optimal policies. We provide bounds on its sample complexity that depend on a measure of the quantity of near-optimal states. The algorithm behavior can be considered as an extension of Monte-Carlo sampling (for estimating an expectation) to problems that alternate maximization (over actions) and expectation (over next states). Finally, another appealing feature of TrailBlazer is that it is simple to implement and computationally efficient.
Tasks
Published	2016-12-01
URL	http://papers.nips.cc/paper/6253-blazing-the-trails-before-beating-the-path-sample-efficient-monte-carlo-planning
PDF	http://papers.nips.cc/paper/6253-blazing-the-trails-before-beating-the-path-sample-efficient-monte-carlo-planning.pdf
PWC	https://paperswithcode.com/paper/blazing-the-trails-before-beating-the-path
Repo
Framework

Dependency Forest based Word Alignment


Title	Dependency Forest based Word Alignment
Authors	Hitoshi Otsuki, Chenhui Chu, Toshiaki Nakazawa, Sadao Kurohashi
Abstract
Tasks	Machine Translation, Word Alignment
Published	2016-08-01
URL	https://www.aclweb.org/anthology/P16-3002/
PDF	https://www.aclweb.org/anthology/P16-3002
PWC	https://paperswithcode.com/paper/dependency-forest-based-word-alignment
Repo
Framework

MADAD: A Readability Annotation Tool for Arabic Text


Title	MADAD: A Readability Annotation Tool for Arabic Text
Authors	Nora Al-Twairesh, Abeer Al-Dayel, Hend Al-Khalifa, Maha Al-Yahya, Sinaa Alageel, Nora Abanmy, Nouf Al-Shenaifi
Abstract	This paper introduces MADAD, a general-purpose annotation tool for Arabic text with focus on readability annotation. This tool will help in overcoming the problem of lack of Arabic readability training data by providing an online environment to collect readability assessments on various kinds of corpora. Also the tool supports a broad range of annotation tasks for various linguistic and semantic phenomena by allowing users to create their customized annotation schemes. MADAD is a web-based tool, accessible through any web browser; the main features that distinguish MADAD are its flexibility, portability, customizability and its bilingual interface (Arabic/English).
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1646/
PDF	https://www.aclweb.org/anthology/L16-1646
PWC	https://paperswithcode.com/paper/madad-a-readability-annotation-tool-for
Repo
Framework

Towards robust cross-linguistic comparisons of phonological networks


Title	Towards robust cross-linguistic comparisons of phonological networks
Authors	Philippa Shoemark, Sharon Goldwater, James Kirby, Rik Sarkar
Abstract
Tasks	Language Acquisition
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-2018/
PDF	https://www.aclweb.org/anthology/W16-2018
PWC	https://paperswithcode.com/paper/towards-robust-cross-linguistic-comparisons
Repo
Framework

ACE: Automatic Colloquialism, Typographical and Orthographic Errors Detection for Chinese Language


Title	ACE: Automatic Colloquialism, Typographical and Orthographic Errors Detection for Chinese Language
Authors	Shichao Dong, Gabriel Pui Cheong Fung, Binyang Li, Baolin Peng, Ming Liao, Jia Zhu, Kam-fai Wong
Abstract	We present a system called ACE for Automatic Colloquialism and Errors detection for written Chinese. ACE is based on the combination of N-gram model and rule-base model. Although it focuses on detecting colloquial Cantonese (a dialect of Chinese) at the current stage, it can be extended to detect other dialects. We chose Cantonese becauase it has many interesting properties, such as unique grammar system and huge colloquial terms, that turn the detection task extremely challenging. We conducted experiments using real data and synthetic data. The results indicated that ACE is highly reliable and effective.
Tasks	Language Modelling
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-2041/
PDF	https://www.aclweb.org/anthology/C16-2041
PWC	https://paperswithcode.com/paper/ace-automatic-colloquialism-typographical-and
Repo
Framework