May 5, 2019

1781 words 9 mins read

Paper Group NANR 86

Paper Group NANR 86

Examining the Relationship between Preordering and Word Order Freedom in Machine Translation. Phrase Table Pruning via Submodular Function Maximization. mwetoolkit+sem: Integrating Word Embeddings in the mwetoolkit for Semantic MWE Processing. Aspectual Flexibility Increases with Agentivity and ConcretenessA Computational Classification Experiment …

Examining the Relationship between Preordering and Word Order Freedom in Machine Translation

Title Examining the Relationship between Preordering and Word Order Freedom in Machine Translation
Authors Joachim Daiber, Milo{\v{s}} Stanojevi{'c}, Wilker Aziz, Khalil Sima{'}an
Abstract
Tasks Machine Translation
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2213/
PDF https://www.aclweb.org/anthology/W16-2213
PWC https://paperswithcode.com/paper/examining-the-relationship-between
Repo
Framework

Phrase Table Pruning via Submodular Function Maximization

Title Phrase Table Pruning via Submodular Function Maximization
Authors Masaaki Nishino, Jun Suzuki, Masaaki Nagata
Abstract
Tasks Machine Translation
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-2066/
PDF https://www.aclweb.org/anthology/P16-2066
PWC https://paperswithcode.com/paper/phrase-table-pruning-via-submodular-function
Repo
Framework

mwetoolkit+sem: Integrating Word Embeddings in the mwetoolkit for Semantic MWE Processing

Title mwetoolkit+sem: Integrating Word Embeddings in the mwetoolkit for Semantic MWE Processing
Authors Silvio Cordeiro, Carlos Ramisch, Aline Villavicencio
Abstract This paper presents mwetoolkit+sem: an extension of the mwetoolkit that estimates semantic compositionality scores for multiword expressions (MWEs) based on word embeddings. First, we describe our implementation of vector-space operations working on distributional vectors. The compositionality score is based on the cosine distance between the MWE vector and the composition of the vectors of its member words. Our generic system can handle several types of word embeddings and MWE lists, and may combine individual word representations using several composition techniques. We evaluate our implementation on a dataset of 1042 English noun compounds, comparing different configurations of the underlying word embeddings and word-composition models. We show that our vector-based scores model non-compositionality better than standard association measures such as log-likelihood.
Tasks Word Embeddings
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1194/
PDF https://www.aclweb.org/anthology/L16-1194
PWC https://paperswithcode.com/paper/mwetoolkitsem-integrating-word-embeddings-in
Repo
Framework

Aspectual Flexibility Increases with Agentivity and ConcretenessA Computational Classification Experiment on Polysemous Verbs

Title Aspectual Flexibility Increases with Agentivity and ConcretenessA Computational Classification Experiment on Polysemous Verbs
Authors Ingrid Falk, Fabienne Martin
Abstract We present an experimental study making use of a machine learning approach to identify the factors that affect the aspectual value that characterizes verbs under each of their readings. The study is based on various morpho-syntactic and semantic features collected from a French lexical resource and on a gold standard aspectual classification of verb readings designed by an expert. Our results support the tested hypothesis, namely that agentivity and abstractness influence lexical aspect.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1193/
PDF https://www.aclweb.org/anthology/L16-1193
PWC https://paperswithcode.com/paper/aspectual-flexibility-increases-with
Repo
Framework

Multi-region two-stream R-CNN for action detection

Title Multi-region two-stream R-CNN for action detection
Authors Xiaojiang Peng, Cordelia Schmid
Abstract We propose a multi-region two-stream R-CNN model for action detection in realistic videos. We start from frame-level action detection based on faster R-CNN [1], and make three contributions: (1) we show that a motion region proposal network generates high-quality proposals , which are complementary to those of an appearance region proposal network; (2) we show that stacking optical flow over several frames significantly improves frame-level action detection; and (3) we embed a multi-region scheme in the faster R-CNN model, which adds complementary information on body parts. We then link frame-level detections with the Viterbi algorithm, and temporally localize an action with the maximum subarray method. Experimental results on the UCF-Sports, J-HMDB and UCF101 action detection datasets show that our approach outperforms the state of the art with a significant margin in both frame-mAP and video-mAP
Tasks Action Detection, Action Recognition In Videos, Optical Flow Estimation, Skeleton Based Action Recognition
Published 2016-09-17
URL https://doi.org/10.1007/978-3-319-46493-0_45
PDF https://hal.inria.fr/hal-01349107v1/document
PWC https://paperswithcode.com/paper/multi-region-two-stream-r-cnn-for-action
Repo
Framework

Comparing Data Sources and Architectures for Deep Visual Representation Learning in Semantics

Title Comparing Data Sources and Architectures for Deep Visual Representation Learning in Semantics
Authors Douwe Kiela, Anita Lilla Ver{\H{o}}, Stephen Clark
Abstract
Tasks Image Retrieval, Information Retrieval, Object Recognition, Representation Learning, Semantic Textual Similarity
Published 2016-11-01
URL https://www.aclweb.org/anthology/D16-1043/
PDF https://www.aclweb.org/anthology/D16-1043
PWC https://paperswithcode.com/paper/comparing-data-sources-and-architectures-for
Repo
Framework

A Multi-Layered Annotated Corpus of Scientific Papers

Title A Multi-Layered Annotated Corpus of Scientific Papers
Authors Beatriz Fisas, Francesco Ronzano, Horacio Saggion
Abstract Scientific literature records the research process with a standardized structure and provides the clues to track the progress in a scientific field. Understanding its internal structure and content is of paramount importance for natural language processing (NLP) technologies. To meet this requirement, we have developed a multi-layered annotated corpus of scientific papers in the domain of Computer Graphics. Sentences are annotated with respect to their role in the argumentative structure of the discourse. The purpose of each citation is specified. Special features of the scientific discourse such as advantages and disadvantages are identified. In addition, a grade is allocated to each sentence according to its relevance for being included in a summary.To the best of our knowledge, this complex, multi-layered collection of annotations and metadata characterizing a set of research papers had never been grouped together before in one corpus and therefore constitutes a newer, richer resource with respect to those currently available in the field.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1492/
PDF https://www.aclweb.org/anthology/L16-1492
PWC https://paperswithcode.com/paper/a-multi-layered-annotated-corpus-of
Repo
Framework

Text Analysis and Automatic Triage of Posts in a Mental Health Forum

Title Text Analysis and Automatic Triage of Posts in a Mental Health Forum
Authors Ehsaneddin Asgari, Soroush Nasiriany, Mohammad R.K. Mofrad
Abstract
Tasks Feature Importance
Published 2016-06-01
URL https://www.aclweb.org/anthology/W16-0318/
PDF https://www.aclweb.org/anthology/W16-0318
PWC https://paperswithcode.com/paper/text-analysis-and-automatic-triage-of-posts
Repo
Framework

Towards Building Semantic Role Labeler for Indian Languages

Title Towards Building Semantic Role Labeler for Indian Languages
Authors Maaz Anwar, Dipti Sharma
Abstract We present a statistical system for identifying the semantic relationships or semantic roles for two major Indian Languages, Hindi and Urdu. Given an input sentence and a predicate/verb, the system first identifies the arguments pertaining to that verb and then classifies it into one of the semantic labels which can either be a DOER, THEME, LOCATIVE, CAUSE, PURPOSE etc. The system is based on 2 statistical classifiers trained on roughly 130,000 words for Urdu and 100,000 words for Hindi that were hand-annotated with semantic roles under the PropBank project for these two languages. Our system achieves an accuracy of 86{%} in identifying the arguments of a verb for Hindi and 75{%} for Urdu. At the subsequent task of classifying the constituents into their semantic roles, the Hindi system achieved 58{%} precision and 42{%} recall whereas Urdu system performed better and achieved 83{%} precision and 80{%} recall. Our study also allowed us to compare the usefulness of different linguistic features and feature combinations in the semantic role labeling task. We also examine the use of statistical syntactic parsing as feature in the role labeling task.
Tasks Semantic Role Labeling
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1727/
PDF https://www.aclweb.org/anthology/L16-1727
PWC https://paperswithcode.com/paper/towards-building-semantic-role-labeler-for
Repo
Framework

Textual complexity as a predictor of difficulty of listening items in language proficiency tests

Title Textual complexity as a predictor of difficulty of listening items in language proficiency tests
Authors Anastassia Loukina, Su-Youn Yoon, Jennifer Sakano, Youhua Wei, Kathy Sheehan
Abstract In this paper we explore to what extent the difficulty of listening items in an English language proficiency test can be predicted by the textual properties of the prompt. We show that a system based on multiple text complexity features can predict item difficulty for several different item types and for some items achieves higher accuracy than human estimates of item difficulty.
Tasks Reading Comprehension
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1306/
PDF https://www.aclweb.org/anthology/C16-1306
PWC https://paperswithcode.com/paper/textual-complexity-as-a-predictor-of
Repo
Framework

Blazing the trails before beating the path: Sample-efficient Monte-Carlo planning

Title Blazing the trails before beating the path: Sample-efficient Monte-Carlo planning
Authors Jean-Bastien Grill, Michal Valko, Remi Munos
Abstract We study the sampling-based planning problem in Markov decision processes (MDPs) that we can access only through a generative model, usually referred to as Monte-Carlo planning. Our objective is to return a good estimate of the optimal value function at any state while minimizing the number of calls to the generative model, i.e. the sample complexity. We propose a new algorithm, TrailBlazer, able to handle MDPs with a finite or an infinite number of transitions from state-action to next states. TrailBlazer is an adaptive algorithm that exploits possible structures of the MDP by exploring only a subset of states reachable by following near-optimal policies. We provide bounds on its sample complexity that depend on a measure of the quantity of near-optimal states. The algorithm behavior can be considered as an extension of Monte-Carlo sampling (for estimating an expectation) to problems that alternate maximization (over actions) and expectation (over next states). Finally, another appealing feature of TrailBlazer is that it is simple to implement and computationally efficient.
Tasks
Published 2016-12-01
URL http://papers.nips.cc/paper/6253-blazing-the-trails-before-beating-the-path-sample-efficient-monte-carlo-planning
PDF http://papers.nips.cc/paper/6253-blazing-the-trails-before-beating-the-path-sample-efficient-monte-carlo-planning.pdf
PWC https://paperswithcode.com/paper/blazing-the-trails-before-beating-the-path
Repo
Framework

Dependency Forest based Word Alignment

Title Dependency Forest based Word Alignment
Authors Hitoshi Otsuki, Chenhui Chu, Toshiaki Nakazawa, Sadao Kurohashi
Abstract
Tasks Machine Translation, Word Alignment
Published 2016-08-01
URL https://www.aclweb.org/anthology/P16-3002/
PDF https://www.aclweb.org/anthology/P16-3002
PWC https://paperswithcode.com/paper/dependency-forest-based-word-alignment
Repo
Framework

MADAD: A Readability Annotation Tool for Arabic Text

Title MADAD: A Readability Annotation Tool for Arabic Text
Authors Nora Al-Twairesh, Abeer Al-Dayel, Hend Al-Khalifa, Maha Al-Yahya, Sinaa Alageel, Nora Abanmy, Nouf Al-Shenaifi
Abstract This paper introduces MADAD, a general-purpose annotation tool for Arabic text with focus on readability annotation. This tool will help in overcoming the problem of lack of Arabic readability training data by providing an online environment to collect readability assessments on various kinds of corpora. Also the tool supports a broad range of annotation tasks for various linguistic and semantic phenomena by allowing users to create their customized annotation schemes. MADAD is a web-based tool, accessible through any web browser; the main features that distinguish MADAD are its flexibility, portability, customizability and its bilingual interface (Arabic/English).
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1646/
PDF https://www.aclweb.org/anthology/L16-1646
PWC https://paperswithcode.com/paper/madad-a-readability-annotation-tool-for
Repo
Framework

Towards robust cross-linguistic comparisons of phonological networks

Title Towards robust cross-linguistic comparisons of phonological networks
Authors Philippa Shoemark, Sharon Goldwater, James Kirby, Rik Sarkar
Abstract
Tasks Language Acquisition
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2018/
PDF https://www.aclweb.org/anthology/W16-2018
PWC https://paperswithcode.com/paper/towards-robust-cross-linguistic-comparisons
Repo
Framework

ACE: Automatic Colloquialism, Typographical and Orthographic Errors Detection for Chinese Language

Title ACE: Automatic Colloquialism, Typographical and Orthographic Errors Detection for Chinese Language
Authors Shichao Dong, Gabriel Pui Cheong Fung, Binyang Li, Baolin Peng, Ming Liao, Jia Zhu, Kam-fai Wong
Abstract We present a system called ACE for Automatic Colloquialism and Errors detection for written Chinese. ACE is based on the combination of N-gram model and rule-base model. Although it focuses on detecting colloquial Cantonese (a dialect of Chinese) at the current stage, it can be extended to detect other dialects. We chose Cantonese becauase it has many interesting properties, such as unique grammar system and huge colloquial terms, that turn the detection task extremely challenging. We conducted experiments using real data and synthetic data. The results indicated that ACE is highly reliable and effective.
Tasks Language Modelling
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-2041/
PDF https://www.aclweb.org/anthology/C16-2041
PWC https://paperswithcode.com/paper/ace-automatic-colloquialism-typographical-and
Repo
Framework
comments powered by Disqus