Paper Group NANR 71
ParFDA for Instance Selection for Statistical Machine Translation. The Effect of Multiple Grammatical Errors on Processing Non-Native Writing. Unsupervised Modeling of Topical Relevance in L2 Learner Text. Collective Entity Resolution with Multi-Focal Attention. A Japanese Chess Commentary Corpus. Account Deletion Prediction on RuNet: A Case Study …
ParFDA for Instance Selection for Statistical Machine Translation
Title | ParFDA for Instance Selection for Statistical Machine Translation |
Authors | Ergun Bicici |
Abstract | |
Tasks | Language Modelling, Machine Translation |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/papers/W16-2306/w16-2306 |
https://www.aclweb.org/anthology/W16-2306v2 | |
PWC | https://paperswithcode.com/paper/parfda-for-instance-selection-for-statistical |
Repo | |
Framework | |
The Effect of Multiple Grammatical Errors on Processing Non-Native Writing
Title | The Effect of Multiple Grammatical Errors on Processing Non-Native Writing |
Authors | Courtney Napoles, Aoife Cahill, Nitin Madnani |
Abstract | |
Tasks | Dependency Parsing |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/W16-0501/ |
https://www.aclweb.org/anthology/W16-0501 | |
PWC | https://paperswithcode.com/paper/the-effect-of-multiple-grammatical-errors-on |
Repo | |
Framework | |
Unsupervised Modeling of Topical Relevance in L2 Learner Text
Title | Unsupervised Modeling of Topical Relevance in L2 Learner Text |
Authors | Ronan Cummins, Helen Yannakoudakis, Ted Briscoe |
Abstract | |
Tasks | Information Retrieval |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/W16-0510/ |
https://www.aclweb.org/anthology/W16-0510 | |
PWC | https://paperswithcode.com/paper/unsupervised-modeling-of-topical-relevance-in |
Repo | |
Framework | |
Collective Entity Resolution with Multi-Focal Attention
Title | Collective Entity Resolution with Multi-Focal Attention |
Authors | Amir Globerson, Nevena Lazic, Soumen Chakrabarti, Amarnag Subramanya, Michael Ringgaard, Fern Pereira, o |
Abstract | |
Tasks | Entity Resolution, Topic Models |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/P16-1059/ |
https://www.aclweb.org/anthology/P16-1059 | |
PWC | https://paperswithcode.com/paper/collective-entity-resolution-with-multi-focal |
Repo | |
Framework | |
A Japanese Chess Commentary Corpus
Title | A Japanese Chess Commentary Corpus |
Authors | Shinsuke Mori, John Richardson, Atsushi Ushiku, Tetsuro Sasada, Hirotaka Kameko, Yoshimasa Tsuruoka |
Abstract | In recent years there has been a surge of interest in the natural language prosessing related to the real world, such as symbol grounding, language generation, and nonlinguistic data search by natural language queries. In order to concentrate on language ambiguities, we propose to use a well-defined {``}real world,{''} that is game states. We built a corpus consisting of pairs of sentences and a game state. The game we focus on is shogi (Japanese chess). We collected 742,286 commentary sentences in Japanese. They are spontaneously generated contrary to natural language annotations in many image datasets provided by human workers on Amazon Mechanical Turk. We defined domain specific named entities and we segmented 2,508 sentences into words manually and annotated each word with a named entity tag. We describe a detailed definition of named entities and show some statistics of our game commentary corpus. We also show the results of the experiments of word segmentation and named entity recognition. The accuracies are as high as those on general domain texts indicating that we are ready to tackle various new problems related to the real world. | |
Tasks | Named Entity Recognition, Text Generation |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1225/ |
https://www.aclweb.org/anthology/L16-1225 | |
PWC | https://paperswithcode.com/paper/a-japanese-chess-commentary-corpus |
Repo | |
Framework | |
Account Deletion Prediction on RuNet: A Case Study of Suspicious Twitter Accounts Active During the Russian-Ukrainian Crisis
Title | Account Deletion Prediction on RuNet: A Case Study of Suspicious Twitter Accounts Active During the Russian-Ukrainian Crisis |
Authors | Svitlana Volkova, Eric Bell |
Abstract | |
Tasks | |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/W16-0801/ |
https://www.aclweb.org/anthology/W16-0801 | |
PWC | https://paperswithcode.com/paper/account-deletion-prediction-on-runet-a-case |
Repo | |
Framework | |
Bitext Name Tagging for Cross-lingual Entity Annotation Projection
Title | Bitext Name Tagging for Cross-lingual Entity Annotation Projection |
Authors | Dongxu Zhang, Boliang Zhang, Xiaoman Pan, Xiaocheng Feng, Heng Ji, Weiran Xu |
Abstract | Annotation projection is a practical method to deal with the low resource problem in incident languages (IL) processing. Previous methods on annotation projection mainly relied on word alignment results without any training process, which led to noise propagation caused by word alignment errors. In this paper, we focus on the named entity recognition (NER) task and propose a weakly-supervised framework to project entity annotations from English to IL through bitexts. Instead of directly relying on word alignment results, this framework combines advantages of rule-based methods and deep learning methods by implementing two steps: First, generates a high-confidence entity annotation set on IL side with strict searching methods; Second, uses this high-confidence set to weakly supervise the model training. The model is finally used to accomplish the projecting process. Experimental results on two low-resource ILs show that the proposed method can generate better annotations projected from English-IL parallel corpora. The performance of IL name tagger can also be improved significantly by training on the newly projected IL annotation set. |
Tasks | Named Entity Recognition, Word Alignment |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1045/ |
https://www.aclweb.org/anthology/C16-1045 | |
PWC | https://paperswithcode.com/paper/bitext-name-tagging-for-cross-lingual-entity |
Repo | |
Framework | |
Brundlefly at SemEval-2016 Task 12: Recurrent Neural Networks vs. Joint Inference for Clinical Temporal Information Extraction
Title | Brundlefly at SemEval-2016 Task 12: Recurrent Neural Networks vs. Joint Inference for Clinical Temporal Information Extraction |
Authors | Jason Fries |
Abstract | |
Tasks | Entity Extraction, Structured Prediction, Temporal Information Extraction, Word Embeddings |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/S16-1198/ |
https://www.aclweb.org/anthology/S16-1198 | |
PWC | https://paperswithcode.com/paper/brundlefly-at-semeval-2016-task-12-recurrent-1 |
Repo | |
Framework | |
A Tag-based English Math Word Problem Solver with Understanding, Reasoning and Explanation
Title | A Tag-based English Math Word Problem Solver with Understanding, Reasoning and Explanation |
Authors | Chao-Chun Liang, Kuang-Yi Hsu, Chien-Tsung Huang, Chung-Min Li, Shen-Yu Miao, Keh-Yih Su |
Abstract | |
Tasks | |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/N16-3014/ |
https://www.aclweb.org/anthology/N16-3014 | |
PWC | https://paperswithcode.com/paper/a-tag-based-english-math-word-problem-solver |
Repo | |
Framework | |
Tools and Guidelines for Principled Machine Translation Development
Title | Tools and Guidelines for Principled Machine Translation Development |
Authors | Nora Aranberri, Eleftherios Avramidis, Aljoscha Burchardt, Ond{\v{r}}ej Klejch, Martin Popel, Maja Popovi{'c} |
Abstract | This work addresses the need to aid Machine Translation (MT) development cycles with a complete workflow of MT evaluation methods. Our aim is to assess, compare and improve MT system variants. We hereby report on novel tools and practices that support various measures, developed in order to support a principled and informed approach of MT development. Our toolkit for automatic evaluation showcases quick and detailed comparison of MT system variants through automatic metrics and n-gram feedback, along with manual evaluation via edit-distance, error annotation and task-based feedback. |
Tasks | Machine Translation |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1296/ |
https://www.aclweb.org/anthology/L16-1296 | |
PWC | https://paperswithcode.com/paper/tools-and-guidelines-for-principled-machine |
Repo | |
Framework | |
Token-Level Metaphor Detection using Neural Networks
Title | Token-Level Metaphor Detection using Neural Networks |
Authors | Erik-L{^a}n Do Dinh, Iryna Gurevych |
Abstract | |
Tasks | Feature Engineering, Machine Translation, Topic Models, Word Embeddings |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/W16-1104/ |
https://www.aclweb.org/anthology/W16-1104 | |
PWC | https://paperswithcode.com/paper/token-level-metaphor-detection-using-neural |
Repo | |
Framework | |
Paraphrasing Out-of-Vocabulary Words with Word Embeddings and Semantic Lexicons for Low Resource Statistical Machine Translation
Title | Paraphrasing Out-of-Vocabulary Words with Word Embeddings and Semantic Lexicons for Low Resource Statistical Machine Translation |
Authors | Chenhui Chu, Sadao Kurohashi |
Abstract | Out-of-vocabulary (OOV) word is a crucial problem in statistical machine translation (SMT) with low resources. OOV paraphrasing that augments the translation model for the OOV words by using the translation knowledge of their paraphrases has been proposed to address the OOV problem. In this paper, we propose using word embeddings and semantic lexicons for OOV paraphrasing. Experiments conducted on a low resource setting of the OLYMPICS task of IWSLT 2012 verify the effectiveness of our proposed method. |
Tasks | Machine Translation, Word Embeddings |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1101/ |
https://www.aclweb.org/anthology/L16-1101 | |
PWC | https://paperswithcode.com/paper/paraphrasing-out-of-vocabulary-words-with |
Repo | |
Framework | |
Evaluation of acoustic word embeddings
Title | Evaluation of acoustic word embeddings |
Authors | Sahar Ghannay, Yannick Est{`e}ve, Nathalie Camelin, Paul Deleglise |
Abstract | |
Tasks | Speech Recognition, Word Embeddings |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2511/ |
https://www.aclweb.org/anthology/W16-2511 | |
PWC | https://paperswithcode.com/paper/evaluation-of-acoustic-word-embeddings |
Repo | |
Framework | |
Subsumption Preservation as a Comparative Measure for Evaluating Sense-Directed Embeddings
Title | Subsumption Preservation as a Comparative Measure for Evaluating Sense-Directed Embeddings |
Authors | Ali Seyed |
Abstract | |
Tasks | Word Embeddings |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2516/ |
https://www.aclweb.org/anthology/W16-2516 | |
PWC | https://paperswithcode.com/paper/subsumption-preservation-as-a-comparative |
Repo | |
Framework | |
Extracting Interlinear Glossed Text from LaTeX Documents
Title | Extracting Interlinear Glossed Text from LaTeX Documents |
Authors | Mathias Schenner, Sebastian Nordhoff |
Abstract | We present texigt, a command-line tool for the extraction of structured linguistic data from LaTeX source documents, and a language resource that has been generated using this tool: a corpus of interlinear glossed text (IGT) extracted from open access books published by Language Science Press. Extracted examples are represented in a simple XML format that is easy to process and can be used to validate certain aspects of interlinear glossed text. The main challenge involved is the parsing of TeX and LaTeX documents. We review why this task is impossible in general and how the texhs Haskell library uses a layered architecture and selective early evaluation (expansion) during lexing and parsing in order to provide access to structured representations of LaTeX documents at several levels. In particular, its parsing modules generate an abstract syntax tree for LaTeX documents after expansion of all user-defined macros and lexer-level commands that serves as an ideal interface for the extraction of interlinear glossed text by texigt. This architecture can easily be adapted to extract other types of linguistic data structures from LaTeX source documents. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1638/ |
https://www.aclweb.org/anthology/L16-1638 | |
PWC | https://paperswithcode.com/paper/extracting-interlinear-glossed-text-from |
Repo | |
Framework | |