Paper Group NANR 2
Recurrent Residual Learning for Sequence Classification. Towards a Language Service Infrastructure for Mobile Environments. Identifying Temporal Orientation of Word Senses. Deterministic natural language generation from meaning representations for machine translation. Insights from Russian second language readability classification: complexity-depe …
Recurrent Residual Learning for Sequence Classification
Title | Recurrent Residual Learning for Sequence Classification |
Authors | Yiren Wang, Fei Tian |
Abstract | |
Tasks | Text Classification |
Published | 2016-11-01 |
URL | https://www.aclweb.org/anthology/D16-1093/ |
https://www.aclweb.org/anthology/D16-1093 | |
PWC | https://paperswithcode.com/paper/recurrent-residual-learning-for-sequence |
Repo | |
Framework | |
Towards a Language Service Infrastructure for Mobile Environments
Title | Towards a Language Service Infrastructure for Mobile Environments |
Authors | Ngoc Nguyen, Donghui Lin, Takao Nakaguchi, Toru Ishida |
Abstract | Since mobile devices have feature-rich configurations and provide diverse functions, the use of mobile devices combined with the language resources of cloud environments is high promising for achieving a wide range communication that goes beyond the current language barrier. However, there are mismatches between using resources of mobile devices and services in the cloud such as the different communication protocol and different input and output methods. In this paper, we propose a language service infrastructure for mobile environments to combine these services. The proposed language service infrastructure allows users to use and mashup existing language resources on both cloud environments and their mobile devices. Furthermore, it allows users to flexibly use services in the cloud or services on mobile devices in their composite service without implementing several different composite services that have the same functionality. A case study of Mobile Shopping Translation System using both a service in the cloud (translation service) and services on mobile devices (Bluetooth low energy (BLE) service and text-to-speech service) is introduced. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1708/ |
https://www.aclweb.org/anthology/L16-1708 | |
PWC | https://paperswithcode.com/paper/towards-a-language-service-infrastructure-for |
Repo | |
Framework | |
Identifying Temporal Orientation of Word Senses
Title | Identifying Temporal Orientation of Word Senses |
Authors | Mohammed Hasanuzzaman, Ga{"e}l Dias, St{'e}phane Ferrari, Yann Mathet, Andy Way |
Abstract | |
Tasks | Information Retrieval |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/K16-1003/ |
https://www.aclweb.org/anthology/K16-1003 | |
PWC | https://paperswithcode.com/paper/identifying-temporal-orientation-of-word |
Repo | |
Framework | |
Deterministic natural language generation from meaning representations for machine translation
Title | Deterministic natural language generation from meaning representations for machine translation |
Authors | Alastair Butler |
Abstract | |
Tasks | Machine Translation, Semantic Parsing, Text Generation |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/W16-0601/ |
https://www.aclweb.org/anthology/W16-0601 | |
PWC | https://paperswithcode.com/paper/deterministic-natural-language-generation |
Repo | |
Framework | |
Insights from Russian second language readability classification: complexity-dependent training requirements, and feature evaluation of multiple categories
Title | Insights from Russian second language readability classification: complexity-dependent training requirements, and feature evaluation of multiple categories |
Authors | Robert Reynolds |
Abstract | |
Tasks | |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/W16-0534/ |
https://www.aclweb.org/anthology/W16-0534 | |
PWC | https://paperswithcode.com/paper/insights-from-russian-second-language |
Repo | |
Framework | |
CItA: an L1 Italian Learners Corpus to Study the Development of Writing Competence
Title | CItA: an L1 Italian Learners Corpus to Study the Development of Writing Competence |
Authors | Alessia Barbagli, Pietro Lucisano, Felice Dell{'}Orletta, Simonetta Montemagni, Giulia Venturi |
Abstract | In this paper, we present the CItA corpus (Corpus Italiano di Apprendenti L1), a collection of essays written by Italian L1 learners collected during the first and second year of lower secondary school. The corpus was built in the framework of an interdisciplinary study jointly carried out by computational linguistics and experimental pedagogists and aimed at tracking the development of written language competence over the years and students{'} background information. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1014/ |
https://www.aclweb.org/anthology/L16-1014 | |
PWC | https://paperswithcode.com/paper/cita-an-l1-italian-learners-corpus-to-study |
Repo | |
Framework | |
Assessing the Corpus Size vs. Similarity Trade-off for Word Embeddings in Clinical NLP
Title | Assessing the Corpus Size vs. Similarity Trade-off for Word Embeddings in Clinical NLP |
Authors | Kirk Roberts |
Abstract | The proliferation of deep learning methods in natural language processing (NLP) and the large amounts of data they often require stands in stark contrast to the relatively data-poor clinical NLP domain. In particular, large text corpora are necessary to build high-quality word embeddings, yet often large corpora that are suitably representative of the target clinical data are unavailable. This forces a choice between building embeddings from small clinical corpora and less representative, larger corpora. This paper explores this trade-off, as well as intermediate compromise solutions. Two standard clinical NLP tasks (the i2b2 2010 concept and assertion tasks) are evaluated with commonly used deep learning models (recurrent neural networks and convolutional neural networks) using a set of six corpora ranging from the target i2b2 data to large open-domain datasets. While combinations of corpora are generally found to work best, the single-best corpus is generally task-dependent. |
Tasks | Word Embeddings |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-4208/ |
https://www.aclweb.org/anthology/W16-4208 | |
PWC | https://paperswithcode.com/paper/assessing-the-corpus-size-vs-similarity-trade |
Repo | |
Framework | |
Coreference Resolution for the Basque Language with BART
Title | Coreference Resolution for the Basque Language with BART |
Authors | Ander Soraluze, Olatz Arregi, Xabier Arregi, Arantza D{'\i}az de Ilarraza, Mijail Kabadjov, Massimo Poesio |
Abstract | |
Tasks | Chunking, Coreference Resolution, Morphological Analysis, Question Answering |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/W16-0710/ |
https://www.aclweb.org/anthology/W16-0710 | |
PWC | https://paperswithcode.com/paper/coreference-resolution-for-the-basque |
Repo | |
Framework | |
Sense Anaphoric Pronouns: Am I One?
Title | Sense Anaphoric Pronouns: Am I One? |
Authors | Marta Recasens, Zhichao Hu, Olivia Rhinehart |
Abstract | |
Tasks | Coreference Resolution |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/W16-0701/ |
https://www.aclweb.org/anthology/W16-0701 | |
PWC | https://paperswithcode.com/paper/sense-anaphoric-pronouns-am-i-one |
Repo | |
Framework | |
Residual Stacking of RNNs for Neural Machine Translation
Title | Residual Stacking of RNNs for Neural Machine Translation |
Authors | Raphael Shu, Akiva Miura |
Abstract | To enhance Neural Machine Translation models, several obvious ways such as enlarging the hidden size of recurrent layers and stacking multiple layers of RNN can be considered. Surprisingly, we observe that using naively stacked RNNs in the decoder slows down the training and leads to degradation in performance. In this paper, We demonstrate that applying residual connections in the depth of stacked RNNs can help the optimization, which is referred to as residual stacking. In empirical evaluation, residual stacking of decoder RNNs gives superior results compared to other methods of enhancing the model with a fixed parameter budget. Our submitted systems in WAT2016 are based on a NMT model ensemble with residual stacking in the decoder. To further improve the performance, we also attempt various methods of system combination in our experiments. |
Tasks | Machine Translation, Word Embeddings |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-4623/ |
https://www.aclweb.org/anthology/W16-4623 | |
PWC | https://paperswithcode.com/paper/residual-stacking-of-rnns-for-neural-machine |
Repo | |
Framework | |
Jointly Learning to Embed and Predict with Multiple Languages
Title | Jointly Learning to Embed and Predict with Multiple Languages |
Authors | Daniel C. Ferreira, Andr{'e} F. T. Martins, Mariana S. C. Almeida |
Abstract | |
Tasks | Cross-Lingual Transfer, Language Modelling, Machine Translation, Sentiment Analysis, Text Classification, Word Embeddings |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/P16-1190/ |
https://www.aclweb.org/anthology/P16-1190 | |
PWC | https://paperswithcode.com/paper/jointly-learning-to-embed-and-predict-with |
Repo | |
Framework | |
T"UB.ITAK SMT System Submission for WMT2016
Title | T"UB.ITAK SMT System Submission for WMT2016 |
Authors | Emre Bekta{\c{s}}, Ertu{\u{g}}rul Yilmaz, Co{\c{s}}kun Mermer, {.I}lknur Durgar El-Kahlout |
Abstract | |
Tasks | Language Modelling, Machine Translation |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/W16-2305/ |
https://www.aclweb.org/anthology/W16-2305 | |
PWC | https://paperswithcode.com/paper/tabatak-smt-system-submission-for-wmt2016 |
Repo | |
Framework | |
LILI: A Simple Language Independent Approach for Language Identification
Title | LILI: A Simple Language Independent Approach for Language Identification |
Authors | Mohamed Al-Badrashiny, Mona Diab |
Abstract | We introduce a generic Language Independent Framework for Linguistic Code Switch Point Detection. The system uses characters level 5-grams and word level unigram language models to train a conditional random fields (CRF) model for classifying input words into various languages. We test our proposed framework and compare it to the state-of-the-art published systems on standard data sets from several language pairs: English-Spanish, Nepali-English, English-Hindi, Arabizi (Refers to Arabic written using the Latin/Roman script)-English, Arabic-Engari (Refers to English written using Arabic script), Modern Standard Arabic(MSA)-Egyptian, Levantine-MSA, Gulf-MSA, one more English-Spanish, and one more MSA-EGY. The overall weighted average F-score of each language pair are 96.4{%}, 97.3{%}, 98.0{%}, 97.0{%}, 98.9{%}, 86.3{%}, 88.2{%}, 90.6{%}, 95.2{%}, and 85.0{%} respectively. The results show that our approach despite its simplicity, either outperforms or performs at comparable levels to state-of-the-art published systems. |
Tasks | Language Identification |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-1115/ |
https://www.aclweb.org/anthology/C16-1115 | |
PWC | https://paperswithcode.com/paper/lili-a-simple-language-independent-approach |
Repo | |
Framework | |
Eliciting Categorical Data for Optimal Aggregation
Title | Eliciting Categorical Data for Optimal Aggregation |
Authors | Chien-Ju Ho, Rafael Frongillo, Yiling Chen |
Abstract | Models for collecting and aggregating categorical data on crowdsourcing platforms typically fall into two broad categories: those assuming agents honest and consistent but with heterogeneous error rates, and those assuming agents strategic and seek to maximize their expected reward. The former often leads to tractable aggregation of elicited data, while the latter usually focuses on optimal elicitation and does not consider aggregation. In this paper, we develop a Bayesian model, wherein agents have differing quality of information, but also respond to incentives. Our model generalizes both categories and enables the joint exploration of optimal elicitation and aggregation. This model enables our exploration, both analytically and experimentally, of optimal aggregation of categorical data and optimal multiple-choice interface design. |
Tasks | |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6237-eliciting-categorical-data-for-optimal-aggregation |
http://papers.nips.cc/paper/6237-eliciting-categorical-data-for-optimal-aggregation.pdf | |
PWC | https://paperswithcode.com/paper/eliciting-categorical-data-for-optimal |
Repo | |
Framework | |
Idiom Token Classification using Sentential Distributed Semantics
Title | Idiom Token Classification using Sentential Distributed Semantics |
Authors | Giancarlo Salton, Robert Ross, John Kelleher |
Abstract | |
Tasks | Machine Translation |
Published | 2016-08-01 |
URL | https://www.aclweb.org/anthology/P16-1019/ |
https://www.aclweb.org/anthology/P16-1019 | |
PWC | https://paperswithcode.com/paper/idiom-token-classification-using-sentential |
Repo | |
Framework | |