Paper Group NANR 141
TencentFmRD Neural Machine Translation for WMT18. CUNI Submissions in WMT18. Joint Part-of-Speech and Language ID Tagging for Code-Switched Data. UNCC QA: Biomedical Question Answering system. What represents ``style’’ in authorship attribution?. Policy Optimization with Demonstrations. Multimodal Neural Machine Translation for Low-resource Languag …
TencentFmRD Neural Machine Translation for WMT18
Title | TencentFmRD Neural Machine Translation for WMT18 |
Authors | Bojie Hu, Ambyer Han, Shen Huang |
Abstract | This paper describes the Neural Machine Translation (NMT) system of TencentFmRD for Chinese↔English news translation tasks of WMT 2018. Our systems are neural machine translation systems trained with our original system TenTrans. TenTrans is an improved NMT system based on Transformer self-attention mechanism. In addition to the basic settings of Transformer training, TenTrans uses multi-model fusion techniques, multiple features reranking, different segmentation models and joint learning. Finally, we adopt some data selection strategies to fine-tune the trained system and achieve a stable performance improvement. Our Chinese→English system achieved the second best BLEU scores and fourth best cased BLEU scores among all WMT18 submitted systems. |
Tasks | Machine Translation |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/W18-6413/ |
https://www.aclweb.org/anthology/W18-6413 | |
PWC | https://paperswithcode.com/paper/tencentfmrd-neural-machine-translation-for |
Repo | |
Framework | |
CUNI Submissions in WMT18
Title | CUNI Submissions in WMT18 |
Authors | Tom Kocmi, Roman Sudarikov, Ond{\v{r}}ej Bojar |
Abstract | We participated in the WMT 2018 shared news translation task in three language pairs: English-Estonian, English-Finnish, and English-Czech. Our main focus was the low-resource language pair of Estonian and English for which we utilized Finnish parallel data in a simple method. We first train a {``}parent model{''} for the high-resource language pair followed by adaptation on the related low-resource language pair. This approach brings a substantial performance boost over the baseline system trained only on Estonian-English parallel data. Our systems are based on the Transformer architecture. For the English to Czech translation, we have evaluated our last year models of hybrid phrase-based approach and neural machine translation mainly for comparison purposes. | |
Tasks | Machine Translation |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/W18-6416/ |
https://www.aclweb.org/anthology/W18-6416 | |
PWC | https://paperswithcode.com/paper/cuni-submissions-in-wmt18 |
Repo | |
Framework | |
Joint Part-of-Speech and Language ID Tagging for Code-Switched Data
Title | Joint Part-of-Speech and Language ID Tagging for Code-Switched Data |
Authors | Victor Soto, Julia Hirschberg |
Abstract | Code-switching is the fluent alternation between two or more languages in conversation between bilinguals. Large populations of speakers code-switch during communication, but little effort has been made to develop tools for code-switching, including part-of-speech taggers. In this paper, we propose an approach to POS tagging of code-switched English-Spanish data based on recurrent neural networks. We test our model on known monolingual benchmarks to demonstrate that our neural POS tagging model is on par with state-of-the-art methods. We next test our code-switched methods on the Miami Bangor corpus of English Spanish conversation, focusing on two types of experiments: POS tagging alone, for which we achieve 96.34{%} accuracy, and joint part-of-speech and language ID tagging, which achieves similar POS tagging accuracy (96.39{%}) and very high language ID accuracy (98.78{%}). Finally, we show that our proposed models outperform other state-of-the-art code-switched taggers. |
Tasks | Language Modelling, Machine Translation, Part-Of-Speech Tagging, Speech Recognition |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/W18-3201/ |
https://www.aclweb.org/anthology/W18-3201 | |
PWC | https://paperswithcode.com/paper/joint-part-of-speech-and-language-id-tagging |
Repo | |
Framework | |
UNCC QA: Biomedical Question Answering system
Title | UNCC QA: Biomedical Question Answering system |
Authors | Bh, Abhishek waldar, Wlodek Zadrozny |
Abstract | In this paper, we detail our submission to the BioASQ competition{'}s Biomedical Semantic Question and Answering task. Our system uses extractive summarization techniques to generate answers and has scored highest ROUGE-2 and Rogue-SU4 in all test batch sets. Our contributions are named-entity based method for answering factoid and list questions, and an extractive summarization techniques for building paragraph-sized summaries, based on lexical chains. Our system got highest ROUGE-2 and ROUGE-SU4 scores for ideal-type answers in all test batch sets. We also discuss the limitations of the described system, such lack of the evaluation on other criteria (e.g. manual). Also, for factoid- and list -type question our system got low accuracy (which suggests that our algorithm needs to improve in the ranking of entities). |
Tasks | Question Answering |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/W18-5308/ |
https://www.aclweb.org/anthology/W18-5308 | |
PWC | https://paperswithcode.com/paper/uncc-qa-biomedical-question-answering-system |
Repo | |
Framework | |
What represents ``style’’ in authorship attribution?
Title | What represents ``style’’ in authorship attribution? | |
Authors | Kalaivani Sundararajan, Damon Woodard |
Abstract | Authorship attribution typically uses all information representing both content and style whereas attribution based only on stylistic aspects may be robust in cross-domain settings. This paper analyzes different linguistic aspects that may help represent style. Specifically, we study the role of syntax and lexical words (nouns, verbs, adjectives and adverbs) in representing style. We use a purely syntactic language model to study the significance of sentence structures in both single-domain and cross-domain attribution, i.e. cross-topic and cross-genre attribution. We show that syntax may be helpful for cross-genre attribution while cross-topic attribution and single-domain may benefit from additional lexical information. Further, pure syntactic models may not be effective by themselves and need to be used in combination with other robust models. To study the role of word choice, we perform attribution by masking all words or specific topic words corresponding to nouns, verbs, adjectives and adverbs. Using a single-domain dataset, IMDB1M reviews, we demonstrate the heavy influence of common nouns and proper nouns in attribution, thereby highlighting topic interference. Using cross-domain Guardian10 dataset, we show that some common nouns, verbs, adjectives and adverbs may help with stylometric attribution as demonstrated by masking topic words corresponding to these parts-of-speech. As expected, it was observed that proper nouns are heavily influenced by content and cross-domain attribution will benefit from completely masking them. |
Tasks | Language Modelling |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-1238/ |
https://www.aclweb.org/anthology/C18-1238 | |
PWC | https://paperswithcode.com/paper/what-represents-style-in-authorship |
Repo | |
Framework | |
Policy Optimization with Demonstrations
Title | Policy Optimization with Demonstrations |
Authors | Bingyi Kang, Zequn Jie, Jiashi Feng |
Abstract | Exploration remains a significant challenge to reinforcement learning methods, especially in environments where reward signals are sparse. Recent methods of learning from demonstrations have shown to be promising in overcoming exploration difficulties but typically require considerable high-quality demonstrations that are difficult to collect. We propose to effectively leverage available demonstrations to guide exploration through enforcing occupancy measure matching between the learned policy and current demonstrations, and develop a novel Policy Optimization from Demonstration (POfD) method. We show that POfD induces implicit dynamic reward shaping and brings provable benefits for policy improvement. Furthermore, it can be combined with policy gradient methods to produce state-of-the-art results, as demonstrated experimentally on a range of popular benchmark sparse-reward tasks, even when the demonstrations are few and imperfect. |
Tasks | Policy Gradient Methods |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=1956 |
http://proceedings.mlr.press/v80/kang18a/kang18a.pdf | |
PWC | https://paperswithcode.com/paper/policy-optimization-with-demonstrations |
Repo | |
Framework | |
Multimodal Neural Machine Translation for Low-resource Language Pairs using Synthetic Data
Title | Multimodal Neural Machine Translation for Low-resource Language Pairs using Synthetic Data |
Authors | Koel Dutta Chowdhury, Mohammed Hasanuzzaman, Qun Liu |
Abstract | In this paper, we investigate the effectiveness of training a multimodal neural machine translation (MNMT) system with image features for a low-resource language pair, Hindi and English, using synthetic data. A three-way parallel corpus which contains bilingual texts and corresponding images is required to train a MNMT system with image features. However, such a corpus is not available for low resource language pairs. To address this, we developed both a synthetic training dataset and a manually curated development/test dataset for Hindi based on an existing English-image parallel corpus. We used these datasets to build our image description translation system by adopting state-of-the-art MNMT models. Our results show that it is possible to train a MNMT system for low-resource language pairs through the use of synthetic data and that such a system can benefit from image features. |
Tasks | Machine Translation, Question Answering, Video Description, Visual Question Answering |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/W18-3405/ |
https://www.aclweb.org/anthology/W18-3405 | |
PWC | https://paperswithcode.com/paper/multimodal-neural-machine-translation-for-low |
Repo | |
Framework | |
Predicting the presence of a Matrix Language in code-switching
Title | Predicting the presence of a Matrix Language in code-switching |
Authors | Barbara Bullock, Wally Guzm{'a}n, Jacqueline Serigos, Vivek Sharath, Almeida Jacqueline Toribio |
Abstract | One language is often assumed to be dominant in code-switching but this assumption has not been empirically tested. We operationalize the matrix language (ML) at the level of the sentence, using three common definitions from linguistics. We test whether these converge and then model this convergence via a set of metrics that together quantify the nature of C-S. We conduct our experiment on four Spanish-English corpora. Our results demonstrate that our model can separate some corpora according to whether they have a dominant ML or not but that the corpora span a range of mixing types that cannot be sorted neatly into an insertional vs. alternational dichotomy. |
Tasks | |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/W18-3208/ |
https://www.aclweb.org/anthology/W18-3208 | |
PWC | https://paperswithcode.com/paper/predicting-the-presence-of-a-matrix-language |
Repo | |
Framework | |
Cross-lingual Terminology Extraction for Translation Quality Estimation
Title | Cross-lingual Terminology Extraction for Translation Quality Estimation |
Authors | Yu Yuan, Yuze Gao, Yue Zhang, Serge Sharoff |
Abstract | |
Tasks | Machine Translation |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1596/ |
https://www.aclweb.org/anthology/L18-1596 | |
PWC | https://paperswithcode.com/paper/cross-lingual-terminology-extraction-for |
Repo | |
Framework | |
The Role of Diacritics in Increasing the Difficulty of Arabic Lexical Recognition Tests
Title | The Role of Diacritics in Increasing the Difficulty of Arabic Lexical Recognition Tests |
Authors | Osama Hamed, Torsten Zesch |
Abstract | |
Tasks | |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/W18-7103/ |
https://www.aclweb.org/anthology/W18-7103 | |
PWC | https://paperswithcode.com/paper/the-role-of-diacritics-in-increasing-the |
Repo | |
Framework | |
Being data-driven is not enough: Revisiting interactive instruction giving as a challenge for NLG
Title | Being data-driven is not enough: Revisiting interactive instruction giving as a challenge for NLG |
Authors | Sina Zarrie{\ss}, David Schlangen |
Abstract | Modeling traditional NLG tasks with data-driven techniques has been a major focus of research in NLG in the past decade. We argue that existing modeling techniques are mostly tailored to textual data and are not sufficient to make NLG technology meet the requirements of agents which target fluid interaction and collaboration in the real world. We revisit interactive instruction giving as a challenge for datadriven NLG and, based on insights from previous GIVE challenges, propose that instruction giving should be addressed in a setting that involves visual grounding and spoken language. These basic design decisions will require NLG frameworks that are capable of monitoring their environment as well as timing and revising their verbal output. We believe that these are core capabilities for making NLG technology transferrable to interactive systems. |
Tasks | Text Generation |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/W18-6906/ |
https://www.aclweb.org/anthology/W18-6906 | |
PWC | https://paperswithcode.com/paper/being-data-driven-is-not-enough-revisiting |
Repo | |
Framework | |
Unsupervised Token-wise Alignment to Improve Interpretation of Encoder-Decoder Models
Title | Unsupervised Token-wise Alignment to Improve Interpretation of Encoder-Decoder Models |
Authors | Shun Kiyono, Sho Takase, Jun Suzuki, Naoaki Okazaki, Kentaro Inui, Masaaki Nagata |
Abstract | Developing a method for understanding the inner workings of black-box neural methods is an important research endeavor. Conventionally, many studies have used an attention matrix to interpret how Encoder-Decoder-based models translate a given source sentence to the corresponding target sentence. However, recent studies have empirically revealed that an attention matrix is not optimal for token-wise translation analyses. We propose a method that explicitly models the token-wise alignment between the source and target sequences to provide a better analysis. Experiments show that our method can acquire token-wise alignments that are superior to those of an attention mechanism. |
Tasks | Machine Translation, Text Generation |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/W18-5410/ |
https://www.aclweb.org/anthology/W18-5410 | |
PWC | https://paperswithcode.com/paper/unsupervised-token-wise-alignment-to-improve |
Repo | |
Framework | |
Feedback Strategies for Form and Meaning in a Real-life Language Tutoring System
Title | Feedback Strategies for Form and Meaning in a Real-life Language Tutoring System |
Authors | Ramon Ziai, Bjoern Rudzewitz, Kordula De Kuthy, Florian Nuxoll, Detmar Meurers |
Abstract | |
Tasks | Language Acquisition |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/W18-7110/ |
https://www.aclweb.org/anthology/W18-7110 | |
PWC | https://paperswithcode.com/paper/feedback-strategies-for-form-and-meaning-in-a |
Repo | |
Framework | |
Generalized Loss-Sensitive Adversarial Learning with Manifold Margins
Title | Generalized Loss-Sensitive Adversarial Learning with Manifold Margins |
Authors | Marzieh Edraki, Guo-Jun Qi |
Abstract | The classic Generative Adversarial Net and its variants can be roughly categorized into two large families: the unregularized ver- sus regularized GANs. By relaxing the non-parametric assumption on the discriminator in the classic GAN, the regularized GANs have better generalization ability to produce new samples drawn from the real dis- tribution. It is well known that the real data like natural images are not uniformly distributed over the whole data space. Instead, they are often restricted to a low-dimensional manifold of the ambient space. Such a manifold assumption suggests the distance over the manifold should be a better measure to characterize the distinct between real and fake sam- ples. Thus, we define a pullback operator to map samples back to their data manifold, and a manifold margin is defined as the distance between the pullback representations to distinguish between real and fake sam- ples and learn the optimal generators. We justify the effectiveness of the proposed model both theoretically and empirically. |
Tasks | |
Published | 2018-09-01 |
URL | http://openaccess.thecvf.com/content_ECCV_2018/html/Marzieh_Edraki_Generalized_Loss-Sensitive_Adversarial_ECCV_2018_paper.html |
http://openaccess.thecvf.com/content_ECCV_2018/papers/Marzieh_Edraki_Generalized_Loss-Sensitive_Adversarial_ECCV_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/generalized-loss-sensitive-adversarial |
Repo | |
Framework | |
Aggression Identification and Multi Lingual Word Embeddings
Title | Aggression Identification and Multi Lingual Word Embeddings |
Authors | Thiago Galery, Efstathios Charitos, Ye Tian |
Abstract | The system presented here took part in the 2018 Trolling, Aggression and Cyberbullying shared task (Forest and Trees team) and uses a Gated Recurrent Neural Network architecture (Cho et al., 2014) in an attempt to assess whether combining pre-trained English and Hindi fastText (Mikolov et al., 2018) word embeddings as a representation of the sequence input would improve classification performance. The motivation for this comes from the fact that the shared task data for English contained many Hindi tokens and therefore some users might be doing code-switching: the alternation between two or more languages in communication. To test this hypothesis, we also aligned Hindi and English vectors using pre-computed SVD matrices that pulls representations from different languages into a common space (Smith et al., 2017). Two conditions were tested: (i) one with standard pre-trained fastText word embeddings where each Hindi word is treated as an OOV token, and (ii) another where word embeddings for Hindi and English are loaded in a common vector space, so Hindi tokens can be assigned a meaningful representation. We submitted the second (i.e., multilingual) system and obtained the scores of 0.531 weighted F1 for the EN-FB dataset and 0.438 weighted F1 for the EN-TW dataset. |
Tasks | Multi-Label Text Classification, Text Classification, Word Embeddings |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/W18-4409/ |
https://www.aclweb.org/anthology/W18-4409 | |
PWC | https://paperswithcode.com/paper/aggression-identification-and-multi-lingual |
Repo | |
Framework | |