Paper Group NANR 68
The JHU Machine Translation Systems for WMT 2018. A Chinese Writing Correction System for Learning Chinese as a Foreign Language. Construction of a Chinese Corpus for the Analysis of the Emotionality of Metaphorical Expressions. A Case Study on Learning a Unified Encoder of Relations. Three Strategies to Improve One-to-Many Multilingual Translation …
The JHU Machine Translation Systems for WMT 2018
Title | The JHU Machine Translation Systems for WMT 2018 |
Authors | Philipp Koehn, Kevin Duh, Brian Thompson |
Abstract | We report on the efforts of the Johns Hopkins University to develop neural machine translation systems for the shared task for news translation organized around the Conference for Machine Translation (WMT) 2018. We developed systems for German{–}English, English{–} German, and Russian{–}English. Our novel contributions are iterative back-translation and fine-tuning on test sets from prior years. |
Tasks | Machine Translation |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/W18-6417/ |
https://www.aclweb.org/anthology/W18-6417 | |
PWC | https://paperswithcode.com/paper/the-jhu-machine-translation-systems-for-wmt |
Repo | |
Framework | |
A Chinese Writing Correction System for Learning Chinese as a Foreign Language
Title | A Chinese Writing Correction System for Learning Chinese as a Foreign Language |
Authors | Yow-Ting Shiue, Hen-Hsen Huang, Hsin-Hsi Chen |
Abstract | We present a Chinese writing correction system for learning Chinese as a foreign language. The system takes a wrong input sentence and generates several correction suggestions. It also retrieves example Chinese sentences with English translations, helping users understand the correct usages of certain grammar patterns. This is the first available Chinese writing error correction system based on the neural machine translation framework. We discuss several design choices and show empirical results to support our decisions. |
Tasks | Grammatical Error Correction, Machine Translation, Word Embeddings |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-2030/ |
https://www.aclweb.org/anthology/C18-2030 | |
PWC | https://paperswithcode.com/paper/a-chinese-writing-correction-system-for |
Repo | |
Framework | |
Construction of a Chinese Corpus for the Analysis of the Emotionality of Metaphorical Expressions
Title | Construction of a Chinese Corpus for the Analysis of the Emotionality of Metaphorical Expressions |
Authors | Dongyu Zhang, Hongfei Lin, Liang Yang, Shaowu Zhang, Bo Xu |
Abstract | Metaphors are frequently used to convey emotions. However, there is little research on the construction of metaphor corpora annotated with emotion for the analysis of emotionality of metaphorical expressions. Furthermore, most studies focus on English, and few in other languages, particularly Sino-Tibetan languages such as Chinese, for emotion analysis from metaphorical texts, although there are likely to be many differences in emotional expressions of metaphorical usages across different languages. We therefore construct a significant new corpus on metaphor, with 5,605 manually annotated sentences in Chinese. We present an annotation scheme that contains annotations of linguistic metaphors, emotional categories (joy, anger, sadness, fear, love, disgust and surprise), and intensity. The annotation agreement analyses for multiple annotators are described. We also use the corpus to explore and analyze the emotionality of metaphors. To the best of our knowledge, this is the first relatively large metaphor corpus with an annotation of emotions in Chinese. |
Tasks | Emotion Recognition |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/P18-2024/ |
https://www.aclweb.org/anthology/P18-2024 | |
PWC | https://paperswithcode.com/paper/construction-of-a-chinese-corpus-for-the |
Repo | |
Framework | |
A Case Study on Learning a Unified Encoder of Relations
Title | A Case Study on Learning a Unified Encoder of Relations |
Authors | Lisheng Fu, Bonan Min, Thien Huu Nguyen, Ralph Grishman |
Abstract | Typical relation extraction models are trained on a single corpus annotated with a pre-defined relation schema. An individual corpus is often small, and the models may often be biased or overfitted to the corpus. We hypothesize that we can learn a better representation by combining multiple relation datasets. We attempt to use a shared encoder to learn the unified feature representation and to augment it with regularization by adversarial training. The additional corpora feeding the encoder can help to learn a better feature representation layer even though the relation schemas are different. We use ACE05 and ERE datasets as our case study for experiments. The multi-task model obtains significant improvement on both datasets. |
Tasks | Knowledge Base Population, Relation Extraction |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/W18-6126/ |
https://www.aclweb.org/anthology/W18-6126 | |
PWC | https://paperswithcode.com/paper/a-case-study-on-learning-a-unified-encoder-of |
Repo | |
Framework | |
Three Strategies to Improve One-to-Many Multilingual Translation
Title | Three Strategies to Improve One-to-Many Multilingual Translation |
Authors | Yining Wang, Jiajun Zhang, Feifei Zhai, Jingfang Xu, Chengqing Zong |
Abstract | Due to the benefits of model compactness, multilingual translation (including many-to-one, many-to-many and one-to-many) based on a universal encoder-decoder architecture attracts more and more attention. However, previous studies show that one-to-many translation based on this framework cannot perform on par with the individually trained models. In this work, we introduce three strategies to improve one-to-many multilingual translation by balancing the shared and unique features. Within the architecture of one decoder for all target languages, we first exploit the use of unique initial states for different target languages. Then, we employ language-dependent positional embeddings. Finally and especially, we propose to divide the hidden cells of the decoder into shared and language-dependent ones. The extensive experiments demonstrate that our proposed methods can obtain remarkable improvements over the strong baselines. Moreover, our strategies can achieve comparable or even better performance than the individually trained translation models. |
Tasks | Machine Translation, Multi-Task Learning |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1326/ |
https://www.aclweb.org/anthology/D18-1326 | |
PWC | https://paperswithcode.com/paper/three-strategies-to-improve-one-to-many |
Repo | |
Framework | |
Gradients explode - Deep Networks are shallow - ResNet explained
Title | Gradients explode - Deep Networks are shallow - ResNet explained |
Authors | George Philipp, Dawn Song, Jaime G. Carbonell |
Abstract | Whereas it is believed that techniques such as Adam, batch normalization and, more recently, SeLU nonlinearities ``solve’’ the exploding gradient problem, we show that this is not the case and that in a range of popular MLP architectures, exploding gradients exist and that they limit the depth to which networks can be effectively trained, both in theory and in practice. We explain why exploding gradients occur and highlight the {\it collapsing domain problem}, which can arise in architectures that avoid exploding gradients. ResNets have significantly lower gradients and thus can circumvent the exploding gradient problem, enabling the effective training of much deeper networks, which we show is a consequence of a surprising mathematical property. By noticing that {\it any neural network is a residual network}, we devise the {\it residual trick}, which reveals that introducing skip connections simplifies the network mathematically, and that this simplicity may be the major cause for their success. | |
Tasks | |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=HkpYwMZRb |
https://openreview.net/pdf?id=HkpYwMZRb | |
PWC | https://paperswithcode.com/paper/gradients-explode-deep-networks-are-shallow |
Repo | |
Framework | |
Adversarial Evaluation of Multimodal Machine Translation
Title | Adversarial Evaluation of Multimodal Machine Translation |
Authors | Desmond Elliott |
Abstract | The promise of combining language and vision in multimodal machine translation is that systems will produce better translations by leveraging the image data. However, the evidence surrounding whether the images are useful is unconvincing due to inconsistencies between text-similarity metrics and human judgements. We present an adversarial evaluation to directly examine the utility of the image data in this task. Our evaluation tests whether systems perform better when paired with congruent images or incongruent images. This evaluation shows that only one out of three publicly available systems is sensitive to this perturbation of the data. We recommend that multimodal translation systems should be able to pass this sanity check in the future. |
Tasks | Machine Translation, Multimodal Machine Translation |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1329/ |
https://www.aclweb.org/anthology/D18-1329 | |
PWC | https://paperswithcode.com/paper/adversarial-evaluation-of-multimodal-machine |
Repo | |
Framework | |
Somm: Into the Model
Title | Somm: Into the Model |
Authors | Shengli Hu |
Abstract | To what extent could the sommelier profession, or wine stewardship, be displaced by machine leaning algorithms? There are at least three essential skills that make a qualified sommelier: wine theory, blind tasting, and beverage service, as exemplified in the rigorous certification processes of certified sommeliers and above (advanced and master) with the most authoritative body in the industry, the Court of Master Sommelier (hereafter CMS). We propose and train corresponding machine learning models that match these skills, and compare algorithmic results with real data collected from a large group of wine professionals. We find that our machine learning models outperform human sommeliers on most tasks {—} most notably in the section of blind tasting, where hierarchically supervised Latent Dirichlet Allocation outperforms sommeliers{'} judgment calls by over 6{%} in terms of F1-score; and in the section of beverage service, especially wine and food pairing, a modified Siamese neural network based on BiLSTM achieves better results than sommeliers by 2{%}. This demonstrates, contrary to popular opinion in the industry, that the sommelier profession is at least to some extent automatable, barring economic (Kleinberg et al., 2017) and psychological (Dietvorst et al., 2015) complications. |
Tasks | Information Retrieval, Open-Domain Question Answering, Question Answering, Reading Comprehension |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1146/ |
https://www.aclweb.org/anthology/D18-1146 | |
PWC | https://paperswithcode.com/paper/somm-into-the-model |
Repo | |
Framework | |
End-to-End Non-Autoregressive Neural Machine Translation with Connectionist Temporal Classification
Title | End-to-End Non-Autoregressive Neural Machine Translation with Connectionist Temporal Classification |
Authors | Jind{\v{r}}ich Libovick{'y}, Jind{\v{r}}ich Helcl |
Abstract | Autoregressive decoding is the only part of sequence-to-sequence models that prevents them from massive parallelization at inference time. Non-autoregressive models enable the decoder to generate all output symbols independently in parallel. We present a novel non-autoregressive architecture based on connectionist temporal classification and evaluate it on the task of neural machine translation. Unlike other non-autoregressive methods which operate in several steps, our model can be trained end-to-end. We conduct experiments on the WMT English-Romanian and English-German datasets. Our models achieve a significant speedup over the autoregressive models, keeping the translation quality comparable to other non-autoregressive models. |
Tasks | Machine Translation |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1336/ |
https://www.aclweb.org/anthology/D18-1336 | |
PWC | https://paperswithcode.com/paper/end-to-end-non-autoregressive-neural-machine-1 |
Repo | |
Framework | |
Candidate Ranking for Maintenance of an Online Dictionary
Title | Candidate Ranking for Maintenance of an Online Dictionary |
Authors | Claire Broad, Helen Langone, David Guy Brizan |
Abstract | |
Tasks | |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1134/ |
https://www.aclweb.org/anthology/L18-1134 | |
PWC | https://paperswithcode.com/paper/candidate-ranking-for-maintenance-of-an |
Repo | |
Framework | |
Bootstrap Domain-Specific Sentiment Classifiers from Unlabeled Corpora
Title | Bootstrap Domain-Specific Sentiment Classifiers from Unlabeled Corpora |
Authors | Andrius Mudinas, Dell Zhang, Mark Levene |
Abstract | There is often the need to perform sentiment classification in a particular domain where no labeled document is available. Although we could make use of a general-purpose off-the-shelf sentiment classifier or a pre-built one for a different domain, the effectiveness would be inferior. In this paper, we explore the possibility of building domain-specific sentiment classifiers with unlabeled documents only. Our investigation indicates that in the word embeddings learned from the unlabeled corpus of a given domain, the distributed word representations (vectors) for opposite sentiments form distinct clusters, though those clusters are not transferable across domains. Exploiting such a clustering structure, we are able to utilize machine learning algorithms to induce a quality domain-specific sentiment lexicon from just a few typical sentiment words ({``}seeds{''}). An important finding is that simple linear model based supervised learning algorithms (such as linear SVM) can actually work better than more sophisticated semi-supervised/transductive learning algorithms which represent the state-of-the-art technique for sentiment lexicon induction. The induced lexicon could be applied directly in a lexicon-based method for sentiment classification, but a higher performance could be achieved through a two-phase bootstrapping method which uses the induced lexicon to assign positive/negative sentiment scores to unlabeled documents first, a nd t hen u ses those documents found to have clear sentiment signals as pseudo-labeled examples to train a document sentiment classifier v ia supervised learning algorithms (such as LSTM). On several benchmark datasets for document sentiment classification, our end-to-end pipelined approach which is overall unsupervised (except for a tiny set of seed words) outperforms existing unsupervised approaches and achieves an accuracy comparable to that of fully supervised approaches. | |
Tasks | Sentiment Analysis, Word Embeddings |
Published | 2018-01-01 |
URL | https://www.aclweb.org/anthology/Q18-1020/ |
https://www.aclweb.org/anthology/Q18-1020 | |
PWC | https://paperswithcode.com/paper/bootstrap-domain-specific-sentiment |
Repo | |
Framework | |
An Interface for Annotating Science Questions
Title | An Interface for Annotating Science Questions |
Authors | Michael Boratko, Harshit Padigela, Divyendra Mikkilineni, Pritish Yuvraj, Rajarshi Das, Andrew McCallum, Maria Chang, Achille Fokoue, Pavan Kapanipathi, Nicholas Mattei, Ryan Musa, Kartik Talamadupula, Michael Witbrock |
Abstract | Recent work introduces the AI2 Reasoning Challenge (ARC) and the associated ARC dataset that partitions open domain, complex science questions into an Easy Set and a Challenge Set. That work includes an analysis of 100 questions with respect to the types of knowledge and reasoning required to answer them. However, it does not include clear definitions of these types, nor does it offer information about the quality of the labels or the annotation process used. In this paper, we introduce a novel interface for human annotation of science question-answer pairs with their respective knowledge and reasoning types, in order that the classification of new questions may be improved. We build on the classification schema proposed by prior work on the ARC dataset, and evaluate the effectiveness of our interface with a preliminary study involving 10 participants. |
Tasks | Information Retrieval |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/D18-2018/ |
https://www.aclweb.org/anthology/D18-2018 | |
PWC | https://paperswithcode.com/paper/an-interface-for-annotating-science-questions |
Repo | |
Framework | |
Deep Exhaustive Model for Nested Named Entity Recognition
Title | Deep Exhaustive Model for Nested Named Entity Recognition |
Authors | Mohammad Golam Sohrab, Makoto Miwa |
Abstract | We propose a simple deep neural model for nested named entity recognition (NER). Most NER models focused on flat entities and ignored nested entities, which failed to fully capture underlying semantic information in texts. The key idea of our model is to enumerate all possible regions or spans as potential entity mentions and classify them with deep neural networks. To reduce the computational costs and capture the information of the contexts around the regions, the model represents the regions using the outputs of shared underlying bidirectional long short-term memory. We evaluate our exhaustive model on the GENIA and JNLPBA corpora in biomedical domain, and the results show that our model outperforms state-of-the-art models on nested and flat NER, achieving 77.1{%} and 78.4{%} respectively in terms of F-score, without any external knowledge resources. |
Tasks | Entity Linking, Feature Engineering, Named Entity Recognition, Nested Named Entity Recognition, Relation Extraction |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1309/ |
https://www.aclweb.org/anthology/D18-1309 | |
PWC | https://paperswithcode.com/paper/deep-exhaustive-model-for-nested-named-entity |
Repo | |
Framework | |
Leveraging Data Resources for Cross-Linguistic Information Retrieval Using Statistical Machine Translation
Title | Leveraging Data Resources for Cross-Linguistic Information Retrieval Using Statistical Machine Translation |
Authors | Steve Sloto, Ann Clifton, Greg Hanneman, Patrick Porter, Donna Gates, Almut Hildebrand, Anish Kumar |
Abstract | |
Tasks | Information Retrieval, Machine Translation |
Published | 2018-01-01 |
URL | https://www.aclweb.org/anthology/papers/W18-1917/w18-1917 |
https://www.aclweb.org/anthology/W18-1917 | |
PWC | https://paperswithcode.com/paper/leveraging-data-resources-for-cross |
Repo | |
Framework | |
Retrieve and Re-rank: A Simple and Effective IR Approach to Simple Question Answering over Knowledge Graphs
Title | Retrieve and Re-rank: A Simple and Effective IR Approach to Simple Question Answering over Knowledge Graphs |
Authors | Vishal Gupta, Manoj Chinnakotla, Manish Shrivastava |
Abstract | SimpleQuestions is a commonly used benchmark for single-factoid question answering (QA) over Knowledge Graphs (KG). Existing QA systems rely on various components to solve different sub-tasks of the problem (such as entity detection, entity linking, relation prediction and evidence integration). In this work, we propose a different approach to the problem and present an information retrieval style solution for it. We adopt a two-phase approach: candidate generation and candidate re-ranking to answer questions. We propose a Triplet-Siamese-Hybrid CNN (TSHCNN) to re-rank candidate answers. Our approach achieves an accuracy of 80{%} which sets a new state-of-the-art on the SimpleQuestions dataset. |
Tasks | Entity Linking, Information Retrieval, Knowledge Graphs, Learning-To-Rank, Question Answering |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/W18-5504/ |
https://www.aclweb.org/anthology/W18-5504 | |
PWC | https://paperswithcode.com/paper/retrieve-and-re-rank-a-simple-and-effective |
Repo | |
Framework | |