Paper Group NANR 190
The RWTH Aachen University English-German and German-English Machine Translation System for WMT 2017. Neural vs. Phrase-Based Machine Translation in a Multi-Domain Scenario. ResSim at SemEval-2017 Task 1: Multilingual Word Representations for Semantic Textual Similarity. SEF@UHH at SemEval-2017 Task 1: Unsupervised Knowledge-Free Semantic Textual S …
The RWTH Aachen University English-German and German-English Machine Translation System for WMT 2017
Title | The RWTH Aachen University English-German and German-English Machine Translation System for WMT 2017 |
Authors | Jan-Thorsten Peter, Andreas Guta, Tamer Alkhouli, Parnia Bahar, Jan Rosendahl, Nick Rossenbach, Miguel Gra{\c{c}}a, Hermann Ney |
Abstract | |
Tasks | Machine Translation, Tokenization |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-4735/ |
https://www.aclweb.org/anthology/W17-4735 | |
PWC | https://paperswithcode.com/paper/the-rwth-aachen-university-english-german-and-1 |
Repo | |
Framework | |
Neural vs. Phrase-Based Machine Translation in a Multi-Domain Scenario
Title | Neural vs. Phrase-Based Machine Translation in a Multi-Domain Scenario |
Authors | M. Amin Farajian, Marco Turchi, Matteo Negri, Nicola Bertoldi, Marcello Federico |
Abstract | State-of-the-art neural machine translation (NMT) systems are generally trained on specific domains by carefully selecting the training sets and applying proper domain adaptation techniques. In this paper we consider the real world scenario in which the target domain is not predefined, hence the system should be able to translate text from multiple domains. We compare the performance of a generic NMT system and phrase-based statistical machine translation (PBMT) system by training them on a generic parallel corpus composed of data from different domains. Our results on multi-domain English-French data show that, in these realistic conditions, PBMT outperforms its neural counterpart. This raises the question: is NMT ready for deployment as a generic/multi-purpose MT backbone in real-world settings? |
Tasks | Domain Adaptation, Machine Translation |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/E17-2045/ |
https://www.aclweb.org/anthology/E17-2045 | |
PWC | https://paperswithcode.com/paper/neural-vs-phrase-based-machine-translation-in |
Repo | |
Framework | |
ResSim at SemEval-2017 Task 1: Multilingual Word Representations for Semantic Textual Similarity
Title | ResSim at SemEval-2017 Task 1: Multilingual Word Representations for Semantic Textual Similarity |
Authors | Johannes Bjerva, Robert {"O}stling |
Abstract | Shared Task 1 at SemEval-2017 deals with assessing the semantic similarity between sentences, either in the same or in different languages. In our system submission, we employ multilingual word representations, in which similar words in different languages are close to one another. Using such representations is advantageous, since the increasing amount of available parallel data allows for the application of such methods to many of the languages in the world. Hence, semantic similarity can be inferred even for languages for which no annotated data exists. Our system is trained and evaluated on all language pairs included in the shared task (English, Spanish, Arabic, and Turkish). Although development results are promising, our system does not yield high performance on the shared task test sets. |
Tasks | Machine Translation, Semantic Similarity, Semantic Textual Similarity |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/S17-2021/ |
https://www.aclweb.org/anthology/S17-2021 | |
PWC | https://paperswithcode.com/paper/ressim-at-semeval-2017-task-1-multilingual |
Repo | |
Framework | |
SEF@UHH at SemEval-2017 Task 1: Unsupervised Knowledge-Free Semantic Textual Similarity via Paragraph Vector
Title | SEF@UHH at SemEval-2017 Task 1: Unsupervised Knowledge-Free Semantic Textual Similarity via Paragraph Vector |
Authors | Mirela-Stefania Duma, Wolfgang Menzel |
Abstract | This paper describes our unsupervised knowledge-free approach to the SemEval-2017 Task 1 Competition. The proposed method makes use of Paragraph Vector for assessing the semantic similarity between pairs of sentences. We experimented with various dimensions of the vector and three state-of-the-art similarity metrics. Given a cross-lingual task, we trained models corresponding to its two languages and combined the models by averaging the similarity scores. The results of our submitted runs are above the median scores for five out of seven test sets by means of Pearson Correlation. Moreover, one of our system runs performed best on the Spanish-English-WMT test set ranking first out of 53 runs submitted in total by all participants. |
Tasks | Semantic Similarity, Semantic Textual Similarity |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/S17-2024/ |
https://www.aclweb.org/anthology/S17-2024 | |
PWC | https://paperswithcode.com/paper/sefuhh-at-semeval-2017-task-1-unsupervised |
Repo | |
Framework | |
UMDeep at SemEval-2017 Task 1: End-to-End Shared Weight LSTM Model for Semantic Textual Similarity
Title | UMDeep at SemEval-2017 Task 1: End-to-End Shared Weight LSTM Model for Semantic Textual Similarity |
Authors | Joe Barrow, Denis Peskov |
Abstract | We describe a modified shared-LSTM network for the Semantic Textual Similarity (STS) task at SemEval-2017. The network builds on previously explored Siamese network architectures. We treat max sentence length as an additional hyperparameter to be tuned (beyond learning rate, regularization, and dropout). Our results demonstrate that hand-tuning max sentence training length significantly improves final accuracy. After optimizing hyperparameters, we train the network on the multilingual semantic similarity task using pre-translated sentences. We achieved a correlation of 0.4792 for all the subtasks. We achieved the fourth highest team correlation for Task 4b, which was our best relative placement. |
Tasks | Feature Engineering, Machine Translation, Representation Learning, Semantic Similarity, Semantic Textual Similarity |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/S17-2026/ |
https://www.aclweb.org/anthology/S17-2026 | |
PWC | https://paperswithcode.com/paper/umdeep-at-semeval-2017-task-1-end-to-end |
Repo | |
Framework | |
LIPN-IIMAS at SemEval-2017 Task 1: Subword Embeddings, Attention Recurrent Neural Networks and Cross Word Alignment for Semantic Textual Similarity
Title | LIPN-IIMAS at SemEval-2017 Task 1: Subword Embeddings, Attention Recurrent Neural Networks and Cross Word Alignment for Semantic Textual Similarity |
Authors | Ignacio Arroyo-Fern{'a}ndez, Ivan Vladimir Meza Ruiz |
Abstract | In this paper we report our attempt to use, on the one hand, state-of-the-art neural approaches that are proposed to measure Semantic Textual Similarity (STS). On the other hand, we propose an unsupervised cross-word alignment approach, which is linguistically motivated. The neural approaches proposed herein are divided into two main stages. The first stage deals with constructing neural word embeddings, the components of sentence embeddings. The second stage deals with constructing a semantic similarity function relating pairs of sentence embeddings. Unfortunately our competition results were poor in all tracks, therefore we concentrated our research to improve them for Track 5 (EN-EN). |
Tasks | Semantic Similarity, Semantic Textual Similarity, Sentence Embeddings, Word Alignment, Word Embeddings |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/S17-2031/ |
https://www.aclweb.org/anthology/S17-2031 | |
PWC | https://paperswithcode.com/paper/lipn-iimas-at-semeval-2017-task-1-subword |
Repo | |
Framework | |
A News Chain Evaluation Methodology along with a Lattice-based Approach for News Chain Construction
Title | A News Chain Evaluation Methodology along with a Lattice-based Approach for News Chain Construction |
Authors | Mustafa Toprak, {"O}zer {"O}zkahraman, Selma Tekir |
Abstract | Chain construction is an important requirement for understanding news and establishing the context. A news chain can be defined as a coherent set of articles that explains an event or a story. There{'}s a lack of well-established methods in this area. In this work, we propose a methodology to evaluate the {}goodness{''} of a given news chain and implement a concept lattice-based news chain construction method by Hossain et al.. The methodology part is vital as it directly affects the growth of research in this area. Our proposed methodology consists of collected news chains from different studies and two { }goodness{''} metrics, minedge and dispersion coefficient respectively. We assess the utility of the lattice-based news chain construction method by our proposed methodology. |
Tasks | |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-4217/ |
https://www.aclweb.org/anthology/W17-4217 | |
PWC | https://paperswithcode.com/paper/a-news-chain-evaluation-methodology-along |
Repo | |
Framework | |
The JHU Machine Translation Systems for WMT 2017
Title | The JHU Machine Translation Systems for WMT 2017 |
Authors | Shuoyang Ding, Huda Khayrallah, Philipp Koehn, Matt Post, Gaurav Kumar, Kevin Duh |
Abstract | |
Tasks | Language Modelling, Machine Translation |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-4724/ |
https://www.aclweb.org/anthology/W17-4724 | |
PWC | https://paperswithcode.com/paper/the-jhu-machine-translation-systems-for-wmt-1 |
Repo | |
Framework | |
metapath2vec: Scalable Representation Learning for Heterogeneous Networks
Title | metapath2vec: Scalable Representation Learning for Heterogeneous Networks |
Authors | Yuxiao Dong, Nitesh Vijay Chawla, Ananthram Swami |
Abstract | We study the problem of representation learning in heterogeneous networks. Its unique challenges come from the existence of multiple types of nodes and links, which limit the feasibility of the conventional network embedding techniques. We develop two scalable representation learning models, namely metapath2vec and metapath2vec++. The metapath2vec model formalizes meta-path-based random walks to construct the heterogeneous neighborhood of a node and then leverages a heterogeneous skip-gram model to perform node embeddings. The metapath2vec++ model further enables the simultaneous modeling of structural and semantic correlations in heterogeneous networks. Extensive experiments show that metapath2vec and metapath2vec++ are able to not only outperform state-of-the-art embedding models in various heterogeneous network mining tasks, such as node classification, clustering, and similarity search, but also discern the structural and semantic correlations between diverse network objects. |
Tasks | Network Embedding, Node Classification, Representation Learning |
Published | 2017-08-01 |
URL | https://dl.acm.org/doi/10.1145/3097983.3098036 |
https://dl.acm.org/doi/pdf/10.1145/3097983.3098036?download=true | |
PWC | https://paperswithcode.com/paper/metapath2vec-scalable-representation-learning |
Repo | |
Framework | |
Towards Compact and Fast Neural Machine Translation Using a Combined Method
Title | Towards Compact and Fast Neural Machine Translation Using a Combined Method |
Authors | Xiaowei Zhang, Wei Chen, Feng Wang, Shuang Xu, Bo Xu |
Abstract | Neural Machine Translation (NMT) lays intensive burden on computation and memory cost. It is a challenge to deploy NMT models on the devices with limited computation and memory budgets. This paper presents a four stage pipeline to compress model and speed up the decoding for NMT. Our method first introduces a compact architecture based on convolutional encoder and weight shared embeddings. Then weight pruning is applied to obtain a sparse model. Next, we propose a fast sequence interpolation approach which enables the greedy decoding to achieve performance on par with the beam search. Hence, the time-consuming beam search can be replaced by simple greedy decoding. Finally, vocabulary selection is used to reduce the computation of softmax layer. Our final model achieves 10 times speedup, 17 times parameters reduction, less than 35MB storage size and comparable performance compared to the baseline model. |
Tasks | Language Modelling, Machine Translation, Quantization |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/D17-1154/ |
https://www.aclweb.org/anthology/D17-1154 | |
PWC | https://paperswithcode.com/paper/towards-compact-and-fast-neural-machine |
Repo | |
Framework | |
Deliberation Networks: Sequence Generation Beyond One-Pass Decoding
Title | Deliberation Networks: Sequence Generation Beyond One-Pass Decoding |
Authors | Yingce Xia, Fei Tian, Lijun Wu, Jianxin Lin, Tao Qin, Nenghai Yu, Tie-Yan Liu |
Abstract | The encoder-decoder framework has achieved promising progress for many sequence generation tasks, including machine translation, text summarization, dialog system, image captioning, etc. Such a framework adopts an one-pass forward process while decoding and generating a sequence, but lacks the deliberation process: A generated sequence is directly used as final output without further polishing. However, deliberation is a common behavior in human’s daily life like reading news and writing papers/articles/books. In this work, we introduce the deliberation process into the encoder-decoder framework and propose deliberation networks for sequence generation. A deliberation network has two levels of decoders, where the first-pass decoder generates a raw sequence and the second-pass decoder polishes and refines the raw sentence with deliberation. Since the second-pass deliberation decoder has global information about what the sequence to be generated might be, it has the potential to generate a better sequence by looking into future words in the raw sentence. Experiments on neural machine translation and text summarization demonstrate the effectiveness of the proposed deliberation networks. On the WMT 2014 English-to-French translation task, our model establishes a new state-of-the-art BLEU score of 41.5. |
Tasks | Image Captioning, Machine Translation, Text Summarization |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/6775-deliberation-networks-sequence-generation-beyond-one-pass-decoding |
http://papers.nips.cc/paper/6775-deliberation-networks-sequence-generation-beyond-one-pass-decoding.pdf | |
PWC | https://paperswithcode.com/paper/deliberation-networks-sequence-generation |
Repo | |
Framework | |
Combining Lightly-Supervised Text Classification Models for Accurate Contextual Advertising
Title | Combining Lightly-Supervised Text Classification Models for Accurate Contextual Advertising |
Authors | Yiping Jin, Dittaya Wanvarie, Phu Le |
Abstract | In this paper we propose a lightly-supervised framework to rapidly build text classifiers for contextual advertising. Traditionally text classification techniques require labeled training documents for each predefined class. In the scenario of contextual advertising, advertisers often want to target to a specific class of webpages most relevant to their product or service, which may not be covered by a pre-trained classifier. Moreover, the advertisers are interested in whether a webpage is {}relevant{''} or { }irrelevant{''}. It is time-consuming to solicit the advertisers for reliable training signals for the negative class. Therefore, it is more suitable to model the problem as a one-class classification problem, in contrast to traditional classification problems where disjoint classes are defined a priori. We first apply two state-of-the-art lightly-supervised classification models, generalized expectation (GE) criteria (Druck et al., 2008) and multinomial naive Bayes (MNB) with priors (Settles, 2011) to one-class classification where the user only needs to provide a small list of labeled words for the target class. To combine the strengths of the two models, we fuse them together by using MNB to automatically enrich the constraints for GE training. We also explore ensemble method to combine classifiers. On a corpus of webpages from real-time bidding requests, the proposed model achieves the highest average F1 of 0.69 and closes more than half of the gap between previous state-of-the-art lightly-supervised models to a fully-supervised MaxEnt model. |
Tasks | Text Classification |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/I17-1055/ |
https://www.aclweb.org/anthology/I17-1055 | |
PWC | https://paperswithcode.com/paper/combining-lightly-supervised-text |
Repo | |
Framework | |
Syntax Aware LSTM model for Semantic Role Labeling
Title | Syntax Aware LSTM model for Semantic Role Labeling |
Authors | Feng Qian, Lei Sha, Baobao Chang, Lu-chen Liu, Ming Zhang |
Abstract | In Semantic Role Labeling (SRL) task, the tree structured dependency relation is rich in syntax information, but it is not well handled by existing models. In this paper, we propose Syntax Aware Long Short Time Memory (SA-LSTM). The structure of SA-LSTM changes according to dependency structure of each sentence, so that SA-LSTM can model the whole tree structure of dependency relation in an architecture engineering way. Experiments demonstrate that on Chinese Proposition Bank (CPB) 1.0, SA-LSTM improves F1 by 2.06{%} than ordinary bi-LSTM with feature engineered dependency relation information, and gives state-of-the-art F1 of 79.92{%}. On English CoNLL 2005 dataset, SA-LSTM brings improvement (2.1{%}) to bi-LSTM model and also brings slight improvement (0.3{%}) when added to the state-of-the-art model. |
Tasks | Feature Engineering, Machine Translation, Semantic Role Labeling, Structured Prediction |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-4305/ |
https://www.aclweb.org/anthology/W17-4305 | |
PWC | https://paperswithcode.com/paper/syntax-aware-lstm-model-for-semantic-role |
Repo | |
Framework | |
Cross-Lingual SRL Based upon Universal Dependencies
Title | Cross-Lingual SRL Based upon Universal Dependencies |
Authors | Ond{\v{r}}ej Pra{\v{z}}{'a}k, Miloslav Konop{'\i}k |
Abstract | In this paper, we introduce a cross-lingual Semantic Role Labeling (SRL) system with language independent features based upon Universal Dependencies. We propose two methods to convert SRL annotations from monolingual dependency trees into universal dependency trees. Our SRL system is based upon cross-lingual features derived from universal dependency trees and a supervised learning that utilizes a maximum entropy classifier. We design experiments to verify whether the Universal Dependencies are suitable for the cross-lingual SRL. The results are very promising and they open new interesting research paths for the future. |
Tasks | Semantic Parsing, Semantic Role Labeling |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/R17-1077/ |
https://doi.org/10.26615/978-954-452-049-6_077 | |
PWC | https://paperswithcode.com/paper/cross-lingual-srl-based-upon-universal |
Repo | |
Framework | |
Correcting prepositional phrase attachments using multimodal corpora
Title | Correcting prepositional phrase attachments using multimodal corpora |
Authors | Sebastien Delecraz, Alexis Nasr, Frederic Bechet, Benoit Favre |
Abstract | PP-attachments are an important source of errors in parsing natural language. We propose in this article to use data coming from a multimodal corpus, combining textual, visual and conceptual information, as well as a correction strategy, to propose alternative attachments in the output of a parser. |
Tasks | Prepositional Phrase Attachment, Text Generation |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-6311/ |
https://www.aclweb.org/anthology/W17-6311 | |
PWC | https://paperswithcode.com/paper/correcting-prepositional-phrase-attachments |
Repo | |
Framework | |