Paper Group NANR 242
The WMT’18 Morpheval test suites for English-Czech, English-German, English-Finnish and Turkish-English. ELMoLex: Connecting ELMo and Lexicon Features for Dependency Parsing. Non-Adversarial Mapping with VAEs. Hunter NMT System for WMT18 Biomedical Translation Task: Transfer Learning in Neural Machine Translation. Rest-Katyusha: Exploiting the Solu …
The WMT’18 Morpheval test suites for English-Czech, English-German, English-Finnish and Turkish-English
Title | The WMT’18 Morpheval test suites for English-Czech, English-German, English-Finnish and Turkish-English |
Authors | Franck Burlot, Yves Scherrer, Vinit Ravishankar, Ond{\v{r}}ej Bojar, Stig-Arne Gr{"o}nroos, Maarit Koponen, Tommi Nieminen, Fran{\c{c}}ois Yvon |
Abstract | Progress in the quality of machine translation output calls for new automatic evaluation procedures and metrics. In this paper, we extend the Morpheval protocol introduced by Burlot and Yvon (2017) for the English-to-Czech and English-to-Latvian translation directions to three additional language pairs, and report its use to analyze the results of WMT 2018{'}s participants for these language pairs. Considering additional, typologically varied source and target languages also enables us to draw some generalizations regarding this morphology-oriented evaluation procedure. |
Tasks | Machine Translation |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/W18-6433/ |
https://www.aclweb.org/anthology/W18-6433 | |
PWC | https://paperswithcode.com/paper/the-wmt18-morpheval-test-suites-for-english |
Repo | |
Framework | |
ELMoLex: Connecting ELMo and Lexicon Features for Dependency Parsing
Title | ELMoLex: Connecting ELMo and Lexicon Features for Dependency Parsing |
Authors | Ganesh Jawahar, Benjamin Muller, Amal Fethi, Louis Martin, {'E}ric Villemonte de la Clergerie, Beno{^\i}t Sagot, Djam{'e} Seddah |
Abstract | In this paper, we present the details of the neural dependency parser and the neural tagger submitted by our team {}ParisNLP{'} to the CoNLL 2018 Shared Task on parsing from raw text to Universal Dependencies. We augment the deep Biaffine (BiAF) parser (Dozat and Manning, 2016) with novel features to perform competitively: we utilize an indomain version of ELMo features (Peters et al., 2018) which provide context-dependent word representations; we utilize disambiguated, embedded, morphosyntactic features from lexicons (Sagot, 2018), which complements the existing feature set. Henceforth, we call our system { }ELMoLex{'}. In addition to incorporating character embeddings, ELMoLex benefits from pre-trained word vectors, ELMo and morphosyntactic features (whenever available) to correctly handle rare or unknown words which are prevalent in languages with complex morphology. ELMoLex ranked 11th by Labeled Attachment Score metric (70.64{%}), Morphology-aware LAS metric (55.74{%}) and ranked 9th by Bilexical dependency metric (60.70{%}). |
Tasks | Dependency Parsing, Language Modelling |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/K18-2023/ |
https://www.aclweb.org/anthology/K18-2023 | |
PWC | https://paperswithcode.com/paper/elmolex-connecting-elmo-and-lexicon-features |
Repo | |
Framework | |
Non-Adversarial Mapping with VAEs
Title | Non-Adversarial Mapping with VAEs |
Authors | Yedid Hoshen |
Abstract | The study of cross-domain mapping without supervision has recently attracted much attention. Much of the recent progress was enabled by the use of adversarial training as well as cycle constraints. The practical difficulty of adversarial training motivates research into non-adversarial methods. In a recent paper, it was shown that cross-domain mapping is possible without the use of cycles or GANs. Although promising, this approach suffers from several drawbacks including costly inference and an optimization variable for every training example preventing the method from using large training sets. We present an alternative approach which is able to achieve non-adversarial mapping using a novel form of Variational Auto-Encoder. Our method is much faster at inference time, is able to leverage large datasets and has a simple interpretation. |
Tasks | |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/7981-non-adversarial-mapping-with-vaes |
http://papers.nips.cc/paper/7981-non-adversarial-mapping-with-vaes.pdf | |
PWC | https://paperswithcode.com/paper/non-adversarial-mapping-with-vaes |
Repo | |
Framework | |
Hunter NMT System for WMT18 Biomedical Translation Task: Transfer Learning in Neural Machine Translation
Title | Hunter NMT System for WMT18 Biomedical Translation Task: Transfer Learning in Neural Machine Translation |
Authors | Abdul Khan, P, Subhadarshi a, Jia Xu, Lampros Flokas |
Abstract | This paper describes the submission of Hunter Neural Machine Translation (NMT) to the WMT{'}18 Biomedical translation task from English to French. The discrepancy between training and test data distribution brings a challenge to translate text in new domains. Beyond the previous work of combining in-domain with out-of-domain models, we found accuracy and efficiency gain in combining different in-domain models. We conduct extensive experiments on NMT with \textit{transfer learning}. We train on different in-domain Biomedical datasets one after another. That means parameters of the previous training serve as the initialization of the next one. Together with a pre-trained out-of-domain News model, we enhanced translation quality with 3.73 BLEU points over the baseline. Furthermore, we applied ensemble learning on training models of intermediate epochs and achieved an improvement of 4.02 BLEU points over the baseline. Overall, our system is 11.29 BLEU points above the best system of last year on the EDP 2017 test set. |
Tasks | Domain Adaptation, Machine Translation, Transfer Learning |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/W18-6447/ |
https://www.aclweb.org/anthology/W18-6447 | |
PWC | https://paperswithcode.com/paper/hunter-nmt-system-for-wmt18-biomedical |
Repo | |
Framework | |
Rest-Katyusha: Exploiting the Solution’s Structure via Scheduled Restart Schemes
Title | Rest-Katyusha: Exploiting the Solution’s Structure via Scheduled Restart Schemes |
Authors | Junqi Tang, Mohammad Golbabaee, Francis Bach, Mike E. Davies |
Abstract | We propose a structure-adaptive variant of the state-of-the-art stochastic variance-reduced gradient algorithm Katyusha for regularized empirical risk minimization. The proposed method is able to exploit the intrinsic low-dimensional structure of the solution, such as sparsity or low rank which is enforced by a non-smooth regularization, to achieve even faster convergence rate. This provable algorithmic improvement is done by restarting the Katyusha algorithm according to restricted strong-convexity constants. We demonstrate the effectiveness of our approach via numerical experiments. |
Tasks | |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/7325-rest-katyusha-exploiting-the-solutions-structure-via-scheduled-restart-schemes |
http://papers.nips.cc/paper/7325-rest-katyusha-exploiting-the-solutions-structure-via-scheduled-restart-schemes.pdf | |
PWC | https://paperswithcode.com/paper/rest-katyusha-exploiting-the-solutions |
Repo | |
Framework | |
A Fast Resection-Intersection Method for the Known Rotation Problem
Title | A Fast Resection-Intersection Method for the Known Rotation Problem |
Authors | Qianggong Zhang, Tat-Jun Chin, Huu Minh Le |
Abstract | The known rotation problem refers to a special case of structure-from-motion where the absolute orientations of the cameras are known. When formulated as a minimax (l_infty) problem on reprojection errors, the problem is an instance of pseudo-convex programming. Though theoretically tractable, solving the known rotation problem on large-scale data (1,000âs of views, 10,000âs scene points) using existing methods can be very time-consuming. In this paper, we devise a fast algorithm for the known rotation problem. Our approach alternates between pose estimation and triangulation (i.e., resection-intersection) to break the problem into multiple simpler instances of pseudo-convex programming. The key to the vastly superior performance of our method lies in using a novel minimum enclosing ball (MEB) technique for the calculation of updating steps, which obviates the need for convex optimisation routines and greatly reduces memory footprint. We demonstrate the practicality of our method on large-scale problem instances which easily overwhelm current state-of-the-art algorithms (demo program available in supplementary). |
Tasks | Pose Estimation |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Zhang_A_Fast_Resection-Intersection_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Zhang_A_Fast_Resection-Intersection_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/a-fast-resection-intersection-method-for-the |
Repo | |
Framework | |
Using authentic texts for grammar exercises for a minority language
Title | Using authentic texts for grammar exercises for a minority language |
Authors | Lene Antonsen, Chiara Argese |
Abstract | |
Tasks | |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/W18-7101/ |
https://www.aclweb.org/anthology/W18-7101 | |
PWC | https://paperswithcode.com/paper/using-authentic-texts-for-grammar-exercises |
Repo | |
Framework | |
Training Deep Models Faster with Robust, Approximate Importance Sampling
Title | Training Deep Models Faster with Robust, Approximate Importance Sampling |
Authors | Tyler B. Johnson, Carlos Guestrin |
Abstract | In theory, importance sampling speeds up stochastic gradient algorithms for supervised learning by prioritizing training examples. In practice, the cost of computing importances greatly limits the impact of importance sampling. We propose a robust, approximate importance sampling procedure (RAIS) for stochastic gradient de- scent. By approximating the ideal sampling distribution using robust optimization, RAIS provides much of the benefit of exact importance sampling with drastically reduced overhead. Empirically, we find RAIS-SGD and standard SGD follow similar learning curves, but RAIS moves faster through these paths, achieving speed-ups of at least 20% and sometimes much more. |
Tasks | |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/7957-training-deep-models-faster-with-robust-approximate-importance-sampling |
http://papers.nips.cc/paper/7957-training-deep-models-faster-with-robust-approximate-importance-sampling.pdf | |
PWC | https://paperswithcode.com/paper/training-deep-models-faster-with-robust |
Repo | |
Framework | |
Sanskrit n-Retroflexion is Input-Output Tier-Based Strictly Local
Title | Sanskrit n-Retroflexion is Input-Output Tier-Based Strictly Local |
Authors | Thomas Graf, Connor Mayer |
Abstract | Sanskrit /n/-retroflexion is one of the most complex segmental processes in phonology. While it is still star-free, it does not fit in any of the subregular classes that are commonly entertained in the literature. We show that when construed as a phonotactic dependency, the process fits into a class we call \textit{input-output tier-based strictly local} (IO-TSL), a natural extension of the familiar class TSL. IO-TSL increases the power of TSL{'}s tier projection function by making it an input-output strictly local transduction. Assuming that /n/-retroflexion represents the upper bound on the complexity of segmental phonology, this shows that all of segmental phonology can be captured by combining the intuitive notion of tiers with the independently motivated machinery of strictly local mappings. |
Tasks | |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/W18-5817/ |
https://www.aclweb.org/anthology/W18-5817 | |
PWC | https://paperswithcode.com/paper/sanskrit-n-retroflexion-is-input-output-tier |
Repo | |
Framework | |
Classification-Driven Dynamic Image Enhancement
Title | Classification-Driven Dynamic Image Enhancement |
Authors | Vivek Sharma, Ali Diba, Davy Neven, Michael S. Brown, Luc Van Gool, Rainer Stiefelhagen |
Abstract | Convolutional neural networks rely on image texture and structure to serve as discriminative features to classify the image content. Image enhancement techniques can be used as preprocessing steps to help improve the overall image quality and in turn improve the overall effectiveness of a CNN. Existing image enhancement methods, however, are designed to improve the perceptual quality of an image for a human observer. In this paper, we are interested in learning CNNs that can emulate image enhancement and restoration, but with the overall goal to improve image classification and not necessarily human perception. To this end, we present a unified CNN architecture that uses a range of enhancement filters that can enhance image-specific details via end-to-end dynamic filter learning. We demonstrate the effectiveness of this strategy on four challenging benchmark datasets for fine-grained, object, scene and texture classification: CUB-200-2011, PASCAL-VOC2007, MIT-Indoor, and DTD. Experiments using our proposed enhancement shows promising results on all the datasets. In addition, our approach is capable of improving the performance of all generic CNN architectures. |
Tasks | Image Classification, Image Enhancement, Texture Classification |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Sharma_Classification-Driven_Dynamic_Image_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Sharma_Classification-Driven_Dynamic_Image_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/classification-driven-dynamic-image-1 |
Repo | |
Framework | |
Coded Sparse Matrix Multiplication
Title | Coded Sparse Matrix Multiplication |
Authors | Sinong Wang, Jiashang Liu, Ness Shroff |
Abstract | In a large-scale and distributed matrix multiplication problem $C=A^{\intercal}B$, where $C\in\mathbb{R}^{r\times t}$, the coded computation plays an important role to effectively deal with “stragglers” (distributed computations that may get delayed due to few slow or faulty processors). However, existing coded schemes could destroy the significant sparsity that exists in large-scale machine learning problems, and could result in much higher computation overhead, i.e., $O(rt)$ decoding time. In this paper, we develop a new coded computation strategy, we call sparse code, which achieves near optimal recovery threshold, low computation overhead, and linear decoding time $O(nnz(C))$. We implement our scheme and demonstrate the advantage of the approach over both uncoded and current fastest coded strategies. |
Tasks | |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=1924 |
http://proceedings.mlr.press/v80/wang18e/wang18e.pdf | |
PWC | https://paperswithcode.com/paper/coded-sparse-matrix-multiplication |
Repo | |
Framework | |
Augment and Reduce: Stochastic Inference for Large Categorical Distributions
Title | Augment and Reduce: Stochastic Inference for Large Categorical Distributions |
Authors | Francisco Ruiz, Michalis Titsias, Adji Bousso Dieng, David Blei |
Abstract | Categorical distributions are ubiquitous in machine learning, e.g., in classification, language models, and recommendation systems. However, when the number of possible outcomes is very large, using categorical distributions becomes computationally expensive, as the complexity scales linearly with the number of outcomes. To address this problem, we propose augment and reduce (A&R), a method to alleviate the computational complexity. A&R uses two ideas: latent variable augmentation and stochastic variational inference. It maximizes a lower bound on the marginal likelihood of the data. Unlike existing methods which are specific to softmax, A&R is more general and is amenable to other categorical models, such as multinomial probit. On several large-scale classification problems, we show that A&R provides a tighter bound on the marginal likelihood and has better predictive performance than existing approaches. |
Tasks | Recommendation Systems |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=1936 |
http://proceedings.mlr.press/v80/ruiz18a/ruiz18a.pdf | |
PWC | https://paperswithcode.com/paper/augment-and-reduce-stochastic-inference-for |
Repo | |
Framework | |
Simplification Using Paraphrases and Context-Based Lexical Substitution
Title | Simplification Using Paraphrases and Context-Based Lexical Substitution |
Authors | Reno Kriz, Eleni Miltsakaki, Marianna Apidianaki, Chris Callison-Burch |
Abstract | Lexical simplification involves identifying complex words or phrases that need to be simplified, and recommending simpler meaning-preserving substitutes that can be more easily understood. We propose a complex word identification (CWI) model that exploits both lexical and contextual features, and a simplification mechanism which relies on a word-embedding lexical substitution model to replace the detected complex words with simpler paraphrases. We compare our CWI and lexical simplification models to several baselines, and evaluate the performance of our simplification system against human judgments. The results show that our models are able to detect complex words with higher accuracy than other commonly used methods, and propose good simplification substitutes in context. They also highlight the limited contribution of context features for CWI, which nonetheless improve simplification compared to context-unaware models. |
Tasks | Complex Word Identification, Lexical Simplification, Text Simplification |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-1019/ |
https://www.aclweb.org/anthology/N18-1019 | |
PWC | https://paperswithcode.com/paper/simplification-using-paraphrases-and-context |
Repo | |
Framework | |
Delta vs. N-Gram Tracing: Evaluating the Robustness of Authorship Attribution Methods
Title | Delta vs. N-Gram Tracing: Evaluating the Robustness of Authorship Attribution Methods |
Authors | Thomas Proisl, Stefan Evert, Fotis Jannidis, Christof Sch{"o}ch, Leonard Konle, Steffen Pielstr{"o}m |
Abstract | |
Tasks | Optical Character Recognition |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1523/ |
https://www.aclweb.org/anthology/L18-1523 | |
PWC | https://paperswithcode.com/paper/delta-vs-n-gram-tracing-evaluating-the |
Repo | |
Framework | |
AI Clerk: 會賣東西的機器人 (AI Clerk: Shopping Assistant) [In Chinese]
Title | AI Clerk: 會賣東西的機器人 (AI Clerk: Shopping Assistant) [In Chinese] |
Authors | Ru-Yng Chang, Huan-Yi Pan, Bo-Lin Lin, Wei-Lun Chen, Jia En Hsieh, Wen-Yu Huang, Lu-Hsuan Li |
Abstract | |
Tasks | |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/O18-1022/ |
https://www.aclweb.org/anthology/O18-1022 | |
PWC | https://paperswithcode.com/paper/ai-clerk-e3eca-aoo-ai-clerk-shopping |
Repo | |
Framework | |