Paper Group AWR 151
pioNER: Datasets and Baselines for Armenian Named Entity Recognition. Dialog-based Interactive Image Retrieval. Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition. A Spectral Approach to Gradient Estimation for Implicit Distributions. Zero-shot Neural Transfer for Cross-lingual Entity Linking. Modeling Camera Effec …
pioNER: Datasets and Baselines for Armenian Named Entity Recognition
Title | pioNER: Datasets and Baselines for Armenian Named Entity Recognition |
Authors | Tsolak Ghukasyan, Garnik Davtyan, Karen Avetisyan, Ivan Andrianov |
Abstract | In this work, we tackle the problem of Armenian named entity recognition, providing silver- and gold-standard datasets as well as establishing baseline results on popular models. We present a 163000-token named entity corpus automatically generated and annotated from Wikipedia, and another 53400-token corpus of news sentences with manual annotation of people, organization and location named entities. The corpora were used to train and evaluate several popular named entity recognition models. Alongside the datasets, we release 50-, 100-, 200-, 300-dimensional GloVe word embeddings trained on a collection of Armenian texts from Wikipedia, news, blogs, and encyclopedia. |
Tasks | Named Entity Recognition, Word Embeddings |
Published | 2018-10-19 |
URL | http://arxiv.org/abs/1810.08699v1 |
http://arxiv.org/pdf/1810.08699v1.pdf | |
PWC | https://paperswithcode.com/paper/pioner-datasets-and-baselines-for-armenian |
Repo | https://github.com/Hrant-Khachatrian/Machine-Learning-in-Armenia |
Framework | none |
Dialog-based Interactive Image Retrieval
Title | Dialog-based Interactive Image Retrieval |
Authors | Xiaoxiao Guo, Hui Wu, Yu Cheng, Steven Rennie, Gerald Tesauro, Rogerio Schmidt Feris |
Abstract | Existing methods for interactive image retrieval have demonstrated the merit of integrating user feedback, improving retrieval results. However, most current systems rely on restricted forms of user feedback, such as binary relevance responses, or feedback based on a fixed set of relative attributes, which limits their impact. In this paper, we introduce a new approach to interactive image search that enables users to provide feedback via natural language, allowing for more natural and effective interaction. We formulate the task of dialog-based interactive image retrieval as a reinforcement learning problem, and reward the dialog system for improving the rank of the target image during each dialog turn. To mitigate the cumbersome and costly process of collecting human-machine conversations as the dialog system learns, we train our system with a user simulator, which is itself trained to describe the differences between target and candidate images. The efficacy of our approach is demonstrated in a footwear retrieval application. Experiments on both simulated and real-world data show that 1) our proposed learning framework achieves better accuracy than other supervised and reinforcement learning baselines and 2) user feedback based on natural language rather than pre-specified attributes leads to more effective retrieval results, and a more natural and expressive communication interface. |
Tasks | Image Retrieval, Visual Dialog |
Published | 2018-05-01 |
URL | http://arxiv.org/abs/1805.00145v3 |
http://arxiv.org/pdf/1805.00145v3.pdf | |
PWC | https://paperswithcode.com/paper/dialog-based-interactive-image-retrieval |
Repo | https://github.com/XiaoxiaoGuo/fashion-retrieval |
Framework | pytorch |
Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition
Title | Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition |
Authors | Titouan Parcollet, Ying Zhang, Mohamed Morchid, Chiheb Trabelsi, Georges Linarès, Renato De Mori, Yoshua Bengio |
Abstract | Recently, the connectionist temporal classification (CTC) model coupled with recurrent (RNN) or convolutional neural networks (CNN), made it easier to train speech recognition systems in an end-to-end fashion. However in real-valued models, time frame components such as mel-filter-bank energies and the cepstral coefficients obtained from them, together with their first and second order derivatives, are processed as individual elements, while a natural alternative is to process such components as composed entities. We propose to group such elements in the form of quaternions and to process these quaternions using the established quaternion algebra. Quaternion numbers and quaternion neural networks have shown their efficiency to process multidimensional inputs as entities, to encode internal dependencies, and to solve many tasks with less learning parameters than real-valued models. This paper proposes to integrate multiple feature views in quaternion-valued convolutional neural network (QCNN), to be used for sequence-to-sequence mapping with the CTC model. Promising results are reported using simple QCNNs in phoneme recognition experiments with the TIMIT corpus. More precisely, QCNNs obtain a lower phoneme error rate (PER) with less learning parameters than a competing model based on real-valued CNNs. |
Tasks | Speech Recognition |
Published | 2018-06-20 |
URL | http://arxiv.org/abs/1806.07789v1 |
http://arxiv.org/pdf/1806.07789v1.pdf | |
PWC | https://paperswithcode.com/paper/quaternion-convolutional-neural-networks-for-1 |
Repo | https://github.com/Riccardo-Vecchi/Pytorch-Quaternion-Neural-Networks |
Framework | pytorch |
A Spectral Approach to Gradient Estimation for Implicit Distributions
Title | A Spectral Approach to Gradient Estimation for Implicit Distributions |
Authors | Jiaxin Shi, Shengyang Sun, Jun Zhu |
Abstract | Recently there have been increasing interests in learning and inference with implicit distributions (i.e., distributions without tractable densities). To this end, we develop a gradient estimator for implicit distributions based on Stein’s identity and a spectral decomposition of kernel operators, where the eigenfunctions are approximated by the Nystr"om method. Unlike the previous works that only provide estimates at the sample points, our approach directly estimates the gradient function, thus allows for a simple and principled out-of-sample extension. We provide theoretical results on the error bound of the estimator and discuss the bias-variance tradeoff in practice. The effectiveness of our method is demonstrated by applications to gradient-free Hamiltonian Monte Carlo and variational inference with implicit distributions. Finally, we discuss the intuition behind the estimator by drawing connections between the Nystr"om method and kernel PCA, which indicates that the estimator can automatically adapt to the geometry of the underlying distribution. |
Tasks | |
Published | 2018-06-07 |
URL | http://arxiv.org/abs/1806.02925v1 |
http://arxiv.org/pdf/1806.02925v1.pdf | |
PWC | https://paperswithcode.com/paper/a-spectral-approach-to-gradient-estimation |
Repo | https://github.com/thjashin/spectral-stein-grad |
Framework | tf |
Zero-shot Neural Transfer for Cross-lingual Entity Linking
Title | Zero-shot Neural Transfer for Cross-lingual Entity Linking |
Authors | Shruti Rijhwani, Jiateng Xie, Graham Neubig, Jaime Carbonell |
Abstract | Cross-lingual entity linking maps an entity mention in a source language to its corresponding entry in a structured knowledge base that is in a different (target) language. While previous work relies heavily on bilingual lexical resources to bridge the gap between the source and the target languages, these resources are scarce or unavailable for many low-resource languages. To address this problem, we investigate zero-shot cross-lingual entity linking, in which we assume no bilingual lexical resources are available in the source low-resource language. Specifically, we propose pivot-based entity linking, which leverages information from a high-resource “pivot” language to train character-level neural entity linking models that are transferred to the source low-resource language in a zero-shot manner. With experiments on 9 low-resource languages and transfer through a total of 54 languages, we show that our proposed pivot-based framework improves entity linking accuracy 17% (absolute) on average over the baseline systems, for the zero-shot scenario. Further, we also investigate the use of language-universal phonological representations which improves average accuracy (absolute) by 36% when transferring between languages that use different scripts. |
Tasks | Cross-Lingual Entity Linking, Entity Linking |
Published | 2018-11-09 |
URL | http://arxiv.org/abs/1811.04154v1 |
http://arxiv.org/pdf/1811.04154v1.pdf | |
PWC | https://paperswithcode.com/paper/zero-shot-neural-transfer-for-cross-lingual |
Repo | https://github.com/neulab/pivot-based-entity-linking |
Framework | none |
Modeling Camera Effects to Improve Visual Learning from Synthetic Data
Title | Modeling Camera Effects to Improve Visual Learning from Synthetic Data |
Authors | Alexandra Carlson, Katherine A. Skinner, Ram Vasudevan, Matthew Johnson-Roberson |
Abstract | Recent work has focused on generating synthetic imagery to increase the size and variability of training data for learning visual tasks in urban scenes. This includes increasing the occurrence of occlusions or varying environmental and weather effects. However, few have addressed modeling variation in the sensor domain. Sensor effects can degrade real images, limiting generalizability of network performance on visual tasks trained on synthetic data and tested in real environments. This paper proposes an efficient, automatic, physically-based augmentation pipeline to vary sensor effects –chromatic aberration, blur, exposure, noise, and color cast– for synthetic imagery. In particular, this paper illustrates that augmenting synthetic training datasets with the proposed pipeline reduces the domain gap between synthetic and real domains for the task of object detection in urban driving scenes. |
Tasks | Object Detection |
Published | 2018-03-21 |
URL | http://arxiv.org/abs/1803.07721v6 |
http://arxiv.org/pdf/1803.07721v6.pdf | |
PWC | https://paperswithcode.com/paper/modeling-camera-effects-to-improve-visual |
Repo | https://github.com/alexacarlson/SensorEffectAugmentation |
Framework | tf |
SMIT: Stochastic Multi-Label Image-to-Image Translation
Title | SMIT: Stochastic Multi-Label Image-to-Image Translation |
Authors | Andrés Romero, Pablo Arbeláez, Luc Van Gool, Radu Timofte |
Abstract | Cross-domain mapping has been a very active topic in recent years. Given one image, its main purpose is to translate it to the desired target domain, or multiple domains in the case of multiple labels. This problem is highly challenging due to three main reasons: (i) unpaired datasets, (ii) multiple attributes, and (iii) the multimodality (e.g., style) associated with the translation. Most of the existing state-of-the-art has focused only on two reasons, i.e. either on (i) and (ii), or (i) and (iii). In this work, we propose a joint framework (i, ii, iii) of diversity and multi-mapping image-to-image translations, using a single generator to conditionally produce countless and unique fake images that hold the underlying characteristics of the source image. Our system does not use style regularization, instead, it uses an embedding representation that we call domain embedding for both domain and style. Extensive experiments over different datasets demonstrate the effectiveness of our proposed approach in comparison with the state-of-the-art in both multi-label and multimodal problems. Additionally, our method is able to generalize under different scenarios: continuous style interpolation, continuous label interpolation, and fine-grained mapping. Code and pretrained models are available at https://github.com/BCV-Uniandes/SMIT. |
Tasks | Image-to-Image Translation |
Published | 2018-12-10 |
URL | https://arxiv.org/abs/1812.03704v3 |
https://arxiv.org/pdf/1812.03704v3.pdf | |
PWC | https://paperswithcode.com/paper/smit-stochastic-multi-label-image-to-image |
Repo | https://github.com/BCV-Uniandes/SMIT |
Framework | pytorch |
Optimization of Molecules via Deep Reinforcement Learning
Title | Optimization of Molecules via Deep Reinforcement Learning |
Authors | Zhenpeng Zhou, Steven Kearnes, Li Li, Richard N. Zare, Patrick Riley |
Abstract | We present a framework, which we call Molecule Deep $Q$-Networks (MolDQN), for molecule optimization by combining domain knowledge of chemistry and state-of-the-art reinforcement learning techniques (double $Q$-learning and randomized value functions). We directly define modifications on molecules, thereby ensuring 100% chemical validity. Further, we operate without pre-training on any dataset to avoid possible bias from the choice of that set. Inspired by problems faced during medicinal chemistry lead optimization, we extend our model with multi-objective reinforcement learning, which maximizes drug-likeness while maintaining similarity to the original molecule. We further show the path through chemical space to achieve optimization for a molecule to understand how the model works. |
Tasks | Q-Learning |
Published | 2018-10-19 |
URL | http://arxiv.org/abs/1810.08678v3 |
http://arxiv.org/pdf/1810.08678v3.pdf | |
PWC | https://paperswithcode.com/paper/optimization-of-molecules-via-deep |
Repo | https://github.com/junyoung0131/Mol-DQN |
Framework | none |
Document-Level Neural Machine Translation with Hierarchical Attention Networks
Title | Document-Level Neural Machine Translation with Hierarchical Attention Networks |
Authors | Lesly Miculicich, Dhananjay Ram, Nikolaos Pappas, James Henderson |
Abstract | Neural Machine Translation (NMT) can be improved by including document-level contextual information. For this purpose, we propose a hierarchical attention model to capture the context in a structured and dynamic manner. The model is integrated in the original NMT architecture as another level of abstraction, conditioning on the NMT model’s own previous hidden states. Experiments show that hierarchical attention significantly improves the BLEU score over a strong NMT baseline with the state-of-the-art in context-aware methods, and that both the encoder and decoder benefit from context in complementary ways. |
Tasks | Machine Translation |
Published | 2018-09-05 |
URL | http://arxiv.org/abs/1809.01576v2 |
http://arxiv.org/pdf/1809.01576v2.pdf | |
PWC | https://paperswithcode.com/paper/document-level-neural-machine-translation |
Repo | https://github.com/idiap/HAN_NMT |
Framework | pytorch |
NMT-Keras: a Very Flexible Toolkit with a Focus on Interactive NMT and Online Learning
Title | NMT-Keras: a Very Flexible Toolkit with a Focus on Interactive NMT and Online Learning |
Authors | Álvaro Peris, Francisco Casacuberta |
Abstract | We present NMT-Keras, a flexible toolkit for training deep learning models, which puts a particular emphasis on the development of advanced applications of neural machine translation systems, such as interactive-predictive translation protocols and long-term adaptation of the translation system via continuous learning. NMT-Keras is based on an extended version of the popular Keras library, and it runs on Theano and Tensorflow. State-of-the-art neural machine translation models are deployed and used following the high-level framework provided by Keras. Given its high modularity and flexibility, it also has been extended to tackle different problems, such as image and video captioning, sentence classification and visual question answering. |
Tasks | Machine Translation, Question Answering, Sentence Classification, Video Captioning, Visual Question Answering |
Published | 2018-07-09 |
URL | http://arxiv.org/abs/1807.03096v3 |
http://arxiv.org/pdf/1807.03096v3.pdf | |
PWC | https://paperswithcode.com/paper/nmt-keras-a-very-flexible-toolkit-with-a |
Repo | https://github.com/lvapeab/nmt-keras |
Framework | tf |
Online normalizer calculation for softmax
Title | Online normalizer calculation for softmax |
Authors | Maxim Milakov, Natalia Gimelshein |
Abstract | The Softmax function is ubiquitous in machine learning, multiple previous works suggested faster alternatives for it. In this paper we propose a way to compute classical Softmax with fewer memory accesses and hypothesize that this reduction in memory accesses should improve Softmax performance on actual hardware. The benchmarks confirm this hypothesis: Softmax accelerates by up to 1.3x and Softmax+TopK combined and fused by up to 5x. |
Tasks | |
Published | 2018-05-08 |
URL | http://arxiv.org/abs/1805.02867v2 |
http://arxiv.org/pdf/1805.02867v2.pdf | |
PWC | https://paperswithcode.com/paper/online-normalizer-calculation-for-softmax |
Repo | https://github.com/NVIDIA/online-softmax |
Framework | none |
Policy Optimization With Penalized Point Probability Distance: An Alternative To Proximal Policy Optimization
Title | Policy Optimization With Penalized Point Probability Distance: An Alternative To Proximal Policy Optimization |
Authors | Xiangxiang Chu |
Abstract | As the most successful variant and improvement for Trust Region Policy Optimization (TRPO), proximal policy optimization (PPO) has been widely applied across various domains with several advantages: efficient data utilization, easy implementation, and good parallelism. In this paper, a first-order gradient reinforcement learning algorithm called Policy Optimization with Penalized Point Probability Distance (POP3D), which is a lower bound to the square of total variance divergence is proposed as another powerful variant. Firstly, we talk about the shortcomings of several commonly used algorithms, by which our method is partly motivated. Secondly, we address to overcome these shortcomings by applying POP3D. Thirdly, we dive into its mechanism from the perspective of solution manifold. Finally, we make quantitative comparisons among several state-of-the-art algorithms based on common benchmarks. Simulation results show that POP3D is highly competitive compared with PPO. Besides, our code is released in https://github.com/paperwithcode/pop3d. |
Tasks | Atari Games |
Published | 2018-07-02 |
URL | http://arxiv.org/abs/1807.00442v4 |
http://arxiv.org/pdf/1807.00442v4.pdf | |
PWC | https://paperswithcode.com/paper/policy-optimization-with-penalized-point |
Repo | https://github.com/cxxgtxy/POP3D |
Framework | tf |
Learning Eligibility in Cancer Clinical Trials using Deep Neural Networks
Title | Learning Eligibility in Cancer Clinical Trials using Deep Neural Networks |
Authors | Aurelia Bustos, Antonio Pertusa |
Abstract | Interventional cancer clinical trials are generally too restrictive, and some patients are often excluded on the basis of comorbidity, past or concomitant treatments, or the fact that they are over a certain age. The efficacy and safety of new treatments for patients with these characteristics are, therefore, not defined. In this work, we built a model to automatically predict whether short clinical statements were considered inclusion or exclusion criteria. We used protocols from cancer clinical trials that were available in public registries from the last 18 years to train word-embeddings, and we constructed a~dataset of 6M short free-texts labeled as eligible or not eligible. A text classifier was trained using deep neural networks, with pre-trained word-embeddings as inputs, to predict whether or not short free-text statements describing clinical information were considered eligible. We additionally analyzed the semantic reasoning of the word-embedding representations obtained and were able to identify equivalent treatments for a type of tumor analogous with the drugs used to treat other tumors. We show that representation learning using {deep} neural networks can be successfully leveraged to extract the medical knowledge from clinical trial protocols for potentially assisting practitioners when prescribing treatments. |
Tasks | Representation Learning, Word Embeddings |
Published | 2018-03-22 |
URL | http://arxiv.org/abs/1803.08312v3 |
http://arxiv.org/pdf/1803.08312v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-eligibility-in-cancer-clinical |
Repo | https://github.com/auriml/capstone |
Framework | tf |
Cross-type Biomedical Named Entity Recognition with Deep Multi-Task Learning
Title | Cross-type Biomedical Named Entity Recognition with Deep Multi-Task Learning |
Authors | Xuan Wang, Yu Zhang, Xiang Ren, Yuhao Zhang, Marinka Zitnik, Jingbo Shang, Curtis Langlotz, Jiawei Han |
Abstract | Motivation: State-of-the-art biomedical named entity recognition (BioNER) systems often require handcrafted features specific to each entity type, such as genes, chemicals and diseases. Although recent studies explored using neural network models for BioNER to free experts from manual feature engineering, the performance remains limited by the available training data for each entity type. Results: We propose a multi-task learning framework for BioNER to collectively use the training data of different types of entities and improve the performance on each of them. In experiments on 15 benchmark BioNER datasets, our multi-task model achieves substantially better performance compared with state-of-the-art BioNER systems and baseline neural sequence labeling models. Further analysis shows that the large performance gains come from sharing character- and word-level information among relevant biomedical entities across differently labeled corpora. |
Tasks | Feature Engineering, Multi-Task Learning, Named Entity Recognition |
Published | 2018-01-30 |
URL | http://arxiv.org/abs/1801.09851v4 |
http://arxiv.org/pdf/1801.09851v4.pdf | |
PWC | https://paperswithcode.com/paper/cross-type-biomedical-named-entity |
Repo | https://github.com/yuzhimanhua/lm-lstm-crf |
Framework | pytorch |
Towards Better Interpretability in Deep Q-Networks
Title | Towards Better Interpretability in Deep Q-Networks |
Authors | Raghuram Mandyam Annasamy, Katia Sycara |
Abstract | Deep reinforcement learning techniques have demonstrated superior performance in a wide variety of environments. As improvements in training algorithms continue at a brisk pace, theoretical or empirical studies on understanding what these networks seem to learn, are far behind. In this paper we propose an interpretable neural network architecture for Q-learning which provides a global explanation of the model’s behavior using key-value memories, attention and reconstructible embeddings. With a directed exploration strategy, our model can reach training rewards comparable to the state-of-the-art deep Q-learning models. However, results suggest that the features extracted by the neural network are extremely shallow and subsequent testing using out-of-sample examples shows that the agent can easily overfit to trajectories seen during training. |
Tasks | Q-Learning |
Published | 2018-09-15 |
URL | http://arxiv.org/abs/1809.05630v2 |
http://arxiv.org/pdf/1809.05630v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-better-interpretability-in-deep-q |
Repo | https://github.com/maraghuram/I-DQN |
Framework | pytorch |