October 21, 2019

2778 words 14 mins read

Paper Group AWR 151

Paper Group AWR 151

pioNER: Datasets and Baselines for Armenian Named Entity Recognition. Dialog-based Interactive Image Retrieval. Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition. A Spectral Approach to Gradient Estimation for Implicit Distributions. Zero-shot Neural Transfer for Cross-lingual Entity Linking. Modeling Camera Effec …

pioNER: Datasets and Baselines for Armenian Named Entity Recognition

Title pioNER: Datasets and Baselines for Armenian Named Entity Recognition
Authors Tsolak Ghukasyan, Garnik Davtyan, Karen Avetisyan, Ivan Andrianov
Abstract In this work, we tackle the problem of Armenian named entity recognition, providing silver- and gold-standard datasets as well as establishing baseline results on popular models. We present a 163000-token named entity corpus automatically generated and annotated from Wikipedia, and another 53400-token corpus of news sentences with manual annotation of people, organization and location named entities. The corpora were used to train and evaluate several popular named entity recognition models. Alongside the datasets, we release 50-, 100-, 200-, 300-dimensional GloVe word embeddings trained on a collection of Armenian texts from Wikipedia, news, blogs, and encyclopedia.
Tasks Named Entity Recognition, Word Embeddings
Published 2018-10-19
URL http://arxiv.org/abs/1810.08699v1
PDF http://arxiv.org/pdf/1810.08699v1.pdf
PWC https://paperswithcode.com/paper/pioner-datasets-and-baselines-for-armenian
Repo https://github.com/Hrant-Khachatrian/Machine-Learning-in-Armenia
Framework none

Dialog-based Interactive Image Retrieval

Title Dialog-based Interactive Image Retrieval
Authors Xiaoxiao Guo, Hui Wu, Yu Cheng, Steven Rennie, Gerald Tesauro, Rogerio Schmidt Feris
Abstract Existing methods for interactive image retrieval have demonstrated the merit of integrating user feedback, improving retrieval results. However, most current systems rely on restricted forms of user feedback, such as binary relevance responses, or feedback based on a fixed set of relative attributes, which limits their impact. In this paper, we introduce a new approach to interactive image search that enables users to provide feedback via natural language, allowing for more natural and effective interaction. We formulate the task of dialog-based interactive image retrieval as a reinforcement learning problem, and reward the dialog system for improving the rank of the target image during each dialog turn. To mitigate the cumbersome and costly process of collecting human-machine conversations as the dialog system learns, we train our system with a user simulator, which is itself trained to describe the differences between target and candidate images. The efficacy of our approach is demonstrated in a footwear retrieval application. Experiments on both simulated and real-world data show that 1) our proposed learning framework achieves better accuracy than other supervised and reinforcement learning baselines and 2) user feedback based on natural language rather than pre-specified attributes leads to more effective retrieval results, and a more natural and expressive communication interface.
Tasks Image Retrieval, Visual Dialog
Published 2018-05-01
URL http://arxiv.org/abs/1805.00145v3
PDF http://arxiv.org/pdf/1805.00145v3.pdf
PWC https://paperswithcode.com/paper/dialog-based-interactive-image-retrieval
Repo https://github.com/XiaoxiaoGuo/fashion-retrieval
Framework pytorch

Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition

Title Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition
Authors Titouan Parcollet, Ying Zhang, Mohamed Morchid, Chiheb Trabelsi, Georges Linarès, Renato De Mori, Yoshua Bengio
Abstract Recently, the connectionist temporal classification (CTC) model coupled with recurrent (RNN) or convolutional neural networks (CNN), made it easier to train speech recognition systems in an end-to-end fashion. However in real-valued models, time frame components such as mel-filter-bank energies and the cepstral coefficients obtained from them, together with their first and second order derivatives, are processed as individual elements, while a natural alternative is to process such components as composed entities. We propose to group such elements in the form of quaternions and to process these quaternions using the established quaternion algebra. Quaternion numbers and quaternion neural networks have shown their efficiency to process multidimensional inputs as entities, to encode internal dependencies, and to solve many tasks with less learning parameters than real-valued models. This paper proposes to integrate multiple feature views in quaternion-valued convolutional neural network (QCNN), to be used for sequence-to-sequence mapping with the CTC model. Promising results are reported using simple QCNNs in phoneme recognition experiments with the TIMIT corpus. More precisely, QCNNs obtain a lower phoneme error rate (PER) with less learning parameters than a competing model based on real-valued CNNs.
Tasks Speech Recognition
Published 2018-06-20
URL http://arxiv.org/abs/1806.07789v1
PDF http://arxiv.org/pdf/1806.07789v1.pdf
PWC https://paperswithcode.com/paper/quaternion-convolutional-neural-networks-for-1
Repo https://github.com/Riccardo-Vecchi/Pytorch-Quaternion-Neural-Networks
Framework pytorch

A Spectral Approach to Gradient Estimation for Implicit Distributions

Title A Spectral Approach to Gradient Estimation for Implicit Distributions
Authors Jiaxin Shi, Shengyang Sun, Jun Zhu
Abstract Recently there have been increasing interests in learning and inference with implicit distributions (i.e., distributions without tractable densities). To this end, we develop a gradient estimator for implicit distributions based on Stein’s identity and a spectral decomposition of kernel operators, where the eigenfunctions are approximated by the Nystr"om method. Unlike the previous works that only provide estimates at the sample points, our approach directly estimates the gradient function, thus allows for a simple and principled out-of-sample extension. We provide theoretical results on the error bound of the estimator and discuss the bias-variance tradeoff in practice. The effectiveness of our method is demonstrated by applications to gradient-free Hamiltonian Monte Carlo and variational inference with implicit distributions. Finally, we discuss the intuition behind the estimator by drawing connections between the Nystr"om method and kernel PCA, which indicates that the estimator can automatically adapt to the geometry of the underlying distribution.
Tasks
Published 2018-06-07
URL http://arxiv.org/abs/1806.02925v1
PDF http://arxiv.org/pdf/1806.02925v1.pdf
PWC https://paperswithcode.com/paper/a-spectral-approach-to-gradient-estimation
Repo https://github.com/thjashin/spectral-stein-grad
Framework tf

Zero-shot Neural Transfer for Cross-lingual Entity Linking

Title Zero-shot Neural Transfer for Cross-lingual Entity Linking
Authors Shruti Rijhwani, Jiateng Xie, Graham Neubig, Jaime Carbonell
Abstract Cross-lingual entity linking maps an entity mention in a source language to its corresponding entry in a structured knowledge base that is in a different (target) language. While previous work relies heavily on bilingual lexical resources to bridge the gap between the source and the target languages, these resources are scarce or unavailable for many low-resource languages. To address this problem, we investigate zero-shot cross-lingual entity linking, in which we assume no bilingual lexical resources are available in the source low-resource language. Specifically, we propose pivot-based entity linking, which leverages information from a high-resource “pivot” language to train character-level neural entity linking models that are transferred to the source low-resource language in a zero-shot manner. With experiments on 9 low-resource languages and transfer through a total of 54 languages, we show that our proposed pivot-based framework improves entity linking accuracy 17% (absolute) on average over the baseline systems, for the zero-shot scenario. Further, we also investigate the use of language-universal phonological representations which improves average accuracy (absolute) by 36% when transferring between languages that use different scripts.
Tasks Cross-Lingual Entity Linking, Entity Linking
Published 2018-11-09
URL http://arxiv.org/abs/1811.04154v1
PDF http://arxiv.org/pdf/1811.04154v1.pdf
PWC https://paperswithcode.com/paper/zero-shot-neural-transfer-for-cross-lingual
Repo https://github.com/neulab/pivot-based-entity-linking
Framework none

Modeling Camera Effects to Improve Visual Learning from Synthetic Data

Title Modeling Camera Effects to Improve Visual Learning from Synthetic Data
Authors Alexandra Carlson, Katherine A. Skinner, Ram Vasudevan, Matthew Johnson-Roberson
Abstract Recent work has focused on generating synthetic imagery to increase the size and variability of training data for learning visual tasks in urban scenes. This includes increasing the occurrence of occlusions or varying environmental and weather effects. However, few have addressed modeling variation in the sensor domain. Sensor effects can degrade real images, limiting generalizability of network performance on visual tasks trained on synthetic data and tested in real environments. This paper proposes an efficient, automatic, physically-based augmentation pipeline to vary sensor effects –chromatic aberration, blur, exposure, noise, and color cast– for synthetic imagery. In particular, this paper illustrates that augmenting synthetic training datasets with the proposed pipeline reduces the domain gap between synthetic and real domains for the task of object detection in urban driving scenes.
Tasks Object Detection
Published 2018-03-21
URL http://arxiv.org/abs/1803.07721v6
PDF http://arxiv.org/pdf/1803.07721v6.pdf
PWC https://paperswithcode.com/paper/modeling-camera-effects-to-improve-visual
Repo https://github.com/alexacarlson/SensorEffectAugmentation
Framework tf

SMIT: Stochastic Multi-Label Image-to-Image Translation

Title SMIT: Stochastic Multi-Label Image-to-Image Translation
Authors Andrés Romero, Pablo Arbeláez, Luc Van Gool, Radu Timofte
Abstract Cross-domain mapping has been a very active topic in recent years. Given one image, its main purpose is to translate it to the desired target domain, or multiple domains in the case of multiple labels. This problem is highly challenging due to three main reasons: (i) unpaired datasets, (ii) multiple attributes, and (iii) the multimodality (e.g., style) associated with the translation. Most of the existing state-of-the-art has focused only on two reasons, i.e. either on (i) and (ii), or (i) and (iii). In this work, we propose a joint framework (i, ii, iii) of diversity and multi-mapping image-to-image translations, using a single generator to conditionally produce countless and unique fake images that hold the underlying characteristics of the source image. Our system does not use style regularization, instead, it uses an embedding representation that we call domain embedding for both domain and style. Extensive experiments over different datasets demonstrate the effectiveness of our proposed approach in comparison with the state-of-the-art in both multi-label and multimodal problems. Additionally, our method is able to generalize under different scenarios: continuous style interpolation, continuous label interpolation, and fine-grained mapping. Code and pretrained models are available at https://github.com/BCV-Uniandes/SMIT.
Tasks Image-to-Image Translation
Published 2018-12-10
URL https://arxiv.org/abs/1812.03704v3
PDF https://arxiv.org/pdf/1812.03704v3.pdf
PWC https://paperswithcode.com/paper/smit-stochastic-multi-label-image-to-image
Repo https://github.com/BCV-Uniandes/SMIT
Framework pytorch

Optimization of Molecules via Deep Reinforcement Learning

Title Optimization of Molecules via Deep Reinforcement Learning
Authors Zhenpeng Zhou, Steven Kearnes, Li Li, Richard N. Zare, Patrick Riley
Abstract We present a framework, which we call Molecule Deep $Q$-Networks (MolDQN), for molecule optimization by combining domain knowledge of chemistry and state-of-the-art reinforcement learning techniques (double $Q$-learning and randomized value functions). We directly define modifications on molecules, thereby ensuring 100% chemical validity. Further, we operate without pre-training on any dataset to avoid possible bias from the choice of that set. Inspired by problems faced during medicinal chemistry lead optimization, we extend our model with multi-objective reinforcement learning, which maximizes drug-likeness while maintaining similarity to the original molecule. We further show the path through chemical space to achieve optimization for a molecule to understand how the model works.
Tasks Q-Learning
Published 2018-10-19
URL http://arxiv.org/abs/1810.08678v3
PDF http://arxiv.org/pdf/1810.08678v3.pdf
PWC https://paperswithcode.com/paper/optimization-of-molecules-via-deep
Repo https://github.com/junyoung0131/Mol-DQN
Framework none

Document-Level Neural Machine Translation with Hierarchical Attention Networks

Title Document-Level Neural Machine Translation with Hierarchical Attention Networks
Authors Lesly Miculicich, Dhananjay Ram, Nikolaos Pappas, James Henderson
Abstract Neural Machine Translation (NMT) can be improved by including document-level contextual information. For this purpose, we propose a hierarchical attention model to capture the context in a structured and dynamic manner. The model is integrated in the original NMT architecture as another level of abstraction, conditioning on the NMT model’s own previous hidden states. Experiments show that hierarchical attention significantly improves the BLEU score over a strong NMT baseline with the state-of-the-art in context-aware methods, and that both the encoder and decoder benefit from context in complementary ways.
Tasks Machine Translation
Published 2018-09-05
URL http://arxiv.org/abs/1809.01576v2
PDF http://arxiv.org/pdf/1809.01576v2.pdf
PWC https://paperswithcode.com/paper/document-level-neural-machine-translation
Repo https://github.com/idiap/HAN_NMT
Framework pytorch

NMT-Keras: a Very Flexible Toolkit with a Focus on Interactive NMT and Online Learning

Title NMT-Keras: a Very Flexible Toolkit with a Focus on Interactive NMT and Online Learning
Authors Álvaro Peris, Francisco Casacuberta
Abstract We present NMT-Keras, a flexible toolkit for training deep learning models, which puts a particular emphasis on the development of advanced applications of neural machine translation systems, such as interactive-predictive translation protocols and long-term adaptation of the translation system via continuous learning. NMT-Keras is based on an extended version of the popular Keras library, and it runs on Theano and Tensorflow. State-of-the-art neural machine translation models are deployed and used following the high-level framework provided by Keras. Given its high modularity and flexibility, it also has been extended to tackle different problems, such as image and video captioning, sentence classification and visual question answering.
Tasks Machine Translation, Question Answering, Sentence Classification, Video Captioning, Visual Question Answering
Published 2018-07-09
URL http://arxiv.org/abs/1807.03096v3
PDF http://arxiv.org/pdf/1807.03096v3.pdf
PWC https://paperswithcode.com/paper/nmt-keras-a-very-flexible-toolkit-with-a
Repo https://github.com/lvapeab/nmt-keras
Framework tf

Online normalizer calculation for softmax

Title Online normalizer calculation for softmax
Authors Maxim Milakov, Natalia Gimelshein
Abstract The Softmax function is ubiquitous in machine learning, multiple previous works suggested faster alternatives for it. In this paper we propose a way to compute classical Softmax with fewer memory accesses and hypothesize that this reduction in memory accesses should improve Softmax performance on actual hardware. The benchmarks confirm this hypothesis: Softmax accelerates by up to 1.3x and Softmax+TopK combined and fused by up to 5x.
Tasks
Published 2018-05-08
URL http://arxiv.org/abs/1805.02867v2
PDF http://arxiv.org/pdf/1805.02867v2.pdf
PWC https://paperswithcode.com/paper/online-normalizer-calculation-for-softmax
Repo https://github.com/NVIDIA/online-softmax
Framework none

Policy Optimization With Penalized Point Probability Distance: An Alternative To Proximal Policy Optimization

Title Policy Optimization With Penalized Point Probability Distance: An Alternative To Proximal Policy Optimization
Authors Xiangxiang Chu
Abstract As the most successful variant and improvement for Trust Region Policy Optimization (TRPO), proximal policy optimization (PPO) has been widely applied across various domains with several advantages: efficient data utilization, easy implementation, and good parallelism. In this paper, a first-order gradient reinforcement learning algorithm called Policy Optimization with Penalized Point Probability Distance (POP3D), which is a lower bound to the square of total variance divergence is proposed as another powerful variant. Firstly, we talk about the shortcomings of several commonly used algorithms, by which our method is partly motivated. Secondly, we address to overcome these shortcomings by applying POP3D. Thirdly, we dive into its mechanism from the perspective of solution manifold. Finally, we make quantitative comparisons among several state-of-the-art algorithms based on common benchmarks. Simulation results show that POP3D is highly competitive compared with PPO. Besides, our code is released in https://github.com/paperwithcode/pop3d.
Tasks Atari Games
Published 2018-07-02
URL http://arxiv.org/abs/1807.00442v4
PDF http://arxiv.org/pdf/1807.00442v4.pdf
PWC https://paperswithcode.com/paper/policy-optimization-with-penalized-point
Repo https://github.com/cxxgtxy/POP3D
Framework tf

Learning Eligibility in Cancer Clinical Trials using Deep Neural Networks

Title Learning Eligibility in Cancer Clinical Trials using Deep Neural Networks
Authors Aurelia Bustos, Antonio Pertusa
Abstract Interventional cancer clinical trials are generally too restrictive, and some patients are often excluded on the basis of comorbidity, past or concomitant treatments, or the fact that they are over a certain age. The efficacy and safety of new treatments for patients with these characteristics are, therefore, not defined. In this work, we built a model to automatically predict whether short clinical statements were considered inclusion or exclusion criteria. We used protocols from cancer clinical trials that were available in public registries from the last 18 years to train word-embeddings, and we constructed a~dataset of 6M short free-texts labeled as eligible or not eligible. A text classifier was trained using deep neural networks, with pre-trained word-embeddings as inputs, to predict whether or not short free-text statements describing clinical information were considered eligible. We additionally analyzed the semantic reasoning of the word-embedding representations obtained and were able to identify equivalent treatments for a type of tumor analogous with the drugs used to treat other tumors. We show that representation learning using {deep} neural networks can be successfully leveraged to extract the medical knowledge from clinical trial protocols for potentially assisting practitioners when prescribing treatments.
Tasks Representation Learning, Word Embeddings
Published 2018-03-22
URL http://arxiv.org/abs/1803.08312v3
PDF http://arxiv.org/pdf/1803.08312v3.pdf
PWC https://paperswithcode.com/paper/learning-eligibility-in-cancer-clinical
Repo https://github.com/auriml/capstone
Framework tf

Cross-type Biomedical Named Entity Recognition with Deep Multi-Task Learning

Title Cross-type Biomedical Named Entity Recognition with Deep Multi-Task Learning
Authors Xuan Wang, Yu Zhang, Xiang Ren, Yuhao Zhang, Marinka Zitnik, Jingbo Shang, Curtis Langlotz, Jiawei Han
Abstract Motivation: State-of-the-art biomedical named entity recognition (BioNER) systems often require handcrafted features specific to each entity type, such as genes, chemicals and diseases. Although recent studies explored using neural network models for BioNER to free experts from manual feature engineering, the performance remains limited by the available training data for each entity type. Results: We propose a multi-task learning framework for BioNER to collectively use the training data of different types of entities and improve the performance on each of them. In experiments on 15 benchmark BioNER datasets, our multi-task model achieves substantially better performance compared with state-of-the-art BioNER systems and baseline neural sequence labeling models. Further analysis shows that the large performance gains come from sharing character- and word-level information among relevant biomedical entities across differently labeled corpora.
Tasks Feature Engineering, Multi-Task Learning, Named Entity Recognition
Published 2018-01-30
URL http://arxiv.org/abs/1801.09851v4
PDF http://arxiv.org/pdf/1801.09851v4.pdf
PWC https://paperswithcode.com/paper/cross-type-biomedical-named-entity
Repo https://github.com/yuzhimanhua/lm-lstm-crf
Framework pytorch

Towards Better Interpretability in Deep Q-Networks

Title Towards Better Interpretability in Deep Q-Networks
Authors Raghuram Mandyam Annasamy, Katia Sycara
Abstract Deep reinforcement learning techniques have demonstrated superior performance in a wide variety of environments. As improvements in training algorithms continue at a brisk pace, theoretical or empirical studies on understanding what these networks seem to learn, are far behind. In this paper we propose an interpretable neural network architecture for Q-learning which provides a global explanation of the model’s behavior using key-value memories, attention and reconstructible embeddings. With a directed exploration strategy, our model can reach training rewards comparable to the state-of-the-art deep Q-learning models. However, results suggest that the features extracted by the neural network are extremely shallow and subsequent testing using out-of-sample examples shows that the agent can easily overfit to trajectories seen during training.
Tasks Q-Learning
Published 2018-09-15
URL http://arxiv.org/abs/1809.05630v2
PDF http://arxiv.org/pdf/1809.05630v2.pdf
PWC https://paperswithcode.com/paper/towards-better-interpretability-in-deep-q
Repo https://github.com/maraghuram/I-DQN
Framework pytorch
comments powered by Disqus