October 21, 2019

2778 words 14 mins read

Paper Group AWR 151

pioNER: Datasets and Baselines for Armenian Named Entity Recognition. Dialog-based Interactive Image Retrieval. Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition. A Spectral Approach to Gradient Estimation for Implicit Distributions. Zero-shot Neural Transfer for Cross-lingual Entity Linking. Modeling Camera Effec …

pioNER: Datasets and Baselines for Armenian Named Entity Recognition


Title	pioNER: Datasets and Baselines for Armenian Named Entity Recognition
Authors	Tsolak Ghukasyan, Garnik Davtyan, Karen Avetisyan, Ivan Andrianov
Abstract	In this work, we tackle the problem of Armenian named entity recognition, providing silver- and gold-standard datasets as well as establishing baseline results on popular models. We present a 163000-token named entity corpus automatically generated and annotated from Wikipedia, and another 53400-token corpus of news sentences with manual annotation of people, organization and location named entities. The corpora were used to train and evaluate several popular named entity recognition models. Alongside the datasets, we release 50-, 100-, 200-, 300-dimensional GloVe word embeddings trained on a collection of Armenian texts from Wikipedia, news, blogs, and encyclopedia.
Tasks	Named Entity Recognition, Word Embeddings
Published	2018-10-19
URL	http://arxiv.org/abs/1810.08699v1
PDF	http://arxiv.org/pdf/1810.08699v1.pdf
PWC	https://paperswithcode.com/paper/pioner-datasets-and-baselines-for-armenian
Repo	https://github.com/Hrant-Khachatrian/Machine-Learning-in-Armenia
Framework	none

Dialog-based Interactive Image Retrieval


Title	Dialog-based Interactive Image Retrieval
Authors	Xiaoxiao Guo, Hui Wu, Yu Cheng, Steven Rennie, Gerald Tesauro, Rogerio Schmidt Feris
Abstract	Existing methods for interactive image retrieval have demonstrated the merit of integrating user feedback, improving retrieval results. However, most current systems rely on restricted forms of user feedback, such as binary relevance responses, or feedback based on a fixed set of relative attributes, which limits their impact. In this paper, we introduce a new approach to interactive image search that enables users to provide feedback via natural language, allowing for more natural and effective interaction. We formulate the task of dialog-based interactive image retrieval as a reinforcement learning problem, and reward the dialog system for improving the rank of the target image during each dialog turn. To mitigate the cumbersome and costly process of collecting human-machine conversations as the dialog system learns, we train our system with a user simulator, which is itself trained to describe the differences between target and candidate images. The efficacy of our approach is demonstrated in a footwear retrieval application. Experiments on both simulated and real-world data show that 1) our proposed learning framework achieves better accuracy than other supervised and reinforcement learning baselines and 2) user feedback based on natural language rather than pre-specified attributes leads to more effective retrieval results, and a more natural and expressive communication interface.
Tasks	Image Retrieval, Visual Dialog
Published	2018-05-01
URL	http://arxiv.org/abs/1805.00145v3
PDF	http://arxiv.org/pdf/1805.00145v3.pdf
PWC	https://paperswithcode.com/paper/dialog-based-interactive-image-retrieval
Repo	https://github.com/XiaoxiaoGuo/fashion-retrieval
Framework	pytorch

Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition


Title	Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition
Authors	Titouan Parcollet, Ying Zhang, Mohamed Morchid, Chiheb Trabelsi, Georges Linarès, Renato De Mori, Yoshua Bengio
Abstract	Recently, the connectionist temporal classification (CTC) model coupled with recurrent (RNN) or convolutional neural networks (CNN), made it easier to train speech recognition systems in an end-to-end fashion. However in real-valued models, time frame components such as mel-filter-bank energies and the cepstral coefficients obtained from them, together with their first and second order derivatives, are processed as individual elements, while a natural alternative is to process such components as composed entities. We propose to group such elements in the form of quaternions and to process these quaternions using the established quaternion algebra. Quaternion numbers and quaternion neural networks have shown their efficiency to process multidimensional inputs as entities, to encode internal dependencies, and to solve many tasks with less learning parameters than real-valued models. This paper proposes to integrate multiple feature views in quaternion-valued convolutional neural network (QCNN), to be used for sequence-to-sequence mapping with the CTC model. Promising results are reported using simple QCNNs in phoneme recognition experiments with the TIMIT corpus. More precisely, QCNNs obtain a lower phoneme error rate (PER) with less learning parameters than a competing model based on real-valued CNNs.
Tasks	Speech Recognition
Published	2018-06-20
URL	http://arxiv.org/abs/1806.07789v1
PDF	http://arxiv.org/pdf/1806.07789v1.pdf
PWC	https://paperswithcode.com/paper/quaternion-convolutional-neural-networks-for-1
Repo	https://github.com/Riccardo-Vecchi/Pytorch-Quaternion-Neural-Networks
Framework	pytorch

A Spectral Approach to Gradient Estimation for Implicit Distributions


Title	A Spectral Approach to Gradient Estimation for Implicit Distributions
Authors	Jiaxin Shi, Shengyang Sun, Jun Zhu
Abstract	Recently there have been increasing interests in learning and inference with implicit distributions (i.e., distributions without tractable densities). To this end, we develop a gradient estimator for implicit distributions based on Stein’s identity and a spectral decomposition of kernel operators, where the eigenfunctions are approximated by the Nystr"om method. Unlike the previous works that only provide estimates at the sample points, our approach directly estimates the gradient function, thus allows for a simple and principled out-of-sample extension. We provide theoretical results on the error bound of the estimator and discuss the bias-variance tradeoff in practice. The effectiveness of our method is demonstrated by applications to gradient-free Hamiltonian Monte Carlo and variational inference with implicit distributions. Finally, we discuss the intuition behind the estimator by drawing connections between the Nystr"om method and kernel PCA, which indicates that the estimator can automatically adapt to the geometry of the underlying distribution.
Tasks
Published	2018-06-07
URL	http://arxiv.org/abs/1806.02925v1
PDF	http://arxiv.org/pdf/1806.02925v1.pdf
PWC	https://paperswithcode.com/paper/a-spectral-approach-to-gradient-estimation
Repo	https://github.com/thjashin/spectral-stein-grad
Framework	tf

Zero-shot Neural Transfer for Cross-lingual Entity Linking


Title	Zero-shot Neural Transfer for Cross-lingual Entity Linking
Authors	Shruti Rijhwani, Jiateng Xie, Graham Neubig, Jaime Carbonell
Abstract	Cross-lingual entity linking maps an entity mention in a source language to its corresponding entry in a structured knowledge base that is in a different (target) language. While previous work relies heavily on bilingual lexical resources to bridge the gap between the source and the target languages, these resources are scarce or unavailable for many low-resource languages. To address this problem, we investigate zero-shot cross-lingual entity linking, in which we assume no bilingual lexical resources are available in the source low-resource language. Specifically, we propose pivot-based entity linking, which leverages information from a high-resource “pivot” language to train character-level neural entity linking models that are transferred to the source low-resource language in a zero-shot manner. With experiments on 9 low-resource languages and transfer through a total of 54 languages, we show that our proposed pivot-based framework improves entity linking accuracy 17% (absolute) on average over the baseline systems, for the zero-shot scenario. Further, we also investigate the use of language-universal phonological representations which improves average accuracy (absolute) by 36% when transferring between languages that use different scripts.
Tasks	Cross-Lingual Entity Linking, Entity Linking
Published	2018-11-09
URL	http://arxiv.org/abs/1811.04154v1
PDF	http://arxiv.org/pdf/1811.04154v1.pdf
PWC	https://paperswithcode.com/paper/zero-shot-neural-transfer-for-cross-lingual
Repo	https://github.com/neulab/pivot-based-entity-linking
Framework	none

Modeling Camera Effects to Improve Visual Learning from Synthetic Data


Title	Modeling Camera Effects to Improve Visual Learning from Synthetic Data
Authors	Alexandra Carlson, Katherine A. Skinner, Ram Vasudevan, Matthew Johnson-Roberson
Abstract	Recent work has focused on generating synthetic imagery to increase the size and variability of training data for learning visual tasks in urban scenes. This includes increasing the occurrence of occlusions or varying environmental and weather effects. However, few have addressed modeling variation in the sensor domain. Sensor effects can degrade real images, limiting generalizability of network performance on visual tasks trained on synthetic data and tested in real environments. This paper proposes an efficient, automatic, physically-based augmentation pipeline to vary sensor effects –chromatic aberration, blur, exposure, noise, and color cast– for synthetic imagery. In particular, this paper illustrates that augmenting synthetic training datasets with the proposed pipeline reduces the domain gap between synthetic and real domains for the task of object detection in urban driving scenes.
Tasks	Object Detection
Published	2018-03-21
URL	http://arxiv.org/abs/1803.07721v6
PDF	http://arxiv.org/pdf/1803.07721v6.pdf
PWC	https://paperswithcode.com/paper/modeling-camera-effects-to-improve-visual
Repo	https://github.com/alexacarlson/SensorEffectAugmentation
Framework	tf

SMIT: Stochastic Multi-Label Image-to-Image Translation


Title	SMIT: Stochastic Multi-Label Image-to-Image Translation
Authors	Andrés Romero, Pablo Arbeláez, Luc Van Gool, Radu Timofte
Abstract	Cross-domain mapping has been a very active topic in recent years. Given one image, its main purpose is to translate it to the desired target domain, or multiple domains in the case of multiple labels. This problem is highly challenging due to three main reasons: (i) unpaired datasets, (ii) multiple attributes, and (iii) the multimodality (e.g., style) associated with the translation. Most of the existing state-of-the-art has focused only on two reasons, i.e. either on (i) and (ii), or (i) and (iii). In this work, we propose a joint framework (i, ii, iii) of diversity and multi-mapping image-to-image translations, using a single generator to conditionally produce countless and unique fake images that hold the underlying characteristics of the source image. Our system does not use style regularization, instead, it uses an embedding representation that we call domain embedding for both domain and style. Extensive experiments over different datasets demonstrate the effectiveness of our proposed approach in comparison with the state-of-the-art in both multi-label and multimodal problems. Additionally, our method is able to generalize under different scenarios: continuous style interpolation, continuous label interpolation, and fine-grained mapping. Code and pretrained models are available at https://github.com/BCV-Uniandes/SMIT.
Tasks	Image-to-Image Translation
Published	2018-12-10
URL	https://arxiv.org/abs/1812.03704v3
PDF	https://arxiv.org/pdf/1812.03704v3.pdf
PWC	https://paperswithcode.com/paper/smit-stochastic-multi-label-image-to-image
Repo	https://github.com/BCV-Uniandes/SMIT
Framework	pytorch

Optimization of Molecules via Deep Reinforcement Learning


Title	Optimization of Molecules via Deep Reinforcement Learning
Authors	Zhenpeng Zhou, Steven Kearnes, Li Li, Richard N. Zare, Patrick Riley
Abstract	We present a framework, which we call Molecule Deep $Q$-Networks (MolDQN), for molecule optimization by combining domain knowledge of chemistry and state-of-the-art reinforcement learning techniques (double $Q$-learning and randomized value functions). We directly define modifications on molecules, thereby ensuring 100% chemical validity. Further, we operate without pre-training on any dataset to avoid possible bias from the choice of that set. Inspired by problems faced during medicinal chemistry lead optimization, we extend our model with multi-objective reinforcement learning, which maximizes drug-likeness while maintaining similarity to the original molecule. We further show the path through chemical space to achieve optimization for a molecule to understand how the model works.
Tasks	Q-Learning
Published	2018-10-19
URL	http://arxiv.org/abs/1810.08678v3
PDF	http://arxiv.org/pdf/1810.08678v3.pdf
PWC	https://paperswithcode.com/paper/optimization-of-molecules-via-deep
Repo	https://github.com/junyoung0131/Mol-DQN
Framework	none

Document-Level Neural Machine Translation with Hierarchical Attention Networks


Title	Document-Level Neural Machine Translation with Hierarchical Attention Networks
Authors	Lesly Miculicich, Dhananjay Ram, Nikolaos Pappas, James Henderson
Abstract	Neural Machine Translation (NMT) can be improved by including document-level contextual information. For this purpose, we propose a hierarchical attention model to capture the context in a structured and dynamic manner. The model is integrated in the original NMT architecture as another level of abstraction, conditioning on the NMT model’s own previous hidden states. Experiments show that hierarchical attention significantly improves the BLEU score over a strong NMT baseline with the state-of-the-art in context-aware methods, and that both the encoder and decoder benefit from context in complementary ways.
Tasks	Machine Translation
Published	2018-09-05
URL	http://arxiv.org/abs/1809.01576v2
PDF	http://arxiv.org/pdf/1809.01576v2.pdf
PWC	https://paperswithcode.com/paper/document-level-neural-machine-translation
Repo	https://github.com/idiap/HAN_NMT
Framework	pytorch

NMT-Keras: a Very Flexible Toolkit with a Focus on Interactive NMT and Online Learning


Title	NMT-Keras: a Very Flexible Toolkit with a Focus on Interactive NMT and Online Learning
Authors	Álvaro Peris, Francisco Casacuberta
Abstract	We present NMT-Keras, a flexible toolkit for training deep learning models, which puts a particular emphasis on the development of advanced applications of neural machine translation systems, such as interactive-predictive translation protocols and long-term adaptation of the translation system via continuous learning. NMT-Keras is based on an extended version of the popular Keras library, and it runs on Theano and Tensorflow. State-of-the-art neural machine translation models are deployed and used following the high-level framework provided by Keras. Given its high modularity and flexibility, it also has been extended to tackle different problems, such as image and video captioning, sentence classification and visual question answering.
Tasks	Machine Translation, Question Answering, Sentence Classification, Video Captioning, Visual Question Answering
Published	2018-07-09
URL	http://arxiv.org/abs/1807.03096v3
PDF	http://arxiv.org/pdf/1807.03096v3.pdf
PWC	https://paperswithcode.com/paper/nmt-keras-a-very-flexible-toolkit-with-a
Repo	https://github.com/lvapeab/nmt-keras
Framework	tf

Online normalizer calculation for softmax


Title	Online normalizer calculation for softmax
Authors	Maxim Milakov, Natalia Gimelshein
Abstract	The Softmax function is ubiquitous in machine learning, multiple previous works suggested faster alternatives for it. In this paper we propose a way to compute classical Softmax with fewer memory accesses and hypothesize that this reduction in memory accesses should improve Softmax performance on actual hardware. The benchmarks confirm this hypothesis: Softmax accelerates by up to 1.3x and Softmax+TopK combined and fused by up to 5x.
Tasks
Published	2018-05-08
URL	http://arxiv.org/abs/1805.02867v2
PDF	http://arxiv.org/pdf/1805.02867v2.pdf
PWC	https://paperswithcode.com/paper/online-normalizer-calculation-for-softmax
Repo	https://github.com/NVIDIA/online-softmax
Framework	none

Policy Optimization With Penalized Point Probability Distance: An Alternative To Proximal Policy Optimization


Title	Policy Optimization With Penalized Point Probability Distance: An Alternative To Proximal Policy Optimization
Authors	Xiangxiang Chu
Abstract	As the most successful variant and improvement for Trust Region Policy Optimization (TRPO), proximal policy optimization (PPO) has been widely applied across various domains with several advantages: efficient data utilization, easy implementation, and good parallelism. In this paper, a first-order gradient reinforcement learning algorithm called Policy Optimization with Penalized Point Probability Distance (POP3D), which is a lower bound to the square of total variance divergence is proposed as another powerful variant. Firstly, we talk about the shortcomings of several commonly used algorithms, by which our method is partly motivated. Secondly, we address to overcome these shortcomings by applying POP3D. Thirdly, we dive into its mechanism from the perspective of solution manifold. Finally, we make quantitative comparisons among several state-of-the-art algorithms based on common benchmarks. Simulation results show that POP3D is highly competitive compared with PPO. Besides, our code is released in https://github.com/paperwithcode/pop3d.
Tasks	Atari Games
Published	2018-07-02
URL	http://arxiv.org/abs/1807.00442v4
PDF	http://arxiv.org/pdf/1807.00442v4.pdf
PWC	https://paperswithcode.com/paper/policy-optimization-with-penalized-point
Repo	https://github.com/cxxgtxy/POP3D
Framework	tf

Learning Eligibility in Cancer Clinical Trials using Deep Neural Networks


Title	Learning Eligibility in Cancer Clinical Trials using Deep Neural Networks
Authors	Aurelia Bustos, Antonio Pertusa
Abstract	Interventional cancer clinical trials are generally too restrictive, and some patients are often excluded on the basis of comorbidity, past or concomitant treatments, or the fact that they are over a certain age. The efficacy and safety of new treatments for patients with these characteristics are, therefore, not defined. In this work, we built a model to automatically predict whether short clinical statements were considered inclusion or exclusion criteria. We used protocols from cancer clinical trials that were available in public registries from the last 18 years to train word-embeddings, and we constructed a~dataset of 6M short free-texts labeled as eligible or not eligible. A text classifier was trained using deep neural networks, with pre-trained word-embeddings as inputs, to predict whether or not short free-text statements describing clinical information were considered eligible. We additionally analyzed the semantic reasoning of the word-embedding representations obtained and were able to identify equivalent treatments for a type of tumor analogous with the drugs used to treat other tumors. We show that representation learning using {deep} neural networks can be successfully leveraged to extract the medical knowledge from clinical trial protocols for potentially assisting practitioners when prescribing treatments.
Tasks	Representation Learning, Word Embeddings
Published	2018-03-22
URL	http://arxiv.org/abs/1803.08312v3
PDF	http://arxiv.org/pdf/1803.08312v3.pdf
PWC	https://paperswithcode.com/paper/learning-eligibility-in-cancer-clinical
Repo	https://github.com/auriml/capstone
Framework	tf

Cross-type Biomedical Named Entity Recognition with Deep Multi-Task Learning


Title	Cross-type Biomedical Named Entity Recognition with Deep Multi-Task Learning
Authors	Xuan Wang, Yu Zhang, Xiang Ren, Yuhao Zhang, Marinka Zitnik, Jingbo Shang, Curtis Langlotz, Jiawei Han
Abstract	Motivation: State-of-the-art biomedical named entity recognition (BioNER) systems often require handcrafted features specific to each entity type, such as genes, chemicals and diseases. Although recent studies explored using neural network models for BioNER to free experts from manual feature engineering, the performance remains limited by the available training data for each entity type. Results: We propose a multi-task learning framework for BioNER to collectively use the training data of different types of entities and improve the performance on each of them. In experiments on 15 benchmark BioNER datasets, our multi-task model achieves substantially better performance compared with state-of-the-art BioNER systems and baseline neural sequence labeling models. Further analysis shows that the large performance gains come from sharing character- and word-level information among relevant biomedical entities across differently labeled corpora.
Tasks	Feature Engineering, Multi-Task Learning, Named Entity Recognition
Published	2018-01-30
URL	http://arxiv.org/abs/1801.09851v4
PDF	http://arxiv.org/pdf/1801.09851v4.pdf
PWC	https://paperswithcode.com/paper/cross-type-biomedical-named-entity
Repo	https://github.com/yuzhimanhua/lm-lstm-crf
Framework	pytorch

Towards Better Interpretability in Deep Q-Networks


Title	Towards Better Interpretability in Deep Q-Networks
Authors	Raghuram Mandyam Annasamy, Katia Sycara
Abstract	Deep reinforcement learning techniques have demonstrated superior performance in a wide variety of environments. As improvements in training algorithms continue at a brisk pace, theoretical or empirical studies on understanding what these networks seem to learn, are far behind. In this paper we propose an interpretable neural network architecture for Q-learning which provides a global explanation of the model’s behavior using key-value memories, attention and reconstructible embeddings. With a directed exploration strategy, our model can reach training rewards comparable to the state-of-the-art deep Q-learning models. However, results suggest that the features extracted by the neural network are extremely shallow and subsequent testing using out-of-sample examples shows that the agent can easily overfit to trajectories seen during training.
Tasks	Q-Learning
Published	2018-09-15
URL	http://arxiv.org/abs/1809.05630v2
PDF	http://arxiv.org/pdf/1809.05630v2.pdf
PWC	https://paperswithcode.com/paper/towards-better-interpretability-in-deep-q
Repo	https://github.com/maraghuram/I-DQN
Framework	pytorch