October 20, 2019

3300 words 16 mins read

Paper Group AWR 329

Evolution-Guided Policy Gradient in Reinforcement Learning. Multi-attention Recurrent Network for Human Communication Comprehension. Amalgamating Knowledge towards Comprehensive Classification. Breaking-down the Ontology Alignment Task with a Lexical Index and Neural Embeddings. Domain Adaptation for Ear Recognition Using Deep Convolutional Neural …

Evolution-Guided Policy Gradient in Reinforcement Learning


Title	Evolution-Guided Policy Gradient in Reinforcement Learning
Authors	Shauharda Khadka, Kagan Tumer
Abstract	Deep Reinforcement Learning (DRL) algorithms have been successfully applied to a range of challenging control tasks. However, these methods typically suffer from three core difficulties: temporal credit assignment with sparse rewards, lack of effective exploration, and brittle convergence properties that are extremely sensitive to hyperparameters. Collectively, these challenges severely limit the applicability of these approaches to real-world problems. Evolutionary Algorithms (EAs), a class of black box optimization techniques inspired by natural evolution, are well suited to address each of these three challenges. However, EAs typically suffer from high sample complexity and struggle to solve problems that require optimization of a large number of parameters. In this paper, we introduce Evolutionary Reinforcement Learning (ERL), a hybrid algorithm that leverages the population of an EA to provide diversified data to train an RL agent, and reinserts the RL agent into the EA population periodically to inject gradient information into the EA. ERL inherits EA’s ability of temporal credit assignment with a fitness metric, effective exploration with a diverse set of policies, and stability of a population-based approach and complements it with off-policy DRL’s ability to leverage gradients for higher sample efficiency and faster learning. Experiments in a range of challenging continuous control benchmarks demonstrate that ERL significantly outperforms prior DRL and EA methods.
Tasks	Continuous Control
Published	2018-05-21
URL	http://arxiv.org/abs/1805.07917v2
PDF	http://arxiv.org/pdf/1805.07917v2.pdf
PWC	https://paperswithcode.com/paper/evolution-guided-policy-gradient-in
Repo	https://github.com/neilsgp/RL-Algorithms
Framework	none

Multi-attention Recurrent Network for Human Communication Comprehension


Title	Multi-attention Recurrent Network for Human Communication Comprehension
Authors	Amir Zadeh, Paul Pu Liang, Soujanya Poria, Prateek Vij, Erik Cambria, Louis-Philippe Morency
Abstract	Human face-to-face communication is a complex multimodal signal. We use words (language modality), gestures (vision modality) and changes in tone (acoustic modality) to convey our intentions. Humans easily process and understand face-to-face communication, however, comprehending this form of communication remains a significant challenge for Artificial Intelligence (AI). AI must understand each modality and the interactions between them that shape human communication. In this paper, we present a novel neural architecture for understanding human communication called the Multi-attention Recurrent Network (MARN). The main strength of our model comes from discovering interactions between modalities through time using a neural component called the Multi-attention Block (MAB) and storing them in the hybrid memory of a recurrent component called the Long-short Term Hybrid Memory (LSTHM). We perform extensive comparisons on six publicly available datasets for multimodal sentiment analysis, speaker trait recognition and emotion recognition. MARN shows state-of-the-art performance on all the datasets.
Tasks	Emotion Recognition, Multimodal Sentiment Analysis, Sentiment Analysis
Published	2018-02-03
URL	http://arxiv.org/abs/1802.00923v1
PDF	http://arxiv.org/pdf/1802.00923v1.pdf
PWC	https://paperswithcode.com/paper/multi-attention-recurrent-network-for-human
Repo	https://github.com/pliang279/MFN
Framework	pytorch

Amalgamating Knowledge towards Comprehensive Classification


Title	Amalgamating Knowledge towards Comprehensive Classification
Authors	Chengchao Shen, Xinchao Wang, Jie Song, Li Sun, Mingli Song
Abstract	With the rapid development of deep learning, there have been an unprecedentedly large number of trained deep network models available online. Reusing such trained models can significantly reduce the cost of training the new models from scratch, if not infeasible at all as the annotations used for the training original networks are often unavailable to public. We propose in this paper to study a new model-reusing task, which we term as \emph{knowledge amalgamation}. Given multiple trained teacher networks, each of which specializes in a different classification problem, the goal of knowledge amalgamation is to learn a lightweight student model capable of handling the comprehensive classification. We assume no other annotations except the outputs from the teacher models are available, and thus focus on extracting and amalgamating knowledge from the multiple teachers. To this end, we propose a pilot two-step strategy to tackle the knowledge amalgamation task, by learning first the compact feature representations from teachers and then the network parameters in a layer-wise manner so as to build the student model. We apply this approach to four public datasets and obtain very encouraging results: even without any human annotation, the obtained student model is competent to handle the comprehensive classification task and in most cases outperforms the teachers in individual sub-tasks.
Tasks
Published	2018-11-07
URL	http://arxiv.org/abs/1811.02796v2
PDF	http://arxiv.org/pdf/1811.02796v2.pdf
PWC	https://paperswithcode.com/paper/amalgamating-knowledge-towards-comprehensive
Repo	https://github.com/zju-vipa/KamalEngine
Framework	pytorch

Breaking-down the Ontology Alignment Task with a Lexical Index and Neural Embeddings


Title	Breaking-down the Ontology Alignment Task with a Lexical Index and Neural Embeddings
Authors	Ernesto Jimenez-Ruiz, Asan Agibetov, Matthias Samwald, Valerie Cross
Abstract	Large ontologies still pose serious challenges to state-of-the-art ontology alignment systems. In the paper we present an approach that combines a lexical index, a neural embedding model and locality modules to effectively divide an input ontology matching task into smaller and more tractable matching (sub)tasks. We have conducted a comprehensive evaluation using the datasets of the Ontology Alignment Evaluation Initiative. The results are encouraging and suggest that the proposed methods are adequate in practice and can be integrated within the workflow of state-of-the-art systems.
Tasks
Published	2018-05-31
URL	http://arxiv.org/abs/1805.12402v1
PDF	http://arxiv.org/pdf/1805.12402v1.pdf
PWC	https://paperswithcode.com/paper/breaking-down-the-ontology-alignment-task
Repo	https://github.com/ernestojimenezruiz/logmap-matcher
Framework	none

Domain Adaptation for Ear Recognition Using Deep Convolutional Neural Networks


Title	Domain Adaptation for Ear Recognition Using Deep Convolutional Neural Networks
Authors	Fevziye Irem Eyiokur, Dogucan Yaman, Hazım Kemal Ekenel
Abstract	In this paper, we have extensively investigated the unconstrained ear recognition problem. We have first shown the importance of domain adaptation, when deep convolutional neural network models are used for ear recognition. To enable domain adaptation, we have collected a new ear dataset using the Multi-PIE face dataset, which we named as Multi-PIE ear dataset. To improve the performance further, we have combined different deep convolutional neural network models. We have analyzed in depth the effect of ear image quality, for example illumination and aspect ratio, on the classification performance. Finally, we have addressed the problem of dataset bias in the ear recognition field. Experiments on the UERC dataset have shown that domain adaptation leads to a significant performance improvement. For example, when VGG-16 model is used and the domain adaptation is applied, an absolute increase of around 10% has been achieved. Combining different deep convolutional neural network models has further improved the accuracy by 4%. It has also been observed that image quality has an influence on the results. In the experiments that we have conducted to examine the dataset bias, given an ear image, we were able to classify the dataset that it has come from with 99.71% accuracy, which indicates a strong bias among the ear recognition datasets.
Tasks	Domain Adaptation
Published	2018-03-21
URL	http://arxiv.org/abs/1803.07801v1
PDF	http://arxiv.org/pdf/1803.07801v1.pdf
PWC	https://paperswithcode.com/paper/domain-adaptation-for-ear-recognition-using
Repo	https://github.com/iremeyiokur/multipie_ear_dataset
Framework	none

Predicting Aircraft Trajectories: A Deep Generative Convolutional Recurrent Neural Networks Approach


Title	Predicting Aircraft Trajectories: A Deep Generative Convolutional Recurrent Neural Networks Approach
Authors	Yulin Liu, Mark Hansen
Abstract	Reliable 4D aircraft trajectory prediction, whether in a real-time setting or for analysis of counterfactuals, is important to the efficiency of the aviation system. Toward this end, we first propose a highly generalizable efficient tree-based matching algorithm to construct image-like feature maps from high-fidelity meteorological datasets - wind, temperature and convective weather. We then model the track points on trajectories as conditional Gaussian mixtures with parameters to be learned from our proposed deep generative model, which is an end-to-end convolutional recurrent neural network that consists of a long short-term memory (LSTM) encoder network and a mixture density LSTM decoder network. The encoder network embeds last-filed flight plan information into fixed-size hidden state variables and feeds the decoder network, which further learns the spatiotemporal correlations from the historical flight tracks and outputs the parameters of Gaussian mixtures. Convolutional layers are integrated into the pipeline to learn representations from the high-dimension weather features. During the inference process, beam search, adaptive Kalman filter, and Rauch-Tung-Striebel smoother algorithms are used to prune the variance of generated trajectories.
Tasks	Trajectory Prediction
Published	2018-12-31
URL	http://arxiv.org/abs/1812.11670v1
PDF	http://arxiv.org/pdf/1812.11670v1.pdf
PWC	https://paperswithcode.com/paper/predicting-aircraft-trajectories-a-deep
Repo	https://github.com/yulinliu101/DeepTP
Framework	tf

OPA2Vec: combining formal and informal content of biomedical ontologies to improve similarity-based prediction


Title	OPA2Vec: combining formal and informal content of biomedical ontologies to improve similarity-based prediction
Authors	Fatima Zohra Smaili, Xin Gao, Robert Hoehndorf
Abstract	Motivation: Ontologies are widely used in biology for data annotation, integration, and analysis. In addition to formally structured axioms, ontologies contain meta-data in the form of annotation axioms which provide valuable pieces of information that characterize ontology classes. Annotations commonly used in ontologies include class labels, descriptions, or synonyms. Despite being a rich source of semantic information, the ontology meta-data are generally unexploited by ontology-based analysis methods such as semantic similarity measures. Results: We propose a novel method, OPA2Vec, to generate vector representations of biological entities in ontologies by combining formal ontology axioms and annotation axioms from the ontology meta-data. We apply a Word2Vec model that has been pre-trained on PubMed abstracts to produce feature vectors from our collected data. We validate our method in two different ways: first, we use the obtained vector representations of proteins as a similarity measure to predict protein-protein interaction (PPI) on two different datasets. Second, we evaluate our method on predicting gene-disease associations based on phenotype similarity by generating vector representations of genes and diseases using a phenotype ontology, and applying the obtained vectors to predict gene-disease associations. These two experiments are just an illustration of the possible applications of our method. OPA2Vec can be used to produce vector representations of any biomedical entity given any type of biomedical ontology. Availability: https://github.com/bio-ontology-research-group/opa2vec Contact: robert.hoehndorf@kaust.edu.sa and xin.gao@kaust.edu.sa.
Tasks	Semantic Similarity, Semantic Textual Similarity
Published	2018-04-29
URL	http://arxiv.org/abs/1804.10922v1
PDF	http://arxiv.org/pdf/1804.10922v1.pdf
PWC	https://paperswithcode.com/paper/opa2vec-combining-formal-and-informal-content
Repo	https://github.com/bio-ontology-research-group/opa2vec
Framework	none

Automatic Academic Paper Rating Based on Modularized Hierarchical Convolutional Neural Network


Title	Automatic Academic Paper Rating Based on Modularized Hierarchical Convolutional Neural Network
Authors	Pengcheng Yang, Xu Sun, Wei Li, Shuming Ma
Abstract	As more and more academic papers are being submitted to conferences and journals, evaluating all these papers by professionals is time-consuming and can cause inequality due to the personal factors of the reviewers. In this paper, in order to assist professionals in evaluating academic papers, we propose a novel task: automatic academic paper rating (AAPR), which automatically determine whether to accept academic papers. We build a new dataset for this task and propose a novel modularized hierarchical convolutional neural network to achieve automatic academic paper rating. Evaluation results show that the proposed model outperforms the baselines by a large margin. The dataset and code are available at \url{https://github.com/lancopku/AAPR}
Tasks
Published	2018-05-10
URL	http://arxiv.org/abs/1805.03977v1
PDF	http://arxiv.org/pdf/1805.03977v1.pdf
PWC	https://paperswithcode.com/paper/automatic-academic-paper-rating-based-on
Repo	https://github.com/lancopku/AAPR
Framework	pytorch

Learning Latent Dynamics for Planning from Pixels


Title	Learning Latent Dynamics for Planning from Pixels
Authors	Danijar Hafner, Timothy Lillicrap, Ian Fischer, Ruben Villegas, David Ha, Honglak Lee, James Davidson
Abstract	Planning has been very successful for control tasks with known environment dynamics. To leverage planning in unknown environments, the agent needs to learn the dynamics from interactions with the world. However, learning dynamics models that are accurate enough for planning has been a long-standing challenge, especially in image-based domains. We propose the Deep Planning Network (PlaNet), a purely model-based agent that learns the environment dynamics from images and chooses actions through fast online planning in latent space. To achieve high performance, the dynamics model must accurately predict the rewards ahead for multiple time steps. We approach this using a latent dynamics model with both deterministic and stochastic transition components. Moreover, we propose a multi-step variational inference objective that we name latent overshooting. Using only pixel observations, our agent solves continuous control tasks with contact dynamics, partial observability, and sparse rewards, which exceed the difficulty of tasks that were previously solved by planning with learned models. PlaNet uses substantially fewer episodes and reaches final performance close to and sometimes higher than strong model-free algorithms.
Tasks	Continuous Control, Motion Planning
Published	2018-11-12
URL	https://arxiv.org/abs/1811.04551v5
PDF	https://arxiv.org/pdf/1811.04551v5.pdf
PWC	https://paperswithcode.com/paper/learning-latent-dynamics-for-planning-from
Repo	https://github.com/google-research/planet
Framework	tf

Machine Learning Methods for Track Classification in the AT-TPC


Title	Machine Learning Methods for Track Classification in the AT-TPC
Authors	Michelle P. Kuchera, Raghuram Ramanujan, Jack Z. Taylor, Ryan R. Strauss, Daniel Bazin, Joshua Bradt, Ruiming Chen
Abstract	We evaluate machine learning methods for event classification in the Active-Target Time Projection Chamber detector at the National Superconducting Cyclotron Laboratory (NSCL) at Michigan State University. An automated method to single out the desired reaction product would result in more accurate physics results as well as a faster analysis process. Binary and multi-class classification methods were tested on data produced by the $^{46}$Ar(p,p) experiment run at the NSCL in September 2015. We found a Convolutional Neural Network to be the most successful classifier of proton scattering events for transfer learning. Results from this investigation and recommendations for event classification in future experiments are presented.
Tasks	Transfer Learning
Published	2018-10-21
URL	http://arxiv.org/abs/1810.10350v3
PDF	http://arxiv.org/pdf/1810.10350v3.pdf
PWC	https://paperswithcode.com/paper/machine-learning-methods-for-track
Repo	https://github.com/ATTPC/event-classification
Framework	tf

Few-Shot Text Classification with Pre-Trained Word Embeddings and a Human in the Loop


Title	Few-Shot Text Classification with Pre-Trained Word Embeddings and a Human in the Loop
Authors	Katherine Bailey, Sunny Chopra
Abstract	Most of the literature around text classification treats it as a supervised learning problem: given a corpus of labeled documents, train a classifier such that it can accurately predict the classes of unseen documents. In industry, however, it is not uncommon for a business to have entire corpora of documents where few or none have been classified, or where existing classifications have become meaningless. With web content, for example, poor taxonomy management can result in labels being applied indiscriminately, making filtering by these labels unhelpful. Our work aims to make it possible to classify an entire corpus of unlabeled documents using a human-in-the-loop approach, where the content owner manually classifies just one or two documents per category and the rest can be automatically classified. This “few-shot” learning approach requires rich representations of the documents such that those that have been manually labeled can be treated as prototypes, and automatic classification of the rest is a simple case of measuring the distance to prototypes. This approach uses pre-trained word embeddings, where documents are represented using a simple weighted average of constituent word embeddings. We have tested the accuracy of the approach on existing labeled datasets and provide the results here. We have also made code available for reproducing the results we got on the 20 Newsgroups dataset.
Tasks	Few-Shot Learning, Text Classification, Word Embeddings
Published	2018-04-05
URL	http://arxiv.org/abs/1804.02063v1
PDF	http://arxiv.org/pdf/1804.02063v1.pdf
PWC	https://paperswithcode.com/paper/few-shot-text-classification-with-pre-trained
Repo	https://github.com/katbailey/few-shot-text-classification
Framework	none

Fast Perceptual Image Enhancement


Title	Fast Perceptual Image Enhancement
Authors	Etienne de Stoutz, Andrey Ignatov, Nikolay Kobyshev, Radu Timofte, Luc Van Gool
Abstract	The vast majority of photos taken today are by mobile phones. While their quality is rapidly growing, due to physical limitations and cost constraints, mobile phone cameras struggle to compare in quality with DSLR cameras. This motivates us to computationally enhance these images. We extend upon the results of Ignatov et al., where they are able to translate images from compact mobile cameras into images with comparable quality to high-resolution photos taken by DSLR cameras. However, the neural models employed require large amounts of computational resources and are not lightweight enough to run on mobile devices. We build upon the prior work and explore different network architectures targeting an increase in image quality and speed. With an efficient network architecture which does most of its processing in a lower spatial resolution, we achieve a significantly higher mean opinion score (MOS) than the baseline while speeding up the computation by 6.3 times on a consumer-grade CPU. This suggests a promising direction for neural-network-based photo enhancement using the phone hardware of the future.
Tasks	Image Enhancement
Published	2018-12-31
URL	http://arxiv.org/abs/1812.11852v1
PDF	http://arxiv.org/pdf/1812.11852v1.pdf
PWC	https://paperswithcode.com/paper/fast-perceptual-image-enhancement
Repo	https://github.com/dojure/FPIE
Framework	tf

Concatenated Power Mean Word Embeddings as Universal Cross-Lingual Sentence Representations


Title	Concatenated Power Mean Word Embeddings as Universal Cross-Lingual Sentence Representations
Authors	Andreas Rücklé, Steffen Eger, Maxime Peyrard, Iryna Gurevych
Abstract	Average word embeddings are a common baseline for more sophisticated sentence embedding techniques. However, they typically fall short of the performances of more complex models such as InferSent. Here, we generalize the concept of average word embeddings to power mean word embeddings. We show that the concatenation of different types of power mean word embeddings considerably closes the gap to state-of-the-art methods monolingually and substantially outperforms these more complex techniques cross-lingually. In addition, our proposed method outperforms different recently proposed baselines such as SIF and Sent2Vec by a solid margin, thus constituting a much harder-to-beat monolingual baseline. Our data and code are publicly available.
Tasks	Sentence Embedding, Word Embeddings
Published	2018-03-04
URL	http://arxiv.org/abs/1803.01400v2
PDF	http://arxiv.org/pdf/1803.01400v2.pdf
PWC	https://paperswithcode.com/paper/concatenated-power-mean-word-embeddings-as
Repo	https://github.com/UKPLab/arxiv2018-xling-sentence-embeddings
Framework	tf

Semantic-aware Grad-GAN for Virtual-to-Real Urban Scene Adaption


Title	Semantic-aware Grad-GAN for Virtual-to-Real Urban Scene Adaption
Authors	Peilun Li, Xiaodan Liang, Daoyuan Jia, Eric P. Xing
Abstract	Recent advances in vision tasks (e.g., segmentation) highly depend on the availability of large-scale real-world image annotations obtained by cumbersome human labors. Moreover, the perception performance often drops significantly for new scenarios, due to the poor generalization capability of models trained on limited and biased annotations. In this work, we resort to transfer knowledge from automatically rendered scene annotations in virtual-world to facilitate real-world visual tasks. Although virtual-world annotations can be ideally diverse and unlimited, the discrepant data distributions between virtual and real-world make it challenging for knowledge transferring. We thus propose a novel Semantic-aware Grad-GAN (SG-GAN) to perform virtual-to-real domain adaption with the ability of retaining vital semantic information. Beyond the simple holistic color/texture transformation achieved by prior works, SG-GAN successfully personalizes the appearance adaption for each semantic region in order to preserve their key characteristic for better recognition. It presents two main contributions to traditional GANs: 1) a soft gradient-sensitive objective for keeping semantic boundaries; 2) a semantic-aware discriminator for validating the fidelity of personalized adaptions with respect to each semantic region. Qualitative and quantitative experiments demonstrate the superiority of our SG-GAN in scene adaption over state-of-the-art GANs. Further evaluations on semantic segmentation on Cityscapes show using adapted virtual images by SG-GAN dramatically improves segmentation performance than original virtual data. We release our code at https://github.com/Peilun-Li/SG-GAN.
Tasks	Domain Adaptation, Semantic Segmentation
Published	2018-01-05
URL	http://arxiv.org/abs/1801.01726v2
PDF	http://arxiv.org/pdf/1801.01726v2.pdf
PWC	https://paperswithcode.com/paper/semantic-aware-grad-gan-for-virtual-to-real
Repo	https://github.com/Peilun-Li/SG-GAN
Framework	tf

DeepWeeds: A Multiclass Weed Species Image Dataset for Deep Learning


Title	DeepWeeds: A Multiclass Weed Species Image Dataset for Deep Learning
Authors	Alex Olsen, Dmitry A. Konovalov, Bronson Philippa, Peter Ridd, Jake C. Wood, Jamie Johns, Wesley Banks, Benjamin Girgenti, Owen Kenny, James Whinney, Brendan Calvert, Mostafa Rahimi Azghadi, Ronald D. White
Abstract	Robotic weed control has seen increased research of late with its potential for boosting productivity in agriculture. Majority of works focus on developing robotics for croplands, ignoring the weed management problems facing rangeland stock farmers. Perhaps the greatest obstacle to widespread uptake of robotic weed control is the robust classification of weed species in their natural environment. The unparalleled successes of deep learning make it an ideal candidate for recognising various weed species in the complex rangeland environment. This work contributes the first large, public, multiclass image dataset of weed species from the Australian rangelands; allowing for the development of robust classification methods to make robotic weed control viable. The DeepWeeds dataset consists of 17,509 labelled images of eight nationally significant weed species native to eight locations across northern Australia. This paper presents a baseline for classification performance on the dataset using the benchmark deep learning models, Inception-v3 and ResNet-50. These models achieved an average classification accuracy of 95.1% and 95.7%, respectively. We also demonstrate real time performance of the ResNet-50 architecture, with an average inference time of 53.4 ms per image. These strong results bode well for future field implementation of robotic weed control methods in the Australian rangelands.
Tasks
Published	2018-10-09
URL	http://arxiv.org/abs/1810.05726v3
PDF	http://arxiv.org/pdf/1810.05726v3.pdf
PWC	https://paperswithcode.com/paper/deepweeds-a-multiclass-weed-species-image
Repo	https://github.com/AlexOlsen/DeepWeeds
Framework	tf