October 20, 2019

3300 words 16 mins read

Paper Group AWR 329

Paper Group AWR 329

Evolution-Guided Policy Gradient in Reinforcement Learning. Multi-attention Recurrent Network for Human Communication Comprehension. Amalgamating Knowledge towards Comprehensive Classification. Breaking-down the Ontology Alignment Task with a Lexical Index and Neural Embeddings. Domain Adaptation for Ear Recognition Using Deep Convolutional Neural …

Evolution-Guided Policy Gradient in Reinforcement Learning

Title Evolution-Guided Policy Gradient in Reinforcement Learning
Authors Shauharda Khadka, Kagan Tumer
Abstract Deep Reinforcement Learning (DRL) algorithms have been successfully applied to a range of challenging control tasks. However, these methods typically suffer from three core difficulties: temporal credit assignment with sparse rewards, lack of effective exploration, and brittle convergence properties that are extremely sensitive to hyperparameters. Collectively, these challenges severely limit the applicability of these approaches to real-world problems. Evolutionary Algorithms (EAs), a class of black box optimization techniques inspired by natural evolution, are well suited to address each of these three challenges. However, EAs typically suffer from high sample complexity and struggle to solve problems that require optimization of a large number of parameters. In this paper, we introduce Evolutionary Reinforcement Learning (ERL), a hybrid algorithm that leverages the population of an EA to provide diversified data to train an RL agent, and reinserts the RL agent into the EA population periodically to inject gradient information into the EA. ERL inherits EA’s ability of temporal credit assignment with a fitness metric, effective exploration with a diverse set of policies, and stability of a population-based approach and complements it with off-policy DRL’s ability to leverage gradients for higher sample efficiency and faster learning. Experiments in a range of challenging continuous control benchmarks demonstrate that ERL significantly outperforms prior DRL and EA methods.
Tasks Continuous Control
Published 2018-05-21
URL http://arxiv.org/abs/1805.07917v2
PDF http://arxiv.org/pdf/1805.07917v2.pdf
PWC https://paperswithcode.com/paper/evolution-guided-policy-gradient-in
Repo https://github.com/neilsgp/RL-Algorithms
Framework none

Multi-attention Recurrent Network for Human Communication Comprehension

Title Multi-attention Recurrent Network for Human Communication Comprehension
Authors Amir Zadeh, Paul Pu Liang, Soujanya Poria, Prateek Vij, Erik Cambria, Louis-Philippe Morency
Abstract Human face-to-face communication is a complex multimodal signal. We use words (language modality), gestures (vision modality) and changes in tone (acoustic modality) to convey our intentions. Humans easily process and understand face-to-face communication, however, comprehending this form of communication remains a significant challenge for Artificial Intelligence (AI). AI must understand each modality and the interactions between them that shape human communication. In this paper, we present a novel neural architecture for understanding human communication called the Multi-attention Recurrent Network (MARN). The main strength of our model comes from discovering interactions between modalities through time using a neural component called the Multi-attention Block (MAB) and storing them in the hybrid memory of a recurrent component called the Long-short Term Hybrid Memory (LSTHM). We perform extensive comparisons on six publicly available datasets for multimodal sentiment analysis, speaker trait recognition and emotion recognition. MARN shows state-of-the-art performance on all the datasets.
Tasks Emotion Recognition, Multimodal Sentiment Analysis, Sentiment Analysis
Published 2018-02-03
URL http://arxiv.org/abs/1802.00923v1
PDF http://arxiv.org/pdf/1802.00923v1.pdf
PWC https://paperswithcode.com/paper/multi-attention-recurrent-network-for-human
Repo https://github.com/pliang279/MFN
Framework pytorch

Amalgamating Knowledge towards Comprehensive Classification

Title Amalgamating Knowledge towards Comprehensive Classification
Authors Chengchao Shen, Xinchao Wang, Jie Song, Li Sun, Mingli Song
Abstract With the rapid development of deep learning, there have been an unprecedentedly large number of trained deep network models available online. Reusing such trained models can significantly reduce the cost of training the new models from scratch, if not infeasible at all as the annotations used for the training original networks are often unavailable to public. We propose in this paper to study a new model-reusing task, which we term as \emph{knowledge amalgamation}. Given multiple trained teacher networks, each of which specializes in a different classification problem, the goal of knowledge amalgamation is to learn a lightweight student model capable of handling the comprehensive classification. We assume no other annotations except the outputs from the teacher models are available, and thus focus on extracting and amalgamating knowledge from the multiple teachers. To this end, we propose a pilot two-step strategy to tackle the knowledge amalgamation task, by learning first the compact feature representations from teachers and then the network parameters in a layer-wise manner so as to build the student model. We apply this approach to four public datasets and obtain very encouraging results: even without any human annotation, the obtained student model is competent to handle the comprehensive classification task and in most cases outperforms the teachers in individual sub-tasks.
Tasks
Published 2018-11-07
URL http://arxiv.org/abs/1811.02796v2
PDF http://arxiv.org/pdf/1811.02796v2.pdf
PWC https://paperswithcode.com/paper/amalgamating-knowledge-towards-comprehensive
Repo https://github.com/zju-vipa/KamalEngine
Framework pytorch

Breaking-down the Ontology Alignment Task with a Lexical Index and Neural Embeddings

Title Breaking-down the Ontology Alignment Task with a Lexical Index and Neural Embeddings
Authors Ernesto Jimenez-Ruiz, Asan Agibetov, Matthias Samwald, Valerie Cross
Abstract Large ontologies still pose serious challenges to state-of-the-art ontology alignment systems. In the paper we present an approach that combines a lexical index, a neural embedding model and locality modules to effectively divide an input ontology matching task into smaller and more tractable matching (sub)tasks. We have conducted a comprehensive evaluation using the datasets of the Ontology Alignment Evaluation Initiative. The results are encouraging and suggest that the proposed methods are adequate in practice and can be integrated within the workflow of state-of-the-art systems.
Tasks
Published 2018-05-31
URL http://arxiv.org/abs/1805.12402v1
PDF http://arxiv.org/pdf/1805.12402v1.pdf
PWC https://paperswithcode.com/paper/breaking-down-the-ontology-alignment-task
Repo https://github.com/ernestojimenezruiz/logmap-matcher
Framework none

Domain Adaptation for Ear Recognition Using Deep Convolutional Neural Networks

Title Domain Adaptation for Ear Recognition Using Deep Convolutional Neural Networks
Authors Fevziye Irem Eyiokur, Dogucan Yaman, Hazım Kemal Ekenel
Abstract In this paper, we have extensively investigated the unconstrained ear recognition problem. We have first shown the importance of domain adaptation, when deep convolutional neural network models are used for ear recognition. To enable domain adaptation, we have collected a new ear dataset using the Multi-PIE face dataset, which we named as Multi-PIE ear dataset. To improve the performance further, we have combined different deep convolutional neural network models. We have analyzed in depth the effect of ear image quality, for example illumination and aspect ratio, on the classification performance. Finally, we have addressed the problem of dataset bias in the ear recognition field. Experiments on the UERC dataset have shown that domain adaptation leads to a significant performance improvement. For example, when VGG-16 model is used and the domain adaptation is applied, an absolute increase of around 10% has been achieved. Combining different deep convolutional neural network models has further improved the accuracy by 4%. It has also been observed that image quality has an influence on the results. In the experiments that we have conducted to examine the dataset bias, given an ear image, we were able to classify the dataset that it has come from with 99.71% accuracy, which indicates a strong bias among the ear recognition datasets.
Tasks Domain Adaptation
Published 2018-03-21
URL http://arxiv.org/abs/1803.07801v1
PDF http://arxiv.org/pdf/1803.07801v1.pdf
PWC https://paperswithcode.com/paper/domain-adaptation-for-ear-recognition-using
Repo https://github.com/iremeyiokur/multipie_ear_dataset
Framework none

Predicting Aircraft Trajectories: A Deep Generative Convolutional Recurrent Neural Networks Approach

Title Predicting Aircraft Trajectories: A Deep Generative Convolutional Recurrent Neural Networks Approach
Authors Yulin Liu, Mark Hansen
Abstract Reliable 4D aircraft trajectory prediction, whether in a real-time setting or for analysis of counterfactuals, is important to the efficiency of the aviation system. Toward this end, we first propose a highly generalizable efficient tree-based matching algorithm to construct image-like feature maps from high-fidelity meteorological datasets - wind, temperature and convective weather. We then model the track points on trajectories as conditional Gaussian mixtures with parameters to be learned from our proposed deep generative model, which is an end-to-end convolutional recurrent neural network that consists of a long short-term memory (LSTM) encoder network and a mixture density LSTM decoder network. The encoder network embeds last-filed flight plan information into fixed-size hidden state variables and feeds the decoder network, which further learns the spatiotemporal correlations from the historical flight tracks and outputs the parameters of Gaussian mixtures. Convolutional layers are integrated into the pipeline to learn representations from the high-dimension weather features. During the inference process, beam search, adaptive Kalman filter, and Rauch-Tung-Striebel smoother algorithms are used to prune the variance of generated trajectories.
Tasks Trajectory Prediction
Published 2018-12-31
URL http://arxiv.org/abs/1812.11670v1
PDF http://arxiv.org/pdf/1812.11670v1.pdf
PWC https://paperswithcode.com/paper/predicting-aircraft-trajectories-a-deep
Repo https://github.com/yulinliu101/DeepTP
Framework tf

OPA2Vec: combining formal and informal content of biomedical ontologies to improve similarity-based prediction

Title OPA2Vec: combining formal and informal content of biomedical ontologies to improve similarity-based prediction
Authors Fatima Zohra Smaili, Xin Gao, Robert Hoehndorf
Abstract Motivation: Ontologies are widely used in biology for data annotation, integration, and analysis. In addition to formally structured axioms, ontologies contain meta-data in the form of annotation axioms which provide valuable pieces of information that characterize ontology classes. Annotations commonly used in ontologies include class labels, descriptions, or synonyms. Despite being a rich source of semantic information, the ontology meta-data are generally unexploited by ontology-based analysis methods such as semantic similarity measures. Results: We propose a novel method, OPA2Vec, to generate vector representations of biological entities in ontologies by combining formal ontology axioms and annotation axioms from the ontology meta-data. We apply a Word2Vec model that has been pre-trained on PubMed abstracts to produce feature vectors from our collected data. We validate our method in two different ways: first, we use the obtained vector representations of proteins as a similarity measure to predict protein-protein interaction (PPI) on two different datasets. Second, we evaluate our method on predicting gene-disease associations based on phenotype similarity by generating vector representations of genes and diseases using a phenotype ontology, and applying the obtained vectors to predict gene-disease associations. These two experiments are just an illustration of the possible applications of our method. OPA2Vec can be used to produce vector representations of any biomedical entity given any type of biomedical ontology. Availability: https://github.com/bio-ontology-research-group/opa2vec Contact: robert.hoehndorf@kaust.edu.sa and xin.gao@kaust.edu.sa.
Tasks Semantic Similarity, Semantic Textual Similarity
Published 2018-04-29
URL http://arxiv.org/abs/1804.10922v1
PDF http://arxiv.org/pdf/1804.10922v1.pdf
PWC https://paperswithcode.com/paper/opa2vec-combining-formal-and-informal-content
Repo https://github.com/bio-ontology-research-group/opa2vec
Framework none

Automatic Academic Paper Rating Based on Modularized Hierarchical Convolutional Neural Network

Title Automatic Academic Paper Rating Based on Modularized Hierarchical Convolutional Neural Network
Authors Pengcheng Yang, Xu Sun, Wei Li, Shuming Ma
Abstract As more and more academic papers are being submitted to conferences and journals, evaluating all these papers by professionals is time-consuming and can cause inequality due to the personal factors of the reviewers. In this paper, in order to assist professionals in evaluating academic papers, we propose a novel task: automatic academic paper rating (AAPR), which automatically determine whether to accept academic papers. We build a new dataset for this task and propose a novel modularized hierarchical convolutional neural network to achieve automatic academic paper rating. Evaluation results show that the proposed model outperforms the baselines by a large margin. The dataset and code are available at \url{https://github.com/lancopku/AAPR}
Tasks
Published 2018-05-10
URL http://arxiv.org/abs/1805.03977v1
PDF http://arxiv.org/pdf/1805.03977v1.pdf
PWC https://paperswithcode.com/paper/automatic-academic-paper-rating-based-on
Repo https://github.com/lancopku/AAPR
Framework pytorch

Learning Latent Dynamics for Planning from Pixels

Title Learning Latent Dynamics for Planning from Pixels
Authors Danijar Hafner, Timothy Lillicrap, Ian Fischer, Ruben Villegas, David Ha, Honglak Lee, James Davidson
Abstract Planning has been very successful for control tasks with known environment dynamics. To leverage planning in unknown environments, the agent needs to learn the dynamics from interactions with the world. However, learning dynamics models that are accurate enough for planning has been a long-standing challenge, especially in image-based domains. We propose the Deep Planning Network (PlaNet), a purely model-based agent that learns the environment dynamics from images and chooses actions through fast online planning in latent space. To achieve high performance, the dynamics model must accurately predict the rewards ahead for multiple time steps. We approach this using a latent dynamics model with both deterministic and stochastic transition components. Moreover, we propose a multi-step variational inference objective that we name latent overshooting. Using only pixel observations, our agent solves continuous control tasks with contact dynamics, partial observability, and sparse rewards, which exceed the difficulty of tasks that were previously solved by planning with learned models. PlaNet uses substantially fewer episodes and reaches final performance close to and sometimes higher than strong model-free algorithms.
Tasks Continuous Control, Motion Planning
Published 2018-11-12
URL https://arxiv.org/abs/1811.04551v5
PDF https://arxiv.org/pdf/1811.04551v5.pdf
PWC https://paperswithcode.com/paper/learning-latent-dynamics-for-planning-from
Repo https://github.com/google-research/planet
Framework tf

Machine Learning Methods for Track Classification in the AT-TPC

Title Machine Learning Methods for Track Classification in the AT-TPC
Authors Michelle P. Kuchera, Raghuram Ramanujan, Jack Z. Taylor, Ryan R. Strauss, Daniel Bazin, Joshua Bradt, Ruiming Chen
Abstract We evaluate machine learning methods for event classification in the Active-Target Time Projection Chamber detector at the National Superconducting Cyclotron Laboratory (NSCL) at Michigan State University. An automated method to single out the desired reaction product would result in more accurate physics results as well as a faster analysis process. Binary and multi-class classification methods were tested on data produced by the $^{46}$Ar(p,p) experiment run at the NSCL in September 2015. We found a Convolutional Neural Network to be the most successful classifier of proton scattering events for transfer learning. Results from this investigation and recommendations for event classification in future experiments are presented.
Tasks Transfer Learning
Published 2018-10-21
URL http://arxiv.org/abs/1810.10350v3
PDF http://arxiv.org/pdf/1810.10350v3.pdf
PWC https://paperswithcode.com/paper/machine-learning-methods-for-track
Repo https://github.com/ATTPC/event-classification
Framework tf

Few-Shot Text Classification with Pre-Trained Word Embeddings and a Human in the Loop

Title Few-Shot Text Classification with Pre-Trained Word Embeddings and a Human in the Loop
Authors Katherine Bailey, Sunny Chopra
Abstract Most of the literature around text classification treats it as a supervised learning problem: given a corpus of labeled documents, train a classifier such that it can accurately predict the classes of unseen documents. In industry, however, it is not uncommon for a business to have entire corpora of documents where few or none have been classified, or where existing classifications have become meaningless. With web content, for example, poor taxonomy management can result in labels being applied indiscriminately, making filtering by these labels unhelpful. Our work aims to make it possible to classify an entire corpus of unlabeled documents using a human-in-the-loop approach, where the content owner manually classifies just one or two documents per category and the rest can be automatically classified. This “few-shot” learning approach requires rich representations of the documents such that those that have been manually labeled can be treated as prototypes, and automatic classification of the rest is a simple case of measuring the distance to prototypes. This approach uses pre-trained word embeddings, where documents are represented using a simple weighted average of constituent word embeddings. We have tested the accuracy of the approach on existing labeled datasets and provide the results here. We have also made code available for reproducing the results we got on the 20 Newsgroups dataset.
Tasks Few-Shot Learning, Text Classification, Word Embeddings
Published 2018-04-05
URL http://arxiv.org/abs/1804.02063v1
PDF http://arxiv.org/pdf/1804.02063v1.pdf
PWC https://paperswithcode.com/paper/few-shot-text-classification-with-pre-trained
Repo https://github.com/katbailey/few-shot-text-classification
Framework none

Fast Perceptual Image Enhancement

Title Fast Perceptual Image Enhancement
Authors Etienne de Stoutz, Andrey Ignatov, Nikolay Kobyshev, Radu Timofte, Luc Van Gool
Abstract The vast majority of photos taken today are by mobile phones. While their quality is rapidly growing, due to physical limitations and cost constraints, mobile phone cameras struggle to compare in quality with DSLR cameras. This motivates us to computationally enhance these images. We extend upon the results of Ignatov et al., where they are able to translate images from compact mobile cameras into images with comparable quality to high-resolution photos taken by DSLR cameras. However, the neural models employed require large amounts of computational resources and are not lightweight enough to run on mobile devices. We build upon the prior work and explore different network architectures targeting an increase in image quality and speed. With an efficient network architecture which does most of its processing in a lower spatial resolution, we achieve a significantly higher mean opinion score (MOS) than the baseline while speeding up the computation by 6.3 times on a consumer-grade CPU. This suggests a promising direction for neural-network-based photo enhancement using the phone hardware of the future.
Tasks Image Enhancement
Published 2018-12-31
URL http://arxiv.org/abs/1812.11852v1
PDF http://arxiv.org/pdf/1812.11852v1.pdf
PWC https://paperswithcode.com/paper/fast-perceptual-image-enhancement
Repo https://github.com/dojure/FPIE
Framework tf

Concatenated Power Mean Word Embeddings as Universal Cross-Lingual Sentence Representations

Title Concatenated Power Mean Word Embeddings as Universal Cross-Lingual Sentence Representations
Authors Andreas Rücklé, Steffen Eger, Maxime Peyrard, Iryna Gurevych
Abstract Average word embeddings are a common baseline for more sophisticated sentence embedding techniques. However, they typically fall short of the performances of more complex models such as InferSent. Here, we generalize the concept of average word embeddings to power mean word embeddings. We show that the concatenation of different types of power mean word embeddings considerably closes the gap to state-of-the-art methods monolingually and substantially outperforms these more complex techniques cross-lingually. In addition, our proposed method outperforms different recently proposed baselines such as SIF and Sent2Vec by a solid margin, thus constituting a much harder-to-beat monolingual baseline. Our data and code are publicly available.
Tasks Sentence Embedding, Word Embeddings
Published 2018-03-04
URL http://arxiv.org/abs/1803.01400v2
PDF http://arxiv.org/pdf/1803.01400v2.pdf
PWC https://paperswithcode.com/paper/concatenated-power-mean-word-embeddings-as
Repo https://github.com/UKPLab/arxiv2018-xling-sentence-embeddings
Framework tf

Semantic-aware Grad-GAN for Virtual-to-Real Urban Scene Adaption

Title Semantic-aware Grad-GAN for Virtual-to-Real Urban Scene Adaption
Authors Peilun Li, Xiaodan Liang, Daoyuan Jia, Eric P. Xing
Abstract Recent advances in vision tasks (e.g., segmentation) highly depend on the availability of large-scale real-world image annotations obtained by cumbersome human labors. Moreover, the perception performance often drops significantly for new scenarios, due to the poor generalization capability of models trained on limited and biased annotations. In this work, we resort to transfer knowledge from automatically rendered scene annotations in virtual-world to facilitate real-world visual tasks. Although virtual-world annotations can be ideally diverse and unlimited, the discrepant data distributions between virtual and real-world make it challenging for knowledge transferring. We thus propose a novel Semantic-aware Grad-GAN (SG-GAN) to perform virtual-to-real domain adaption with the ability of retaining vital semantic information. Beyond the simple holistic color/texture transformation achieved by prior works, SG-GAN successfully personalizes the appearance adaption for each semantic region in order to preserve their key characteristic for better recognition. It presents two main contributions to traditional GANs: 1) a soft gradient-sensitive objective for keeping semantic boundaries; 2) a semantic-aware discriminator for validating the fidelity of personalized adaptions with respect to each semantic region. Qualitative and quantitative experiments demonstrate the superiority of our SG-GAN in scene adaption over state-of-the-art GANs. Further evaluations on semantic segmentation on Cityscapes show using adapted virtual images by SG-GAN dramatically improves segmentation performance than original virtual data. We release our code at https://github.com/Peilun-Li/SG-GAN.
Tasks Domain Adaptation, Semantic Segmentation
Published 2018-01-05
URL http://arxiv.org/abs/1801.01726v2
PDF http://arxiv.org/pdf/1801.01726v2.pdf
PWC https://paperswithcode.com/paper/semantic-aware-grad-gan-for-virtual-to-real
Repo https://github.com/Peilun-Li/SG-GAN
Framework tf

DeepWeeds: A Multiclass Weed Species Image Dataset for Deep Learning

Title DeepWeeds: A Multiclass Weed Species Image Dataset for Deep Learning
Authors Alex Olsen, Dmitry A. Konovalov, Bronson Philippa, Peter Ridd, Jake C. Wood, Jamie Johns, Wesley Banks, Benjamin Girgenti, Owen Kenny, James Whinney, Brendan Calvert, Mostafa Rahimi Azghadi, Ronald D. White
Abstract Robotic weed control has seen increased research of late with its potential for boosting productivity in agriculture. Majority of works focus on developing robotics for croplands, ignoring the weed management problems facing rangeland stock farmers. Perhaps the greatest obstacle to widespread uptake of robotic weed control is the robust classification of weed species in their natural environment. The unparalleled successes of deep learning make it an ideal candidate for recognising various weed species in the complex rangeland environment. This work contributes the first large, public, multiclass image dataset of weed species from the Australian rangelands; allowing for the development of robust classification methods to make robotic weed control viable. The DeepWeeds dataset consists of 17,509 labelled images of eight nationally significant weed species native to eight locations across northern Australia. This paper presents a baseline for classification performance on the dataset using the benchmark deep learning models, Inception-v3 and ResNet-50. These models achieved an average classification accuracy of 95.1% and 95.7%, respectively. We also demonstrate real time performance of the ResNet-50 architecture, with an average inference time of 53.4 ms per image. These strong results bode well for future field implementation of robotic weed control methods in the Australian rangelands.
Tasks
Published 2018-10-09
URL http://arxiv.org/abs/1810.05726v3
PDF http://arxiv.org/pdf/1810.05726v3.pdf
PWC https://paperswithcode.com/paper/deepweeds-a-multiclass-weed-species-image
Repo https://github.com/AlexOlsen/DeepWeeds
Framework tf
comments powered by Disqus