July 30, 2019

3193 words 15 mins read

Paper Group AWR 71

Paper Group AWR 71

Learning SO(3) Equivariant Representations with Spherical CNNs. Open-World Knowledge Graph Completion. Should Robots be Obedient?. Learning a Generative Model for Validity in Complex Discrete Structures. Trainable Greedy Decoding for Neural Machine Translation. Wavelet Convolutional Neural Networks for Texture Classification. Fast Meta-Learning for …

Learning SO(3) Equivariant Representations with Spherical CNNs

Title Learning SO(3) Equivariant Representations with Spherical CNNs
Authors Carlos Esteves, Christine Allen-Blanchette, Ameesh Makadia, Kostas Daniilidis
Abstract We address the problem of 3D rotation equivariance in convolutional neural networks. 3D rotations have been a challenging nuisance in 3D classification tasks requiring higher capacity and extended data augmentation in order to tackle it. We model 3D data with multi-valued spherical functions and we propose a novel spherical convolutional network that implements exact convolutions on the sphere by realizing them in the spherical harmonic domain. Resulting filters have local symmetry and are localized by enforcing smooth spectra. We apply a novel pooling on the spectral domain and our operations are independent of the underlying spherical resolution throughout the network. We show that networks with much lower capacity and without requiring data augmentation can exhibit performance comparable to the state of the art in standard retrieval and classification benchmarks.
Tasks Data Augmentation
Published 2017-11-17
URL http://arxiv.org/abs/1711.06721v3
PDF http://arxiv.org/pdf/1711.06721v3.pdf
PWC https://paperswithcode.com/paper/learning-so3-equivariant-representations-with
Repo https://github.com/aidinhass/tf-sphcnn
Framework tf

Open-World Knowledge Graph Completion

Title Open-World Knowledge Graph Completion
Authors Baoxu Shi, Tim Weninger
Abstract Knowledge Graphs (KGs) have been applied to many tasks including Web search, link prediction, recommendation, natural language processing, and entity linking. However, most KGs are far from complete and are growing at a rapid pace. To address these problems, Knowledge Graph Completion (KGC) has been proposed to improve KGs by filling in its missing connections. Unlike existing methods which hold a closed-world assumption, i.e., where KGs are fixed and new entities cannot be easily added, in the present work we relax this assumption and propose a new open-world KGC task. As a first attempt to solve this task we introduce an open-world KGC model called ConMask. This model learns embeddings of the entity’s name and parts of its text-description to connect unseen entities to the KG. To mitigate the presence of noisy text descriptions, ConMask uses a relationship-dependent content masking to extract relevant snippets and then trains a fully convolutional neural network to fuse the extracted snippets with entities in the KG. Experiments on large data sets, both old and new, show that ConMask performs well in the open-world KGC task and even outperforms existing KGC models on the standard closed-world KGC task.
Tasks Entity Linking, Knowledge Graph Completion, Knowledge Graphs, Link Prediction
Published 2017-11-09
URL http://arxiv.org/abs/1711.03438v1
PDF http://arxiv.org/pdf/1711.03438v1.pdf
PWC https://paperswithcode.com/paper/open-world-knowledge-graph-completion
Repo https://github.com/bxshi/ConMask
Framework tf

Should Robots be Obedient?

Title Should Robots be Obedient?
Authors Smitha Milli, Dylan Hadfield-Menell, Anca Dragan, Stuart Russell
Abstract Intuitively, obedience – following the order that a human gives – seems like a good property for a robot to have. But, we humans are not perfect and we may give orders that are not best aligned to our preferences. We show that when a human is not perfectly rational then a robot that tries to infer and act according to the human’s underlying preferences can always perform better than a robot that simply follows the human’s literal order. Thus, there is a tradeoff between the obedience of a robot and the value it can attain for its owner. We investigate how this tradeoff is impacted by the way the robot infers the human’s preferences, showing that some methods err more on the side of obedience than others. We then analyze how performance degrades when the robot has a misspecified model of the features that the human cares about or the level of rationality of the human. Finally, we study how robots can start detecting such model misspecification. Overall, our work suggests that there might be a middle ground in which robots intelligently decide when to obey human orders, but err on the side of obedience.
Tasks
Published 2017-05-28
URL http://arxiv.org/abs/1705.09990v1
PDF http://arxiv.org/pdf/1705.09990v1.pdf
PWC https://paperswithcode.com/paper/should-robots-be-obedient
Repo https://github.com/smilli/obedience
Framework none

Learning a Generative Model for Validity in Complex Discrete Structures

Title Learning a Generative Model for Validity in Complex Discrete Structures
Authors David Janz, Jos van der Westhuizen, Brooks Paige, Matt J. Kusner, José Miguel Hernández-Lobato
Abstract Deep generative models have been successfully used to learn representations for high-dimensional discrete spaces by representing discrete objects as sequences and employing powerful sequence-based deep models. Unfortunately, these sequence-based models often produce invalid sequences: sequences which do not represent any underlying discrete structure; invalid sequences hinder the utility of such models. As a step towards solving this problem, we propose to learn a deep recurrent validator model, which can estimate whether a partial sequence can function as the beginning of a full, valid sequence. This validator provides insight as to how individual sequence elements influence the validity of the overall sequence, and can be used to constrain sequence based models to generate valid sequences – and thus faithfully model discrete objects. Our approach is inspired by reinforcement learning, where an oracle which can evaluate validity of complete sequences provides a sparse reward signal. We demonstrate its effectiveness as a generative model of Python 3 source code for mathematical expressions, and in improving the ability of a variational autoencoder trained on SMILES strings to decode valid molecular structures.
Tasks
Published 2017-12-05
URL http://arxiv.org/abs/1712.01664v4
PDF http://arxiv.org/pdf/1712.01664v4.pdf
PWC https://paperswithcode.com/paper/learning-a-generative-model-for-validity-in
Repo https://github.com/DavidJanz/molecule_grammar_rnn
Framework pytorch

Trainable Greedy Decoding for Neural Machine Translation

Title Trainable Greedy Decoding for Neural Machine Translation
Authors Jiatao Gu, Kyunghyun Cho, Victor O. K. Li
Abstract Recent research in neural machine translation has largely focused on two aspects; neural network architectures and end-to-end learning algorithms. The problem of decoding, however, has received relatively little attention from the research community. In this paper, we solely focus on the problem of decoding given a trained neural machine translation model. Instead of trying to build a new decoding algorithm for any specific decoding objective, we propose the idea of trainable decoding algorithm in which we train a decoding algorithm to find a translation that maximizes an arbitrary decoding objective. More specifically, we design an actor that observes and manipulates the hidden state of the neural machine translation decoder and propose to train it using a variant of deterministic policy gradient. We extensively evaluate the proposed algorithm using four language pairs and two decoding objectives and show that we can indeed train a trainable greedy decoder that generates a better translation (in terms of a target decoding objective) with minimal computational overhead.
Tasks Machine Translation
Published 2017-02-08
URL http://arxiv.org/abs/1702.02429v1
PDF http://arxiv.org/pdf/1702.02429v1.pdf
PWC https://paperswithcode.com/paper/trainable-greedy-decoding-for-neural-machine
Repo https://github.com/kyunghyuncho/rl-pong
Framework pytorch

Wavelet Convolutional Neural Networks for Texture Classification

Title Wavelet Convolutional Neural Networks for Texture Classification
Authors Shin Fujieda, Kohei Takayama, Toshiya Hachisuka
Abstract Texture classification is an important and challenging problem in many image processing applications. While convolutional neural networks (CNNs) achieved significant successes for image classification, texture classification remains a difficult problem since textures usually do not contain enough information regarding the shape of object. In image processing, texture classification has been traditionally studied well with spectral analyses which exploit repeated structures in many textures. Since CNNs process images as-is in the spatial domain whereas spectral analyses process images in the frequency domain, these models have different characteristics in terms of performance. We propose a novel CNN architecture, wavelet CNNs, which integrates a spectral analysis into CNNs. Our insight is that the pooling layer and the convolution layer can be viewed as a limited form of a spectral analysis. Based on this insight, we generalize both layers to perform a spectral analysis with wavelet transform. Wavelet CNNs allow us to utilize spectral information which is lost in conventional CNNs but useful in texture classification. The experiments demonstrate that our model achieves better accuracy in texture classification than existing models. We also show that our model has significantly fewer parameters than CNNs, making our model easier to train with less memory.
Tasks Image Classification, Texture Classification
Published 2017-07-24
URL http://arxiv.org/abs/1707.07394v1
PDF http://arxiv.org/pdf/1707.07394v1.pdf
PWC https://paperswithcode.com/paper/wavelet-convolutional-neural-networks-for
Repo https://github.com/shinfj/WaveletCNN_for_TextureClassification
Framework caffe2

Fast Meta-Learning for Adaptive Hierarchical Classifier Design

Title Fast Meta-Learning for Adaptive Hierarchical Classifier Design
Authors Gerrit J. J. van den Burg, Alfred O. Hero
Abstract We propose a new splitting criterion for a meta-learning approach to multiclass classifier design that adaptively merges the classes into a tree-structured hierarchy of increasingly difficult binary classification problems. The classification tree is constructed from empirical estimates of the Henze-Penrose bounds on the pairwise Bayes misclassification rates that rank the binary subproblems in terms of difficulty of classification. The proposed empirical estimates of the Bayes error rate are computed from the minimal spanning tree (MST) of the samples from each pair of classes. Moreover, a meta-learning technique is presented for quantifying the one-vs-rest Bayes error rate for each individual class from a single MST on the entire dataset. Extensive simulations on benchmark datasets show that the proposed hierarchical method can often be learned much faster than competing methods, while achieving competitive accuracy.
Tasks Meta-Learning
Published 2017-11-09
URL http://arxiv.org/abs/1711.03512v1
PDF http://arxiv.org/pdf/1711.03512v1.pdf
PWC https://paperswithcode.com/paper/fast-meta-learning-for-adaptive-hierarchical
Repo https://github.com/HeroResearchGroup/SmartSVM
Framework none

End-to-end optimization of goal-driven and visually grounded dialogue systems

Title End-to-end optimization of goal-driven and visually grounded dialogue systems
Authors Florian Strub, Harm de Vries, Jeremie Mary, Bilal Piot, Aaron Courville, Olivier Pietquin
Abstract End-to-end design of dialogue systems has recently become a popular research topic thanks to powerful tools such as encoder-decoder architectures for sequence-to-sequence learning. Yet, most current approaches cast human-machine dialogue management as a supervised learning problem, aiming at predicting the next utterance of a participant given the full history of the dialogue. This vision is too simplistic to render the intrinsic planning problem inherent to dialogue as well as its grounded nature, making the context of a dialogue larger than the sole history. This is why only chit-chat and question answering tasks have been addressed so far using end-to-end architectures. In this paper, we introduce a Deep Reinforcement Learning method to optimize visually grounded task-oriented dialogues, based on the policy gradient algorithm. This approach is tested on a dataset of 120k dialogues collected through Mechanical Turk and provides encouraging results at solving both the problem of generating natural dialogues and the task of discovering a specific object in a complex picture.
Tasks Dialogue Management, Visual Question Answering
Published 2017-03-15
URL http://arxiv.org/abs/1703.05423v1
PDF http://arxiv.org/pdf/1703.05423v1.pdf
PWC https://paperswithcode.com/paper/end-to-end-optimization-of-goal-driven-and
Repo https://github.com/ibrahimSouleiman/GuessWhat
Framework tf

Remote Sensing Image Scene Classification: Benchmark and State of the Art

Title Remote Sensing Image Scene Classification: Benchmark and State of the Art
Authors Gong Cheng, Junwei Han, Xiaoqiang Lu
Abstract Remote sensing image scene classification plays an important role in a wide range of applications and hence has been receiving remarkable attention. During the past years, significant efforts have been made to develop various datasets or present a variety of approaches for scene classification from remote sensing images. However, a systematic review of the literature concerning datasets and methods for scene classification is still lacking. In addition, almost all existing datasets have a number of limitations, including the small scale of scene classes and the image numbers, the lack of image variations and diversity, and the saturation of accuracy. These limitations severely limit the development of new approaches especially deep learning-based methods. This paper first provides a comprehensive review of the recent progress. Then, we propose a large-scale dataset, termed “NWPU-RESISC45”, which is a publicly available benchmark for REmote Sensing Image Scene Classification (RESISC), created by Northwestern Polytechnical University (NWPU). This dataset contains 31,500 images, covering 45 scene classes with 700 images in each class. The proposed NWPU-RESISC45 (i) is large-scale on the scene classes and the total image number, (ii) holds big variations in translation, spatial resolution, viewpoint, object pose, illumination, background, and occlusion, and (iii) has high within-class diversity and between-class similarity. The creation of this dataset will enable the community to develop and evaluate various data-driven algorithms. Finally, several representative methods are evaluated using the proposed dataset and the results are reported as a useful baseline for future research.
Tasks Scene Classification
Published 2017-03-01
URL http://arxiv.org/abs/1703.00121v1
PDF http://arxiv.org/pdf/1703.00121v1.pdf
PWC https://paperswithcode.com/paper/remote-sensing-image-scene-classification
Repo https://github.com/ArealTeamM2AIC/Remote-Sensing-Image
Framework pytorch

Visual Feature Attribution using Wasserstein GANs

Title Visual Feature Attribution using Wasserstein GANs
Authors Christian F. Baumgartner, Lisa M. Koch, Kerem Can Tezcan, Jia Xi Ang, Ender Konukoglu
Abstract Attributing the pixels of an input image to a certain category is an important and well-studied problem in computer vision, with applications ranging from weakly supervised localisation to understanding hidden effects in the data. In recent years, approaches based on interpreting a previously trained neural network classifier have become the de facto state-of-the-art and are commonly used on medical as well as natural image datasets. In this paper, we discuss a limitation of these approaches which may lead to only a subset of the category specific features being detected. To address this problem we develop a novel feature attribution technique based on Wasserstein Generative Adversarial Networks (WGAN), which does not suffer from this limitation. We show that our proposed method performs substantially better than the state-of-the-art for visual attribution on a synthetic dataset and on real 3D neuroimaging data from patients with mild cognitive impairment (MCI) and Alzheimer’s disease (AD). For AD patients the method produces compellingly realistic disease effect maps which are very close to the observed effects.
Tasks
Published 2017-11-24
URL http://arxiv.org/abs/1711.08998v3
PDF http://arxiv.org/pdf/1711.08998v3.pdf
PWC https://paperswithcode.com/paper/visual-feature-attribution-using-wasserstein
Repo https://github.com/orobix/Visual-Feature-Attribution-Using-Wasserstein-GANs-Pytorch
Framework pytorch

Action-depedent Control Variates for Policy Optimization via Stein’s Identity

Title Action-depedent Control Variates for Policy Optimization via Stein’s Identity
Authors Hao Liu, Yihao Feng, Yi Mao, Dengyong Zhou, Jian Peng, Qiang Liu
Abstract Policy gradient methods have achieved remarkable successes in solving challenging reinforcement learning problems. However, it still often suffers from the large variance issue on policy gradient estimation, which leads to poor sample efficiency during training. In this work, we propose a control variate method to effectively reduce variance for policy gradient methods. Motivated by the Stein’s identity, our method extends the previous control variate methods used in REINFORCE and advantage actor-critic by introducing more general action-dependent baseline functions. Empirical studies show that our method significantly improves the sample efficiency of the state-of-the-art policy gradient approaches.
Tasks Policy Gradient Methods
Published 2017-10-30
URL http://arxiv.org/abs/1710.11198v4
PDF http://arxiv.org/pdf/1710.11198v4.pdf
PWC https://paperswithcode.com/paper/action-depedent-control-variates-for-policy
Repo https://github.com/brain-research/mirage-rl
Framework tf

Deep Image Prior

Title Deep Image Prior
Authors Dmitry Ulyanov, Andrea Vedaldi, Victor Lempitsky
Abstract Deep convolutional networks have become a popular tool for image generation and restoration. Generally, their excellent performance is imputed to their ability to learn realistic image priors from a large number of example images. In this paper, we show that, on the contrary, the structure of a generator network is sufficient to capture a great deal of low-level image statistics prior to any learning. In order to do so, we show that a randomly-initialized neural network can be used as a handcrafted prior with excellent results in standard inverse problems such as denoising, super-resolution, and inpainting. Furthermore, the same prior can be used to invert deep neural representations to diagnose them, and to restore images based on flash-no flash input pairs. Apart from its diverse applications, our approach highlights the inductive bias captured by standard generator network architectures. It also bridges the gap between two very popular families of image restoration methods: learning-based methods using deep convolutional networks and learning-free methods based on handcrafted image priors such as self-similarity. Code and supplementary material are available at https://dmitryulyanov.github.io/deep_image_prior .
Tasks Denoising, Image Denoising, Image Generation, Image Inpainting, Image Restoration, Jpeg Compression Artifact Reduction, Super-Resolution
Published 2017-11-29
URL http://arxiv.org/abs/1711.10925v3
PDF http://arxiv.org/pdf/1711.10925v3.pdf
PWC https://paperswithcode.com/paper/deep-image-prior
Repo https://github.com/rsin46/deep-image-prior-keras
Framework none

NEURAghe: Exploiting CPU-FPGA Synergies for Efficient and Flexible CNN Inference Acceleration on Zynq SoCs

Title NEURAghe: Exploiting CPU-FPGA Synergies for Efficient and Flexible CNN Inference Acceleration on Zynq SoCs
Authors Paolo Meloni, Alessandro Capotondi, Gianfranco Deriu, Michele Brian, Francesco Conti, Davide Rossi, Luigi Raffo, Luca Benini
Abstract Deep convolutional neural networks (CNNs) obtain outstanding results in tasks that require human-level understanding of data, like image or speech recognition. However, their computational load is significant, motivating the development of CNN-specialized accelerators. This work presents NEURAghe, a flexible and efficient hardware/software solution for the acceleration of CNNs on Zynq SoCs. NEURAghe leverages the synergistic usage of Zynq ARM cores and of a powerful and flexible Convolution-Specific Processor deployed on the reconfigurable logic. The Convolution-Specific Processor embeds both a convolution engine and a programmable soft core, releasing the ARM processors from most of the supervision duties and allowing the accelerator to be controlled by software at an ultra-fine granularity. This methodology opens the way for cooperative heterogeneous computing: while the accelerator takes care of the bulk of the CNN workload, the ARM cores can seamlessly execute hard-to-accelerate parts of the computational graph, taking advantage of the NEON vector engines to further speed up computation. Through the companion NeuDNN SW stack, NEURAghe supports end-to-end CNN-based classification with a peak performance of 169 Gops/s, and an energy efficiency of 17 Gops/W. Thanks to our heterogeneous computing model, our platform improves upon the state-of-the-art, achieving a frame rate of 5.5 fps on the end-to-end execution of VGG-16, and 6.6 fps on ResNet-18.
Tasks Speech Recognition
Published 2017-12-04
URL http://arxiv.org/abs/1712.00994v1
PDF http://arxiv.org/pdf/1712.00994v1.pdf
PWC https://paperswithcode.com/paper/neuraghe-exploiting-cpu-fpga-synergies-for
Repo https://github.com/Dhananjayadmd/DNN_MP
Framework none

MACA: A Modular Architecture for Conversational Agents

Title MACA: A Modular Architecture for Conversational Agents
Authors Hoai Phuoc Truong, Prasanna Parthasarathi, Joelle Pineau
Abstract We propose a software architecture designed to ease the implementation of dialogue systems. The Modular Architecture for Conversational Agents (MACA) uses a plug-n-play style that allows quick prototyping, thereby facilitating the development of new techniques and the reproduction of previous work. The architecture separates the domain of the conversation from the agent’s dialogue strategy, and as such can be easily extended to multiple domains. MACA provides tools to host dialogue agents on Amazon Mechanical Turk (mTurk) for data collection and allows processing of other sources of training data. The current version of the framework already incorporates several domains and existing dialogue strategies from the recent literature.
Tasks
Published 2017-05-01
URL http://arxiv.org/abs/1705.00673v2
PDF http://arxiv.org/pdf/1705.00673v2.pdf
PWC https://paperswithcode.com/paper/maca-a-modular-architecture-for
Repo https://github.com/ppartha03/MACA
Framework none

DARLA: Improving Zero-Shot Transfer in Reinforcement Learning

Title DARLA: Improving Zero-Shot Transfer in Reinforcement Learning
Authors Irina Higgins, Arka Pal, Andrei A. Rusu, Loic Matthey, Christopher P Burgess, Alexander Pritzel, Matthew Botvinick, Charles Blundell, Alexander Lerchner
Abstract Domain adaptation is an important open problem in deep reinforcement learning (RL). In many scenarios of interest data is hard to obtain, so agents may learn a source policy in a setting where data is readily available, with the hope that it generalises well to the target domain. We propose a new multi-stage RL agent, DARLA (DisentAngled Representation Learning Agent), which learns to see before learning to act. DARLA’s vision is based on learning a disentangled representation of the observed environment. Once DARLA can see, it is able to acquire source policies that are robust to many domain shifts - even with no access to the target domain. DARLA significantly outperforms conventional baselines in zero-shot domain adaptation scenarios, an effect that holds across a variety of RL environments (Jaco arm, DeepMind Lab) and base RL algorithms (DQN, A3C and EC).
Tasks Domain Adaptation, Representation Learning
Published 2017-07-26
URL http://arxiv.org/abs/1707.08475v2
PDF http://arxiv.org/pdf/1707.08475v2.pdf
PWC https://paperswithcode.com/paper/darla-improving-zero-shot-transfer-in
Repo https://github.com/BCHoagland/DARLA
Framework pytorch
comments powered by Disqus