July 29, 2019

2917 words 14 mins read

Paper Group AWR 191

Paper Group AWR 191

Zoom Out-and-In Network with Map Attention Decision for Region Proposal and Object Detection. Generalizing Hamiltonian Monte Carlo with Neural Networks. OptionGAN: Learning Joint Reward-Policy Options using Generative Adversarial Inverse Reinforcement Learning. Adversarial Information Factorization. Neural Extractive Summarization with Side Informa …

Zoom Out-and-In Network with Map Attention Decision for Region Proposal and Object Detection

Title Zoom Out-and-In Network with Map Attention Decision for Region Proposal and Object Detection
Authors Hongyang Li, Yu Liu, Wanli Ouyang, Xiaogang Wang
Abstract In this paper, we propose a zoom-out-and-in network for generating object proposals. A key observation is that it is difficult to classify anchors of different sizes with the same set of features. Anchors of different sizes should be placed accordingly based on different depth within a network: smaller boxes on high-resolution layers with a smaller stride while larger boxes on low-resolution counterparts with a larger stride. Inspired by the conv/deconv structure, we fully leverage the low-level local details and high-level regional semantics from two feature map streams, which are complimentary to each other, to identify the objectness in an image. A map attention decision (MAD) unit is further proposed to aggressively search for neuron activations among two streams and attend the most contributive ones on the feature learning of the final loss. The unit serves as a decisionmaker to adaptively activate maps along certain channels with the solely purpose of optimizing the overall training loss. One advantage of MAD is that the learned weights enforced on each feature channel is predicted on-the-fly based on the input context, which is more suitable than the fixed enforcement of a convolutional kernel. Experimental results on three datasets, including PASCAL VOC 2007, ImageNet DET, MS COCO, demonstrate the effectiveness of our proposed algorithm over other state-of-the-arts, in terms of average recall (AR) for region proposal and average precision (AP) for object detection.
Tasks Object Detection
Published 2017-09-13
URL http://arxiv.org/abs/1709.04347v2
PDF http://arxiv.org/pdf/1709.04347v2.pdf
PWC https://paperswithcode.com/paper/zoom-out-and-in-network-with-map-attention
Repo https://github.com/hli2020/zoom_network
Framework none

Generalizing Hamiltonian Monte Carlo with Neural Networks

Title Generalizing Hamiltonian Monte Carlo with Neural Networks
Authors Daniel Levy, Matthew D. Hoffman, Jascha Sohl-Dickstein
Abstract We present a general-purpose method to train Markov chain Monte Carlo kernels, parameterized by deep neural networks, that converge and mix quickly to their target distribution. Our method generalizes Hamiltonian Monte Carlo and is trained to maximize expected squared jumped distance, a proxy for mixing speed. We demonstrate large empirical gains on a collection of simple but challenging distributions, for instance achieving a 106x improvement in effective sample size in one case, and mixing when standard HMC makes no measurable progress in a second. Finally, we show quantitative and qualitative gains on a real-world task: latent-variable generative modeling. We release an open source TensorFlow implementation of the algorithm.
Tasks
Published 2017-11-25
URL http://arxiv.org/abs/1711.09268v3
PDF http://arxiv.org/pdf/1711.09268v3.pdf
PWC https://paperswithcode.com/paper/generalizing-hamiltonian-monte-carlo-with
Repo https://github.com/brain-research/l2hmc
Framework tf

OptionGAN: Learning Joint Reward-Policy Options using Generative Adversarial Inverse Reinforcement Learning

Title OptionGAN: Learning Joint Reward-Policy Options using Generative Adversarial Inverse Reinforcement Learning
Authors Peter Henderson, Wei-Di Chang, Pierre-Luc Bacon, David Meger, Joelle Pineau, Doina Precup
Abstract Reinforcement learning has shown promise in learning policies that can solve complex problems. However, manually specifying a good reward function can be difficult, especially for intricate tasks. Inverse reinforcement learning offers a useful paradigm to learn the underlying reward function directly from expert demonstrations. Yet in reality, the corpus of demonstrations may contain trajectories arising from a diverse set of underlying reward functions rather than a single one. Thus, in inverse reinforcement learning, it is useful to consider such a decomposition. The options framework in reinforcement learning is specifically designed to decompose policies in a similar light. We therefore extend the options framework and propose a method to simultaneously recover reward options in addition to policy options. We leverage adversarial methods to learn joint reward-policy options using only observed expert states. We show that this approach works well in both simple and complex continuous control tasks and shows significant performance increases in one-shot transfer learning.
Tasks Continuous Control, Imitation Learning, Transfer Learning
Published 2017-09-20
URL http://arxiv.org/abs/1709.06683v2
PDF http://arxiv.org/pdf/1709.06683v2.pdf
PWC https://paperswithcode.com/paper/optiongan-learning-joint-reward-policy
Repo https://github.com/Breakend/OptionGAN
Framework tf

Adversarial Information Factorization

Title Adversarial Information Factorization
Authors Antonia Creswell, Yumnah Mohamied, Biswa Sengupta, Anil A Bharath
Abstract We propose a novel generative model architecture designed to learn representations for images that factor out a single attribute from the rest of the representation. A single object may have many attributes which when altered do not change the identity of the object itself. Consider the human face; the identity of a particular person is independent of whether or not they happen to be wearing glasses. The attribute of wearing glasses can be changed without changing the identity of the person. However, the ability to manipulate and alter image attributes without altering the object identity is not a trivial task. Here, we are interested in learning a representation of the image that separates the identity of an object (such as a human face) from an attribute (such as ‘wearing glasses’). We demonstrate the success of our factorization approach by using the learned representation to synthesize the same face with and without a chosen attribute. We refer to this specific synthesis process as image attribute manipulation. We further demonstrate that our model achieves competitive scores, with state of the art, on a facial attribute classification task.
Tasks Facial Attribute Classification, Image Generation
Published 2017-11-14
URL http://arxiv.org/abs/1711.05175v2
PDF http://arxiv.org/pdf/1711.05175v2.pdf
PWC https://paperswithcode.com/paper/adversarial-information-factorization
Repo https://github.com/ToniCreswell/attribute-cVAEGAN
Framework pytorch

Neural Extractive Summarization with Side Information

Title Neural Extractive Summarization with Side Information
Authors Shashi Narayan, Nikos Papasarantopoulos, Shay B. Cohen, Mirella Lapata
Abstract Most extractive summarization methods focus on the main body of the document from which sentences need to be extracted. However, the gist of the document may lie in side information, such as the title and image captions which are often available for newswire articles. We propose to explore side information in the context of single-document extractive summarization. We develop a framework for single-document summarization composed of a hierarchical document encoder and an attention-based extractor with attention over side information. We evaluate our model on a large scale news dataset. We show that extractive summarization with side information consistently outperforms its counterpart that does not use any side information, in terms of both informativeness and fluency.
Tasks Document Summarization, Image Captioning
Published 2017-04-14
URL http://arxiv.org/abs/1704.04530v2
PDF http://arxiv.org/pdf/1704.04530v2.pdf
PWC https://paperswithcode.com/paper/neural-extractive-summarization-with-side
Repo https://github.com/shashiongithub/sidenet
Framework tf

Weakly supervised 3D Reconstruction with Adversarial Constraint

Title Weakly supervised 3D Reconstruction with Adversarial Constraint
Authors JunYoung Gwak, Christopher B. Choy, Animesh Garg, Manmohan Chandraker, Silvio Savarese
Abstract Supervised 3D reconstruction has witnessed a significant progress through the use of deep neural networks. However, this increase in performance requires large scale annotations of 2D/3D data. In this paper, we explore inexpensive 2D supervision as an alternative for expensive 3D CAD annotation. Specifically, we use foreground masks as weak supervision through a raytrace pooling layer that enables perspective projection and backpropagation. Additionally, since the 3D reconstruction from masks is an ill posed problem, we propose to constrain the 3D reconstruction to the manifold of unlabeled realistic 3D shapes that match mask observations. We demonstrate that learning a log-barrier solution to this constrained optimization problem resembles the GAN objective, enabling the use of existing tools for training GANs. We evaluate and analyze the manifold constrained reconstruction on various datasets for single and multi-view reconstruction of both synthetic and real images.
Tasks 3D Reconstruction
Published 2017-05-31
URL http://arxiv.org/abs/1705.10904v2
PDF http://arxiv.org/pdf/1705.10904v2.pdf
PWC https://paperswithcode.com/paper/weakly-supervised-3d-reconstruction-with
Repo https://github.com/chrischoy/3D-R2N2
Framework none

Stochastic Primal-Dual Hybrid Gradient Algorithm with Arbitrary Sampling and Imaging Applications

Title Stochastic Primal-Dual Hybrid Gradient Algorithm with Arbitrary Sampling and Imaging Applications
Authors Antonin Chambolle, Matthias J. Ehrhardt, Peter Richtárik, Carola-Bibiane Schönlieb
Abstract We propose a stochastic extension of the primal-dual hybrid gradient algorithm studied by Chambolle and Pock in 2011 to solve saddle point problems that are separable in the dual variable. The analysis is carried out for general convex-concave saddle point problems and problems that are either partially smooth / strongly convex or fully smooth / strongly convex. We perform the analysis for arbitrary samplings of dual variables, and obtain known deterministic results as a special case. Several variants of our stochastic method significantly outperform the deterministic variant on a variety of imaging tasks.
Tasks
Published 2017-06-15
URL http://arxiv.org/abs/1706.04957v2
PDF http://arxiv.org/pdf/1706.04957v2.pdf
PWC https://paperswithcode.com/paper/stochastic-primal-dual-hybrid-gradient
Repo https://github.com/mehrhardt/spdhg
Framework none

TensorFlow Agents: Efficient Batched Reinforcement Learning in TensorFlow

Title TensorFlow Agents: Efficient Batched Reinforcement Learning in TensorFlow
Authors Danijar Hafner, James Davidson, Vincent Vanhoucke
Abstract We introduce TensorFlow Agents, an efficient infrastructure paradigm for building parallel reinforcement learning algorithms in TensorFlow. We simulate multiple environments in parallel, and group them to perform the neural network computation on a batch rather than individual observations. This allows the TensorFlow execution engine to parallelize computation, without the need for manual synchronization. Environments are stepped in separate Python processes to progress them in parallel without interference of the global interpreter lock. As part of this project, we introduce BatchPPO, an efficient implementation of the proximal policy optimization algorithm. By open sourcing TensorFlow Agents, we hope to provide a flexible starting point for future projects that accelerates future research in the field.
Tasks
Published 2017-09-08
URL http://arxiv.org/abs/1709.02878v2
PDF http://arxiv.org/pdf/1709.02878v2.pdf
PWC https://paperswithcode.com/paper/tensorflow-agents-efficient-batched
Repo https://github.com/brain-research/batch-ppo
Framework tf

Neural SLAM: Learning to Explore with External Memory

Title Neural SLAM: Learning to Explore with External Memory
Authors Jingwei Zhang, Lei Tai, Joschka Boedecker, Wolfram Burgard, Ming Liu
Abstract We present an approach for agents to learn representations of a global map from sensor data, to aid their exploration in new environments. To achieve this, we embed procedures mimicking that of traditional Simultaneous Localization and Mapping (SLAM) into the soft attention based addressing of external memory architectures, in which the external memory acts as an internal representation of the environment. This structure encourages the evolution of SLAM-like behaviors inside a completely differentiable deep neural network. We show that this approach can help reinforcement learning agents to successfully explore new environments where long-term memory is essential. We validate our approach in both challenging grid-world environments and preliminary Gazebo experiments. A video of our experiments can be found at: https://goo.gl/G2Vu5y.
Tasks Simultaneous Localization and Mapping
Published 2017-06-29
URL http://arxiv.org/abs/1706.09520v6
PDF http://arxiv.org/pdf/1706.09520v6.pdf
PWC https://paperswithcode.com/paper/neural-slam-learning-to-explore-with-external
Repo https://github.com/jingweiz/pytorch-dnc
Framework pytorch

A Constrained, Weighted-L1 Minimization Approach for Joint Discovery of Heterogeneous Neural Connectivity Graphs

Title A Constrained, Weighted-L1 Minimization Approach for Joint Discovery of Heterogeneous Neural Connectivity Graphs
Authors Chandan Singh, Beilun Wang, Yanjun Qi
Abstract Determining functional brain connectivity is crucial to understanding the brain and neural differences underlying disorders such as autism. Recent studies have used Gaussian graphical models to learn brain connectivity via statistical dependencies across brain regions from neuroimaging. However, previous studies often fail to properly incorporate priors tailored to neuroscience, such as preferring shorter connections. To remedy this problem, the paper here introduces a novel, weighted-$\ell_1$, multi-task graphical model (W-SIMULE). This model elegantly incorporates a flexible prior, along with a parallelizable formulation. Additionally, W-SIMULE extends the often-used Gaussian assumption, leading to considerable performance increases. Here, applications to fMRI data show that W-SIMULE succeeds in determining functional connectivity in terms of (1) log-likelihood, (2) finding edges that differentiate groups, and (3) classifying different groups based on their connectivity, achieving 58.6% accuracy on the ABIDE dataset. Having established W-SIMULE’s effectiveness, it links four key areas to autism, all of which are consistent with the literature. Due to its elegant domain adaptivity, W-SIMULE can be readily applied to various data types to effectively estimate connectivity.
Tasks Connectivity Estimation
Published 2017-09-13
URL http://arxiv.org/abs/1709.04090v2
PDF http://arxiv.org/pdf/1709.04090v2.pdf
PWC https://paperswithcode.com/paper/a-constrained-weighted-l1-minimization
Repo https://github.com/QData/SIMULE
Framework none

Exploiting temporal information for 3D pose estimation

Title Exploiting temporal information for 3D pose estimation
Authors Mir Rayat Imtiaz Hossain, James J. Little
Abstract In this work, we address the problem of 3D human pose estimation from a sequence of 2D human poses. Although the recent success of deep networks has led many state-of-the-art methods for 3D pose estimation to train deep networks end-to-end to predict from images directly, the top-performing approaches have shown the effectiveness of dividing the task of 3D pose estimation into two steps: using a state-of-the-art 2D pose estimator to estimate the 2D pose from images and then mapping them into 3D space. They also showed that a low-dimensional representation like 2D locations of a set of joints can be discriminative enough to estimate 3D pose with high accuracy. However, estimation of 3D pose for individual frames leads to temporally incoherent estimates due to independent error in each frame causing jitter. Therefore, in this work we utilize the temporal information across a sequence of 2D joint locations to estimate a sequence of 3D poses. We designed a sequence-to-sequence network composed of layer-normalized LSTM units with shortcut connections connecting the input to the output on the decoder side and imposed temporal smoothness constraint during training. We found that the knowledge of temporal consistency improves the best reported result on Human3.6M dataset by approximately $12.2%$ and helps our network to recover temporally consistent 3D poses over a sequence of images even when the 2D pose detector fails.
Tasks 3D Human Pose Estimation, 3D Pose Estimation, Pose Estimation
Published 2017-11-23
URL http://arxiv.org/abs/1711.08585v4
PDF http://arxiv.org/pdf/1711.08585v4.pdf
PWC https://paperswithcode.com/paper/exploiting-temporal-information-for-3d-pose
Repo https://github.com/rayat137/Pose_3D
Framework tf

Noisy Networks for Exploration

Title Noisy Networks for Exploration
Authors Meire Fortunato, Mohammad Gheshlaghi Azar, Bilal Piot, Jacob Menick, Ian Osband, Alex Graves, Vlad Mnih, Remi Munos, Demis Hassabis, Olivier Pietquin, Charles Blundell, Shane Legg
Abstract We introduce NoisyNet, a deep reinforcement learning agent with parametric noise added to its weights, and show that the induced stochasticity of the agent’s policy can be used to aid efficient exploration. The parameters of the noise are learned with gradient descent along with the remaining network weights. NoisyNet is straightforward to implement and adds little computational overhead. We find that replacing the conventional exploration heuristics for A3C, DQN and dueling agents (entropy reward and $\epsilon$-greedy respectively) with NoisyNet yields substantially higher scores for a wide range of Atari games, in some cases advancing the agent from sub to super-human performance.
Tasks Atari Games, Efficient Exploration
Published 2017-06-30
URL https://arxiv.org/abs/1706.10295v3
PDF https://arxiv.org/pdf/1706.10295v3.pdf
PWC https://paperswithcode.com/paper/noisy-networks-for-exploration
Repo https://github.com/LilTwo/DRL-using-PyTorch
Framework pytorch

RankIQA: Learning from Rankings for No-reference Image Quality Assessment

Title RankIQA: Learning from Rankings for No-reference Image Quality Assessment
Authors Xialei Liu, Joost van de Weijer, Andrew D. Bagdanov
Abstract We propose a no-reference image quality assessment (NR-IQA) approach that learns from rankings (RankIQA). To address the problem of limited IQA dataset size, we train a Siamese Network to rank images in terms of image quality by using synthetically generated distortions for which relative image quality is known. These ranked image sets can be automatically generated without laborious human labeling. We then use fine-tuning to transfer the knowledge represented in the trained Siamese Network to a traditional CNN that estimates absolute image quality from single images. We demonstrate how our approach can be made significantly more efficient than traditional Siamese Networks by forward propagating a batch of images through a single network and backpropagating gradients derived from all pairs of images in the batch. Experiments on the TID2013 benchmark show that we improve the state-of-the-art by over 5%. Furthermore, on the LIVE benchmark we show that our approach is superior to existing NR-IQA techniques and that we even outperform the state-of-the-art in full-reference IQA (FR-IQA) methods without having to resort to high-quality reference images to infer IQA.
Tasks Image Quality Assessment, No-Reference Image Quality Assessment
Published 2017-07-26
URL http://arxiv.org/abs/1707.08347v1
PDF http://arxiv.org/pdf/1707.08347v1.pdf
PWC https://paperswithcode.com/paper/rankiqa-learning-from-rankings-for-no
Repo https://github.com/xialeiliu/RankIQA
Framework none

Efficient B-mode Ultrasound Image Reconstruction from Sub-sampled RF Data using Deep Learning

Title Efficient B-mode Ultrasound Image Reconstruction from Sub-sampled RF Data using Deep Learning
Authors Yeo Hun Yoon, Shujaat Khan, Jaeyoung Huh, Jong Chul Ye
Abstract In portable, three dimensional, and ultra-fast ultrasound imaging systems, there is an increasing demand for the reconstruction of high quality images from a limited number of radio-frequency (RF) measurements due to receiver (Rx) or transmit (Xmit) event sub-sampling. However, due to the presence of side lobe artifacts from RF sub-sampling, the standard beamformer often produces blurry images with less contrast, which are unsuitable for diagnostic purposes. Existing compressed sensing approaches often require either hardware changes or computationally expensive algorithms, but their quality improvements are limited. To address this problem, here we propose a novel deep learning approach that directly interpolates the missing RF data by utilizing redundancy in the Rx-Xmit plane. Our extensive experimental results using sub-sampled RF data from a multi-line acquisition B-mode system confirm that the proposed method can effectively reduce the data rate without sacrificing image quality.
Tasks Image Reconstruction
Published 2017-12-17
URL http://arxiv.org/abs/1712.06096v3
PDF http://arxiv.org/pdf/1712.06096v3.pdf
PWC https://paperswithcode.com/paper/efficient-b-mode-ultrasound-image
Repo https://github.com/BISPL-JYH/Ultrasound_TMI
Framework none

Community detection with spiking neural networks for neuromorphic hardware

Title Community detection with spiking neural networks for neuromorphic hardware
Authors Kathleen E. Hamilton, Neena Imam, Travis S. Humble
Abstract We present results related to the performance of an algorithm for community detection which incorporates event-driven computation. We define a mapping which takes a graph G to a system of spiking neurons. Using a fully connected spiking neuron system, with both inhibitory and excitatory synaptic connections, the firing patterns of neurons within the same community can be distinguished from firing patterns of neurons in different communities. On a random graph with 128 vertices and known community structure we show that by using binary decoding and a Hamming-distance based metric, individual communities can be identified from spike train similarities. Using bipolar decoding and finite rate thresholding, we verify that inhibitory connections prevent the spread of spiking patterns.
Tasks Community Detection
Published 2017-11-20
URL http://arxiv.org/abs/1711.07361v1
PDF http://arxiv.org/pdf/1711.07361v1.pdf
PWC https://paperswithcode.com/paper/community-detection-with-spiking-neural
Repo https://github.com/abasak24/ece594Neuromorphic
Framework none
comments powered by Disqus