July 29, 2019

2917 words 14 mins read

Paper Group AWR 191

Zoom Out-and-In Network with Map Attention Decision for Region Proposal and Object Detection. Generalizing Hamiltonian Monte Carlo with Neural Networks. OptionGAN: Learning Joint Reward-Policy Options using Generative Adversarial Inverse Reinforcement Learning. Adversarial Information Factorization. Neural Extractive Summarization with Side Informa …

Zoom Out-and-In Network with Map Attention Decision for Region Proposal and Object Detection


Title	Zoom Out-and-In Network with Map Attention Decision for Region Proposal and Object Detection
Authors	Hongyang Li, Yu Liu, Wanli Ouyang, Xiaogang Wang
Abstract	In this paper, we propose a zoom-out-and-in network for generating object proposals. A key observation is that it is difficult to classify anchors of different sizes with the same set of features. Anchors of different sizes should be placed accordingly based on different depth within a network: smaller boxes on high-resolution layers with a smaller stride while larger boxes on low-resolution counterparts with a larger stride. Inspired by the conv/deconv structure, we fully leverage the low-level local details and high-level regional semantics from two feature map streams, which are complimentary to each other, to identify the objectness in an image. A map attention decision (MAD) unit is further proposed to aggressively search for neuron activations among two streams and attend the most contributive ones on the feature learning of the final loss. The unit serves as a decisionmaker to adaptively activate maps along certain channels with the solely purpose of optimizing the overall training loss. One advantage of MAD is that the learned weights enforced on each feature channel is predicted on-the-fly based on the input context, which is more suitable than the fixed enforcement of a convolutional kernel. Experimental results on three datasets, including PASCAL VOC 2007, ImageNet DET, MS COCO, demonstrate the effectiveness of our proposed algorithm over other state-of-the-arts, in terms of average recall (AR) for region proposal and average precision (AP) for object detection.
Tasks	Object Detection
Published	2017-09-13
URL	http://arxiv.org/abs/1709.04347v2
PDF	http://arxiv.org/pdf/1709.04347v2.pdf
PWC	https://paperswithcode.com/paper/zoom-out-and-in-network-with-map-attention
Repo	https://github.com/hli2020/zoom_network
Framework	none

Generalizing Hamiltonian Monte Carlo with Neural Networks


Title	Generalizing Hamiltonian Monte Carlo with Neural Networks
Authors	Daniel Levy, Matthew D. Hoffman, Jascha Sohl-Dickstein
Abstract	We present a general-purpose method to train Markov chain Monte Carlo kernels, parameterized by deep neural networks, that converge and mix quickly to their target distribution. Our method generalizes Hamiltonian Monte Carlo and is trained to maximize expected squared jumped distance, a proxy for mixing speed. We demonstrate large empirical gains on a collection of simple but challenging distributions, for instance achieving a 106x improvement in effective sample size in one case, and mixing when standard HMC makes no measurable progress in a second. Finally, we show quantitative and qualitative gains on a real-world task: latent-variable generative modeling. We release an open source TensorFlow implementation of the algorithm.
Tasks
Published	2017-11-25
URL	http://arxiv.org/abs/1711.09268v3
PDF	http://arxiv.org/pdf/1711.09268v3.pdf
PWC	https://paperswithcode.com/paper/generalizing-hamiltonian-monte-carlo-with
Repo	https://github.com/brain-research/l2hmc
Framework	tf

OptionGAN: Learning Joint Reward-Policy Options using Generative Adversarial Inverse Reinforcement Learning


Title	OptionGAN: Learning Joint Reward-Policy Options using Generative Adversarial Inverse Reinforcement Learning
Authors	Peter Henderson, Wei-Di Chang, Pierre-Luc Bacon, David Meger, Joelle Pineau, Doina Precup
Abstract	Reinforcement learning has shown promise in learning policies that can solve complex problems. However, manually specifying a good reward function can be difficult, especially for intricate tasks. Inverse reinforcement learning offers a useful paradigm to learn the underlying reward function directly from expert demonstrations. Yet in reality, the corpus of demonstrations may contain trajectories arising from a diverse set of underlying reward functions rather than a single one. Thus, in inverse reinforcement learning, it is useful to consider such a decomposition. The options framework in reinforcement learning is specifically designed to decompose policies in a similar light. We therefore extend the options framework and propose a method to simultaneously recover reward options in addition to policy options. We leverage adversarial methods to learn joint reward-policy options using only observed expert states. We show that this approach works well in both simple and complex continuous control tasks and shows significant performance increases in one-shot transfer learning.
Tasks	Continuous Control, Imitation Learning, Transfer Learning
Published	2017-09-20
URL	http://arxiv.org/abs/1709.06683v2
PDF	http://arxiv.org/pdf/1709.06683v2.pdf
PWC	https://paperswithcode.com/paper/optiongan-learning-joint-reward-policy
Repo	https://github.com/Breakend/OptionGAN
Framework	tf

Adversarial Information Factorization


Title	Adversarial Information Factorization
Authors	Antonia Creswell, Yumnah Mohamied, Biswa Sengupta, Anil A Bharath
Abstract	We propose a novel generative model architecture designed to learn representations for images that factor out a single attribute from the rest of the representation. A single object may have many attributes which when altered do not change the identity of the object itself. Consider the human face; the identity of a particular person is independent of whether or not they happen to be wearing glasses. The attribute of wearing glasses can be changed without changing the identity of the person. However, the ability to manipulate and alter image attributes without altering the object identity is not a trivial task. Here, we are interested in learning a representation of the image that separates the identity of an object (such as a human face) from an attribute (such as ‘wearing glasses’). We demonstrate the success of our factorization approach by using the learned representation to synthesize the same face with and without a chosen attribute. We refer to this specific synthesis process as image attribute manipulation. We further demonstrate that our model achieves competitive scores, with state of the art, on a facial attribute classification task.
Tasks	Facial Attribute Classification, Image Generation
Published	2017-11-14
URL	http://arxiv.org/abs/1711.05175v2
PDF	http://arxiv.org/pdf/1711.05175v2.pdf
PWC	https://paperswithcode.com/paper/adversarial-information-factorization
Repo	https://github.com/ToniCreswell/attribute-cVAEGAN
Framework	pytorch

Neural Extractive Summarization with Side Information


Title	Neural Extractive Summarization with Side Information
Authors	Shashi Narayan, Nikos Papasarantopoulos, Shay B. Cohen, Mirella Lapata
Abstract	Most extractive summarization methods focus on the main body of the document from which sentences need to be extracted. However, the gist of the document may lie in side information, such as the title and image captions which are often available for newswire articles. We propose to explore side information in the context of single-document extractive summarization. We develop a framework for single-document summarization composed of a hierarchical document encoder and an attention-based extractor with attention over side information. We evaluate our model on a large scale news dataset. We show that extractive summarization with side information consistently outperforms its counterpart that does not use any side information, in terms of both informativeness and fluency.
Tasks	Document Summarization, Image Captioning
Published	2017-04-14
URL	http://arxiv.org/abs/1704.04530v2
PDF	http://arxiv.org/pdf/1704.04530v2.pdf
PWC	https://paperswithcode.com/paper/neural-extractive-summarization-with-side
Repo	https://github.com/shashiongithub/sidenet
Framework	tf

Weakly supervised 3D Reconstruction with Adversarial Constraint


Title	Weakly supervised 3D Reconstruction with Adversarial Constraint
Authors	JunYoung Gwak, Christopher B. Choy, Animesh Garg, Manmohan Chandraker, Silvio Savarese
Abstract	Supervised 3D reconstruction has witnessed a significant progress through the use of deep neural networks. However, this increase in performance requires large scale annotations of 2D/3D data. In this paper, we explore inexpensive 2D supervision as an alternative for expensive 3D CAD annotation. Specifically, we use foreground masks as weak supervision through a raytrace pooling layer that enables perspective projection and backpropagation. Additionally, since the 3D reconstruction from masks is an ill posed problem, we propose to constrain the 3D reconstruction to the manifold of unlabeled realistic 3D shapes that match mask observations. We demonstrate that learning a log-barrier solution to this constrained optimization problem resembles the GAN objective, enabling the use of existing tools for training GANs. We evaluate and analyze the manifold constrained reconstruction on various datasets for single and multi-view reconstruction of both synthetic and real images.
Tasks	3D Reconstruction
Published	2017-05-31
URL	http://arxiv.org/abs/1705.10904v2
PDF	http://arxiv.org/pdf/1705.10904v2.pdf
PWC	https://paperswithcode.com/paper/weakly-supervised-3d-reconstruction-with
Repo	https://github.com/chrischoy/3D-R2N2
Framework	none

Stochastic Primal-Dual Hybrid Gradient Algorithm with Arbitrary Sampling and Imaging Applications


Title	Stochastic Primal-Dual Hybrid Gradient Algorithm with Arbitrary Sampling and Imaging Applications
Authors	Antonin Chambolle, Matthias J. Ehrhardt, Peter Richtárik, Carola-Bibiane Schönlieb
Abstract	We propose a stochastic extension of the primal-dual hybrid gradient algorithm studied by Chambolle and Pock in 2011 to solve saddle point problems that are separable in the dual variable. The analysis is carried out for general convex-concave saddle point problems and problems that are either partially smooth / strongly convex or fully smooth / strongly convex. We perform the analysis for arbitrary samplings of dual variables, and obtain known deterministic results as a special case. Several variants of our stochastic method significantly outperform the deterministic variant on a variety of imaging tasks.
Tasks
Published	2017-06-15
URL	http://arxiv.org/abs/1706.04957v2
PDF	http://arxiv.org/pdf/1706.04957v2.pdf
PWC	https://paperswithcode.com/paper/stochastic-primal-dual-hybrid-gradient
Repo	https://github.com/mehrhardt/spdhg
Framework	none

TensorFlow Agents: Efficient Batched Reinforcement Learning in TensorFlow


Title	TensorFlow Agents: Efficient Batched Reinforcement Learning in TensorFlow
Authors	Danijar Hafner, James Davidson, Vincent Vanhoucke
Abstract	We introduce TensorFlow Agents, an efficient infrastructure paradigm for building parallel reinforcement learning algorithms in TensorFlow. We simulate multiple environments in parallel, and group them to perform the neural network computation on a batch rather than individual observations. This allows the TensorFlow execution engine to parallelize computation, without the need for manual synchronization. Environments are stepped in separate Python processes to progress them in parallel without interference of the global interpreter lock. As part of this project, we introduce BatchPPO, an efficient implementation of the proximal policy optimization algorithm. By open sourcing TensorFlow Agents, we hope to provide a flexible starting point for future projects that accelerates future research in the field.
Tasks
Published	2017-09-08
URL	http://arxiv.org/abs/1709.02878v2
PDF	http://arxiv.org/pdf/1709.02878v2.pdf
PWC	https://paperswithcode.com/paper/tensorflow-agents-efficient-batched
Repo	https://github.com/brain-research/batch-ppo
Framework	tf

Neural SLAM: Learning to Explore with External Memory


Title	Neural SLAM: Learning to Explore with External Memory
Authors	Jingwei Zhang, Lei Tai, Joschka Boedecker, Wolfram Burgard, Ming Liu
Abstract	We present an approach for agents to learn representations of a global map from sensor data, to aid their exploration in new environments. To achieve this, we embed procedures mimicking that of traditional Simultaneous Localization and Mapping (SLAM) into the soft attention based addressing of external memory architectures, in which the external memory acts as an internal representation of the environment. This structure encourages the evolution of SLAM-like behaviors inside a completely differentiable deep neural network. We show that this approach can help reinforcement learning agents to successfully explore new environments where long-term memory is essential. We validate our approach in both challenging grid-world environments and preliminary Gazebo experiments. A video of our experiments can be found at: https://goo.gl/G2Vu5y.
Tasks	Simultaneous Localization and Mapping
Published	2017-06-29
URL	http://arxiv.org/abs/1706.09520v6
PDF	http://arxiv.org/pdf/1706.09520v6.pdf
PWC	https://paperswithcode.com/paper/neural-slam-learning-to-explore-with-external
Repo	https://github.com/jingweiz/pytorch-dnc
Framework	pytorch

A Constrained, Weighted-L1 Minimization Approach for Joint Discovery of Heterogeneous Neural Connectivity Graphs


Title	A Constrained, Weighted-L1 Minimization Approach for Joint Discovery of Heterogeneous Neural Connectivity Graphs
Authors	Chandan Singh, Beilun Wang, Yanjun Qi
Abstract	Determining functional brain connectivity is crucial to understanding the brain and neural differences underlying disorders such as autism. Recent studies have used Gaussian graphical models to learn brain connectivity via statistical dependencies across brain regions from neuroimaging. However, previous studies often fail to properly incorporate priors tailored to neuroscience, such as preferring shorter connections. To remedy this problem, the paper here introduces a novel, weighted-$\ell_1$, multi-task graphical model (W-SIMULE). This model elegantly incorporates a flexible prior, along with a parallelizable formulation. Additionally, W-SIMULE extends the often-used Gaussian assumption, leading to considerable performance increases. Here, applications to fMRI data show that W-SIMULE succeeds in determining functional connectivity in terms of (1) log-likelihood, (2) finding edges that differentiate groups, and (3) classifying different groups based on their connectivity, achieving 58.6% accuracy on the ABIDE dataset. Having established W-SIMULE’s effectiveness, it links four key areas to autism, all of which are consistent with the literature. Due to its elegant domain adaptivity, W-SIMULE can be readily applied to various data types to effectively estimate connectivity.
Tasks	Connectivity Estimation
Published	2017-09-13
URL	http://arxiv.org/abs/1709.04090v2
PDF	http://arxiv.org/pdf/1709.04090v2.pdf
PWC	https://paperswithcode.com/paper/a-constrained-weighted-l1-minimization
Repo	https://github.com/QData/SIMULE
Framework	none

Exploiting temporal information for 3D pose estimation


Title	Exploiting temporal information for 3D pose estimation
Authors	Mir Rayat Imtiaz Hossain, James J. Little
Abstract	In this work, we address the problem of 3D human pose estimation from a sequence of 2D human poses. Although the recent success of deep networks has led many state-of-the-art methods for 3D pose estimation to train deep networks end-to-end to predict from images directly, the top-performing approaches have shown the effectiveness of dividing the task of 3D pose estimation into two steps: using a state-of-the-art 2D pose estimator to estimate the 2D pose from images and then mapping them into 3D space. They also showed that a low-dimensional representation like 2D locations of a set of joints can be discriminative enough to estimate 3D pose with high accuracy. However, estimation of 3D pose for individual frames leads to temporally incoherent estimates due to independent error in each frame causing jitter. Therefore, in this work we utilize the temporal information across a sequence of 2D joint locations to estimate a sequence of 3D poses. We designed a sequence-to-sequence network composed of layer-normalized LSTM units with shortcut connections connecting the input to the output on the decoder side and imposed temporal smoothness constraint during training. We found that the knowledge of temporal consistency improves the best reported result on Human3.6M dataset by approximately $12.2%$ and helps our network to recover temporally consistent 3D poses over a sequence of images even when the 2D pose detector fails.
Tasks	3D Human Pose Estimation, 3D Pose Estimation, Pose Estimation
Published	2017-11-23
URL	http://arxiv.org/abs/1711.08585v4
PDF	http://arxiv.org/pdf/1711.08585v4.pdf
PWC	https://paperswithcode.com/paper/exploiting-temporal-information-for-3d-pose
Repo	https://github.com/rayat137/Pose_3D
Framework	tf

Noisy Networks for Exploration


Title	Noisy Networks for Exploration
Authors	Meire Fortunato, Mohammad Gheshlaghi Azar, Bilal Piot, Jacob Menick, Ian Osband, Alex Graves, Vlad Mnih, Remi Munos, Demis Hassabis, Olivier Pietquin, Charles Blundell, Shane Legg
Abstract	We introduce NoisyNet, a deep reinforcement learning agent with parametric noise added to its weights, and show that the induced stochasticity of the agent’s policy can be used to aid efficient exploration. The parameters of the noise are learned with gradient descent along with the remaining network weights. NoisyNet is straightforward to implement and adds little computational overhead. We find that replacing the conventional exploration heuristics for A3C, DQN and dueling agents (entropy reward and $\epsilon$-greedy respectively) with NoisyNet yields substantially higher scores for a wide range of Atari games, in some cases advancing the agent from sub to super-human performance.
Tasks	Atari Games, Efficient Exploration
Published	2017-06-30
URL	https://arxiv.org/abs/1706.10295v3
PDF	https://arxiv.org/pdf/1706.10295v3.pdf
PWC	https://paperswithcode.com/paper/noisy-networks-for-exploration
Repo	https://github.com/LilTwo/DRL-using-PyTorch
Framework	pytorch

RankIQA: Learning from Rankings for No-reference Image Quality Assessment


Title	RankIQA: Learning from Rankings for No-reference Image Quality Assessment
Authors	Xialei Liu, Joost van de Weijer, Andrew D. Bagdanov
Abstract	We propose a no-reference image quality assessment (NR-IQA) approach that learns from rankings (RankIQA). To address the problem of limited IQA dataset size, we train a Siamese Network to rank images in terms of image quality by using synthetically generated distortions for which relative image quality is known. These ranked image sets can be automatically generated without laborious human labeling. We then use fine-tuning to transfer the knowledge represented in the trained Siamese Network to a traditional CNN that estimates absolute image quality from single images. We demonstrate how our approach can be made significantly more efficient than traditional Siamese Networks by forward propagating a batch of images through a single network and backpropagating gradients derived from all pairs of images in the batch. Experiments on the TID2013 benchmark show that we improve the state-of-the-art by over 5%. Furthermore, on the LIVE benchmark we show that our approach is superior to existing NR-IQA techniques and that we even outperform the state-of-the-art in full-reference IQA (FR-IQA) methods without having to resort to high-quality reference images to infer IQA.
Tasks	Image Quality Assessment, No-Reference Image Quality Assessment
Published	2017-07-26
URL	http://arxiv.org/abs/1707.08347v1
PDF	http://arxiv.org/pdf/1707.08347v1.pdf
PWC	https://paperswithcode.com/paper/rankiqa-learning-from-rankings-for-no
Repo	https://github.com/xialeiliu/RankIQA
Framework	none

Efficient B-mode Ultrasound Image Reconstruction from Sub-sampled RF Data using Deep Learning


Title	Efficient B-mode Ultrasound Image Reconstruction from Sub-sampled RF Data using Deep Learning
Authors	Yeo Hun Yoon, Shujaat Khan, Jaeyoung Huh, Jong Chul Ye
Abstract	In portable, three dimensional, and ultra-fast ultrasound imaging systems, there is an increasing demand for the reconstruction of high quality images from a limited number of radio-frequency (RF) measurements due to receiver (Rx) or transmit (Xmit) event sub-sampling. However, due to the presence of side lobe artifacts from RF sub-sampling, the standard beamformer often produces blurry images with less contrast, which are unsuitable for diagnostic purposes. Existing compressed sensing approaches often require either hardware changes or computationally expensive algorithms, but their quality improvements are limited. To address this problem, here we propose a novel deep learning approach that directly interpolates the missing RF data by utilizing redundancy in the Rx-Xmit plane. Our extensive experimental results using sub-sampled RF data from a multi-line acquisition B-mode system confirm that the proposed method can effectively reduce the data rate without sacrificing image quality.
Tasks	Image Reconstruction
Published	2017-12-17
URL	http://arxiv.org/abs/1712.06096v3
PDF	http://arxiv.org/pdf/1712.06096v3.pdf
PWC	https://paperswithcode.com/paper/efficient-b-mode-ultrasound-image
Repo	https://github.com/BISPL-JYH/Ultrasound_TMI
Framework	none

Community detection with spiking neural networks for neuromorphic hardware


Title	Community detection with spiking neural networks for neuromorphic hardware
Authors	Kathleen E. Hamilton, Neena Imam, Travis S. Humble
Abstract	We present results related to the performance of an algorithm for community detection which incorporates event-driven computation. We define a mapping which takes a graph G to a system of spiking neurons. Using a fully connected spiking neuron system, with both inhibitory and excitatory synaptic connections, the firing patterns of neurons within the same community can be distinguished from firing patterns of neurons in different communities. On a random graph with 128 vertices and known community structure we show that by using binary decoding and a Hamming-distance based metric, individual communities can be identified from spike train similarities. Using bipolar decoding and finite rate thresholding, we verify that inhibitory connections prevent the spread of spiking patterns.
Tasks	Community Detection
Published	2017-11-20
URL	http://arxiv.org/abs/1711.07361v1
PDF	http://arxiv.org/pdf/1711.07361v1.pdf
PWC	https://paperswithcode.com/paper/community-detection-with-spiking-neural
Repo	https://github.com/abasak24/ece594Neuromorphic
Framework	none