January 30, 2020

3170 words 15 mins read

Paper Group ANR 255

Visual Understanding and Narration: A Deeper Understanding and Explanation of Visual Scenes. Real-time interactive magnetic resonance (MR) temperature imaging in both aqueous and adipose tissues using cascaded deep neural networks for MR-guided focused ultrasound surgery (MRgFUS). Fine-grained Search Space Classification for Hard Enumeration Varian …

Visual Understanding and Narration: A Deeper Understanding and Explanation of Visual Scenes


Title	Visual Understanding and Narration: A Deeper Understanding and Explanation of Visual Scenes
Authors	Stephanie M. Lukin, Claire Bonial, Clare R. Voss
Abstract	We describe the task of Visual Understanding and Narration, in which a robot (or agent) generates text for the images that it collects when navigating its environment, by answering open-ended questions, such as ‘what happens, or might have happened, here?’
Tasks
Published	2019-05-31
URL	https://arxiv.org/abs/1906.00038v2
PDF	https://arxiv.org/pdf/1906.00038v2.pdf
PWC	https://paperswithcode.com/paper/190600038
Repo
Framework

Real-time interactive magnetic resonance (MR) temperature imaging in both aqueous and adipose tissues using cascaded deep neural networks for MR-guided focused ultrasound surgery (MRgFUS)


Title	Real-time interactive magnetic resonance (MR) temperature imaging in both aqueous and adipose tissues using cascaded deep neural networks for MR-guided focused ultrasound surgery (MRgFUS)
Authors	Jong-Min Kim, You-Jin Jeong, Han-Jae Chung, Chulhyun Lee, Chang-Hyun Oh
Abstract	Purpose: To acquire the real-time interactive temperature map for aqueous and adipose tissue, the problems of long acquisition and processing time must be addressed. To overcome these major challenges, this paper proposes a cascaded convolutional neural network (CNN) framework and multi-echo gradient echo (meGRE) with a single reference variable flip angle (srVFA). Methods: To optimize the echo times for each method, MR images are acquired using a meGRE sequence; meGRE images with two flip angles (FAs) and meGRE images with a single FA are acquired during the pretreatment and treatment stages, respectively. These images are then processed and reconstructed by a cascaded CNN, which consists of two CNNs. The first CNN (called DeepACCnet) performs HR complex MR image reconstruction from the LR MR image acquired during the treatment stage, which is improved by the HR magnitude MR image acquired during the pretreatment stage. The second CNN (called DeepPROCnet) copes with T1 mapping. Results: Measurements of temperature and T1 changes obtained by meGRE combined with srVFA and cascaded CNNs were achieved in an agarose gel phantom, ex vivo porcine muscle, and ex vivo porcine muscle with fat layers (heating tests), and in vivo human prostate and brain (non-heating tests). In the heating test, the maximum differences between fiber-optic sensor and samples are less than 1 degree Celcius. In all cases, temperature mapping using the cascaded CNN achieved the best results in all cases. The acquisition and processing times for the proposed method are 0.8 s and 32 ms, respectively. Conclusions: Real-time interactive HR MR temperature mapping for simultaneously measuring aqueous and adipose tissue is feasible by combining a cascaded CNN with meGRE and srVFA.
Tasks	Image Reconstruction
Published	2019-08-29
URL	https://arxiv.org/abs/1908.10995v1
PDF	https://arxiv.org/pdf/1908.10995v1.pdf
PWC	https://paperswithcode.com/paper/real-time-interactive-magnetic-resonance-mr
Repo
Framework

Fine-grained Search Space Classification for Hard Enumeration Variants of Subset Problems


Title	Fine-grained Search Space Classification for Hard Enumeration Variants of Subset Problems
Authors	Juho Lauri, Sourav Dutta
Abstract	We propose a simple, powerful, and flexible machine learning framework for (i) reducing the search space of computationally difficult enumeration variants of subset problems and (ii) augmenting existing state-of-the-art solvers with informative cues arising from the input distribution. We instantiate our framework for the problem of listing all maximum cliques in a graph, a central problem in network analysis, data mining, and computational biology. We demonstrate the practicality of our approach on real-world networks with millions of vertices and edges by not only retaining all optimal solutions, but also aggressively pruning the input instance size resulting in several fold speedups of state-of-the-art algorithms. Finally, we explore the limits of scalability and robustness of our proposed framework, suggesting that supervised learning is viable for tackling NP-hard problems in practice.
Tasks
Published	2019-02-22
URL	http://arxiv.org/abs/1902.08455v1
PDF	http://arxiv.org/pdf/1902.08455v1.pdf
PWC	https://paperswithcode.com/paper/fine-grained-search-space-classification-for
Repo
Framework

Decoupling Hierarchical Recurrent Neural Networks With Locally Computable Losses


Title	Decoupling Hierarchical Recurrent Neural Networks With Locally Computable Losses
Authors	Asier Mujika, Felix Weissenberger, Angelika Steger
Abstract	Learning long-term dependencies is a key long-standing challenge of recurrent neural networks (RNNs). Hierarchical recurrent neural networks (HRNNs) have been considered a promising approach as long-term dependencies are resolved through shortcuts up and down the hierarchy. Yet, the memory requirements of Truncated Backpropagation Through Time (TBPTT) still prevent training them on very long sequences. In this paper, we empirically show that in (deep) HRNNs, propagating gradients back from higher to lower levels can be replaced by locally computable losses, without harming the learning capability of the network, over a wide range of tasks. This decoupling by local losses reduces the memory requirements of training by a factor exponential in the depth of the hierarchy in comparison to standard TBPTT.
Tasks
Published	2019-10-11
URL	https://arxiv.org/abs/1910.05245v1
PDF	https://arxiv.org/pdf/1910.05245v1.pdf
PWC	https://paperswithcode.com/paper/decoupling-hierarchical-recurrent-neural-1
Repo
Framework

Learning efficient haptic shape exploration with a rigid tactile sensor array


Title	Learning efficient haptic shape exploration with a rigid tactile sensor array
Authors	Sascha Fleer, Alexandra Moringen, Roberta L. Klatzky, Helge Ritter
Abstract	Haptic exploration is a key skill for both robots and humans to discriminate and handle unknown objects or to recognize familiar objects. Its active nature is evident in humans who from early on reliably acquire sophisticated sensory-motor capabilities for active exploratory touch and directed manual exploration that associates surfaces and object properties with their spatial locations. This is in stark contrast to robotics. In this field, the relative lack of good real-world interaction models - along with very restricted sensors and a scarcity of suitable training data to leverage machine learning methods - has so far rendered haptic exploration a largely underdeveloped skill. In the present work, we connect recent advances in recurrent models of visual attention with previous insights about the organisation of human haptic search behavior, exploratory procedures and haptic glances for a novel architecture that learns a generative model of haptic exploration in a simulated three-dimensional environment. The proposed algorithm simultaneously optimizes main perception-action loop components: feature extraction, integration of features over time, and the control strategy, while continuously acquiring data online. We perform a multi-module neural network training, including a feature extractor and a recurrent neural network module aiding pose control for storing and combining sequential sensory data. The resulting haptic meta-controller for the rigid $16 \times 16$ tactile sensor array moving in a physics-driven simulation environment, called the Haptic Attention Model, performs a sequence of haptic glances, and outputs corresponding force measurements. The resulting method has been successfully tested with four different objects. It achieved results close to $100 %$ while performing object contour exploration that has been optimized for its own sensor morphology.
Tasks
Published	2019-02-20
URL	https://arxiv.org/abs/1902.07501v4
PDF	https://arxiv.org/pdf/1902.07501v4.pdf
PWC	https://paperswithcode.com/paper/learning-efficient-haptic-shape-exploration
Repo
Framework

Attend to the Difference: Cross-Modality Person Re-identification via Contrastive Correlation


Title	Attend to the Difference: Cross-Modality Person Re-identification via Contrastive Correlation
Authors	Shizhou Zhang, Yifei Yang, Peng Wang, Xiuwei Zhang, Yanning Zhang
Abstract	The problem of cross-modality person re-identification has been receiving increasing attention recently, due to its practical significance. Motivated by the fact that human usually attend to the difference when they compare two similar objects, we propose a dual-path cross-modality feature learning framework which preserves intrinsic spatial strictures and attends to the difference of input cross-modality image pairs. Our framework is composed by two main components: a Dual-path Spatial-structure-preserving Common Space Network (DSCSN) and a Contrastive Correlation Network (CCN). The former embeds cross-modality images into a common 3D tensor space without losing spatial structures, while the latter extracts contrastive features by dynamically comparing input image pairs. Note that the representations generated for the input RGB and Infrared images are mutually dependant to each other. We conduct extensive experiments on two public available RGB-IR ReID datasets, SYSU-MM01 and RegDB, and our proposed method outperforms state-of-the-art algorithms by a large margin with both full and simplified evaluation modes.
Tasks	Person Re-Identification
Published	2019-10-25
URL	https://arxiv.org/abs/1910.11656v1
PDF	https://arxiv.org/pdf/1910.11656v1.pdf
PWC	https://paperswithcode.com/paper/attend-to-the-difference-cross-modality
Repo
Framework

Distortion-adaptive Salient Object Detection in 360$^\circ$ Omnidirectional Images


Title	Distortion-adaptive Salient Object Detection in 360$^\circ$ Omnidirectional Images
Authors	Jia Li, Jinming Su, Changqun Xia, Yonghong Tian
Abstract	Image-based salient object detection (SOD) has been extensively explored in the past decades. However, SOD on 360$^\circ$ omnidirectional images is less studied owing to the lack of datasets with pixel-level annotations. Toward this end, this paper proposes a 360$^\circ$ image-based SOD dataset that contains 500 high-resolution equirectangular images. We collect the representative equirectangular images from five mainstream 360$^\circ$ video datasets and manually annotate all objects and regions over these images with precise masks with a free-viewpoint way. To the best of our knowledge, it is the first public available dataset for salient object detection on 360$^\circ$ scenes. By observing this dataset, we find that distortion from projection, large-scale complex scene and small salient objects are the most prominent characteristics. Inspired by these foundings, this paper proposes a baseline model for SOD on equirectangular images. In the proposed approach, we construct a distortion-adaptive module to deal with the distortion caused by the equirectangular projection. In addition, a multi-scale contextual integration block is introduced to perceive and distinguish the rich scenes and objects in omnidirectional scenes. The whole network is organized in a progressively manner with deep supervision. Experimental results show the proposed baseline approach outperforms the top-performanced state-of-the-art methods on 360$^\circ$ SOD dataset. Moreover, benchmarking results of the proposed baseline approach and other methods on 360$^\circ$ SOD dataset show the proposed dataset is very challenging, which also validate the usefulness of the proposed dataset and approach to boost the development of SOD on 360$^\circ$ omnidirectional scenes.
Tasks	Object Detection, Salient Object Detection
Published	2019-09-11
URL	https://arxiv.org/abs/1909.04913v1
PDF	https://arxiv.org/pdf/1909.04913v1.pdf
PWC	https://paperswithcode.com/paper/distortion-adaptive-salient-object-detection
Repo
Framework

Bipartite Graph Neural Networks for Efficient Node Representation Learning


Title	Bipartite Graph Neural Networks for Efficient Node Representation Learning
Authors	Chaoyang He, Tian Xie, Yu Rong, Wenbing Huang, Yanfang Li, Junzhou Huang, Xiang Ren, Cyrus Shahabi
Abstract	Existing Graph Neural Networks (GNNs) mainly focus on general structures, while the specific architecture on bipartite graphs—a crucial practical data form that consists of two distinct domains of nodes—is seldom studied. In this paper, we propose Bipartite Graph Neural Network (BGNN), a novel model that is domain-consistent, unsupervised, and efficient. At its core, BGNN utilizes the proposed Inter-domain Message Passing (IDMP) for message aggregation and Intra-domain Alignment (IDA) towards information fusion over domains, both of which are trained without requiring any supervision. Moreover, we formulate a multi-layer BGNN in a cascaded manner to enable multi-hop relation modeling while enjoying promising efficiency in training. Extensive experiments on several datasets of varying scales verify the effectiveness of BGNN compared to other counterparts. Particularly for the experiment on a large-scale bipartite graph dataset, the scalability of our BGNN is validated in terms of fast training speed and low memory cost.
Tasks	Representation Learning
Published	2019-06-27
URL	https://arxiv.org/abs/1906.11994v2
PDF	https://arxiv.org/pdf/1906.11994v2.pdf
PWC	https://paperswithcode.com/paper/adversarial-representation-learning-on-large
Repo
Framework

Lifting Vectorial Variational Problems: A Natural Formulation based on Geometric Measure Theory and Discrete Exterior Calculus


Title	Lifting Vectorial Variational Problems: A Natural Formulation based on Geometric Measure Theory and Discrete Exterior Calculus
Authors	Thomas Möllenhoff, Daniel Cremers
Abstract	Numerous tasks in imaging and vision can be formulated as variational problems over vector-valued maps. We approach the relaxation and convexification of such vectorial variational problems via a lifting to the space of currents. To that end, we recall that functionals with polyconvex Lagrangians can be reparametrized as convex one-homogeneous functionals on the graph of the function. This leads to an equivalent shape optimization problem over oriented surfaces in the product space of domain and codomain. A convex formulation is then obtained by relaxing the search space from oriented surfaces to more general currents. We propose a discretization of the resulting infinite-dimensional optimization problem using Whitney forms, which also generalizes recent “sublabel-accurate” multilabeling approaches.
Tasks
Published	2019-05-02
URL	https://arxiv.org/abs/1905.00851v1
PDF	https://arxiv.org/pdf/1905.00851v1.pdf
PWC	https://paperswithcode.com/paper/lifting-vectorial-variational-problems-a
Repo
Framework

On Completeness-aware Concept-Based Explanations in Deep Neural Networks


Title	On Completeness-aware Concept-Based Explanations in Deep Neural Networks
Authors	Chih-Kuan Yeh, Been Kim, Sercan O. Arik, Chun-Liang Li, Tomas Pfister, Pradeep Ravikumar
Abstract	Human explanations of high-level decisions are often expressed in terms of key concepts the decisions are based on. In this paper, we study such concept-based explainability for Deep Neural Networks (DNNs). First, we define the notion of completeness, which quantifies how sufficient is a particular set of concepts in explaining a model’s prediction behavior. Next, we propose a concept discovery method that aims to infer a complete set of concepts that are additionally encouraged to be interpretable. Our concept discovery method aims to address the limitations of commonly-used methods such as PCA and TCAV. To define an importance score for each discovered concept, we adapt game-theoretic notions to aggregate over sets and propose \emph{ConceptSHAP}. On a Synthetic dataset with ground-truth concept explanations, on a real-world dataset, and with a user study, we validate the effectiveness of our framework in finding concepts that are both complete in explaining the decisions, and interpretable.
Tasks
Published	2019-10-17
URL	https://arxiv.org/abs/1910.07969v2
PDF	https://arxiv.org/pdf/1910.07969v2.pdf
PWC	https://paperswithcode.com/paper/on-concept-based-explanations-in-deep-neural
Repo
Framework

Video Person Re-Identification using Learned Clip Similarity Aggregation


Title	Video Person Re-Identification using Learned Clip Similarity Aggregation
Authors	Neeraj Matiyali, Gaurav Sharma
Abstract	We address the challenging task of video-based person re-identification. Recent works have shown that splitting the video sequences into clips and then aggregating clip based similarity is appropriate for the task. We show that using a learned clip similarity aggregation function allows filtering out hard clip pairs, e.g. where the person is not clearly visible, is in a challenging pose, or where the poses in the two clips are too different to be informative. This allows the method to focus on clip-pairs which are more informative for the task. We also introduce the use of 3D CNNs for video-based re-identification and show their effectiveness by performing equivalent to previous works, which use optical flow in addition to RGB, while using RGB inputs only. We give quantitative results on three challenging public benchmarks and show better or competitive performance. We also validate our method qualitatively.
Tasks	Optical Flow Estimation, Person Re-Identification, Video-Based Person Re-Identification
Published	2019-10-17
URL	https://arxiv.org/abs/1910.08055v1
PDF	https://arxiv.org/pdf/1910.08055v1.pdf
PWC	https://paperswithcode.com/paper/video-person-re-identification-using-learned
Repo
Framework

A Generalization Bound for Online Variational Inference


Title	A Generalization Bound for Online Variational Inference
Authors	Badr-Eddine Chérief-Abdellatif, Pierre Alquier, Mohammad Emtiyaz Khan
Abstract	Bayesian inference provides an attractive online-learning framework to analyze sequential data, and offers generalization guarantees which hold even with model mismatch and adversaries. Unfortunately, exact Bayesian inference is rarely feasible in practice and approximation methods are usually employed, but do such methods preserve the generalization properties of Bayesian inference ? In this paper, we show that this is indeed the case for some variational inference (VI) algorithms. We consider a few existing online, tempered VI algorithms, as well as a new algorithm, and derive their generalization bounds. Our theoretical result relies on the convexity of the variational objective, but we argue that the result should hold more generally and present empirical evidence in support of this. Our work in this paper presents theoretical justifications in favor of online algorithms relying on approximate Bayesian methods.
Tasks	Bayesian Inference
Published	2019-04-08
URL	https://arxiv.org/abs/1904.03920v2
PDF	https://arxiv.org/pdf/1904.03920v2.pdf
PWC	https://paperswithcode.com/paper/a-generalization-bound-for-online-variational
Repo
Framework

Deep Learning-based Universal Beamformer for Ultrasound Imaging


Title	Deep Learning-based Universal Beamformer for Ultrasound Imaging
Authors	Shujaat Khan, Jaeyoung Huh, Jong Chul Ye
Abstract	In ultrasound (US) imaging, individual channel RF measurements are back-propagated and accumulated to form an image after applying specific delays. While this time reversal is usually implemented using a hardware- or software-based delay-and-sum (DAS) beamformer, the performance of DAS decreases rapidly in situations where data acquisition is not ideal. Herein, for the first time, we demonstrate that a single data-driven adaptive beamformer designed as a deep neural network can generate high quality images robustly for various detector channel configurations and subsampling rates. The proposed deep beamformer is evaluated for two distinct acquisition schemes: focused ultrasound imaging and planewave imaging. Experimental results showed that the proposed deep beamformer exhibit significant performance gain for both focused and planar imaging schemes, in terms of contrast-to-noise ratio and structural similarity.
Tasks
Published	2019-04-05
URL	https://arxiv.org/abs/1904.02843v2
PDF	https://arxiv.org/pdf/1904.02843v2.pdf
PWC	https://paperswithcode.com/paper/deep-learning-based-universal-beamformer-for
Repo
Framework

Neural Program Repair by Jointly Learning to Localize and Repair


Title	Neural Program Repair by Jointly Learning to Localize and Repair
Authors	Marko Vasic, Aditya Kanade, Petros Maniatis, David Bieber, Rishabh Singh
Abstract	Due to its potential to improve programmer productivity and software quality, automated program repair has been an active topic of research. Newer techniques harness neural networks to learn directly from examples of buggy programs and their fixes. In this work, we consider a recently identified class of bugs called variable-misuse bugs. The state-of-the-art solution for variable misuse enumerates potential fixes for all possible bug locations in a program, before selecting the best prediction. We show that it is beneficial to train a model that jointly and directly localizes and repairs variable-misuse bugs. We present multi-headed pointer networks for this purpose, with one head each for localization and repair. The experimental results show that the joint model significantly outperforms an enumerative solution that uses a pointer based model for repair alone.
Tasks
Published	2019-04-03
URL	http://arxiv.org/abs/1904.01720v1
PDF	http://arxiv.org/pdf/1904.01720v1.pdf
PWC	https://paperswithcode.com/paper/neural-program-repair-by-jointly-learning-to-1
Repo
Framework

Localization of Unmanned Aerial Vehicles in Corridor Environments using Deep Learning


Title	Localization of Unmanned Aerial Vehicles in Corridor Environments using Deep Learning
Authors	Ram Prasad Padhy, Shahzad Ahmad, Sachin Verma, Pankaj Kumar Sa, Sambit Bakshi
Abstract	Vision-based pose estimation of Unmanned Aerial Vehicles (UAV) in unknown environments is a rapidly growing research area in the field of robot vision. The task becomes more complex when the only available sensor is a static single camera (monocular vision). In this regard, we propose a monocular vision assisted localization algorithm, that will help a UAV to navigate safely in indoor corridor environments. Always, the aim is to navigate the UAV through a corridor in the forward direction by keeping it at the center with no orientation either to the left or right side. The algorithm makes use of the RGB image, captured from the UAV front camera, and passes it through a trained deep neural network (DNN) to predict the position of the UAV as either on the left or center or right side of the corridor. Depending upon the divergence of the UAV with respect to the central bisector line (CBL) of the corridor, a suitable command is generated to bring the UAV to the center. When the UAV is at the center of the corridor, a new image is passed through another trained DNN to predict the orientation of the UAV with respect to the CBL of the corridor. If the UAV is either left or right tilted, an appropriate command is generated to rectify the orientation. We also propose a new corridor dataset, named NITRCorrV1, which contains images as captured by the UAV front camera when the UAV is at all possible locations of a variety of corridors. An exhaustive set of experiments in different corridors reveal the efficacy of the proposed algorithm.
Tasks	Pose Estimation
Published	2019-03-21
URL	http://arxiv.org/abs/1903.09021v1
PDF	http://arxiv.org/pdf/1903.09021v1.pdf
PWC	https://paperswithcode.com/paper/localization-of-unmanned-aerial-vehicles-in
Repo
Framework