Paper Group ANR 255
Visual Understanding and Narration: A Deeper Understanding and Explanation of Visual Scenes. Real-time interactive magnetic resonance (MR) temperature imaging in both aqueous and adipose tissues using cascaded deep neural networks for MR-guided focused ultrasound surgery (MRgFUS). Fine-grained Search Space Classification for Hard Enumeration Varian …
Visual Understanding and Narration: A Deeper Understanding and Explanation of Visual Scenes
Title | Visual Understanding and Narration: A Deeper Understanding and Explanation of Visual Scenes |
Authors | Stephanie M. Lukin, Claire Bonial, Clare R. Voss |
Abstract | We describe the task of Visual Understanding and Narration, in which a robot (or agent) generates text for the images that it collects when navigating its environment, by answering open-ended questions, such as ‘what happens, or might have happened, here?’ |
Tasks | |
Published | 2019-05-31 |
URL | https://arxiv.org/abs/1906.00038v2 |
https://arxiv.org/pdf/1906.00038v2.pdf | |
PWC | https://paperswithcode.com/paper/190600038 |
Repo | |
Framework | |
Real-time interactive magnetic resonance (MR) temperature imaging in both aqueous and adipose tissues using cascaded deep neural networks for MR-guided focused ultrasound surgery (MRgFUS)
Title | Real-time interactive magnetic resonance (MR) temperature imaging in both aqueous and adipose tissues using cascaded deep neural networks for MR-guided focused ultrasound surgery (MRgFUS) |
Authors | Jong-Min Kim, You-Jin Jeong, Han-Jae Chung, Chulhyun Lee, Chang-Hyun Oh |
Abstract | Purpose: To acquire the real-time interactive temperature map for aqueous and adipose tissue, the problems of long acquisition and processing time must be addressed. To overcome these major challenges, this paper proposes a cascaded convolutional neural network (CNN) framework and multi-echo gradient echo (meGRE) with a single reference variable flip angle (srVFA). Methods: To optimize the echo times for each method, MR images are acquired using a meGRE sequence; meGRE images with two flip angles (FAs) and meGRE images with a single FA are acquired during the pretreatment and treatment stages, respectively. These images are then processed and reconstructed by a cascaded CNN, which consists of two CNNs. The first CNN (called DeepACCnet) performs HR complex MR image reconstruction from the LR MR image acquired during the treatment stage, which is improved by the HR magnitude MR image acquired during the pretreatment stage. The second CNN (called DeepPROCnet) copes with T1 mapping. Results: Measurements of temperature and T1 changes obtained by meGRE combined with srVFA and cascaded CNNs were achieved in an agarose gel phantom, ex vivo porcine muscle, and ex vivo porcine muscle with fat layers (heating tests), and in vivo human prostate and brain (non-heating tests). In the heating test, the maximum differences between fiber-optic sensor and samples are less than 1 degree Celcius. In all cases, temperature mapping using the cascaded CNN achieved the best results in all cases. The acquisition and processing times for the proposed method are 0.8 s and 32 ms, respectively. Conclusions: Real-time interactive HR MR temperature mapping for simultaneously measuring aqueous and adipose tissue is feasible by combining a cascaded CNN with meGRE and srVFA. |
Tasks | Image Reconstruction |
Published | 2019-08-29 |
URL | https://arxiv.org/abs/1908.10995v1 |
https://arxiv.org/pdf/1908.10995v1.pdf | |
PWC | https://paperswithcode.com/paper/real-time-interactive-magnetic-resonance-mr |
Repo | |
Framework | |
Fine-grained Search Space Classification for Hard Enumeration Variants of Subset Problems
Title | Fine-grained Search Space Classification for Hard Enumeration Variants of Subset Problems |
Authors | Juho Lauri, Sourav Dutta |
Abstract | We propose a simple, powerful, and flexible machine learning framework for (i) reducing the search space of computationally difficult enumeration variants of subset problems and (ii) augmenting existing state-of-the-art solvers with informative cues arising from the input distribution. We instantiate our framework for the problem of listing all maximum cliques in a graph, a central problem in network analysis, data mining, and computational biology. We demonstrate the practicality of our approach on real-world networks with millions of vertices and edges by not only retaining all optimal solutions, but also aggressively pruning the input instance size resulting in several fold speedups of state-of-the-art algorithms. Finally, we explore the limits of scalability and robustness of our proposed framework, suggesting that supervised learning is viable for tackling NP-hard problems in practice. |
Tasks | |
Published | 2019-02-22 |
URL | http://arxiv.org/abs/1902.08455v1 |
http://arxiv.org/pdf/1902.08455v1.pdf | |
PWC | https://paperswithcode.com/paper/fine-grained-search-space-classification-for |
Repo | |
Framework | |
Decoupling Hierarchical Recurrent Neural Networks With Locally Computable Losses
Title | Decoupling Hierarchical Recurrent Neural Networks With Locally Computable Losses |
Authors | Asier Mujika, Felix Weissenberger, Angelika Steger |
Abstract | Learning long-term dependencies is a key long-standing challenge of recurrent neural networks (RNNs). Hierarchical recurrent neural networks (HRNNs) have been considered a promising approach as long-term dependencies are resolved through shortcuts up and down the hierarchy. Yet, the memory requirements of Truncated Backpropagation Through Time (TBPTT) still prevent training them on very long sequences. In this paper, we empirically show that in (deep) HRNNs, propagating gradients back from higher to lower levels can be replaced by locally computable losses, without harming the learning capability of the network, over a wide range of tasks. This decoupling by local losses reduces the memory requirements of training by a factor exponential in the depth of the hierarchy in comparison to standard TBPTT. |
Tasks | |
Published | 2019-10-11 |
URL | https://arxiv.org/abs/1910.05245v1 |
https://arxiv.org/pdf/1910.05245v1.pdf | |
PWC | https://paperswithcode.com/paper/decoupling-hierarchical-recurrent-neural-1 |
Repo | |
Framework | |
Learning efficient haptic shape exploration with a rigid tactile sensor array
Title | Learning efficient haptic shape exploration with a rigid tactile sensor array |
Authors | Sascha Fleer, Alexandra Moringen, Roberta L. Klatzky, Helge Ritter |
Abstract | Haptic exploration is a key skill for both robots and humans to discriminate and handle unknown objects or to recognize familiar objects. Its active nature is evident in humans who from early on reliably acquire sophisticated sensory-motor capabilities for active exploratory touch and directed manual exploration that associates surfaces and object properties with their spatial locations. This is in stark contrast to robotics. In this field, the relative lack of good real-world interaction models - along with very restricted sensors and a scarcity of suitable training data to leverage machine learning methods - has so far rendered haptic exploration a largely underdeveloped skill. In the present work, we connect recent advances in recurrent models of visual attention with previous insights about the organisation of human haptic search behavior, exploratory procedures and haptic glances for a novel architecture that learns a generative model of haptic exploration in a simulated three-dimensional environment. The proposed algorithm simultaneously optimizes main perception-action loop components: feature extraction, integration of features over time, and the control strategy, while continuously acquiring data online. We perform a multi-module neural network training, including a feature extractor and a recurrent neural network module aiding pose control for storing and combining sequential sensory data. The resulting haptic meta-controller for the rigid $16 \times 16$ tactile sensor array moving in a physics-driven simulation environment, called the Haptic Attention Model, performs a sequence of haptic glances, and outputs corresponding force measurements. The resulting method has been successfully tested with four different objects. It achieved results close to $100 %$ while performing object contour exploration that has been optimized for its own sensor morphology. |
Tasks | |
Published | 2019-02-20 |
URL | https://arxiv.org/abs/1902.07501v4 |
https://arxiv.org/pdf/1902.07501v4.pdf | |
PWC | https://paperswithcode.com/paper/learning-efficient-haptic-shape-exploration |
Repo | |
Framework | |
Attend to the Difference: Cross-Modality Person Re-identification via Contrastive Correlation
Title | Attend to the Difference: Cross-Modality Person Re-identification via Contrastive Correlation |
Authors | Shizhou Zhang, Yifei Yang, Peng Wang, Xiuwei Zhang, Yanning Zhang |
Abstract | The problem of cross-modality person re-identification has been receiving increasing attention recently, due to its practical significance. Motivated by the fact that human usually attend to the difference when they compare two similar objects, we propose a dual-path cross-modality feature learning framework which preserves intrinsic spatial strictures and attends to the difference of input cross-modality image pairs. Our framework is composed by two main components: a Dual-path Spatial-structure-preserving Common Space Network (DSCSN) and a Contrastive Correlation Network (CCN). The former embeds cross-modality images into a common 3D tensor space without losing spatial structures, while the latter extracts contrastive features by dynamically comparing input image pairs. Note that the representations generated for the input RGB and Infrared images are mutually dependant to each other. We conduct extensive experiments on two public available RGB-IR ReID datasets, SYSU-MM01 and RegDB, and our proposed method outperforms state-of-the-art algorithms by a large margin with both full and simplified evaluation modes. |
Tasks | Person Re-Identification |
Published | 2019-10-25 |
URL | https://arxiv.org/abs/1910.11656v1 |
https://arxiv.org/pdf/1910.11656v1.pdf | |
PWC | https://paperswithcode.com/paper/attend-to-the-difference-cross-modality |
Repo | |
Framework | |
Distortion-adaptive Salient Object Detection in 360$^\circ$ Omnidirectional Images
Title | Distortion-adaptive Salient Object Detection in 360$^\circ$ Omnidirectional Images |
Authors | Jia Li, Jinming Su, Changqun Xia, Yonghong Tian |
Abstract | Image-based salient object detection (SOD) has been extensively explored in the past decades. However, SOD on 360$^\circ$ omnidirectional images is less studied owing to the lack of datasets with pixel-level annotations. Toward this end, this paper proposes a 360$^\circ$ image-based SOD dataset that contains 500 high-resolution equirectangular images. We collect the representative equirectangular images from five mainstream 360$^\circ$ video datasets and manually annotate all objects and regions over these images with precise masks with a free-viewpoint way. To the best of our knowledge, it is the first public available dataset for salient object detection on 360$^\circ$ scenes. By observing this dataset, we find that distortion from projection, large-scale complex scene and small salient objects are the most prominent characteristics. Inspired by these foundings, this paper proposes a baseline model for SOD on equirectangular images. In the proposed approach, we construct a distortion-adaptive module to deal with the distortion caused by the equirectangular projection. In addition, a multi-scale contextual integration block is introduced to perceive and distinguish the rich scenes and objects in omnidirectional scenes. The whole network is organized in a progressively manner with deep supervision. Experimental results show the proposed baseline approach outperforms the top-performanced state-of-the-art methods on 360$^\circ$ SOD dataset. Moreover, benchmarking results of the proposed baseline approach and other methods on 360$^\circ$ SOD dataset show the proposed dataset is very challenging, which also validate the usefulness of the proposed dataset and approach to boost the development of SOD on 360$^\circ$ omnidirectional scenes. |
Tasks | Object Detection, Salient Object Detection |
Published | 2019-09-11 |
URL | https://arxiv.org/abs/1909.04913v1 |
https://arxiv.org/pdf/1909.04913v1.pdf | |
PWC | https://paperswithcode.com/paper/distortion-adaptive-salient-object-detection |
Repo | |
Framework | |
Bipartite Graph Neural Networks for Efficient Node Representation Learning
Title | Bipartite Graph Neural Networks for Efficient Node Representation Learning |
Authors | Chaoyang He, Tian Xie, Yu Rong, Wenbing Huang, Yanfang Li, Junzhou Huang, Xiang Ren, Cyrus Shahabi |
Abstract | Existing Graph Neural Networks (GNNs) mainly focus on general structures, while the specific architecture on bipartite graphs—a crucial practical data form that consists of two distinct domains of nodes—is seldom studied. In this paper, we propose Bipartite Graph Neural Network (BGNN), a novel model that is domain-consistent, unsupervised, and efficient. At its core, BGNN utilizes the proposed Inter-domain Message Passing (IDMP) for message aggregation and Intra-domain Alignment (IDA) towards information fusion over domains, both of which are trained without requiring any supervision. Moreover, we formulate a multi-layer BGNN in a cascaded manner to enable multi-hop relation modeling while enjoying promising efficiency in training. Extensive experiments on several datasets of varying scales verify the effectiveness of BGNN compared to other counterparts. Particularly for the experiment on a large-scale bipartite graph dataset, the scalability of our BGNN is validated in terms of fast training speed and low memory cost. |
Tasks | Representation Learning |
Published | 2019-06-27 |
URL | https://arxiv.org/abs/1906.11994v2 |
https://arxiv.org/pdf/1906.11994v2.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-representation-learning-on-large |
Repo | |
Framework | |
Lifting Vectorial Variational Problems: A Natural Formulation based on Geometric Measure Theory and Discrete Exterior Calculus
Title | Lifting Vectorial Variational Problems: A Natural Formulation based on Geometric Measure Theory and Discrete Exterior Calculus |
Authors | Thomas Möllenhoff, Daniel Cremers |
Abstract | Numerous tasks in imaging and vision can be formulated as variational problems over vector-valued maps. We approach the relaxation and convexification of such vectorial variational problems via a lifting to the space of currents. To that end, we recall that functionals with polyconvex Lagrangians can be reparametrized as convex one-homogeneous functionals on the graph of the function. This leads to an equivalent shape optimization problem over oriented surfaces in the product space of domain and codomain. A convex formulation is then obtained by relaxing the search space from oriented surfaces to more general currents. We propose a discretization of the resulting infinite-dimensional optimization problem using Whitney forms, which also generalizes recent “sublabel-accurate” multilabeling approaches. |
Tasks | |
Published | 2019-05-02 |
URL | https://arxiv.org/abs/1905.00851v1 |
https://arxiv.org/pdf/1905.00851v1.pdf | |
PWC | https://paperswithcode.com/paper/lifting-vectorial-variational-problems-a |
Repo | |
Framework | |
On Completeness-aware Concept-Based Explanations in Deep Neural Networks
Title | On Completeness-aware Concept-Based Explanations in Deep Neural Networks |
Authors | Chih-Kuan Yeh, Been Kim, Sercan O. Arik, Chun-Liang Li, Tomas Pfister, Pradeep Ravikumar |
Abstract | Human explanations of high-level decisions are often expressed in terms of key concepts the decisions are based on. In this paper, we study such concept-based explainability for Deep Neural Networks (DNNs). First, we define the notion of completeness, which quantifies how sufficient is a particular set of concepts in explaining a model’s prediction behavior. Next, we propose a concept discovery method that aims to infer a complete set of concepts that are additionally encouraged to be interpretable. Our concept discovery method aims to address the limitations of commonly-used methods such as PCA and TCAV. To define an importance score for each discovered concept, we adapt game-theoretic notions to aggregate over sets and propose \emph{ConceptSHAP}. On a Synthetic dataset with ground-truth concept explanations, on a real-world dataset, and with a user study, we validate the effectiveness of our framework in finding concepts that are both complete in explaining the decisions, and interpretable. |
Tasks | |
Published | 2019-10-17 |
URL | https://arxiv.org/abs/1910.07969v2 |
https://arxiv.org/pdf/1910.07969v2.pdf | |
PWC | https://paperswithcode.com/paper/on-concept-based-explanations-in-deep-neural |
Repo | |
Framework | |
Video Person Re-Identification using Learned Clip Similarity Aggregation
Title | Video Person Re-Identification using Learned Clip Similarity Aggregation |
Authors | Neeraj Matiyali, Gaurav Sharma |
Abstract | We address the challenging task of video-based person re-identification. Recent works have shown that splitting the video sequences into clips and then aggregating clip based similarity is appropriate for the task. We show that using a learned clip similarity aggregation function allows filtering out hard clip pairs, e.g. where the person is not clearly visible, is in a challenging pose, or where the poses in the two clips are too different to be informative. This allows the method to focus on clip-pairs which are more informative for the task. We also introduce the use of 3D CNNs for video-based re-identification and show their effectiveness by performing equivalent to previous works, which use optical flow in addition to RGB, while using RGB inputs only. We give quantitative results on three challenging public benchmarks and show better or competitive performance. We also validate our method qualitatively. |
Tasks | Optical Flow Estimation, Person Re-Identification, Video-Based Person Re-Identification |
Published | 2019-10-17 |
URL | https://arxiv.org/abs/1910.08055v1 |
https://arxiv.org/pdf/1910.08055v1.pdf | |
PWC | https://paperswithcode.com/paper/video-person-re-identification-using-learned |
Repo | |
Framework | |
A Generalization Bound for Online Variational Inference
Title | A Generalization Bound for Online Variational Inference |
Authors | Badr-Eddine Chérief-Abdellatif, Pierre Alquier, Mohammad Emtiyaz Khan |
Abstract | Bayesian inference provides an attractive online-learning framework to analyze sequential data, and offers generalization guarantees which hold even with model mismatch and adversaries. Unfortunately, exact Bayesian inference is rarely feasible in practice and approximation methods are usually employed, but do such methods preserve the generalization properties of Bayesian inference ? In this paper, we show that this is indeed the case for some variational inference (VI) algorithms. We consider a few existing online, tempered VI algorithms, as well as a new algorithm, and derive their generalization bounds. Our theoretical result relies on the convexity of the variational objective, but we argue that the result should hold more generally and present empirical evidence in support of this. Our work in this paper presents theoretical justifications in favor of online algorithms relying on approximate Bayesian methods. |
Tasks | Bayesian Inference |
Published | 2019-04-08 |
URL | https://arxiv.org/abs/1904.03920v2 |
https://arxiv.org/pdf/1904.03920v2.pdf | |
PWC | https://paperswithcode.com/paper/a-generalization-bound-for-online-variational |
Repo | |
Framework | |
Deep Learning-based Universal Beamformer for Ultrasound Imaging
Title | Deep Learning-based Universal Beamformer for Ultrasound Imaging |
Authors | Shujaat Khan, Jaeyoung Huh, Jong Chul Ye |
Abstract | In ultrasound (US) imaging, individual channel RF measurements are back-propagated and accumulated to form an image after applying specific delays. While this time reversal is usually implemented using a hardware- or software-based delay-and-sum (DAS) beamformer, the performance of DAS decreases rapidly in situations where data acquisition is not ideal. Herein, for the first time, we demonstrate that a single data-driven adaptive beamformer designed as a deep neural network can generate high quality images robustly for various detector channel configurations and subsampling rates. The proposed deep beamformer is evaluated for two distinct acquisition schemes: focused ultrasound imaging and planewave imaging. Experimental results showed that the proposed deep beamformer exhibit significant performance gain for both focused and planar imaging schemes, in terms of contrast-to-noise ratio and structural similarity. |
Tasks | |
Published | 2019-04-05 |
URL | https://arxiv.org/abs/1904.02843v2 |
https://arxiv.org/pdf/1904.02843v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-based-universal-beamformer-for |
Repo | |
Framework | |
Neural Program Repair by Jointly Learning to Localize and Repair
Title | Neural Program Repair by Jointly Learning to Localize and Repair |
Authors | Marko Vasic, Aditya Kanade, Petros Maniatis, David Bieber, Rishabh Singh |
Abstract | Due to its potential to improve programmer productivity and software quality, automated program repair has been an active topic of research. Newer techniques harness neural networks to learn directly from examples of buggy programs and their fixes. In this work, we consider a recently identified class of bugs called variable-misuse bugs. The state-of-the-art solution for variable misuse enumerates potential fixes for all possible bug locations in a program, before selecting the best prediction. We show that it is beneficial to train a model that jointly and directly localizes and repairs variable-misuse bugs. We present multi-headed pointer networks for this purpose, with one head each for localization and repair. The experimental results show that the joint model significantly outperforms an enumerative solution that uses a pointer based model for repair alone. |
Tasks | |
Published | 2019-04-03 |
URL | http://arxiv.org/abs/1904.01720v1 |
http://arxiv.org/pdf/1904.01720v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-program-repair-by-jointly-learning-to-1 |
Repo | |
Framework | |
Localization of Unmanned Aerial Vehicles in Corridor Environments using Deep Learning
Title | Localization of Unmanned Aerial Vehicles in Corridor Environments using Deep Learning |
Authors | Ram Prasad Padhy, Shahzad Ahmad, Sachin Verma, Pankaj Kumar Sa, Sambit Bakshi |
Abstract | Vision-based pose estimation of Unmanned Aerial Vehicles (UAV) in unknown environments is a rapidly growing research area in the field of robot vision. The task becomes more complex when the only available sensor is a static single camera (monocular vision). In this regard, we propose a monocular vision assisted localization algorithm, that will help a UAV to navigate safely in indoor corridor environments. Always, the aim is to navigate the UAV through a corridor in the forward direction by keeping it at the center with no orientation either to the left or right side. The algorithm makes use of the RGB image, captured from the UAV front camera, and passes it through a trained deep neural network (DNN) to predict the position of the UAV as either on the left or center or right side of the corridor. Depending upon the divergence of the UAV with respect to the central bisector line (CBL) of the corridor, a suitable command is generated to bring the UAV to the center. When the UAV is at the center of the corridor, a new image is passed through another trained DNN to predict the orientation of the UAV with respect to the CBL of the corridor. If the UAV is either left or right tilted, an appropriate command is generated to rectify the orientation. We also propose a new corridor dataset, named NITRCorrV1, which contains images as captured by the UAV front camera when the UAV is at all possible locations of a variety of corridors. An exhaustive set of experiments in different corridors reveal the efficacy of the proposed algorithm. |
Tasks | Pose Estimation |
Published | 2019-03-21 |
URL | http://arxiv.org/abs/1903.09021v1 |
http://arxiv.org/pdf/1903.09021v1.pdf | |
PWC | https://paperswithcode.com/paper/localization-of-unmanned-aerial-vehicles-in |
Repo | |
Framework | |