October 17, 2019

3186 words 15 mins read

Paper Group ANR 769

EV-SegNet: Semantic Segmentation for Event-based Cameras. Training Recurrent Neural Networks via Dynamical Trajectory-Based Optimization. Multi-modal Feedback for Affordance-driven Interactive Reinforcement Learning. Abdominal Aortic Aneurysm Segmentation with a Small Number of Training Subjects. A Smart Security System with Face Recognition. Near- …

EV-SegNet: Semantic Segmentation for Event-based Cameras


Title	EV-SegNet: Semantic Segmentation for Event-based Cameras
Authors	Iñigo Alonso, Ana C. Murillo
Abstract	Event cameras, or Dynamic Vision Sensor (DVS), are very promising sensors which have shown several advantages over frame based cameras. However, most recent work on real applications of these cameras is focused on 3D reconstruction and 6-DOF camera tracking. Deep learning based approaches, which are leading the state-of-the-art in visual recognition tasks, could potentially take advantage of the benefits of DVS, but some adaptations are needed still needed in order to effectively work on these cameras. This work introduces a first baseline for semantic segmentation with this kind of data. We build a semantic segmentation CNN based on state-of-the-art techniques which takes event information as the only input. Besides, we propose a novel representation for DVS data that outperforms previously used event representations for related tasks. Since there is no existing labeled dataset for this task, we propose how to automatically generate approximated semantic segmentation labels for some sequences of the DDD17 dataset, which we publish together with the model, and demonstrate they are valid to train a model for DVS data only. We compare our results on semantic segmentation from DVS data with results using corresponding grayscale images, demonstrating how they are complementary and worth combining.
Tasks	3D Reconstruction, Semantic Segmentation
Published	2018-11-29
URL	http://arxiv.org/abs/1811.12039v1
PDF	http://arxiv.org/pdf/1811.12039v1.pdf
PWC	https://paperswithcode.com/paper/ev-segnet-semantic-segmentation-for-event
Repo
Framework

Training Recurrent Neural Networks via Dynamical Trajectory-Based Optimization


Title	Training Recurrent Neural Networks via Dynamical Trajectory-Based Optimization
Authors	Hamid Khodabandehlou, M. Sami Fadali
Abstract	This paper introduces a new method to train recurrent neural networks using dynamical trajectory-based optimization. The optimization method utilizes a projected gradient system (PGS) and a quotient gradient system (QGS) to determine the feasible regions of an optimization problem and search the feasible regions for local minima. By exploring the feasible regions, local minima are identified and the local minimum with the lowest cost is chosen as the global minimum of the optimization problem. Lyapunov theory is used to prove the stability of the local minima and their stability in the presence of measurement errors. Numerical examples show that the new approach provides better results than genetic algorithm and error backpropagation (EBP) trained networks.
Tasks
Published	2018-05-10
URL	http://arxiv.org/abs/1805.04152v1
PDF	http://arxiv.org/pdf/1805.04152v1.pdf
PWC	https://paperswithcode.com/paper/training-recurrent-neural-networks-via
Repo
Framework


Title	Multi-modal Feedback for Affordance-driven Interactive Reinforcement Learning
Authors	Francisco Cruz, German I. Parisi, Stefan Wermter
Abstract	Interactive reinforcement learning (IRL) extends traditional reinforcement learning (RL) by allowing an agent to interact with parent-like trainers during a task. In this paper, we present an IRL approach using dynamic audio-visual input in terms of vocal commands and hand gestures as feedback. Our architecture integrates multi-modal information to provide robust commands from multiple sensory cues along with a confidence value indicating the trustworthiness of the feedback. The integration process also considers the case in which the two modalities convey incongruent information. Additionally, we modulate the influence of sensory-driven feedback in the IRL task using goal-oriented knowledge in terms of contextual affordances. We implement a neural network architecture to predict the effect of performed actions with different objects to avoid failed-states, i.e., states from which it is not possible to accomplish the task. In our experimental setup, we explore the interplay of multimodal feedback and task-specific affordances in a robot cleaning scenario. We compare the learning performance of the agent under four different conditions: traditional RL, multi-modal IRL, and each of these two setups with the use of contextual affordances. Our experiments show that the best performance is obtained by using audio-visual feedback with affordancemodulated IRL. The obtained results demonstrate the importance of multi-modal sensory processing integrated with goal-oriented knowledge in IRL tasks.
Tasks
Published	2018-07-26
URL	http://arxiv.org/abs/1807.09991v1
PDF	http://arxiv.org/pdf/1807.09991v1.pdf
PWC	https://paperswithcode.com/paper/multi-modal-feedback-for-affordance-driven
Repo
Framework

Abdominal Aortic Aneurysm Segmentation with a Small Number of Training Subjects


Title	Abdominal Aortic Aneurysm Segmentation with a Small Number of Training Subjects
Authors	Jian-Qing Zheng, Xiao-Yun Zhou, Qing-Biao Li, Celia Riga, Guang-Zhong Yang
Abstract	Pre-operative Abdominal Aortic Aneurysm (AAA) 3D shape is critical for customized stent-graft design in Fenestrated Endovascular Aortic Repair (FEVAR). Traditional segmentation approaches implement expert-designed feature extractors while recent deep neural networks extract features automatically with multiple non-linear modules. Usually, a large training dataset is essential for applying deep learning on AAA segmentation. In this paper, the AAA was segmented using U-net with a small number (two) of training subjects. Firstly, Computed Tomography Angiography (CTA) slices were augmented with gray value variation and translation to avoid the overfitting caused by the small number of training subjects. Then, U-net was trained to segment the AAA. Dice Similarity Coefficients (DSCs) over 0.8 were achieved on the testing subjects. The PLZ, DLZ and aortic branches are all reconstructed reasonably, which will facilitate stent graft customization and help shape instantiation for intra-operative surgery navigation in FEVAR.
Tasks
Published	2018-04-09
URL	http://arxiv.org/abs/1804.02943v1
PDF	http://arxiv.org/pdf/1804.02943v1.pdf
PWC	https://paperswithcode.com/paper/abdominal-aortic-aneurysm-segmentation-with-a
Repo
Framework

A Smart Security System with Face Recognition


Title	A Smart Security System with Face Recognition
Authors	Trung Nguyen, Barth Lakshmanan, Weihua Sheng
Abstract	Web-based technology has improved drastically in the past decade. As a result, security technology has become a major help to protect our daily life. In this paper, we propose a robust security based on face recognition system (SoF). In particular, we develop this system to giving access into a home for authenticated users. The classifier is trained by using a new adaptive learning method. The training data are initially collected from social networks. The accuracy of the classifier is incrementally improved as the user starts using the system. A novel method has been introduced to improve the classifier model by human interaction and social media. By using a deep learning framework - TensorFlow, it will be easy to reuse the framework to adopt with many devices and applications.
Tasks	Face Recognition
Published	2018-12-03
URL	http://arxiv.org/abs/1812.09127v1
PDF	http://arxiv.org/pdf/1812.09127v1.pdf
PWC	https://paperswithcode.com/paper/a-smart-security-system-with-face-recognition
Repo
Framework

Near-lossless $\ell_\infty$-constrained Image Decompression via Deep Neural Network


Title	Near-lossless $\ell_\infty$-constrained Image Decompression via Deep Neural Network
Authors	Xi Zhang, Xiaolin Wu
Abstract	Recently a number of CNN-based techniques were proposed to remove image compression artifacts. As in other restoration applications, these techniques all learn a mapping from decompressed patches to the original counterparts under the ubiquitous $\ell_\infty$ metric. However, this approach is incapable of restoring distinctive image details which may be statistical outliers but have high semantic importance (e.g., tiny lesions in medical images). To overcome this weakness, we propose to incorporate an $\ell_\infty$ fidelity criterion in the design of neural network so that no small, distinctive structures of the original image can be dropped or distorted. Experimental results demonstrate that the proposed method outperforms the state-of-the-art methods in $\ell_\infty$ error metric and perceptual quality, while being competitive in $\ell_2$ error metric as well. It can restore subtle image details that are otherwise destroyed or missed by other algorithms. Our research suggests a new machine learning paradigm of ultra high fidelity image compression that is ideally suited for applications in medicine, space, and sciences.
Tasks	Image Compression, Image Restoration
Published	2018-01-18
URL	https://arxiv.org/abs/1801.07987v5
PDF	https://arxiv.org/pdf/1801.07987v5.pdf
PWC	https://paperswithcode.com/paper/near-lossless-l-infinity-constrained-multi
Repo
Framework

SG-One: Similarity Guidance Network for One-Shot Semantic Segmentation


Title	SG-One: Similarity Guidance Network for One-Shot Semantic Segmentation
Authors	Xiaolin Zhang, Yunchao Wei, Yi Yang, Thomas Huang
Abstract	One-shot semantic segmentation poses a challenging task of recognizing the object regions from unseen categories with only one annotated example as supervision. In this paper, we propose a simple yet effective Similarity Guidance network to tackle the One-shot (SG-One) segmentation problem. We aim at predicting the segmentation mask of a query image with the reference to one densely labeled support image. To obtain the robust representative feature of the support image, we firstly propose a masked average pooling strategy for producing the guidance features using only the pixels belonging to the support image. We then leverage the cosine similarity to build the relationship between the guidance features and features of pixels from the query image. In this way, the possibilities embedded in the produced similarity maps can be adapted to guide the process of segmenting objects. Furthermore, our SG-One is a unified framework which can efficiently process both support and query images within one network and be learned in an end-to-end manner. We conduct extensive experiments on Pascal VOC 2012. In particular, our SG-One achieves the mIoU score of 46.3%, which is a new state-of-the-art.
Tasks	Semantic Segmentation
Published	2018-10-22
URL	http://arxiv.org/abs/1810.09091v3
PDF	http://arxiv.org/pdf/1810.09091v3.pdf
PWC	https://paperswithcode.com/paper/sg-one-similarity-guidance-network-for-one
Repo
Framework

Preserved Structure Across Vector Space Representations


Title	Preserved Structure Across Vector Space Representations
Authors	Andrei Amatuni, Estelle He, Elika Bergelson
Abstract	Certain concepts, words, and images are intuitively more similar than others (dog vs. cat, dog vs. spoon), though quantifying such similarity is notoriously difficult. Indeed, this kind of computation is likely a critical part of learning the category boundaries for words within a given language. Here, we use a set of 27 items (e.g. ‘dog’) that are highly common in infants’ input, and use both image- and word-based algorithms to independently compute similarity among them. We find three key results. First, the pairwise item similarities derived within image-space and word-space are correlated, suggesting preserved structure among these extremely different representational formats. Second, the closest ‘neighbors’ for each item, within each space, showed significant overlap (e.g. both found ‘egg’ as a neighbor of ‘apple’). Third, items with the most overlapping neighbors are later-learned by infants and toddlers. We conclude that this approach, which does not rely on human ratings of similarity, may nevertheless reflect stable within-class structure across these two spaces. We speculate that such invariance might aid lexical acquisition, by serving as an informative marker of category boundaries.
Tasks
Published	2018-02-02
URL	http://arxiv.org/abs/1802.00840v2
PDF	http://arxiv.org/pdf/1802.00840v2.pdf
PWC	https://paperswithcode.com/paper/preserved-structure-across-vector-space
Repo
Framework

ViewpointS: towards a Collective Brain


Title	ViewpointS: towards a Collective Brain
Authors	Philippe Lemoisson, Stefano A. Cerri
Abstract	Tracing knowledge acquisition and linking learning events to interaction between peers is a major challenge of our times. We have conceived, designed and evaluated a new paradigm for constructing and using collective knowledge by Web interactions that we called ViewpointS. By exploiting the similarity with Edelman’s Theory of Neuronal Group Selection (TNGS), we conjecture that it may be metaphorically considered a Collective Brain, especially effective in the case of trans-disciplinary representations. Far from being without doubts, in the paper we present the reasons (and the limits) of our proposal that aims to become a useful integrating tool for future quantitative explorations of individual as well as collective learning at different degrees of granu-larity. We are therefore challenging each of the current approaches: the logical one in the semantic Web, the statistical one in mining and deep learning, the social one in recommender systems based on authority and trust; not in each of their own preferred field of operation, rather in their integration weaknesses far from the holistic and dynamic behavior of the human brain.
Tasks	Recommendation Systems
Published	2018-09-03
URL	http://arxiv.org/abs/1809.00564v1
PDF	http://arxiv.org/pdf/1809.00564v1.pdf
PWC	https://paperswithcode.com/paper/viewpoints-towards-a-collective-brain
Repo
Framework

Data Poisoning Attack against Unsupervised Node Embedding Methods


Title	Data Poisoning Attack against Unsupervised Node Embedding Methods
Authors	Mingjie Sun, Jian Tang, Huichen Li, Bo Li, Chaowei Xiao, Yao Chen, Dawn Song
Abstract	Unsupervised node embedding methods (e.g., DeepWalk, LINE, and node2vec) have attracted growing interests given their simplicity and effectiveness. However, although these methods have been proved effective in a variety of applications, none of the existing work has analyzed the robustness of them. This could be very risky if these methods are attacked by an adversarial party. In this paper, we take the task of link prediction as an example, which is one of the most fundamental problems for graph analysis, and introduce a data positioning attack to node embedding methods. We give a complete characterization of attacker’s utilities and present efficient solutions to adversarial attacks for two popular node embedding methods: DeepWalk and LINE. We evaluate our proposed attack model on multiple real-world graphs. Experimental results show that our proposed model can significantly affect the results of link prediction by slightly changing the graph structures (e.g., adding or removing a few edges). We also show that our proposed model is very general and can be transferable across different embedding methods. Finally, we conduct a case study on a coauthor network to better understand our attack method.
Tasks	data poisoning, Link Prediction
Published	2018-10-30
URL	http://arxiv.org/abs/1810.12881v2
PDF	http://arxiv.org/pdf/1810.12881v2.pdf
PWC	https://paperswithcode.com/paper/data-poisoning-attack-against-unsupervised
Repo
Framework

Episodic Memory Deep Q-Networks


Title	Episodic Memory Deep Q-Networks
Authors	Zichuan Lin, Tianqi Zhao, Guangwen Yang, Lintao Zhang
Abstract	Reinforcement learning (RL) algorithms have made huge progress in recent years by leveraging the power of deep neural networks (DNN). Despite the success, deep RL algorithms are known to be sample inefficient, often requiring many rounds of interaction with the environments to obtain satisfactory performance. Recently, episodic memory based RL has attracted attention due to its ability to latch on good actions quickly. In this paper, we present a simple yet effective biologically inspired RL algorithm called Episodic Memory Deep Q-Networks (EMDQN), which leverages episodic memory to supervise an agent during training. Experiments show that our proposed method can lead to better sample efficiency and is more likely to find good policies. It only requires 1/5 of the interactions of DQN to achieve many state-of-the-art performances on Atari games, significantly outperforming regular DQN and other episodic memory based RL algorithms.
Tasks	Atari Games
Published	2018-05-19
URL	http://arxiv.org/abs/1805.07603v1
PDF	http://arxiv.org/pdf/1805.07603v1.pdf
PWC	https://paperswithcode.com/paper/episodic-memory-deep-q-networks
Repo
Framework

Learning Student Networks via Feature Embedding


Title	Learning Student Networks via Feature Embedding
Authors	Hanting Chen, Yunhe Wang, Chang Xu, Chao Xu, Dacheng Tao
Abstract	Deep convolutional neural networks have been widely used in numerous applications, but their demanding storage and computational resource requirements prevent their applications on mobile devices. Knowledge distillation aims to optimize a portable student network by taking the knowledge from a well-trained heavy teacher network. Traditional teacher-student based methods used to rely on additional fully-connected layers to bridge intermediate layers of teacher and student networks, which brings in a large number of auxiliary parameters. In contrast, this paper aims to propagate information from teacher to student without introducing new variables which need to be optimized. We regard the teacher-student paradigm from a new perspective of feature embedding. By introducing the locality preserving loss, the student network is encouraged to generate the low-dimensional features which could inherit intrinsic properties of their corresponding high-dimensional features from teacher network. The resulting portable network thus can naturally maintain the performance as that of the teacher network. Theoretical analysis is provided to justify the lower computation complexity of the proposed method. Experiments on benchmark datasets and well-trained networks suggest that the proposed algorithm is superior to state-of-the-art teacher-student learning methods in terms of computational and storage complexity.
Tasks
Published	2018-12-17
URL	http://arxiv.org/abs/1812.06597v1
PDF	http://arxiv.org/pdf/1812.06597v1.pdf
PWC	https://paperswithcode.com/paper/learning-student-networks-via-feature
Repo
Framework

A General Pipeline for 3D Detection of Vehicles


Title	A General Pipeline for 3D Detection of Vehicles
Authors	Xinxin Du, Marcelo H. Ang Jr., Sertac Karaman, Daniela Rus
Abstract	Autonomous driving requires 3D perception of vehicles and other objects in the in environment. Much of the current methods support 2D vehicle detection. This paper proposes a flexible pipeline to adopt any 2D detection network and fuse it with a 3D point cloud to generate 3D information with minimum changes of the 2D detection networks. To identify the 3D box, an effective model fitting algorithm is developed based on generalised car models and score maps. A two-stage convolutional neural network (CNN) is proposed to refine the detected 3D box. This pipeline is tested on the KITTI dataset using two different 2D detection networks. The 3D detection results based on these two networks are similar, demonstrating the flexibility of the proposed pipeline. The results rank second among the 3D detection algorithms, indicating its competencies in 3D detection.
Tasks	3D Object Detection, Autonomous Driving
Published	2018-02-12
URL	http://arxiv.org/abs/1803.00387v1
PDF	http://arxiv.org/pdf/1803.00387v1.pdf
PWC	https://paperswithcode.com/paper/a-general-pipeline-for-3d-detection-of
Repo
Framework

Learning a Robust Society of Tracking Parts using Co-occurrence Constraints


Title	Learning a Robust Society of Tracking Parts using Co-occurrence Constraints
Authors	Elena Burceanu, Marius Leordeanu
Abstract	Object tracking is an essential problem in computer vision that has been researched for several decades. One of the main challenges in tracking is to adapt to object appearance changes over time and avoiding drifting to background clutter. We address this challenge by proposing a deep neural network composed of different parts, which functions as a society of tracking parts. They work in conjunction according to a certain policy and learn from each other in a robust manner, using co-occurrence constraints that ensure robust inference and learning. From a structural point of view, our network is composed of two main pathways. One pathway is more conservative. It carefully monitors a large set of simple tracker parts learned as linear filters over deep feature activation maps. It assigns the parts different roles. It promotes the reliable ones and removes the inconsistent ones. We learn these filters simultaneously in an efficient way, with a single closed-form formulation, for which we propose novel theoretical properties. The second pathway is more progressive. It is learned completely online and thus it is able to better model object appearance changes. In order to adapt in a robust manner, it is learned only on highly confident frames, which are decided using co-occurrences with the first pathway. Thus, our system has the full benefit of two main approaches in tracking. The larger set of simpler filter parts offers robustness, while the full deep network learned online provides adaptability to change. As shown in the experimental section, our approach achieves state of the art performance on the challenging VOT17 benchmark, outperforming the published methods both on the general EAO metric and in the number of fails, by a significant margin.
Tasks	Object Tracking
Published	2018-04-05
URL	http://arxiv.org/abs/1804.01771v2
PDF	http://arxiv.org/pdf/1804.01771v2.pdf
PWC	https://paperswithcode.com/paper/learning-a-robust-society-of-tracking-parts
Repo
Framework

A Human Mixed Strategy Approach to Deep Reinforcement Learning


Title	A Human Mixed Strategy Approach to Deep Reinforcement Learning
Authors	Ngoc Duy Nguyen, Saeid Nahavandi, Thanh Nguyen
Abstract	In 2015, Google’s DeepMind announced an advancement in creating an autonomous agent based on deep reinforcement learning (DRL) that could beat a professional player in a series of 49 Atari games. However, the current manifestation of DRL is still immature, and has significant drawbacks. One of DRL’s imperfections is its lack of “exploration” during the training process, especially when working with high-dimensional problems. In this paper, we propose a mixed strategy approach that mimics behaviors of human when interacting with environment, and create a “thinking” agent that allows for more efficient exploration in the DRL training process. The simulation results based on the Breakout game show that our scheme achieves a higher probability of obtaining a maximum score than does the baseline DRL algorithm, i.e., the asynchronous advantage actor-critic method. The proposed scheme therefore can be applied effectively to solving a complicated task in a real-world application.
Tasks	Atari Games, Efficient Exploration
Published	2018-04-05
URL	http://arxiv.org/abs/1804.01874v1
PDF	http://arxiv.org/pdf/1804.01874v1.pdf
PWC	https://paperswithcode.com/paper/a-human-mixed-strategy-approach-to-deep
Repo
Framework