January 26, 2020

3266 words 16 mins read

Paper Group ANR 1423

Attention-Guided Lightweight Network for Real-Time Segmentation of Robotic Surgical Instruments. Deep Robust Single Image Depth Estimation Neural Network Using Scene Understanding. Improving Generalization in Meta Reinforcement Learning using Learned Objectives. Real-Time EEG Classification via Coresets for BCI Applications. Evaluating the Robustne …

Attention-Guided Lightweight Network for Real-Time Segmentation of Robotic Surgical Instruments


Title	Attention-Guided Lightweight Network for Real-Time Segmentation of Robotic Surgical Instruments
Authors	Zhen-Liang Ni, Gui-Bin Bian, Zeng-Guang Hou, Xiao-Hu Zhou, Xiao-Liang Xie, Zhen Li
Abstract	Real-time segmentation of surgical instruments plays a crucial role in robot-assisted surgery. However, real-time segmentation of surgical instruments using current deep learning models is still a challenging task due to the high computational costs and slow inference speed. In this paper, an attention-guided lightweight network (LWANet), is proposed to segment surgical instruments in real-time. LWANet adopts the encoder-decoder architecture, where the encoder is the lightweight network MobileNetV2 and the decoder consists of depth-wise separable convolution, attention fusion block, and transposed convolution. Depth-wise separable convolution is used as the basic unit to construct the decoder, which can reduce the model size and computational costs. Attention fusion block captures global context and encodes semantic dependencies between channels to emphasize target regions, contributing to locating the surgical instrument. Transposed convolution is performed to upsample the feature map for acquiring refined edges. LWANet can segment surgical instruments in real-time, taking few computational costs. Based on 960*544 inputs, its inference speed can reach 39 fps with only 3.39 GFLOPs. Also, it has a small model size and the number of parameters is only 2.06 M. The proposed network is evaluated on two datasets. It achieves state-of-the-art performance 94.10% mean IOU on Cata7 and obtains a new record on EndoVis 2017 with 4.10% increase on mean mIOU.
Tasks
Published	2019-10-24
URL	https://arxiv.org/abs/1910.11109v1
PDF	https://arxiv.org/pdf/1910.11109v1.pdf
PWC	https://paperswithcode.com/paper/attention-guided-lightweight-network-for-real
Repo
Framework

Deep Robust Single Image Depth Estimation Neural Network Using Scene Understanding


Title	Deep Robust Single Image Depth Estimation Neural Network Using Scene Understanding
Authors	Haoyu Ren, Mostafa El-khamy, Jungwon Lee
Abstract	Single image depth estimation (SIDE) plays a crucial role in 3D computer vision. In this paper, we propose a two-stage robust SIDE framework that can perform blind SIDE for both indoor and outdoor scenes. At the first stage, the scene understanding module will categorize the RGB image into different depth-ranges. We introduce two different scene understanding modules based on scene classification and coarse depth estimation respectively. At the second stage, SIDE networks trained by the images of specific depth-range are applied to obtain an accurate depth map. In order to improve the accuracy, we further design a multi-task encoding-decoding SIDE network DS-SIDENet based on depthwise separable convolutions. DS-SIDENet is optimized to minimize both depth classification and depth regression losses. This improves the accuracy compared to a single-task SIDE network. Experimental results demonstrate that training DS-SIDENet on an individual dataset such as NYU achieves competitive performance to the state-of-art methods with much better efficiency. Ours proposed robust SIDE framework also shows good performance for the ScanNet indoor images and KITTI outdoor images simultaneously. It achieves the top performance compared to the Robust Vision Challenge (ROB) 2018 submissions.
Tasks	Depth Estimation, Scene Classification, Scene Understanding
Published	2019-06-07
URL	https://arxiv.org/abs/1906.03279v1
PDF	https://arxiv.org/pdf/1906.03279v1.pdf
PWC	https://paperswithcode.com/paper/deep-robust-single-image-depth-estimation
Repo
Framework

Improving Generalization in Meta Reinforcement Learning using Learned Objectives


Title	Improving Generalization in Meta Reinforcement Learning using Learned Objectives
Authors	Louis Kirsch, Sjoerd van Steenkiste, Jürgen Schmidhuber
Abstract	Biological evolution has distilled the experiences of many learners into the general learning algorithms of humans. Our novel meta reinforcement learning algorithm MetaGenRL is inspired by this process. MetaGenRL distills the experiences of many complex agents to meta-learn a low-complexity neural objective function that decides how future individuals will learn. Unlike recent meta-RL algorithms, MetaGenRL can generalize to new environments that are entirely different from those used for meta-training. In some cases, it even outperforms human-engineered RL algorithms. MetaGenRL uses off-policy second-order gradients during meta-training that greatly increase its sample efficiency.
Tasks
Published	2019-10-09
URL	https://arxiv.org/abs/1910.04098v2
PDF	https://arxiv.org/pdf/1910.04098v2.pdf
PWC	https://paperswithcode.com/paper/improving-generalization-in-meta-1
Repo
Framework

Real-Time EEG Classification via Coresets for BCI Applications


Title	Real-Time EEG Classification via Coresets for BCI Applications
Authors	Eitan Netzer, Alex Frid, Dan Feldman
Abstract	A brain-computer interface (BCI) based on the motor imagery (MI) paradigm translates one’s motor intention into a control signal by classifying the Electroencephalogram (EEG) signal of different tasks. However, most existing systems either (i) use a high-quality algorithm to train the data off-line and run only classification in real-time, since the off-line algorithm is too slow, or (ii) use low-quality heuristics that are sufficiently fast for real-time training but introduces relatively large classification error. In this work, we propose a novel processing pipeline that allows real-time and parallel learning of EEG signals using high-quality but possibly inefficient algorithms. This is done by forging a link between BCI and core-sets, a technique that originated in computational geometry for handling streaming data via data summarization. We suggest an algorithm that maintains the representation such coreset tailored to handle the EEG signal which enables: (i) real time and continuous computation of the Common Spatial Pattern (CSP) feature extraction method on a coreset representation of the signal (instead on the signal itself) , (ii) improvement of the CSP algorithm efficiency with provable guarantees by applying CSP algorithm on the coreset, and (iii) real time addition of the data trials (EEG data windows) to the coreset. For simplicity, we focus on the CSP algorithm, which is a classic algorithm. Nevertheless, we expect that our coreset will be extended to other algorithms in future papers. In the experimental results we show that our system can indeed learn EEG signals in real-time for example a 64 channels setup with hundreds of time samples per second. Full open source is provided to reproduce the experiment and in the hope that it will be used and extended to more coresets and BCI applications in the future.
Tasks	Data Summarization, EEG
Published	2019-01-02
URL	http://arxiv.org/abs/1901.00512v1
PDF	http://arxiv.org/pdf/1901.00512v1.pdf
PWC	https://paperswithcode.com/paper/real-time-eeg-classification-via-coresets-for
Repo
Framework

Evaluating the Robustness of Nearest Neighbor Classifiers: A Primal-Dual Perspective


Title	Evaluating the Robustness of Nearest Neighbor Classifiers: A Primal-Dual Perspective
Authors	Lu Wang, Xuanqing Liu, Jinfeng Yi, Zhi-Hua Zhou, Cho-Jui Hsieh
Abstract	We study the problem of computing the minimum adversarial perturbation of the Nearest Neighbor (NN) classifiers. Previous attempts either conduct attacks on continuous approximations of NN models or search for the perturbation by some heuristic methods. In this paper, we propose the first algorithm that is able to compute the minimum adversarial perturbation. The main idea is to formulate the problem as a list of convex quadratic programming (QP) problems that can be efficiently solved by the proposed algorithms for 1-NN models. Furthermore, we show that dual solutions for these QP problems could give us a valid lower bound of the adversarial perturbation that can be used for formal robustness verification, giving us a nice view of attack/verification for NN models. For $K$-NN models with larger $K$, we show that the same formulation can help us efficiently compute the upper and lower bounds of the minimum adversarial perturbation, which can be used for attack and verification.
Tasks
Published	2019-06-10
URL	https://arxiv.org/abs/1906.03972v1
PDF	https://arxiv.org/pdf/1906.03972v1.pdf
PWC	https://paperswithcode.com/paper/evaluating-the-robustness-of-nearest-neighbor
Repo
Framework

Defending with Shared Resources on a Network


Title	Defending with Shared Resources on a Network
Authors	Minming Li, Long Tran-Thanh, Xiaowei Wu
Abstract	In this paper we consider a defending problem on a network. In the model, the defender holds a total defending resource of R, which can be distributed to the nodes of the network. The defending resource allocated to a node can be shared by its neighbors. There is a weight associated with every edge that represents the efficiency defending resources are shared between neighboring nodes. We consider the setting when each attack can affect not only the target node, but its neighbors as well. Assuming that nodes in the network have different treasures to defend and different defending requirements, the defender aims at allocating the defending resource to the nodes to minimize the loss due to attack. We give polynomial time exact algorithms for two important special cases of the network defending problem. For the case when an attack can only affect the target node, we present an LP-based exact algorithm. For the case when defending resources cannot be shared, we present a max-flow-based exact algorithm. We show that the general problem is NP-hard, and we give a 2-approximation algorithm based on LP-rounding. Moreover, by giving a matching lower bound of 2 on the integrality gap on the LP relaxation, we show that our rounding is tight.
Tasks
Published	2019-11-19
URL	https://arxiv.org/abs/1911.08196v1
PDF	https://arxiv.org/pdf/1911.08196v1.pdf
PWC	https://paperswithcode.com/paper/defending-with-shared-resources-on-a-network
Repo
Framework

Beyond exploding and vanishing gradients: analysing RNN training using attractors and smoothness


Title	Beyond exploding and vanishing gradients: analysing RNN training using attractors and smoothness
Authors	Antônio H. Ribeiro, Koen Tiels, Luis A. Aguirre, Thomas B. Schön
Abstract	The exploding and vanishing gradient problem has been the major conceptual principle behind most architecture and training improvements in recurrent neural networks (RNNs) during the last decade. In this paper, we argue that this principle, while powerful, might need some refinement to explain recent developments. We refine the concept of exploding gradients by reformulating the problem in terms of the cost function smoothness, which gives insight into higher-order derivatives and the existence of regions with many close local minima. We also clarify the distinction between vanishing gradients and the need for the RNN to learn attractors to fully use its expressive power. Through the lens of these refinements, we shed new light on recent developments in the RNN field, namely stable RNN and unitary (or orthogonal) RNNs.
Tasks
Published	2019-06-20
URL	https://arxiv.org/abs/1906.08482v3
PDF	https://arxiv.org/pdf/1906.08482v3.pdf
PWC	https://paperswithcode.com/paper/the-trade-off-between-long-term-memory-and
Repo
Framework

Online Sampling from Log-Concave Distributions


Title	Online Sampling from Log-Concave Distributions
Authors	Holden Lee, Oren Mangoubi, Nisheeth K. Vishnoi
Abstract	Given a sequence of convex functions $f_0, f_1, \ldots, f_T$, we study the problem of sampling from the Gibbs distribution $\pi_t \propto e^{-\sum_{k=0}^tf_k}$ for each epoch $t$ in an online manner. Interest in this problem derives from applications in machine learning, Bayesian statistics, and optimization where, rather than obtaining all the observations at once, one constantly acquires new data, and must continuously update the distribution. Our main result is an algorithm that generates roughly independent samples from $\pi_t$ for every epoch $t$ and, under mild assumptions, makes $\mathrm{polylog}(T)$ gradient evaluations per epoch. All previous results imply a bound on the number of gradient or function evaluations which is at least linear in $T$. Motivated by real-world applications, we assume that functions are smooth, their associated distributions have a bounded second moment, and their minimizer drifts in a bounded manner, but do not assume they are strongly convex. In particular, our assumptions hold for online Bayesian logistic regression, when the data satisfy natural regularity properties, giving a sampling algorithm with updates that are poly-logarithmic in $T$. In simulations, our algorithm achieves accuracy comparable to an algorithm specialized to logistic regression. Key to our algorithm is a novel stochastic gradient Langevin dynamics Markov chain with a carefully designed variance reduction step and constant batch size. Technically, lack of strong convexity is a significant barrier to analysis and, here, our main contribution is a martingale exit time argument that shows our Markov chain remains in a ball of radius roughly poly-logarithmic in $T$ for enough time to reach within $\varepsilon$ of $\pi_t$.
Tasks
Published	2019-02-21
URL	https://arxiv.org/abs/1902.08179v4
PDF	https://arxiv.org/pdf/1902.08179v4.pdf
PWC	https://paperswithcode.com/paper/online-sampling-from-log-concave
Repo
Framework

Adaptive Nearest Neighbor: A General Framework for Distance Metric Learning


Title	Adaptive Nearest Neighbor: A General Framework for Distance Metric Learning
Authors	Kun Song
Abstract	$K$-NN classifier is one of the most famous classification algorithms, whose performance is crucially dependent on the distance metric. When we consider the distance metric as a parameter of $K$-NN, learning an appropriate distance metric for $K$-NN can be seen as minimizing the empirical risk of $K$-NN. In this paper, we design a new type of continuous decision function of the $K$-NN classification rule which can be used to construct the continuous empirical risk function of $K$-NN. By minimizing this continuous empirical risk function, we obtain a novel distance metric learning algorithm named as adaptive nearest neighbor (ANN). We have proved that the current algorithms such as the large margin nearest neighbor (LMNN), neighbourhood components analysis (NCA) and the pairwise constraint methods are special cases of the proposed ANN by setting the parameter different values. Compared with the LMNN, NCA, and pairwise constraint methods, our method has a broader searching space which may contain better solutions. At last, extensive experiments on various data sets are conducted to demonstrate the effectiveness and efficiency of the proposed method.
Tasks	Metric Learning
Published	2019-11-22
URL	https://arxiv.org/abs/1911.10674v1
PDF	https://arxiv.org/pdf/1911.10674v1.pdf
PWC	https://paperswithcode.com/paper/adaptive-nearest-neighbor-a-general-framework
Repo
Framework

Crowding in humans is unlike that in convolutional neural networks


Title	Crowding in humans is unlike that in convolutional neural networks
Authors	Ben Lonnqvist, Alasdair D. F. Clarke, Ramakrishna Chakravarthi
Abstract	Object recognition is a primary function of the human visual system. It has recently been claimed that the highly successful ability to recognise objects in a set of emergent computer vision systems—Deep Convolutional Neural Networks (DCNNs)—can form a useful guide to recognition in humans. To test this assertion, we systematically evaluated visual crowding, a dramatic breakdown of recognition in clutter, in DCNNs and compared their performance to extant research in humans. We examined crowding in three architectures of DCNNs with the same methodology as that used among humans. We manipulated multiple stimulus factors including inter-letter spacing, letter colour, size, and flanker location to assess the extent and shape of crowding in DCNNs. We found that crowding followed a predictable pattern across architectures that was different from that in humans. Some characteristic hallmarks of human crowding, such as invariance to size, the effect of target-flanker similarity, and confusions between target and flanker identities, were completely missing, minimised or even reversed. These data show that DCNNs, while proficient in object recognition, likely achieve this competence through a set of mechanisms that are distinct from those in humans. They are not necessarily equivalent models of human or primate object recognition and caution must be exercised when inferring mechanisms derived from their operation.
Tasks	Object Recognition
Published	2019-03-01
URL	https://arxiv.org/abs/1903.00258v2
PDF	https://arxiv.org/pdf/1903.00258v2.pdf
PWC	https://paperswithcode.com/paper/object-recognition-in-deep-convolutional
Repo
Framework

Learning discriminative and robust time-frequency representations for environmental sound classification


Title	Learning discriminative and robust time-frequency representations for environmental sound classification
Authors	Helin Wang, Yuexian Zou, Dading Chong, Wenwu Wang
Abstract	Convolutional neural networks (CNN) are one of the best-performing neural network architectures for environmental sound classification (ESC). Recently, attention mechanisms have been used in CNN to capture the useful information from the audio signal for sound classification, especially for weakly labelled data where the timing information about the acoustic events is not available in the training data, apart from the availability of sound class labels. In these methods, however, the inherent time-frequency characteristics and variations are not explicitly exploited when obtaining the deep features. In this paper, we propose a new method, called time-frequency enhancement block (TFBlock), which temporal attention and frequency attention are employed to enhance the features from relevant frames and frequency bands. Compared with other attention mechanisms, in our method, parallel branches are constructed which allow the temporal and frequency features to be attended respectively in order to mitigate interference from the sections where no sound events happened in the acoustic environments. The experiments on three benchmark ESC datasets show that our method improves the classification performance and also exhibits robustness to noise.
Tasks	Environmental Sound Classification
Published	2019-12-14
URL	https://arxiv.org/abs/1912.06808v2
PDF	https://arxiv.org/pdf/1912.06808v2.pdf
PWC	https://paperswithcode.com/paper/learning-discriminative-and-robust-time
Repo
Framework

Detection of Adversarial Attacks and Characterization of Adversarial Subspace


Title	Detection of Adversarial Attacks and Characterization of Adversarial Subspace
Authors	Mohammad Esmaeilpour, Patrick Cardinal, Alessandro Lameiras Koerich
Abstract	Adversarial attacks have always been a serious threat for any data-driven model. In this paper, we explore subspaces of adversarial examples in unitary vector domain, and we propose a novel detector for defending our models trained for environmental sound classification. We measure chordal distance between legitimate and malicious representation of sounds in unitary space of generalized Schur decomposition and show that their manifolds lie far from each other. Our front-end detector is a regularized logistic regression which discriminates eigenvalues of legitimate and adversarial spectrograms. The experimental results on three benchmarking datasets of environmental sounds represented by spectrograms reveal high detection rate of the proposed detector for eight types of adversarial attacks and outperforms other detection approaches.
Tasks	Environmental Sound Classification
Published	2019-10-26
URL	https://arxiv.org/abs/1910.12084v1
PDF	https://arxiv.org/pdf/1910.12084v1.pdf
PWC	https://paperswithcode.com/paper/detection-of-adversarial-attacks-and
Repo
Framework

Robust object extraction from remote sensing data


Title	Robust object extraction from remote sensing data
Authors	Sophie Crommelinck, Mila Koeva, Michael Ying Yang, George Vosselman
Abstract	The extraction of object outlines has been a research topic during the last decades. In spite of advances in photogrammetry, remote sensing and computer vision, this task remains challenging due to object and data complexity. The development of object extraction approaches is promoted through publically available benchmark datasets and evaluation frameworks. Many aspects of performance evaluation have already been studied. This study collects the best practices from literature, puts the various aspects in one evaluation framework, and demonstrates its usefulness to a case study on mapping object outlines. The evaluation framework includes five dimensions: the robustness to changes in resolution, input, location, parameters, and application. Examples for investigating these dimensions are provided, as well as accuracy measures for their qualitative analysis. The measures consist of time efficiency and a procedure for line-based accuracy assessment regarding quantitative completeness and spatial correctness. The delineation approach to which the evaluation framework is applied, was previously introduced and is substantially improved in this study.
Tasks
Published	2019-04-03
URL	http://arxiv.org/abs/1904.12586v1
PDF	http://arxiv.org/pdf/1904.12586v1.pdf
PWC	https://paperswithcode.com/paper/190412586
Repo
Framework

Sub-Spectrogram Segmentation for Environmental Sound Classification via Convolutional Recurrent Neural Network and Score Level Fusion


Title	Sub-Spectrogram Segmentation for Environmental Sound Classification via Convolutional Recurrent Neural Network and Score Level Fusion
Authors	Tianhao Qiao, Shunqing Zhang, Zhichao Zhang, Shan Cao, Shugong Xu
Abstract	Environmental Sound Classification (ESC) is an important and challenging problem, and feature representation is a critical and even decisive factor in ESC. Feature representation ability directly affects the accuracy of sound classification. Therefore, the ESC performance is heavily dependent on the effectiveness of representative features extracted from the environmental sounds. In this paper, we propose a subspectrogram segmentation based ESC classification framework. In addition, we adopt the proposed Convolutional Recurrent Neural Network (CRNN) and score level fusion to jointly improve the classification accuracy. Extensive truncation schemes are evaluated to find the optimal number and the corresponding band ranges of sub-spectrograms. Based on the numerical experiments, the proposed framework can achieve 81.9% ESC classification accuracy on the public dataset ESC-50, which provides 9.1% accuracy improvement over traditional baseline schemes.
Tasks	Environmental Sound Classification
Published	2019-08-16
URL	https://arxiv.org/abs/1908.05863v1
PDF	https://arxiv.org/pdf/1908.05863v1.pdf
PWC	https://paperswithcode.com/paper/sub-spectrogram-segmentation-for
Repo
Framework

Learning Feature Sparse Principal Components


Title	Learning Feature Sparse Principal Components
Authors	Lai Tian, Feiping Nie, Xuelong Li
Abstract	This paper presents new algorithms to solve the feature-sparsity constrained PCA problem (FSPCA), which performs feature selection and PCA simultaneously. Existing optimization methods for FSPCA require data distribution assumptions and are lack of global convergence guarantee. Though the general FSPCA problem is NP-hard, we show that, for a low-rank covariance, FSPCA can be solved globally (Algorithm 1). Then, we propose another strategy (Algorithm 2) to solve FSPCA for the general covariance by iteratively building a carefully designed proxy. We prove theoretical guarantees on approximation and convergence for the new algorithms. Experimental results show the promising performance of the new algorithms compared with the state-of-the-arts on both synthetic and real-world datasets.
Tasks	Feature Selection
Published	2019-04-23
URL	https://arxiv.org/abs/1904.10155v2
PDF	https://arxiv.org/pdf/1904.10155v2.pdf
PWC	https://paperswithcode.com/paper/learning-feature-sparse-principal-components
Repo
Framework