January 27, 2020

3090 words 15 mins read

Paper Group ANR 1318

Diverse mini-batch Active Learning. A Radio Signal Modulation Recognition Algorithm Based on Residual Networks and Attention Mechanisms. FAMNet: Joint Learning of Feature, Affinity and Multi-dimensional Assignment for Online Multiple Object Tracking. Joint Segmentation and Landmark Localization of Fetal Femur in Ultrasound Volumes. Super accurate l …

Diverse mini-batch Active Learning


Title	Diverse mini-batch Active Learning
Authors	Fedor Zhdanov
Abstract	We study the problem of reducing the amount of labeled training data required to train supervised classification models. We approach it by leveraging Active Learning, through sequential selection of examples which benefit the model most. Selecting examples one by one is not practical for the amount of training examples required by the modern Deep Learning models. We consider the mini-batch Active Learning setting, where several examples are selected at once. We present an approach which takes into account both informativeness of the examples for the model, as well as the diversity of the examples in a mini-batch. By using the well studied K-means clustering algorithm, this approach scales better than the previously proposed approaches, and achieves comparable or better performance.
Tasks	Active Learning
Published	2019-01-17
URL	http://arxiv.org/abs/1901.05954v1
PDF	http://arxiv.org/pdf/1901.05954v1.pdf
PWC	https://paperswithcode.com/paper/diverse-mini-batch-active-learning
Repo
Framework

A Radio Signal Modulation Recognition Algorithm Based on Residual Networks and Attention Mechanisms


Title	A Radio Signal Modulation Recognition Algorithm Based on Residual Networks and Attention Mechanisms
Authors	Ruisen Luo, Tao Hu, Zuodong Tang, Chen Wang, Xiaofeng Gong, Haiyan Tu
Abstract	To solve the problem of inaccurate recognition of types of communication signal modulation, a RNN neural network recognition algorithm combining residual block network with attention mechanism is proposed. In this method, 10 kinds of communication signals with Gaussian white noise are generated from standard data sets, such as MASK, MPSK, MFSK, OFDM, 16QAM, AM and FM. Based on the original RNN neural network, residual block network is added to solve the problem of gradient disappearance caused by deep network layers. Attention mechanism is added to the network to accelerate the gradient descent. In the experiment, 16QAM, 2FSK and 4FSK are used as actual samples, IQ data frames of signals are used as input, and the RNN neural network combined with residual block network and attention mechanism is trained. The final recognition results show that the average recognition rate of real-time signals is over 93%. The network has high robustness and good use value.
Tasks
Published	2019-09-27
URL	https://arxiv.org/abs/1909.12472v1
PDF	https://arxiv.org/pdf/1909.12472v1.pdf
PWC	https://paperswithcode.com/paper/a-radio-signal-modulation-recognition
Repo
Framework

FAMNet: Joint Learning of Feature, Affinity and Multi-dimensional Assignment for Online Multiple Object Tracking


Title	FAMNet: Joint Learning of Feature, Affinity and Multi-dimensional Assignment for Online Multiple Object Tracking
Authors	Peng Chu, Haibin Ling
Abstract	Data association-based multiple object tracking (MOT) involves multiple separated modules processed or optimized differently, which results in complex method design and requires non-trivial tuning of parameters. In this paper, we present an end-to-end model, named FAMNet, where Feature extraction, Affinity estimation and Multi-dimensional assignment are refined in a single network. All layers in FAMNet are designed differentiable thus can be optimized jointly to learn the discriminative features and higher-order affinity model for robust MOT, which is supervised by the loss directly from the assignment ground truth. We also integrate single object tracking technique and a dedicated target management scheme into the FAMNet-based tracking system to further recover false negatives and inhibit noisy target candidates generated by the external detector. The proposed method is evaluated on a diverse set of benchmarks including MOT2015, MOT2017, KITTI-Car and UA-DETRAC, and achieves promising performance on all of them in comparison with state-of-the-arts.
Tasks	Multiple Object Tracking, Object Tracking
Published	2019-04-10
URL	http://arxiv.org/abs/1904.04989v1
PDF	http://arxiv.org/pdf/1904.04989v1.pdf
PWC	https://paperswithcode.com/paper/famnet-joint-learning-of-feature-affinity-and
Repo
Framework

Joint Segmentation and Landmark Localization of Fetal Femur in Ultrasound Volumes


Title	Joint Segmentation and Landmark Localization of Fetal Femur in Ultrasound Volumes
Authors	Xu Wang, Xin Yang, Haoran Dou, Shengli Li, Pheng-Ann Heng, Dong Ni
Abstract	Volumetric ultrasound has great potentials in promoting prenatal examinations. Automated solutions are highly desired to efficiently and effectively analyze the massive volumes. Segmentation and landmark localization are two key techniques in making the quantitative evaluation of prenatal ultrasound volumes available in clinic. However, both tasks are non-trivial when considering the poor image quality, boundary ambiguity and anatomical variations in volumetric ultrasound. In this paper, we propose an effective framework for simultaneous segmentation and landmark localization in prenatal ultrasound volumes. The proposed framework has two branches where informative cues of segmentation and landmark localization can be propagated bidirectionally to benefit both tasks. As landmark localization tends to suffer from false positives, we propose a distance based loss to suppress the noise and thus enhance the localization map and in turn the segmentation. Finally, we further leverage an adversarial module to emphasize the correspondence between segmentation and landmark localization. Extensively validated on a volumetric ultrasound dataset of fetal femur, our proposed framework proves to be a promising solution to facilitate the interpretation of prenatal ultrasound volumes.
Tasks
Published	2019-08-31
URL	https://arxiv.org/abs/1909.00186v1
PDF	https://arxiv.org/pdf/1909.00186v1.pdf
PWC	https://paperswithcode.com/paper/joint-segmentation-and-landmark-localization
Repo
Framework

Super accurate low latency object detection on a surveillance UAV


Title	Super accurate low latency object detection on a surveillance UAV
Authors	Maarten Vandersteegen, Kristof Vanbeeck, Toon goedeme
Abstract	Drones have proven to be useful in many industry segments such as security and surveillance, where e.g. on-board real-time object tracking is a necessity for autonomous flying guards. Tracking and following suspicious objects is therefore required in real-time on limited hardware. With an object detector in the loop, low latency becomes extremely important. In this paper, we propose a solution to make object detection for UAVs both fast and super accurate. We propose a multi-dataset learning strategy yielding top eye-sky object detection accuracy. Our model generalizes well on unseen data and can cope with different flying heights, optically zoomed-in shots and different viewing angles. We apply optimization steps such that we achieve minimal latency on embedded on-board hardware by fusing layers, quantizing calculations to 16-bit floats and 8-bit integers, with negligible loss in accuracy. We validate on NVIDIA’s Jetson TX2 and Jetson Xavier platforms where we achieve a speed-wise performance boost of more than 10x.
Tasks	Object Detection, Object Tracking
Published	2019-04-03
URL	http://arxiv.org/abs/1904.02024v1
PDF	http://arxiv.org/pdf/1904.02024v1.pdf
PWC	https://paperswithcode.com/paper/super-accurate-low-latency-object-detection
Repo
Framework

On evaluating CNN representations for low resource medical image classification


Title	On evaluating CNN representations for low resource medical image classification
Authors	Taruna Agrawal, Rahul Gupta, Shrikanth Narayanan
Abstract	Convolutional Neural Networks (CNNs) have revolutionized performances in several machine learning tasks such as image classification, object tracking, and keyword spotting. However, given that they contain a large number of parameters, their direct applicability into low resource tasks is not straightforward. In this work, we experiment with an application of CNN models to gastrointestinal landmark classification with only a few thousands of training samples through transfer learning. As in a standard transfer learning approach, we train CNNs on a large external corpus, followed by representation extraction for the medical images. Finally, a classifier is trained on these CNN representations. However, given that several variants of CNNs exist, the choice of CNN is not obvious. To address this, we develop a novel metric that can be used to predict test performances, given CNN representations on the training set. Not only we demonstrate the superiority of the CNN based transfer learning approach against an assembly of knowledge driven features, but the proposed metric also carries an 87% correlation with the test set performances as obtained using various CNN representations.
Tasks	Image Classification, Keyword Spotting, Object Tracking, Transfer Learning
Published	2019-03-26
URL	http://arxiv.org/abs/1903.11176v1
PDF	http://arxiv.org/pdf/1903.11176v1.pdf
PWC	https://paperswithcode.com/paper/on-evaluating-cnn-representations-for-low
Repo
Framework

Interactive segmentation of medical images through fully convolutional neural networks


Title	Interactive segmentation of medical images through fully convolutional neural networks
Authors	Tomas Sakinis, Fausto Milletari, Holger Roth, Panagiotis Korfiatis, Petro Kostandy, Kenneth Philbrick, Zeynettin Akkus, Ziyue Xu, Daguang Xu, Bradley J. Erickson
Abstract	Image segmentation plays an essential role in medicine for both diagnostic and interventional tasks. Segmentation approaches are either manual, semi-automated or fully-automated. Manual segmentation offers full control over the quality of the results, but is tedious, time consuming and prone to operator bias. Fully automated methods require no human effort, but often deliver sub-optimal results without providing users with the means to make corrections. Semi-automated approaches keep users in control of the results by providing means for interaction, but the main challenge is to offer a good trade-off between precision and required interaction. In this paper we present a deep learning (DL) based semi-automated segmentation approach that aims to be a “smart” interactive tool for region of interest delineation in medical images. We demonstrate its use for segmenting multiple organs on computed tomography (CT) of the abdomen. Our approach solves some of the most pressing clinical challenges: (i) it requires only one to a few user clicks to deliver excellent 2D segmentations in a fast and reliable fashion; (ii) it can generalize to previously unseen structures and “corner cases”; (iii) it delivers results that can be corrected quickly in a smart and intuitive way up to an arbitrary degree of precision chosen by the user and (iv) ensures high accuracy. We present our approach and compare it to other techniques and previous work to show the advantages brought by our method.
Tasks	Computed Tomography (CT), Interactive Segmentation, Semantic Segmentation
Published	2019-03-19
URL	http://arxiv.org/abs/1903.08205v1
PDF	http://arxiv.org/pdf/1903.08205v1.pdf
PWC	https://paperswithcode.com/paper/interactive-segmentation-of-medical-images
Repo
Framework

Bayesian surrogate learning in dynamic simulator-based regression problems


Title	Bayesian surrogate learning in dynamic simulator-based regression problems
Authors	Xi Chen, Mike Hobson
Abstract	The estimation of unknown values of parameters (or hidden variables, control variables) that characterise a physical system often relies on the comparison of measured data with synthetic data produced by some numerical simulator of the system as the parameter values are varied. This process often encounters two major difficulties: the generation of synthetic data for each considered set of parameter values can be computationally expensive if the system model is complicated; and the exploration of the parameter space can be inefficient and/or incomplete, a typical example being when the exploration becomes trapped in a local optimum of the objection function that characterises the mismatch between the measured and synthetic data. A method to address both these issues is presented, whereby: a surrogate model (or proxy), which emulates the computationally expensive system simulator, is constructed using deep recurrent networks (DRN); and a nested sampling (NS) algorithm is employed to perform efficient and robust exploration of the parameter space. The analysis is performed in a Bayesian context, in which the samples characterise the full joint posterior distribution of the parameters, from which parameter estimates and uncertainties are easily derived. The proposed approach is compared with conventional methods in some numerical examples, for which the results demonstrate that one can accelerate the parameter estimation process by at least an order of magnitude.
Tasks
Published	2019-01-25
URL	http://arxiv.org/abs/1901.08898v1
PDF	http://arxiv.org/pdf/1901.08898v1.pdf
PWC	https://paperswithcode.com/paper/bayesian-surrogate-learning-in-dynamic
Repo
Framework

Improved Optical Flow for Gesture-based Human-robot Interaction


Title	Improved Optical Flow for Gesture-based Human-robot Interaction
Authors	Jen-Yen Chang, Antonio Tejero-de-Pablos, Tatsuya Harada
Abstract	Gesture interaction is a natural way of communicating with a robot as an alternative to speech. Gesture recognition methods leverage optical flow in order to understand human motion. However, while accurate optical flow estimation (i.e., traditional) methods are costly in terms of runtime, fast estimation (i.e., deep learning) methods’ accuracy can be improved. In this paper, we present a pipeline for gesture-based human-robot interaction that uses a novel optical flow estimation method in order to achieve an improved speed-accuracy trade-off. Our optical flow estimation method introduces four improvements to previous deep learning-based methods: strong feature extractors, attention to contours, midway features, and a combination of these three. This results in a better understanding of motion, and a finer representation of silhouettes. In order to evaluate our pipeline, we generated our own dataset, MIBURI, which contains gestures to command a house service robot. In our experiments, we show how our method improves not only optical flow estimation, but also gesture recognition, offering a speed-accuracy trade-off more realistic for practical robot applications.
Tasks	Gesture Recognition, Optical Flow Estimation
Published	2019-05-21
URL	https://arxiv.org/abs/1905.08685v1
PDF	https://arxiv.org/pdf/1905.08685v1.pdf
PWC	https://paperswithcode.com/paper/improved-optical-flow-for-gesture-based-human
Repo
Framework

Searching the Landscape of Flux Vacua with Genetic Algorithms


Title	Searching the Landscape of Flux Vacua with Genetic Algorithms
Authors	Alex Cole, Andreas Schachner, Gary Shiu
Abstract	In this paper, we employ genetic algorithms to explore the landscape of type IIB flux vacua. We show that genetic algorithms can efficiently scan the landscape for viable solutions satisfying various criteria. More specifically, we consider a symmetric $T^{6}$ as well as the conifold region of a Calabi-Yau hypersurface. We argue that in both cases genetic algorithms are powerful tools for finding flux vacua with interesting phenomenological properties. We also compare genetic algorithms to algorithms based on different breeding mechanisms as well as random walk approaches.
Tasks
Published	2019-07-23
URL	https://arxiv.org/abs/1907.10072v2
PDF	https://arxiv.org/pdf/1907.10072v2.pdf
PWC	https://paperswithcode.com/paper/searching-the-landscape-of-flux-vacua-with
Repo
Framework

Identifying Unknown Instances for Autonomous Driving


Title	Identifying Unknown Instances for Autonomous Driving
Authors	Kelvin Wong, Shenlong Wang, Mengye Ren, Ming Liang, Raquel Urtasun
Abstract	In the past few years, we have seen great progress in perception algorithms, particular through the use of deep learning. However, most existing approaches focus on a few categories of interest, which represent only a small fraction of the potential categories that robots need to handle in the real-world. Thus, identifying objects from unknown classes remains a challenging yet crucial task. In this paper, we develop a novel open-set instance segmentation algorithm for point clouds which can segment objects from both known and unknown classes in a holistic way. Our method uses a deep convolutional neural network to project points into a category-agnostic embedding space in which they can be clustered into instances irrespective of their semantics. Experiments on two large-scale self-driving datasets validate the effectiveness of our proposed method.
Tasks	Autonomous Driving, Instance Segmentation, Semantic Segmentation
Published	2019-10-24
URL	https://arxiv.org/abs/1910.11296v1
PDF	https://arxiv.org/pdf/1910.11296v1.pdf
PWC	https://paperswithcode.com/paper/identifying-unknown-instances-for-autonomous
Repo
Framework

Modelling Generalized Forces with Reinforcement Learning for Sim-to-Real Transfer


Title	Modelling Generalized Forces with Reinforcement Learning for Sim-to-Real Transfer
Authors	Rae Jeong, Jackie Kay, Francesco Romano, Thomas Lampe, Tom Rothorl, Abbas Abdolmaleki, Tom Erez, Yuval Tassa, Francesco Nori
Abstract	Learning robotic control policies in the real world gives rise to challenges in data efficiency, safety, and controlling the initial condition of the system. On the other hand, simulations are a useful alternative as they provide an abundant source of data without the restrictions of the real world. Unfortunately, simulations often fail to accurately model complex real-world phenomena. Traditional system identification techniques are limited in expressiveness by the analytical model parameters, and usually are not sufficient to capture such phenomena. In this paper we propose a general framework for improving the analytical model by optimizing state dependent generalized forces. State dependent generalized forces are expressive enough to model constraints in the equations of motion, while maintaining a clear physical meaning and intuition. We use reinforcement learning to efficiently optimize the mapping from states to generalized forces over a discounted infinite horizon. We show that using only minutes of real world data improves the sim-to-real control policy transfer. We demonstrate the feasibility of our approach by validating it on a nonprehensile manipulation task on the Sawyer robot.
Tasks
Published	2019-10-21
URL	https://arxiv.org/abs/1910.09471v1
PDF	https://arxiv.org/pdf/1910.09471v1.pdf
PWC	https://paperswithcode.com/paper/modelling-generalized-forces-with
Repo
Framework

An Unbiased Risk Estimator for Learning with Augmented Classes


Title	An Unbiased Risk Estimator for Learning with Augmented Classes
Authors	Yu-Jie Zhang, Peng Zhao, Zhi-Hua Zhou
Abstract	In this paper, we study the problem of learning with augmented classes (LAC), where new classes that do not appear in the training dataset might emerge in the testing phase. The mixture of known classes and new classes in the testing distribution makes the LAC problem quite challenging. Our discovery is that by exploiting cheap and vast unlabeled data, the testing distribution can be estimated in the training stage, which paves us a way to develop algorithms with nice statistical properties. Specifically, we propose an unbiased risk estimator over the testing distribution for the LAC problem, and further develop an efficient algorithm to perform the empirical risk minimization. Both asymptotic and non-asymptotic analyses are provided as theoretical guarantees. The efficacy of the proposed algorithm is also confirmed by experiments.
Tasks
Published	2019-10-21
URL	https://arxiv.org/abs/1910.09388v1
PDF	https://arxiv.org/pdf/1910.09388v1.pdf
PWC	https://paperswithcode.com/paper/an-unbiased-risk-estimator-for-learning-with
Repo
Framework

Image Aesthetics Assessment Using Composite Features from off-the-Shelf Deep Models


Title	Image Aesthetics Assessment Using Composite Features from off-the-Shelf Deep Models
Authors	Xin Fu, Jia Yan, Cien Fan
Abstract	Deep convolutional neural networks have recently achieved great success on image aesthetics assessment task. In this paper, we propose an efficient method which takes the global, local and scene-aware information of images into consideration and exploits the composite features extracted from corresponding pretrained deep learning models to classify the derived features with support vector machine. Contrary to popular methods that require fine-tuning or training a new model from scratch, our training-free method directly takes the deep features generated by off-the-shelf models for image classification and scene recognition. Also, we analyzed the factors that could influence the performance from two aspects: the architecture of the deep neural network and the contribution of local and scene-aware information. It turns out that deep residual network could produce more aesthetics-aware image representation and composite features lead to the improvement of overall performance. Experiments on common large-scale aesthetics assessment benchmarks demonstrate that our method outperforms the state-of-the-art results in photo aesthetics assessment.
Tasks	Image Classification, Scene Recognition
Published	2019-02-22
URL	http://arxiv.org/abs/1902.08546v1
PDF	http://arxiv.org/pdf/1902.08546v1.pdf
PWC	https://paperswithcode.com/paper/image-aesthetics-assessment-using-composite
Repo
Framework

Unsupervised AER Object Recognition Based on Multiscale Spatio-Temporal Features and Spiking Neurons


Title	Unsupervised AER Object Recognition Based on Multiscale Spatio-Temporal Features and Spiking Neurons
Authors	Qianhui Liu, Gang Pan, Haibo Ruan, Dong Xing, Qi Xu, Huajin Tang
Abstract	This paper proposes an unsupervised address event representation (AER) object recognition approach. The proposed approach consists of a novel multiscale spatio-temporal feature (MuST) representation of input AER events and a spiking neural network (SNN) using spike-timing-dependent plasticity (STDP) for object recognition with MuST. MuST extracts the features contained in both the spatial and temporal information of AER event flow, and meanwhile forms an informative and compact feature spike representation. We show not only how MuST exploits spikes to convey information more effectively, but also how it benefits the recognition using SNN. The recognition process is performed in an unsupervised manner, which does not need to specify the desired status of every single neuron of SNN, and thus can be flexibly applied in real-world recognition tasks. The experiments are performed on five AER datasets including a new one named GESTURE-DVS. Extensive experimental results show the effectiveness and advantages of this proposed approach.
Tasks	Object Recognition
Published	2019-11-19
URL	https://arxiv.org/abs/1911.08261v1
PDF	https://arxiv.org/pdf/1911.08261v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-aer-object-recognition-based-on
Repo
Framework