January 29, 2020

3141 words 15 mins read

Paper Group ANR 685

A General Framework for Complex Network-Based Image Segmentation. Laplacian Smoothing Stochastic Gradient Markov Chain Monte Carlo. LaserNet: An Efficient Probabilistic 3D Object Detector for Autonomous Driving. Predictability of diffusion-based recommender systems. Point Cloud Processing via Recurrent Set Encoding. Software Based Higher Order Stru …

A General Framework for Complex Network-Based Image Segmentation


Title	A General Framework for Complex Network-Based Image Segmentation
Authors	Youssef Mourchid, Mohammed El Hassouni, Hocine Cherifi
Abstract	With the recent advances in complex networks theory, graph-based techniques for image segmentation has attracted great attention recently. In order to segment the image into meaningful connected components, this paper proposes an image segmentation general framework using complex networks based community detection algorithms. If we consider regions as communities, using community detection algorithms directly can lead to an over-segmented image. To address this problem, we start by splitting the image into small regions using an initial segmentation. The obtained regions are used for building the complex network. To produce meaningful connected components and detect homogeneous communities, some combinations of color and texture based features are employed in order to quantify the regions similarities. To sum up, the network of regions is constructed adaptively to avoid many small regions in the image, and then, community detection algorithms are applied on the resulting adaptive similarity matrix to obtain the final segmented image. Experiments are conducted on Berkeley Segmentation Dataset and four of the most influential community detection algorithms are tested. Experimental results have shown that the proposed general framework increases the segmentation performances compared to some existing methods.
Tasks	Community Detection, Semantic Segmentation
Published	2019-07-04
URL	https://arxiv.org/abs/1907.05278v1
PDF	https://arxiv.org/pdf/1907.05278v1.pdf
PWC	https://paperswithcode.com/paper/a-general-framework-for-complex-network-based
Repo
Framework

Laplacian Smoothing Stochastic Gradient Markov Chain Monte Carlo


Title	Laplacian Smoothing Stochastic Gradient Markov Chain Monte Carlo
Authors	Bao Wang, Difan Zou, Quanquan Gu, Stanley Osher
Abstract	As an important Markov Chain Monte Carlo (MCMC) method, stochastic gradient Langevin dynamics (SGLD) algorithm has achieved great success in Bayesian learning and posterior sampling. However, SGLD typically suffers from slow convergence rate due to its large variance caused by the stochastic gradient. In order to alleviate these drawbacks, we leverage the recently developed Laplacian Smoothing (LS) technique and propose a Laplacian smoothing stochastic gradient Langevin dynamics (LS-SGLD) algorithm. We prove that for sampling from both log-concave and non-log-concave densities, LS-SGLD achieves strictly smaller discretization error in $2$-Wasserstein distance, although its mixing rate can be slightly slower. Experiments on both synthetic and real datasets verify our theoretical results, and demonstrate the superior performance of LS-SGLD on different machine learning tasks including posterior sampling, Bayesian logistic regression and training Bayesian convolutional neural networks. The code is available at \url{https://github.com/BaoWangMath/LS-MCMC}.
Tasks
Published	2019-11-02
URL	https://arxiv.org/abs/1911.00782v1
PDF	https://arxiv.org/pdf/1911.00782v1.pdf
PWC	https://paperswithcode.com/paper/laplacian-smoothing-stochastic-gradient
Repo
Framework

LaserNet: An Efficient Probabilistic 3D Object Detector for Autonomous Driving


Title	LaserNet: An Efficient Probabilistic 3D Object Detector for Autonomous Driving
Authors	Gregory P. Meyer, Ankit Laddha, Eric Kee, Carlos Vallespi-Gonzalez, Carl K. Wellington
Abstract	In this paper, we present LaserNet, a computationally efficient method for 3D object detection from LiDAR data for autonomous driving. The efficiency results from processing LiDAR data in the native range view of the sensor, where the input data is naturally compact. Operating in the range view involves well known challenges for learning, including occlusion and scale variation, but it also provides contextual information based on how the sensor data was captured. Our approach uses a fully convolutional network to predict a multimodal distribution over 3D boxes for each point and then it efficiently fuses these distributions to generate a prediction for each object. Experiments show that modeling each detection as a distribution rather than a single deterministic box leads to better overall detection performance. Benchmark results show that this approach has significantly lower runtime than other recent detectors and that it achieves state-of-the-art performance when compared on a large dataset that has enough data to overcome the challenges of training on the range view.
Tasks	3D Object Detection, Autonomous Driving, Object Detection
Published	2019-03-20
URL	http://arxiv.org/abs/1903.08701v1
PDF	http://arxiv.org/pdf/1903.08701v1.pdf
PWC	https://paperswithcode.com/paper/lasernet-an-efficient-probabilistic-3d-object
Repo
Framework

Predictability of diffusion-based recommender systems


Title	Predictability of diffusion-based recommender systems
Authors	Peng Zhang, Leyang Xue, An Zeng
Abstract	The recommendation methods based on network diffusion have been shown to perform well in both recommendation accuracy and diversity. Nowdays, numerous extensions have been made to further improve the performance of such methods. However, to what extent can items be predicted by diffusion-based algorithms still lack of understanding. Here, we mainly propose a method to quantify the predictability of diffusion-based algorithms. Accordingly, we conduct experiments on Movielens and Netflix data sets. The results show that the higher recommendation accuracy based on diffusion algorithms can still be achieved by optimizing the way of resource allocation on a density network. On a sparse network, the possibility of improving accuracy is relatively low due to the fact that the current accuracy of diffusion-based methods is very close its predictability. In this case, we find that the predictability can be improved significantly by multi-steps diffusion, especially for users with less historical information. In contrast to common belief, there are plausible circumstances where the higher predictability of diffusion-based methods do not correspond to those users with more historical recording. Thus, we proposed the diffusion coverage and item average degree to explain this phenomenon. In addition, we demonstrate the recommendation accuracy in real online system is overestimated by random partition used in the literature, suggesting the recommendation in real online system may be a harder task.
Tasks	Recommendation Systems
Published	2019-03-29
URL	http://arxiv.org/abs/1903.12388v1
PDF	http://arxiv.org/pdf/1903.12388v1.pdf
PWC	https://paperswithcode.com/paper/predictability-of-diffusion-based-recommender
Repo
Framework

Point Cloud Processing via Recurrent Set Encoding


Title	Point Cloud Processing via Recurrent Set Encoding
Authors	Pengxiang Wu, Chao Chen, Jingru Yi, Dimitris Metaxas
Abstract	We present a new permutation-invariant network for 3D point cloud processing. Our network is composed of a recurrent set encoder and a convolutional feature aggregator. Given an unordered point set, the encoder firstly partitions its ambient space into parallel beams. Points within each beam are then modeled as a sequence and encoded into subregional geometric features by a shared recurrent neural network (RNN). The spatial layout of the beams is regular, and this allows the beam features to be further fed into an efficient 2D convolutional neural network (CNN) for hierarchical feature aggregation. Our network is effective at spatial feature learning, and competes favorably with the state-of-the-arts (SOTAs) on a number of benchmarks. Meanwhile, it is significantly more efficient compared to the SOTAs.
Tasks
Published	2019-11-25
URL	https://arxiv.org/abs/1911.10729v1
PDF	https://arxiv.org/pdf/1911.10729v1.pdf
PWC	https://paperswithcode.com/paper/point-cloud-processing-via-recurrent-set
Repo
Framework

Software Based Higher Order Structural Foot Abnormality Detection Using Image Processing


Title	Software Based Higher Order Structural Foot Abnormality Detection Using Image Processing
Authors	Arnesh Sen, Kaustav Sen, Jayoti Das
Abstract	The entire movement of human body undergoes through a periodic process named Gait Cycle. The structure of human foot is the key element to complete the cycle successfully. Abnormality of this foot structure is an alarming form of congenital disorder which results a classification based on the geometry of the human foot print image. Image processing is one of the most efficient way to determine a number of footprint parameter to detect the severeness of disorder. This paper aims to detect the Flatfoot and High Arch foot abnormalities using one of the footprint parameters named Modified Brucken Index by biomedical image processing.
Tasks	Anomaly Detection
Published	2019-04-11
URL	http://arxiv.org/abs/1904.05651v1
PDF	http://arxiv.org/pdf/1904.05651v1.pdf
PWC	https://paperswithcode.com/paper/software-based-higher-order-structural-foot
Repo
Framework

Improving Robustness in Real-World Neural Machine Translation Engines


Title	Improving Robustness in Real-World Neural Machine Translation Engines
Authors	Rohit Gupta, Patrik Lambert, Raj Nath Patel, John Tinsley
Abstract	As a commercial provider of machine translation, we are constantly training engines for a variety of uses, languages, and content types. In each case, there can be many variables, such as the amount of training data available, and the quality requirements of the end user. These variables can have an impact on the robustness of Neural MT engines. On the whole, Neural MT cures many ills of other MT paradigms, but at the same time, it has introduced a new set of challenges to address. In this paper, we describe some of the specific issues with practical NMT and the approaches we take to improve model robustness in real-world scenarios.
Tasks	Machine Translation
Published	2019-07-02
URL	https://arxiv.org/abs/1907.01279v1
PDF	https://arxiv.org/pdf/1907.01279v1.pdf
PWC	https://paperswithcode.com/paper/improving-robustness-in-real-world-neural
Repo
Framework

Texture Hallucination for Large-Factor Painting Super-Resolution


Title	Texture Hallucination for Large-Factor Painting Super-Resolution
Authors	Yulun Zhang, Zhifei Zhang, Stephen DiVerdi, Zhaowen Wang, Jose Echevarria, Yun Fu
Abstract	We aim to super-resolve digital paintings, synthesizing realistic details from high-resolution reference painting materials for very large scaling factors (e.g., 8X, 16X). However, previous single image super-resolution (SISR) methods would either lose textural details or introduce unpleasing artifacts. On the other hand, reference-based SR (Ref-SR) methods can transfer textures to some extent, but is still impractical to handle very large factors and keep fidelity with original input. To solve these problems, we propose an efficient high-resolution hallucination network for very large scaling factors with an efficient network structure and feature transferring. To transfer more detailed textures, we design a wavelet texture loss, which helps to enhance more high-frequency components. At the same time, to reduce the smoothing effect brought by the image reconstruction loss, we further relax the reconstruction constraint with a degradation loss which ensures the consistency between downscaled super-resolution results and low-resolution inputs. We also collected a high-resolution (e.g., 4K resolution) painting dataset PaintHD by considering both physical size and image resolution. We demonstrate the effectiveness of our method with extensive experiments on PaintHD by comparing with SISR and Ref-SR state-of-the-art methods.
Tasks	Image Reconstruction, Image Super-Resolution, Super-Resolution
Published	2019-12-01
URL	https://arxiv.org/abs/1912.00515v2
PDF	https://arxiv.org/pdf/1912.00515v2.pdf
PWC	https://paperswithcode.com/paper/texture-hallucination-for-large-scale
Repo
Framework

POD: Practical Object Detection with Scale-Sensitive Network


Title	POD: Practical Object Detection with Scale-Sensitive Network
Authors	Junran Peng, Ming Sun, Zhaoxiang Zhang, Tieniu Tan, Junjie Yan
Abstract	Scale-sensitive object detection remains a challenging task, where most of the existing methods could not learn it explicitly and are not robust to scale variance. In addition, the most existing methods are less efficient during training or slow during inference, which are not friendly to real-time applications. In this paper, we propose a practical object detection method with scale-sensitive network.Our method first predicts a global continuous scale ,which is shared by all position, for each convolution filter of each network stage. To effectively learn the scale, we average the spatial features and distill the scale from channels. For fast-deployment, we propose a scale decomposition method that transfers the robust fractional scale into combination of fixed integral scales for each convolution filter, which exploits the dilated convolution. We demonstrate it on one-stage and two-stage algorithms under different configurations. For practical applications, training of our method is of efficiency and simplicity which gets rid of complex data sampling or optimize strategy. During test-ing, the proposed method requires no extra operation and is very supportive of hardware acceleration like TensorRT and TVM. On the COCO test-dev, our model could achieve a 41.5 mAP on one-stage detector and 42.1 mAP on two-stage detectors based on ResNet-101, outperforming base-lines by 2.4 and 2.1 respectively without extra FLOPS.
Tasks	Object Detection
Published	2019-09-05
URL	https://arxiv.org/abs/1909.02225v1
PDF	https://arxiv.org/pdf/1909.02225v1.pdf
PWC	https://paperswithcode.com/paper/pod-practical-object-detection-with-scale
Repo
Framework

Autonomous Driving in the Lung using Deep Learning for Localization


Title	Autonomous Driving in the Lung using Deep Learning for Localization
Authors	Jake Sganga, David Eng, Chauncey Graetzel, David B. Camarillo
Abstract	Lung cancer is the leading cause of cancer-related death worldwide, and early diagnosis is critical to improving patient outcomes. To diagnose cancer, a highly trained pulmonologist must navigate a flexible bronchoscope deep into the branched structure of the lung for biopsy. The biopsy fails to sample the target tissue in 26-33% of cases largely because of poor registration with the preoperative CT map. To improve intraoperative registration, we develop two deep learning approaches to localize the bronchoscope in the preoperative CT map based on the bronchoscopic video in real-time, called AirwayNet and BifurcationNet. The networks are trained entirely on simulated images derived from the patient-specific CT. When evaluated on recorded bronchoscopy videos in a phantom lung, AirwayNet outperforms other deep learning localization algorithms with an area under the precision-recall curve of 0.97. Using AirwayNet, we demonstrate autonomous driving in the phantom lung based on video feedback alone. The robot reaches four targets in the left and right lungs in 95% of the trials. On recorded videos in eight human cadaver lungs, AirwayNet achieves areas under the precision-recall curve ranging from 0.82 to 0.997.
Tasks	Autonomous Driving
Published	2019-07-16
URL	https://arxiv.org/abs/1907.08136v1
PDF	https://arxiv.org/pdf/1907.08136v1.pdf
PWC	https://paperswithcode.com/paper/autonomous-driving-in-the-lung-using-deep
Repo
Framework

Investigating Generalisation in Continuous Deep Reinforcement Learning


Title	Investigating Generalisation in Continuous Deep Reinforcement Learning
Authors	Chenyang Zhao, Olivier Sigaud, Freek Stulp, Timothy M. Hospedales
Abstract	Deep Reinforcement Learning has shown great success in a variety of control tasks. However, it is unclear how close we are to the vision of putting Deep RL into practice to solve real world problems. In particular, common practice in the field is to train policies on largely deterministic simulators and to evaluate algorithms through training performance alone, without a train/test distinction to ensure models generalise and are not overfitted. Moreover, it is not standard practice to check for generalisation under domain shift, although robustness to such system change between training and testing would be necessary for real-world Deep RL control, for example, in robotics. In this paper we study these issues by first characterising the sources of uncertainty that provide generalisation challenges in Deep RL. We then provide a new benchmark and thorough empirical evaluation of generalisation challenges for state of the art Deep RL methods. In particular, we show that, if generalisation is the goal, then common practice of evaluating algorithms based on their training performance leads to the wrong conclusions about algorithm choice. Finally, we evaluate several techniques for improving generalisation and draw conclusions about the most robust techniques to date.
Tasks
Published	2019-02-19
URL	http://arxiv.org/abs/1902.07015v2
PDF	http://arxiv.org/pdf/1902.07015v2.pdf
PWC	https://paperswithcode.com/paper/investigating-generalisation-in-continuous
Repo
Framework

Anomaly Detection in Images


Title	Anomaly Detection in Images
Authors	Manpreet Singh Minhas, John Zelek
Abstract	Visual defect assessment is a form of anomaly detection. This is very relevant in finding faults such as cracks and markings in various surface inspection tasks like pavement and automotive parts. The task involves detection of deviation/divergence of anomalous samples from the normal ones. Two of the major challenges in supervised anomaly detection are the lack of labelled training data and the low availability of anomaly instances. Semi-supervised methods which learn the underlying distribution of the normal samples and then measure the deviation/divergence from the estimated model as the anomaly score have limitations in their overall ability to detect anomalies. This paper proposes the application of network-based deep transfer learning using convolutional neural networks (CNNs) for the task of anomaly detection. Single class SVMs have been used in the past with some success, however we hypothesize that deeper networks for single class classification should perform better. Results obtained on established anomaly detection benchmarks as well as on a real-world dataset, show that the proposed method clearly outperforms the existing state-of-the-art methods, by achieving a staggering average area under the receiver operating characteristic curve value of 0.99 for the tested data-sets which is an average improvement of 41% on the CIFAR10, 20% on MNIST and 16% on Cement Crack data-sets.
Tasks	Anomaly Detection, Transfer Learning
Published	2019-05-09
URL	https://arxiv.org/abs/1905.13147v1
PDF	https://arxiv.org/pdf/1905.13147v1.pdf
PWC	https://paperswithcode.com/paper/190513147
Repo
Framework

CVPR19 Tracking and Detection Challenge: How crowded can it get?


Title	CVPR19 Tracking and Detection Challenge: How crowded can it get?
Authors	Patrick Dendorfer, Hamid Rezatofighi, Anton Milan, Javen Shi, Daniel Cremers, Ian Reid, Stefan Roth, Konrad Schindler, Laura Leal-Taixe
Abstract	Standardized benchmarks are crucial for the majority of computer vision applications. Although leaderboards and ranking tables should not be over-claimed, benchmarks often provide the most objective measure of performance and are therefore important guides for research. The benchmark for Multiple Object Tracking, MOTChallenge, was launched with the goal to establish a standardized evaluation of multiple object tracking methods. The challenge focuses on multiple people tracking, since pedestrians are well studied in the tracking community, and precise tracking and detection has high practical relevance. Since the first release, MOT15, MOT16 and MOT17 have tremendously contributed to the community by introducing a clean dataset and precise framework to benchmark multi-object trackers. In this paper, we present our CVPR19 benchmark, consisting of 8 new sequences depicting very crowded challenging scenes. The benchmark will be presented at the 4th BMTT MOT Challenge Workshop at the Computer Vision and Pattern Recognition Conference (CVPR) 2019, and will evaluate the state-of-the-art in multiple object tracking whend handling extremely crowded scenarios.
Tasks	Multiple Object Tracking, Multiple People Tracking, Object Tracking
Published	2019-06-10
URL	https://arxiv.org/abs/1906.04567v1
PDF	https://arxiv.org/pdf/1906.04567v1.pdf
PWC	https://paperswithcode.com/paper/cvpr19-tracking-and-detection-challenge-how
Repo
Framework

Bayesian Automatic Relevance Determination for Utility Function Specification in Discrete Choice Models


Title	Bayesian Automatic Relevance Determination for Utility Function Specification in Discrete Choice Models
Authors	Filipe Rodrigues, Nicola Ortelli, Michel Bierlaire, Francisco Pereira
Abstract	Specifying utility functions is a key step towards applying the discrete choice framework for understanding the behaviour processes that govern user choices. However, identifying the utility function specifications that best model and explain the observed choices can be a very challenging and time-consuming task. This paper seeks to help modellers by leveraging the Bayesian framework and the concept of automatic relevance determination (ARD), in order to automatically determine an optimal utility function specification from an exponentially large set of possible specifications in a purely data-driven manner. Based on recent advances in approximate Bayesian inference, a doubly stochastic variational inference is developed, which allows the proposed DCM-ARD model to scale to very large and high-dimensional datasets. Using semi-artificial choice data, the proposed approach is shown to very accurately recover the true utility function specifications that govern the observed choices. Moreover, when applied to real choice data, DCM-ARD is shown to be able discover high quality specifications that can outperform previous ones from the literature according to multiple criteria, thereby demonstrating its practical applicability.
Tasks	Bayesian Inference
Published	2019-06-10
URL	https://arxiv.org/abs/1906.03855v1
PDF	https://arxiv.org/pdf/1906.03855v1.pdf
PWC	https://paperswithcode.com/paper/bayesian-automatic-relevance-determination
Repo
Framework

Semi-supervised and Population Based Training for Voice Commands Recognition


Title	Semi-supervised and Population Based Training for Voice Commands Recognition
Authors	Oguz H. Elibol, Gokce Keskin, Anil Thomas
Abstract	We present a rapid design methodology that combines automated hyper-parameter tuning with semi-supervised training to build highly accurate and robust models for voice commands classification. Proposed approach allows quick evaluation of network architectures to fit performance and power constraints of available hardware, while ensuring good hyper-parameter choices for each network in real-world scenarios. Leveraging the vast amount of unlabeled data with a student/teacher based semi-supervised method, classification accuracy is improved from 84% to 94% in the validation set. For model optimization, we explore the hyper-parameter space through population based training and obtain an optimized model in the same time frame as it takes to train a single model.
Tasks
Published	2019-05-10
URL	https://arxiv.org/abs/1905.04230v1
PDF	https://arxiv.org/pdf/1905.04230v1.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-and-population-based-training
Repo
Framework