April 1, 2020

3166 words 15 mins read

Paper Group ANR 498

Convolutional Neural Networks for Image-based Corn Kernel Detection and Counting. Probabilistic spike propagation for FPGA implementation of spiking neural networks. Deterministic Approximate EM Algorithm; Application to the Riemann Approximation EM and the Tempered EM. Adaptive Structural Hyper-Parameter Configuration by Q-Learning. SceneCAD: Pred …

Convolutional Neural Networks for Image-based Corn Kernel Detection and Counting


Title	Convolutional Neural Networks for Image-based Corn Kernel Detection and Counting
Authors	Saeed Khaki, Hieu Pham, Ye Han, Andy Kuhl, Wade Kent, Lizhi Wang
Abstract	Precise in-season corn grain yield estimates enable farmers to make real-time accurate harvest and grain marketing decisions minimizing possible losses of profitability. A well developed corn ear can have up to 800 kernels, but manually counting the kernels on an ear of corn is labor-intensive, time consuming and prone to human error. From an algorithmic perspective, the detection of the kernels from a single corn ear image is challenging due to the large number of kernels at different angles and very small distance among the kernels. In this paper, we propose a kernel detection and counting method based on a sliding window approach. The proposed method detect and counts all corn kernels in a single corn ear image taken in uncontrolled lighting conditions. The sliding window approach uses a convolutional neural network (CNN) for kernel detection. Then, a non-maximum suppression (NMS) is applied to remove overlapping detections. Finally, windows that are classified as kernel are passed to another CNN regression model for finding the (x,y) coordinates of the center of kernel image patches. Our experiments indicate that the proposed method can successfully detect the corn kernels with a low detection error and is also able to detect kernels on a batch of corn ears positioned at different angles.
Tasks
Published	2020-03-26
URL	https://arxiv.org/abs/2003.12025v1
PDF	https://arxiv.org/pdf/2003.12025v1.pdf
PWC	https://paperswithcode.com/paper/convolutional-neural-networks-for-image-based
Repo
Framework

Probabilistic spike propagation for FPGA implementation of spiking neural networks


Title	Probabilistic spike propagation for FPGA implementation of spiking neural networks
Authors	Abinand Nallathambi, Nitin Chandrachoodan
Abstract	Evaluation of spiking neural networks requires fetching a large number of synaptic weights to update postsynaptic neurons. This limits parallelism and becomes a bottleneck for hardware. We present an approach for spike propagation based on a probabilistic interpretation of weights, thus reducing memory accesses and updates. We study the effects of introducing randomness into the spike processing, and show on benchmark networks that this can be done with minimal impact on the recognition accuracy. We present an architecture and the trade-offs in accuracy on fully connected and convolutional networks for the MNIST and CIFAR10 datasets on the Xilinx Zynq platform.
Tasks
Published	2020-01-07
URL	https://arxiv.org/abs/2001.09725v1
PDF	https://arxiv.org/pdf/2001.09725v1.pdf
PWC	https://paperswithcode.com/paper/probabilistic-spike-propagation-for-fpga
Repo
Framework

Deterministic Approximate EM Algorithm; Application to the Riemann Approximation EM and the Tempered EM


Title	Deterministic Approximate EM Algorithm; Application to the Riemann Approximation EM and the Tempered EM
Authors	Thomas Lartigue, Stanley Durrleman, Stéphanie Allassonnière
Abstract	The Expectation Maximisation (EM) algorithm is widely used to optimise non-convex likelihood functions with hidden variables. Many authors modified its simple design to fit more specific situations. For instance the Expectation (E) step has been replaced by Monte Carlo (MC) approximations, Markov Chain Monte Carlo approximations, tempered approximations… Most of the well studied approximations belong to the stochastic class. By comparison, the literature is lacking when it comes to deterministic approximations. In this paper, we introduce a theoretical framework, with state of the art convergence guarantees, for any deterministic approximation of the E step. We analyse theoretically and empirically several approximations that fit into this framework. First, for cases with intractable E steps, we introduce a deterministic alternative to the MC-EM, using Riemann sums. This method is easy to implement and does not require the tuning of hyper-parameters. Then, we consider the tempered approximation, borrowed from the Simulated Annealing optimisation technique and meant to improve the EM solution. We prove that the the tempered EM verifies the convergence guarantees for a wide range of temperature profiles. We showcase empirically how it is able to escape adversarial initialisations. Finally, we combine the Riemann and tempered approximations to accomplish both their purposes.
Tasks
Published	2020-03-23
URL	https://arxiv.org/abs/2003.10126v1
PDF	https://arxiv.org/pdf/2003.10126v1.pdf
PWC	https://paperswithcode.com/paper/deterministic-approximate-em-algorithm
Repo
Framework

Adaptive Structural Hyper-Parameter Configuration by Q-Learning


Title	Adaptive Structural Hyper-Parameter Configuration by Q-Learning
Authors	Haotian Zhang, Jianyong Sun, Zongben Xu
Abstract	Tuning hyper-parameters for evolutionary algorithms is an important issue in computational intelligence. Performance of an evolutionary algorithm depends not only on its operation strategy design, but also on its hyper-parameters. Hyper-parameters can be categorized in two dimensions as structural/numerical and time-invariant/time-variant. Particularly, structural hyper-parameters in existing studies are usually tuned in advance for time-invariant parameters, or with hand-crafted scheduling for time-invariant parameters. In this paper, we make the first attempt to model the tuning of structural hyper-parameters as a reinforcement learning problem, and present to tune the structural hyper-parameter which controls computational resource allocation in the CEC 2018 winner algorithm by Q-learning. Experimental results show favorably against the winner algorithm on the CEC 2018 test functions.
Tasks	Q-Learning
Published	2020-03-02
URL	https://arxiv.org/abs/2003.00863v1
PDF	https://arxiv.org/pdf/2003.00863v1.pdf
PWC	https://paperswithcode.com/paper/adaptive-structural-hyper-parameter
Repo
Framework

SceneCAD: Predicting Object Alignments and Layouts in RGB-D Scans


Title	SceneCAD: Predicting Object Alignments and Layouts in RGB-D Scans
Authors	Armen Avetisyan, Tatiana Khanova, Christopher Choy, Denver Dash, Angela Dai, Matthias Nießner
Abstract	We present a novel approach to reconstructing lightweight, CAD-based representations of scanned 3D environments from commodity RGB-D sensors. Our key idea is to jointly optimize for both CAD model alignments as well as layout estimations of the scanned scene, explicitly modeling inter-relationships between objects-to-objects and objects-to-layout. Since object arrangement and scene layout are intrinsically coupled, we show that treating the problem jointly significantly helps to produce globally-consistent representations of a scene. Object CAD models are aligned to the scene by establishing dense correspondences between geometry, and we introduce a hierarchical layout prediction approach to estimate layout planes from corners and edges of the scene.To this end, we propose a message-passing graph neural network to model the inter-relationships between objects and layout, guiding generation of a globally object alignment in a scene. By considering the global scene layout, we achieve significantly improved CAD alignments compared to state-of-the-art methods, improving from 41.83% to 58.41% alignment accuracy on SUNCG and from 50.05% to 61.24% on ScanNet, respectively. The resulting CAD-based representations makes our method well-suited for applications in content creation such as augmented- or virtual reality.
Tasks
Published	2020-03-27
URL	https://arxiv.org/abs/2003.12622v1
PDF	https://arxiv.org/pdf/2003.12622v1.pdf
PWC	https://paperswithcode.com/paper/scenecad-predicting-object-alignments-and
Repo
Framework

Transferring Cross-domain Knowledge for Video Sign Language Recognition


Title	Transferring Cross-domain Knowledge for Video Sign Language Recognition
Authors	Dongxu Li, Xin Yu, Chenchen Xu, Lars Petersson, Hongdong Li
Abstract	Word-level sign language recognition (WSLR) is a fundamental task in sign language interpretation. It requires models to recognize isolated sign words from videos. However, annotating WSLR data needs expert knowledge, thus limiting WSLR dataset acquisition. On the contrary, there are abundant subtitled sign news videos on the internet. Since these videos have no word-level annotation and exhibit a large domain gap from isolated signs, they cannot be directly used for training WSLR models. We observe that despite the existence of a large domain gap, isolated and news signs share the same visual concepts, such as hand gestures and body movements. Motivated by this observation, we propose a novel method that learns domain-invariant visual concepts and fertilizes WSLR models by transferring knowledge of subtitled news sign to them. To this end, we extract news signs using a base WSLR model, and then design a classifier jointly trained on news and isolated signs to coarsely align these two domain features. In order to learn domain-invariant features within each class and suppress domain-specific features, our method further resorts to an external memory to store the class centroids of the aligned news signs. We then design a temporal attention based on the learnt descriptor to improve recognition performance. Experimental results on standard WSLR datasets show that our method outperforms previous state-of-the-art methods significantly. We also demonstrate the effectiveness of our method on automatically localizing signs from sign news, achieving 28.1 for AP@0.5.
Tasks	Sign Language Recognition
Published	2020-03-08
URL	https://arxiv.org/abs/2003.03703v2
PDF	https://arxiv.org/pdf/2003.03703v2.pdf
PWC	https://paperswithcode.com/paper/transferring-cross-domain-knowledge-for-video
Repo
Framework

Learning-based Bias Correction for Ultra-wideband Localization of Resource-constrained Mobile Robots


Title	Learning-based Bias Correction for Ultra-wideband Localization of Resource-constrained Mobile Robots
Authors	Wenda Zhao, Abhishek Goudar, Jacopo Panerati, Angela P. Schoellig
Abstract	Accurate indoor localization is a crucial enabling technology for many robotics applications, from warehouse management to monitoring tasks. Ultra-wideband (UWB) ranging is a promising solution which is low-cost, lightweight, and computationally inexpensive compared to alternative state-of-the-art approaches such as simultaneous localization and mapping, making it especially suited for resource-constrained aerial robots. Many commercially-available ultra-wideband radios, however, provide inaccurate, biased range measurements. In this article, we propose a bias correction framework compatible with both two-way ranging and time difference of arrival ultra-wideband localization. Our method comprises of two steps: (i) statistical outlier rejection and (ii) a learning-based bias correction. This approach is scalable and frugal enough to be deployed on-board a nano-quadcopter’s microcontroller. Previous research mostly focused on two-way ranging bias correction and has not been implemented in closed-loop nor using resource-constrained robots. Experimental results show that, using our approach, the localization error is reduced by ~18.5% and 48% (for TWR and TDoA, respectively), and a quadcopter can accurately track trajectories with position information from UWB only.
Tasks	Simultaneous Localization and Mapping
Published	2020-03-20
URL	https://arxiv.org/abs/2003.09371v1
PDF	https://arxiv.org/pdf/2003.09371v1.pdf
PWC	https://paperswithcode.com/paper/learning-based-bias-correction-for-ultra
Repo
Framework

SDVTracker: Real-Time Multi-Sensor Association and Tracking for Self-Driving Vehicles


Title	SDVTracker: Real-Time Multi-Sensor Association and Tracking for Self-Driving Vehicles
Authors	Shivam Gautam, Gregory P. Meyer, Carlos Vallespi-Gonzalez, Brian C. Becker
Abstract	Accurate motion state estimation of Vulnerable Road Users (VRUs), is a critical requirement for autonomous vehicles that navigate in urban environments. Due to their computational efficiency, many traditional autonomy systems perform multi-object tracking using Kalman Filters which frequently rely on hand-engineered association. However, such methods fail to generalize to crowded scenes and multi-sensor modalities, often resulting in poor state estimates which cascade to inaccurate predictions. We present a practical and lightweight tracking system, SDVTracker, that uses a deep learned model for association and state estimation in conjunction with an Interacting Multiple Model (IMM) filter. The proposed tracking method is fast, robust and generalizes across multiple sensor modalities and different VRU classes. In this paper, we detail a model that jointly optimizes both association and state estimation with a novel loss, an algorithm for determining ground-truth supervision, and a training procedure. We show this system significantly outperforms hand-engineered methods on a real-world urban driving dataset while running in less than 2.5 ms on CPU for a scene with 100 actors, making it suitable for self-driving applications where low latency and high accuracy is critical.
Tasks	Autonomous Vehicles, Multi-Object Tracking, Object Tracking
Published	2020-03-09
URL	https://arxiv.org/abs/2003.04447v1
PDF	https://arxiv.org/pdf/2003.04447v1.pdf
PWC	https://paperswithcode.com/paper/sdvtracker-real-time-multi-sensor-association
Repo
Framework

A Bayesian Filter for Multi-view 3D Multi-object Tracking with Occlusion Handling


Title	A Bayesian Filter for Multi-view 3D Multi-object Tracking with Occlusion Handling
Authors	Jonah Ong, Ba Tuong Vo, Ba Ngu Vo, Du Yong Kim, Sven Nordholm
Abstract	This paper proposes an online multi-camera multi-object tracker that only requires monocular detector training, independent of the multi-camera configurations, allowing seamless extension/deletion of cameras without (retraining) effort. The proposed algorithm has a linear complexity in the total number of detections across the cameras, and hence scales gracefully with the number of cameras. It operates in 3D world frame, and provides 3D trajectory estimates of the objects. The key innovation is a high fidelity yet tractable 3D occlusion model, amenable to optimal Bayesian multi-view multi-object filtering, which seamlessly integrates, into a single Bayesian recursion, the sub-tasks of track management, state estimation, clutter rejection, and occlusion/misdetection handling. The proposed algorithm is evaluated on the latest WILDTRACKS dataset, and demonstrated to work in very crowded scenes on a new dataset.
Tasks	3D Multi-Object Tracking, Multi-Object Tracking, Object Tracking
Published	2020-01-13
URL	https://arxiv.org/abs/2001.04118v2
PDF	https://arxiv.org/pdf/2001.04118v2.pdf
PWC	https://paperswithcode.com/paper/a-bayesian-3d-multi-view-multi-object
Repo
Framework

Intensity Scan Context: Coding Intensity and Geometry Relations for Loop Closure Detection


Title	Intensity Scan Context: Coding Intensity and Geometry Relations for Loop Closure Detection
Authors	Han Wang, Chen Wang, Lihua Xie
Abstract	Loop closure detection is an essential and challenging problem in simultaneous localization and mapping (SLAM). It is often tackled with light detection and ranging (LiDAR) sensor due to its view-point and illumination invariant properties. Existing works on 3D loop closure detection often leverage the matching of local or global geometrical-only descriptors, but without considering the intensity reading. In this paper we explore the intensity property from LiDAR scan and show that it can be effective for place recognition. Concretely, we propose a novel global descriptor, intensity scan context (ISC), that explores both geometry and intensity characteristics. To improve the efficiency for loop closure detection, an efficient two-stage hierarchical re-identification process is proposed, including a binary-operation based fast geometric relation retrieval and an intensity structure re-identification. Thorough experiments including both local experiment and public datasets test have been conducted to evaluate the performance of the proposed method. Our method achieves higher recall rate and recall precision than existing geometric-only methods.
Tasks	Loop Closure Detection, Simultaneous Localization and Mapping
Published	2020-03-12
URL	https://arxiv.org/abs/2003.05656v1
PDF	https://arxiv.org/pdf/2003.05656v1.pdf
PWC	https://paperswithcode.com/paper/intensity-scan-context-coding-intensity-and
Repo
Framework

Deep State Space Models for Nonlinear System Identification


Title	Deep State Space Models for Nonlinear System Identification
Authors	Daniel Gedon, Niklas Wahlström, Thomas B. Schön, Lennart Ljung
Abstract	An actively evolving model class for generative temporal models developed in the deep learning community are deep state space models (SSMs) which have a close connection to classic SSMs. In this work six new deep SSMs are implemented and evaluated for the identification of established nonlinear dynamic system benchmarks. The models and their parameter learning algorithms are elaborated rigorously. The usage of deep SSMs as a black-box identification model can describe a wide range of dynamics due to the flexibility of deep neural networks. Additionally, the uncertainty of the system is modelled and therefore one obtains a much richer representation and a whole class of systems to describe the underlying dynamics.
Tasks
Published	2020-03-31
URL	https://arxiv.org/abs/2003.14162v1
PDF	https://arxiv.org/pdf/2003.14162v1.pdf
PWC	https://paperswithcode.com/paper/deep-state-space-models-for-nonlinear-system
Repo
Framework

Fast Loop Closure Detection via Binary Content


Title	Fast Loop Closure Detection via Binary Content
Authors	Han Wang, Juncheng Li, Maopeng Ran, Lihua Xie
Abstract	Loop closure detection plays an important role in reducing localization drift in Simultaneous Localization And Mapping (SLAM). It aims to find repetitive scenes from historical data to reset localization. To tackle the loop closure problem, existing methods often leverage on the matching of visual features, which achieve good accuracy but require high computational resources. However, feature point based methods ignore the patterns of image, i.e., the shape of the objects as well as the distribution of objects in an image. It is believed that this information is usually unique for a scene and can be utilized to improve the performance of traditional loop closure detection methods. In this paper we leverage and compress the information into a binary image to accelerate an existing fast loop closure detection method via binary content. The proposed method can greatly reduce the computational cost without sacrificing recall rate. It consists of three parts: binary content construction, fast image retrieval and precise loop closure detection. No offline training is required. Our method is compared with the state-of-the-art loop closure detection methods and the results show that it outperforms the traditional methods at both recall rate and speed.
Tasks	Image Retrieval, Loop Closure Detection, Simultaneous Localization and Mapping
Published	2020-02-25
URL	https://arxiv.org/abs/2002.10622v1
PDF	https://arxiv.org/pdf/2002.10622v1.pdf
PWC	https://paperswithcode.com/paper/fast-loop-closure-detection-via-binary
Repo
Framework

Statistical Outlier Identification in Multi-robot Visual SLAM using Expectation Maximization


Title	Statistical Outlier Identification in Multi-robot Visual SLAM using Expectation Maximization
Authors	Arman Karimian, Ziqi Yang, Roberto Tron
Abstract	This paper introduces a novel and distributed method for detecting inter-map loop closure outliers in simultaneous localization and mapping (SLAM). The proposed algorithm does not rely on a good initialization and can handle more than two maps at a time. In multi-robot SLAM applications, maps made by different agents have nonidentical spatial frames of reference which makes initialization very difficult in the presence of outliers. This paper presents a probabilistic approach for detecting incorrect orientation measurements prior to pose graph optimization by checking the geometric consistency of rotation measurements. Expectation-Maximization is used to fine-tune the model parameters. As ancillary contributions, a new approximate discrete inference procedure is presented which uses evidence on loops in a graph and is based on optimization (Alternate Direction Method of Multipliers). This method yields superior results compared to Belief Propagation and has convergence guarantees. Simulation and experimental results are presented that evaluate the performance of the outlier detection method and the inference algorithm on synthetic and real-world data.
Tasks	Outlier Detection, Simultaneous Localization and Mapping
Published	2020-02-07
URL	https://arxiv.org/abs/2002.02638v1
PDF	https://arxiv.org/pdf/2002.02638v1.pdf
PWC	https://paperswithcode.com/paper/statistical-outlier-identification-in-multi
Repo
Framework

Object Detection on Single Monocular Images through Canonical Correlation Analysis


Title	Object Detection on Single Monocular Images through Canonical Correlation Analysis
Authors	Zifan Yu, Suya You
Abstract	Without using extra 3-D data like points cloud or depth images for providing 3-D information, we retrieve the 3-D object information from single monocular images. The high-quality predicted depth images are recovered from single monocular images, and it is fed into the 2-D object proposal network with corresponding monocular images. Most existing deep learning frameworks with two-streams input data always fuse separate data by concatenating or adding, which views every part of a feature map can contribute equally to the whole task. However, when data are noisy, and too much information is redundant, these methods no longer produce predictions or classifications efficiently. In this report, we propose a two-dimensional CCA(canonical correlation analysis) framework to fuse monocular images and corresponding predicted depth images for basic computer vision tasks like image classification and object detection. Firstly, we implemented different structures with one-dimensional CCA and Alexnet to test the performance on the image classification task. And then, we applied one of these structures with 2D-CCA for object detection. During these experiments, we found that our proposed framework behaves better when taking predicted depth images as inputs with the model trained from ground truth depth.
Tasks	Image Classification, Object Detection
Published	2020-02-13
URL	https://arxiv.org/abs/2002.05349v1
PDF	https://arxiv.org/pdf/2002.05349v1.pdf
PWC	https://paperswithcode.com/paper/object-detection-on-single-monocular-images
Repo
Framework

Predicting Legal Proceedings Status: an Approach Based on Sequential Text Data


Title	Predicting Legal Proceedings Status: an Approach Based on Sequential Text Data
Authors	Felipe Maia Polo, Itamar Ciochetti, Emerson Bertolo
Abstract	Machine learning applications in the legal field are numerous and diverse. In order to make contribution to both the machine learning community and the legal community, we have made efforts to create a model compatible with the classification of text sequences, valuing the interpretability of the results. The purpose of this paper is to classify legal proceedings in three possible status classes, which are (i) archived proceedings, (ii) active proceedings and (iii) suspended proceedings. Our approach is composed by natural language processing, supervised and unsupervised deep learning models and performed remarkably well in the classification task. Furthermore we had some insights regarding the patterns learned by the neural network applying tools to make the results more interpretable.
Tasks
Published	2020-03-13
URL	https://arxiv.org/abs/2003.11561v1
PDF	https://arxiv.org/pdf/2003.11561v1.pdf
PWC	https://paperswithcode.com/paper/predicting-legal-proceedings-status-an
Repo
Framework