Paper Group ANR 498
Convolutional Neural Networks for Image-based Corn Kernel Detection and Counting. Probabilistic spike propagation for FPGA implementation of spiking neural networks. Deterministic Approximate EM Algorithm; Application to the Riemann Approximation EM and the Tempered EM. Adaptive Structural Hyper-Parameter Configuration by Q-Learning. SceneCAD: Pred …
Convolutional Neural Networks for Image-based Corn Kernel Detection and Counting
Title | Convolutional Neural Networks for Image-based Corn Kernel Detection and Counting |
Authors | Saeed Khaki, Hieu Pham, Ye Han, Andy Kuhl, Wade Kent, Lizhi Wang |
Abstract | Precise in-season corn grain yield estimates enable farmers to make real-time accurate harvest and grain marketing decisions minimizing possible losses of profitability. A well developed corn ear can have up to 800 kernels, but manually counting the kernels on an ear of corn is labor-intensive, time consuming and prone to human error. From an algorithmic perspective, the detection of the kernels from a single corn ear image is challenging due to the large number of kernels at different angles and very small distance among the kernels. In this paper, we propose a kernel detection and counting method based on a sliding window approach. The proposed method detect and counts all corn kernels in a single corn ear image taken in uncontrolled lighting conditions. The sliding window approach uses a convolutional neural network (CNN) for kernel detection. Then, a non-maximum suppression (NMS) is applied to remove overlapping detections. Finally, windows that are classified as kernel are passed to another CNN regression model for finding the (x,y) coordinates of the center of kernel image patches. Our experiments indicate that the proposed method can successfully detect the corn kernels with a low detection error and is also able to detect kernels on a batch of corn ears positioned at different angles. |
Tasks | |
Published | 2020-03-26 |
URL | https://arxiv.org/abs/2003.12025v1 |
https://arxiv.org/pdf/2003.12025v1.pdf | |
PWC | https://paperswithcode.com/paper/convolutional-neural-networks-for-image-based |
Repo | |
Framework | |
Probabilistic spike propagation for FPGA implementation of spiking neural networks
Title | Probabilistic spike propagation for FPGA implementation of spiking neural networks |
Authors | Abinand Nallathambi, Nitin Chandrachoodan |
Abstract | Evaluation of spiking neural networks requires fetching a large number of synaptic weights to update postsynaptic neurons. This limits parallelism and becomes a bottleneck for hardware. We present an approach for spike propagation based on a probabilistic interpretation of weights, thus reducing memory accesses and updates. We study the effects of introducing randomness into the spike processing, and show on benchmark networks that this can be done with minimal impact on the recognition accuracy. We present an architecture and the trade-offs in accuracy on fully connected and convolutional networks for the MNIST and CIFAR10 datasets on the Xilinx Zynq platform. |
Tasks | |
Published | 2020-01-07 |
URL | https://arxiv.org/abs/2001.09725v1 |
https://arxiv.org/pdf/2001.09725v1.pdf | |
PWC | https://paperswithcode.com/paper/probabilistic-spike-propagation-for-fpga |
Repo | |
Framework | |
Deterministic Approximate EM Algorithm; Application to the Riemann Approximation EM and the Tempered EM
Title | Deterministic Approximate EM Algorithm; Application to the Riemann Approximation EM and the Tempered EM |
Authors | Thomas Lartigue, Stanley Durrleman, Stéphanie Allassonnière |
Abstract | The Expectation Maximisation (EM) algorithm is widely used to optimise non-convex likelihood functions with hidden variables. Many authors modified its simple design to fit more specific situations. For instance the Expectation (E) step has been replaced by Monte Carlo (MC) approximations, Markov Chain Monte Carlo approximations, tempered approximations… Most of the well studied approximations belong to the stochastic class. By comparison, the literature is lacking when it comes to deterministic approximations. In this paper, we introduce a theoretical framework, with state of the art convergence guarantees, for any deterministic approximation of the E step. We analyse theoretically and empirically several approximations that fit into this framework. First, for cases with intractable E steps, we introduce a deterministic alternative to the MC-EM, using Riemann sums. This method is easy to implement and does not require the tuning of hyper-parameters. Then, we consider the tempered approximation, borrowed from the Simulated Annealing optimisation technique and meant to improve the EM solution. We prove that the the tempered EM verifies the convergence guarantees for a wide range of temperature profiles. We showcase empirically how it is able to escape adversarial initialisations. Finally, we combine the Riemann and tempered approximations to accomplish both their purposes. |
Tasks | |
Published | 2020-03-23 |
URL | https://arxiv.org/abs/2003.10126v1 |
https://arxiv.org/pdf/2003.10126v1.pdf | |
PWC | https://paperswithcode.com/paper/deterministic-approximate-em-algorithm |
Repo | |
Framework | |
Adaptive Structural Hyper-Parameter Configuration by Q-Learning
Title | Adaptive Structural Hyper-Parameter Configuration by Q-Learning |
Authors | Haotian Zhang, Jianyong Sun, Zongben Xu |
Abstract | Tuning hyper-parameters for evolutionary algorithms is an important issue in computational intelligence. Performance of an evolutionary algorithm depends not only on its operation strategy design, but also on its hyper-parameters. Hyper-parameters can be categorized in two dimensions as structural/numerical and time-invariant/time-variant. Particularly, structural hyper-parameters in existing studies are usually tuned in advance for time-invariant parameters, or with hand-crafted scheduling for time-invariant parameters. In this paper, we make the first attempt to model the tuning of structural hyper-parameters as a reinforcement learning problem, and present to tune the structural hyper-parameter which controls computational resource allocation in the CEC 2018 winner algorithm by Q-learning. Experimental results show favorably against the winner algorithm on the CEC 2018 test functions. |
Tasks | Q-Learning |
Published | 2020-03-02 |
URL | https://arxiv.org/abs/2003.00863v1 |
https://arxiv.org/pdf/2003.00863v1.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-structural-hyper-parameter |
Repo | |
Framework | |
SceneCAD: Predicting Object Alignments and Layouts in RGB-D Scans
Title | SceneCAD: Predicting Object Alignments and Layouts in RGB-D Scans |
Authors | Armen Avetisyan, Tatiana Khanova, Christopher Choy, Denver Dash, Angela Dai, Matthias Nießner |
Abstract | We present a novel approach to reconstructing lightweight, CAD-based representations of scanned 3D environments from commodity RGB-D sensors. Our key idea is to jointly optimize for both CAD model alignments as well as layout estimations of the scanned scene, explicitly modeling inter-relationships between objects-to-objects and objects-to-layout. Since object arrangement and scene layout are intrinsically coupled, we show that treating the problem jointly significantly helps to produce globally-consistent representations of a scene. Object CAD models are aligned to the scene by establishing dense correspondences between geometry, and we introduce a hierarchical layout prediction approach to estimate layout planes from corners and edges of the scene.To this end, we propose a message-passing graph neural network to model the inter-relationships between objects and layout, guiding generation of a globally object alignment in a scene. By considering the global scene layout, we achieve significantly improved CAD alignments compared to state-of-the-art methods, improving from 41.83% to 58.41% alignment accuracy on SUNCG and from 50.05% to 61.24% on ScanNet, respectively. The resulting CAD-based representations makes our method well-suited for applications in content creation such as augmented- or virtual reality. |
Tasks | |
Published | 2020-03-27 |
URL | https://arxiv.org/abs/2003.12622v1 |
https://arxiv.org/pdf/2003.12622v1.pdf | |
PWC | https://paperswithcode.com/paper/scenecad-predicting-object-alignments-and |
Repo | |
Framework | |
Transferring Cross-domain Knowledge for Video Sign Language Recognition
Title | Transferring Cross-domain Knowledge for Video Sign Language Recognition |
Authors | Dongxu Li, Xin Yu, Chenchen Xu, Lars Petersson, Hongdong Li |
Abstract | Word-level sign language recognition (WSLR) is a fundamental task in sign language interpretation. It requires models to recognize isolated sign words from videos. However, annotating WSLR data needs expert knowledge, thus limiting WSLR dataset acquisition. On the contrary, there are abundant subtitled sign news videos on the internet. Since these videos have no word-level annotation and exhibit a large domain gap from isolated signs, they cannot be directly used for training WSLR models. We observe that despite the existence of a large domain gap, isolated and news signs share the same visual concepts, such as hand gestures and body movements. Motivated by this observation, we propose a novel method that learns domain-invariant visual concepts and fertilizes WSLR models by transferring knowledge of subtitled news sign to them. To this end, we extract news signs using a base WSLR model, and then design a classifier jointly trained on news and isolated signs to coarsely align these two domain features. In order to learn domain-invariant features within each class and suppress domain-specific features, our method further resorts to an external memory to store the class centroids of the aligned news signs. We then design a temporal attention based on the learnt descriptor to improve recognition performance. Experimental results on standard WSLR datasets show that our method outperforms previous state-of-the-art methods significantly. We also demonstrate the effectiveness of our method on automatically localizing signs from sign news, achieving 28.1 for AP@0.5. |
Tasks | Sign Language Recognition |
Published | 2020-03-08 |
URL | https://arxiv.org/abs/2003.03703v2 |
https://arxiv.org/pdf/2003.03703v2.pdf | |
PWC | https://paperswithcode.com/paper/transferring-cross-domain-knowledge-for-video |
Repo | |
Framework | |
Learning-based Bias Correction for Ultra-wideband Localization of Resource-constrained Mobile Robots
Title | Learning-based Bias Correction for Ultra-wideband Localization of Resource-constrained Mobile Robots |
Authors | Wenda Zhao, Abhishek Goudar, Jacopo Panerati, Angela P. Schoellig |
Abstract | Accurate indoor localization is a crucial enabling technology for many robotics applications, from warehouse management to monitoring tasks. Ultra-wideband (UWB) ranging is a promising solution which is low-cost, lightweight, and computationally inexpensive compared to alternative state-of-the-art approaches such as simultaneous localization and mapping, making it especially suited for resource-constrained aerial robots. Many commercially-available ultra-wideband radios, however, provide inaccurate, biased range measurements. In this article, we propose a bias correction framework compatible with both two-way ranging and time difference of arrival ultra-wideband localization. Our method comprises of two steps: (i) statistical outlier rejection and (ii) a learning-based bias correction. This approach is scalable and frugal enough to be deployed on-board a nano-quadcopter’s microcontroller. Previous research mostly focused on two-way ranging bias correction and has not been implemented in closed-loop nor using resource-constrained robots. Experimental results show that, using our approach, the localization error is reduced by ~18.5% and 48% (for TWR and TDoA, respectively), and a quadcopter can accurately track trajectories with position information from UWB only. |
Tasks | Simultaneous Localization and Mapping |
Published | 2020-03-20 |
URL | https://arxiv.org/abs/2003.09371v1 |
https://arxiv.org/pdf/2003.09371v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-based-bias-correction-for-ultra |
Repo | |
Framework | |
SDVTracker: Real-Time Multi-Sensor Association and Tracking for Self-Driving Vehicles
Title | SDVTracker: Real-Time Multi-Sensor Association and Tracking for Self-Driving Vehicles |
Authors | Shivam Gautam, Gregory P. Meyer, Carlos Vallespi-Gonzalez, Brian C. Becker |
Abstract | Accurate motion state estimation of Vulnerable Road Users (VRUs), is a critical requirement for autonomous vehicles that navigate in urban environments. Due to their computational efficiency, many traditional autonomy systems perform multi-object tracking using Kalman Filters which frequently rely on hand-engineered association. However, such methods fail to generalize to crowded scenes and multi-sensor modalities, often resulting in poor state estimates which cascade to inaccurate predictions. We present a practical and lightweight tracking system, SDVTracker, that uses a deep learned model for association and state estimation in conjunction with an Interacting Multiple Model (IMM) filter. The proposed tracking method is fast, robust and generalizes across multiple sensor modalities and different VRU classes. In this paper, we detail a model that jointly optimizes both association and state estimation with a novel loss, an algorithm for determining ground-truth supervision, and a training procedure. We show this system significantly outperforms hand-engineered methods on a real-world urban driving dataset while running in less than 2.5 ms on CPU for a scene with 100 actors, making it suitable for self-driving applications where low latency and high accuracy is critical. |
Tasks | Autonomous Vehicles, Multi-Object Tracking, Object Tracking |
Published | 2020-03-09 |
URL | https://arxiv.org/abs/2003.04447v1 |
https://arxiv.org/pdf/2003.04447v1.pdf | |
PWC | https://paperswithcode.com/paper/sdvtracker-real-time-multi-sensor-association |
Repo | |
Framework | |
A Bayesian Filter for Multi-view 3D Multi-object Tracking with Occlusion Handling
Title | A Bayesian Filter for Multi-view 3D Multi-object Tracking with Occlusion Handling |
Authors | Jonah Ong, Ba Tuong Vo, Ba Ngu Vo, Du Yong Kim, Sven Nordholm |
Abstract | This paper proposes an online multi-camera multi-object tracker that only requires monocular detector training, independent of the multi-camera configurations, allowing seamless extension/deletion of cameras without (retraining) effort. The proposed algorithm has a linear complexity in the total number of detections across the cameras, and hence scales gracefully with the number of cameras. It operates in 3D world frame, and provides 3D trajectory estimates of the objects. The key innovation is a high fidelity yet tractable 3D occlusion model, amenable to optimal Bayesian multi-view multi-object filtering, which seamlessly integrates, into a single Bayesian recursion, the sub-tasks of track management, state estimation, clutter rejection, and occlusion/misdetection handling. The proposed algorithm is evaluated on the latest WILDTRACKS dataset, and demonstrated to work in very crowded scenes on a new dataset. |
Tasks | 3D Multi-Object Tracking, Multi-Object Tracking, Object Tracking |
Published | 2020-01-13 |
URL | https://arxiv.org/abs/2001.04118v2 |
https://arxiv.org/pdf/2001.04118v2.pdf | |
PWC | https://paperswithcode.com/paper/a-bayesian-3d-multi-view-multi-object |
Repo | |
Framework | |
Intensity Scan Context: Coding Intensity and Geometry Relations for Loop Closure Detection
Title | Intensity Scan Context: Coding Intensity and Geometry Relations for Loop Closure Detection |
Authors | Han Wang, Chen Wang, Lihua Xie |
Abstract | Loop closure detection is an essential and challenging problem in simultaneous localization and mapping (SLAM). It is often tackled with light detection and ranging (LiDAR) sensor due to its view-point and illumination invariant properties. Existing works on 3D loop closure detection often leverage the matching of local or global geometrical-only descriptors, but without considering the intensity reading. In this paper we explore the intensity property from LiDAR scan and show that it can be effective for place recognition. Concretely, we propose a novel global descriptor, intensity scan context (ISC), that explores both geometry and intensity characteristics. To improve the efficiency for loop closure detection, an efficient two-stage hierarchical re-identification process is proposed, including a binary-operation based fast geometric relation retrieval and an intensity structure re-identification. Thorough experiments including both local experiment and public datasets test have been conducted to evaluate the performance of the proposed method. Our method achieves higher recall rate and recall precision than existing geometric-only methods. |
Tasks | Loop Closure Detection, Simultaneous Localization and Mapping |
Published | 2020-03-12 |
URL | https://arxiv.org/abs/2003.05656v1 |
https://arxiv.org/pdf/2003.05656v1.pdf | |
PWC | https://paperswithcode.com/paper/intensity-scan-context-coding-intensity-and |
Repo | |
Framework | |
Deep State Space Models for Nonlinear System Identification
Title | Deep State Space Models for Nonlinear System Identification |
Authors | Daniel Gedon, Niklas Wahlström, Thomas B. Schön, Lennart Ljung |
Abstract | An actively evolving model class for generative temporal models developed in the deep learning community are deep state space models (SSMs) which have a close connection to classic SSMs. In this work six new deep SSMs are implemented and evaluated for the identification of established nonlinear dynamic system benchmarks. The models and their parameter learning algorithms are elaborated rigorously. The usage of deep SSMs as a black-box identification model can describe a wide range of dynamics due to the flexibility of deep neural networks. Additionally, the uncertainty of the system is modelled and therefore one obtains a much richer representation and a whole class of systems to describe the underlying dynamics. |
Tasks | |
Published | 2020-03-31 |
URL | https://arxiv.org/abs/2003.14162v1 |
https://arxiv.org/pdf/2003.14162v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-state-space-models-for-nonlinear-system |
Repo | |
Framework | |
Fast Loop Closure Detection via Binary Content
Title | Fast Loop Closure Detection via Binary Content |
Authors | Han Wang, Juncheng Li, Maopeng Ran, Lihua Xie |
Abstract | Loop closure detection plays an important role in reducing localization drift in Simultaneous Localization And Mapping (SLAM). It aims to find repetitive scenes from historical data to reset localization. To tackle the loop closure problem, existing methods often leverage on the matching of visual features, which achieve good accuracy but require high computational resources. However, feature point based methods ignore the patterns of image, i.e., the shape of the objects as well as the distribution of objects in an image. It is believed that this information is usually unique for a scene and can be utilized to improve the performance of traditional loop closure detection methods. In this paper we leverage and compress the information into a binary image to accelerate an existing fast loop closure detection method via binary content. The proposed method can greatly reduce the computational cost without sacrificing recall rate. It consists of three parts: binary content construction, fast image retrieval and precise loop closure detection. No offline training is required. Our method is compared with the state-of-the-art loop closure detection methods and the results show that it outperforms the traditional methods at both recall rate and speed. |
Tasks | Image Retrieval, Loop Closure Detection, Simultaneous Localization and Mapping |
Published | 2020-02-25 |
URL | https://arxiv.org/abs/2002.10622v1 |
https://arxiv.org/pdf/2002.10622v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-loop-closure-detection-via-binary |
Repo | |
Framework | |
Statistical Outlier Identification in Multi-robot Visual SLAM using Expectation Maximization
Title | Statistical Outlier Identification in Multi-robot Visual SLAM using Expectation Maximization |
Authors | Arman Karimian, Ziqi Yang, Roberto Tron |
Abstract | This paper introduces a novel and distributed method for detecting inter-map loop closure outliers in simultaneous localization and mapping (SLAM). The proposed algorithm does not rely on a good initialization and can handle more than two maps at a time. In multi-robot SLAM applications, maps made by different agents have nonidentical spatial frames of reference which makes initialization very difficult in the presence of outliers. This paper presents a probabilistic approach for detecting incorrect orientation measurements prior to pose graph optimization by checking the geometric consistency of rotation measurements. Expectation-Maximization is used to fine-tune the model parameters. As ancillary contributions, a new approximate discrete inference procedure is presented which uses evidence on loops in a graph and is based on optimization (Alternate Direction Method of Multipliers). This method yields superior results compared to Belief Propagation and has convergence guarantees. Simulation and experimental results are presented that evaluate the performance of the outlier detection method and the inference algorithm on synthetic and real-world data. |
Tasks | Outlier Detection, Simultaneous Localization and Mapping |
Published | 2020-02-07 |
URL | https://arxiv.org/abs/2002.02638v1 |
https://arxiv.org/pdf/2002.02638v1.pdf | |
PWC | https://paperswithcode.com/paper/statistical-outlier-identification-in-multi |
Repo | |
Framework | |
Object Detection on Single Monocular Images through Canonical Correlation Analysis
Title | Object Detection on Single Monocular Images through Canonical Correlation Analysis |
Authors | Zifan Yu, Suya You |
Abstract | Without using extra 3-D data like points cloud or depth images for providing 3-D information, we retrieve the 3-D object information from single monocular images. The high-quality predicted depth images are recovered from single monocular images, and it is fed into the 2-D object proposal network with corresponding monocular images. Most existing deep learning frameworks with two-streams input data always fuse separate data by concatenating or adding, which views every part of a feature map can contribute equally to the whole task. However, when data are noisy, and too much information is redundant, these methods no longer produce predictions or classifications efficiently. In this report, we propose a two-dimensional CCA(canonical correlation analysis) framework to fuse monocular images and corresponding predicted depth images for basic computer vision tasks like image classification and object detection. Firstly, we implemented different structures with one-dimensional CCA and Alexnet to test the performance on the image classification task. And then, we applied one of these structures with 2D-CCA for object detection. During these experiments, we found that our proposed framework behaves better when taking predicted depth images as inputs with the model trained from ground truth depth. |
Tasks | Image Classification, Object Detection |
Published | 2020-02-13 |
URL | https://arxiv.org/abs/2002.05349v1 |
https://arxiv.org/pdf/2002.05349v1.pdf | |
PWC | https://paperswithcode.com/paper/object-detection-on-single-monocular-images |
Repo | |
Framework | |
Predicting Legal Proceedings Status: an Approach Based on Sequential Text Data
Title | Predicting Legal Proceedings Status: an Approach Based on Sequential Text Data |
Authors | Felipe Maia Polo, Itamar Ciochetti, Emerson Bertolo |
Abstract | Machine learning applications in the legal field are numerous and diverse. In order to make contribution to both the machine learning community and the legal community, we have made efforts to create a model compatible with the classification of text sequences, valuing the interpretability of the results. The purpose of this paper is to classify legal proceedings in three possible status classes, which are (i) archived proceedings, (ii) active proceedings and (iii) suspended proceedings. Our approach is composed by natural language processing, supervised and unsupervised deep learning models and performed remarkably well in the classification task. Furthermore we had some insights regarding the patterns learned by the neural network applying tools to make the results more interpretable. |
Tasks | |
Published | 2020-03-13 |
URL | https://arxiv.org/abs/2003.11561v1 |
https://arxiv.org/pdf/2003.11561v1.pdf | |
PWC | https://paperswithcode.com/paper/predicting-legal-proceedings-status-an |
Repo | |
Framework | |