October 16, 2019

3153 words 15 mins read

Paper Group ANR 1050

Scene Understanding Networks for Autonomous Driving based on Around View Monitoring System. Scan2Mesh: From Unstructured Range Scans to 3D Meshes. Multi-scale metrics and self-organizing maps: a computational approach to the structure of sensory maps. Chameleon: A Hybrid Secure Computation Framework for Machine Learning Applications. Joint direct e …

Scene Understanding Networks for Autonomous Driving based on Around View Monitoring System


Title	Scene Understanding Networks for Autonomous Driving based on Around View Monitoring System
Authors	JeongYeol Baek, Ioana Veronica Chelu, Livia Iordache, Vlad Paunescu, HyunJoo Ryu, Alexandru Ghiuta, Andrei Petreanu, YunSung Soh, Andrei Leica, ByeongMoon Jeon
Abstract	Modern driver assistance systems rely on a wide range of sensors (RADAR, LIDAR, ultrasound and cameras) for scene understanding and prediction. These sensors are typically used for detecting traffic participants and scene elements required for navigation. In this paper we argue that relying on camera based systems, specifically Around View Monitoring (AVM) system has great potential to achieve these goals in both parking and driving modes with decreased costs. The contributions of this paper are as follows: we present a new end-to-end solution for delimiting the safe drivable area for each frame by means of identifying the closest obstacle in each direction from the driving vehicle, we use this approach to calculate the distance to the nearest obstacles and we incorporate it into a unified end-to-end architecture capable of joint object detection, curb detection and safe drivable area detection. Furthermore, we describe the family of networks for both a high accuracy solution and a low complexity solution. We also introduce further augmentation of the base architecture with 3D object detection.
Tasks	3D Object Detection, Autonomous Driving, Object Detection, Scene Understanding
Published	2018-05-18
URL	http://arxiv.org/abs/1805.07029v1
PDF	http://arxiv.org/pdf/1805.07029v1.pdf
PWC	https://paperswithcode.com/paper/scene-understanding-networks-for-autonomous
Repo
Framework

Scan2Mesh: From Unstructured Range Scans to 3D Meshes


Title	Scan2Mesh: From Unstructured Range Scans to 3D Meshes
Authors	Angela Dai, Matthias Nießner
Abstract	We introduce Scan2Mesh, a novel data-driven generative approach which transforms an unstructured and potentially incomplete range scan into a structured 3D mesh representation. The main contribution of this work is a generative neural network architecture whose input is a range scan of a 3D object and whose output is an indexed face set conditioned on the input scan. In order to generate a 3D mesh as a set of vertices and face indices, the generative model builds on a series of proxy losses for vertices, edges, and faces. At each stage, we realize a one-to-one discrete mapping between the predicted and ground truth data points with a combination of convolutional- and graph neural network architectures. This enables our algorithm to predict a compact mesh representation similar to those created through manual artist effort using 3D modeling software. Our generated mesh results thus produce sharper, cleaner meshes with a fundamentally different structure from those generated through implicit functions, a first step in bridging the gap towards artist-created CAD models.
Tasks
Published	2018-11-26
URL	http://arxiv.org/abs/1811.10464v2
PDF	http://arxiv.org/pdf/1811.10464v2.pdf
PWC	https://paperswithcode.com/paper/scan2mesh-from-unstructured-range-scans-to-3d
Repo
Framework

Multi-scale metrics and self-organizing maps: a computational approach to the structure of sensory maps


Title	Multi-scale metrics and self-organizing maps: a computational approach to the structure of sensory maps
Authors	William H. Wilson
Abstract	This paper introduces the concept of a bi-scale metric for use in the cooperative phase of the self-organizing map (SOM) algorithm. Use of a bi-scale metric allows segmentation of the map into a number of regions, corresponding to anticipated cluster structure in the data. Such a situation occurs, for example, in the somatotopic maps which inspired the SOM algo- rithm, where clusters of data may correspond to body surface regions whose general structure is known. When a bi-scale metric is appropriately applied, issues with map neurons that are not activated by any point in the training data are reduced or eliminated. The paper also presents results of simulation studies on the plasticity of bi-scale metric maps when they are retrained af- ter loss of groups of map neurons or after changes in training data (such as would occur in a somatotopic map when a body surface region like a finger is lost/removed). The paper further considers situations where tri-scale met- rics may be useful, and an alternative approach suggested by neurobiology, where some map regions adapt more slowly to stimuli because they have a lower learning rate parameter.
Tasks
Published	2018-05-09
URL	http://arxiv.org/abs/1805.03337v1
PDF	http://arxiv.org/pdf/1805.03337v1.pdf
PWC	https://paperswithcode.com/paper/multi-scale-metrics-and-self-organizing-maps
Repo
Framework

Chameleon: A Hybrid Secure Computation Framework for Machine Learning Applications


Title	Chameleon: A Hybrid Secure Computation Framework for Machine Learning Applications
Authors	M. Sadegh Riazi, Christian Weinert, Oleksandr Tkachenko, Ebrahim M. Songhori, Thomas Schneider, Farinaz Koushanfar
Abstract	We present Chameleon, a novel hybrid (mixed-protocol) framework for secure function evaluation (SFE) which enables two parties to jointly compute a function without disclosing their private inputs. Chameleon combines the best aspects of generic SFE protocols with the ones that are based upon additive secret sharing. In particular, the framework performs linear operations in the ring $\mathbb{Z}_{2^l}$ using additively secret shared values and nonlinear operations using Yao’s Garbled Circuits or the Goldreich-Micali-Wigderson protocol. Chameleon departs from the common assumption of additive or linear secret sharing models where three or more parties need to communicate in the online phase: the framework allows two parties with private inputs to communicate in the online phase under the assumption of a third node generating correlated randomness in an offline phase. Almost all of the heavy cryptographic operations are precomputed in an offline phase which substantially reduces the communication overhead. Chameleon is both scalable and significantly more efficient than the ABY framework (NDSS’15) it is based on. Our framework supports signed fixed-point numbers. In particular, Chameleon’s vector dot product of signed fixed-point numbers improves the efficiency of mining and classification of encrypted data for algorithms based upon heavy matrix multiplications. Our evaluation of Chameleon on a 5 layer convolutional deep neural network shows 133x and 4.2x faster executions than Microsoft CryptoNets (ICML’16) and MiniONN (CCS’17), respectively.
Tasks
Published	2018-01-10
URL	http://arxiv.org/abs/1801.03239v1
PDF	http://arxiv.org/pdf/1801.03239v1.pdf
PWC	https://paperswithcode.com/paper/chameleon-a-hybrid-secure-computation
Repo
Framework

Joint direct estimation of 3D geometry and 3D motion using spatio temporal gradients


Title	Joint direct estimation of 3D geometry and 3D motion using spatio temporal gradients
Authors	Francisco Barranco, Cornelia Fermüller, Yiannis Aloimonos, Eduardo Ros
Abstract	Conventional image motion based structure from motion methods first compute optical flow, then solve for the 3D motion parameters based on the epipolar constraint, and finally recover the 3D geometry of the scene. However, errors in optical flow due to regularization can lead to large errors in 3D motion and structure. This paper investigates whether performance and consistency can be improved by avoiding optical flow estimation in the early stages of the structure from motion pipeline, and it proposes a new direct method based on image gradients (normal flow) only. The main idea lies in a reformulation of the positive-depth constraint, which allows the use of well-known minimization techniques to solve for 3D motion. The 3D motion estimate is then refined and structure estimated adding a regularization based on depth. Experimental comparisons on standard synthetic datasets and the real-world driving benchmark dataset KITTI using three different optic flow algorithms show that the method achieves better accuracy in all but one case. Furthermore, it outperforms existing normal flow based 3D motion estimation techniques. Finally, the recovered 3D geometry is shown to be also very accurate.
Tasks	Motion Estimation, Optical Flow Estimation
Published	2018-05-17
URL	http://arxiv.org/abs/1805.06641v1
PDF	http://arxiv.org/pdf/1805.06641v1.pdf
PWC	https://paperswithcode.com/paper/joint-direct-estimation-of-3d-geometry-and-3d
Repo
Framework

A method to construct exponential families by representation theory


Title	A method to construct exponential families by representation theory
Authors	Koichi Tojo, Taro Yoshino
Abstract	In this paper, we give a method to construct “good” exponential families systematically by representation theory. More precisely, we consider a homogeneous space $G/H$ as a sample space and construct an exponential family invariant under the transformation group $G$ by using a representation of $G$. The method generates widely used exponential families such as normal, gamma, Bernoulli, categorical, Wishart, von Mises, Fisher-Bingham and hyperboloid distributions.
Tasks
Published	2018-11-04
URL	https://arxiv.org/abs/1811.01394v3
PDF	https://arxiv.org/pdf/1811.01394v3.pdf
PWC	https://paperswithcode.com/paper/a-method-to-construct-exponential-families-by
Repo
Framework

Competitive Training of Mixtures of Independent Deep Generative Models


Title	Competitive Training of Mixtures of Independent Deep Generative Models
Authors	Francesco Locatello, Damien Vincent, Ilya Tolstikhin, Gunnar Rätsch, Sylvain Gelly, Bernhard Schölkopf
Abstract	A common assumption in causal modeling posits that the data is generated by a set of independent mechanisms, and algorithms should aim to recover this structure. Standard unsupervised learning, however, is often concerned with training a single model to capture the overall distribution or aspects thereof. Inspired by clustering approaches, we consider mixtures of implicit generative models that ``disentangle’’ the independent generative mechanisms underlying the data. Relying on an additional set of discriminators, we propose a competitive training procedure in which the models only need to capture the portion of the data distribution from which they can produce realistic samples. As a by-product, each model is simpler and faster to train. We empirically show that our approach splits the training distribution in a sensible way and increases the quality of the generated samples. \|
Tasks
Published	2018-04-30
URL	http://arxiv.org/abs/1804.11130v4
PDF	http://arxiv.org/pdf/1804.11130v4.pdf
PWC	https://paperswithcode.com/paper/competitive-training-of-mixtures-of
Repo
Framework

Propheticus: Generalizable Machine Learning Framework


Title	Propheticus: Generalizable Machine Learning Framework
Authors	João R. Campos, Marco Vieira, Ernesto Costa
Abstract	Due to recent technological developments, Machine Learning (ML), a subfield of Artificial Intelligence (AI), has been successfully used to process and extract knowledge from a variety of complex problems. However, a thorough ML approach is complex and highly dependent on the problem at hand. Additionally, implementing the logic required to execute the experiments is no small nor trivial deed, consequentially increasing the probability of faulty code which can compromise the results. Propheticus is a data-driven framework which results of the need for a tool that abstracts some of the inherent complexity of ML, whilst being easy to understand and use, as well as to adapt and expand to assist the user’s specific needs. Propheticus systematizes and enforces various complex concepts of an ML experiment workflow, taking into account the nature of both the problem and the data. It contains functionalities to execute all the different tasks, from data preprocessing, to results analysis and comparison. Notwithstanding, it can be fairly easily adapted to different problems due to its flexible architecture, and customized as needed to address the user’s needs.
Tasks
Published	2018-09-06
URL	http://arxiv.org/abs/1809.01898v1
PDF	http://arxiv.org/pdf/1809.01898v1.pdf
PWC	https://paperswithcode.com/paper/propheticus-generalizable-machine-learning
Repo
Framework


Title	A Convolutional Neural Network based Live Object Recognition System as Blind Aid
Authors	Kedar Potdar, Chinmay D. Pai, Sukrut Akolkar
Abstract	This paper introduces a live object recognition system that serves as a blind aid. Visually impaired people heavily rely on their other senses such as touch and auditory signals for understanding the environment around them. The act of knowing what object is in front of the blind person without touching it (by hand or some other tool) is very difficult. In some cases, the physical contact between the person and object can be dangerous, and even lethal. This project employs a Convolutional Neural Network for recognition of pre-trained objects on the ImageNet dataset. A camera, aligned with the system’s predetermined orientation serves as input to the computer system, which has the object recognition Neural Network deployed to carry out real-time object detection. Output from the network can then be parsed to present to the visually impaired person either in the form of audio or Braille text.
Tasks	Object Detection, Object Recognition, Real-Time Object Detection
Published	2018-11-26
URL	http://arxiv.org/abs/1811.10399v1
PDF	http://arxiv.org/pdf/1811.10399v1.pdf
PWC	https://paperswithcode.com/paper/a-convolutional-neural-network-based-live
Repo
Framework

Visual Mesh: Real-time Object Detection Using Constant Sample Density


Title	Visual Mesh: Real-time Object Detection Using Constant Sample Density
Authors	Trent Houliston, Stephan K. Chalup
Abstract	This paper proposes an enhancement of convolutional neural networks for object detection in resource-constrained robotics through a geometric input transformation called Visual Mesh. It uses object geometry to create a graph in vision space, reducing computational complexity by normalizing the pixel and feature density of objects. The experiments compare the Visual Mesh with several other fast convolutional neural networks. The results demonstrate execution times sixteen times quicker than the fastest competitor tested, while achieving outstanding accuracy.
Tasks	Object Detection, Real-Time Object Detection
Published	2018-07-23
URL	http://arxiv.org/abs/1807.08405v1
PDF	http://arxiv.org/pdf/1807.08405v1.pdf
PWC	https://paperswithcode.com/paper/visual-mesh-real-time-object-detection-using
Repo
Framework

ShuffleDet: Real-Time Vehicle Detection Network in On-board Embedded UAV Imagery


Title	ShuffleDet: Real-Time Vehicle Detection Network in On-board Embedded UAV Imagery
Authors	Seyed Majid Azimi
Abstract	On-board real-time vehicle detection is of great significance for UAVs and other embedded mobile platforms. We propose a computationally inexpensive detection network for vehicle detection in UAV imagery which we call ShuffleDet. In order to enhance the speed-wise performance, we construct our method primarily using channel shuffling and grouped convolutions. We apply inception modules and deformable modules to consider the size and geometric shape of the vehicles. ShuffleDet is evaluated on CARPK and PUCPR+ datasets and compared against the state-of-the-art real-time object detection networks. ShuffleDet achieves 3.8 GFLOPs while it provides competitive performance on test sets of both datasets. We show that our algorithm achieves real-time performance by running at the speed of 14 frames per second on NVIDIA Jetson TX2 showing high potential for this method for real-time processing in UAVs.
Tasks	Object Detection, Real-Time Object Detection
Published	2018-11-15
URL	http://arxiv.org/abs/1811.06318v1
PDF	http://arxiv.org/pdf/1811.06318v1.pdf
PWC	https://paperswithcode.com/paper/shuffledet-real-time-vehicle-detection
Repo
Framework

Automatic Detection of Arousals during Sleep using Multiple Physiological Signals


Title	Automatic Detection of Arousals during Sleep using Multiple Physiological Signals
Authors	Saman Parvaneh, Jonathan Rubin, Ali Samadani, Gajendra Katuwal
Abstract	The visual scoring of arousals during sleep routinely conducted by sleep experts is a challenging task warranting an automatic approach. This paper presents an algorithm for automatic detection of arousals during sleep. Using the Physionet/CinC Challenge dataset, an 80-20% subject-level split was performed to create in-house training and test sets, respectively. The data for each subject in the training set was split to 30-second epochs with no overlap. A total of 428 features from EEG, EMG, EOG, airflow, and SaO2 in each epoch were extracted and used for creating subject-specific models based on an ensemble of bagged classification trees, resulting in 943 models. For marking arousal and non-arousal regions in the test set, the data in the test set was split to 30-second epochs with 50% overlaps. The average of arousal probabilities from different patient-specific models was assigned to each 30-second epoch and then a sample-wise probability vector with the same length as test data was created for model evaluation. Using the PhysioNet/CinC Challenge 2018 scoring criteria, AUPRCs of 0.25 and 0.21 were achieved for the in-house test and blind test sets, respectively.
Tasks	EEG
Published	2018-10-05
URL	http://arxiv.org/abs/1810.02726v1
PDF	http://arxiv.org/pdf/1810.02726v1.pdf
PWC	https://paperswithcode.com/paper/automatic-detection-of-arousals-during-sleep
Repo
Framework

URBAN-i: From urban scenes to mapping slums, transport modes, and pedestrians in cities using deep learning and computer vision


Title	URBAN-i: From urban scenes to mapping slums, transport modes, and pedestrians in cities using deep learning and computer vision
Authors	Mohamed R. Ibrahim, James Haworth, Tao Cheng
Abstract	Within the burgeoning expansion of deep learning and computer vision across the different fields of science, when it comes to urban development, deep learning and computer vision applications are still limited towards the notions of smart cities and autonomous vehicles. Indeed, a wide gap of knowledge appears when it comes to cities and urban regions in less developed countries where the chaos of informality is the dominant scheme. How can deep learning and Artificial Intelligence (AI) untangle the complexities of informality to advance urban modelling and our understanding of cities? Various questions and debates can be raised concerning the future of cities of the North and the South in the paradigm of AI and computer vision. In this paper, we introduce a new method for multipurpose realistic-dynamic urban modelling relying on deep learning and computer vision, using deep Convolutional Neural Networks (CNN), to sense and detect informality and slums in urban scenes from aerial and street view images in addition to detection of pedestrian and transport modes. The model has been trained on images of urban scenes in cities across the globe. The model shows a good validation of understanding a wide spectrum of nuances among the planned and the unplanned regions, including informal and slum areas. We attempt to advance urban modelling for better understanding the dynamics of city developments. We also aim to exemplify the significant impacts of AI in cities beyond how smart cities are discussed and perceived in the mainstream. The algorithms of the URBAN-i model are fully-coded in Python programming with the pre-trained deep learning models to be used as a tool for mapping and city modelling in the various corner of the globe, including informal settlements and slum regions.
Tasks	Autonomous Vehicles
Published	2018-09-10
URL	http://arxiv.org/abs/1809.03609v1
PDF	http://arxiv.org/pdf/1809.03609v1.pdf
PWC	https://paperswithcode.com/paper/urban-i-from-urban-scenes-to-mapping-slums
Repo
Framework

Learning to Compensate Photovoltaic Power Fluctuations from Images of the Sky by Imitating an Optimal Policy


Title	Learning to Compensate Photovoltaic Power Fluctuations from Images of the Sky by Imitating an Optimal Policy
Authors	Robin Spiess, Felix Berkenkamp, Jan Poland, Andreas Krause
Abstract	The energy output of photovoltaic (PV) power plants depends on the environment and thus fluctuates over time. As a result, PV power can cause instability in the power grid, in particular when increasingly used. Limiting the rate of change of the power output is a common way to mitigate these fluctuations, often with the help of large batteries. A reactive controller that uses these batteries to compensate ramps works in practice, but causes stress on the battery due to a high energy throughput. In this paper, we present a deep learning approach that uses images of the sky to compensate power fluctuations predictively and reduces battery stress. In particular, we show that the optimal control policy can be computed using information that is only available in hindsight. Based on this, we use imitation learning to train a neural network that approximates this hindsight-optimal policy, but uses only currently available sky images and sensor data. We evaluate our method on a large dataset of measurements and images from a real power plant and show that the trained policy reduces stress on the battery.
Tasks	Imitation Learning
Published	2018-11-13
URL	http://arxiv.org/abs/1811.05788v1
PDF	http://arxiv.org/pdf/1811.05788v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-compensate-photovoltaic-power
Repo
Framework

Understanding Fashionability: What drives sales of a style?


Title	Understanding Fashionability: What drives sales of a style?
Authors	Aniket Jain, Yadunath Gupta, Pawan Kumar Singh, Aruna Rajan
Abstract	We use customer demand data for fashion articles on Myntra, and derive a fashionability or style quotient, which represents customer demand for the stylistic content of a fashion article, decoupled with its commercials (price, offers, etc.). We demonstrate learning for assortment planning in fashion that would aim to keep a healthy mix of breadth and depth across various styles, and we show the relationship between a customer’s perception of a style vs a merchandiser’s catalogue of styles. We also backtest our method to calculate prediction errors in our style quotient and customer demand, and discuss various implications and findings.
Tasks
Published	2018-06-28
URL	http://arxiv.org/abs/1806.11424v1
PDF	http://arxiv.org/pdf/1806.11424v1.pdf
PWC	https://paperswithcode.com/paper/understanding-fashionability-what-drives
Repo
Framework