Paper Group ANR 314
ApolloCar3D: A Large 3D Car Instance Understanding Benchmark for Autonomous Driving. Cloud No Longer a Silver Bullet, Edge to the Rescue. Deep Continuous Conditional Random Fields with Asymmetric Inter-object Constraints for Online Multi-object Tracking. Higher-order Spectral Clustering for Heterogeneous Graphs. Amanuensis: The Programmer’s Apprent …
ApolloCar3D: A Large 3D Car Instance Understanding Benchmark for Autonomous Driving
Title | ApolloCar3D: A Large 3D Car Instance Understanding Benchmark for Autonomous Driving |
Authors | Xibin Song, Peng Wang, Dingfu Zhou, Rui Zhu, Chenye Guan, Yuchao Dai, Hao Su, Hongdong Li, Ruigang Yang |
Abstract | Autonomous driving has attracted remarkable attention from both industry and academia. An important task is to estimate 3D properties(e.g.translation, rotation and shape) of a moving or parked vehicle on the road. This task, while critical, is still under-researched in the computer vision community - partially owing to the lack of large scale and fully-annotated 3D car database suitable for autonomous driving research. In this paper, we contribute the first large-scale database suitable for 3D car instance understanding - ApolloCar3D. The dataset contains 5,277 driving images and over 60K car instances, where each car is fitted with an industry-grade 3D CAD model with absolute model size and semantically labelled keypoints. This dataset is above 20 times larger than PASCAL3D+ and KITTI, the current state-of-the-art. To enable efficient labelling in 3D, we build a pipeline by considering 2D-3D keypoint correspondences for a single instance and 3D relationship among multiple instances. Equipped with such dataset, we build various baseline algorithms with the state-of-the-art deep convolutional neural networks. Specifically, we first segment each car with a pre-trained Mask R-CNN, and then regress towards its 3D pose and shape based on a deformable 3D car model with or without using semantic keypoints. We show that using keypoints significantly improves fitting performance. Finally, we develop a new 3D metric jointly considering 3D pose and 3D shape, allowing for comprehensive evaluation and ablation study. By comparing with human performance we suggest several future directions for further improvements. |
Tasks | 3D Car Instance Understanding, Autonomous Driving |
Published | 2018-11-29 |
URL | http://arxiv.org/abs/1811.12222v2 |
http://arxiv.org/pdf/1811.12222v2.pdf | |
PWC | https://paperswithcode.com/paper/apollocar3d-a-large-3d-car-instance |
Repo | |
Framework | |
Cloud No Longer a Silver Bullet, Edge to the Rescue
Title | Cloud No Longer a Silver Bullet, Edge to the Rescue |
Authors | Yuhao Zhu, Gu-Yeon Wei, David Brooks |
Abstract | This paper takes the position that, while cognitive computing today relies heavily on the cloud, we will soon see a paradigm shift where cognitive computing primarily happens on network edges. The shift toward edge devices is fundamentally propelled both by technological constraints in data centers and wireless network infrastructures, as well as practical considerations such as privacy and safety. The remainder of this paper lays out our view of how these constraints will impact future cognitive computing. Bringing cognitive computing to edge devices opens up several new opportunities and challenges, some of which demand new solutions and some of which require us to revisit entrenched techniques in light of new technologies. We close the paper with a call to action for future research. |
Tasks | |
Published | 2018-02-15 |
URL | http://arxiv.org/abs/1802.05943v1 |
http://arxiv.org/pdf/1802.05943v1.pdf | |
PWC | https://paperswithcode.com/paper/cloud-no-longer-a-silver-bullet-edge-to-the |
Repo | |
Framework | |
Deep Continuous Conditional Random Fields with Asymmetric Inter-object Constraints for Online Multi-object Tracking
Title | Deep Continuous Conditional Random Fields with Asymmetric Inter-object Constraints for Online Multi-object Tracking |
Authors | Hui Zhou, Wanli Ouyang, Jian Cheng, Xiaogang Wang, Hongsheng Li |
Abstract | Online Multi-Object Tracking (MOT) is a challenging problem and has many important applications including intelligence surveillance, robot navigation and autonomous driving. In existing MOT methods, individual object’s movements and inter-object relations are mostly modeled separately and relations between them are still manually tuned. In addition, inter-object relations are mostly modeled in a symmetric way, which we argue is not an optimal setting. To tackle those difficulties, in this paper, we propose a Deep Continuous Conditional Random Field (DCCRF) for solving the online MOT problem in a track-by-detection framework. The DCCRF consists of unary and pairwise terms. The unary terms estimate tracked objects’ displacements across time based on visual appearance information. They are modeled as deep Convolution Neural Networks, which are able to learn discriminative visual features for tracklet association. The asymmetric pairwise terms model inter-object relations in an asymmetric way, which encourages high-confidence tracklets to help correct errors of low-confidence tracklets and not to be affected by low-confidence ones much. The DCCRF is trained in an end-to-end manner for better adapting the influences of visual information as well as inter-object relations. Extensive experimental comparisons with state-of-the-arts as well as detailed component analysis of our proposed DCCRF on two public benchmarks demonstrate the effectiveness of our proposed MOT framework. |
Tasks | Autonomous Driving, Multi-Object Tracking, Object Tracking, Online Multi-Object Tracking, Robot Navigation |
Published | 2018-06-04 |
URL | http://arxiv.org/abs/1806.01183v1 |
http://arxiv.org/pdf/1806.01183v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-continuous-conditional-random-fields |
Repo | |
Framework | |
Higher-order Spectral Clustering for Heterogeneous Graphs
Title | Higher-order Spectral Clustering for Heterogeneous Graphs |
Authors | Aldo G. Carranza, Ryan A. Rossi, Anup Rao, Eunyee Koh |
Abstract | Higher-order connectivity patterns such as small induced sub-graphs called graphlets (network motifs) are vital to understand the important components (modules/functional units) governing the configuration and behavior of complex networks. Existing work in higher-order clustering has focused on simple homogeneous graphs with a single node/edge type. However, heterogeneous graphs consisting of nodes and edges of different types are seemingly ubiquitous in the real-world. In this work, we introduce the notion of typed-graphlet that explicitly captures the rich (typed) connectivity patterns in heterogeneous networks. Using typed-graphlets as a basis, we develop a general principled framework for higher-order clustering in heterogeneous networks. The framework provides mathematical guarantees on the optimality of the higher-order clustering obtained. The experiments demonstrate the effectiveness of the framework quantitatively for three important applications including (i) clustering, (ii) link prediction, and (iii) graph compression. In particular, the approach achieves a mean improvement of 43x over all methods and graphs for clustering while achieving a 18.7% and 20.8% improvement for link prediction and graph compression, respectively. |
Tasks | Link Prediction |
Published | 2018-10-06 |
URL | http://arxiv.org/abs/1810.02959v1 |
http://arxiv.org/pdf/1810.02959v1.pdf | |
PWC | https://paperswithcode.com/paper/higher-order-spectral-clustering-for |
Repo | |
Framework | |
Amanuensis: The Programmer’s Apprentice
Title | Amanuensis: The Programmer’s Apprentice |
Authors | Thomas Dean, Maurice Chiang, Marcus Gomez, Nate Gruver, Yousef Hindy, Michelle Lam, Peter Lu, Sophia Sanchez, Rohun Saxena, Michael Smith, Lucy Wang, Catherine Wong |
Abstract | This document provides an overview of the material covered in a course taught at Stanford in the spring quarter of 2018. The course draws upon insight from cognitive and systems neuroscience to implement hybrid connectionist and symbolic reasoning systems that leverage and extend the state of the art in machine learning by integrating human and machine intelligence. As a concrete example we focus on digital assistants that learn from continuous dialog with an expert software engineer while providing initial value as powerful analytical, computational and mathematical savants. Over time these savants learn cognitive strategies (domain-relevant problem solving skills) and develop intuitions (heuristics and the experience necessary for applying them) by learning from their expert associates. By doing so these savants elevate their innate analytical skills allowing them to partner on an equal footing as versatile collaborators - effectively serving as cognitive extensions and digital prostheses, thereby amplifying and emulating their human partner’s conceptually-flexible thinking patterns and enabling improved access to and control over powerful computing resources. |
Tasks | |
Published | 2018-06-29 |
URL | http://arxiv.org/abs/1807.00082v2 |
http://arxiv.org/pdf/1807.00082v2.pdf | |
PWC | https://paperswithcode.com/paper/amanuensis-the-programmers-apprentice |
Repo | |
Framework | |
One Bit Matters: Understanding Adversarial Examples as the Abuse of Redundancy
Title | One Bit Matters: Understanding Adversarial Examples as the Abuse of Redundancy |
Authors | Jingkang Wang, Ruoxi Jia, Gerald Friedland, Bo Li, Costas Spanos |
Abstract | Despite the great success achieved in machine learning (ML), adversarial examples have caused concerns with regards to its trustworthiness: A small perturbation of an input results in an arbitrary failure of an otherwise seemingly well-trained ML model. While studies are being conducted to discover the intrinsic properties of adversarial examples, such as their transferability and universality, there is insufficient theoretic analysis to help understand the phenomenon in a way that can influence the design process of ML experiments. In this paper, we deduce an information-theoretic model which explains adversarial attacks as the abuse of feature redundancies in ML algorithms. We prove that feature redundancy is a necessary condition for the existence of adversarial examples. Our model helps to explain some major questions raised in many anecdotal studies on adversarial examples. Our theory is backed up by empirical measurements of the information content of benign and adversarial examples on both image and text datasets. Our measurements show that typical adversarial examples introduce just enough redundancy to overflow the decision making of an ML model trained on corresponding benign examples. We conclude with actionable recommendations to improve the robustness of machine learners against adversarial examples. |
Tasks | Decision Making |
Published | 2018-10-23 |
URL | http://arxiv.org/abs/1810.09650v1 |
http://arxiv.org/pdf/1810.09650v1.pdf | |
PWC | https://paperswithcode.com/paper/one-bit-matters-understanding-adversarial |
Repo | |
Framework | |
Tracking State Changes in Procedural Text: A Challenge Dataset and Models for Process Paragraph Comprehension
Title | Tracking State Changes in Procedural Text: A Challenge Dataset and Models for Process Paragraph Comprehension |
Authors | Bhavana Dalvi Mishra, Lifu Huang, Niket Tandon, Wen-tau Yih, Peter Clark |
Abstract | We present a new dataset and models for comprehending paragraphs about processes (e.g., photosynthesis), an important genre of text describing a dynamic world. The new dataset, ProPara, is the first to contain natural (rather than machine-generated) text about a changing world along with a full annotation of entity states (location and existence) during those changes (81k datapoints). The end-task, tracking the location and existence of entities through the text, is challenging because the causal effects of actions are often implicit and need to be inferred. We find that previous models that have worked well on synthetic data achieve only mediocre performance on ProPara, and introduce two new neural models that exploit alternative mechanisms for state prediction, in particular using LSTM input encoding and span prediction. The new models improve accuracy by up to 19%. The dataset and models are available to the community at http://data.allenai.org/propara. |
Tasks | |
Published | 2018-05-17 |
URL | http://arxiv.org/abs/1805.06975v1 |
http://arxiv.org/pdf/1805.06975v1.pdf | |
PWC | https://paperswithcode.com/paper/tracking-state-changes-in-procedural-text-a |
Repo | |
Framework | |
Decentralized Exploration in Multi-Armed Bandits
Title | Decentralized Exploration in Multi-Armed Bandits |
Authors | Raphaël Féraud, Réda Alami, Romain Laroche |
Abstract | We consider the decentralized exploration problem: a set of players collaborate to identify the best arm by asynchronously interacting with the same stochastic environment. The objective is to insure privacy in the best arm identification problem between asynchronous, collaborative, and thrifty players. In the context of a digital service, we advocate that this decentralized approach allows a good balance between the interests of users and those of service providers: the providers optimize their services, while protecting the privacy of the users and saving resources. We define the privacy level as the amount of information an adversary could infer by intercepting the messages concerning a single user. We provide a generic algorithm Decentralized Elimination, which uses any best arm identification algorithm as a subroutine. We prove that this algorithm insures privacy, with a low communication cost, and that in comparison to the lower bound of the best arm identification problem, its sample complexity suffers from a penalty depending on the inverse of the probability of the most frequent players. Then, thanks to the genericity of the approach, we extend the proposed algorithm to the non-stationary bandits. Finally, experiments illustrate and complete the analysis. |
Tasks | Multi-Armed Bandits |
Published | 2018-11-19 |
URL | https://arxiv.org/abs/1811.07763v4 |
https://arxiv.org/pdf/1811.07763v4.pdf | |
PWC | https://paperswithcode.com/paper/decentralized-exploration-in-multi-armed |
Repo | |
Framework | |
Graph Cut Segmentation Methods Revisited with a Quantum Algorithm
Title | Graph Cut Segmentation Methods Revisited with a Quantum Algorithm |
Authors | Lisa Tse, Peter Mountney, Paul Klein, Simone Severini |
Abstract | The design and performance of computer vision algorithms are greatly influenced by the hardware on which they are implemented. CPUs, multi-core CPUs, FPGAs and GPUs have inspired new algorithms and enabled existing ideas to be realized. This is notably the case with GPUs, which has significantly changed the landscape of computer vision research through deep learning. As the end of Moores law approaches, researchers and hardware manufacturers are exploring alternative hardware computing paradigms. Quantum computers are a very promising alternative and offer polynomial or even exponential speed-ups over conventional computing for some problems. This paper presents a novel approach to image segmentation that uses new quantum computing hardware. Segmentation is formulated as a graph cut problem that can be mapped to the quantum approximate optimization algorithm (QAOA). This algorithm can be implemented on current and near-term quantum computers. Encouraging results are presented on artificial and medical imaging data. This represents an important, practical step towards leveraging quantum computers for computer vision. |
Tasks | Semantic Segmentation |
Published | 2018-12-07 |
URL | http://arxiv.org/abs/1812.03050v2 |
http://arxiv.org/pdf/1812.03050v2.pdf | |
PWC | https://paperswithcode.com/paper/graph-cut-segmentation-methods-revisited-with |
Repo | |
Framework | |
A Subpixel Registration Algorithm for Low PSNR Images
Title | A Subpixel Registration Algorithm for Low PSNR Images |
Authors | Song Feng, Linhua Deng, Guofeng Shu, Feng Wang, Hui Deng, Kaifan Ji |
Abstract | This paper presents a fast algorithm for obtaining high-accuracy subpixel translation of low PSNR images. Instead of locating the maximum point on the upsampled images or fitting the peak of correlation surface, the proposed algorithm is based on the measurement of centroid on the cross correlation surface by Modified Moment method. Synthetic images, real solar images and standard testing images with white Gaussian noise added were tested, and the results show that the accuracies of our algorithm are comparable with other subpixel registration techniques and the processing speed is higher. The drawback is also discussed at the end of this paper. |
Tasks | |
Published | 2018-03-31 |
URL | http://arxiv.org/abs/1804.00174v1 |
http://arxiv.org/pdf/1804.00174v1.pdf | |
PWC | https://paperswithcode.com/paper/a-subpixel-registration-algorithm-for-low |
Repo | |
Framework | |
Towards Machine Learning Prediction of Deep Brain Stimulation (DBS) Intra-operative Efficacy Maps
Title | Towards Machine Learning Prediction of Deep Brain Stimulation (DBS) Intra-operative Efficacy Maps |
Authors | Camilo Bermudez, William Rodriguez, Yuankai Huo, Allison E. Hainline, Rui Li, Robert Shults, Pierre D. DHaese, Peter E. Konrad, Benoit M. Dawant, Bennett A. Landman |
Abstract | Deep brain stimulation (DBS) has the potential to improve the quality of life of people with a variety of neurological diseases. A key challenge in DBS is in the placement of a stimulation electrode in the anatomical location that maximizes efficacy and minimizes side effects. Pre-operative localization of the optimal stimulation zone can reduce surgical times and morbidity. Current methods of producing efficacy probability maps follow an anatomical guidance on magnetic resonance imaging (MRI) to identify the areas with the highest efficacy in a population. In this work, we propose to revisit this problem as a classification problem, where each voxel in the MRI is a sample informed by the surrounding anatomy. We use a patch-based convolutional neural network to classify a stimulation coordinate as having a positive reduction in symptoms during surgery. We use a cohort of 187 patients with a total of 2,869 stimulation coordinates, upon which 3D patches were extracted and associated with an efficacy score. We compare our results with a registration-based method of surgical planning. We show an improvement in the classification of intraoperative stimulation coordinates as a positive response in reduction of symptoms with AUC of 0.670 compared to a baseline registration-based approach, which achieves an AUC of 0.627 (p < 0.01). Although additional validation is needed, the proposed classification framework and deep learning method appear well-suited for improving pre-surgical planning and personalize treatment strategies. |
Tasks | |
Published | 2018-11-26 |
URL | http://arxiv.org/abs/1811.10415v1 |
http://arxiv.org/pdf/1811.10415v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-machine-learning-prediction-of-deep |
Repo | |
Framework | |
MS-UEdin Submission to the WMT2018 APE Shared Task: Dual-Source Transformer for Automatic Post-Editing
Title | MS-UEdin Submission to the WMT2018 APE Shared Task: Dual-Source Transformer for Automatic Post-Editing |
Authors | Marcin Junczys-Dowmunt, Roman Grundkiewicz |
Abstract | This paper describes the Microsoft and University of Edinburgh submission to the Automatic Post-editing shared task at WMT2018. Based on training data and systems from the WMT2017 shared task, we re-implement our own models from the last shared task and introduce improvements based on extensive parameter sharing. Next we experiment with our implementation of dual-source transformer models and data selection for the IT domain. Our submissions decisively wins the SMT post-editing sub-task establishing the new state-of-the-art and is a very close second (or equal, 16.46 vs 16.50 TER) in the NMT sub-task. Based on the rather weak results in the NMT sub-task, we hypothesize that neural-on-neural APE might not be actually useful. |
Tasks | Automatic Post-Editing |
Published | 2018-09-01 |
URL | http://arxiv.org/abs/1809.00188v1 |
http://arxiv.org/pdf/1809.00188v1.pdf | |
PWC | https://paperswithcode.com/paper/ms-uedin-submission-to-the-wmt2018-ape-shared |
Repo | |
Framework | |
Learning stable and predictive structures in kinetic systems: Benefits of a causal approach
Title | Learning stable and predictive structures in kinetic systems: Benefits of a causal approach |
Authors | Niklas Pfister, Stefan Bauer, Jonas Peters |
Abstract | Learning kinetic systems from data is one of the core challenges in many fields. Identifying stable models is essential for the generalization capabilities of data-driven inference. We introduce a computationally efficient framework, called CausalKinetiX, that identifies structure from discrete time, noisy observations, generated from heterogeneous experiments. The algorithm assumes the existence of an underlying, invariant kinetic model, a key criterion for reproducible research. Results on both simulated and real-world examples suggest that learning the structure of kinetic systems benefits from a causal perspective. The identified variables and models allow for a concise description of the dynamics across multiple experimental settings and can be used for prediction in unseen experiments. We observe significant improvements compared to well established approaches focusing solely on predictive performance, especially for out-of-sample generalization. |
Tasks | Causal Inference, Model Selection |
Published | 2018-10-28 |
URL | https://arxiv.org/abs/1810.11776v2 |
https://arxiv.org/pdf/1810.11776v2.pdf | |
PWC | https://paperswithcode.com/paper/identifying-causal-structure-in-large-scale |
Repo | |
Framework | |
Controlling Covariate Shift using Balanced Normalization of Weights
Title | Controlling Covariate Shift using Balanced Normalization of Weights |
Authors | Aaron Defazio, Léon Bottou |
Abstract | We introduce a new normalization technique that exhibits the fast convergence properties of batch normalization using a transformation of layer weights instead of layer outputs. The proposed technique keeps the contribution of positive and negative weights to the layer output balanced. We validate our method on a set of standard benchmarks including CIFAR-10/100, SVHN and ILSVRC 2012 ImageNet. |
Tasks | |
Published | 2018-12-11 |
URL | https://arxiv.org/abs/1812.04549v2 |
https://arxiv.org/pdf/1812.04549v2.pdf | |
PWC | https://paperswithcode.com/paper/controlling-covariate-shift-using-equilibrium |
Repo | |
Framework | |
Deep Geodesic Learning for Segmentation and Anatomical Landmarking
Title | Deep Geodesic Learning for Segmentation and Anatomical Landmarking |
Authors | Neslisah Torosdagli, Denise K. Liberton, Payal Verma, Murat Sincan, Janice S. Lee, Ulas Bagci |
Abstract | In this paper, we propose a novel deep learning framework for anatomy segmentation and automatic landmark- ing. Specifically, we focus on the challenging problem of mandible segmentation from cone-beam computed tomography (CBCT) scans and identification of 9 anatomical landmarks of the mandible on the geodesic space. The overall approach employs three inter-related steps. In step 1, we propose a deep neu- ral network architecture with carefully designed regularization, and network hyper-parameters to perform image segmentation without the need for data augmentation and complex post- processing refinement. In step 2, we formulate the landmark localization problem directly on the geodesic space for sparsely- spaced anatomical landmarks. In step 3, we propose to use a long short-term memory (LSTM) network to identify closely- spaced landmarks, which is rather difficult to obtain using other standard detection networks. The proposed fully automated method showed superior efficacy compared to the state-of-the- art mandible segmentation and landmarking approaches in craniofacial anomalies and diseased states. We used a very challenging CBCT dataset of 50 patients with a high-degree of craniomaxillofacial (CMF) variability that is realistic in clinical practice. Complementary to the quantitative analysis, the qualitative visual inspection was conducted for distinct CBCT scans from 250 patients with high anatomical variability. We have also shown feasibility of the proposed work in an independent dataset from MICCAI Head-Neck Challenge (2015) achieving the state-of-the-art performance. Lastly, we present an in-depth analysis of the proposed deep networks with respect to the choice of hyper-parameters such as pooling and activation functions. |
Tasks | Data Augmentation, Semantic Segmentation |
Published | 2018-10-06 |
URL | http://arxiv.org/abs/1810.04021v1 |
http://arxiv.org/pdf/1810.04021v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-geodesic-learning-for-segmentation-and |
Repo | |
Framework | |