Paper Group ANR 199
PointFusion: Deep Sensor Fusion for 3D Bounding Box Estimation. Zoom-in-Net: Deep Mining Lesions for Diabetic Retinopathy Detection. Bootstrapping Labelled Dataset Construction for Cow Tracking and Behavior Analysis. Cross-Domain Self-supervised Multi-task Feature Learning using Synthetic Imagery. Do Neural Nets Learn Statistical Laws behind Natura …
PointFusion: Deep Sensor Fusion for 3D Bounding Box Estimation
Title | PointFusion: Deep Sensor Fusion for 3D Bounding Box Estimation |
Authors | Danfei Xu, Dragomir Anguelov, Ashesh Jain |
Abstract | We present PointFusion, a generic 3D object detection method that leverages both image and 3D point cloud information. Unlike existing methods that either use multi-stage pipelines or hold sensor and dataset-specific assumptions, PointFusion is conceptually simple and application-agnostic. The image data and the raw point cloud data are independently processed by a CNN and a PointNet architecture, respectively. The resulting outputs are then combined by a novel fusion network, which predicts multiple 3D box hypotheses and their confidences, using the input 3D points as spatial anchors. We evaluate PointFusion on two distinctive datasets: the KITTI dataset that features driving scenes captured with a lidar-camera setup, and the SUN-RGBD dataset that captures indoor environments with RGB-D cameras. Our model is the first one that is able to perform better or on-par with the state-of-the-art on these diverse datasets without any dataset-specific model tuning. |
Tasks | 3D Object Detection, 6D Pose Estimation using RGB, Object Detection, Sensor Fusion |
Published | 2017-11-29 |
URL | http://arxiv.org/abs/1711.10871v2 |
http://arxiv.org/pdf/1711.10871v2.pdf | |
PWC | https://paperswithcode.com/paper/pointfusion-deep-sensor-fusion-for-3d |
Repo | |
Framework | |
Zoom-in-Net: Deep Mining Lesions for Diabetic Retinopathy Detection
Title | Zoom-in-Net: Deep Mining Lesions for Diabetic Retinopathy Detection |
Authors | Zhe Wang, Yanxin Yin, Jianping Shi, Wei Fang, Hongsheng Li, Xiaogang Wang |
Abstract | We propose a convolution neural network based algorithm for simultaneously diagnosing diabetic retinopathy and highlighting suspicious regions. Our contributions are two folds: 1) a network termed Zoom-in-Net which mimics the zoom-in process of a clinician to examine the retinal images. Trained with only image-level supervisions, Zoomin-Net can generate attention maps which highlight suspicious regions, and predicts the disease level accurately based on both the whole image and its high resolution suspicious patches. 2) Only four bounding boxes generated from the automatically learned attention maps are enough to cover 80% of the lesions labeled by an experienced ophthalmologist, which shows good localization ability of the attention maps. By clustering features at high response locations on the attention maps, we discover meaningful clusters which contain potential lesions in diabetic retinopathy. Experiments show that our algorithm outperform the state-of-the-art methods on two datasets, EyePACS and Messidor. |
Tasks | Diabetic Retinopathy Detection |
Published | 2017-06-14 |
URL | http://arxiv.org/abs/1706.04372v1 |
http://arxiv.org/pdf/1706.04372v1.pdf | |
PWC | https://paperswithcode.com/paper/zoom-in-net-deep-mining-lesions-for-diabetic |
Repo | |
Framework | |
Bootstrapping Labelled Dataset Construction for Cow Tracking and Behavior Analysis
Title | Bootstrapping Labelled Dataset Construction for Cow Tracking and Behavior Analysis |
Authors | Aram Ter-Sarkisov, Robert Ross, John Kelleher |
Abstract | This paper introduces a new approach to the long-term tracking of an object in a challenging environment. The object is a cow and the environment is an enclosure in a cowshed. Some of the key challenges in this domain are a cluttered background, low contrast and high similarity between moving objects which greatly reduces the efficiency of most existing approaches, including those based on background subtraction. Our approach is split into object localization, instance segmentation, learning and tracking stages. Our solution is compared to a range of semi-supervised object tracking algorithms and we show that the performance is strong and well suited to subsequent analysis. We present our solution as a first step towards broader tracking and behavior monitoring for cows in precision agriculture with the ultimate objective of early detection of lameness. |
Tasks | Instance Segmentation, Object Localization, Object Tracking, Semantic Segmentation |
Published | 2017-03-30 |
URL | http://arxiv.org/abs/1703.10571v1 |
http://arxiv.org/pdf/1703.10571v1.pdf | |
PWC | https://paperswithcode.com/paper/bootstrapping-labelled-dataset-construction |
Repo | |
Framework | |
Cross-Domain Self-supervised Multi-task Feature Learning using Synthetic Imagery
Title | Cross-Domain Self-supervised Multi-task Feature Learning using Synthetic Imagery |
Authors | Zhongzheng Ren, Yong Jae Lee |
Abstract | In human learning, it is common to use multiple sources of information jointly. However, most existing feature learning approaches learn from only a single task. In this paper, we propose a novel multi-task deep network to learn generalizable high-level visual representations. Since multi-task learning requires annotations for multiple properties of the same training instance, we look to synthetic images to train our network. To overcome the domain difference between real and synthetic data, we employ an unsupervised feature space domain adaptation method based on adversarial learning. Given an input synthetic RGB image, our network simultaneously predicts its surface normal, depth, and instance contour, while also minimizing the feature space domain differences between real and synthetic data. Through extensive experiments, we demonstrate that our network learns more transferable representations compared to single-task baselines. Our learned representation produces state-of-the-art transfer learning results on PASCAL VOC 2007 classification and 2012 detection. |
Tasks | Domain Adaptation, Multi-Task Learning, Transfer Learning |
Published | 2017-11-24 |
URL | http://arxiv.org/abs/1711.09082v1 |
http://arxiv.org/pdf/1711.09082v1.pdf | |
PWC | https://paperswithcode.com/paper/cross-domain-self-supervised-multi-task |
Repo | |
Framework | |
Do Neural Nets Learn Statistical Laws behind Natural Language?
Title | Do Neural Nets Learn Statistical Laws behind Natural Language? |
Authors | Shuntaro Takahashi, Kumiko Tanaka-Ishii |
Abstract | The performance of deep learning in natural language processing has been spectacular, but the reasons for this success remain unclear because of the inherent complexity of deep learning. This paper provides empirical evidence of its effectiveness and of a limitation of neural networks for language engineering. Precisely, we demonstrate that a neural language model based on long short-term memory (LSTM) effectively reproduces Zipf’s law and Heaps’ law, two representative statistical properties underlying natural language. We discuss the quality of reproducibility and the emergence of Zipf’s law and Heaps’ law as training progresses. We also point out that the neural language model has a limitation in reproducing long-range correlation, another statistical property of natural language. This understanding could provide a direction for improving the architectures of neural networks. |
Tasks | Language Modelling |
Published | 2017-07-16 |
URL | http://arxiv.org/abs/1707.04848v2 |
http://arxiv.org/pdf/1707.04848v2.pdf | |
PWC | https://paperswithcode.com/paper/do-neural-nets-learn-statistical-laws-behind |
Repo | |
Framework | |
Expected exponential loss for gaze-based video and volume ground truth annotation
Title | Expected exponential loss for gaze-based video and volume ground truth annotation |
Authors | Laurent Lejeune, Mario Christoudias, Raphael Sznitman |
Abstract | Many recent machine learning approaches used in medical imaging are highly reliant on large amounts of image and ground truth data. In the context of object segmentation, pixel-wise annotations are extremely expensive to collect, especially in video and 3D volumes. To reduce this annotation burden, we propose a novel framework to allow annotators to simply observe the object to segment and record where they have looked at with a $200 eye gaze tracker. Our method then estimates pixel-wise probabilities for the presence of the object throughout the sequence from which we train a classifier in semi-supervised setting using a novel Expected Exponential loss function. We show that our framework provides superior performances on a wide range of medical image settings compared to existing strategies and that our method can be combined with current crowd-sourcing paradigms as well. |
Tasks | Semantic Segmentation |
Published | 2017-07-16 |
URL | http://arxiv.org/abs/1707.04905v1 |
http://arxiv.org/pdf/1707.04905v1.pdf | |
PWC | https://paperswithcode.com/paper/expected-exponential-loss-for-gaze-based |
Repo | |
Framework | |
Decentralised firewall for malware detection
Title | Decentralised firewall for malware detection |
Authors | Saurabh Raje, Shyamal Vaderia, Neil Wilson, Rudrakh Panigrahi |
Abstract | This paper describes the design and development of a decentralized firewall system powered by a novel malware detection engine. The firewall is built using blockchain technology. The detection engine aims to classify Portable Executable (PE) files as malicious or benign. File classification is carried out using a deep belief neural network (DBN) as the detection engine. Our approach is to model the files as grayscale images and use the DBN to classify those images into the aforementioned two classes. An extensive data set of 10,000 files is used to train the DBN. Validation is carried out using 4,000 files previously unexposed to the network. The final result of whether to allow or block a file is obtained by arriving at a proof of work based consensus in the blockchain network. |
Tasks | Malware Detection |
Published | 2017-11-03 |
URL | http://arxiv.org/abs/1711.01353v1 |
http://arxiv.org/pdf/1711.01353v1.pdf | |
PWC | https://paperswithcode.com/paper/decentralised-firewall-for-malware-detection |
Repo | |
Framework | |
Learning Fast and Slow: PROPEDEUTICA for Real-time Malware Detection
Title | Learning Fast and Slow: PROPEDEUTICA for Real-time Malware Detection |
Authors | Ruimin Sun, Xiaoyong Yuan, Pan He, Qile Zhu, Aokun Chen, Andre Gregio, Daniela Oliveira, Xiaolin Li |
Abstract | In this paper, we introduce and evaluate PROPEDEUTICA, a novel methodology and framework for efficient and effective real-time malware detection, leveraging the best of conventional machine learning (ML) and deep learning (DL) algorithms. In PROPEDEUTICA, all software processes in the system start execution subjected to a conventional ML detector for fast classification. If a piece of software receives a borderline classification, it is subjected to further analysis via more performance expensive and more accurate DL methods, via our newly proposed DL algorithm DEEPMALWARE. Further, we introduce delays to the execution of software subjected to deep learning analysis as a way to “buy time” for DL analysis and to rate-limit the impact of possible malware in the system. We evaluated PROPEDEUTICA with a set of 9,115 malware samples and 877 commonly used benign software samples from various categories for the Windows OS. Our results show that the false positive rate for conventional ML methods can reach 20%, and for modern DL methods it is usually below 6%. However, the classification time for DL can be 100X longer than conventional ML methods. PROPEDEUTICA improved the detection F1-score from 77.54% (conventional ML method) to 90.25%, and reduced the detection time by 54.86%. Further, the percentage of software subjected to DL analysis was approximately 40% on average. Further, the application of delays in software subjected to ML reduced the detection time by approximately 10%. Finally, we found and discussed a discrepancy between the detection accuracy offline (analysis after all traces are collected) and on-the-fly (analysis in tandem with trace collection). Our insights show that conventional ML and modern DL-based malware detectors in isolation cannot meet the needs of efficient and effective malware detection: high accuracy, low false positive rate, and short classification time. |
Tasks | Malware Detection |
Published | 2017-12-04 |
URL | http://arxiv.org/abs/1712.01145v1 |
http://arxiv.org/pdf/1712.01145v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-fast-and-slow-propedeutica-for-real |
Repo | |
Framework | |
Abductive, Causal, and Counterfactual Conditionals Under Incomplete Probabilistic Knowledge
Title | Abductive, Causal, and Counterfactual Conditionals Under Incomplete Probabilistic Knowledge |
Authors | Niki Pfeifer, Leena Tulkki |
Abstract | We study abductive, causal, and non-causal conditionals in indicative and counterfactual formulations using probabilistic truth table tasks under incomplete probabilistic knowledge (N = 80). We frame the task as a probability-logical inference problem. The most frequently observed response type across all conditions was a class of conditional event interpretations of conditionals; it was followed by conjunction interpretations. An interesting minority of participants neglected some of the relevant imprecision involved in the premises when inferring lower or upper probability bounds on the target conditional/counterfactual (“halfway responses”). We discuss the results in the light of coherence-based probability logic and the new paradigm psychology of reasoning. |
Tasks | |
Published | 2017-03-09 |
URL | http://arxiv.org/abs/1703.03254v2 |
http://arxiv.org/pdf/1703.03254v2.pdf | |
PWC | https://paperswithcode.com/paper/abductive-causal-and-counterfactual |
Repo | |
Framework | |
Manifold Based Low-rank Regularization for Image Restoration and Semi-supervised Learning
Title | Manifold Based Low-rank Regularization for Image Restoration and Semi-supervised Learning |
Authors | Rongjie Lai, Jia Li |
Abstract | Low-rank structures play important role in recent advances of many problems in image science and data science. As a natural extension of low-rank structures for data with nonlinear structures, the concept of the low-dimensional manifold structure has been considered in many data processing problems. Inspired by this concept, we consider a manifold based low-rank regularization as a linear approximation of manifold dimension. This regularization is less restricted than the global low-rank regularization, and thus enjoy more flexibility to handle data with nonlinear structures. As applications, we demonstrate the proposed regularization to classical inverse problems in image sciences and data sciences including image inpainting, image super-resolution, X-ray computer tomography (CT) image reconstruction and semi-supervised learning. We conduct intensive numerical experiments in several image restoration problems and a semi-supervised learning problem of classifying handwritten digits using the MINST data. Our numerical tests demonstrate the effectiveness of the proposed methods and illustrate that the new regularization methods produce outstanding results by comparing with many existing methods. |
Tasks | Image Inpainting, Image Reconstruction, Image Restoration, Image Super-Resolution, Super-Resolution |
Published | 2017-02-09 |
URL | http://arxiv.org/abs/1702.02680v1 |
http://arxiv.org/pdf/1702.02680v1.pdf | |
PWC | https://paperswithcode.com/paper/manifold-based-low-rank-regularization-for |
Repo | |
Framework | |
Tapping the sensorimotor trajectory
Title | Tapping the sensorimotor trajectory |
Authors | Oswald Berthold, Verena Hafner |
Abstract | In this paper, we propose the concept of sensorimotor tappings, a new graphical technique that explicitly represents relations between the time steps of an agent’s sensorimotor loop and a single training step of an adaptive internal model. In the simplest case this is a relation linking two time steps. In realistic cases these relations can extend over several time steps and over different sensory channels. The aim is to capture the footprint of information intake relative to the agent’s current time step. We argue that this view allows us to make prior considerations explicit and then use them in implementations without modification once they are established. Here we explain the basic idea, provide example tappings for standard configurations used in developmental models, and show how tappings can be applied to problems in related fields. |
Tasks | |
Published | 2017-04-25 |
URL | http://arxiv.org/abs/1704.07622v2 |
http://arxiv.org/pdf/1704.07622v2.pdf | |
PWC | https://paperswithcode.com/paper/tapping-the-sensorimotor-trajectory |
Repo | |
Framework | |
Combining Search with Structured Data to Create a More Engaging User Experience in Open Domain Dialogue
Title | Combining Search with Structured Data to Create a More Engaging User Experience in Open Domain Dialogue |
Authors | Kevin K. Bowden, Shereen Oraby, Jiaqi Wu, Amita Misra, Marilyn Walker |
Abstract | The greatest challenges in building sophisticated open-domain conversational agents arise directly from the potential for ongoing mixed-initiative multi-turn dialogues, which do not follow a particular plan or pursue a particular fixed information need. In order to make coherent conversational contributions in this context, a conversational agent must be able to track the types and attributes of the entities under discussion in the conversation and know how they are related. In some cases, the agent can rely on structured information sources to help identify the relevant semantic relations and produce a turn, but in other cases, the only content available comes from search, and it may be unclear which semantic relations hold between the search results and the discourse context. A further constraint is that the system must produce its contribution to the ongoing conversation in real-time. This paper describes our experience building SlugBot for the 2017 Alexa Prize, and discusses how we leveraged search and structured data from different sources to help SlugBot produce dialogic turns and carry on conversations whose length over the semi-finals user evaluation period averaged 8:17 minutes. |
Tasks | |
Published | 2017-09-15 |
URL | http://arxiv.org/abs/1709.05411v1 |
http://arxiv.org/pdf/1709.05411v1.pdf | |
PWC | https://paperswithcode.com/paper/combining-search-with-structured-data-to |
Repo | |
Framework | |
Geometric robustness of deep networks: analysis and improvement
Title | Geometric robustness of deep networks: analysis and improvement |
Authors | Can Kanbak, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard |
Abstract | Deep convolutional neural networks have been shown to be vulnerable to arbitrary geometric transformations. However, there is no systematic method to measure the invariance properties of deep networks to such transformations. We propose ManiFool as a simple yet scalable algorithm to measure the invariance of deep networks. In particular, our algorithm measures the robustness of deep networks to geometric transformations in a worst-case regime as they can be problematic for sensitive applications. Our extensive experimental results show that ManiFool can be used to measure the invariance of fairly complex networks on high dimensional datasets and these values can be used for analyzing the reasons for it. Furthermore, we build on Manifool to propose a new adversarial training scheme and we show its effectiveness on improving the invariance properties of deep neural networks. |
Tasks | |
Published | 2017-11-24 |
URL | http://arxiv.org/abs/1711.09115v1 |
http://arxiv.org/pdf/1711.09115v1.pdf | |
PWC | https://paperswithcode.com/paper/geometric-robustness-of-deep-networks |
Repo | |
Framework | |
Integration of LiDAR and Hyperspectral Data for Land-cover Classification: A Case Study
Title | Integration of LiDAR and Hyperspectral Data for Land-cover Classification: A Case Study |
Authors | Pedram Ghamisi, Gabriele Cavallaro, Dan, Wu, Jon Atli Benediktsson, Antonio Plaza |
Abstract | In this paper, an approach is proposed to fuse LiDAR and hyperspectral data, which considers both spectral and spatial information in a single framework. Here, an extended self-dual attribute profile (ESDAP) is investigated to extract spatial information from a hyperspectral data set. To extract spectral information, a few well-known classifiers have been used such as support vector machines (SVMs), random forests (RFs), and artificial neural networks (ANNs). The proposed method accurately classify the relatively volumetric data set in a few CPU processing time in a real ill-posed situation where there is no balance between the number of training samples and the number of features. The classification part of the proposed approach is fully-automatic. |
Tasks | |
Published | 2017-07-09 |
URL | http://arxiv.org/abs/1707.02642v1 |
http://arxiv.org/pdf/1707.02642v1.pdf | |
PWC | https://paperswithcode.com/paper/integration-of-lidar-and-hyperspectral-data |
Repo | |
Framework | |
Measuring the Accuracy of Object Detectors and Trackers
Title | Measuring the Accuracy of Object Detectors and Trackers |
Authors | Tobias Bottger, Patrick Follmann, Michael Fauser |
Abstract | The accuracy of object detectors and trackers is most commonly evaluated by the Intersection over Union (IoU) criterion. To date, most approaches are restricted to axis-aligned or oriented boxes and, as a consequence, many datasets are only labeled with boxes. Nevertheless, axis-aligned or oriented boxes cannot accurately capture an object’s shape. To address this, a number of densely segmented datasets has started to emerge in both the object detection and the object tracking communities. However, evaluating the accuracy of object detectors and trackers that are restricted to boxes on densely segmented data is not straightforward. To close this gap, we introduce the relative Intersection over Union (rIoU) accuracy measure. The measure normalizes the IoU with the optimal box for the segmentation to generate an accuracy measure that ranges between 0 and 1 and allows a more precise measurement of accuracies. Furthermore, it enables an efficient and easy way to understand scenes and the strengths and weaknesses of an object detection or tracking approach. We display how the new measure can be efficiently calculated and present an easy-to-use evaluation framework. The framework is tested on the DAVIS and the VOT2016 segmentations and has been made available to the community. |
Tasks | Object Detection, Object Tracking |
Published | 2017-04-24 |
URL | http://arxiv.org/abs/1704.07293v1 |
http://arxiv.org/pdf/1704.07293v1.pdf | |
PWC | https://paperswithcode.com/paper/measuring-the-accuracy-of-object-detectors |
Repo | |
Framework | |