Paper Group ANR 967
A Cell-Division Search Technique for Inversion with Application to Picture-Discovery and Magnetotellurics. Explainable Neural Networks based on Additive Index Models. Cubic Range Error Model for Stereo Vision with Illuminators. A Unified Probabilistic Model for Learning Latent Factors and Their Connectivities from High-Dimensional Data. Approximati …
A Cell-Division Search Technique for Inversion with Application to Picture-Discovery and Magnetotellurics
Title | A Cell-Division Search Technique for Inversion with Application to Picture-Discovery and Magnetotellurics |
Authors | Bradley Alexander, Yang Heng Lee |
Abstract | Solving inverse problems in natural sciences often requires a search pro- cess to find explanatory models that match collected field data. Inverse problems are often under-determined meaning that there are many poten- tial explanatory models for the same data. In such cases using stochastic search, through providing multiple solutions, can help characterise which model features that are most persistent and therefore likely to be real. Unfortunately, in some fields, large parameter spaces can make stochas- tic search intractable. In this work we improve upon previous work by defining a compact and expressive representation and search process able to describe and discover two and three dimensional spatial models. The search process takes place in stages starting with greedy search, followed by alternating stages of evolutionary search and a novel model-splitting process inspired by cell-division. We apply this framework to two prob- lems - magnetotellurics and picture discovery. We show that our improved representation and search process is able to produce detailed models with low error residuals. |
Tasks | |
Published | 2018-04-21 |
URL | http://arxiv.org/abs/1804.07887v1 |
http://arxiv.org/pdf/1804.07887v1.pdf | |
PWC | https://paperswithcode.com/paper/a-cell-division-search-technique-for |
Repo | |
Framework | |
Explainable Neural Networks based on Additive Index Models
Title | Explainable Neural Networks based on Additive Index Models |
Authors | Joel Vaughan, Agus Sudjianto, Erind Brahimi, Jie Chen, Vijayan N. Nair |
Abstract | Machine Learning algorithms are increasingly being used in recent years due to their flexibility in model fitting and increased predictive performance. However, the complexity of the models makes them hard for the data analyst to interpret the results and explain them without additional tools. This has led to much research in developing various approaches to understand the model behavior. In this paper, we present the Explainable Neural Network (xNN), a structured neural network designed especially to learn interpretable features. Unlike fully connected neural networks, the features engineered by the xNN can be extracted from the network in a relatively straightforward manner and the results displayed. With appropriate regularization, the xNN provides a parsimonious explanation of the relationship between the features and the output. We illustrate this interpretable feature–engineering property on simulated examples. |
Tasks | Feature Engineering |
Published | 2018-06-05 |
URL | http://arxiv.org/abs/1806.01933v1 |
http://arxiv.org/pdf/1806.01933v1.pdf | |
PWC | https://paperswithcode.com/paper/explainable-neural-networks-based-on-additive |
Repo | |
Framework | |
Cubic Range Error Model for Stereo Vision with Illuminators
Title | Cubic Range Error Model for Stereo Vision with Illuminators |
Authors | Marius Huber, Timo Hinzmann, Roland Siegwart, Larry H. Matthies |
Abstract | Use of low-cost depth sensors, such as a stereo camera setup with illuminators, is of particular interest for numerous applications ranging from robotics and transportation to mixed and augmented reality. The ability to quantify noise is crucial for these applications, e.g., when the sensor is used for map generation or to develop a sensor scheduling policy in a multi-sensor setup. Range error models provide uncertainty estimates and help weigh the data correctly in instances where range measurements are taken from different vantage points or with different sensors. The weighing is important to fuse range data into a map in a meaningful way, i.e., the high confidence data is relied on most heavily. Such a model is derived in this work. We show that the range error for stereo systems with integrated illuminators is cubic and validate the proposed model experimentally with an off-the-shelf structured light stereo system. The experiments confirm the validity of the model and simplify the application of this type of sensor in robotics. The proposed error model is relevant to any stereo system with low ambient light where the main light source is located at the camera system. Among others, this is the case for structured light stereo systems and night stereo systems with headlights. In this work, we propose that the range error is cubic in range for stereo systems with integrated illuminators. Experimental validation with an off-the-shelf structured light stereo system shows that the exponent is between 2.4 and 2.6. The deviation is attributed to our model considering only shot noise. |
Tasks | |
Published | 2018-03-11 |
URL | http://arxiv.org/abs/1803.03932v1 |
http://arxiv.org/pdf/1803.03932v1.pdf | |
PWC | https://paperswithcode.com/paper/cubic-range-error-model-for-stereo-vision |
Repo | |
Framework | |
A Unified Probabilistic Model for Learning Latent Factors and Their Connectivities from High-Dimensional Data
Title | A Unified Probabilistic Model for Learning Latent Factors and Their Connectivities from High-Dimensional Data |
Authors | Ricardo Pio Monti, Aapo Hyvärinen |
Abstract | Connectivity estimation is challenging in the context of high-dimensional data. A useful preprocessing step is to group variables into clusters, however, it is not always clear how to do so from the perspective of connectivity estimation. Another practical challenge is that we may have data from multiple related classes (e.g., multiple subjects or conditions) and wish to incorporate constraints on the similarities across classes. We propose a probabilistic model which simultaneously performs both a grouping of variables (i.e., detecting community structure) and estimation of connectivities between the groups which correspond to latent variables. The model is essentially a factor analysis model where the factors are allowed to have arbitrary correlations, while the factor loading matrix is constrained to express a community structure. The model can be applied on multiple classes so that the connectivities can be different between the classes, while the community structure is the same for all classes. We propose an efficient estimation algorithm based on score matching, and prove the identifiability of the model. Finally, we present an extension to directed (causal) connectivities over latent variables. Simulations and experiments on fMRI data validate the practical utility of the method. |
Tasks | Connectivity Estimation |
Published | 2018-05-24 |
URL | http://arxiv.org/abs/1805.09567v1 |
http://arxiv.org/pdf/1805.09567v1.pdf | |
PWC | https://paperswithcode.com/paper/a-unified-probabilistic-model-for-learning |
Repo | |
Framework | |
Approximation beats concentration? An approximation view on inference with smooth radial kernels
Title | Approximation beats concentration? An approximation view on inference with smooth radial kernels |
Authors | Mikhail Belkin |
Abstract | Positive definite kernels and their associated Reproducing Kernel Hilbert Spaces provide a mathematically compelling and practically competitive framework for learning from data. In this paper we take the approximation theory point of view to explore various aspects of smooth kernels related to their inferential properties. We analyze eigenvalue decay of kernels operators and matrices, properties of eigenfunctions/eigenvectors and “Fourier” coefficients of functions in the kernel space restricted to a discrete set of data points. We also investigate the fitting capacity of kernels, giving explicit bounds on the fat shattering dimension of the balls in Reproducing Kernel Hilbert spaces. Interestingly, the same properties that make kernels very effective approximators for functions in their “native” kernel space, also limit their capacity to represent arbitrary functions. We discuss various implications, including those for gradient descent type methods. It is important to note that most of our bounds are measure independent. Moreover, at least in moderate dimension, the bounds for eigenvalues are much tighter than the bounds which can be obtained from the usual matrix concentration results. For example, we see that the eigenvalues of kernel matrices show nearly exponential decay with constants depending only on the kernel and the domain. We call this “approximation beats concentration” phenomenon as even when the data are sampled from a probability distribution, some of their aspects are better understood in terms of approximation theory. |
Tasks | |
Published | 2018-01-10 |
URL | http://arxiv.org/abs/1801.03437v2 |
http://arxiv.org/pdf/1801.03437v2.pdf | |
PWC | https://paperswithcode.com/paper/approximation-beats-concentration-an |
Repo | |
Framework | |
Communication-Efficient On-Device Machine Learning: Federated Distillation and Augmentation under Non-IID Private Data
Title | Communication-Efficient On-Device Machine Learning: Federated Distillation and Augmentation under Non-IID Private Data |
Authors | Eunjeong Jeong, Seungeun Oh, Hyesung Kim, Jihong Park, Mehdi Bennis, Seong-Lyun Kim |
Abstract | On-device machine learning (ML) enables the training process to exploit a massive amount of user-generated private data samples. To enjoy this benefit, inter-device communication overhead should be minimized. With this end, we propose federated distillation (FD), a distributed model training algorithm whose communication payload size is much smaller than a benchmark scheme, federated learning (FL), particularly when the model size is large. Moreover, user-generated data samples are likely to become non-IID across devices, which commonly degrades the performance compared to the case with an IID dataset. To cope with this, we propose federated augmentation (FAug), where each device collectively trains a generative model, and thereby augments its local data towards yielding an IID dataset. Empirical studies demonstrate that FD with FAug yields around 26x less communication overhead while achieving 95-98% test accuracy compared to FL. |
Tasks | |
Published | 2018-11-28 |
URL | http://arxiv.org/abs/1811.11479v1 |
http://arxiv.org/pdf/1811.11479v1.pdf | |
PWC | https://paperswithcode.com/paper/communication-efficient-on-device-machine |
Repo | |
Framework | |
DeepSSM: A Deep Learning Framework for Statistical Shape Modeling from Raw Images
Title | DeepSSM: A Deep Learning Framework for Statistical Shape Modeling from Raw Images |
Authors | Riddhish Bhalodia, Shireen Y. Elhabian, Ladislav Kavan, Ross T. Whitaker |
Abstract | Statistical shape modeling is an important tool to characterize variation in anatomical morphology. Typical shapes of interest are measured using 3D imaging and a subsequent pipeline of registration, segmentation, and some extraction of shape features or projections onto some lower-dimensional shape space, which facilitates subsequent statistical analysis. Many methods for constructing compact shape representations have been proposed, but are often impractical due to the sequence of image preprocessing operations, which involve significant parameter tuning, manual delineation, and/or quality control by the users. We propose DeepSSM: a deep learning approach to extract a low-dimensional shape representation directly from 3D images, requiring virtually no parameter tuning or user assistance. DeepSSM uses a convolutional neural network (CNN) that simultaneously localizes the biological structure of interest, establishes correspondences, and projects these points onto a low-dimensional shape representation in the form of PCA loadings within a point distribution model. To overcome the challenge of the limited availability of training images, we present a novel data augmentation procedure that uses existing correspondences on a relatively small set of processed images with shape statistics to create plausible training samples with known shape parameters. Hence, we leverage the limited CT/MRI scans (40-50) into thousands of images needed to train a CNN. After the training, the CNN automatically produces accurate low-dimensional shape representations for unseen images. We validate DeepSSM for three different applications pertaining to modeling pediatric cranial CT for characterization of metopic craniosynostosis, femur CT scans identifying morphologic deformities of the hip due to femoroacetabular impingement, and left atrium MRI scans for atrial fibrillation recurrence prediction. |
Tasks | Data Augmentation |
Published | 2018-09-28 |
URL | http://arxiv.org/abs/1810.00111v1 |
http://arxiv.org/pdf/1810.00111v1.pdf | |
PWC | https://paperswithcode.com/paper/deepssm-a-deep-learning-framework-for |
Repo | |
Framework | |
Brain Tumor Segmentation Using Deep Learning by Type Specific Sorting of Images
Title | Brain Tumor Segmentation Using Deep Learning by Type Specific Sorting of Images |
Authors | Zahra Sobhaninia, Safiyeh Rezaei, Alireza Noroozi, Mehdi Ahmadi, Hamidreza Zarrabi, Nader Karimi, Ali Emami, Shadrokh Samavi |
Abstract | Recently deep learning has been playing a major role in the field of computer vision. One of its applications is the reduction of human judgment in the diagnosis of diseases. Especially, brain tumor diagnosis requires high accuracy, where minute errors in judgment may lead to disaster. For this reason, brain tumor segmentation is an important challenge for medical purposes. Currently several methods exist for tumor segmentation but they all lack high accuracy. Here we present a solution for brain tumor segmenting by using deep learning. In this work, we studied different angles of brain MR images and applied different networks for segmentation. The effect of using separate networks for segmentation of MR images is evaluated by comparing the results with a single network. Experimental evaluations of the networks show that Dice score of 0.73 is achieved for a single network and 0.79 in obtained for multiple networks. |
Tasks | Brain Tumor Segmentation |
Published | 2018-09-20 |
URL | http://arxiv.org/abs/1809.07786v1 |
http://arxiv.org/pdf/1809.07786v1.pdf | |
PWC | https://paperswithcode.com/paper/brain-tumor-segmentation-using-deep-learning |
Repo | |
Framework | |
Imparting Interpretability to Word Embeddings while Preserving Semantic Structure
Title | Imparting Interpretability to Word Embeddings while Preserving Semantic Structure |
Authors | Lutfi Kerem Senel, Ihsan Utlu, Furkan Şahinuç, Haldun M. Ozaktas, Aykut Koç |
Abstract | As an ubiquitous method in natural language processing, word embeddings are extensively employed to map semantic properties of words into a dense vector representation. They capture semantic and syntactic relations among words but the vectors corresponding to the words are only meaningful relative to each other. Neither the vector nor its dimensions have any absolute, interpretable meaning. We introduce an additive modification to the objective function of the embedding learning algorithm that encourages the embedding vectors of words that are semantically related to a predefined concept to take larger values along a specified dimension, while leaving the original semantic learning mechanism mostly unaffected. In other words, we align words that are already determined to be related, along predefined concepts. Therefore, we impart interpretability to the word embedding by assigning meaning to its vector dimensions. The predefined concepts are derived from an external lexical resource, which in this paper is chosen as Roget’s Thesaurus. We observe that alignment along the chosen concepts is not limited to words in the Thesaurus and extends to other related words as well. We quantify the extent of interpretability and assignment of meaning from our experimental results. Manual human evaluation results have also been presented to further verify that the proposed method increases interpretability. We also demonstrate the preservation of semantic coherence of the resulting vector space by using word-analogy and word-similarity tests. These tests show that the interpretability-imparted word embeddings that are obtained by the proposed framework do not sacrifice performances in common benchmark tests. |
Tasks | Word Embeddings |
Published | 2018-07-19 |
URL | https://arxiv.org/abs/1807.07279v3 |
https://arxiv.org/pdf/1807.07279v3.pdf | |
PWC | https://paperswithcode.com/paper/imparting-interpretability-to-word-embeddings |
Repo | |
Framework | |
TextWorld: A Learning Environment for Text-based Games
Title | TextWorld: A Learning Environment for Text-based Games |
Authors | Marc-Alexandre Côté, Ákos Kádár, Xingdi Yuan, Ben Kybartas, Tavian Barnes, Emery Fine, James Moore, Ruo Yu Tao, Matthew Hausknecht, Layla El Asri, Mahmoud Adada, Wendy Tay, Adam Trischler |
Abstract | We introduce TextWorld, a sandbox learning environment for the training and evaluation of RL agents on text-based games. TextWorld is a Python library that handles interactive play-through of text games, as well as backend functions like state tracking and reward assignment. It comes with a curated list of games whose features and challenges we have analyzed. More significantly, it enables users to handcraft or automatically generate new games. Its generative mechanisms give precise control over the difficulty, scope, and language of constructed games, and can be used to relax challenges inherent to commercial text games like partial observability and sparse rewards. By generating sets of varied but similar games, TextWorld can also be used to study generalization and transfer learning. We cast text-based games in the Reinforcement Learning formalism, use our framework to develop a set of benchmark games, and evaluate several baseline agents on this set and the curated list. |
Tasks | Transfer Learning |
Published | 2018-06-29 |
URL | https://arxiv.org/abs/1806.11532v2 |
https://arxiv.org/pdf/1806.11532v2.pdf | |
PWC | https://paperswithcode.com/paper/textworld-a-learning-environment-for-text |
Repo | |
Framework | |
PedX: Benchmark Dataset for Metric 3D Pose Estimation of Pedestrians in Complex Urban Intersections
Title | PedX: Benchmark Dataset for Metric 3D Pose Estimation of Pedestrians in Complex Urban Intersections |
Authors | Wonhui Kim, Manikandasriram Srinivasan Ramanagopal, Charles Barto, Ming-Yuan Yu, Karl Rosaen, Nick Goumas, Ram Vasudevan, Matthew Johnson-Roberson |
Abstract | This paper presents a novel dataset titled PedX, a large-scale multimodal collection of pedestrians at complex urban intersections. PedX consists of more than 5,000 pairs of high-resolution (12MP) stereo images and LiDAR data along with providing 2D and 3D labels of pedestrians. We also present a novel 3D model fitting algorithm for automatic 3D labeling harnessing constraints across different modalities and novel shape and temporal priors. All annotated 3D pedestrians are localized into the real-world metric space, and the generated 3D models are validated using a mocap system configured in a controlled outdoor environment to simulate pedestrians in urban intersections. We also show that the manual 2D labels can be replaced by state-of-the-art automated labeling approaches, thereby facilitating automatic generation of large scale datasets. |
Tasks | 3D Pose Estimation, Pose Estimation |
Published | 2018-09-10 |
URL | http://arxiv.org/abs/1809.03605v1 |
http://arxiv.org/pdf/1809.03605v1.pdf | |
PWC | https://paperswithcode.com/paper/pedx-benchmark-dataset-for-metric-3d-pose |
Repo | |
Framework | |
Joint Mapping and Calibration via Differentiable Sensor Fusion
Title | Joint Mapping and Calibration via Differentiable Sensor Fusion |
Authors | Jonathan P. Chen, Fritz Obermeyer, Vladimir Lyapunov, Lionel Gueguen, Noah D. Goodman |
Abstract | We leverage automatic differentiation (AD) and probabilistic programming to develop an end-to-end optimization algorithm for batch triangulation of a large number of unknown objects. Given noisy detections extracted from noisily geo-located street level imagery without depth information, we jointly estimate the number and location of objects of different types, together with parameters for sensor noise characteristics and prior distribution of objects conditioned on side information. The entire algorithm is framed as nested stochastic variational inference. An inner loop solves a soft data association problem via loopy belief propagation; a middle loop performs soft EM clustering using a regularized Newton solver (leveraging an AD framework); an outer loop backpropagates through the inner loops to train global parameters. We place priors over sensor parameters for different traffic object types, and demonstrate improvements with richer priors incorporating knowledge of the environment. We test our algorithm on detections of road signs observed by cars with mounted cameras, though in practice this technique can be used for any geo-tagged images. The detections were extracted by neural image detectors and classifiers, and we independently triangulate each type of sign (e.g. stop, traffic light). We find that our model is more robust to DNN misclassifications than current methods, generalizes across sign types, and can use geometric information to increase precision. Our algorithm outperforms our current production baseline based on k-means clustering. We show that variational inference training allows generalization by learning sign-specific parameters. |
Tasks | Calibration, Probabilistic Programming, Sensor Fusion |
Published | 2018-11-21 |
URL | http://arxiv.org/abs/1812.00880v2 |
http://arxiv.org/pdf/1812.00880v2.pdf | |
PWC | https://paperswithcode.com/paper/joint-mapping-and-calibration-via |
Repo | |
Framework | |
Can 3D Pose be Learned from 2D Projections Alone?
Title | Can 3D Pose be Learned from 2D Projections Alone? |
Authors | Dylan Drover, Rohith MV, Ching-Hang Chen, Amit Agrawal, Ambrish Tyagi, Cong Phuoc Huynh |
Abstract | 3D pose estimation from a single image is a challenging task in computer vision. We present a weakly supervised approach to estimate 3D pose points, given only 2D pose landmarks. Our method does not require correspondences between 2D and 3D points to build explicit 3D priors. We utilize an adversarial framework to impose a prior on the 3D structure, learned solely from their random 2D projections. Given a set of 2D pose landmarks, the generator network hypothesizes their depths to obtain a 3D skeleton. We propose a novel Random Projection layer, which randomly projects the generated 3D skeleton and sends the resulting 2D pose to the discriminator. The discriminator improves by discriminating between the generated poses and pose samples from a real distribution of 2D poses. Training does not require correspondence between the 2D inputs to either the generator or the discriminator. We apply our approach to the task of 3D human pose estimation. Results on Human3.6M dataset demonstrates that our approach outperforms many previous supervised and weakly supervised approaches. |
Tasks | 3D Human Pose Estimation, 3D Pose Estimation, Pose Estimation |
Published | 2018-08-22 |
URL | http://arxiv.org/abs/1808.07182v1 |
http://arxiv.org/pdf/1808.07182v1.pdf | |
PWC | https://paperswithcode.com/paper/can-3d-pose-be-learned-from-2d-projections |
Repo | |
Framework | |
Balanced Sparsity for Efficient DNN Inference on GPU
Title | Balanced Sparsity for Efficient DNN Inference on GPU |
Authors | Zhuliang Yao, Shijie Cao, Wencong Xiao, Chen Zhang, Lanshun Nie |
Abstract | In trained deep neural networks, unstructured pruning can reduce redundant weights to lower storage cost. However, it requires the customization of hardwares to speed up practical inference. Another trend accelerates sparse model inference on general-purpose hardwares by adopting coarse-grained sparsity to prune or regularize consecutive weights for efficient computation. But this method often sacrifices model accuracy. In this paper, we propose a novel fine-grained sparsity approach, balanced sparsity, to achieve high model accuracy with commercial hardwares efficiently. Our approach adapts to high parallelism property of GPU, showing incredible potential for sparsity in the widely deployment of deep learning services. Experiment results show that balanced sparsity achieves up to 3.1x practical speedup for model inference on GPU, while retains the same high model accuracy as fine-grained sparsity. |
Tasks | |
Published | 2018-11-01 |
URL | http://arxiv.org/abs/1811.00206v4 |
http://arxiv.org/pdf/1811.00206v4.pdf | |
PWC | https://paperswithcode.com/paper/balanced-sparsity-for-efficient-dnn-inference |
Repo | |
Framework | |
Multi-Task Deep Networks for Depth-Based 6D Object Pose and Joint Registration in Crowd Scenarios
Title | Multi-Task Deep Networks for Depth-Based 6D Object Pose and Joint Registration in Crowd Scenarios |
Authors | Juil Sock, Kwang In Kim, Caner Sahin, Tae-Kyun Kim |
Abstract | In bin-picking scenarios, multiple instances of an object of interest are stacked in a pile randomly, and hence, the instances are inherently subjected to the challenges: severe occlusion, clutter, and similar-looking distractors. Most existing methods are, however, for single isolated object instances, while some recent methods tackle crowd scenarios as post-refinement which accounts multiple object relations. In this paper, we address recovering 6D poses of multiple instances in bin-picking scenarios in depth modality by multi-task learning in deep neural networks. Our architecture jointly learns multiple sub-tasks: 2D detection, depth, and 3D pose estimation of individual objects; and joint registration of multiple objects. For training data generation, depth images of physically plausible object pose configurations are generated by a 3D object model in a physics simulation, which yields diverse occlusion patterns to learn. We adopt a state-of-the-art object detector, and 2D offsets are further estimated via a network to refine misaligned 2D detections. The depth and 3D pose estimator is designed to generate multiple hypotheses per detection. This allows the joint registration network to learn occlusion patterns and remove physically implausible pose hypotheses. We apply our architecture on both synthetic (our own and Sileane dataset) and real (a public Bin-Picking dataset) data, showing that it significantly outperforms state-of-the-art methods by 15-31% in average precision. |
Tasks | 3D Pose Estimation, Multi-Task Learning, Pose Estimation |
Published | 2018-06-11 |
URL | http://arxiv.org/abs/1806.03891v1 |
http://arxiv.org/pdf/1806.03891v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-task-deep-networks-for-depth-based-6d |
Repo | |
Framework | |