Paper Group ANR 774
Generative Visual Rationales. Latent Space Policies for Hierarchical Reinforcement Learning. FASK with Interventional Knowledge Recovers Edges from the Sachs Model. High Dynamic Range SLAM with Map-Aware Exposure Time Control. Image Dataset for Visual Objects Classification in 3D Printing. Metric Learning for Phoneme Perception. Biased Embeddings f …
Generative Visual Rationales
Title | Generative Visual Rationales |
Authors | Jarrel Seah, Jennifer Tang, Andy Kitchen, Jonathan Seah |
Abstract | Interpretability and small labelled datasets are key issues in the practical application of deep learning, particularly in areas such as medicine. In this paper, we present a semi-supervised technique that addresses both these issues by leveraging large unlabelled datasets to encode and decode images into a dense latent representation. Using chest radiography as an example, we apply this encoder to other labelled datasets and apply simple models to the latent vectors to learn algorithms to identify heart failure. For each prediction, we generate visual rationales by optimizing a latent representation to minimize the prediction of disease while constrained by a similarity measure in image space. Decoding the resultant latent representation produces an image without apparent disease. The difference between the original decoding and the altered image forms an interpretable visual rationale for the algorithm’s prediction on that image. We also apply our method to the MNIST dataset and compare the generated rationales to other techniques described in the literature. |
Tasks | |
Published | 2018-04-04 |
URL | http://arxiv.org/abs/1804.04539v1 |
http://arxiv.org/pdf/1804.04539v1.pdf | |
PWC | https://paperswithcode.com/paper/generative-visual-rationales |
Repo | |
Framework | |
Latent Space Policies for Hierarchical Reinforcement Learning
Title | Latent Space Policies for Hierarchical Reinforcement Learning |
Authors | Tuomas Haarnoja, Kristian Hartikainen, Pieter Abbeel, Sergey Levine |
Abstract | We address the problem of learning hierarchical deep neural network policies for reinforcement learning. In contrast to methods that explicitly restrict or cripple lower layers of a hierarchy to force them to use higher-level modulating signals, each layer in our framework is trained to directly solve the task, but acquires a range of diverse strategies via a maximum entropy reinforcement learning objective. Each layer is also augmented with latent random variables, which are sampled from a prior distribution during the training of that layer. The maximum entropy objective causes these latent variables to be incorporated into the layer’s policy, and the higher level layer can directly control the behavior of the lower layer through this latent space. Furthermore, by constraining the mapping from latent variables to actions to be invertible, higher layers retain full expressivity: neither the higher layers nor the lower layers are constrained in their behavior. Our experimental evaluation demonstrates that we can improve on the performance of single-layer policies on standard benchmark tasks simply by adding additional layers, and that our method can solve more complex sparse-reward tasks by learning higher-level policies on top of high-entropy skills optimized for simple low-level objectives. |
Tasks | Hierarchical Reinforcement Learning |
Published | 2018-04-09 |
URL | http://arxiv.org/abs/1804.02808v2 |
http://arxiv.org/pdf/1804.02808v2.pdf | |
PWC | https://paperswithcode.com/paper/latent-space-policies-for-hierarchical |
Repo | |
Framework | |
FASK with Interventional Knowledge Recovers Edges from the Sachs Model
Title | FASK with Interventional Knowledge Recovers Edges from the Sachs Model |
Authors | Joseph Ramsey, Bryan Andrews |
Abstract | We report a procedure that, in one step from continuous data with minimal preparation, recovers the graph found by Sachs et al. \cite{sachs2005causal}, with only a few edges different. The algorithm, Fast Adjacency Skewness (FASK), relies on a mixture of linear reasoning and reasoning from the skewness of variables; the Sachs data is a good candidate for this procedure since the skewness of the variables is quite pronounced. We review the ground truth model from Sachs et al. as well as some of the fluctuations seen in the protein abundances in the system, give the Sachs model and the FASK model, and perform a detailed comparison. Some variation in hyper-parameters is explored, though the main result uses values at or near the defaults learned from work modeling fMRI data. |
Tasks | |
Published | 2018-05-06 |
URL | http://arxiv.org/abs/1805.03108v1 |
http://arxiv.org/pdf/1805.03108v1.pdf | |
PWC | https://paperswithcode.com/paper/fask-with-interventional-knowledge-recovers |
Repo | |
Framework | |
High Dynamic Range SLAM with Map-Aware Exposure Time Control
Title | High Dynamic Range SLAM with Map-Aware Exposure Time Control |
Authors | Sergey V. Alexandrov, Johann Prankl, Michael Zillich, Markus Vincze |
Abstract | The research in dense online 3D mapping is mostly focused on the geometrical accuracy and spatial extent of the reconstructions. Their color appearance is often neglected, leading to inconsistent colors and noticeable artifacts. We rectify this by extending a state-of-the-art SLAM system to accumulate colors in HDR space. We replace the simplistic pixel intensity averaging scheme with HDR color fusion rules tailored to the incremental nature of SLAM and a noise model suitable for off-the-shelf RGB-D cameras. Our main contribution is a map-aware exposure time controller. It makes decisions based on the global state of the map and predicted camera motion, attempting to maximize the information gain of each observation. We report a set of experiments demonstrating the improved texture quality and advantages of using the custom controller that is tightly integrated in the mapping loop. |
Tasks | |
Published | 2018-04-20 |
URL | http://arxiv.org/abs/1804.07427v1 |
http://arxiv.org/pdf/1804.07427v1.pdf | |
PWC | https://paperswithcode.com/paper/high-dynamic-range-slam-with-map-aware |
Repo | |
Framework | |
Image Dataset for Visual Objects Classification in 3D Printing
Title | Image Dataset for Visual Objects Classification in 3D Printing |
Authors | Hongjia Li, Xiaolong Ma, Aditya Singh Rathore, Zhe Li, Qiyuan An, Chen Song, Wenyao Xu, Yanzhi Wang |
Abstract | The rapid development in additive manufacturing (AM), also known as 3D printing, has brought about potential risk and security issues along with significant benefits. In order to enhance the security level of the 3D printing process, the present research aims to detect and recognize illegal components using deep learning. In this work, we collected a dataset of 61,340 2D images (28x28 for each image) of 10 classes including guns and other non-gun objects, corresponding to the projection results of the original 3D models. To validate the dataset, we train a convolutional neural network (CNN) model for gun classification which can achieve 98.16% classification accuracy. |
Tasks | |
Published | 2018-02-15 |
URL | http://arxiv.org/abs/1803.00391v2 |
http://arxiv.org/pdf/1803.00391v2.pdf | |
PWC | https://paperswithcode.com/paper/image-dataset-for-visual-objects |
Repo | |
Framework | |
Metric Learning for Phoneme Perception
Title | Metric Learning for Phoneme Perception |
Authors | Yair Lakretz, Gal Chechik, Evan-Gary Cohen, Alessandro Treves, Naama Friedmann |
Abstract | Metric functions for phoneme perception capture the similarity structure among phonemes in a given language and therefore play a central role in phonology and psycho-linguistics. Various phenomena depend on phoneme similarity, such as spoken word recognition or serial recall from verbal working memory. This study presents a new framework for learning a metric function for perceptual distances among pairs of phonemes. Previous studies have proposed various metric functions, from simple measures counting the number of phonetic dimensions that two phonemes share (place-, manner-of-articulation and voicing), to more sophisticated ones such as deriving perceptual distances based on the number of natural classes that both phonemes belong to. However, previous studies have manually constructed the metric function, which may lead to unsatisfactory account of the empirical data. This study presents a framework to derive the metric function from behavioral data on phoneme perception using learning algorithms. We first show that this approach outperforms previous metrics suggested in the literature in predicting perceptual distances among phoneme pairs. We then study several metric functions derived by the learning algorithms and show how perceptual saliencies of phonological features can be derived from them. For English, we show that the derived perceptual saliencies are in accordance with a previously described order among phonological features and show how the framework extends the results to more features. Finally, we explore how the metric function and perceptual saliencies of phonological features may vary across languages. To this end, we compare results based on two English datasets and a new dataset that we have collected for Hebrew. |
Tasks | Metric Learning |
Published | 2018-09-20 |
URL | http://arxiv.org/abs/1809.07824v1 |
http://arxiv.org/pdf/1809.07824v1.pdf | |
PWC | https://paperswithcode.com/paper/metric-learning-for-phoneme-perception |
Repo | |
Framework | |
Biased Embeddings from Wild Data: Measuring, Understanding and Removing
Title | Biased Embeddings from Wild Data: Measuring, Understanding and Removing |
Authors | Adam Sutton, Thomas Lansdall-Welfare, Nello Cristianini |
Abstract | Many modern Artificial Intelligence (AI) systems make use of data embeddings, particularly in the domain of Natural Language Processing (NLP). These embeddings are learnt from data that has been gathered “from the wild” and have been found to contain unwanted biases. In this paper we make three contributions towards measuring, understanding and removing this problem. We present a rigorous way to measure some of these biases, based on the use of word lists created for social psychology applications; we observe how gender bias in occupations reflects actual gender bias in the same occupations in the real world; and finally we demonstrate how a simple projection can significantly reduce the effects of embedding bias. All this is part of an ongoing effort to understand how trust can be built into AI systems. |
Tasks | |
Published | 2018-06-16 |
URL | http://arxiv.org/abs/1806.06301v1 |
http://arxiv.org/pdf/1806.06301v1.pdf | |
PWC | https://paperswithcode.com/paper/biased-embeddings-from-wild-data-measuring |
Repo | |
Framework | |
BCI decoder performance comparison of an LSTM recurrent neural network and a Kalman filter in retrospective simulation
Title | BCI decoder performance comparison of an LSTM recurrent neural network and a Kalman filter in retrospective simulation |
Authors | Tommy Hosman, Marco Vilela, Daniel Milstein, Jessica N. Kelemen, David M. Brandman, Leigh R. Hochberg, John D. Simeral |
Abstract | Intracortical brain computer interfaces (iBCIs) using linear Kalman decoders have enabled individuals with paralysis to control a computer cursor for continuous point-and-click typing on a virtual keyboard, browsing the internet, and using familiar tablet apps. However, further advances are needed to deliver iBCI-enabled cursor control approaching able-bodied performance. Motivated by recent evidence that nonlinear recurrent neural networks (RNNs) can provide higher performance iBCI cursor control in nonhuman primates (NHPs), we evaluated decoding of intended cursor velocity from human motor cortical signals using a long-short term memory (LSTM) RNN trained across multiple days of multi-electrode recordings. Running simulations with previously recorded intracortical signals from three BrainGate iBCI trial participants, we demonstrate an RNN that can substantially increase bits-per-second metric in a high-speed cursor-based target selection task as well as a challenging small-target high-accuracy task when compared to a Kalman decoder. These results indicate that RNN decoding applied to human intracortical signals could achieve substantial performance advances in continuous 2-D cursor control and motivate a real-time RNN implementation for online evaluation by individuals with tetraplegia. |
Tasks | |
Published | 2018-12-24 |
URL | http://arxiv.org/abs/1812.09835v1 |
http://arxiv.org/pdf/1812.09835v1.pdf | |
PWC | https://paperswithcode.com/paper/bci-decoder-performance-comparison-of-an-lstm |
Repo | |
Framework | |
Semantically Enhanced Dynamic Bayesian Network for Detecting Sepsis Mortality Risk in ICU Patients with Infection
Title | Semantically Enhanced Dynamic Bayesian Network for Detecting Sepsis Mortality Risk in ICU Patients with Infection |
Authors | Tony Wang, Tom Velez, Emilia Apostolova, Tim Tschampel, Thuy L. Ngo, Joy Hardison |
Abstract | Although timely sepsis diagnosis and prompt interventions in Intensive Care Unit (ICU) patients are associated with reduced mortality, early clinical recognition is frequently impeded by non-specific signs of infection and failure to detect signs of sepsis-induced organ dysfunction in a constellation of dynamically changing physiological data. The goal of this work is to identify patient at risk of life-threatening sepsis utilizing a data-centered and machine learning-driven approach. We derive a mortality risk predictive dynamic Bayesian network (DBN) guided by a customized sepsis knowledgebase and compare the predictive accuracy of the derived DBN with the Sepsis-related Organ Failure Assessment (SOFA) score, the Quick SOFA (qSOFA) score, the Simplified Acute Physiological Score (SAPS-II) and the Modified Early Warning Score (MEWS) tools. A customized sepsis ontology was used to derive the DBN node structure and semantically characterize temporal features derived from both structured physiological data and unstructured clinical notes. We assessed the performance in predicting mortality risk of the DBN predictive model and compared performance to other models using Receiver Operating Characteristic (ROC) curves, area under curve (AUROC), calibration curves, and risk distributions. The derived dataset consists of 24,506 ICU stays from 19,623 patients with evidence of suspected infection, with 2,829 patients deceased at discharge. The DBN AUROC was found to be 0.91, which outperformed the SOFA (0.843), qSOFA (0.66), MEWS (0.73), and SAPS-II (0.77) scoring tools. Continuous Net Reclassification Index and Integrated Discrimination Improvement analysis supported the superiority DBN. Compared with conventional rule-based risk scoring tools, the sepsis knowledgebase-driven DBN algorithm offers improved performance for predicting mortality of infected patients in ICUs. |
Tasks | Calibration |
Published | 2018-06-26 |
URL | http://arxiv.org/abs/1806.10174v1 |
http://arxiv.org/pdf/1806.10174v1.pdf | |
PWC | https://paperswithcode.com/paper/semantically-enhanced-dynamic-bayesian |
Repo | |
Framework | |
Neural models of factuality
Title | Neural models of factuality |
Authors | Rachel Rudinger, Aaron Steven White, Benjamin Van Durme |
Abstract | We present two neural models for event factuality prediction, which yield significant performance gains over previous models on three event factuality datasets: FactBank, UW, and MEANTIME. We also present a substantial expansion of the It Happened portion of the Universal Decompositional Semantics dataset, yielding the largest event factuality dataset to date. We report model results on this extended factuality dataset as well. |
Tasks | |
Published | 2018-04-06 |
URL | http://arxiv.org/abs/1804.02472v1 |
http://arxiv.org/pdf/1804.02472v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-models-of-factuality |
Repo | |
Framework | |
Information Geometry of Orthogonal Initializations and Training
Title | Information Geometry of Orthogonal Initializations and Training |
Authors | Piotr A. Sokol, Il Memming Park |
Abstract | Recently mean field theory has been successfully used to analyze properties of wide, random neural networks. It gave rise to a prescriptive theory for initializing feed-forward neural networks with orthogonal weights, which ensures that both the forward propagated activations and the backpropagated gradients are near $\ell_2$ isometries and as a consequence training is orders of magnitude faster. Despite strong empirical performance, the mechanisms by which critical initializations confer an advantage in the optimization of deep neural networks are poorly understood. Here we show a novel connection between the maximum curvature of the optimization landscape (gradient smoothness) as measured by the Fisher information matrix (FIM) and the spectral radius of the input-output Jacobian, which partially explains why more isometric networks can train much faster. Furthermore, given that orthogonal weights are necessary to ensure that gradient norms are approximately preserved at initialization, we experimentally investigate the benefits of maintaining orthogonality throughout training, from which we conclude that manifold optimization of weights performs well regardless of the smoothness of the gradients. Moreover, motivated by experimental results we show that a low condition number of the FIM is not predictive of faster learning. |
Tasks | |
Published | 2018-10-09 |
URL | https://arxiv.org/abs/1810.03785v2 |
https://arxiv.org/pdf/1810.03785v2.pdf | |
PWC | https://paperswithcode.com/paper/information-geometry-of-orthogonal |
Repo | |
Framework | |
How much should you ask? On the question structure in QA systems
Title | How much should you ask? On the question structure in QA systems |
Authors | Dominika Basaj, Barbara Rychalska, Przemyslaw Biecek, Anna Wroblewska |
Abstract | Datasets that boosted state-of-the-art solutions for Question Answering (QA) systems prove that it is possible to ask questions in natural language manner. However, users are still used to query-like systems where they type in keywords to search for answer. In this study we validate which parts of questions are essential for obtaining valid answer. In order to conclude that, we take advantage of LIME - a framework that explains prediction by local approximation. We find that grammar and natural language is disregarded by QA. State-of-the-art model can answer properly even if ‘asked’ only with a few words with high coefficients calculated with LIME. According to our knowledge, it is the first time that QA model is being explained by LIME. |
Tasks | Question Answering |
Published | 2018-09-11 |
URL | http://arxiv.org/abs/1809.03734v1 |
http://arxiv.org/pdf/1809.03734v1.pdf | |
PWC | https://paperswithcode.com/paper/how-much-should-you-ask-on-the-question-1 |
Repo | |
Framework | |
Purely Geometric Scene Association and Retrieval - A Case for Macro Scale 3D Geometry
Title | Purely Geometric Scene Association and Retrieval - A Case for Macro Scale 3D Geometry |
Authors | Rahul Sawhney, Fuxin Li, Henrik I. Christensen, Charles L. Isbell |
Abstract | We address the problems of measuring geometric similarity between 3D scenes, represented through point clouds or range data frames, and associating them. Our approach leverages macro-scale 3D structural geometry - the relative configuration of arbitrary surfaces and relationships among structures that are potentially far apart. We express such discriminative information in a viewpoint-invariant feature space. These are subsequently encoded in a frame-level signature that can be utilized to measure geometric similarity. Such a characterization is robust to noise, incomplete and partially overlapping data besides viewpoint changes. We show how it can be employed to select a diverse set of data frames which have structurally similar content, and how to validate whether views with similar geometric content are from the same scene. The problem is formulated as one of general purpose retrieval from an unannotated, spatio-temporally unordered database. Empirical analysis indicates that the presented approach thoroughly outperforms baselines on depth / range data. Its depth-only performance is competitive with state-of-the-art approaches with RGB or RGB-D inputs, including ones based on deep learning. Experiments show retrieval performance to hold up well with much sparser databases, which is indicative of the approach’s robustness. The approach generalized well - it did not require dataset specific training, and scaled up in our experiments. Finally, we also demonstrate how geometrically diverse selection of views can result in richer 3D reconstructions. |
Tasks | |
Published | 2018-08-03 |
URL | http://arxiv.org/abs/1808.01343v1 |
http://arxiv.org/pdf/1808.01343v1.pdf | |
PWC | https://paperswithcode.com/paper/purely-geometric-scene-association-and |
Repo | |
Framework | |
Detecting The Objects on The Road Using Modular Lightweight Network
Title | Detecting The Objects on The Road Using Modular Lightweight Network |
Authors | Sen Cao, Yazhou Liu, Pongsak Lasang, Shengmei Shen |
Abstract | This paper presents a modular lightweight network model for road objects detection, such as car, pedestrian and cyclist, especially when they are far away from the camera and their sizes are small. Great advances have been made for the deep networks, but small objects detection is still a challenging task. In order to solve this problem, majority of existing methods utilize complicated network or bigger image size, which generally leads to higher computation cost. The proposed network model is referred to as modular feature fusion detector (MFFD), using a fast and efficient network architecture for detecting small objects. The contribution lies in the following aspects: 1) Two base modules have been designed for efficient computation: Front module reduce the information loss from raw input images; Tinier module decrease model size and computation cost, while ensuring the detection accuracy. 2) By stacking the base modules, we design a context features fusion framework for multi-scale object detection. 3) The propose method is efficient in terms of model size and computation cost, which is applicable for resource limited devices, such as embedded systems for advanced driver assistance systems (ADAS). Comparisons with the state-of-the-arts on the challenging KITTI dataset reveal the superiority of the proposed method. Especially, 100 fps can be achieved on the embedded GPUs such as Jetson TX2. |
Tasks | Object Detection |
Published | 2018-11-16 |
URL | http://arxiv.org/abs/1811.06641v1 |
http://arxiv.org/pdf/1811.06641v1.pdf | |
PWC | https://paperswithcode.com/paper/detecting-the-objects-on-the-road-using |
Repo | |
Framework | |
A Semi-supervised Spatial Spectral Regularized Manifold Local Scaling Cut With HGF for Dimensionality Reduction of Hyperspectral Images
Title | A Semi-supervised Spatial Spectral Regularized Manifold Local Scaling Cut With HGF for Dimensionality Reduction of Hyperspectral Images |
Authors | Ramanarayan Mohanty, SL Happy, Aurobinda Routray |
Abstract | Hyperspectral images (HSI) contain a wealth of information over hundreds of contiguous spectral bands, making it possible to classify materials through subtle spectral discrepancies. However, the classification of this rich spectral information is accompanied by the challenges like high dimensionality, singularity, limited training samples, lack of labeled data samples, heteroscedasticity and nonlinearity. To address these challenges, we propose a semi-supervised graph based dimensionality reduction method named `semi-supervised spatial spectral regularized manifold local scaling cut’ (S3RMLSC). The underlying idea of the proposed method is to exploit the limited labeled information from both the spectral and spatial domains along with the abundant unlabeled samples to facilitate the classification task by retaining the original distribution of the data. In S3RMLSC, a hierarchical guided filter (HGF) is initially used to smoothen the pixels of the HSI data to preserve the spatial pixel consistency. This step is followed by the construction of linear patches from the nonlinear manifold by using the maximal linear patch (MLP) criterion. Then the inter-patch and intra-patch dissimilarity matrices are constructed in both spectral and spatial domains by regularized manifold local scaling cut (RMLSC) and neighboring pixel manifold local scaling cut (NPMLSC) respectively. Finally, we obtain the projection matrix by optimizing the updated semi-supervised spatial-spectral between-patch and total-patch dissimilarity. The effectiveness of the proposed DR algorithm is illustrated with publicly available real-world HSI datasets. | |
Tasks | Dimensionality Reduction |
Published | 2018-11-20 |
URL | http://arxiv.org/abs/1811.08223v1 |
http://arxiv.org/pdf/1811.08223v1.pdf | |
PWC | https://paperswithcode.com/paper/a-semi-supervised-spatial-spectral |
Repo | |
Framework | |