Paper Group ANR 1442
Inferring Compact Representations for Efficient Natural Language Understanding of Robot Instructions. Universal Lipschitz Approximation in Bounded Depth Neural Networks. Specifying Weight Priors in Bayesian Deep Neural Networks with Empirical Bayes. Fault Sneaking Attack: a Stealthy Framework for Misleading Deep Neural Networks. Aerial hyperspectra …
Inferring Compact Representations for Efficient Natural Language Understanding of Robot Instructions
Title | Inferring Compact Representations for Efficient Natural Language Understanding of Robot Instructions |
Authors | Siddharth Patki, Andrea F. Daniele, Matthew R. Walter, Thomas M. Howard |
Abstract | The speed and accuracy with which robots are able to interpret natural language is fundamental to realizing effective human-robot interaction. A great deal of attention has been paid to developing models and approximate inference algorithms that improve the efficiency of language understanding. However, existing methods still attempt to reason over a representation of the environment that is flat and unnecessarily detailed, which limits scalability. An open problem is then to develop methods capable of producing the most compact environment model sufficient for accurate and efficient natural language understanding. We propose a model that leverages environment-related information encoded within instructions to identify the subset of observations and perceptual classifiers necessary to perceive a succinct, instruction-specific environment representation. The framework uses three probabilistic graphical models trained from a corpus of annotated instructions to infer salient scene semantics, perceptual classifiers, and grounded symbols. Experimental results on two robots operating in different environments demonstrate that by exploiting the content and the structure of the instructions, our method learns compact environment representations that significantly improve the efficiency of natural language symbol grounding. |
Tasks | |
Published | 2019-03-21 |
URL | http://arxiv.org/abs/1903.09243v1 |
http://arxiv.org/pdf/1903.09243v1.pdf | |
PWC | https://paperswithcode.com/paper/inferring-compact-representations-for |
Repo | |
Framework | |
Universal Lipschitz Approximation in Bounded Depth Neural Networks
Title | Universal Lipschitz Approximation in Bounded Depth Neural Networks |
Authors | Jeremy E. J. Cohen, Todd Huster, Ra Cohen |
Abstract | Adversarial attacks against machine learning models are a rather hefty obstacle to our increasing reliance on these models. Due to this, provably robust (certified) machine learning models are a major topic of interest. Lipschitz continuous models present a promising approach to solving this problem. By leveraging the expressive power of a variant of neural networks which maintain low Lipschitz constants, we prove that three layer neural networks using the FullSort activation function are Universal Lipschitz function Approximators (ULAs). This both explains experimental results and paves the way for the creation of better certified models going forward. We conclude by presenting experimental results that suggest that ULAs are a not just a novelty, but a competitive approach to providing certified classifiers, using these results to motivate several potential topics of further research. |
Tasks | |
Published | 2019-04-09 |
URL | http://arxiv.org/abs/1904.04861v1 |
http://arxiv.org/pdf/1904.04861v1.pdf | |
PWC | https://paperswithcode.com/paper/universal-lipschitz-approximation-in-bounded |
Repo | |
Framework | |
Specifying Weight Priors in Bayesian Deep Neural Networks with Empirical Bayes
Title | Specifying Weight Priors in Bayesian Deep Neural Networks with Empirical Bayes |
Authors | Ranganath Krishnan, Mahesh Subedar, Omesh Tickoo |
Abstract | Stochastic variational inference for Bayesian deep neural network (DNN) requires specifying priors and approximate posterior distributions over neural network weights. Specifying meaningful weight priors is a challenging problem, particularly for scaling variational inference to deeper architectures involving high dimensional weight space. We propose MOdel Priors with Empirical Bayes using DNN (MOPED) method to choose informed weight priors in Bayesian neural networks. We formulate a two-stage hierarchical modeling, first find the maximum likelihood estimates of weights with DNN, and then set the weight priors using empirical Bayes approach to infer the posterior with variational inference. We empirically evaluate the proposed approach on real-world tasks including image classification, video activity recognition and audio classification with varying complex neural network architectures. We also evaluate our proposed approach on diabetic retinopathy diagnosis task and benchmark with the state-of-the-art Bayesian deep learning techniques. We demonstrate MOPED method enables scalable variational inference and provides reliable uncertainty quantification. |
Tasks | Activity Recognition, Audio Classification, Bayesian Inference, Image Classification |
Published | 2019-06-12 |
URL | https://arxiv.org/abs/1906.05323v3 |
https://arxiv.org/pdf/1906.05323v3.pdf | |
PWC | https://paperswithcode.com/paper/moped-efficient-priors-for-scalable |
Repo | |
Framework | |
Fault Sneaking Attack: a Stealthy Framework for Misleading Deep Neural Networks
Title | Fault Sneaking Attack: a Stealthy Framework for Misleading Deep Neural Networks |
Authors | Pu Zhao, Siyue Wang, Cheng Gongye, Yanzhi Wang, Yunsi Fei, Xue Lin |
Abstract | Despite the great achievements of deep neural networks (DNNs), the vulnerability of state-of-the-art DNNs raises security concerns of DNNs in many application domains requiring high reliability.We propose the fault sneaking attack on DNNs, where the adversary aims to misclassify certain input images into any target labels by modifying the DNN parameters. We apply ADMM (alternating direction method of multipliers) for solving the optimization problem of the fault sneaking attack with two constraints: 1) the classification of the other images should be unchanged and 2) the parameter modifications should be minimized. Specifically, the first constraint requires us not only to inject designated faults (misclassifications), but also to hide the faults for stealthy or sneaking considerations by maintaining model accuracy. The second constraint requires us to minimize the parameter modifications (using L0 norm to measure the number of modifications and L2 norm to measure the magnitude of modifications). Comprehensive experimental evaluation demonstrates that the proposed framework can inject multiple sneaking faults without losing the overall test accuracy performance. |
Tasks | |
Published | 2019-05-28 |
URL | https://arxiv.org/abs/1905.12032v1 |
https://arxiv.org/pdf/1905.12032v1.pdf | |
PWC | https://paperswithcode.com/paper/fault-sneaking-attack-a-stealthy-framework |
Repo | |
Framework | |
Aerial hyperspectral imagery and deep neural networks for high-throughput yield phenotyping in wheat
Title | Aerial hyperspectral imagery and deep neural networks for high-throughput yield phenotyping in wheat |
Authors | Ali Moghimi, Ce Yang, James A. Anderson |
Abstract | Crop production needs to increase in a sustainable manner to meet the growing global demand for food. To identify crop varieties with high yield potential, plant scientists and breeders evaluate the performance of hundreds of lines in multiple locations over several years. To facilitate the process of selecting advanced varieties, an automated framework was developed in this study. A hyperspectral camera was mounted on an unmanned aerial vehicle to collect aerial imagery with high spatial and spectral resolution. Aerial images were captured in two consecutive growing seasons from three experimental yield fields composed of hundreds experimental plots (1x2.4 meter), each contained a single wheat line. The grain of more than thousand wheat plots was harvested by a combine, weighed, and recorded as the ground truth data. To leverage the high spatial resolution and investigate the yield variation within the plots, images of plots were divided into sub-plots by integrating image processing techniques and spectral mixture analysis with the expert domain knowledge. Afterwards, the sub-plot dataset was divided into train, validation, and test sets using stratified sampling. Subsequent to extracting features from each sub-plot, deep neural networks were trained for yield estimation. The coefficient of determination for predicting the yield of the test dataset at sub-plot scale was 0.79 with root mean square error of 5.90 grams. In addition to providing insights into yield variation at sub-plot scale, the proposed framework can facilitate the process of high-throughput yield phenotyping as a valuable decision support tool. It offers the possibility of (i) remote visual inspection of the plots, (ii) studying the effect of crop density on yield, and (iii) optimizing plot size to investigate more lines in a dedicated field each year. |
Tasks | |
Published | 2019-06-23 |
URL | https://arxiv.org/abs/1906.09666v1 |
https://arxiv.org/pdf/1906.09666v1.pdf | |
PWC | https://paperswithcode.com/paper/aerial-hyperspectral-imagery-and-deep-neural |
Repo | |
Framework | |
Projective Decomposition and Matrix Equivalence up to Scale
Title | Projective Decomposition and Matrix Equivalence up to Scale |
Authors | Max Robinson |
Abstract | A data matrix may be seen simply as a means of organizing observations into rows ( e.g., by measured object) and into columns ( e.g., by measured variable) so that the observations can be analyzed with mathematical tools. As a mathematical object, a matrix defines a linear mapping between points representing weighted combinations of its rows (the row vector space) and points representing weighted combinations of its columns (the column vector space). From this perspective, a data matrix defines a relationship between the information that labels its rows and the information that labels its columns, and numerical methods are used to analyze this relationship. A first step is to normalize the data, transforming each observation from scales convenient for measurement to a common scale, on which addition and multiplication can meaningfully combine the different observations. For example, z-transformation rescales every variable to the same scale, standardized variation from an expected value, but ignores scale differences between measured objects. Here we develop the concepts and properties of projective decomposition, which applies the same normalization strategy to both rows and columns by separating the matrix into row- and column-scaling factors and a scale-normalized matrix. We show that different scalings of the same scale-normalized matrix form an equivalence class, and call the scale-normalized, canonical member of the class its scale-invariant form that preserves all pairwise relative ratios. Projective decomposition therefore provides a means of normalizing the broad class of ratio-scale data, in which relative ratios are of primary interest, onto a common scale without altering the ratios of interest, and simultaneously accounting for scale effects for both organizations of the matrix values. Both of these properties distinguish it from z-transformation. |
Tasks | |
Published | 2019-01-04 |
URL | http://arxiv.org/abs/1901.01336v1 |
http://arxiv.org/pdf/1901.01336v1.pdf | |
PWC | https://paperswithcode.com/paper/projective-decomposition-and-matrix |
Repo | |
Framework | |
Deep Clustering for Mars Rover image datasets
Title | Deep Clustering for Mars Rover image datasets |
Authors | Vikas Ramachandra |
Abstract | In this paper, we build autoencoders to learn a latent space from unlabeled image datasets obtained from the Mars rover. Then, once the latent feature space has been learnt, we use k-means to cluster the data. We test the performance of the algorithm on a smaller labeled dataset, and report good accuracy and concordance with the ground truth labels. This is the first attempt to use deep learning based unsupervised algorithms to cluster Mars Rover images. This algorithm can be used to augment human annotations for such datasets (which are time consuming) and speed up the generation of ground truth labels for Mars Rover image data, and potentially other planetary and space images. |
Tasks | |
Published | 2019-11-12 |
URL | https://arxiv.org/abs/1911.06623v1 |
https://arxiv.org/pdf/1911.06623v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-clustering-for-mars-rover-image-datasets |
Repo | |
Framework | |
Inducing Syntactic Trees from BERT Representations
Title | Inducing Syntactic Trees from BERT Representations |
Authors | Rudolf Rosa, David Mareček |
Abstract | We use the English model of BERT and explore how a deletion of one word in a sentence changes representations of other words. Our hypothesis is that removing a reducible word (e.g. an adjective) does not affect the representation of other words so much as removing e.g. the main verb, which makes the sentence ungrammatical and of “high surprise” for the language model. We estimate reducibilities of individual words and also of longer continuous phrases (word n-grams), study their syntax-related properties, and then also use them to induce full dependency trees. |
Tasks | Language Modelling |
Published | 2019-06-27 |
URL | https://arxiv.org/abs/1906.11511v1 |
https://arxiv.org/pdf/1906.11511v1.pdf | |
PWC | https://paperswithcode.com/paper/inducing-syntactic-trees-from-bert |
Repo | |
Framework | |
Incremental Learning with Maximum Entropy Regularization: Rethinking Forgetting and Intransigence
Title | Incremental Learning with Maximum Entropy Regularization: Rethinking Forgetting and Intransigence |
Authors | Dahyun Kim, Jihwan Bae, Yeonsik Jo, Jonghyun Choi |
Abstract | Incremental learning suffers from two challenging problems; forgetting of old knowledge and intransigence on learning new knowledge. Prediction by the model incrementally learned with a subset of the dataset are thus uncertain and the uncertainty accumulates through the tasks by knowledge transfer. To prevent overfitting to the uncertain knowledge, we propose to penalize confident fitting to the uncertain knowledge by the Maximum Entropy Regularizer (MER). Additionally, to reduce class imbalance and induce a self-paced curriculum on new classes, we exclude a few samples from the new classes in every mini-batch, which we call DropOut Sampling (DOS). We further rethink evaluation metrics for forgetting and intransigence in incremental learning by tracking each sample’s confusion at the transition of a task since the existing metrics that compute the difference in accuracy are often misleading. We show that the proposed method, named ‘MEDIC’, outperforms the state-of-the-art incremental learning algorithms in accuracy, forgetting, and intransigence measured by both the existing and the proposed metrics by a large margin in extensive empirical validations on CIFAR100 and a popular subset of ImageNet dataset (TinyImageNet). |
Tasks | Transfer Learning |
Published | 2019-02-03 |
URL | http://arxiv.org/abs/1902.00829v1 |
http://arxiv.org/pdf/1902.00829v1.pdf | |
PWC | https://paperswithcode.com/paper/incremental-learning-with-maximum-entropy |
Repo | |
Framework | |
Snooping Attacks on Deep Reinforcement Learning
Title | Snooping Attacks on Deep Reinforcement Learning |
Authors | Matthew Inkawhich, Yiran Chen, Hai Li |
Abstract | Adversarial attacks have exposed a significant security vulnerability in state-of-the-art machine learning models. Among these models include deep reinforcement learning agents. The existing methods for attacking reinforcement learning agents assume the adversary either has access to the target agent’s learned parameters or the environment that the agent interacts with. In this work, we propose a new class of threat models, called snooping threat models, that are unique to reinforcement learning. In these snooping threat models, the adversary does not have the ability to interact with the target agent’s environment, and can only eavesdrop on the action and reward signals being exchanged between agent and environment. We show that adversaries operating in these highly constrained threat models can still launch devastating attacks against the target agent by training proxy models on related tasks and leveraging the transferability of adversarial examples. |
Tasks | |
Published | 2019-05-28 |
URL | https://arxiv.org/abs/1905.11832v2 |
https://arxiv.org/pdf/1905.11832v2.pdf | |
PWC | https://paperswithcode.com/paper/snooping-attacks-on-deep-reinforcement |
Repo | |
Framework | |
Deep Reason: A Strong Baseline for Real-World Visual Reasoning
Title | Deep Reason: A Strong Baseline for Real-World Visual Reasoning |
Authors | Chenfei Wu, Yanzhao Zhou, Gen Li, Nan Duan, Duyu Tang, Xiaojie Wang |
Abstract | This paper presents a strong baseline for real-world visual reasoning (GQA), which achieves 60.93% in GQA 2019 challenge and won the sixth place. GQA is a large dataset with 22M questions involving spatial understanding and multi-step inference. To help further research in this area, we identified three crucial parts that improve the performance, namely: multi-source features, fine-grained encoder, and score-weighted ensemble. We provide a series of analysis on their impact on performance. |
Tasks | Visual Reasoning |
Published | 2019-05-24 |
URL | https://arxiv.org/abs/1905.10226v2 |
https://arxiv.org/pdf/1905.10226v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-reason-a-strong-baseline-for-real-world |
Repo | |
Framework | |
Regular Partitions and Their Use in Structural Pattern Recognition
Title | Regular Partitions and Their Use in Structural Pattern Recognition |
Authors | Marco Fiorucci |
Abstract | Recent years are characterized by an unprecedented quantity of available network data which are produced at an astonishing rate by an heterogeneous variety of interconnected sensors and devices. This high-throughput generation calls for the development of new effective methods to store, retrieve, understand and process massive network data. In this thesis, we tackle this challenge by introducing a framework to summarize large graphs based on Szemer'edi’s Regularity Remma (RL), which roughly states that any sufficiently large graph can almost entirely be partitioned into a bounded number of random-like bipartite graphs. The partition resulting from the RL gives rise to a summary, which inherits many of the essential structural properties of the original graph. We first extend an heuristic version of the RL to improve its efficiency and its robustness. We use the proposed algorithm to address graph-based clustering and image segmentation tasks. In the second part of the thesis, we introduce a new heuristic algorithm which is characterized by an improvement of the summary quality both in terms of reconstruction error and of noise filtering. We use the proposed heuristic to address the graph search problem defined under a similarity measure. Finally, we study the linkage among the regularity lemma, the stochastic block model and the minimum description length. This study provide us a principled way to develop a graph decomposition algorithm based on stochastic block model which is fitted using likelihood maximization. |
Tasks | Semantic Segmentation |
Published | 2019-09-16 |
URL | https://arxiv.org/abs/1909.07420v2 |
https://arxiv.org/pdf/1909.07420v2.pdf | |
PWC | https://paperswithcode.com/paper/regular-partitions-and-their-use-in |
Repo | |
Framework | |
Guided Anisotropic Diffusion and Iterative Learning for Weakly Supervised Change Detection
Title | Guided Anisotropic Diffusion and Iterative Learning for Weakly Supervised Change Detection |
Authors | Rodrigo Caye Daudt, Bertrand Le Saux, Alexandre Boulch, Yann Gousseau |
Abstract | Large scale datasets created from user labels or openly available data have become crucial to provide training data for large scale learning algorithms. While these datasets are easier to acquire, the data are frequently noisy and unreliable, which is motivating research on weakly supervised learning techniques. In this paper we propose an iterative learning method that extracts the useful information from a large scale change detection dataset generated from open vector data to train a fully convolutional network which surpasses the performance obtained by naive supervised learning. We also propose the guided anisotropic diffusion algorithm, which improves semantic segmentation results using the input images as guides to perform edge preserving filtering, and is used in conjunction with the iterative training method to improve results. |
Tasks | Semantic Segmentation |
Published | 2019-04-17 |
URL | http://arxiv.org/abs/1904.08208v1 |
http://arxiv.org/pdf/1904.08208v1.pdf | |
PWC | https://paperswithcode.com/paper/190408208 |
Repo | |
Framework | |
On Robustness to Adversarial Examples and Polynomial Optimization
Title | On Robustness to Adversarial Examples and Polynomial Optimization |
Authors | Pranjal Awasthi, Abhratanu Dutta, Aravindan Vijayaraghavan |
Abstract | We study the design of computationally efficient algorithms with provable guarantees, that are robust to adversarial (test time) perturbations. While there has been an proliferation of recent work on this topic due to its connections to test time robustness of deep networks, there is limited theoretical understanding of several basic questions like (i) when and how can one design provably robust learning algorithms? (ii) what is the price of achieving robustness to adversarial examples in a computationally efficient manner? The main contribution of this work is to exhibit a strong connection between achieving robustness to adversarial examples, and a rich class of polynomial optimization problems, thereby making progress on the above questions. In particular, we leverage this connection to (a) design computationally efficient robust algorithms with provable guarantees for a large class of hypothesis, namely linear classifiers and degree-2 polynomial threshold functions (PTFs), (b) give a precise characterization of the price of achieving robustness in a computationally efficient manner for these classes, (c) design efficient algorithms to certify robustness and generate adversarial attacks in a principled manner for 2-layer neural networks. We empirically demonstrate the effectiveness of these attacks on real data. |
Tasks | |
Published | 2019-11-12 |
URL | https://arxiv.org/abs/1911.04681v1 |
https://arxiv.org/pdf/1911.04681v1.pdf | |
PWC | https://paperswithcode.com/paper/on-robustness-to-adversarial-examples-and-1 |
Repo | |
Framework | |
One-Shot Neural Architecture Search via Compressive Sensing
Title | One-Shot Neural Architecture Search via Compressive Sensing |
Authors | Minsu Cho, Mohammadreza Soltani, Chinmay Hegde |
Abstract | Neural architecture search (NAS), or automated design of neural network models, remains a very challenging meta-learning problem. Several recent works (called “one-shot” approaches) have focused on dramatically reducing NAS running time by leveraging proxy models that still provide architectures with competitive performance. In our work, we propose a new meta-learning algorithm that we call CoNAS, or Compressive sensing-based Neural Architecture Search. Our approach merges ideas from one-shot approaches with iterative techniques for learning low-degree sparse Boolean polynomial functions. We validate our approach on several standard test datasets, discover novel architectures hitherto unreported, and achieve competitive (or better) results in both performance and search time compared to existing NAS approaches. Further, we support our algorithm with a theoretical analysis, providing upper bounds on the number of measurements needed to perform reliable meta-learning; to our knowledge, these analysis tools are novel to the NAS literature and may be of independent interest. |
Tasks | Compressive Sensing, Meta-Learning, Neural Architecture Search |
Published | 2019-06-07 |
URL | https://arxiv.org/abs/1906.02869v1 |
https://arxiv.org/pdf/1906.02869v1.pdf | |
PWC | https://paperswithcode.com/paper/one-shot-neural-architecture-search-via |
Repo | |
Framework | |