January 26, 2020

3079 words 15 mins read

Paper Group ANR 1582

Paper Group ANR 1582

Guaranteed Validity for Empirical Approaches to Adaptive Data Analysis. NASA: Neural Articulated Shape Approximation. Deep Reinforcement Learning Based Dynamic Trajectory Control for UAV-assisted Mobile Edge Computing. Predicting dynamical system evolution with residual neural networks. Android Malware Detection Using Autoencoder. Gender Prediction …

Guaranteed Validity for Empirical Approaches to Adaptive Data Analysis

Title Guaranteed Validity for Empirical Approaches to Adaptive Data Analysis
Authors Ryan Rogers, Aaron Roth, Adam Smith, Nathan Srebro, Om Thakkar, Blake Woodworth
Abstract We design a general framework for answering adaptive statistical queries that focuses on providing explicit confidence intervals along with point estimates. Prior work in this area has either focused on providing tight confidence intervals for specific analyses, or providing general worst-case bounds for point estimates. Unfortunately, as we observe, these worst-case bounds are loose in many settings — often not even beating simple baselines like sample splitting. Our main contribution is to design a framework for providing valid, instance-specific confidence intervals for point estimates that can be generated by heuristics. When paired with good heuristics, this method gives guarantees that are orders of magnitude better than the best worst-case bounds. We provide a Python library implementing our method.
Tasks
Published 2019-06-21
URL https://arxiv.org/abs/1906.09231v2
PDF https://arxiv.org/pdf/1906.09231v2.pdf
PWC https://paperswithcode.com/paper/guaranteed-validity-for-empirical-approaches
Repo
Framework

NASA: Neural Articulated Shape Approximation

Title NASA: Neural Articulated Shape Approximation
Authors Boyang Deng, JP Lewis, Timothy Jeruzalski, Gerard Pons-Moll, Geoffrey Hinton, Mohammad Norouzi, Andrea Tagliasacchi
Abstract Efficient representation of articulated objects such as human bodies is an important problem in computer vision and graphics. To efficiently simulate deformation, existing approaches represent 3D objects using polygonal meshes and deform them using skinning techniques. This paper introduces neural articulated shape approximation (NASA), an alternative framework that enables efficient representation of articulated deformable objects using neural indicator functions that are conditioned on pose. Occupancy testing using NASA is straightforward, circumventing the complexity of meshes and the issue of water-tightness. We demonstrate the effectiveness of NASA for 3D tracking applications, and discuss other potential extensions.
Tasks
Published 2019-12-06
URL https://arxiv.org/abs/1912.03207v2
PDF https://arxiv.org/pdf/1912.03207v2.pdf
PWC https://paperswithcode.com/paper/nasa-neural-articulated-shape-approximation
Repo
Framework

Deep Reinforcement Learning Based Dynamic Trajectory Control for UAV-assisted Mobile Edge Computing

Title Deep Reinforcement Learning Based Dynamic Trajectory Control for UAV-assisted Mobile Edge Computing
Authors Liang Wang, Kezhi Wang, Cunhua Pan, Wei Xu, Nauman Aslam, Arumugam Nallanathan
Abstract In this paper, we consider a platform of flying mobile edge computing (F-MEC), where unmanned aerial vehicles (UAVs) serve as equipment providing computation resource, and they enable task offloading from user equipment (UE). We aim to minimize energy consumption of all the UEs via optimizing the user association, resource allocation and the trajectory of UAVs. To this end, we first propose a Convex optimizAtion based Trajectory control algorithm (CAT), which solves the problem in an iterative way by using block coordinate descent (BCD) method. Then, to make the real-time decision while taking into account the dynamics of the environment (i.e., UAV may take off from different locations), we propose a deep Reinforcement leArning based Trajectory control algorithm (RAT). In RAT, we apply the Prioritized Experience Replay (PER) to improve the convergence of the training procedure. Different from the convex optimization based algorithm which may be susceptible to the initial points and requires iterations, RAT can be adapted to any taking off points of the UAVs and can obtain the solution more rapidly than CAT once training process has been completed. Simulation results show that the proposed CAT and RAT achieve the similar performance and both outperform traditional algorithms.
Tasks
Published 2019-11-10
URL https://arxiv.org/abs/1911.03887v1
PDF https://arxiv.org/pdf/1911.03887v1.pdf
PWC https://paperswithcode.com/paper/deep-reinforcement-learning-based-dynamic
Repo
Framework

Predicting dynamical system evolution with residual neural networks

Title Predicting dynamical system evolution with residual neural networks
Authors Artem Chashchin, Mikhail Botchev, Ivan Oseledets, George Ovchinnikov
Abstract Forecasting time series and time-dependent data is a common problem in many applications. One typical example is solving ordinary differential equation (ODE) systems $\dot{x}=F(x)$. Oftentimes the right hand side function $F(x)$ is not known explicitly and the ODE system is described by solution samples taken at some time points. Hence, ODE solvers cannot be used. In this paper, a data-driven approach to learning the evolution of dynamical systems is considered. We show how by training neural networks with ResNet-like architecture on the solution samples, models can be developed to predict the ODE system solution further in time. By evaluating the proposed approaches on three test ODE systems, we demonstrate that the neural network models are able to reproduce the main dynamics of the systems qualitatively well. Moreover, the predicted solution remains stable for much longer times than for other currently known models.
Tasks Time Series
Published 2019-10-11
URL https://arxiv.org/abs/1910.05233v1
PDF https://arxiv.org/pdf/1910.05233v1.pdf
PWC https://paperswithcode.com/paper/predicting-dynamical-system-evolution-with
Repo
Framework

Android Malware Detection Using Autoencoder

Title Android Malware Detection Using Autoencoder
Authors Abdelmonim Naway, Yuancheng Li
Abstract Smartphones have become an intrinsic part of human’s life. The smartphone unifies diverse advanced characteristics. It enables users to store various data such as photos, health data, credential bank data, and personal information. The Android operating system is the prevalent mobile operating system and, in the meantime, the most targeted operating system by malware developers. Recently the unparalleled development of Android malware put pressure on researchers to propose effective methods to suppress the spread of the malware. In this paper, we propose a deep learning approach for Android malware detection. The proposed approach investigates five different feature sets and applies Autoencoder to identify malware. The experimental results show that the proposed approach can identify malware with high accuracy.
Tasks Android Malware Detection, Malware Detection
Published 2019-01-14
URL http://arxiv.org/abs/1901.07315v1
PDF http://arxiv.org/pdf/1901.07315v1.pdf
PWC https://paperswithcode.com/paper/android-malware-detection-using-autoencoder
Repo
Framework

Gender Prediction from Tweets: Improving Neural Representations with Hand-Crafted Features

Title Gender Prediction from Tweets: Improving Neural Representations with Hand-Crafted Features
Authors Erhan Sezerer, Ozan Polatbilek, Selma Tekir
Abstract Author profiling is the characterization of an author through some key attributes such as gender, age, and language. In this paper, a RNN model with Attention (RNNwA) is proposed to predict the gender of a twitter user using their tweets. Both word level and tweet level attentions are utilized to learn ‘where to look’. This model (https://github.com/Darg-Iztech/gender-prediction-from-tweets) is improved by concatenating LSA-reduced n-gram features with the learned neural representation of a user. Both models are tested on three languages: English, Spanish, Arabic. The improved version of the proposed model (RNNwA + n-gram) achieves state-of-the-art performance on English and has competitive results on Spanish and Arabic.
Tasks Gender Prediction
Published 2019-08-22
URL https://arxiv.org/abs/1908.09919v2
PDF https://arxiv.org/pdf/1908.09919v2.pdf
PWC https://paperswithcode.com/paper/gender-prediction-from-tweets-improving
Repo
Framework

TMI: Thermodynamic inference of data manifolds

Title TMI: Thermodynamic inference of data manifolds
Authors Purushottam D. Dixit
Abstract The Gibbs-Boltzmann distribution offers a physically interpretable way to massively reduce the dimensionality of high dimensional probability distributions where the extensive variables are features' and the intensive variables are descriptors’. However, not all probability distributions can be modeled using the Gibbs-Boltzmann form. Here, we present TMI: TMI, {\bf T}hermodynamic {\bf M}anifold {\bf I}nference; a thermodynamic approach to approximate a collection of arbitrary distributions. TMI simultaneously learns from data intensive and extensive variables and achieves dimensionality reduction through a multiplicative, positive valued, and interpretable decomposition of the data. Importantly, the reduced dimensional space of intensive parameters is not homogeneous. The Gibbs-Boltzmann distribution defines an analytically tractable Riemannian metric on the space of intensive variables allowing us to calculate geodesics and volume elements. We discuss the applications of TMI with multiple real and artificial data sets. Possible extensions are discussed as well.
Tasks Dimensionality Reduction
Published 2019-11-21
URL https://arxiv.org/abs/1911.09776v1
PDF https://arxiv.org/pdf/1911.09776v1.pdf
PWC https://paperswithcode.com/paper/tmi-thermodynamic-inference-of-data-manifolds
Repo
Framework

Gradient-free activation maximization for identifying effective stimuli

Title Gradient-free activation maximization for identifying effective stimuli
Authors Will Xiao, Gabriel Kreiman
Abstract A fundamental question for understanding brain function is what types of stimuli drive neurons to fire. In visual neuroscience, this question has also been posted as characterizing the receptive field of a neuron. The search for effective stimuli has traditionally been based on a combination of insights from previous studies, intuition, and luck. Recently, the same question has emerged in the study of units in convolutional neural networks (ConvNets), and together with this question a family of solutions were developed that are generally referred to as “feature visualization by activation maximization.” We sought to bring in tools and techniques developed for studying ConvNets to the study of biological neural networks. However, one key difference that impedes direct translation of tools is that gradients can be obtained from ConvNets using backpropagation, but such gradients are not available from the brain. To circumvent this problem, we developed a method for gradient-free activation maximization by combining a generative neural network with a genetic algorithm. We termed this method XDream (EXtending DeepDream with real-time evolution for activation maximization), and we have shown that this method can reliably create strong stimuli for neurons in the macaque visual cortex (Ponce et al., 2019). In this paper, we describe extensive experiments characterizing the XDream method by using ConvNet units as in silico models of neurons. We show that XDream is applicable across network layers, architectures, and training sets; examine design choices in the algorithm; and provide practical guides for choosing hyperparameters in the algorithm. XDream is an efficient algorithm for uncovering neuronal tuning preferences in black-box networks using a vast and diverse stimulus space.
Tasks
Published 2019-05-01
URL http://arxiv.org/abs/1905.00378v1
PDF http://arxiv.org/pdf/1905.00378v1.pdf
PWC https://paperswithcode.com/paper/gradient-free-activation-maximization-for
Repo
Framework

Statistically Robust Neural Network Classification

Title Statistically Robust Neural Network Classification
Authors Benjie Wang, Stefan Webb, Tom Rainforth
Abstract Recently there has been much interest in quantifying the robustness of neural network classifiers through adversarial risk metrics. However, for problems where test-time corruptions occur in a probabilistic manner, rather than being generated by an explicit adversary, adversarial metrics typically do not provide an accurate or reliable indicator of robustness. To address this, we introduce a statistically robust risk (SRR) framework which measures robustness in expectation over both network inputs and a corruption distribution. Unlike many adversarial risk metrics, which typically require separate applications on a point-by-point basis, the SRR can easily be directly estimated for an entire network and used as a training objective in a stochastic gradient scheme. Furthermore, we show both theoretically and empirically that it can scale to higher-dimensional networks by providing superior generalization performance compared with comparable adversarial risks.
Tasks
Published 2019-12-10
URL https://arxiv.org/abs/1912.04884v2
PDF https://arxiv.org/pdf/1912.04884v2.pdf
PWC https://paperswithcode.com/paper/statistically-robust-neural-network
Repo
Framework

Image Decomposition and Classification through a Generative Model

Title Image Decomposition and Classification through a Generative Model
Authors Houpu Yao, Malcolm Regan, Yezhou Yang, Yi Ren
Abstract We demonstrate in this paper that a generative model can be designed to perform classification tasks under challenging settings, including adversarial attacks and input distribution shifts. Specifically, we propose a conditional variational autoencoder that learns both the decomposition of inputs and the distributions of the resulting components. During test, we jointly optimize the latent variables of the generator and the relaxed component labels to find the best match between the given input and the output of the generator. The model demonstrates promising performance at recognizing overlapping components from the multiMNIST dataset, and novel component combinations from a traffic sign dataset. Experiments also show that the proposed model achieves high robustness on MNIST and NORB datasets, in particular for high-strength gradient attacks and non-gradient attacks.
Tasks
Published 2019-02-09
URL http://arxiv.org/abs/1902.03361v1
PDF http://arxiv.org/pdf/1902.03361v1.pdf
PWC https://paperswithcode.com/paper/image-decomposition-and-classification
Repo
Framework
Title GNTeam at 2018 n2c2: Feature-augmented BiLSTM-CRF for drug-related entity recognition in hospital discharge summaries
Authors Maksim Belousov, Nikola Milosevic, Ghada Alfattni, Haifa Alrdahi, Goran Nenadic
Abstract Monitoring the administration of drugs and adverse drug reactions are key parts of pharmacovigilance. In this paper, we explore the extraction of drug mentions and drug-related information (reason for taking a drug, route, frequency, dosage, strength, form, duration, and adverse events) from hospital discharge summaries through deep learning that relies on various representations for clinical named entity recognition. This work was officially part of the 2018 n2c2 shared task, and we use the data supplied as part of the task. We developed two deep learning architecture based on recurrent neural networks and pre-trained language models. We also explore the effect of augmenting word representations with semantic features for clinical named entity recognition. Our feature-augmented BiLSTM-CRF model performed with F1-score of 92.67% and ranked 4th for entity extraction sub-task among submitted systems to n2c2 challenge. The recurrent neural networks that use the pre-trained domain-specific word embeddings and a CRF layer for label optimization perform drug, adverse event and related entities extraction with micro-averaged F1-score of over 91%. The augmentation of word vectors with semantic features extracted using available clinical NLP toolkits can further improve the performance. Word embeddings that are pre-trained on a large unannotated corpus of relevant documents and further fine-tuned to the task perform rather well. However, the augmentation of word embeddings with semantic features can help improve the performance (primarily by boosting precision) of drug-related named entity recognition from electronic health records.
Tasks Entity Extraction, Named Entity Recognition, Word Embeddings
Published 2019-09-23
URL https://arxiv.org/abs/1909.10390v1
PDF https://arxiv.org/pdf/1909.10390v1.pdf
PWC https://paperswithcode.com/paper/190910390
Repo
Framework

Human-Machine Collaborative Design for Accelerated Design of Compact Deep Neural Networks for Autonomous Driving

Title Human-Machine Collaborative Design for Accelerated Design of Compact Deep Neural Networks for Autonomous Driving
Authors Mohammad Javad Shafiee, Mirko Nentwig, Yohannes Kassahun, Francis Li, Stanislav Bochkarev, Akif Kamal, David Dolson, Secil Altintas, Arif Virani, Alexander Wong
Abstract An effective deep learning development process is critical for widespread industrial adoption, particularly in the automotive sector. A typical industrial deep learning development cycle involves customizing and re-designing an off-the-shelf network architecture to meet the operational requirements of the target application, leading to considerable trial and error work by a machine learning practitioner. This approach greatly impedes development with a long turnaround time and the unsatisfactory quality of the created models. As a result, a development platform that can aid engineers in greatly accelerating the design and production of compact, optimized deep neural networks is highly desirable. In this joint industrial case study, we study the efficacy of the GenSynth AI-assisted AI design platform for accelerating the design of custom, optimized deep neural networks for autonomous driving through human-machine collaborative design. We perform a quantitative examination by evaluating 10 different compact deep neural networks produced by GenSynth for the purpose of object detection via a NASNet-based user network prototype design, targeted at a low-cost GPU-based accelerated embedded system. Furthermore, we quantitatively assess the talent hours and GPU processing hours used by the GenSynth process and three other approaches based on the typical industrial development process. In addition, we quantify the annual cloud cost savings for comprehensive testing using networks produced by GenSynth. Finally, we assess the usability and merits of the GenSynth process through user feedback. The findings of this case study showed that GenSynth is easy to use and can be effective at accelerating the design and production of compact, customized deep neural network.
Tasks Autonomous Driving, Object Detection
Published 2019-09-12
URL https://arxiv.org/abs/1909.05587v1
PDF https://arxiv.org/pdf/1909.05587v1.pdf
PWC https://paperswithcode.com/paper/human-machine-collaborative-design-for
Repo
Framework

Decoupling Gating from Linearity

Title Decoupling Gating from Linearity
Authors Jonathan Fiat, Eran Malach, Shai Shalev-Shwartz
Abstract ReLU neural-networks have been in the focus of many recent theoretical works, trying to explain their empirical success. Nonetheless, there is still a gap between current theoretical results and empirical observations, even in the case of shallow (one hidden-layer) networks. For example, in the task of memorizing a random sample of size $m$ and dimension $d$, the best theoretical result requires the size of the network to be $\tilde{\Omega}(\frac{m^2}{d})$, while empirically a network of size slightly larger than $\frac{m}{d}$ is sufficient. To bridge this gap, we turn to study a simplified model for ReLU networks. We observe that a ReLU neuron is a product of a linear function with a gate (the latter determines whether the neuron is active or not), where both share a jointly trained weight vector. In this spirit, we introduce the Gated Linear Unit (GaLU), which simply decouples the linearity from the gating by assigning different vectors for each role. We show that GaLU networks allow us to get optimization and generalization results that are much stronger than those available for ReLU networks. Specifically, we show a memorization result for networks of size $\tilde{\Omega}(\frac{m}{d})$, and improved generalization bounds. Finally, we show that in some scenarios, GaLU networks behave similarly to ReLU networks, hence proving to be a good choice of a simplified model.
Tasks
Published 2019-06-12
URL https://arxiv.org/abs/1906.05032v1
PDF https://arxiv.org/pdf/1906.05032v1.pdf
PWC https://paperswithcode.com/paper/decoupling-gating-from-linearity-1
Repo
Framework

Policy Space Identification in Configurable Environments

Title Policy Space Identification in Configurable Environments
Authors Alberto Maria Metelli, Guglielmo Manneschi, Marcello Restelli
Abstract We study the problem of identifying the policy space of a learning agent, having access to a set of demonstrations generated by its optimal policy. We introduce an approach based on statistical testing to identify the set of policy parameters the agent can control, within a larger parametric policy space. After presenting two identification rules (combinatorial and simplified), applicable under different assumptions on the policy space, we provide a probabilistic analysis of the simplified one in the case of linear policies belonging to the exponential family. To improve the performance of our identification rules, we frame the problem in the recently introduced framework of the Configurable Markov Decision Processes, exploiting the opportunity of configuring the environment to induce the agent revealing which parameters it can control. Finally, we provide an empirical evaluation, on both discrete and continuous domains, to prove the effectiveness of our identification rules.
Tasks
Published 2019-09-09
URL https://arxiv.org/abs/1909.03984v1
PDF https://arxiv.org/pdf/1909.03984v1.pdf
PWC https://paperswithcode.com/paper/policy-space-identification-in-configurable
Repo
Framework

Deep Zero-Shot Learning for Scene Sketch

Title Deep Zero-Shot Learning for Scene Sketch
Authors Yao Xie, Peng Xu, Zhanyu Ma
Abstract We introduce a novel problem of scene sketch zero-shot learning (SSZSL), which is a challenging task, since (i) different from photo, the gap between common semantic domain (e.g., word vector) and sketch is too huge to exploit common semantic knowledge as the bridge for knowledge transfer, and (ii) compared with single-object sketch, more expressive feature representation for scene sketch is required to accommodate its high-level of abstraction and complexity. To overcome these challenges, we propose a deep embedding model for scene sketch zero-shot learning. In particular, we propose the augmented semantic vector to conduct domain alignment by fusing multi-modal semantic knowledge (e.g., cartoon image, natural image, text description), and adopt attention-based network for scene sketch feature learning. Moreover, we propose a novel distance metric to improve the similarity measure during testing. Extensive experiments and ablation studies demonstrate the benefit of our sketch-specific design.
Tasks Transfer Learning, Zero-Shot Learning
Published 2019-05-11
URL https://arxiv.org/abs/1905.04510v1
PDF https://arxiv.org/pdf/1905.04510v1.pdf
PWC https://paperswithcode.com/paper/deep-zero-shot-learning-for-scene-sketch
Repo
Framework
comments powered by Disqus