Paper Group ANR 1582
Guaranteed Validity for Empirical Approaches to Adaptive Data Analysis. NASA: Neural Articulated Shape Approximation. Deep Reinforcement Learning Based Dynamic Trajectory Control for UAV-assisted Mobile Edge Computing. Predicting dynamical system evolution with residual neural networks. Android Malware Detection Using Autoencoder. Gender Prediction …
Guaranteed Validity for Empirical Approaches to Adaptive Data Analysis
Title | Guaranteed Validity for Empirical Approaches to Adaptive Data Analysis |
Authors | Ryan Rogers, Aaron Roth, Adam Smith, Nathan Srebro, Om Thakkar, Blake Woodworth |
Abstract | We design a general framework for answering adaptive statistical queries that focuses on providing explicit confidence intervals along with point estimates. Prior work in this area has either focused on providing tight confidence intervals for specific analyses, or providing general worst-case bounds for point estimates. Unfortunately, as we observe, these worst-case bounds are loose in many settings — often not even beating simple baselines like sample splitting. Our main contribution is to design a framework for providing valid, instance-specific confidence intervals for point estimates that can be generated by heuristics. When paired with good heuristics, this method gives guarantees that are orders of magnitude better than the best worst-case bounds. We provide a Python library implementing our method. |
Tasks | |
Published | 2019-06-21 |
URL | https://arxiv.org/abs/1906.09231v2 |
https://arxiv.org/pdf/1906.09231v2.pdf | |
PWC | https://paperswithcode.com/paper/guaranteed-validity-for-empirical-approaches |
Repo | |
Framework | |
NASA: Neural Articulated Shape Approximation
Title | NASA: Neural Articulated Shape Approximation |
Authors | Boyang Deng, JP Lewis, Timothy Jeruzalski, Gerard Pons-Moll, Geoffrey Hinton, Mohammad Norouzi, Andrea Tagliasacchi |
Abstract | Efficient representation of articulated objects such as human bodies is an important problem in computer vision and graphics. To efficiently simulate deformation, existing approaches represent 3D objects using polygonal meshes and deform them using skinning techniques. This paper introduces neural articulated shape approximation (NASA), an alternative framework that enables efficient representation of articulated deformable objects using neural indicator functions that are conditioned on pose. Occupancy testing using NASA is straightforward, circumventing the complexity of meshes and the issue of water-tightness. We demonstrate the effectiveness of NASA for 3D tracking applications, and discuss other potential extensions. |
Tasks | |
Published | 2019-12-06 |
URL | https://arxiv.org/abs/1912.03207v2 |
https://arxiv.org/pdf/1912.03207v2.pdf | |
PWC | https://paperswithcode.com/paper/nasa-neural-articulated-shape-approximation |
Repo | |
Framework | |
Deep Reinforcement Learning Based Dynamic Trajectory Control for UAV-assisted Mobile Edge Computing
Title | Deep Reinforcement Learning Based Dynamic Trajectory Control for UAV-assisted Mobile Edge Computing |
Authors | Liang Wang, Kezhi Wang, Cunhua Pan, Wei Xu, Nauman Aslam, Arumugam Nallanathan |
Abstract | In this paper, we consider a platform of flying mobile edge computing (F-MEC), where unmanned aerial vehicles (UAVs) serve as equipment providing computation resource, and they enable task offloading from user equipment (UE). We aim to minimize energy consumption of all the UEs via optimizing the user association, resource allocation and the trajectory of UAVs. To this end, we first propose a Convex optimizAtion based Trajectory control algorithm (CAT), which solves the problem in an iterative way by using block coordinate descent (BCD) method. Then, to make the real-time decision while taking into account the dynamics of the environment (i.e., UAV may take off from different locations), we propose a deep Reinforcement leArning based Trajectory control algorithm (RAT). In RAT, we apply the Prioritized Experience Replay (PER) to improve the convergence of the training procedure. Different from the convex optimization based algorithm which may be susceptible to the initial points and requires iterations, RAT can be adapted to any taking off points of the UAVs and can obtain the solution more rapidly than CAT once training process has been completed. Simulation results show that the proposed CAT and RAT achieve the similar performance and both outperform traditional algorithms. |
Tasks | |
Published | 2019-11-10 |
URL | https://arxiv.org/abs/1911.03887v1 |
https://arxiv.org/pdf/1911.03887v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-reinforcement-learning-based-dynamic |
Repo | |
Framework | |
Predicting dynamical system evolution with residual neural networks
Title | Predicting dynamical system evolution with residual neural networks |
Authors | Artem Chashchin, Mikhail Botchev, Ivan Oseledets, George Ovchinnikov |
Abstract | Forecasting time series and time-dependent data is a common problem in many applications. One typical example is solving ordinary differential equation (ODE) systems $\dot{x}=F(x)$. Oftentimes the right hand side function $F(x)$ is not known explicitly and the ODE system is described by solution samples taken at some time points. Hence, ODE solvers cannot be used. In this paper, a data-driven approach to learning the evolution of dynamical systems is considered. We show how by training neural networks with ResNet-like architecture on the solution samples, models can be developed to predict the ODE system solution further in time. By evaluating the proposed approaches on three test ODE systems, we demonstrate that the neural network models are able to reproduce the main dynamics of the systems qualitatively well. Moreover, the predicted solution remains stable for much longer times than for other currently known models. |
Tasks | Time Series |
Published | 2019-10-11 |
URL | https://arxiv.org/abs/1910.05233v1 |
https://arxiv.org/pdf/1910.05233v1.pdf | |
PWC | https://paperswithcode.com/paper/predicting-dynamical-system-evolution-with |
Repo | |
Framework | |
Android Malware Detection Using Autoencoder
Title | Android Malware Detection Using Autoencoder |
Authors | Abdelmonim Naway, Yuancheng Li |
Abstract | Smartphones have become an intrinsic part of human’s life. The smartphone unifies diverse advanced characteristics. It enables users to store various data such as photos, health data, credential bank data, and personal information. The Android operating system is the prevalent mobile operating system and, in the meantime, the most targeted operating system by malware developers. Recently the unparalleled development of Android malware put pressure on researchers to propose effective methods to suppress the spread of the malware. In this paper, we propose a deep learning approach for Android malware detection. The proposed approach investigates five different feature sets and applies Autoencoder to identify malware. The experimental results show that the proposed approach can identify malware with high accuracy. |
Tasks | Android Malware Detection, Malware Detection |
Published | 2019-01-14 |
URL | http://arxiv.org/abs/1901.07315v1 |
http://arxiv.org/pdf/1901.07315v1.pdf | |
PWC | https://paperswithcode.com/paper/android-malware-detection-using-autoencoder |
Repo | |
Framework | |
Gender Prediction from Tweets: Improving Neural Representations with Hand-Crafted Features
Title | Gender Prediction from Tweets: Improving Neural Representations with Hand-Crafted Features |
Authors | Erhan Sezerer, Ozan Polatbilek, Selma Tekir |
Abstract | Author profiling is the characterization of an author through some key attributes such as gender, age, and language. In this paper, a RNN model with Attention (RNNwA) is proposed to predict the gender of a twitter user using their tweets. Both word level and tweet level attentions are utilized to learn ‘where to look’. This model (https://github.com/Darg-Iztech/gender-prediction-from-tweets) is improved by concatenating LSA-reduced n-gram features with the learned neural representation of a user. Both models are tested on three languages: English, Spanish, Arabic. The improved version of the proposed model (RNNwA + n-gram) achieves state-of-the-art performance on English and has competitive results on Spanish and Arabic. |
Tasks | Gender Prediction |
Published | 2019-08-22 |
URL | https://arxiv.org/abs/1908.09919v2 |
https://arxiv.org/pdf/1908.09919v2.pdf | |
PWC | https://paperswithcode.com/paper/gender-prediction-from-tweets-improving |
Repo | |
Framework | |
TMI: Thermodynamic inference of data manifolds
Title | TMI: Thermodynamic inference of data manifolds |
Authors | Purushottam D. Dixit |
Abstract | The Gibbs-Boltzmann distribution offers a physically interpretable way to massively reduce the dimensionality of high dimensional probability distributions where the extensive variables are features' and the intensive variables are descriptors’. However, not all probability distributions can be modeled using the Gibbs-Boltzmann form. Here, we present TMI: TMI, {\bf T}hermodynamic {\bf M}anifold {\bf I}nference; a thermodynamic approach to approximate a collection of arbitrary distributions. TMI simultaneously learns from data intensive and extensive variables and achieves dimensionality reduction through a multiplicative, positive valued, and interpretable decomposition of the data. Importantly, the reduced dimensional space of intensive parameters is not homogeneous. The Gibbs-Boltzmann distribution defines an analytically tractable Riemannian metric on the space of intensive variables allowing us to calculate geodesics and volume elements. We discuss the applications of TMI with multiple real and artificial data sets. Possible extensions are discussed as well. |
Tasks | Dimensionality Reduction |
Published | 2019-11-21 |
URL | https://arxiv.org/abs/1911.09776v1 |
https://arxiv.org/pdf/1911.09776v1.pdf | |
PWC | https://paperswithcode.com/paper/tmi-thermodynamic-inference-of-data-manifolds |
Repo | |
Framework | |
Gradient-free activation maximization for identifying effective stimuli
Title | Gradient-free activation maximization for identifying effective stimuli |
Authors | Will Xiao, Gabriel Kreiman |
Abstract | A fundamental question for understanding brain function is what types of stimuli drive neurons to fire. In visual neuroscience, this question has also been posted as characterizing the receptive field of a neuron. The search for effective stimuli has traditionally been based on a combination of insights from previous studies, intuition, and luck. Recently, the same question has emerged in the study of units in convolutional neural networks (ConvNets), and together with this question a family of solutions were developed that are generally referred to as “feature visualization by activation maximization.” We sought to bring in tools and techniques developed for studying ConvNets to the study of biological neural networks. However, one key difference that impedes direct translation of tools is that gradients can be obtained from ConvNets using backpropagation, but such gradients are not available from the brain. To circumvent this problem, we developed a method for gradient-free activation maximization by combining a generative neural network with a genetic algorithm. We termed this method XDream (EXtending DeepDream with real-time evolution for activation maximization), and we have shown that this method can reliably create strong stimuli for neurons in the macaque visual cortex (Ponce et al., 2019). In this paper, we describe extensive experiments characterizing the XDream method by using ConvNet units as in silico models of neurons. We show that XDream is applicable across network layers, architectures, and training sets; examine design choices in the algorithm; and provide practical guides for choosing hyperparameters in the algorithm. XDream is an efficient algorithm for uncovering neuronal tuning preferences in black-box networks using a vast and diverse stimulus space. |
Tasks | |
Published | 2019-05-01 |
URL | http://arxiv.org/abs/1905.00378v1 |
http://arxiv.org/pdf/1905.00378v1.pdf | |
PWC | https://paperswithcode.com/paper/gradient-free-activation-maximization-for |
Repo | |
Framework | |
Statistically Robust Neural Network Classification
Title | Statistically Robust Neural Network Classification |
Authors | Benjie Wang, Stefan Webb, Tom Rainforth |
Abstract | Recently there has been much interest in quantifying the robustness of neural network classifiers through adversarial risk metrics. However, for problems where test-time corruptions occur in a probabilistic manner, rather than being generated by an explicit adversary, adversarial metrics typically do not provide an accurate or reliable indicator of robustness. To address this, we introduce a statistically robust risk (SRR) framework which measures robustness in expectation over both network inputs and a corruption distribution. Unlike many adversarial risk metrics, which typically require separate applications on a point-by-point basis, the SRR can easily be directly estimated for an entire network and used as a training objective in a stochastic gradient scheme. Furthermore, we show both theoretically and empirically that it can scale to higher-dimensional networks by providing superior generalization performance compared with comparable adversarial risks. |
Tasks | |
Published | 2019-12-10 |
URL | https://arxiv.org/abs/1912.04884v2 |
https://arxiv.org/pdf/1912.04884v2.pdf | |
PWC | https://paperswithcode.com/paper/statistically-robust-neural-network |
Repo | |
Framework | |
Image Decomposition and Classification through a Generative Model
Title | Image Decomposition and Classification through a Generative Model |
Authors | Houpu Yao, Malcolm Regan, Yezhou Yang, Yi Ren |
Abstract | We demonstrate in this paper that a generative model can be designed to perform classification tasks under challenging settings, including adversarial attacks and input distribution shifts. Specifically, we propose a conditional variational autoencoder that learns both the decomposition of inputs and the distributions of the resulting components. During test, we jointly optimize the latent variables of the generator and the relaxed component labels to find the best match between the given input and the output of the generator. The model demonstrates promising performance at recognizing overlapping components from the multiMNIST dataset, and novel component combinations from a traffic sign dataset. Experiments also show that the proposed model achieves high robustness on MNIST and NORB datasets, in particular for high-strength gradient attacks and non-gradient attacks. |
Tasks | |
Published | 2019-02-09 |
URL | http://arxiv.org/abs/1902.03361v1 |
http://arxiv.org/pdf/1902.03361v1.pdf | |
PWC | https://paperswithcode.com/paper/image-decomposition-and-classification |
Repo | |
Framework | |
GNTeam at 2018 n2c2: Feature-augmented BiLSTM-CRF for drug-related entity recognition in hospital discharge summaries
Title | GNTeam at 2018 n2c2: Feature-augmented BiLSTM-CRF for drug-related entity recognition in hospital discharge summaries |
Authors | Maksim Belousov, Nikola Milosevic, Ghada Alfattni, Haifa Alrdahi, Goran Nenadic |
Abstract | Monitoring the administration of drugs and adverse drug reactions are key parts of pharmacovigilance. In this paper, we explore the extraction of drug mentions and drug-related information (reason for taking a drug, route, frequency, dosage, strength, form, duration, and adverse events) from hospital discharge summaries through deep learning that relies on various representations for clinical named entity recognition. This work was officially part of the 2018 n2c2 shared task, and we use the data supplied as part of the task. We developed two deep learning architecture based on recurrent neural networks and pre-trained language models. We also explore the effect of augmenting word representations with semantic features for clinical named entity recognition. Our feature-augmented BiLSTM-CRF model performed with F1-score of 92.67% and ranked 4th for entity extraction sub-task among submitted systems to n2c2 challenge. The recurrent neural networks that use the pre-trained domain-specific word embeddings and a CRF layer for label optimization perform drug, adverse event and related entities extraction with micro-averaged F1-score of over 91%. The augmentation of word vectors with semantic features extracted using available clinical NLP toolkits can further improve the performance. Word embeddings that are pre-trained on a large unannotated corpus of relevant documents and further fine-tuned to the task perform rather well. However, the augmentation of word embeddings with semantic features can help improve the performance (primarily by boosting precision) of drug-related named entity recognition from electronic health records. |
Tasks | Entity Extraction, Named Entity Recognition, Word Embeddings |
Published | 2019-09-23 |
URL | https://arxiv.org/abs/1909.10390v1 |
https://arxiv.org/pdf/1909.10390v1.pdf | |
PWC | https://paperswithcode.com/paper/190910390 |
Repo | |
Framework | |
Human-Machine Collaborative Design for Accelerated Design of Compact Deep Neural Networks for Autonomous Driving
Title | Human-Machine Collaborative Design for Accelerated Design of Compact Deep Neural Networks for Autonomous Driving |
Authors | Mohammad Javad Shafiee, Mirko Nentwig, Yohannes Kassahun, Francis Li, Stanislav Bochkarev, Akif Kamal, David Dolson, Secil Altintas, Arif Virani, Alexander Wong |
Abstract | An effective deep learning development process is critical for widespread industrial adoption, particularly in the automotive sector. A typical industrial deep learning development cycle involves customizing and re-designing an off-the-shelf network architecture to meet the operational requirements of the target application, leading to considerable trial and error work by a machine learning practitioner. This approach greatly impedes development with a long turnaround time and the unsatisfactory quality of the created models. As a result, a development platform that can aid engineers in greatly accelerating the design and production of compact, optimized deep neural networks is highly desirable. In this joint industrial case study, we study the efficacy of the GenSynth AI-assisted AI design platform for accelerating the design of custom, optimized deep neural networks for autonomous driving through human-machine collaborative design. We perform a quantitative examination by evaluating 10 different compact deep neural networks produced by GenSynth for the purpose of object detection via a NASNet-based user network prototype design, targeted at a low-cost GPU-based accelerated embedded system. Furthermore, we quantitatively assess the talent hours and GPU processing hours used by the GenSynth process and three other approaches based on the typical industrial development process. In addition, we quantify the annual cloud cost savings for comprehensive testing using networks produced by GenSynth. Finally, we assess the usability and merits of the GenSynth process through user feedback. The findings of this case study showed that GenSynth is easy to use and can be effective at accelerating the design and production of compact, customized deep neural network. |
Tasks | Autonomous Driving, Object Detection |
Published | 2019-09-12 |
URL | https://arxiv.org/abs/1909.05587v1 |
https://arxiv.org/pdf/1909.05587v1.pdf | |
PWC | https://paperswithcode.com/paper/human-machine-collaborative-design-for |
Repo | |
Framework | |
Decoupling Gating from Linearity
Title | Decoupling Gating from Linearity |
Authors | Jonathan Fiat, Eran Malach, Shai Shalev-Shwartz |
Abstract | ReLU neural-networks have been in the focus of many recent theoretical works, trying to explain their empirical success. Nonetheless, there is still a gap between current theoretical results and empirical observations, even in the case of shallow (one hidden-layer) networks. For example, in the task of memorizing a random sample of size $m$ and dimension $d$, the best theoretical result requires the size of the network to be $\tilde{\Omega}(\frac{m^2}{d})$, while empirically a network of size slightly larger than $\frac{m}{d}$ is sufficient. To bridge this gap, we turn to study a simplified model for ReLU networks. We observe that a ReLU neuron is a product of a linear function with a gate (the latter determines whether the neuron is active or not), where both share a jointly trained weight vector. In this spirit, we introduce the Gated Linear Unit (GaLU), which simply decouples the linearity from the gating by assigning different vectors for each role. We show that GaLU networks allow us to get optimization and generalization results that are much stronger than those available for ReLU networks. Specifically, we show a memorization result for networks of size $\tilde{\Omega}(\frac{m}{d})$, and improved generalization bounds. Finally, we show that in some scenarios, GaLU networks behave similarly to ReLU networks, hence proving to be a good choice of a simplified model. |
Tasks | |
Published | 2019-06-12 |
URL | https://arxiv.org/abs/1906.05032v1 |
https://arxiv.org/pdf/1906.05032v1.pdf | |
PWC | https://paperswithcode.com/paper/decoupling-gating-from-linearity-1 |
Repo | |
Framework | |
Policy Space Identification in Configurable Environments
Title | Policy Space Identification in Configurable Environments |
Authors | Alberto Maria Metelli, Guglielmo Manneschi, Marcello Restelli |
Abstract | We study the problem of identifying the policy space of a learning agent, having access to a set of demonstrations generated by its optimal policy. We introduce an approach based on statistical testing to identify the set of policy parameters the agent can control, within a larger parametric policy space. After presenting two identification rules (combinatorial and simplified), applicable under different assumptions on the policy space, we provide a probabilistic analysis of the simplified one in the case of linear policies belonging to the exponential family. To improve the performance of our identification rules, we frame the problem in the recently introduced framework of the Configurable Markov Decision Processes, exploiting the opportunity of configuring the environment to induce the agent revealing which parameters it can control. Finally, we provide an empirical evaluation, on both discrete and continuous domains, to prove the effectiveness of our identification rules. |
Tasks | |
Published | 2019-09-09 |
URL | https://arxiv.org/abs/1909.03984v1 |
https://arxiv.org/pdf/1909.03984v1.pdf | |
PWC | https://paperswithcode.com/paper/policy-space-identification-in-configurable |
Repo | |
Framework | |
Deep Zero-Shot Learning for Scene Sketch
Title | Deep Zero-Shot Learning for Scene Sketch |
Authors | Yao Xie, Peng Xu, Zhanyu Ma |
Abstract | We introduce a novel problem of scene sketch zero-shot learning (SSZSL), which is a challenging task, since (i) different from photo, the gap between common semantic domain (e.g., word vector) and sketch is too huge to exploit common semantic knowledge as the bridge for knowledge transfer, and (ii) compared with single-object sketch, more expressive feature representation for scene sketch is required to accommodate its high-level of abstraction and complexity. To overcome these challenges, we propose a deep embedding model for scene sketch zero-shot learning. In particular, we propose the augmented semantic vector to conduct domain alignment by fusing multi-modal semantic knowledge (e.g., cartoon image, natural image, text description), and adopt attention-based network for scene sketch feature learning. Moreover, we propose a novel distance metric to improve the similarity measure during testing. Extensive experiments and ablation studies demonstrate the benefit of our sketch-specific design. |
Tasks | Transfer Learning, Zero-Shot Learning |
Published | 2019-05-11 |
URL | https://arxiv.org/abs/1905.04510v1 |
https://arxiv.org/pdf/1905.04510v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-zero-shot-learning-for-scene-sketch |
Repo | |
Framework | |