Paper Group ANR 46
Zero-Shot Reinforcement Learning with Deep Attention Convolutional Neural Networks. Estimating heterogeneous treatment effects with right-censored data via causal survival forests. MDEA: Malware Detection with Evolutionary Adversarial Learning. Plausible Counterfactuals: Auditing Deep Learning Classifiers with Realistic Adversarial Examples. Kidney …
Zero-Shot Reinforcement Learning with Deep Attention Convolutional Neural Networks
Title | Zero-Shot Reinforcement Learning with Deep Attention Convolutional Neural Networks |
Authors | Sahika Genc, Sunil Mallya, Sravan Bodapati, Tao Sun, Yunzhe Tao |
Abstract | Simulation-to-simulation and simulation-to-real world transfer of neural network models have been a difficult problem. To close the reality gap, prior methods to simulation-to-real world transfer focused on domain adaptation, decoupling perception and dynamics and solving each problem separately, and randomization of agent parameters and environment conditions to expose the learning agent to a variety of conditions. While these methods provide acceptable performance, the computational complexity required to capture a large variation of parameters for comprehensive scenarios on a given task such as autonomous driving or robotic manipulation is high. Our key contribution is to theoretically prove and empirically demonstrate that a deep attention convolutional neural network (DACNN) with specific visual sensor configuration performs as well as training on a dataset with high domain and parameter variation at lower computational complexity. Specifically, the attention network weights are learned through policy optimization to focus on local dependencies that lead to optimal actions, and does not require tuning in real-world for generalization. Our new architecture adapts perception with respect to the control objective, resulting in zero-shot learning without pre-training a perception network. To measure the impact of our new deep network architecture on domain adaptation, we consider autonomous driving as a use case. We perform an extensive set of experiments in simulation-to-simulation and simulation-to-real scenarios to compare our approach to several baselines including the current state-of-art models. |
Tasks | Autonomous Driving, Deep Attention, Domain Adaptation, Zero-Shot Learning |
Published | 2020-01-02 |
URL | https://arxiv.org/abs/2001.00605v1 |
https://arxiv.org/pdf/2001.00605v1.pdf | |
PWC | https://paperswithcode.com/paper/zero-shot-reinforcement-learning-with-deep |
Repo | |
Framework | |
Estimating heterogeneous treatment effects with right-censored data via causal survival forests
Title | Estimating heterogeneous treatment effects with right-censored data via causal survival forests |
Authors | Yifan Cui, Michael R. Kosorok, Stefan Wager, Ruoqing Zhu |
Abstract | There is fast-growing literature on estimating heterogeneous treatment effects via random forests in observational studies. However, there are few approaches available for right-censored survival data. In clinical trials, right-censored survival data are frequently encountered. Quantifying the causal relationship between a treatment and the survival outcome is of great interest. Random forests provide a robust, nonparametric approach to statistical estimation. In addition, recent developments allow forest-based methods to quantify the uncertainty of the estimated heterogeneous treatment effects. We propose causal survival forests that directly target on estimating the treatment effect from an observational study. We establish consistency and asymptotic normality of the proposed estimators and provide an estimator of the asymptotic variance that enables valid confidence intervals of the estimated treatment effect. The performance of our approach is demonstrated via extensive simulations and data from an HIV study. |
Tasks | |
Published | 2020-01-27 |
URL | https://arxiv.org/abs/2001.09887v1 |
https://arxiv.org/pdf/2001.09887v1.pdf | |
PWC | https://paperswithcode.com/paper/estimating-heterogeneous-treatment-effects |
Repo | |
Framework | |
MDEA: Malware Detection with Evolutionary Adversarial Learning
Title | MDEA: Malware Detection with Evolutionary Adversarial Learning |
Authors | Xiruo Wang, Risto Miikkulainen |
Abstract | Malware detection have used machine learning to detect malware in programs. These applications take in raw or processed binary data to neural network models to classify as benign or malicious files. Even though this approach has proven effective against dynamic changes, such as encrypting, obfuscating and packing techniques, it is vulnerable to specific evasion attacks where that small changes in the input data cause misclassification at test time. This paper proposes a new approach: MDEA, an Adversarial Malware Detection model uses evolutionary optimization to create attack samples to make the network robust against evasion attacks. By retraining the model with the evolved malware samples, its performance improves a significant margin. |
Tasks | Malware Detection |
Published | 2020-02-09 |
URL | https://arxiv.org/abs/2002.03331v1 |
https://arxiv.org/pdf/2002.03331v1.pdf | |
PWC | https://paperswithcode.com/paper/mdea-malware-detection-with-evolutionary |
Repo | |
Framework | |
Plausible Counterfactuals: Auditing Deep Learning Classifiers with Realistic Adversarial Examples
Title | Plausible Counterfactuals: Auditing Deep Learning Classifiers with Realistic Adversarial Examples |
Authors | Alejandro Barredo-Arrieta, Javier Del Ser |
Abstract | The last decade has witnessed the proliferation of Deep Learning models in many applications, achieving unrivaled levels of predictive performance. Unfortunately, the black-box nature of Deep Learning models has posed unanswered questions about what they learn from data. Certain application scenarios have highlighted the importance of assessing the bounds under which Deep Learning models operate, a problem addressed by using assorted approaches aimed at audiences from different domains. However, as the focus of the application is placed more on non-expert users, it results mandatory to provide the means for him/her to trust the model, just like a human gets familiar with a system or process: by understanding the hypothetical circumstances under which it fails. This is indeed the angular stone for this research work: to undertake an adversarial analysis of a Deep Learning model. The proposed framework constructs counterfactual examples by ensuring their plausibility, e.g. there is a reasonable probability that a human could generate them without resorting to a computer program. Therefore, this work must be regarded as valuable auditing exercise of the usable bounds a certain model is constrained within, thereby allowing for a much greater understanding of the capabilities and pitfalls of a model used in a real application. To this end, a Generative Adversarial Network (GAN) and multi-objective heuristics are used to furnish a plausible attack to the audited model, efficiently trading between the confusion of this model, the intensity and plausibility of the generated counterfactual. Its utility is showcased within a human face classification task, unveiling the enormous potential of the proposed framework. |
Tasks | |
Published | 2020-03-25 |
URL | https://arxiv.org/abs/2003.11323v1 |
https://arxiv.org/pdf/2003.11323v1.pdf | |
PWC | https://paperswithcode.com/paper/plausible-counterfactuals-auditing-deep |
Repo | |
Framework | |
Kidney segmentation using 3D U-Net localized with Expectation Maximization
Title | Kidney segmentation using 3D U-Net localized with Expectation Maximization |
Authors | Omid Bazgir, Kai Barck, Richard A. D. Carano, Robby M. Weimer, Luke Xie |
Abstract | Kidney volume is greatly affected in several renal diseases. Precise and automatic segmentation of the kidney can help determine kidney size and evaluate renal function. Fully convolutional neural networks have been used to segment organs from large biomedical 3D images. While these networks demonstrate state-of-the-art segmentation performances, they do not immediately translate to small foreground objects, small sample sizes, and anisotropic resolution in MRI datasets. In this paper we propose a new framework to address some of the challenges for segmenting 3D MRI. These methods were implemented on preclinical MRI for segmenting kidneys in an animal model of lupus nephritis. Our implementation strategy is twofold: 1) to utilize additional MRI diffusion images to detect the general kidney area, and 2) to reduce the 3D U-Net kernels to handle small sample sizes. Using this approach, a Dice similarity coefficient of 0.88 was achieved with a limited dataset of n=196. This segmentation strategy with careful optimization can be applied to various renal injuries or other organ systems. |
Tasks | |
Published | 2020-03-20 |
URL | https://arxiv.org/abs/2003.09075v1 |
https://arxiv.org/pdf/2003.09075v1.pdf | |
PWC | https://paperswithcode.com/paper/kidney-segmentation-using-3d-u-net-localized |
Repo | |
Framework | |
Chained Representation Cycling: Learning to Estimate 3D Human Pose and Shape by Cycling Between Representations
Title | Chained Representation Cycling: Learning to Estimate 3D Human Pose and Shape by Cycling Between Representations |
Authors | Nadine Rueegg, Christoph Lassner, Michael J. Black, Konrad Schindler |
Abstract | The goal of many computer vision systems is to transform image pixels into 3D representations. Recent popular models use neural networks to regress directly from pixels to 3D object parameters. Such an approach works well when supervision is available, but in problems like human pose and shape estimation, it is difficult to obtain natural images with 3D ground truth. To go one step further, we propose a new architecture that facilitates unsupervised, or lightly supervised, learning. The idea is to break the problem into a series of transformations between increasingly abstract representations. Each step involves a cycle designed to be learnable without annotated training data, and the chain of cycles delivers the final solution. Specifically, we use 2D body part segments as an intermediate representation that contains enough information to be lifted to 3D, and at the same time is simple enough to be learned in an unsupervised way. We demonstrate the method by learning 3D human pose and shape from un-paired and un-annotated images. We also explore varying amounts of paired data and show that cycling greatly alleviates the need for paired data. While we present results for modeling humans, our formulation is general and can be applied to other vision problems. |
Tasks | |
Published | 2020-01-06 |
URL | https://arxiv.org/abs/2001.01613v1 |
https://arxiv.org/pdf/2001.01613v1.pdf | |
PWC | https://paperswithcode.com/paper/chained-representation-cycling-learning-to |
Repo | |
Framework | |
Auto Completion of User Interface Layout Design Using Transformer-Based Tree Decoders
Title | Auto Completion of User Interface Layout Design Using Transformer-Based Tree Decoders |
Authors | Yang Li, Julien Amelot, Xin Zhou, Samy Bengio, Si Si |
Abstract | It has been of increasing interest in the field to develop automatic machineries to facilitate the design process. In this paper, we focus on assisting graphical user interface (UI) layout design, a crucial task in app development. Given a partial layout, which a designer has entered, our model learns to complete the layout by predicting the remaining UI elements with a correct position and dimension as well as the hierarchical structures. Such automation will significantly ease the effort of UI designers and developers. While we focus on interface layout prediction, our model can be generally applicable for other layout prediction problems that involve tree structures and 2-dimensional placements. Particularly, we design two versions of Transformer-based tree decoders: Pointer and Recursive Transformer, and experiment with these models on a public dataset. We also propose several metrics for measuring the accuracy of tree prediction and ground these metrics in the domain of user experience. These contribute a new task and methods to deep learning research. |
Tasks | |
Published | 2020-01-14 |
URL | https://arxiv.org/abs/2001.05308v1 |
https://arxiv.org/pdf/2001.05308v1.pdf | |
PWC | https://paperswithcode.com/paper/auto-completion-of-user-interface-layout-1 |
Repo | |
Framework | |
Channel-Attention Dense U-Net for Multichannel Speech Enhancement
Title | Channel-Attention Dense U-Net for Multichannel Speech Enhancement |
Authors | Bahareh Tolooshams, Ritwik Giri, Andrew H. Song, Umut Isik, Arvindh Krishnaswamy |
Abstract | Supervised deep learning has gained significant attention for speech enhancement recently. The state-of-the-art deep learning methods perform the task by learning a ratio/binary mask that is applied to the mixture in the time-frequency domain to produce the clean speech. Despite the great performance in the single-channel setting, these frameworks lag in performance in the multichannel setting as the majority of these methods a) fail to exploit the available spatial information fully, and b) still treat the deep architecture as a black box which may not be well-suited for multichannel audio processing. This paper addresses these drawbacks, a) by utilizing complex ratio masking instead of masking on the magnitude of the spectrogram, and more importantly, b) by introducing a channel-attention mechanism inside the deep architecture to mimic beamforming. We propose Channel-Attention Dense U-Net, in which we apply the channel-attention unit recursively on feature maps at every layer of the network, enabling the network to perform non-linear beamforming. We demonstrate the superior performance of the network against the state-of-the-art approaches on the CHiME-3 dataset. |
Tasks | Speech Enhancement |
Published | 2020-01-30 |
URL | https://arxiv.org/abs/2001.11542v1 |
https://arxiv.org/pdf/2001.11542v1.pdf | |
PWC | https://paperswithcode.com/paper/channel-attention-dense-u-net-for |
Repo | |
Framework | |
High-Accuracy and Low-Latency Speech Recognition with Two-Head Contextual Layer Trajectory LSTM Model
Title | High-Accuracy and Low-Latency Speech Recognition with Two-Head Contextual Layer Trajectory LSTM Model |
Authors | Jinyu Li, Rui Zhao, Eric Sun, Jeremy H. M. Wong, Amit Das, Zhong Meng, Yifan Gong |
Abstract | While the community keeps promoting end-to-end models over conventional hybrid models, which usually are long short-term memory (LSTM) models trained with a cross entropy criterion followed by a sequence discriminative training criterion, we argue that such conventional hybrid models can still be significantly improved. In this paper, we detail our recent efforts to improve conventional hybrid LSTM acoustic models for high-accuracy and low-latency automatic speech recognition. To achieve high accuracy, we use a contextual layer trajectory LSTM (cltLSTM), which decouples the temporal modeling and target classification tasks, and incorporates future context frames to get more information for accurate acoustic modeling. We further improve the training strategy with sequence-level teacher-student learning. To obtain low latency, we design a two-head cltLSTM, in which one head has zero latency and the other head has a small latency, compared to an LSTM. When trained with Microsoft’s 65 thousand hours of anonymized training data and evaluated with test sets with 1.8 million words, the proposed two-head cltLSTM model with the proposed training strategy yields a 28.2% relative WER reduction over the conventional LSTM acoustic model, with a similar perceived latency. |
Tasks | Speech Recognition |
Published | 2020-03-17 |
URL | https://arxiv.org/abs/2003.07482v1 |
https://arxiv.org/pdf/2003.07482v1.pdf | |
PWC | https://paperswithcode.com/paper/high-accuracy-and-low-latency-speech |
Repo | |
Framework | |
Convolutional Mean: A Simple Convolutional Neural Network for Illuminant Estimation
Title | Convolutional Mean: A Simple Convolutional Neural Network for Illuminant Estimation |
Authors | Han Gong |
Abstract | We present Convolutional Mean (CM) - a simple and fast convolutional neural network for illuminant estimation. Our proposed method only requires a small neural network model (1.1K parameters) and a 48 x 32 thumbnail input image. Our unoptimized Python implementation takes 1 ms/image, which is arguably 3-3750x faster than the current leading solutions with similar accuracy. Using two public datasets, we show that our proposed light-weight method offers accuracy comparable to the current leading methods’ (which consist of thousands/millions of parameters) across several measures. |
Tasks | |
Published | 2020-01-14 |
URL | https://arxiv.org/abs/2001.04911v1 |
https://arxiv.org/pdf/2001.04911v1.pdf | |
PWC | https://paperswithcode.com/paper/convolutional-mean-a-simple-convolutional |
Repo | |
Framework | |
Sample Efficient Reinforcement Learning through Learning from Demonstrations in Minecraft
Title | Sample Efficient Reinforcement Learning through Learning from Demonstrations in Minecraft |
Authors | Christian Scheller, Yanick Schraner, Manfred Vogel |
Abstract | Sample inefficiency of deep reinforcement learning methods is a major obstacle for their use in real-world applications. In this work, we show how human demonstrations can improve final performance of agents on the Minecraft minigame ObtainDiamond with only 8M frames of environment interaction. We propose a training procedure where policy networks are first trained on human data and later fine-tuned by reinforcement learning. Using a policy exploitation mechanism, experience replay and an additional loss against catastrophic forgetting, our best agent was able to achieve a mean score of 48. Our proposed solution placed 3rd in the NeurIPS MineRL Competition for Sample-Efficient Reinforcement Learning. |
Tasks | |
Published | 2020-03-12 |
URL | https://arxiv.org/abs/2003.06066v1 |
https://arxiv.org/pdf/2003.06066v1.pdf | |
PWC | https://paperswithcode.com/paper/sample-efficient-reinforcement-learning |
Repo | |
Framework | |
Federated Continual Learning with Adaptive Parameter Communication
Title | Federated Continual Learning with Adaptive Parameter Communication |
Authors | Jaehong Yoon, Wonyong Jeong, Giwoong Lee, Eunho Yang, Sung Ju Hwang |
Abstract | There has been a surge of interest in continual learning and federated learning, both of which are important in training deep neural networks in real-world scenarios. Yet little research has been done regarding the scenario where each client learns on a sequence of tasks from private local data. This problem of federated continual learning poses new challenges to continual learning, such as utilizing knowledge and preventing interference from tasks learned on other clients. To resolve these issues, we propose a novel federated continual learning framework, Federated continual learning with Adaptive Parameter Communication, which additively decomposes the network weights into global shared parameters and sparse task-specific parameters. This decomposition allows to minimize interference between incompatible tasks, and also allows inter-client knowledge transfer by communicating the sparse task-specific parameters. Our federated continual learning framework is also communication-efficient, due to high sparsity of the parameters and sparse parameter update. We validate APC against existing federated learning and local continual learning methods under varying degrees of task similarity across clients, and show that our model significantly outperforms them with a large reduction in the communication cost. |
Tasks | Continual Learning, Transfer Learning |
Published | 2020-03-06 |
URL | https://arxiv.org/abs/2003.03196v2 |
https://arxiv.org/pdf/2003.03196v2.pdf | |
PWC | https://paperswithcode.com/paper/federated-continual-learning-with-adaptive |
Repo | |
Framework | |
DANTE: A framework for mining and monitoring darknet traffic
Title | DANTE: A framework for mining and monitoring darknet traffic |
Authors | Dvir Cohen, Yisroel Mirsky, Yuval Elovici, Rami Puzis, Manuel Kamp, Tobias Martin, Asaf Shabtai |
Abstract | Trillions of network packets are sent over the Internet to destinations which do not exist. This ‘darknet’ traffic captures the activity of botnets and other malicious campaigns aiming to discover and compromise devices around the world. In order to mine threat intelligence from this data, one must be able to handle large streams of logs and represent the traffic patterns in a meaningful way. However, by observing how network ports (services) are used, it is possible to capture the intent of each transmission. In this paper, we present DANTE: a framework and algorithm for mining darknet traffic. DANTE learns the meaning of targeted network ports by applying Word2Vec to observed port sequences. Then, when a host sends a new sequence, DANTE represents the transmission as the average embedding of the ports found that sequence. Finally, DANTE uses a novel and incremental time-series cluster tracking algorithm on observed sequences to detect recurring behaviors and new emerging threats. To evaluate the system, we ran DANTE on a full year of darknet traffic (over three Tera-Bytes) collected by the largest telecommunications provider in Europe, Deutsche Telekom and analyzed the results. DANTE discovered 1,177 new emerging threats and was able to track malicious campaigns over time. We also compared DANTE to the current best approach and found DANTE to be more practical and effective at detecting darknet traffic patterns. |
Tasks | Time Series |
Published | 2020-03-05 |
URL | https://arxiv.org/abs/2003.02575v1 |
https://arxiv.org/pdf/2003.02575v1.pdf | |
PWC | https://paperswithcode.com/paper/dante-a-framework-for-mining-and-monitoring |
Repo | |
Framework | |
Building Footprint Generation by IntegratingConvolution Neural Network with Feature PairwiseConditional Random Field (FPCRF)
Title | Building Footprint Generation by IntegratingConvolution Neural Network with Feature PairwiseConditional Random Field (FPCRF) |
Authors | Qingyu Li, Yilei Shi, Xin Huang, Xiao Xiang Zhu |
Abstract | Building footprint maps are vital to many remote sensing applications, such as 3D building modeling, urban planning, and disaster management. Due to the complexity of buildings, the accurate and reliable generation of the building footprint from remote sensing imagery is still a challenging task. In this work, an end-to-end building footprint generation approach that integrates convolution neural network (CNN) and graph model is proposed. CNN serves as the feature extractor, while the graph model can take spatial correlation into consideration. Moreover, we propose to implement the feature pairwise conditional random field (FPCRF) as a graph model to preserve sharp boundaries and fine-grained segmentation. Experiments are conducted on four different datasets: (1) Planetscope satellite imagery of the cities of Munich, Paris, Rome, and Zurich; (2) ISPRS benchmark data from the city of Potsdam, (3) Dstl Kaggle dataset; and (4) Inria Aerial Image Labeling data of Austin, Chicago, Kitsap County, Western Tyrol, and Vienna. It is found that the proposed end-to-end building footprint generation framework with the FPCRF as the graph model can further improve the accuracy of building footprint generation by using only CNN, which is the current state-of-the-art. |
Tasks | |
Published | 2020-02-11 |
URL | https://arxiv.org/abs/2002.04600v1 |
https://arxiv.org/pdf/2002.04600v1.pdf | |
PWC | https://paperswithcode.com/paper/building-footprint-generation-by |
Repo | |
Framework | |
Computational optimization of convolutional neural networks using separated filters architecture
Title | Computational optimization of convolutional neural networks using separated filters architecture |
Authors | Elena Limonova, Alexander Sheshkus, Dmitry Nikolaev |
Abstract | This paper considers a convolutional neural network transformation that reduces computation complexity and thus speedups neural network processing. Usage of convolutional neural networks (CNN) is the standard approach to image recognition despite the fact they can be too computationally demanding, for example for recognition on mobile platforms or in embedded systems. In this paper we propose CNN structure transformation which expresses 2D convolution filters as a linear combination of separable filters. It allows to obtain separated convolutional filters by standard training algorithms. We study the computation efficiency of this structure transformation and suggest fast implementation easily handled by CPU or GPU. We demonstrate that CNNs designed for letter and digit recognition of proposed structure show 15% speedup without accuracy loss in industrial image recognition system. In conclusion, we discuss the question of possible accuracy decrease and the application of proposed transformation to different recognition problems. convolutional neural networks, computational optimization, separable filters, complexity reduction. |
Tasks | |
Published | 2020-02-18 |
URL | https://arxiv.org/abs/2002.07754v1 |
https://arxiv.org/pdf/2002.07754v1.pdf | |
PWC | https://paperswithcode.com/paper/computational-optimization-of-convolutional |
Repo | |
Framework | |