January 28, 2020

3318 words 16 mins read

Paper Group ANR 801

The Ant Swarm Neuro-Evolution Procedure for Optimizing Recurrent Networks. Few-shot Adaptive Faster R-CNN. Semi-Supervised Multitask Learning on Multispectral Satellite Images Using Wasserstein Generative Adversarial Networks (GANs) for Predicting Poverty. Collapsed Amortized Variational Inference for Switching Nonlinear Dynamical Systems. Improvin …

The Ant Swarm Neuro-Evolution Procedure for Optimizing Recurrent Networks


Title	The Ant Swarm Neuro-Evolution Procedure for Optimizing Recurrent Networks
Authors	AbdElRahman A. ElSaid, Alexander G. Ororbia, Travis J. Desell
Abstract	Hand-crafting effective and efficient structures for recurrent neural networks (RNNs) is a difficult, expensive, and time-consuming process. To address this challenge, we propose a novel neuro-evolution algorithm based on ant colony optimization (ACO), called ant swarm neuro-evolution (ASNE), for directly optimizing RNN topologies. The procedure selects from multiple modern recurrent cell types such as Delta-RNN, GRU, LSTM, MGU and UGRNN cells, as well as recurrent connections which may span multiple layers and/or steps of time. In order to introduce an inductive bias that encourages the formation of sparser synaptic connectivity patterns, we investigate several variations of the core algorithm. We do so primarily by formulating different functions that drive the underlying pheromone simulation process (which mimic L1 and L2 regularization in standard machine learning) as well as by introducing ant agents with specialized roles (inspired by how real ant colonies operate), i.e., explorer ants that construct the initial feed forward structure and social ants which select nodes from the feed forward connections to subsequently craft recurrent memory structures. We also incorporate a Lamarckian strategy for weight initialization which reduces the number of backpropagation epochs required to locally train candidate RNNs, speeding up the neuro-evolution process. Our results demonstrate that the sparser RNNs evolved by ASNE significantly outperform traditional one and two layer architectures consisting of modern memory cells, as well as the well-known NEAT algorithm. Furthermore, we improve upon prior state-of-the-art results on the time series dataset utilized in our experiments.
Tasks	L2 Regularization, Time Series
Published	2019-09-26
URL	https://arxiv.org/abs/1909.11849v2
PDF	https://arxiv.org/pdf/1909.11849v2.pdf
PWC	https://paperswithcode.com/paper/the-ant-swarm-neuro-evolution-procedure-for
Repo
Framework

Few-shot Adaptive Faster R-CNN


Title	Few-shot Adaptive Faster R-CNN
Authors	Tao Wang, Xiaopeng Zhang, Li Yuan, Jiashi Feng
Abstract	To mitigate the detection performance drop caused by domain shift, we aim to develop a novel few-shot adaptation approach that requires only a few target domain images with limited bounding box annotations. To this end, we first observe several significant challenges. First, the target domain data is highly insufficient, making most existing domain adaptation methods ineffective. Second, object detection involves simultaneous localization and classification, further complicating the model adaptation process. Third, the model suffers from over-adaptation (similar to overfitting when training with a few data example) and instability risk that may lead to degraded detection performance in the target domain. To address these challenges, we first introduce a pairing mechanism over source and target features to alleviate the issue of insufficient target domain samples. We then propose a bi-level module to adapt the source trained detector to the target domain: 1) the split pooling based image level adaptation module uniformly extracts and aligns paired local patch features over locations, with different scale and aspect ratio; 2) the instance level adaptation module semantically aligns paired object features while avoids inter-class confusion. Meanwhile, a source model feature regularization (SMFR) is applied to stabilize the adaptation process of the two modules. Combining these contributions gives a novel few-shot adaptive Faster-RCNN framework, termed FAFRCNN, which effectively adapts to target domain with a few labeled samples. Experiments with multiple datasets show that our model achieves new state-of-the-art performance under both the interested few-shot domain adaptation(FDA) and unsupervised domain adaptation(UDA) setting.
Tasks	Domain Adaptation, Object Detection, Unsupervised Domain Adaptation
Published	2019-03-22
URL	http://arxiv.org/abs/1903.09372v1
PDF	http://arxiv.org/pdf/1903.09372v1.pdf
PWC	https://paperswithcode.com/paper/few-shot-adaptive-faster-r-cnn
Repo
Framework

Semi-Supervised Multitask Learning on Multispectral Satellite Images Using Wasserstein Generative Adversarial Networks (GANs) for Predicting Poverty


Title	Semi-Supervised Multitask Learning on Multispectral Satellite Images Using Wasserstein Generative Adversarial Networks (GANs) for Predicting Poverty
Authors	Anthony Perez, Swetava Ganguli, Stefano Ermon, George Azzari, Marshall Burke, David Lobell
Abstract	Obtaining reliable data describing local poverty metrics at a granularity that is informative to policy-makers requires expensive and logistically difficult surveys, particularly in the developing world. Not surprisingly, the poverty stricken regions are also the ones which have a high probability of being a war zone, have poor infrastructure and sometimes have governments that do not cooperate with internationally funded development efforts. We train a CNN on free and publicly available daytime satellite images of the African continent from Landsat 7 to build a model for predicting local economic livelihoods. Only 5% of the satellite images can be associated with labels (which are obtained from DHS Surveys) and thus a semi-supervised approach using a GAN (similar to the approach of Salimans, et al. (2016)), albeit with a more stable-to-train flavor of GANs called the Wasserstein GAN regularized with gradient penalty(Gulrajani, et al. (2017)) is used. The method of multitask learning is employed to regularize the network and also create an end-to-end model for the prediction of multiple poverty metrics.
Tasks
Published	2019-02-13
URL	http://arxiv.org/abs/1902.11110v2
PDF	http://arxiv.org/pdf/1902.11110v2.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-multitask-learning-on
Repo
Framework

Collapsed Amortized Variational Inference for Switching Nonlinear Dynamical Systems


Title	Collapsed Amortized Variational Inference for Switching Nonlinear Dynamical Systems
Authors	Zhe Dong, Bryan A. Seybold, Kevin P. Murphy, Hung H. Bui
Abstract	We propose an efficient inference method for switching nonlinear dynamical systems. The key idea is to learn an inference network which can be used as a proposal distribution for the continuous latent variables, while performing exact marginalization of the discrete latent variables. This allows us to use the reparameterization trick, and apply end-to-end training with stochastic gradient descent. We show that the proposed method can successfully segment time series data, including videos and 3D human pose, into meaningful ``regimes’’ by using the piece-wise nonlinear dynamics. \|
Tasks	Time Series
Published	2019-10-21
URL	https://arxiv.org/abs/1910.09588v2
PDF	https://arxiv.org/pdf/1910.09588v2.pdf
PWC	https://paperswithcode.com/paper/collapsed-amortized-variational-inference-for
Repo
Framework

Improving learnability of neural networks: adding supplementary axes to disentangle data representation


Title	Improving learnability of neural networks: adding supplementary axes to disentangle data representation
Authors	Kim Bukweon, Lee Sung Min, Seo Jin Keun
Abstract	Over-parameterized deep neural networks have proven to be able to learn an arbitrary dataset with 100$%$ training accuracy. Because of a risk of overfitting and computational cost issues, we cannot afford to increase the number of network nodes if we want achieve better training results for medical images. Previous deep learning research shows that the training ability of a neural network improves dramatically (for the same epoch of training) when a few nodes with supplementary information are added to the network. These few informative nodes allow the network to learn features that are otherwise difficult to learn by generating a disentangled data representation. This paper analyzes how concatenation of additional information as supplementary axes affects the training of the neural networks. This analysis was conducted for a simple multilayer perceptron (MLP) classification model with a rectified linear unit (ReLU) on two-dimensional training data. We compared the networks with and without concatenation of supplementary information to support our analysis. The model with concatenation showed more robust and accurate training results compared to the model without concatenation. We also confirmed that our findings are valid for deeper convolutional neural networks (CNN) using ultrasound images and for a conditional generative adversarial network (cGAN) using the MNIST data.
Tasks
Published	2019-02-12
URL	http://arxiv.org/abs/1902.04205v1
PDF	http://arxiv.org/pdf/1902.04205v1.pdf
PWC	https://paperswithcode.com/paper/improving-learnability-of-neural-networks
Repo
Framework

Accurate Trajectory Prediction for Autonomous Vehicles


Title	Accurate Trajectory Prediction for Autonomous Vehicles
Authors	Michael Diodato, Yu Li, Antonia Lovjer, Minsu Yeom, Albert Song, Yiyang Zeng, Abhay Khosla, Benedikt Schifferer, Manik Goyal, Iddo Drori
Abstract	Predicting vehicle trajectories, angle and speed is important for safe and comfortable driving. We demonstrate the best predicted angle, speed, and best performance overall winning the top three places of the ICCV 2019 Learning to Drive challenge. Our key contributions are (i) a general neural network system architecture which embeds and fuses together multiple inputs by encoding, and decodes multiple outputs using neural networks, (ii) using pre-trained neural networks for augmenting the given input data with segmentation maps and semantic information, and (iii) leveraging the form and distribution of the expected output in the model.
Tasks	Autonomous Vehicles, Trajectory Prediction
Published	2019-11-18
URL	https://arxiv.org/abs/1911.08568v1
PDF	https://arxiv.org/pdf/1911.08568v1.pdf
PWC	https://paperswithcode.com/paper/accurate-trajectory-prediction-for-autonomous
Repo
Framework

Confidence-Calibrated Adversarial Training: Generalizing to Unseen Attacks


Title	Confidence-Calibrated Adversarial Training: Generalizing to Unseen Attacks
Authors	David Stutz, Matthias Hein, Bernt Schiele
Abstract	Adversarial training yields robust models against a specific threat model, e.g., $L_\infty$ adversarial examples. Typically robustness does not generalize to previously unseen threat models, e.g., other $L_p$ norms, or larger perturbations. Our confidence-calibrated adversarial training (CCAT) tackles this problem by biasing the model towards low confidence predictions on adversarial examples. By allowing to reject examples with low confidence, robustness generalizes beyond the threat model employed during training. CCAT, trained only on $L_\infty$ adversarial examples, increases robustness against larger $L_\infty$, $L_2$, $L_1$ and $L_0$ attacks, adversarial frames, distal adversarial examples and corrupted examples and yields better clean accuracy compared to adversarial training. For thorough evaluation we developed novel white- and black-box attacks directly attacking CCAT by maximizing confidence. For each threat model, we use $7$ attacks with up to $50$ restarts and $5000$ iterations and report worst-case robust test error, extended to our confidence-thresholded setting, across all attacks.
Tasks
Published	2019-10-14
URL	https://arxiv.org/abs/1910.06259v3
PDF	https://arxiv.org/pdf/1910.06259v3.pdf
PWC	https://paperswithcode.com/paper/confidence-calibrated-adversarial-training
Repo
Framework

Dilated Convolution with Dilated GRU for Music Source Separation


Title	Dilated Convolution with Dilated GRU for Music Source Separation
Authors	Jen-Yu Liu, Yi-Hsuan Yang
Abstract	Stacked dilated convolutions used in Wavenet have been shown effective for generating high-quality audios. By replacing pooling/striding with dilation in convolution layers, they can preserve high-resolution information and still reach distant locations. Producing high-resolution predictions is also crucial in music source separation, whose goal is to separate different sound sources while maintaining the quality of the separated sounds. Therefore, this paper investigates using stacked dilated convolutions as the backbone for music source separation. However, while stacked dilated convolutions can reach wider context than standard convolutions, their effective receptive fields are still fixed and may not be wide enough for complex music audio signals. To reach information at remote locations, we propose to combine dilated convolution with a modified version of gated recurrent units (GRU) called the `Dilated GRU’ to form a block. A Dilated GRU unit receives information from k steps before instead of the previous step for a fixed k. This modification allows a GRU unit to reach a location with fewer recurrent steps and run faster because it can execute partially in parallel. We show that the proposed model with a stack of such blocks performs equally well or better than the state-of-the-art models for separating vocals and accompaniments. \|
Tasks	Music Source Separation
Published	2019-06-04
URL	https://arxiv.org/abs/1906.01203v1
PDF	https://arxiv.org/pdf/1906.01203v1.pdf
PWC	https://paperswithcode.com/paper/dilated-convolution-with-dilated-gru-for
Repo
Framework

What does it mean to understand a neural network?


Title	What does it mean to understand a neural network?
Authors	Timothy P. Lillicrap, Konrad P. Kording
Abstract	We can define a neural network that can learn to recognize objects in less than 100 lines of code. However, after training, it is characterized by millions of weights that contain the knowledge about many object types across visual scenes. Such networks are thus dramatically easier to understand in terms of the code that makes them than the resulting properties, such as tuning or connections. In analogy, we conjecture that rules for development and learning in brains may be far easier to understand than their resulting properties. The analogy suggests that neuroscience would benefit from a focus on learning and development.
Tasks
Published	2019-07-15
URL	https://arxiv.org/abs/1907.06374v1
PDF	https://arxiv.org/pdf/1907.06374v1.pdf
PWC	https://paperswithcode.com/paper/what-does-it-mean-to-understand-a-neural
Repo
Framework

Crypto Mining Makes Noise


Title	Crypto Mining Makes Noise
Authors	Maurantonio Caprolu, Simone Raponi, Gabriele Oligeri, Roberto Di Pietro
Abstract	A new cybersecurity attack (cryptojacking) is emerging, in both the literature and in the wild, where an adversary illicitly runs Crypto-clients software over the devices of unaware users. This attack has been proved to be very effective given the simplicity of running a Crypto-client into a target device, e.g., by means of web-based Java scripting. In this scenario, we propose Crypto-Aegis, a solution to detect and identify Crypto-clients network traffic–even when it is VPN-ed. In detail, our contributions are the following: (i) We identify and model a new type of attack, i.e., the sponge-attack, being a generalization of cryptojacking; (ii) We provide a detailed analysis of real network traffic generated by 3 major cryptocurrencies; (iii) We investigate how VPN tunneling shapes the network traffic generated by Crypto-clients by considering two major VPNbrands; (iv) We propose Crypto-Aegis, a Machine Learning (ML) based framework that builds over the previous steps to detect crypto-mining activities; and, finally, (v) We compare our results against competing solutions in the literature. Evidence from of our experimental campaign show the exceptional quality and viability of our solution–Crypto-Aegis achieves an F1-score of 0.96 and an AUC of 0.99. Given the extent and novelty of the addressed threat we believe that our approach and our results, other than being interesting on their own, also pave the way for further research in this area.
Tasks
Published	2019-10-21
URL	https://arxiv.org/abs/1910.09272v1
PDF	https://arxiv.org/pdf/1910.09272v1.pdf
PWC	https://paperswithcode.com/paper/crypto-mining-makes-noise
Repo
Framework

Secure Distributed On-Device Learning Networks With Byzantine Adversaries


Title	Secure Distributed On-Device Learning Networks With Byzantine Adversaries
Authors	Yanjie Dong, Julian Cheng, Md. Jahangir Hossain, Victor C. M. Leung
Abstract	The privacy concern exists when the central server has the copies of datasets. Hence, there is a paradigm shift for the learning networks to change from centralized in-cloud learning to distributed \mbox{on-device} learning. Benefit from the parallel computing, the on-device learning networks have a lower bandwidth requirement than the in-cloud learning networks. Moreover, the on-device learning networks also have several desirable characteristics such as privacy preserving and flexibility. However, the \mbox{on-device} learning networks are vulnerable to the malfunctioning terminals across the networks. The worst-case malfunctioning terminals are the Byzantine adversaries, that can perform arbitrary harmful operations to compromise the learned model based on the full knowledge of the networks. Hence, the design of secure learning algorithms becomes an emerging topic in the on-device learning networks with Byzantine adversaries. In this article, we present a comprehensive overview of the prevalent secure learning algorithms for the two promising on-device learning networks: Federated-Learning networks and decentralized-learning networks. We also review several future research directions in the \mbox{Federated-Learning} and decentralized-learning networks.
Tasks
Published	2019-06-03
URL	https://arxiv.org/abs/1906.00887v1
PDF	https://arxiv.org/pdf/1906.00887v1.pdf
PWC	https://paperswithcode.com/paper/190600887
Repo
Framework

Active Learning with TensorBoard Projector


Title	Active Learning with TensorBoard Projector
Authors	Francois Luus, Naweed Khan, Ismail Akhalwaya
Abstract	An ML-based system for interactive labeling of image datasets is contributed in TensorBoard Projector to speed up image annotation performed by humans. The tool visualizes feature spaces and makes it directly editable by online integration of applied labels, and it is a system for verifying and managing machine learning data pertaining to labels. We propose realistic annotation emulation to evaluate the system design of interactive active learning, based on our improved semi-supervised extension of t-SNE dimensionality reduction. Our active learning tool can significantly increase labeling efficiency compared to uncertainty sampling, and we show that less than 100 labeling actions are typically sufficient for good classification on a variety of specialized image datasets. Our contribution is unique given that it needs to perform dimensionality reduction, feature space visualization and editing, interactive label propagation, low-complexity active learning, human perceptual modeling, annotation emulation and unsupervised feature extraction for specialized datasets in a production-quality implementation.
Tasks	Active Learning, Dimensionality Reduction
Published	2019-01-03
URL	http://arxiv.org/abs/1901.00675v1
PDF	http://arxiv.org/pdf/1901.00675v1.pdf
PWC	https://paperswithcode.com/paper/active-learning-with-tensorboard-projector
Repo
Framework

TMAV: Temporal Motionless Analysis of Video using CNN in MPSoC


Title	TMAV: Temporal Motionless Analysis of Video using CNN in MPSoC
Authors	Somdip Dey, Amit K. Singh, Dilip K. Prasad, Klaus D. McDonald-Maier
Abstract	Analyzing video for traffic categorization is an important pillar of Intelligent Transport Systems. However, it is difficult to analyze and predict traffic based on image frames because the representation of each frame may vary significantly within a short time period. This also would inaccurately represent the traffic over a longer period of time such as the case of video. We propose a novel bio-inspired methodology that integrates analysis of the previous image frames of the video to represent the analysis of the current image frame, the same way a human being analyzes the current situation based on past experience. In our proposed methodology, called IRON-MAN (Integrated Rational prediction and Motionless ANalysis), we utilize Bayesian update on top of the individual image frame analysis in the videos and this has resulted in highly accurate prediction of Temporal Motionless Analysis of the Videos (TMAV) for most of the chosen test cases. The proposed approach could be used for TMAV using Convolutional Neural Network (CNN) for applications where the number of objects in an image is the deciding factor for prediction and results also show that our proposed approach outperforms the state-of-the-art for the chosen test case. We also introduce a new metric named, Energy Consumption per Training Image (ECTI). Since, different CNN based models have different training capability and computing resource utilization, some of the models are more suitable for embedded device implementation than the others, and ECTI metric is useful to assess the suitability of using a CNN model in multi-processor systems-on-chips (MPSoCs) with a focus on energy consumption and reliability in terms of lifespan of the embedded device using these MPSoCs.
Tasks
Published	2019-02-15
URL	http://arxiv.org/abs/1902.05657v2
PDF	http://arxiv.org/pdf/1902.05657v2.pdf
PWC	https://paperswithcode.com/paper/tmav-temporal-motionless-analysis-of-video
Repo
Framework

Learning Curves for Deep Neural Networks: A Gaussian Field Theory Perspective


Title	Learning Curves for Deep Neural Networks: A Gaussian Field Theory Perspective
Authors	Omry Cohen, Or Malka, Zohar Ringel
Abstract	A series of recent works established a rigorous correspondence between very wide deep neural networks (DNNs), trained in a particular manner, and noiseless Bayesian Inference with a certain Gaussian Process (GP) known as the Neural Tangent Kernel (NTK). Here we extend a known field-theory formalism for GP inference to get a detailed understanding of learning-curves in DNNs trained in the regime of this correspondence (NTK regime). In particular, a renormalization-group approach is used to show that noiseless GP inference using NTK, which lacks a good analytical handle, can be well approximated by noisy GP inference on a related kernel we call the renormalized NTK. Following this, a perturbation-theory analysis is carried in one over the dataset-size yielding analytical expressions for the (fixed-teacher/fixed-target) leading and sub-leading asymptotics of the learning curves. At least for uniform datasets, a coherent picture emerges wherein fully-connected DNNs have a strong implicit bias towards functions which are low order polynomials of the input.
Tasks	Bayesian Inference, Gaussian Processes
Published	2019-06-12
URL	https://arxiv.org/abs/1906.05301v2
PDF	https://arxiv.org/pdf/1906.05301v2.pdf
PWC	https://paperswithcode.com/paper/learning-curves-for-deep-neural-networks-a
Repo
Framework

Stochastic Fairness and Language-Theoretic Fairness in Planning on Nondeterministic Domains


Title	Stochastic Fairness and Language-Theoretic Fairness in Planning on Nondeterministic Domains
Authors	Benjamin Aminof, Giuseppe De Giacomo, Sasha Rubin
Abstract	We address two central notions of fairness in the literature of planning on nondeterministic fully observable domains. The first, which we call stochastic fairness, is classical, and assumes an environment which operates probabilistically using possibly unknown probabilities. The second, which is language-theoretic, assumes that if an action is taken from a given state infinitely often then all its possible outcomes should appear infinitely often (we call this state-action fairness). While the two notions coincide for standard reachability goals, they diverge for temporally extended goals. This important difference has been overlooked in the planning literature, and we argue has led to confusion in a number of published algorithms which use reductions that were stated for state-action fairness, for which they are incorrect, while being correct for stochastic fairness. We remedy this and provide an optimal sound and complete algorithm for solving state-action fair planning for LTL/LTLf goals, as well as a correct proof of the lower bound of the goal-complexity (our proof is general enough that it provides new proofs also for the no-fairness and stochastic-fairness cases). Overall, we show that stochastic fairness is better behaved than state-action fairness.
Tasks
Published	2019-12-24
URL	https://arxiv.org/abs/1912.11203v1
PDF	https://arxiv.org/pdf/1912.11203v1.pdf
PWC	https://paperswithcode.com/paper/stochastic-fairness-and-language-theoretic
Repo
Framework