January 30, 2020

3164 words 15 mins read

Paper Group ANR 375

An Algorithm for Approximating Continuous Functions on Compact Subsets with a Neural Network with one Hidden Layer. On Norm-Agnostic Robustness of Adversarial Training. Homogeneous Linear Inequality Constraints for Neural Network Activations. Quantum Inflation: A General Approach to Quantum Causal Compatibility. Scalable Reinforcement-Learning-Base …

An Algorithm for Approximating Continuous Functions on Compact Subsets with a Neural Network with one Hidden Layer


Title	An Algorithm for Approximating Continuous Functions on Compact Subsets with a Neural Network with one Hidden Layer
Authors	Elliott Zaresky-Williams
Abstract	George Cybenko’s landmark 1989 paper showed that there exists a feedforward neural network, with exactly one hidden layer (and a finite number of neurons), that can arbitrarily approximate a given continuous function $f$ on the unit hypercube. The paper did not address how to find the weight/parameters of such a network, or if finding them would be computationally feasible. This paper outlines an algorithm for a neural network with exactly one hidden layer to reconstruct any continuous scalar or vector valued continuous function.
Tasks
Published	2019-02-10
URL	http://arxiv.org/abs/1902.03638v1
PDF	http://arxiv.org/pdf/1902.03638v1.pdf
PWC	https://paperswithcode.com/paper/an-algorithm-for-approximating-continuous
Repo
Framework

On Norm-Agnostic Robustness of Adversarial Training


Title	On Norm-Agnostic Robustness of Adversarial Training
Authors	Bai Li, Changyou Chen, Wenlin Wang, Lawrence Carin
Abstract	Adversarial examples are carefully perturbed in-puts for fooling machine learning models. A well-acknowledged defense method against such examples is adversarial training, where adversarial examples are injected into training data to increase robustness. In this paper, we propose a new attack to unveil an undesired property of the state-of-the-art adversarial training, that is it fails to obtain robustness against perturbations in $\ell_2$ and $\ell_\infty$ norms simultaneously. We discuss a possible solution to this issue and its limitations as well.
Tasks
Published	2019-05-15
URL	https://arxiv.org/abs/1905.06455v1
PDF	https://arxiv.org/pdf/1905.06455v1.pdf
PWC	https://paperswithcode.com/paper/on-norm-agnostic-robustness-of-adversarial
Repo
Framework

Homogeneous Linear Inequality Constraints for Neural Network Activations


Title	Homogeneous Linear Inequality Constraints for Neural Network Activations
Authors	Thomas Frerix, Matthias Nießner, Daniel Cremers
Abstract	We propose a method to impose homogeneous linear inequality constraints of the form $Ax\leq 0$ on neural network activations. The proposed method allows a data-driven training approach to be combined with modeling prior knowledge about the task. One way to achieve this task is by means of a projection step at test time after unconstrained training. However, this is an expensive operation. By directly incorporating the constraints into the architecture, we can significantly speed-up inference at test time; for instance, our experiments show a speed-up of up to two orders of magnitude over a projection method. Our algorithm computes a suitable parameterization of the feasible set at initialization and uses standard variants of stochastic gradient descent to find solutions to the constrained network. Thus, the modeling constraints are always satisfied during training. Crucially, our approach avoids to solve an optimization problem at each training step or to manually trade-off data and constraint fidelity with additional hyperparameters. We consider constrained generative modeling as an important application domain and experimentally demonstrate the proposed method by constraining a variational autoencoder.
Tasks
Published	2019-02-05
URL	https://arxiv.org/abs/1902.01785v3
PDF	https://arxiv.org/pdf/1902.01785v3.pdf
PWC	https://paperswithcode.com/paper/linear-inequality-constraints-for-neural
Repo
Framework

Quantum Inflation: A General Approach to Quantum Causal Compatibility


Title	Quantum Inflation: A General Approach to Quantum Causal Compatibility
Authors	Elie Wolfe, Alejandro Pozas-Kerstjens, Matan Grinberg, Denis Rosset, Antonio Acín, Miguel Navascues
Abstract	Causality is a seminal concept in science: any research discipline, from sociology and medicine to physics and chemistry, aims at understanding the causes that could explain the correlations observed among some measured variables. While several methods exist to characterize classical causal models, no general construction is known for the quantum case. In this work we present quantum inflation, a systematic technique to falsify if a given quantum causal model is compatible with some observed correlations. We demonstrate the power of the technique by reproducing known results and solving open problems for some paradigmatic examples of causal networks. Our results may find an application in many fields: from the characterization of correlations in quantum networks to the study of quantum effects in thermodynamic and biological processes.
Tasks
Published	2019-09-23
URL	https://arxiv.org/abs/1909.10519v1
PDF	https://arxiv.org/pdf/1909.10519v1.pdf
PWC	https://paperswithcode.com/paper/quantum-inflation-a-general-approach-to
Repo
Framework

Scalable Reinforcement-Learning-Based Neural Architecture Search for Cancer Deep Learning Research


Title	Scalable Reinforcement-Learning-Based Neural Architecture Search for Cancer Deep Learning Research
Authors	Prasanna Balaprakash, Romain Egele, Misha Salim, Stefan Wild, Venkatram Vishwanath, Fangfang Xia, Tom Brettin, Rick Stevens
Abstract	Cancer is a complex disease, the understanding and treatment of which are being aided through increases in the volume of collected data and in the scale of deployed computing power. Consequently, there is a growing need for the development of data-driven and, in particular, deep learning methods for various tasks such as cancer diagnosis, detection, prognosis, and prediction. Despite recent successes, however, designing high-performing deep learning models for nonimage and nontext cancer data is a time-consuming, trial-and-error, manual task that requires both cancer domain and deep learning expertise. To that end, we develop a reinforcement-learning-based neural architecture search to automate deep-learning-based predictive model development for a class of representative cancer data. We develop custom building blocks that allow domain experts to incorporate the cancer-data-specific characteristics. We show that our approach discovers deep neural network architectures that have significantly fewer trainable parameters, shorter training time, and accuracy similar to or higher than those of manually designed architectures. We study and demonstrate the scalability of our approach on up to 1,024 Intel Knights Landing nodes of the Theta supercomputer at the Argonne Leadership Computing Facility.
Tasks	Neural Architecture Search
Published	2019-09-01
URL	https://arxiv.org/abs/1909.00311v1
PDF	https://arxiv.org/pdf/1909.00311v1.pdf
PWC	https://paperswithcode.com/paper/scalable-reinforcement-learning-based-neural
Repo
Framework

The Potential of the Confluence of Theoretical and Algorithmic Modeling in Music Recommendation


Title	The Potential of the Confluence of Theoretical and Algorithmic Modeling in Music Recommendation
Authors	Christine Bauer
Abstract	The task of a music recommender system is to predict what music item a particular user would like to listen to next. This position paper discusses the main challenges of the music preference prediction task: the lack of information on the many contextual factors influencing a user’s music preferences in existing open datasets, the lack of clarity of what the right choice of music is and whether a right choice exists at all; the multitude of criteria (beyond accuracy) that have to be met for a “good” music item recommendation; and the need for explanations on relationships to identify (and potentially counteract) unwanted biases in recommendation approaches. The paper substantiates the position that the confluence of theoretical modeling (which seeks to explain behaviors) and algorithmic modeling (which seeks to predict behaviors) seems to be an effective avenue to take in computational modeling for music recommender systems.
Tasks	Recommendation Systems
Published	2019-11-17
URL	https://arxiv.org/abs/1911.07328v1
PDF	https://arxiv.org/pdf/1911.07328v1.pdf
PWC	https://paperswithcode.com/paper/the-potential-of-the-confluence-of
Repo
Framework

Recurrent Control Nets for Deep Reinforcement Learning


Title	Recurrent Control Nets for Deep Reinforcement Learning
Authors	Vincent Liu, Ademi Adeniji, Nathaniel Lee, Jason Zhao, Mario Srouji
Abstract	Central Pattern Generators (CPGs) are biological neural circuits capable of producing coordinated rhythmic outputs in the absence of rhythmic input. As a result, they are responsible for most rhythmic motion in living organisms. This rhythmic control is broadly applicable to fields such as locomotive robotics and medical devices. In this paper, we explore the possibility of creating a self-sustaining CPG network for reinforcement learning that learns rhythmic motion more efficiently and across more general environments than the current multilayer perceptron (MLP) baseline models. Recent work introduces the Structured Control Net (SCN), which maintains linear and nonlinear modules for local and global control, respectively. Here, we show that time-sequence architectures such as Recurrent Neural Networks (RNNs) model CPGs effectively. Combining previous work with RNNs and SCNs, we introduce the Recurrent Control Net (RCN), which adds a linear component to the, RCNs match and exceed the performance of baseline MLPs and SCNs across all environment tasks. Our findings confirm existing intuitions for RNNs on reinforcement learning tasks, and demonstrate promise of SCN-like structures in reinforcement learning.
Tasks
Published	2019-01-06
URL	http://arxiv.org/abs/1901.01994v2
PDF	http://arxiv.org/pdf/1901.01994v2.pdf
PWC	https://paperswithcode.com/paper/recurrent-control-nets-for-deep-reinforcement
Repo
Framework

Evolving Self-taught Neural Networks: The Baldwin Effect and the Emergence of Intelligence


Title	Evolving Self-taught Neural Networks: The Baldwin Effect and the Emergence of Intelligence
Authors	Nam Le
Abstract	The so-called Baldwin Effect generally says how learning, as a form of ontogenetic adaptation, can influence the process of phylogenetic adaptation, or evolution. This idea has also been taken into computation in which evolution and learning are used as computational metaphors, including evolving neural networks. This paper presents a technique called evolving self-taught neural networks - neural networks that can teach themselves without external supervision or reward. The self-taught neural network is intrinsically motivated. Moreover, the self-taught neural network is the product of the interplay between evolution and learning. We simulate a multi-agent system in which neural networks are used to control autonomous agents. These agents have to forage for resources and compete for their own survival. Experimental results show that the interaction between evolution and the ability to teach oneself in self-taught neural networks outperform evolution and self-teaching alone. More specifically, the emergence of an intelligent foraging strategy is also demonstrated through that interaction. Indications for future work on evolving neural networks are also presented.
Tasks
Published	2019-04-04
URL	https://arxiv.org/abs/1906.08854v1
PDF	https://arxiv.org/pdf/1906.08854v1.pdf
PWC	https://paperswithcode.com/paper/evolving-self-taught-neural-networks-the
Repo
Framework

Exponentially-Modified Gaussian Mixture Model: Applications in Spectroscopy


Title	Exponentially-Modified Gaussian Mixture Model: Applications in Spectroscopy
Authors	Sebastian Ament, John Gregoire, Carla Gomes
Abstract	We propose a novel exponentially-modified Gaussian (EMG) mixture residual model. The EMG mixture is well suited to model residuals that are contaminated by a distribution with positive support. This is in contrast to commonly used robust residual models, like the Huber loss or $\ell_1$, which assume a symmetric contaminating distribution and are otherwise asymptotically biased. We propose an expectation-maximization algorithm to optimize an arbitrary model with respect to the EMG mixture. We apply the approach to linear regression and probabilistic matrix factorization (PMF). We compare against other residual models, including quantile regression. Our numerical experiments demonstrate the strengths of the EMG mixture on both tasks. The PMF model arises from considering spectroscopic data. In particular, we demonstrate the effectiveness of PMF in conjunction with the EMG mixture model on synthetic data and two real-world applications: X-ray diffraction and Raman spectroscopy. We show how our approach is effective in inferring background signals and systematic errors in data arising from these experimental settings, dramatically outperforming existing approaches and revealing the data’s physically meaningful components.
Tasks
Published	2019-02-14
URL	http://arxiv.org/abs/1902.05601v1
PDF	http://arxiv.org/pdf/1902.05601v1.pdf
PWC	https://paperswithcode.com/paper/exponentially-modified-gaussian-mixture-model
Repo
Framework

Scale- and Context-Aware Convolutional Non-intrusive Load Monitoring


Title	Scale- and Context-Aware Convolutional Non-intrusive Load Monitoring
Authors	Kunjin Chen, Yu Zhang, Qin Wang, Jun Hu, Hang Fan, Jinliang He
Abstract	Non-intrusive load monitoring addresses the challenging task of decomposing the aggregate signal of a household’s electricity consumption into appliance-level data without installing dedicated meters. By detecting load malfunction and recommending energy reduction programs, cost-effective non-intrusive load monitoring provides intelligent demand-side management for utilities and end users. In this paper, we boost the accuracy of energy disaggregation with a novel neural network structure named scale- and context-aware network, which exploits multi-scale features and contextual information. Specifically, we develop a multi-branch architecture with multiple receptive field sizes and branch-wise gates that connect the branches in the sub-networks. We build a self-attention module to facilitate the integration of global context, and we incorporate an adversarial loss and on-state augmentation to further improve the model’s performance. Extensive simulation results tested on open datasets corroborate the merits of the proposed approach, which significantly outperforms state-of-the-art methods.
Tasks	Non-Intrusive Load Monitoring
Published	2019-11-17
URL	https://arxiv.org/abs/1911.07183v1
PDF	https://arxiv.org/pdf/1911.07183v1.pdf
PWC	https://paperswithcode.com/paper/scale-and-context-aware-convolutional-non
Repo
Framework

Multi-Objective Evolutionary Framework for Non-linear System Identification: A Comprehensive Investigation


Title	Multi-Objective Evolutionary Framework for Non-linear System Identification: A Comprehensive Investigation
Authors	Faizal Hafiz, Akshya Swain, Eduardo MAM Mendes
Abstract	The present study proposes a multi-objective framework for structure selection of nonlinear systems which are represented by polynomial NARX models. This framework integrates the key components of Multi-Criteria Decision Making (MCDM) which include preference handling, Multi-Objective Evolutionary Algorithms (MOEAs) and a posteriori selection. To this end, three well-known MOEAs such as NSGA-II, SPEA-II and MOEA/D are thoroughly investigated to determine if there exists any significant difference in their search performance. The sensitivity of all these MOEAs to various qualitative and quantitative parameters, such as the choice of recombination mechanism, crossover and mutation probabilities, is also studied. These issues are critically analyzed considering seven discrete-time and a continuous-time benchmark nonlinear system as well as a practical case study of non-linear wave-force modeling. The results of this investigation demonstrate that MOEAs can be tailored to determine the correct structure of nonlinear systems. Further, it has been established through frequency domain analysis that it is possible to identify multiple valid discrete-time models for continuous-time systems. A rigorous statistical analysis of MOEAs via performance sweet spots in the parameter space convincingly demonstrates that these algorithms are robust over a wide range of control parameters.
Tasks	Decision Making
Published	2019-08-17
URL	https://arxiv.org/abs/1908.06232v1
PDF	https://arxiv.org/pdf/1908.06232v1.pdf
PWC	https://paperswithcode.com/paper/multi-objective-evolutionary-framework-for
Repo
Framework

Dynamic Network Embedding via Incremental Skip-gram with Negative Sampling


Title	Dynamic Network Embedding via Incremental Skip-gram with Negative Sampling
Authors	Hao Peng, Jianxin Li, Hao Yan, Qiran Gong, Senzhang Wang, Lin Liu, Lihong Wang, Xiang Ren
Abstract	Network representation learning, as an approach to learn low dimensional representations of vertices, has attracted considerable research attention recently. It has been proven extremely useful in many machine learning tasks over large graph. Most existing methods focus on learning the structural representations of vertices in a static network, but cannot guarantee an accurate and efficient embedding in a dynamic network scenario. To address this issue, we present an efficient incremental skip-gram algorithm with negative sampling for dynamic network embedding, and provide a set of theoretical analyses to characterize the performance guarantee. Specifically, we first partition a dynamic network into the updated, including addition/deletion of links and vertices, and the retained networks over time. Then we factorize the objective function of network embedding into the added, vanished and retained parts of the network. Next we provide a new stochastic gradient-based method, guided by the partitions of the network, to update the nodes and the parameter vectors. The proposed algorithm is proven to yield an objective function value with a bounded difference to that of the original objective function. Experimental results show that our proposal can significantly reduce the training time while preserving the comparable performance. We also demonstrate the correctness of the theoretical analysis and the practical usefulness of the dynamic network embedding. We perform extensive experiments on multiple real-world large network datasets over multi-label classification and link prediction tasks to evaluate the effectiveness and efficiency of the proposed framework, and up to 22 times speedup has been achieved.
Tasks	Link Prediction, Multi-Label Classification, Network Embedding, Representation Learning
Published	2019-06-09
URL	https://arxiv.org/abs/1906.03586v1
PDF	https://arxiv.org/pdf/1906.03586v1.pdf
PWC	https://paperswithcode.com/paper/dynamic-network-embedding-via-incremental
Repo
Framework

Multiplayer Bandit Learning, from Competition to Cooperation


Title	Multiplayer Bandit Learning, from Competition to Cooperation
Authors	Simina Brânzei, Yuval Peres
Abstract	The stochastic multi-armed bandit model captures the tradeoff between exploration and exploitation. We study the effects of competition and cooperation on this tradeoff. Suppose there are $k$ arms and two players, Alice and Bob. In every round, each player pulls an arm, receives the resulting reward, and observes the choice of the other player but not their reward. Alice’s utility is $\Gamma_A + \lambda \Gamma_B$ (and similarly for Bob), where $\Gamma_A$ is Alice’s total reward and $\lambda \in [-1, 1]$ is a cooperation parameter. At $\lambda = -1$ the players are competing in a zero-sum game, at $\lambda = 1$, they are fully cooperating, and at $\lambda = 0$, they are neutral: each player’s utility is their own reward. The model is related to the economics literature on strategic experimentation, where usually players observe each other’s rewards. With discount factor $\beta$, the Gittins index reduces the one-player problem to the comparison between a risky arm, with a prior $\mu$, and a predictable arm, with success probability $p$. The value of $p$ where the player is indifferent between the arms is the Gittins index $g = g(\mu,\beta) > m$, where $m$ is the mean of the risky arm. We show that competing players explore less than a single player: there is $p^* \in (m, g)$ so that for all $p > p^$, the players stay at the predictable arm. However, the players are not myopic: they still explore for some $p > m$. On the other hand, cooperating players explore more than a single player. We also show that neutral players learn from each other, receiving strictly higher total rewards than they would playing alone, for all $ p\in (p^, g)$, where $p^*$ is the threshold from the competing case. Finally, we show that competing and neutral players eventually settle on the same arm in every Nash equilibrium, while this can fail for cooperating players.
Tasks
Published	2019-08-03
URL	https://arxiv.org/abs/1908.01135v2
PDF	https://arxiv.org/pdf/1908.01135v2.pdf
PWC	https://paperswithcode.com/paper/multiplayer-bandit-learning-from-competition
Repo
Framework

RRNet: Repetition-Reduction Network for Energy Efficient Decoder of Depth Estimation


Title	RRNet: Repetition-Reduction Network for Energy Efficient Decoder of Depth Estimation
Authors	Sangyun Oh, Hye-Jin S. Kim, Jongeun Lee, Junmo Kim
Abstract	We introduce Repetition-Reduction network (RRNet) for resource-constrained depth estimation, offering significantly improved efficiency in terms of computation, memory and energy consumption. The proposed method is based on repetition-reduction (RR) blocks. The RR blocks consist of the set of repeated convolutions and the residual connection layer that take place of the pointwise reduction layer with linear connection to the decoder. The RRNet help reduce memory usage and power consumption in the residual connections to the decoder layers. RRNet consumes approximately 3.84 times less energy and 3.06 times less meory and is approaximately 2.21 times faster, without increasing the demand on hardware resource relative to the baseline network (Godard et al, CVPR’17), outperforming current state-of-the-art lightweight architectures such as SqueezeNet, ShuffleNet, MobileNet and PyDNet.
Tasks	Depth Estimation
Published	2019-07-23
URL	https://arxiv.org/abs/1907.09707v2
PDF	https://arxiv.org/pdf/1907.09707v2.pdf
PWC	https://paperswithcode.com/paper/rrnet-repetition-reduction-network-for-energy
Repo
Framework

Measurable Counterfactual Local Explanations for Any Classifier


Title	Measurable Counterfactual Local Explanations for Any Classifier
Authors	Adam White, Artur d’Avila Garcez
Abstract	We propose a novel method for explaining the predictions of any classifier. In our approach, local explanations are expected to explain both the outcome of a prediction and how that prediction would change if ‘things had been different’. Furthermore, we argue that satisfactory explanations cannot be dissociated from a notion and measure of fidelity, as advocated in the early days of neural networks’ knowledge extraction. We introduce a definition of fidelity to the underlying classifier for local explanation models which is based on distances to a target decision boundary. A system called CLEAR: Counterfactual Local Explanations via Regression, is introduced and evaluated. CLEAR generates w-counterfactual explanations that state minimum changes necessary to flip a prediction’s classification. CLEAR then builds local regression models, using the w-counterfactuals to measure and improve the fidelity of its regressions. By contrast, the popular LIME method, which also uses regression to generate local explanations, neither measures its own fidelity nor generates counterfactuals. CLEAR’s regressions are found to have significantly higher fidelity than LIME’s, averaging over 45% higher in this paper’s four case studies.
Tasks
Published	2019-08-08
URL	https://arxiv.org/abs/1908.03020v2
PDF	https://arxiv.org/pdf/1908.03020v2.pdf
PWC	https://paperswithcode.com/paper/measurable-counterfactual-local-explanations
Repo
Framework