Paper Group ANR 375
An Algorithm for Approximating Continuous Functions on Compact Subsets with a Neural Network with one Hidden Layer. On Norm-Agnostic Robustness of Adversarial Training. Homogeneous Linear Inequality Constraints for Neural Network Activations. Quantum Inflation: A General Approach to Quantum Causal Compatibility. Scalable Reinforcement-Learning-Base …
An Algorithm for Approximating Continuous Functions on Compact Subsets with a Neural Network with one Hidden Layer
Title | An Algorithm for Approximating Continuous Functions on Compact Subsets with a Neural Network with one Hidden Layer |
Authors | Elliott Zaresky-Williams |
Abstract | George Cybenko’s landmark 1989 paper showed that there exists a feedforward neural network, with exactly one hidden layer (and a finite number of neurons), that can arbitrarily approximate a given continuous function $f$ on the unit hypercube. The paper did not address how to find the weight/parameters of such a network, or if finding them would be computationally feasible. This paper outlines an algorithm for a neural network with exactly one hidden layer to reconstruct any continuous scalar or vector valued continuous function. |
Tasks | |
Published | 2019-02-10 |
URL | http://arxiv.org/abs/1902.03638v1 |
http://arxiv.org/pdf/1902.03638v1.pdf | |
PWC | https://paperswithcode.com/paper/an-algorithm-for-approximating-continuous |
Repo | |
Framework | |
On Norm-Agnostic Robustness of Adversarial Training
Title | On Norm-Agnostic Robustness of Adversarial Training |
Authors | Bai Li, Changyou Chen, Wenlin Wang, Lawrence Carin |
Abstract | Adversarial examples are carefully perturbed in-puts for fooling machine learning models. A well-acknowledged defense method against such examples is adversarial training, where adversarial examples are injected into training data to increase robustness. In this paper, we propose a new attack to unveil an undesired property of the state-of-the-art adversarial training, that is it fails to obtain robustness against perturbations in $\ell_2$ and $\ell_\infty$ norms simultaneously. We discuss a possible solution to this issue and its limitations as well. |
Tasks | |
Published | 2019-05-15 |
URL | https://arxiv.org/abs/1905.06455v1 |
https://arxiv.org/pdf/1905.06455v1.pdf | |
PWC | https://paperswithcode.com/paper/on-norm-agnostic-robustness-of-adversarial |
Repo | |
Framework | |
Homogeneous Linear Inequality Constraints for Neural Network Activations
Title | Homogeneous Linear Inequality Constraints for Neural Network Activations |
Authors | Thomas Frerix, Matthias Nießner, Daniel Cremers |
Abstract | We propose a method to impose homogeneous linear inequality constraints of the form $Ax\leq 0$ on neural network activations. The proposed method allows a data-driven training approach to be combined with modeling prior knowledge about the task. One way to achieve this task is by means of a projection step at test time after unconstrained training. However, this is an expensive operation. By directly incorporating the constraints into the architecture, we can significantly speed-up inference at test time; for instance, our experiments show a speed-up of up to two orders of magnitude over a projection method. Our algorithm computes a suitable parameterization of the feasible set at initialization and uses standard variants of stochastic gradient descent to find solutions to the constrained network. Thus, the modeling constraints are always satisfied during training. Crucially, our approach avoids to solve an optimization problem at each training step or to manually trade-off data and constraint fidelity with additional hyperparameters. We consider constrained generative modeling as an important application domain and experimentally demonstrate the proposed method by constraining a variational autoencoder. |
Tasks | |
Published | 2019-02-05 |
URL | https://arxiv.org/abs/1902.01785v3 |
https://arxiv.org/pdf/1902.01785v3.pdf | |
PWC | https://paperswithcode.com/paper/linear-inequality-constraints-for-neural |
Repo | |
Framework | |
Quantum Inflation: A General Approach to Quantum Causal Compatibility
Title | Quantum Inflation: A General Approach to Quantum Causal Compatibility |
Authors | Elie Wolfe, Alejandro Pozas-Kerstjens, Matan Grinberg, Denis Rosset, Antonio Acín, Miguel Navascues |
Abstract | Causality is a seminal concept in science: any research discipline, from sociology and medicine to physics and chemistry, aims at understanding the causes that could explain the correlations observed among some measured variables. While several methods exist to characterize classical causal models, no general construction is known for the quantum case. In this work we present quantum inflation, a systematic technique to falsify if a given quantum causal model is compatible with some observed correlations. We demonstrate the power of the technique by reproducing known results and solving open problems for some paradigmatic examples of causal networks. Our results may find an application in many fields: from the characterization of correlations in quantum networks to the study of quantum effects in thermodynamic and biological processes. |
Tasks | |
Published | 2019-09-23 |
URL | https://arxiv.org/abs/1909.10519v1 |
https://arxiv.org/pdf/1909.10519v1.pdf | |
PWC | https://paperswithcode.com/paper/quantum-inflation-a-general-approach-to |
Repo | |
Framework | |
Scalable Reinforcement-Learning-Based Neural Architecture Search for Cancer Deep Learning Research
Title | Scalable Reinforcement-Learning-Based Neural Architecture Search for Cancer Deep Learning Research |
Authors | Prasanna Balaprakash, Romain Egele, Misha Salim, Stefan Wild, Venkatram Vishwanath, Fangfang Xia, Tom Brettin, Rick Stevens |
Abstract | Cancer is a complex disease, the understanding and treatment of which are being aided through increases in the volume of collected data and in the scale of deployed computing power. Consequently, there is a growing need for the development of data-driven and, in particular, deep learning methods for various tasks such as cancer diagnosis, detection, prognosis, and prediction. Despite recent successes, however, designing high-performing deep learning models for nonimage and nontext cancer data is a time-consuming, trial-and-error, manual task that requires both cancer domain and deep learning expertise. To that end, we develop a reinforcement-learning-based neural architecture search to automate deep-learning-based predictive model development for a class of representative cancer data. We develop custom building blocks that allow domain experts to incorporate the cancer-data-specific characteristics. We show that our approach discovers deep neural network architectures that have significantly fewer trainable parameters, shorter training time, and accuracy similar to or higher than those of manually designed architectures. We study and demonstrate the scalability of our approach on up to 1,024 Intel Knights Landing nodes of the Theta supercomputer at the Argonne Leadership Computing Facility. |
Tasks | Neural Architecture Search |
Published | 2019-09-01 |
URL | https://arxiv.org/abs/1909.00311v1 |
https://arxiv.org/pdf/1909.00311v1.pdf | |
PWC | https://paperswithcode.com/paper/scalable-reinforcement-learning-based-neural |
Repo | |
Framework | |
The Potential of the Confluence of Theoretical and Algorithmic Modeling in Music Recommendation
Title | The Potential of the Confluence of Theoretical and Algorithmic Modeling in Music Recommendation |
Authors | Christine Bauer |
Abstract | The task of a music recommender system is to predict what music item a particular user would like to listen to next. This position paper discusses the main challenges of the music preference prediction task: the lack of information on the many contextual factors influencing a user’s music preferences in existing open datasets, the lack of clarity of what the right choice of music is and whether a right choice exists at all; the multitude of criteria (beyond accuracy) that have to be met for a “good” music item recommendation; and the need for explanations on relationships to identify (and potentially counteract) unwanted biases in recommendation approaches. The paper substantiates the position that the confluence of theoretical modeling (which seeks to explain behaviors) and algorithmic modeling (which seeks to predict behaviors) seems to be an effective avenue to take in computational modeling for music recommender systems. |
Tasks | Recommendation Systems |
Published | 2019-11-17 |
URL | https://arxiv.org/abs/1911.07328v1 |
https://arxiv.org/pdf/1911.07328v1.pdf | |
PWC | https://paperswithcode.com/paper/the-potential-of-the-confluence-of |
Repo | |
Framework | |
Recurrent Control Nets for Deep Reinforcement Learning
Title | Recurrent Control Nets for Deep Reinforcement Learning |
Authors | Vincent Liu, Ademi Adeniji, Nathaniel Lee, Jason Zhao, Mario Srouji |
Abstract | Central Pattern Generators (CPGs) are biological neural circuits capable of producing coordinated rhythmic outputs in the absence of rhythmic input. As a result, they are responsible for most rhythmic motion in living organisms. This rhythmic control is broadly applicable to fields such as locomotive robotics and medical devices. In this paper, we explore the possibility of creating a self-sustaining CPG network for reinforcement learning that learns rhythmic motion more efficiently and across more general environments than the current multilayer perceptron (MLP) baseline models. Recent work introduces the Structured Control Net (SCN), which maintains linear and nonlinear modules for local and global control, respectively. Here, we show that time-sequence architectures such as Recurrent Neural Networks (RNNs) model CPGs effectively. Combining previous work with RNNs and SCNs, we introduce the Recurrent Control Net (RCN), which adds a linear component to the, RCNs match and exceed the performance of baseline MLPs and SCNs across all environment tasks. Our findings confirm existing intuitions for RNNs on reinforcement learning tasks, and demonstrate promise of SCN-like structures in reinforcement learning. |
Tasks | |
Published | 2019-01-06 |
URL | http://arxiv.org/abs/1901.01994v2 |
http://arxiv.org/pdf/1901.01994v2.pdf | |
PWC | https://paperswithcode.com/paper/recurrent-control-nets-for-deep-reinforcement |
Repo | |
Framework | |
Evolving Self-taught Neural Networks: The Baldwin Effect and the Emergence of Intelligence
Title | Evolving Self-taught Neural Networks: The Baldwin Effect and the Emergence of Intelligence |
Authors | Nam Le |
Abstract | The so-called Baldwin Effect generally says how learning, as a form of ontogenetic adaptation, can influence the process of phylogenetic adaptation, or evolution. This idea has also been taken into computation in which evolution and learning are used as computational metaphors, including evolving neural networks. This paper presents a technique called evolving self-taught neural networks - neural networks that can teach themselves without external supervision or reward. The self-taught neural network is intrinsically motivated. Moreover, the self-taught neural network is the product of the interplay between evolution and learning. We simulate a multi-agent system in which neural networks are used to control autonomous agents. These agents have to forage for resources and compete for their own survival. Experimental results show that the interaction between evolution and the ability to teach oneself in self-taught neural networks outperform evolution and self-teaching alone. More specifically, the emergence of an intelligent foraging strategy is also demonstrated through that interaction. Indications for future work on evolving neural networks are also presented. |
Tasks | |
Published | 2019-04-04 |
URL | https://arxiv.org/abs/1906.08854v1 |
https://arxiv.org/pdf/1906.08854v1.pdf | |
PWC | https://paperswithcode.com/paper/evolving-self-taught-neural-networks-the |
Repo | |
Framework | |
Exponentially-Modified Gaussian Mixture Model: Applications in Spectroscopy
Title | Exponentially-Modified Gaussian Mixture Model: Applications in Spectroscopy |
Authors | Sebastian Ament, John Gregoire, Carla Gomes |
Abstract | We propose a novel exponentially-modified Gaussian (EMG) mixture residual model. The EMG mixture is well suited to model residuals that are contaminated by a distribution with positive support. This is in contrast to commonly used robust residual models, like the Huber loss or $\ell_1$, which assume a symmetric contaminating distribution and are otherwise asymptotically biased. We propose an expectation-maximization algorithm to optimize an arbitrary model with respect to the EMG mixture. We apply the approach to linear regression and probabilistic matrix factorization (PMF). We compare against other residual models, including quantile regression. Our numerical experiments demonstrate the strengths of the EMG mixture on both tasks. The PMF model arises from considering spectroscopic data. In particular, we demonstrate the effectiveness of PMF in conjunction with the EMG mixture model on synthetic data and two real-world applications: X-ray diffraction and Raman spectroscopy. We show how our approach is effective in inferring background signals and systematic errors in data arising from these experimental settings, dramatically outperforming existing approaches and revealing the data’s physically meaningful components. |
Tasks | |
Published | 2019-02-14 |
URL | http://arxiv.org/abs/1902.05601v1 |
http://arxiv.org/pdf/1902.05601v1.pdf | |
PWC | https://paperswithcode.com/paper/exponentially-modified-gaussian-mixture-model |
Repo | |
Framework | |
Scale- and Context-Aware Convolutional Non-intrusive Load Monitoring
Title | Scale- and Context-Aware Convolutional Non-intrusive Load Monitoring |
Authors | Kunjin Chen, Yu Zhang, Qin Wang, Jun Hu, Hang Fan, Jinliang He |
Abstract | Non-intrusive load monitoring addresses the challenging task of decomposing the aggregate signal of a household’s electricity consumption into appliance-level data without installing dedicated meters. By detecting load malfunction and recommending energy reduction programs, cost-effective non-intrusive load monitoring provides intelligent demand-side management for utilities and end users. In this paper, we boost the accuracy of energy disaggregation with a novel neural network structure named scale- and context-aware network, which exploits multi-scale features and contextual information. Specifically, we develop a multi-branch architecture with multiple receptive field sizes and branch-wise gates that connect the branches in the sub-networks. We build a self-attention module to facilitate the integration of global context, and we incorporate an adversarial loss and on-state augmentation to further improve the model’s performance. Extensive simulation results tested on open datasets corroborate the merits of the proposed approach, which significantly outperforms state-of-the-art methods. |
Tasks | Non-Intrusive Load Monitoring |
Published | 2019-11-17 |
URL | https://arxiv.org/abs/1911.07183v1 |
https://arxiv.org/pdf/1911.07183v1.pdf | |
PWC | https://paperswithcode.com/paper/scale-and-context-aware-convolutional-non |
Repo | |
Framework | |
Multi-Objective Evolutionary Framework for Non-linear System Identification: A Comprehensive Investigation
Title | Multi-Objective Evolutionary Framework for Non-linear System Identification: A Comprehensive Investigation |
Authors | Faizal Hafiz, Akshya Swain, Eduardo MAM Mendes |
Abstract | The present study proposes a multi-objective framework for structure selection of nonlinear systems which are represented by polynomial NARX models. This framework integrates the key components of Multi-Criteria Decision Making (MCDM) which include preference handling, Multi-Objective Evolutionary Algorithms (MOEAs) and a posteriori selection. To this end, three well-known MOEAs such as NSGA-II, SPEA-II and MOEA/D are thoroughly investigated to determine if there exists any significant difference in their search performance. The sensitivity of all these MOEAs to various qualitative and quantitative parameters, such as the choice of recombination mechanism, crossover and mutation probabilities, is also studied. These issues are critically analyzed considering seven discrete-time and a continuous-time benchmark nonlinear system as well as a practical case study of non-linear wave-force modeling. The results of this investigation demonstrate that MOEAs can be tailored to determine the correct structure of nonlinear systems. Further, it has been established through frequency domain analysis that it is possible to identify multiple valid discrete-time models for continuous-time systems. A rigorous statistical analysis of MOEAs via performance sweet spots in the parameter space convincingly demonstrates that these algorithms are robust over a wide range of control parameters. |
Tasks | Decision Making |
Published | 2019-08-17 |
URL | https://arxiv.org/abs/1908.06232v1 |
https://arxiv.org/pdf/1908.06232v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-objective-evolutionary-framework-for |
Repo | |
Framework | |
Dynamic Network Embedding via Incremental Skip-gram with Negative Sampling
Title | Dynamic Network Embedding via Incremental Skip-gram with Negative Sampling |
Authors | Hao Peng, Jianxin Li, Hao Yan, Qiran Gong, Senzhang Wang, Lin Liu, Lihong Wang, Xiang Ren |
Abstract | Network representation learning, as an approach to learn low dimensional representations of vertices, has attracted considerable research attention recently. It has been proven extremely useful in many machine learning tasks over large graph. Most existing methods focus on learning the structural representations of vertices in a static network, but cannot guarantee an accurate and efficient embedding in a dynamic network scenario. To address this issue, we present an efficient incremental skip-gram algorithm with negative sampling for dynamic network embedding, and provide a set of theoretical analyses to characterize the performance guarantee. Specifically, we first partition a dynamic network into the updated, including addition/deletion of links and vertices, and the retained networks over time. Then we factorize the objective function of network embedding into the added, vanished and retained parts of the network. Next we provide a new stochastic gradient-based method, guided by the partitions of the network, to update the nodes and the parameter vectors. The proposed algorithm is proven to yield an objective function value with a bounded difference to that of the original objective function. Experimental results show that our proposal can significantly reduce the training time while preserving the comparable performance. We also demonstrate the correctness of the theoretical analysis and the practical usefulness of the dynamic network embedding. We perform extensive experiments on multiple real-world large network datasets over multi-label classification and link prediction tasks to evaluate the effectiveness and efficiency of the proposed framework, and up to 22 times speedup has been achieved. |
Tasks | Link Prediction, Multi-Label Classification, Network Embedding, Representation Learning |
Published | 2019-06-09 |
URL | https://arxiv.org/abs/1906.03586v1 |
https://arxiv.org/pdf/1906.03586v1.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-network-embedding-via-incremental |
Repo | |
Framework | |
Multiplayer Bandit Learning, from Competition to Cooperation
Title | Multiplayer Bandit Learning, from Competition to Cooperation |
Authors | Simina Brânzei, Yuval Peres |
Abstract | The stochastic multi-armed bandit model captures the tradeoff between exploration and exploitation. We study the effects of competition and cooperation on this tradeoff. Suppose there are $k$ arms and two players, Alice and Bob. In every round, each player pulls an arm, receives the resulting reward, and observes the choice of the other player but not their reward. Alice’s utility is $\Gamma_A + \lambda \Gamma_B$ (and similarly for Bob), where $\Gamma_A$ is Alice’s total reward and $\lambda \in [-1, 1]$ is a cooperation parameter. At $\lambda = -1$ the players are competing in a zero-sum game, at $\lambda = 1$, they are fully cooperating, and at $\lambda = 0$, they are neutral: each player’s utility is their own reward. The model is related to the economics literature on strategic experimentation, where usually players observe each other’s rewards. With discount factor $\beta$, the Gittins index reduces the one-player problem to the comparison between a risky arm, with a prior $\mu$, and a predictable arm, with success probability $p$. The value of $p$ where the player is indifferent between the arms is the Gittins index $g = g(\mu,\beta) > m$, where $m$ is the mean of the risky arm. We show that competing players explore less than a single player: there is $p^* \in (m, g)$ so that for all $p > p^*$, the players stay at the predictable arm. However, the players are not myopic: they still explore for some $p > m$. On the other hand, cooperating players explore more than a single player. We also show that neutral players learn from each other, receiving strictly higher total rewards than they would playing alone, for all $ p\in (p^*, g)$, where $p^*$ is the threshold from the competing case. Finally, we show that competing and neutral players eventually settle on the same arm in every Nash equilibrium, while this can fail for cooperating players. |
Tasks | |
Published | 2019-08-03 |
URL | https://arxiv.org/abs/1908.01135v2 |
https://arxiv.org/pdf/1908.01135v2.pdf | |
PWC | https://paperswithcode.com/paper/multiplayer-bandit-learning-from-competition |
Repo | |
Framework | |
RRNet: Repetition-Reduction Network for Energy Efficient Decoder of Depth Estimation
Title | RRNet: Repetition-Reduction Network for Energy Efficient Decoder of Depth Estimation |
Authors | Sangyun Oh, Hye-Jin S. Kim, Jongeun Lee, Junmo Kim |
Abstract | We introduce Repetition-Reduction network (RRNet) for resource-constrained depth estimation, offering significantly improved efficiency in terms of computation, memory and energy consumption. The proposed method is based on repetition-reduction (RR) blocks. The RR blocks consist of the set of repeated convolutions and the residual connection layer that take place of the pointwise reduction layer with linear connection to the decoder. The RRNet help reduce memory usage and power consumption in the residual connections to the decoder layers. RRNet consumes approximately 3.84 times less energy and 3.06 times less meory and is approaximately 2.21 times faster, without increasing the demand on hardware resource relative to the baseline network (Godard et al, CVPR’17), outperforming current state-of-the-art lightweight architectures such as SqueezeNet, ShuffleNet, MobileNet and PyDNet. |
Tasks | Depth Estimation |
Published | 2019-07-23 |
URL | https://arxiv.org/abs/1907.09707v2 |
https://arxiv.org/pdf/1907.09707v2.pdf | |
PWC | https://paperswithcode.com/paper/rrnet-repetition-reduction-network-for-energy |
Repo | |
Framework | |
Measurable Counterfactual Local Explanations for Any Classifier
Title | Measurable Counterfactual Local Explanations for Any Classifier |
Authors | Adam White, Artur d’Avila Garcez |
Abstract | We propose a novel method for explaining the predictions of any classifier. In our approach, local explanations are expected to explain both the outcome of a prediction and how that prediction would change if ‘things had been different’. Furthermore, we argue that satisfactory explanations cannot be dissociated from a notion and measure of fidelity, as advocated in the early days of neural networks’ knowledge extraction. We introduce a definition of fidelity to the underlying classifier for local explanation models which is based on distances to a target decision boundary. A system called CLEAR: Counterfactual Local Explanations via Regression, is introduced and evaluated. CLEAR generates w-counterfactual explanations that state minimum changes necessary to flip a prediction’s classification. CLEAR then builds local regression models, using the w-counterfactuals to measure and improve the fidelity of its regressions. By contrast, the popular LIME method, which also uses regression to generate local explanations, neither measures its own fidelity nor generates counterfactuals. CLEAR’s regressions are found to have significantly higher fidelity than LIME’s, averaging over 45% higher in this paper’s four case studies. |
Tasks | |
Published | 2019-08-08 |
URL | https://arxiv.org/abs/1908.03020v2 |
https://arxiv.org/pdf/1908.03020v2.pdf | |
PWC | https://paperswithcode.com/paper/measurable-counterfactual-local-explanations |
Repo | |
Framework | |