July 27, 2019

2736 words 13 mins read

Paper Group ANR 742

Paper Group ANR 742

Stochastic Configuration Networks: Fundamentals and Algorithms. A Computational Framework for Multi-Modal Social Action Identification. Recursive Binary Neural Network Learning Model with 2.28b/Weight Storage Requirement. Source-side Prediction for Neural Headline Generation. A Probabilistic Framework for Nonlinearities in Stochastic Neural Network …

Stochastic Configuration Networks: Fundamentals and Algorithms

Title Stochastic Configuration Networks: Fundamentals and Algorithms
Authors Dianhui Wang, Ming Li
Abstract This paper contributes to a development of randomized methods for neural networks. The proposed learner model is generated incrementally by stochastic configuration (SC) algorithms, termed as Stochastic Configuration Networks (SCNs). In contrast to the existing randomised learning algorithms for single layer feed-forward neural networks (SLFNNs), we randomly assign the input weights and biases of the hidden nodes in the light of a supervisory mechanism, and the output weights are analytically evaluated in either constructive or selective manner. As fundamentals of SCN-based data modelling techniques, we establish some theoretical results on the universal approximation property. Three versions of SC algorithms are presented for regression problems (applicable for classification problems as well) in this work. Simulation results concerning both function approximation and real world data regression indicate some remarkable merits of our proposed SCNs in terms of less human intervention on the network size setting, the scope adaptation of random parameters, fast learning and sound generalization.
Tasks
Published 2017-02-10
URL http://arxiv.org/abs/1702.03180v4
PDF http://arxiv.org/pdf/1702.03180v4.pdf
PWC https://paperswithcode.com/paper/stochastic-configuration-networks
Repo
Framework

A Computational Framework for Multi-Modal Social Action Identification

Title A Computational Framework for Multi-Modal Social Action Identification
Authors Jason Anastasopoulos, Jake Ryland Williams
Abstract We create a computational framework for understanding social action and demonstrate how this framework can be used to build an open-source event detection tool with scalable statistical machine learning algorithms and a subsampled database of over 600 million geo-tagged Tweets from around the world. These Tweets were collected between April 1st, 2014 and April 30th, 2015, most notably when the Black Lives Matter movement began. We demonstrate how these methods can be used diagnostically-by researchers, government officials and the public-to understand peaceful and violent collective action at very fine-grained levels of time and geography.
Tasks
Published 2017-10-20
URL http://arxiv.org/abs/1710.07728v2
PDF http://arxiv.org/pdf/1710.07728v2.pdf
PWC https://paperswithcode.com/paper/a-computational-framework-for-multi-modal
Repo
Framework

Recursive Binary Neural Network Learning Model with 2.28b/Weight Storage Requirement

Title Recursive Binary Neural Network Learning Model with 2.28b/Weight Storage Requirement
Authors Tianchan Guan, Xiaoyang Zeng, Mingoo Seok
Abstract This paper presents a storage-efficient learning model titled Recursive Binary Neural Networks for sensing devices having a limited amount of on-chip data storage such as < 100’s kilo-Bytes. The main idea of the proposed model is to recursively recycle data storage of synaptic weights (parameters) during training. This enables a device with a given storage constraint to train and instantiate a neural network classifier with a larger number of weights on a chip and with a less number of off-chip storage accesses. This enables higher classification accuracy, shorter training time, less energy dissipation, and less on-chip storage requirement. We verified the training model with deep neural network classifiers and the permutation-invariant MNIST benchmark. Our model uses only 2.28 bits/weight while for the same data storage constraint achieving ~1% lower classification error as compared to the conventional binary-weight learning model which yet has to use 8 to 16 bit storage per weight. To achieve the similar classification error, the conventional binary model requires ~4x more data storage for weights than the proposed model.
Tasks
Published 2017-09-15
URL http://arxiv.org/abs/1709.05306v1
PDF http://arxiv.org/pdf/1709.05306v1.pdf
PWC https://paperswithcode.com/paper/recursive-binary-neural-network-learning-1
Repo
Framework

Source-side Prediction for Neural Headline Generation

Title Source-side Prediction for Neural Headline Generation
Authors Shun Kiyono, Sho Takase, Jun Suzuki, Naoaki Okazaki, Kentaro Inui, Masaaki Nagata
Abstract The encoder-decoder model is widely used in natural language generation tasks. However, the model sometimes suffers from repeated redundant generation, misses important phrases, and includes irrelevant entities. Toward solving these problems we propose a novel source-side token prediction module. Our method jointly estimates the probability distributions over source and target vocabularies to capture a correspondence between source and target tokens. The experiments show that the proposed model outperforms the current state-of-the-art method in the headline generation task. Additionally, we show that our method has an ability to learn a reasonable token-wise correspondence without knowing any true alignments.
Tasks Text Generation
Published 2017-12-22
URL http://arxiv.org/abs/1712.08302v1
PDF http://arxiv.org/pdf/1712.08302v1.pdf
PWC https://paperswithcode.com/paper/source-side-prediction-for-neural-headline
Repo
Framework

A Probabilistic Framework for Nonlinearities in Stochastic Neural Networks

Title A Probabilistic Framework for Nonlinearities in Stochastic Neural Networks
Authors Qinliang Su, Xuejun Liao, Lawrence Carin
Abstract We present a probabilistic framework for nonlinearities, based on doubly truncated Gaussian distributions. By setting the truncation points appropriately, we are able to generate various types of nonlinearities within a unified framework, including sigmoid, tanh and ReLU, the most commonly used nonlinearities in neural networks. The framework readily integrates into existing stochastic neural networks (with hidden units characterized as random variables), allowing one for the first time to learn the nonlinearities alongside model weights in these networks. Extensive experiments demonstrate the performance improvements brought about by the proposed framework when integrated with the restricted Boltzmann machine (RBM), temporal RBM and the truncated Gaussian graphical model (TGGM).
Tasks
Published 2017-09-18
URL http://arxiv.org/abs/1709.06123v1
PDF http://arxiv.org/pdf/1709.06123v1.pdf
PWC https://paperswithcode.com/paper/a-probabilistic-framework-for-nonlinearities
Repo
Framework

BLENDER: Enabling Local Search with a Hybrid Differential Privacy Model

Title BLENDER: Enabling Local Search with a Hybrid Differential Privacy Model
Authors Brendan Avent, Aleksandra Korolova, David Zeber, Torgeir Hovden, Benjamin Livshits
Abstract We propose a hybrid model of differential privacy that considers a combination of regular and opt-in users who desire the differential privacy guarantees of the local privacy model and the trusted curator model, respectively. We demonstrate that within this model, it is possible to design a new type of blended algorithm for the task of privately computing the head of a search log. This blended approach provides significant improvements in the utility of obtained data compared to related work while providing users with their desired privacy guarantees. Specifically, on two large search click data sets, comprising 1.75 and 16 GB respectively, our approach attains NDCG values exceeding 95% across a range of privacy budget values.
Tasks
Published 2017-05-02
URL https://arxiv.org/abs/1705.00831v4
PDF https://arxiv.org/pdf/1705.00831v4.pdf
PWC https://paperswithcode.com/paper/blender-enabling-local-search-with-a-hybrid
Repo
Framework

A Correspondence Between Random Neural Networks and Statistical Field Theory

Title A Correspondence Between Random Neural Networks and Statistical Field Theory
Authors Samuel S. Schoenholz, Jeffrey Pennington, Jascha Sohl-Dickstein
Abstract A number of recent papers have provided evidence that practical design questions about neural networks may be tackled theoretically by studying the behavior of random networks. However, until now the tools available for analyzing random neural networks have been relatively ad-hoc. In this work, we show that the distribution of pre-activations in random neural networks can be exactly mapped onto lattice models in statistical physics. We argue that several previous investigations of stochastic networks actually studied a particular factorial approximation to the full lattice model. For random linear networks and random rectified linear networks we show that the corresponding lattice models in the wide network limit may be systematically approximated by a Gaussian distribution with covariance between the layers of the network. In each case, the approximate distribution can be diagonalized by Fourier transformation. We show that this approximation accurately describes the results of numerical simulations of wide random neural networks. Finally, we demonstrate that in each case the large scale behavior of the random networks can be approximated by an effective field theory.
Tasks
Published 2017-10-18
URL http://arxiv.org/abs/1710.06570v1
PDF http://arxiv.org/pdf/1710.06570v1.pdf
PWC https://paperswithcode.com/paper/a-correspondence-between-random-neural
Repo
Framework

Over the Air Deep Learning Based Radio Signal Classification

Title Over the Air Deep Learning Based Radio Signal Classification
Authors Timothy J. O’Shea, Tamoghna Roy, T. Charles Clancy
Abstract We conduct an in depth study on the performance of deep learning based radio signal classification for radio communications signals. We consider a rigorous baseline method using higher order moments and strong boosted gradient tree classification and compare performance between the two approaches across a range of configurations and channel impairments. We consider the effects of carrier frequency offset, symbol rate, and multi-path fading in simulation and conduct over-the-air measurement of radio classification performance in the lab using software radios and compare performance and training strategies for both. Finally we conclude with a discussion of remaining problems, and design considerations for using such techniques.
Tasks
Published 2017-12-13
URL http://arxiv.org/abs/1712.04578v1
PDF http://arxiv.org/pdf/1712.04578v1.pdf
PWC https://paperswithcode.com/paper/over-the-air-deep-learning-based-radio-signal
Repo
Framework

Multi-Agent Q-Learning for Minimizing Demand-Supply Power Deficit in Microgrids

Title Multi-Agent Q-Learning for Minimizing Demand-Supply Power Deficit in Microgrids
Authors Raghuram Bharadwaj Diddigi, D. Sai Koti Reddy, Shalabh Bhatnagar
Abstract We consider the problem of minimizing the difference in the demand and the supply of power using microgrids. We setup multiple microgrids, that provide electricity to a village. They have access to the batteries that can store renewable power and also the electrical lines from the main grid. During each time period, these microgrids need to take decision on the amount of renewable power to be used from the batteries as well as the amount of power needed from the main grid. We formulate this problem in the framework of Markov Decision Process (MDP), similar to the one discussed in [1]. The power allotment to the village from the main grid is fixed and bounded, whereas the renewable energy generation is uncertain in nature. Therefore we adapt a distributed version of the popular Reinforcement learning technique, Multi-Agent Q-Learning to the problem. Finally, we also consider a variant of this problem where the cost of power production at the main site is taken into consideration. In this scenario the microgrids need to minimize the demand-supply deficit, while maintaining the desired average cost of the power production.
Tasks Q-Learning
Published 2017-08-25
URL http://arxiv.org/abs/1708.07732v2
PDF http://arxiv.org/pdf/1708.07732v2.pdf
PWC https://paperswithcode.com/paper/multi-agent-q-learning-for-minimizing-demand
Repo
Framework

Bayesian Conditional Generative Adverserial Networks

Title Bayesian Conditional Generative Adverserial Networks
Authors M. Ehsan Abbasnejad, Qinfeng Shi, Iman Abbasnejad, Anton van den Hengel, Anthony Dick
Abstract Traditional GANs use a deterministic generator function (typically a neural network) to transform a random noise input $z$ to a sample $\mathbf{x}$ that the discriminator seeks to distinguish. We propose a new GAN called Bayesian Conditional Generative Adversarial Networks (BC-GANs) that use a random generator function to transform a deterministic input $y'$ to a sample $\mathbf{x}$. Our BC-GANs extend traditional GANs to a Bayesian framework, and naturally handle unsupervised learning, supervised learning, and semi-supervised learning problems. Experiments show that the proposed BC-GANs outperforms the state-of-the-arts.
Tasks
Published 2017-06-17
URL http://arxiv.org/abs/1706.05477v1
PDF http://arxiv.org/pdf/1706.05477v1.pdf
PWC https://paperswithcode.com/paper/bayesian-conditional-generative-adverserial
Repo
Framework

Non-line-of-sight tracking of people at long range

Title Non-line-of-sight tracking of people at long range
Authors Susan Chan, Ryan E. Warburton, Genevieve Gariepy, Jonathan Leach, Daniele Faccio
Abstract A remote-sensing system that can determine the position of hidden objects has applications in many critical real-life scenarios, such as search and rescue missions and safe autonomous driving. Previous work has shown the ability to range and image objects hidden from the direct line of sight, employing advanced optical imaging technologies aimed at small objects at short range. In this work we demonstrate a long-range tracking system based on single laser illumination and single-pixel single-photon detection. This enables us to track one or more people hidden from view at a stand-off distance of over 50~m. These results pave the way towards next generation LiDAR systems that will reconstruct not only the direct-view scene but also the main elements hidden behind walls or corners.
Tasks Autonomous Driving
Published 2017-03-02
URL http://arxiv.org/abs/1703.02124v1
PDF http://arxiv.org/pdf/1703.02124v1.pdf
PWC https://paperswithcode.com/paper/non-line-of-sight-tracking-of-people-at-long
Repo
Framework

MinimalRNN: Toward More Interpretable and Trainable Recurrent Neural Networks

Title MinimalRNN: Toward More Interpretable and Trainable Recurrent Neural Networks
Authors Minmin Chen
Abstract We introduce MinimalRNN, a new recurrent neural network architecture that achieves comparable performance as the popular gated RNNs with a simplified structure. It employs minimal updates within RNN, which not only leads to efficient learning and testing but more importantly better interpretability and trainability. We demonstrate that by endorsing the more restrictive update rule, MinimalRNN learns disentangled RNN states. We further examine the learning dynamics of different RNN structures using input-output Jacobians, and show that MinimalRNN is able to capture longer range dependencies than existing RNN architectures.
Tasks
Published 2017-11-18
URL http://arxiv.org/abs/1711.06788v2
PDF http://arxiv.org/pdf/1711.06788v2.pdf
PWC https://paperswithcode.com/paper/minimalrnn-toward-more-interpretable-and
Repo
Framework

Deep Neural Networks as 0-1 Mixed Integer Linear Programs: A Feasibility Study

Title Deep Neural Networks as 0-1 Mixed Integer Linear Programs: A Feasibility Study
Authors Matteo Fischetti, Jason Jo
Abstract Deep Neural Networks (DNNs) are very popular these days, and are the subject of a very intense investigation. A DNN is made by layers of internal units (or neurons), each of which computes an affine combination of the output of the units in the previous layer, applies a nonlinear operator, and outputs the corresponding value (also known as activation). A commonly-used nonlinear operator is the so-called rectified linear unit (ReLU), whose output is just the maximum between its input value and zero. In this (and other similar cases like max pooling, where the max operation involves more than one input value), one can model the DNN as a 0-1 Mixed Integer Linear Program (0-1 MILP) where the continuous variables correspond to the output values of each unit, and a binary variable is associated with each ReLU to model its yes/no nature. In this paper we discuss the peculiarity of this kind of 0-1 MILP models, and describe an effective bound-tightening technique intended to ease its solution. We also present possible applications of the 0-1 MILP model arising in feature visualization and in the construction of adversarial examples. Preliminary computational results are reported, aimed at investigating (on small DNNs) the computational performance of a state-of-the-art MILP solver when applied to a known test case, namely, hand-written digit recognition.
Tasks
Published 2017-12-17
URL http://arxiv.org/abs/1712.06174v1
PDF http://arxiv.org/pdf/1712.06174v1.pdf
PWC https://paperswithcode.com/paper/deep-neural-networks-as-0-1-mixed-integer
Repo
Framework

Multivariate Regression with Grossly Corrupted Observations: A Robust Approach and its Applications

Title Multivariate Regression with Grossly Corrupted Observations: A Robust Approach and its Applications
Authors Xiaowei Zhang, Chi Xu, Yu Zhang, Tingshao Zhu, Li Cheng
Abstract This paper studies the problem of multivariate linear regression where a portion of the observations is grossly corrupted or is missing, and the magnitudes and locations of such occurrences are unknown in priori. To deal with this problem, we propose a new approach by explicitly consider the error source as well as its sparseness nature. An interesting property of our approach lies in its ability of allowing individual regression output elements or tasks to possess their unique noise levels. Moreover, despite working with a non-smooth optimization problem, our approach still guarantees to converge to its optimal solution. Experiments on synthetic data demonstrate the competitiveness of our approach compared with existing multivariate regression models. In addition, empirically our approach has been validated with very promising results on two exemplar real-world applications: The first concerns the prediction of \textit{Big-Five} personality based on user behaviors at social network sites (SNSs), while the second is 3D human hand pose estimation from depth images. The implementation of our approach and comparison methods as well as the involved datasets are made publicly available in support of the open-source and reproducible research initiatives.
Tasks Hand Pose Estimation, Pose Estimation
Published 2017-01-11
URL http://arxiv.org/abs/1701.02892v1
PDF http://arxiv.org/pdf/1701.02892v1.pdf
PWC https://paperswithcode.com/paper/multivariate-regression-with-grossly
Repo
Framework

Stochastic Variance Reduction Gradient for a Non-convex Problem Using Graduated Optimization

Title Stochastic Variance Reduction Gradient for a Non-convex Problem Using Graduated Optimization
Authors Li Chen, Shuisheng Zhou, Zhuan Zhang
Abstract In machine learning, nonconvex optimization problems with multiple local optimums are often encountered. Graduated Optimization Algorithm (GOA) is a popular heuristic method to obtain global optimums of nonconvex problems through progressively minimizing a series of convex approximations to the nonconvex problems more and more accurate. Recently, such an algorithm GradOpt based on GOA is proposed with amazing theoretical and experimental results, but it mainly studies the problem which consists of one nonconvex part. This paper aims to find the global solution of a nonconvex objective with a convex part plus a nonconvex part based on GOA. By graduating approximating non-convex part of the problem and minimizing them with the Stochastic Variance Reduced Gradient (SVRG) or proximal SVRG, two new algorithms, SVRG-GOA and PSVRG-GOA, are proposed. We prove that the new algorithms have lower iteration complexity ($O(1/\varepsilon)$) than GradOpt ($O(1/\varepsilon^2)$). Some tricks, such as enlarging shrink factor, using project step, stochastic gradient, and mini-batch skills, are also given to accelerate the convergence speed of the proposed algorithms. Experimental results illustrate that the new algorithms with the similar performance can converge to ‘global’ optimums of the nonconvex problems, and they converge faster than the GradOpt and the nonconvex proximal SVRG.
Tasks
Published 2017-07-10
URL http://arxiv.org/abs/1707.02727v1
PDF http://arxiv.org/pdf/1707.02727v1.pdf
PWC https://paperswithcode.com/paper/stochastic-variance-reduction-gradient-for-a
Repo
Framework
comments powered by Disqus