Paper Group ANR 742
Stochastic Configuration Networks: Fundamentals and Algorithms. A Computational Framework for Multi-Modal Social Action Identification. Recursive Binary Neural Network Learning Model with 2.28b/Weight Storage Requirement. Source-side Prediction for Neural Headline Generation. A Probabilistic Framework for Nonlinearities in Stochastic Neural Network …
Stochastic Configuration Networks: Fundamentals and Algorithms
Title | Stochastic Configuration Networks: Fundamentals and Algorithms |
Authors | Dianhui Wang, Ming Li |
Abstract | This paper contributes to a development of randomized methods for neural networks. The proposed learner model is generated incrementally by stochastic configuration (SC) algorithms, termed as Stochastic Configuration Networks (SCNs). In contrast to the existing randomised learning algorithms for single layer feed-forward neural networks (SLFNNs), we randomly assign the input weights and biases of the hidden nodes in the light of a supervisory mechanism, and the output weights are analytically evaluated in either constructive or selective manner. As fundamentals of SCN-based data modelling techniques, we establish some theoretical results on the universal approximation property. Three versions of SC algorithms are presented for regression problems (applicable for classification problems as well) in this work. Simulation results concerning both function approximation and real world data regression indicate some remarkable merits of our proposed SCNs in terms of less human intervention on the network size setting, the scope adaptation of random parameters, fast learning and sound generalization. |
Tasks | |
Published | 2017-02-10 |
URL | http://arxiv.org/abs/1702.03180v4 |
http://arxiv.org/pdf/1702.03180v4.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-configuration-networks |
Repo | |
Framework | |
A Computational Framework for Multi-Modal Social Action Identification
Title | A Computational Framework for Multi-Modal Social Action Identification |
Authors | Jason Anastasopoulos, Jake Ryland Williams |
Abstract | We create a computational framework for understanding social action and demonstrate how this framework can be used to build an open-source event detection tool with scalable statistical machine learning algorithms and a subsampled database of over 600 million geo-tagged Tweets from around the world. These Tweets were collected between April 1st, 2014 and April 30th, 2015, most notably when the Black Lives Matter movement began. We demonstrate how these methods can be used diagnostically-by researchers, government officials and the public-to understand peaceful and violent collective action at very fine-grained levels of time and geography. |
Tasks | |
Published | 2017-10-20 |
URL | http://arxiv.org/abs/1710.07728v2 |
http://arxiv.org/pdf/1710.07728v2.pdf | |
PWC | https://paperswithcode.com/paper/a-computational-framework-for-multi-modal |
Repo | |
Framework | |
Recursive Binary Neural Network Learning Model with 2.28b/Weight Storage Requirement
Title | Recursive Binary Neural Network Learning Model with 2.28b/Weight Storage Requirement |
Authors | Tianchan Guan, Xiaoyang Zeng, Mingoo Seok |
Abstract | This paper presents a storage-efficient learning model titled Recursive Binary Neural Networks for sensing devices having a limited amount of on-chip data storage such as < 100’s kilo-Bytes. The main idea of the proposed model is to recursively recycle data storage of synaptic weights (parameters) during training. This enables a device with a given storage constraint to train and instantiate a neural network classifier with a larger number of weights on a chip and with a less number of off-chip storage accesses. This enables higher classification accuracy, shorter training time, less energy dissipation, and less on-chip storage requirement. We verified the training model with deep neural network classifiers and the permutation-invariant MNIST benchmark. Our model uses only 2.28 bits/weight while for the same data storage constraint achieving ~1% lower classification error as compared to the conventional binary-weight learning model which yet has to use 8 to 16 bit storage per weight. To achieve the similar classification error, the conventional binary model requires ~4x more data storage for weights than the proposed model. |
Tasks | |
Published | 2017-09-15 |
URL | http://arxiv.org/abs/1709.05306v1 |
http://arxiv.org/pdf/1709.05306v1.pdf | |
PWC | https://paperswithcode.com/paper/recursive-binary-neural-network-learning-1 |
Repo | |
Framework | |
Source-side Prediction for Neural Headline Generation
Title | Source-side Prediction for Neural Headline Generation |
Authors | Shun Kiyono, Sho Takase, Jun Suzuki, Naoaki Okazaki, Kentaro Inui, Masaaki Nagata |
Abstract | The encoder-decoder model is widely used in natural language generation tasks. However, the model sometimes suffers from repeated redundant generation, misses important phrases, and includes irrelevant entities. Toward solving these problems we propose a novel source-side token prediction module. Our method jointly estimates the probability distributions over source and target vocabularies to capture a correspondence between source and target tokens. The experiments show that the proposed model outperforms the current state-of-the-art method in the headline generation task. Additionally, we show that our method has an ability to learn a reasonable token-wise correspondence without knowing any true alignments. |
Tasks | Text Generation |
Published | 2017-12-22 |
URL | http://arxiv.org/abs/1712.08302v1 |
http://arxiv.org/pdf/1712.08302v1.pdf | |
PWC | https://paperswithcode.com/paper/source-side-prediction-for-neural-headline |
Repo | |
Framework | |
A Probabilistic Framework for Nonlinearities in Stochastic Neural Networks
Title | A Probabilistic Framework for Nonlinearities in Stochastic Neural Networks |
Authors | Qinliang Su, Xuejun Liao, Lawrence Carin |
Abstract | We present a probabilistic framework for nonlinearities, based on doubly truncated Gaussian distributions. By setting the truncation points appropriately, we are able to generate various types of nonlinearities within a unified framework, including sigmoid, tanh and ReLU, the most commonly used nonlinearities in neural networks. The framework readily integrates into existing stochastic neural networks (with hidden units characterized as random variables), allowing one for the first time to learn the nonlinearities alongside model weights in these networks. Extensive experiments demonstrate the performance improvements brought about by the proposed framework when integrated with the restricted Boltzmann machine (RBM), temporal RBM and the truncated Gaussian graphical model (TGGM). |
Tasks | |
Published | 2017-09-18 |
URL | http://arxiv.org/abs/1709.06123v1 |
http://arxiv.org/pdf/1709.06123v1.pdf | |
PWC | https://paperswithcode.com/paper/a-probabilistic-framework-for-nonlinearities |
Repo | |
Framework | |
BLENDER: Enabling Local Search with a Hybrid Differential Privacy Model
Title | BLENDER: Enabling Local Search with a Hybrid Differential Privacy Model |
Authors | Brendan Avent, Aleksandra Korolova, David Zeber, Torgeir Hovden, Benjamin Livshits |
Abstract | We propose a hybrid model of differential privacy that considers a combination of regular and opt-in users who desire the differential privacy guarantees of the local privacy model and the trusted curator model, respectively. We demonstrate that within this model, it is possible to design a new type of blended algorithm for the task of privately computing the head of a search log. This blended approach provides significant improvements in the utility of obtained data compared to related work while providing users with their desired privacy guarantees. Specifically, on two large search click data sets, comprising 1.75 and 16 GB respectively, our approach attains NDCG values exceeding 95% across a range of privacy budget values. |
Tasks | |
Published | 2017-05-02 |
URL | https://arxiv.org/abs/1705.00831v4 |
https://arxiv.org/pdf/1705.00831v4.pdf | |
PWC | https://paperswithcode.com/paper/blender-enabling-local-search-with-a-hybrid |
Repo | |
Framework | |
A Correspondence Between Random Neural Networks and Statistical Field Theory
Title | A Correspondence Between Random Neural Networks and Statistical Field Theory |
Authors | Samuel S. Schoenholz, Jeffrey Pennington, Jascha Sohl-Dickstein |
Abstract | A number of recent papers have provided evidence that practical design questions about neural networks may be tackled theoretically by studying the behavior of random networks. However, until now the tools available for analyzing random neural networks have been relatively ad-hoc. In this work, we show that the distribution of pre-activations in random neural networks can be exactly mapped onto lattice models in statistical physics. We argue that several previous investigations of stochastic networks actually studied a particular factorial approximation to the full lattice model. For random linear networks and random rectified linear networks we show that the corresponding lattice models in the wide network limit may be systematically approximated by a Gaussian distribution with covariance between the layers of the network. In each case, the approximate distribution can be diagonalized by Fourier transformation. We show that this approximation accurately describes the results of numerical simulations of wide random neural networks. Finally, we demonstrate that in each case the large scale behavior of the random networks can be approximated by an effective field theory. |
Tasks | |
Published | 2017-10-18 |
URL | http://arxiv.org/abs/1710.06570v1 |
http://arxiv.org/pdf/1710.06570v1.pdf | |
PWC | https://paperswithcode.com/paper/a-correspondence-between-random-neural |
Repo | |
Framework | |
Over the Air Deep Learning Based Radio Signal Classification
Title | Over the Air Deep Learning Based Radio Signal Classification |
Authors | Timothy J. O’Shea, Tamoghna Roy, T. Charles Clancy |
Abstract | We conduct an in depth study on the performance of deep learning based radio signal classification for radio communications signals. We consider a rigorous baseline method using higher order moments and strong boosted gradient tree classification and compare performance between the two approaches across a range of configurations and channel impairments. We consider the effects of carrier frequency offset, symbol rate, and multi-path fading in simulation and conduct over-the-air measurement of radio classification performance in the lab using software radios and compare performance and training strategies for both. Finally we conclude with a discussion of remaining problems, and design considerations for using such techniques. |
Tasks | |
Published | 2017-12-13 |
URL | http://arxiv.org/abs/1712.04578v1 |
http://arxiv.org/pdf/1712.04578v1.pdf | |
PWC | https://paperswithcode.com/paper/over-the-air-deep-learning-based-radio-signal |
Repo | |
Framework | |
Multi-Agent Q-Learning for Minimizing Demand-Supply Power Deficit in Microgrids
Title | Multi-Agent Q-Learning for Minimizing Demand-Supply Power Deficit in Microgrids |
Authors | Raghuram Bharadwaj Diddigi, D. Sai Koti Reddy, Shalabh Bhatnagar |
Abstract | We consider the problem of minimizing the difference in the demand and the supply of power using microgrids. We setup multiple microgrids, that provide electricity to a village. They have access to the batteries that can store renewable power and also the electrical lines from the main grid. During each time period, these microgrids need to take decision on the amount of renewable power to be used from the batteries as well as the amount of power needed from the main grid. We formulate this problem in the framework of Markov Decision Process (MDP), similar to the one discussed in [1]. The power allotment to the village from the main grid is fixed and bounded, whereas the renewable energy generation is uncertain in nature. Therefore we adapt a distributed version of the popular Reinforcement learning technique, Multi-Agent Q-Learning to the problem. Finally, we also consider a variant of this problem where the cost of power production at the main site is taken into consideration. In this scenario the microgrids need to minimize the demand-supply deficit, while maintaining the desired average cost of the power production. |
Tasks | Q-Learning |
Published | 2017-08-25 |
URL | http://arxiv.org/abs/1708.07732v2 |
http://arxiv.org/pdf/1708.07732v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-agent-q-learning-for-minimizing-demand |
Repo | |
Framework | |
Bayesian Conditional Generative Adverserial Networks
Title | Bayesian Conditional Generative Adverserial Networks |
Authors | M. Ehsan Abbasnejad, Qinfeng Shi, Iman Abbasnejad, Anton van den Hengel, Anthony Dick |
Abstract | Traditional GANs use a deterministic generator function (typically a neural network) to transform a random noise input $z$ to a sample $\mathbf{x}$ that the discriminator seeks to distinguish. We propose a new GAN called Bayesian Conditional Generative Adversarial Networks (BC-GANs) that use a random generator function to transform a deterministic input $y'$ to a sample $\mathbf{x}$. Our BC-GANs extend traditional GANs to a Bayesian framework, and naturally handle unsupervised learning, supervised learning, and semi-supervised learning problems. Experiments show that the proposed BC-GANs outperforms the state-of-the-arts. |
Tasks | |
Published | 2017-06-17 |
URL | http://arxiv.org/abs/1706.05477v1 |
http://arxiv.org/pdf/1706.05477v1.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-conditional-generative-adverserial |
Repo | |
Framework | |
Non-line-of-sight tracking of people at long range
Title | Non-line-of-sight tracking of people at long range |
Authors | Susan Chan, Ryan E. Warburton, Genevieve Gariepy, Jonathan Leach, Daniele Faccio |
Abstract | A remote-sensing system that can determine the position of hidden objects has applications in many critical real-life scenarios, such as search and rescue missions and safe autonomous driving. Previous work has shown the ability to range and image objects hidden from the direct line of sight, employing advanced optical imaging technologies aimed at small objects at short range. In this work we demonstrate a long-range tracking system based on single laser illumination and single-pixel single-photon detection. This enables us to track one or more people hidden from view at a stand-off distance of over 50~m. These results pave the way towards next generation LiDAR systems that will reconstruct not only the direct-view scene but also the main elements hidden behind walls or corners. |
Tasks | Autonomous Driving |
Published | 2017-03-02 |
URL | http://arxiv.org/abs/1703.02124v1 |
http://arxiv.org/pdf/1703.02124v1.pdf | |
PWC | https://paperswithcode.com/paper/non-line-of-sight-tracking-of-people-at-long |
Repo | |
Framework | |
MinimalRNN: Toward More Interpretable and Trainable Recurrent Neural Networks
Title | MinimalRNN: Toward More Interpretable and Trainable Recurrent Neural Networks |
Authors | Minmin Chen |
Abstract | We introduce MinimalRNN, a new recurrent neural network architecture that achieves comparable performance as the popular gated RNNs with a simplified structure. It employs minimal updates within RNN, which not only leads to efficient learning and testing but more importantly better interpretability and trainability. We demonstrate that by endorsing the more restrictive update rule, MinimalRNN learns disentangled RNN states. We further examine the learning dynamics of different RNN structures using input-output Jacobians, and show that MinimalRNN is able to capture longer range dependencies than existing RNN architectures. |
Tasks | |
Published | 2017-11-18 |
URL | http://arxiv.org/abs/1711.06788v2 |
http://arxiv.org/pdf/1711.06788v2.pdf | |
PWC | https://paperswithcode.com/paper/minimalrnn-toward-more-interpretable-and |
Repo | |
Framework | |
Deep Neural Networks as 0-1 Mixed Integer Linear Programs: A Feasibility Study
Title | Deep Neural Networks as 0-1 Mixed Integer Linear Programs: A Feasibility Study |
Authors | Matteo Fischetti, Jason Jo |
Abstract | Deep Neural Networks (DNNs) are very popular these days, and are the subject of a very intense investigation. A DNN is made by layers of internal units (or neurons), each of which computes an affine combination of the output of the units in the previous layer, applies a nonlinear operator, and outputs the corresponding value (also known as activation). A commonly-used nonlinear operator is the so-called rectified linear unit (ReLU), whose output is just the maximum between its input value and zero. In this (and other similar cases like max pooling, where the max operation involves more than one input value), one can model the DNN as a 0-1 Mixed Integer Linear Program (0-1 MILP) where the continuous variables correspond to the output values of each unit, and a binary variable is associated with each ReLU to model its yes/no nature. In this paper we discuss the peculiarity of this kind of 0-1 MILP models, and describe an effective bound-tightening technique intended to ease its solution. We also present possible applications of the 0-1 MILP model arising in feature visualization and in the construction of adversarial examples. Preliminary computational results are reported, aimed at investigating (on small DNNs) the computational performance of a state-of-the-art MILP solver when applied to a known test case, namely, hand-written digit recognition. |
Tasks | |
Published | 2017-12-17 |
URL | http://arxiv.org/abs/1712.06174v1 |
http://arxiv.org/pdf/1712.06174v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-neural-networks-as-0-1-mixed-integer |
Repo | |
Framework | |
Multivariate Regression with Grossly Corrupted Observations: A Robust Approach and its Applications
Title | Multivariate Regression with Grossly Corrupted Observations: A Robust Approach and its Applications |
Authors | Xiaowei Zhang, Chi Xu, Yu Zhang, Tingshao Zhu, Li Cheng |
Abstract | This paper studies the problem of multivariate linear regression where a portion of the observations is grossly corrupted or is missing, and the magnitudes and locations of such occurrences are unknown in priori. To deal with this problem, we propose a new approach by explicitly consider the error source as well as its sparseness nature. An interesting property of our approach lies in its ability of allowing individual regression output elements or tasks to possess their unique noise levels. Moreover, despite working with a non-smooth optimization problem, our approach still guarantees to converge to its optimal solution. Experiments on synthetic data demonstrate the competitiveness of our approach compared with existing multivariate regression models. In addition, empirically our approach has been validated with very promising results on two exemplar real-world applications: The first concerns the prediction of \textit{Big-Five} personality based on user behaviors at social network sites (SNSs), while the second is 3D human hand pose estimation from depth images. The implementation of our approach and comparison methods as well as the involved datasets are made publicly available in support of the open-source and reproducible research initiatives. |
Tasks | Hand Pose Estimation, Pose Estimation |
Published | 2017-01-11 |
URL | http://arxiv.org/abs/1701.02892v1 |
http://arxiv.org/pdf/1701.02892v1.pdf | |
PWC | https://paperswithcode.com/paper/multivariate-regression-with-grossly |
Repo | |
Framework | |
Stochastic Variance Reduction Gradient for a Non-convex Problem Using Graduated Optimization
Title | Stochastic Variance Reduction Gradient for a Non-convex Problem Using Graduated Optimization |
Authors | Li Chen, Shuisheng Zhou, Zhuan Zhang |
Abstract | In machine learning, nonconvex optimization problems with multiple local optimums are often encountered. Graduated Optimization Algorithm (GOA) is a popular heuristic method to obtain global optimums of nonconvex problems through progressively minimizing a series of convex approximations to the nonconvex problems more and more accurate. Recently, such an algorithm GradOpt based on GOA is proposed with amazing theoretical and experimental results, but it mainly studies the problem which consists of one nonconvex part. This paper aims to find the global solution of a nonconvex objective with a convex part plus a nonconvex part based on GOA. By graduating approximating non-convex part of the problem and minimizing them with the Stochastic Variance Reduced Gradient (SVRG) or proximal SVRG, two new algorithms, SVRG-GOA and PSVRG-GOA, are proposed. We prove that the new algorithms have lower iteration complexity ($O(1/\varepsilon)$) than GradOpt ($O(1/\varepsilon^2)$). Some tricks, such as enlarging shrink factor, using project step, stochastic gradient, and mini-batch skills, are also given to accelerate the convergence speed of the proposed algorithms. Experimental results illustrate that the new algorithms with the similar performance can converge to ‘global’ optimums of the nonconvex problems, and they converge faster than the GradOpt and the nonconvex proximal SVRG. |
Tasks | |
Published | 2017-07-10 |
URL | http://arxiv.org/abs/1707.02727v1 |
http://arxiv.org/pdf/1707.02727v1.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-variance-reduction-gradient-for-a |
Repo | |
Framework | |