Paper Group ANR 580
Non-parametric estimation of Jensen-Shannon Divergence in Generative Adversarial Network training. Variational inference for the multi-armed contextual bandit. Convergence analysis of the information matrix in Gaussian belief propagation. A Random Block-Coordinate Douglas-Rachford Splitting Method with Low Computational Complexity for Binary Logist …
Non-parametric estimation of Jensen-Shannon Divergence in Generative Adversarial Network training
Title | Non-parametric estimation of Jensen-Shannon Divergence in Generative Adversarial Network training |
Authors | Mathieu Sinn, Ambrish Rawat |
Abstract | Generative Adversarial Networks (GANs) have become a widely popular framework for generative modelling of high-dimensional datasets. However their training is well-known to be difficult. This work presents a rigorous statistical analysis of GANs providing straight-forward explanations for common training pathologies such as vanishing gradients. Furthermore, it proposes a new training objective, Kernel GANs, and demonstrates its practical effectiveness on large-scale real-world data sets. A key element in the analysis is the distinction between training with respect to the (unknown) data distribution, and its empirical counterpart. To overcome issues in GAN training, we pursue the idea of smoothing the Jensen-Shannon Divergence (JSD) by incorporating noise in the input distributions of the discriminator. As we show, this effectively leads to an empirical version of the JSD in which the true and the generator densities are replaced by kernel density estimates, which leads to Kernel GANs. |
Tasks | |
Published | 2017-05-25 |
URL | http://arxiv.org/abs/1705.09199v3 |
http://arxiv.org/pdf/1705.09199v3.pdf | |
PWC | https://paperswithcode.com/paper/non-parametric-estimation-of-jensen-shannon |
Repo | |
Framework | |
Variational inference for the multi-armed contextual bandit
Title | Variational inference for the multi-armed contextual bandit |
Authors | Iñigo Urteaga, Chris H. Wiggins |
Abstract | In many biomedical, science, and engineering problems, one must sequentially decide which action to take next so as to maximize rewards. One general class of algorithms for optimizing interactions with the world, while simultaneously learning how the world operates, is the multi-armed bandit setting and, in particular, the contextual bandit case. In this setting, for each executed action, one observes rewards that are dependent on a given ‘context’, available at each interaction with the world. The Thompson sampling algorithm has recently been shown to enjoy provable optimality properties for this set of problems, and to perform well in real-world settings. It facilitates generative and interpretable modeling of the problem at hand. Nevertheless, the design and complexity of the model limit its application, since one must both sample from the distributions modeled and calculate their expected rewards. We here show how these limitations can be overcome using variational inference to approximate complex models, applying to the reinforcement learning case advances developed for the inference case in the machine learning community over the past two decades. We consider contextual multi-armed bandit applications where the true reward distribution is unknown and complex, which we approximate with a mixture model whose parameters are inferred via variational inference. We show how the proposed variational Thompson sampling approach is accurate in approximating the true distribution, and attains reduced regrets even with complex reward distributions. The proposed algorithm is valuable for practical scenarios where restrictive modeling assumptions are undesirable. |
Tasks | Multi-Armed Bandits |
Published | 2017-09-10 |
URL | http://arxiv.org/abs/1709.03163v2 |
http://arxiv.org/pdf/1709.03163v2.pdf | |
PWC | https://paperswithcode.com/paper/variational-inference-for-the-multi-armed |
Repo | |
Framework | |
Convergence analysis of the information matrix in Gaussian belief propagation
Title | Convergence analysis of the information matrix in Gaussian belief propagation |
Authors | Jian Du, Shaodan Ma, Yik-Chung Wu, Soummya Kar, José M. F. Moura |
Abstract | Gaussian belief propagation (BP) has been widely used for distributed estimation in large-scale networks such as the smart grid, communication networks, and social networks, where local measurements/observations are scattered over a wide geographical area. However, the convergence of Gaus- sian BP is still an open issue. In this paper, we consider the convergence of Gaussian BP, focusing in particular on the convergence of the information matrix. We show analytically that the exchanged message information matrix converges for arbitrary positive semidefinite initial value, and its dis- tance to the unique positive definite limit matrix decreases exponentially fast. |
Tasks | |
Published | 2017-04-13 |
URL | http://arxiv.org/abs/1704.03969v1 |
http://arxiv.org/pdf/1704.03969v1.pdf | |
PWC | https://paperswithcode.com/paper/convergence-analysis-of-the-information |
Repo | |
Framework | |
A Random Block-Coordinate Douglas-Rachford Splitting Method with Low Computational Complexity for Binary Logistic Regression
Title | A Random Block-Coordinate Douglas-Rachford Splitting Method with Low Computational Complexity for Binary Logistic Regression |
Authors | Luis M. Briceno-Arias, Giovanni Chierchia, Emilie Chouzenoux, Jean-Christophe Pesquet |
Abstract | In this paper, we propose a new optimization algorithm for sparse logistic regression based on a stochastic version of the Douglas-Rachford splitting method. Our algorithm sweeps the training set by randomly selecting a mini-batch of data at each iteration, and it allows us to update the variables in a block coordinate manner. Our approach leverages the proximity operator of the logistic loss, which is expressed with the generalized Lambert W function. Experiments carried out on standard datasets demonstrate the efficiency of our approach w.r.t. stochastic gradient-like methods. |
Tasks | |
Published | 2017-12-25 |
URL | http://arxiv.org/abs/1712.09131v1 |
http://arxiv.org/pdf/1712.09131v1.pdf | |
PWC | https://paperswithcode.com/paper/a-random-block-coordinate-douglas-rachford |
Repo | |
Framework | |
Contextual Data Collection for Smart Cities
Title | Contextual Data Collection for Smart Cities |
Authors | Henrique Santos, Vasco Furtado, Paulo Pinheiro, Deborah L. McGuinness |
Abstract | As part of Smart Cities initiatives, national, regional and local governments all over the globe are under the mandate of being more open regarding how they share their data. Under this mandate, many of these governments are publishing data under the umbrella of open government data, which includes measurement data from city-wide sensor networks. Furthermore, many of these data are published in so-called data portals as documents that may be spreadsheets, comma-separated value (CSV) data files, or plain documents in PDF or Word documents. The sharing of these documents may be a convenient way for the data provider to convey and publish data but it is not the ideal way for data consumers to reuse the data. For example, the problems of reusing the data may range from difficulty opening a document that is provided in any format that is not plain text, to the actual problem of understanding the meaning of each piece of knowledge inside of the document. Our proposal tackles those challenges by identifying metadata that has been regarded to be relevant for measurement data and providing a schema for this metadata. We further leverage the Human-Aware Sensor Network Ontology (HASNetO) to build an architecture for data collected in urban environments. We discuss the use of HASNetO and the supporting infrastructure to manage both data and metadata in support of the City of Fortaleza, a large metropolitan area in Brazil. |
Tasks | |
Published | 2017-04-06 |
URL | http://arxiv.org/abs/1704.01802v1 |
http://arxiv.org/pdf/1704.01802v1.pdf | |
PWC | https://paperswithcode.com/paper/contextual-data-collection-for-smart-cities |
Repo | |
Framework | |
Noisy Softplus: an activation function that enables SNNs to be trained as ANNs
Title | Noisy Softplus: an activation function that enables SNNs to be trained as ANNs |
Authors | Qian Liu, Yunhua Chen, Steve Furber |
Abstract | We extended the work of proposed activation function, Noisy Softplus, to fit into training of layered up spiking neural networks (SNNs). Thus, any ANN employing Noisy Softplus neurons, even of deep architecture, can be trained simply by the traditional algorithm, for example Back Propagation (BP), and the trained weights can be directly used in the spiking version of the same network without any conversion. Furthermore, the training method can be generalised to other activation units, for instance Rectified Linear Units (ReLU), to train deep SNNs off-line. This research is crucial to provide an effective approach for SNN training, and to increase the classification accuracy of SNNs with biological characteristics and to close the gap between the performance of SNNs and ANNs. |
Tasks | |
Published | 2017-03-31 |
URL | http://arxiv.org/abs/1706.03609v1 |
http://arxiv.org/pdf/1706.03609v1.pdf | |
PWC | https://paperswithcode.com/paper/noisy-softplus-an-activation-function-that |
Repo | |
Framework | |
Offline Handwritten Recognition of Malayalam District Name - A Holistic Approach
Title | Offline Handwritten Recognition of Malayalam District Name - A Holistic Approach |
Authors | Jino P J, Kannan Balakrishnan |
Abstract | Various machine learning methods for writer independent recognition of Malayalam handwritten district names are discussed in this paper. Data collected from 56 different writers are used for the experiments. The proposed work can be used for the recognition of district in the address written in Malayalam. Different methods for Dimensionality reduction are discussed. Features consider for the recognition are Histogram of Oriented Gradient descriptor, Number of Black Pixels in the upper half and lower half, length of image. Classifiers used in this work are Neural Network, SVM and RandomForest. |
Tasks | Dimensionality Reduction |
Published | 2017-05-02 |
URL | http://arxiv.org/abs/1705.00794v1 |
http://arxiv.org/pdf/1705.00794v1.pdf | |
PWC | https://paperswithcode.com/paper/offline-handwritten-recognition-of-malayalam |
Repo | |
Framework | |
Human-Aware Sensor Network Ontology: Semantic Support for Empirical Data Collection
Title | Human-Aware Sensor Network Ontology: Semantic Support for Empirical Data Collection |
Authors | Paulo Pinheiro, Deborah L. McGuinness, Henrique Santos |
Abstract | Significant efforts have been made to understand and document knowledge related to scientific measurements. Many of those efforts resulted in one or more high-quality ontologies that describe some aspects of scientific measurements, but not in a comprehensive and coherently integrated manner. For instance, we note that many of these high-quality ontologies are not properly aligned, and more challenging, that they have different and often conflicting concepts and approaches for encoding knowledge about empirical measurements. As a result of this lack of an integrated view, it is often challenging for scientists to determine whether any two scientific measurements were taken in semantically compatible manners, thus making it difficult to decide whether measurements should be analyzed in combination or not. In this paper, we present the Human-Aware Sensor Network Ontology that is a comprehensive alignment and integration of a sensing infrastructure ontology and a provenance ontology. HASNetO has been under development for more than one year, and has been reviewed, shared and used by multiple scientific communities. The ontology has been in use to support the data management of a number of large-scale ecological monitoring activities (observations) and empirical experiments. |
Tasks | |
Published | 2017-04-06 |
URL | http://arxiv.org/abs/1704.01806v1 |
http://arxiv.org/pdf/1704.01806v1.pdf | |
PWC | https://paperswithcode.com/paper/human-aware-sensor-network-ontology-semantic |
Repo | |
Framework | |
Facial Expression Recognition Using Enhanced Deep 3D Convolutional Neural Networks
Title | Facial Expression Recognition Using Enhanced Deep 3D Convolutional Neural Networks |
Authors | Behzad Hasani, Mohammad H. Mahoor |
Abstract | Deep Neural Networks (DNNs) have shown to outperform traditional methods in various visual recognition tasks including Facial Expression Recognition (FER). In spite of efforts made to improve the accuracy of FER systems using DNN, existing methods still are not generalizable enough in practical applications. This paper proposes a 3D Convolutional Neural Network method for FER in videos. This new network architecture consists of 3D Inception-ResNet layers followed by an LSTM unit that together extracts the spatial relations within facial images as well as the temporal relations between different frames in the video. Facial landmark points are also used as inputs to our network which emphasize on the importance of facial components rather than the facial regions that may not contribute significantly to generating facial expressions. Our proposed method is evaluated using four publicly available databases in subject-independent and cross-database tasks and outperforms state-of-the-art methods. |
Tasks | Facial Expression Recognition |
Published | 2017-05-22 |
URL | http://arxiv.org/abs/1705.07871v1 |
http://arxiv.org/pdf/1705.07871v1.pdf | |
PWC | https://paperswithcode.com/paper/facial-expression-recognition-using-enhanced |
Repo | |
Framework | |
Softmax Q-Distribution Estimation for Structured Prediction: A Theoretical Interpretation for RAML
Title | Softmax Q-Distribution Estimation for Structured Prediction: A Theoretical Interpretation for RAML |
Authors | Xuezhe Ma, Pengcheng Yin, Jingzhou Liu, Graham Neubig, Eduard Hovy |
Abstract | Reward augmented maximum likelihood (RAML), a simple and effective learning framework to directly optimize towards the reward function in structured prediction tasks, has led to a number of impressive empirical successes. RAML incorporates task-specific reward by performing maximum-likelihood updates on candidate outputs sampled according to an exponentiated payoff distribution, which gives higher probabilities to candidates that are close to the reference output. While RAML is notable for its simplicity, efficiency, and its impressive empirical successes, the theoretical properties of RAML, especially the behavior of the exponentiated payoff distribution, has not been examined thoroughly. In this work, we introduce softmax Q-distribution estimation, a novel theoretical interpretation of RAML, which reveals the relation between RAML and Bayesian decision theory. The softmax Q-distribution can be regarded as a smooth approximation of the Bayes decision boundary, and the Bayes decision rule is achieved by decoding with this Q-distribution. We further show that RAML is equivalent to approximately estimating the softmax Q-distribution, with the temperature $\tau$ controlling approximation error. We perform two experiments, one on synthetic data of multi-class classification and one on real data of image captioning, to demonstrate the relationship between RAML and the proposed softmax Q-distribution estimation method, verifying our theoretical analysis. Additional experiments on three structured prediction tasks with rewards defined on sequential (named entity recognition), tree-based (dependency parsing) and irregular (machine translation) structures show notable improvements over maximum likelihood baselines. |
Tasks | Dependency Parsing, Image Captioning, Machine Translation, Named Entity Recognition, Structured Prediction |
Published | 2017-05-19 |
URL | http://arxiv.org/abs/1705.07136v3 |
http://arxiv.org/pdf/1705.07136v3.pdf | |
PWC | https://paperswithcode.com/paper/softmax-q-distribution-estimation-for |
Repo | |
Framework | |
A Simple Yet Efficient Rank One Update for Covariance Matrix Adaptation
Title | A Simple Yet Efficient Rank One Update for Covariance Matrix Adaptation |
Authors | Zhenhua Li, Qingfu Zhang |
Abstract | In this paper, we propose an efficient approximated rank one update for covariance matrix adaptation evolution strategy (CMA-ES). It makes use of two evolution paths as simple as that of CMA-ES, while avoiding the computational matrix decomposition. We analyze the algorithms’ properties and behaviors. We experimentally study the proposed algorithm’s performances. It generally outperforms or performs competitively to the Cholesky CMA-ES. |
Tasks | |
Published | 2017-10-11 |
URL | http://arxiv.org/abs/1710.03996v3 |
http://arxiv.org/pdf/1710.03996v3.pdf | |
PWC | https://paperswithcode.com/paper/a-simple-yet-efficient-rank-one-update-for |
Repo | |
Framework | |
Scalable Deep Traffic Flow Neural Networks for Urban Traffic Congestion Prediction
Title | Scalable Deep Traffic Flow Neural Networks for Urban Traffic Congestion Prediction |
Authors | Mohammadhani Fouladgar, Mostafa Parchami, Ramez Elmasri, Amir Ghaderi |
Abstract | Tracking congestion throughout the network road is a critical component of Intelligent transportation network management systems. Understanding how the traffic flows and short-term prediction of congestion occurrence due to rush-hour or incidents can be beneficial to such systems to effectively manage and direct the traffic to the most appropriate detours. Many of the current traffic flow prediction systems are designed by utilizing a central processing component where the prediction is carried out through aggregation of the information gathered from all measuring stations. However, centralized systems are not scalable and fail provide real-time feedback to the system whereas in a decentralized scheme, each node is responsible to predict its own short-term congestion based on the local current measurements in neighboring nodes. We propose a decentralized deep learning-based method where each node accurately predicts its own congestion state in real-time based on the congestion state of the neighboring stations. Moreover, historical data from the deployment site is not required, which makes the proposed method more suitable for newly installed stations. In order to achieve higher performance, we introduce a regularized Euclidean loss function that favors high congestion samples over low congestion samples to avoid the impact of the unbalanced training dataset. A novel dataset for this purpose is designed based on the traffic data obtained from traffic control stations in northern California. Extensive experiments conducted on the designed benchmark reflect a successful congestion prediction. |
Tasks | |
Published | 2017-03-03 |
URL | http://arxiv.org/abs/1703.01006v1 |
http://arxiv.org/pdf/1703.01006v1.pdf | |
PWC | https://paperswithcode.com/paper/scalable-deep-traffic-flow-neural-networks |
Repo | |
Framework | |
Deep Learning and Quantum Entanglement: Fundamental Connections with Implications to Network Design
Title | Deep Learning and Quantum Entanglement: Fundamental Connections with Implications to Network Design |
Authors | Yoav Levine, David Yakira, Nadav Cohen, Amnon Shashua |
Abstract | Deep convolutional networks have witnessed unprecedented success in various machine learning applications. Formal understanding on what makes these networks so successful is gradually unfolding, but for the most part there are still significant mysteries to unravel. The inductive bias, which reflects prior knowledge embedded in the network architecture, is one of them. In this work, we establish a fundamental connection between the fields of quantum physics and deep learning. We use this connection for asserting novel theoretical observations regarding the role that the number of channels in each layer of the convolutional network fulfills in the overall inductive bias. Specifically, we show an equivalence between the function realized by a deep convolutional arithmetic circuit (ConvAC) and a quantum many-body wave function, which relies on their common underlying tensorial structure. This facilitates the use of quantum entanglement measures as well-defined quantifiers of a deep network’s expressive ability to model intricate correlation structures of its inputs. Most importantly, the construction of a deep ConvAC in terms of a Tensor Network is made available. This description enables us to carry a graph-theoretic analysis of a convolutional network, with which we demonstrate a direct control over the inductive bias of the deep network via its channel numbers, that are related to the min-cut in the underlying graph. This result is relevant to any practitioner designing a network for a specific task. We theoretically analyze ConvACs, and empirically validate our findings on more common ConvNets which involve ReLU activations and max pooling. Beyond the results described above, the description of a deep convolutional network in well-defined graph-theoretic tools and the formal connection to quantum entanglement, are two interdisciplinary bridges that are brought forth by this work. |
Tasks | |
Published | 2017-04-05 |
URL | http://arxiv.org/abs/1704.01552v2 |
http://arxiv.org/pdf/1704.01552v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-and-quantum-entanglement |
Repo | |
Framework | |
Incremental Transductive Learning Approaches to Schistosomiasis Vector Classification
Title | Incremental Transductive Learning Approaches to Schistosomiasis Vector Classification |
Authors | Terence Fusco, Yaxin Bi, Haiying Wang, Fiona Browne |
Abstract | The key issues pertaining to collection of epidemic disease data for our analysis purposes are that it is a labour intensive, time consuming and expensive process resulting in availability of sparse sample data which we use to develop prediction models. To address this sparse data issue, we present novel Incremental Transductive methods to circumvent the data collection process by applying previously acquired data to provide consistent, confidence-based labelling alternatives to field survey research. We investigated various reasoning approaches for semisupervised machine learning including Bayesian models for labelling data. The results show that using the proposed methods, we can label instances of data with a class of vector density at a high level of confidence. By applying the Liberal and Strict Training Approaches, we provide a labelling and classification alternative to standalone algorithms. The methods in this paper are components in the process of reducing the proliferation of the Schistosomiasis disease and its effects. |
Tasks | |
Published | 2017-04-06 |
URL | http://arxiv.org/abs/1704.01815v1 |
http://arxiv.org/pdf/1704.01815v1.pdf | |
PWC | https://paperswithcode.com/paper/incremental-transductive-learning-approaches |
Repo | |
Framework | |
Automatic Spatial Context-Sensitive Cloud/Cloud-Shadow Detection in Multi-Source Multi-Spectral Earth Observation Images: AutoCloud+
Title | Automatic Spatial Context-Sensitive Cloud/Cloud-Shadow Detection in Multi-Source Multi-Spectral Earth Observation Images: AutoCloud+ |
Authors | Andrea Baraldi |
Abstract | The proposed Earth observation (EO) based value adding system (EO VAS), hereafter identified as AutoCloud+, consists of an innovative EO image understanding system (EO IUS) design and implementation capable of automatic spatial context sensitive cloud/cloud shadow detection in multi source multi spectral (MS) EO imagery, whether or not radiometrically calibrated, acquired by multiple platforms, either spaceborne or airborne, including unmanned aerial vehicles (UAVs). It is worth mentioning that the same EO IUS architecture is suitable for a large variety of EO based value adding products and services, including: (i) low level image enhancement applications, such as automatic MS image topographic correction, co registration, mosaicking and compositing, (ii) high level MS image land cover (LC) and LC change (LCC) classification and (iii) content based image storage/retrieval in massive multi source EO image databases (big data mining). |
Tasks | Image Enhancement, Shadow Detection |
Published | 2017-01-16 |
URL | http://arxiv.org/abs/1701.04256v1 |
http://arxiv.org/pdf/1701.04256v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-spatial-context-sensitive |
Repo | |
Framework | |