Paper Group ANR 216
Mean Field Analysis of Deep Neural Networks. EE-AE: An Exclusivity Enhanced Unsupervised Feature Learning Approach. A Simple Dual-decoder Model for Generating Response with Sentiment. SAG-VAE: End-to-end Joint Inference of Data Representations and Feature Relations. Spectral clustering in the weighted stochastic block model. Geometric Image Corresp …
Mean Field Analysis of Deep Neural Networks
Title | Mean Field Analysis of Deep Neural Networks |
Authors | Justin Sirignano, Konstantinos Spiliopoulos |
Abstract | We analyze multi-layer neural networks in the asymptotic regime of simultaneously (A) large network sizes and (B) large numbers of stochastic gradient descent training iterations. We rigorously establish the limiting behavior of the multi-layer neural network output. The limit procedure is valid for any number of hidden layers and it naturally also describes the limiting behavior of the training loss. The ideas that we explore are to (a) take the limits of each hidden layer sequentially and (b) characterize the evolution of parameters in terms of their initialization. The limit satisfies a system of deterministic integro-differential equations. The proof uses methods from weak convergence and stochastic analysis. We show that, under suitable assumptions on the activation functions and the behavior for large times, the limit neural network recovers a global minimum (with zero loss for the objective function). |
Tasks | |
Published | 2019-03-11 |
URL | https://arxiv.org/abs/1903.04440v3 |
https://arxiv.org/pdf/1903.04440v3.pdf | |
PWC | https://paperswithcode.com/paper/mean-field-analysis-of-deep-neural-networks |
Repo | |
Framework | |
EE-AE: An Exclusivity Enhanced Unsupervised Feature Learning Approach
Title | EE-AE: An Exclusivity Enhanced Unsupervised Feature Learning Approach |
Authors | Jingcai Guo, Song Guo |
Abstract | Unsupervised learning is becoming more and more important recently. As one of its key components, the autoencoder (AE) aims to learn a latent feature representation of data which is more robust and discriminative. However, most AE based methods only focus on the reconstruction within the encoder-decoder phase, which ignores the inherent relation of data, i.e., statistical and geometrical dependence, and easily causes overfitting. In order to deal with this issue, we propose an Exclusivity Enhanced (EE) unsupervised feature learning approach to improve the conventional AE. To the best of our knowledge, our research is the first to utilize such exclusivity concept to cooperate with feature extraction within AE. Moreover, in this paper we also make some improvements to the stacked AE structure especially for the connection of different layers from decoders, this could be regarded as a weight initialization trial. The experimental results show that our proposed approach can achieve remarkable performance compared with other related methods. |
Tasks | |
Published | 2019-03-30 |
URL | http://arxiv.org/abs/1904.00172v1 |
http://arxiv.org/pdf/1904.00172v1.pdf | |
PWC | https://paperswithcode.com/paper/ee-ae-an-exclusivity-enhanced-unsupervised |
Repo | |
Framework | |
A Simple Dual-decoder Model for Generating Response with Sentiment
Title | A Simple Dual-decoder Model for Generating Response with Sentiment |
Authors | Xiuyu Wu, Yunfang Wu |
Abstract | How to generate human like response is one of the most challenging tasks for artificial intelligence. In a real application, after reading the same post different people might write responses with positive or negative sentiment according to their own experiences and attitudes. To simulate this procedure, we propose a simple but effective dual-decoder model to generate response with a particular sentiment, by connecting two sentiment decoders to one encoder. To support this model training, we construct a new conversation dataset with the form of (post, resp1, resp2) where two responses contain opposite sentiment. Experiment results show that our dual-decoder model can generate diverse responses with target sentiment, which obtains significant performance gain in sentiment accuracy and word diversity over the traditional single-decoder model. We will make our data and code publicly available for further study. |
Tasks | |
Published | 2019-05-16 |
URL | https://arxiv.org/abs/1905.06597v1 |
https://arxiv.org/pdf/1905.06597v1.pdf | |
PWC | https://paperswithcode.com/paper/a-simple-dual-decoder-model-for-generating |
Repo | |
Framework | |
SAG-VAE: End-to-end Joint Inference of Data Representations and Feature Relations
Title | SAG-VAE: End-to-end Joint Inference of Data Representations and Feature Relations |
Authors | Chen Wang, Chengyuan Deng, Vladimir Ivanov |
Abstract | Variational Autoencoders (VAEs) are powerful in data representation inference, but it cannot learn relations between features with its vanilla form and common variations. The ability to capture relations within data can provide the much needed inductive bias necessary for building more robust Machine Learning algorithms with more interpretable results. In this paper, inspired by recent advances in relational learning using Graph Neural Networks, we propose the Self-Attention Graph Variational AutoEncoder (SAG-VAE) network which can simultaneously learn feature relations and data representations in an end-to-end manner. SAG-VAE is trained by jointly inferring the posterior distribution of two types of latent variables, which denote the data representation and a shared graph structure, respectively. Furthermore, we introduce a novel self-attention graph network that improves the generative capabilities of SAG-VAE by parameterizing the generative distribution allowing SAG-VAE to generate new data via graph convolution, while still trainable via backpropagation. A learnable relational graph representation enhances SAG-VAE’s robustness to perturbation and noise, while also providing deeper intuition into model performance. Experiments based on graphs show that SAG-VAE is capable of approximately retrieving edges and links between nodes based entirely on feature observations. Finally, results on image data illustrate that SAG-VAE is fairly robust against perturbations in image reconstruction and sampling. |
Tasks | Image Reconstruction, Relational Reasoning |
Published | 2019-11-27 |
URL | https://arxiv.org/abs/1911.11984v2 |
https://arxiv.org/pdf/1911.11984v2.pdf | |
PWC | https://paperswithcode.com/paper/sag-vae-end-to-end-joint-inference-of-data |
Repo | |
Framework | |
Spectral clustering in the weighted stochastic block model
Title | Spectral clustering in the weighted stochastic block model |
Authors | Ian Gallagher, Anna Bertiger, Carey Priebe, Patrick Rubin-Delanchy |
Abstract | This paper is concerned with the statistical analysis of a real-valued symmetric data matrix. We assume a weighted stochastic block model: the matrix indices, taken to represent nodes, can be partitioned into communities so that all entries corresponding to a given community pair are replicates of the same random variable. Extending results previously known only for unweighted graphs, we provide a limit theorem showing that the point cloud obtained from spectrally embedding the data matrix follows a Gaussian mixture model where each community is represented with an elliptical component. We can therefore formally evaluate how well the communities separate under different data transformations, for example, whether it is productive to “take logs”. We find that performance is invariant to affine transformation of the entries, but this expected and desirable feature hinges on adaptively selecting the eigenvectors according to eigenvalue magnitude and using Gaussian clustering. We present a network anomaly detection problem with cyber-security data where the matrix of log p-values, as opposed to p-values, has both theoretical and empirical advantages. |
Tasks | Anomaly Detection |
Published | 2019-10-12 |
URL | https://arxiv.org/abs/1910.05534v1 |
https://arxiv.org/pdf/1910.05534v1.pdf | |
PWC | https://paperswithcode.com/paper/spectral-clustering-in-the-weighted |
Repo | |
Framework | |
Geometric Image Correspondence Verification by Dense Pixel Matching
Title | Geometric Image Correspondence Verification by Dense Pixel Matching |
Authors | Zakaria Laskar, Iaroslav Melekhov, Hamed R. Tavakoli, Juha Ylioinas, Juho Kannala |
Abstract | This paper addresses the problem of determining dense pixel correspondences between two images and its application to geometric correspondence verification in image retrieval. The main contribution is a geometric correspondence verification approach for re-ranking a shortlist of retrieved database images based on their dense pair-wise matching with the query image at a pixel level. We determine a set of cyclically consistent dense pixel matches between the pair of images and evaluate local similarity of matched pixels using neural network based image descriptors. Final re-ranking is based on a novel similarity function, which fuses the local similarity metric with a global similarity metric and a geometric consistency measure computed for the matched pixels. For dense matching our approach utilizes a modified version of a recently proposed dense geometric correspondence network (DGC-Net), which we also improve by optimizing the architecture. The proposed model and similarity metric compare favourably to the state-of-the-art image retrieval methods. In addition, we apply our method to the problem of long-term visual localization demonstrating promising results and generalization across datasets. |
Tasks | Image Retrieval, Visual Localization |
Published | 2019-04-15 |
URL | https://arxiv.org/abs/1904.06882v2 |
https://arxiv.org/pdf/1904.06882v2.pdf | |
PWC | https://paperswithcode.com/paper/geometric-image-correspondence-verification |
Repo | |
Framework | |
Quantum-inspired annealers as Boltzmann generators for machine learning and statistical physics
Title | Quantum-inspired annealers as Boltzmann generators for machine learning and statistical physics |
Authors | Alexander E. Ulanov, Egor S. Tiunov, A. I. Lvovsky |
Abstract | Quantum simulators and processors are rapidly improving nowadays, but they are still not able to solve complex and multidimensional tasks of practical value. However, certain numerical algorithms inspired by the physics of real quantum devices prove to be efficient in application to specific problems, related, for example, to combinatorial optimization. Here we implement a numerical annealer based on simulating the coherent Ising machine as a tool to sample from a high-dimensional Boltzmann probability distribution with the energy functional defined by the classical Ising Hamiltonian. Samples provided by such a generator are then utilized for the partition function estimation of this distribution and for the training of a general Boltzmann machine. Our study opens up a door to practical application of numerical quantum-inspired annealers. |
Tasks | Combinatorial Optimization |
Published | 2019-12-18 |
URL | https://arxiv.org/abs/1912.08480v1 |
https://arxiv.org/pdf/1912.08480v1.pdf | |
PWC | https://paperswithcode.com/paper/quantum-inspired-annealers-as-boltzmann |
Repo | |
Framework | |
Evaluating Adversarial Evasion Attacks in the Context of Wireless Communications
Title | Evaluating Adversarial Evasion Attacks in the Context of Wireless Communications |
Authors | Bryse Flowers, R. Michael Buehrer, William C. Headley |
Abstract | Recent advancements in radio frequency machine learning (RFML) have demonstrated the use of raw in-phase and quadrature (IQ) samples for multiple spectrum sensing tasks. Yet, deep learning techniques have been shown, in other applications, to be vulnerable to adversarial machine learning (ML) techniques, which seek to craft small perturbations that are added to the input to cause a misclassification. The current work differentiates the threats that adversarial ML poses to RFML systems based on where the attack is executed from: direct access to classifier input, synchronously transmitted over the air (OTA), or asynchronously transmitted from a separate device. Additionally, the current work develops a methodology for evaluating adversarial success in the context of wireless communications, where the primary metric of interest is bit error rate and not human perception, as is the case in image recognition. The methodology is demonstrated using the well known Fast Gradient Sign Method to evaluate the vulnerabilities of raw IQ based Automatic Modulation Classification and concludes RFML is vulnerable to adversarial examples, even in OTA attacks. However, RFML domain specific receiver effects, which would be encountered in an OTA attack, can present significant impairments to adversarial evasion. |
Tasks | |
Published | 2019-03-01 |
URL | http://arxiv.org/abs/1903.01563v1 |
http://arxiv.org/pdf/1903.01563v1.pdf | |
PWC | https://paperswithcode.com/paper/evaluating-adversarial-evasion-attacks-in-the |
Repo | |
Framework | |
MetaMixUp: Learning Adaptive Interpolation Policy of MixUp with Meta-Learning
Title | MetaMixUp: Learning Adaptive Interpolation Policy of MixUp with Meta-Learning |
Authors | Zhijun Mai, Guosheng Hu, Dexiong Chen, Fumin Shen, Heng Tao Shen |
Abstract | MixUp is an effective data augmentation method to regularize deep neural networks via random linear interpolations between pairs of samples and their labels. It plays an important role in model regularization, semi-supervised learning and domain adaption. However, despite its empirical success, its deficiency of randomly mixing samples has poorly been studied. Since deep networks are capable of memorizing the entire dataset, the corrupted samples generated by vanilla MixUp with a badly chosen interpolation policy will degrade the performance of networks. To overcome the underfitting by corrupted samples, inspired by Meta-learning (learning to learn), we propose a novel technique of learning to mixup in this work, namely, MetaMixUp. Unlike the vanilla MixUp that samples interpolation policy from a predefined distribution, this paper introduces a meta-learning based online optimization approach to dynamically learn the interpolation policy in a data-adaptive way. The validation set performance via meta-learning captures the underfitting issue, which provides more information to refine interpolation policy. Furthermore, we adapt our method for pseudo-label based semisupervised learning (SSL) along with a refined pseudo-labeling strategy. In our experiments, our method achieves better performance than vanilla MixUp and its variants under supervised learning configuration. In particular, extensive experiments show that our MetaMixUp adapted SSL greatly outperforms MixUp and many state-of-the-art methods on CIFAR-10 and SVHN benchmarks under SSL configuration. |
Tasks | Data Augmentation, Domain Adaptation, Meta-Learning |
Published | 2019-08-27 |
URL | https://arxiv.org/abs/1908.10059v1 |
https://arxiv.org/pdf/1908.10059v1.pdf | |
PWC | https://paperswithcode.com/paper/metamixup-learning-adaptive-interpolation |
Repo | |
Framework | |
Learning Algorithms via Neural Logic Networks
Title | Learning Algorithms via Neural Logic Networks |
Authors | Ali Payani, Faramarz Fekri |
Abstract | We propose a novel learning paradigm for Deep Neural Networks (DNN) by using Boolean logic algebra. We first present the basic differentiable operators of a Boolean system such as conjunction, disjunction and exclusive-OR and show how these elementary operators can be combined in a simple and meaningful way to form Neural Logic Networks (NLNs). We examine the effectiveness of the proposed NLN framework in learning Boolean functions and discrete-algorithmic tasks. We demonstrate that, in contrast to the implicit learning in MLP approach, the proposed neural logic networks can learn the logical functions explicitly that can be verified and interpreted by human. In particular, we propose a new framework for learning the inductive logic programming (ILP) problems by exploiting the explicit representational power of NLN. We show the proposed neural ILP solver is capable of feats such as predicate invention and recursion and can outperform the current state of the art neural ILP solvers using a variety of benchmark tasks such as decimal addition and multiplication, and sorting on ordered list. |
Tasks | |
Published | 2019-04-02 |
URL | http://arxiv.org/abs/1904.01554v1 |
http://arxiv.org/pdf/1904.01554v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-algorithms-via-neural-logic-networks |
Repo | |
Framework | |
Adaptive Intelligent Secondary Control of Microgrids Using a Biologically-Inspired Reinforcement Learning
Title | Adaptive Intelligent Secondary Control of Microgrids Using a Biologically-Inspired Reinforcement Learning |
Authors | Mohammad Jafari, Vahid Sarfi, Amir Ghasemkhani, Hanif Livani, Lei Yang, Hao Xu |
Abstract | In this paper, a biologically-inspired adaptive intelligent secondary controller is developed for microgrids to tackle system dynamics uncertainties, faults, and/or disturbances. The developed adaptive biologically-inspired controller adopts a novel computational model of emotional learning in mammalian limbic system. The learning capability of the proposed biologically-inspired intelligent controller makes it a promising approach to deal with the power system non-linear and volatile dynamics without increasing the controller complexity, and maintain the voltage and frequency stabilities by using an efficient reference tracking mechanism. The performance of the proposed intelligent secondary controller is validated in terms of the voltage and frequency absolute errors in the simulated microgrid. Simulation results highlight the efficiency and robustness of the proposed intelligent controller under the fault conditions and different system uncertainties compared to other benchmark controllers. |
Tasks | |
Published | 2019-05-02 |
URL | http://arxiv.org/abs/1905.00557v1 |
http://arxiv.org/pdf/1905.00557v1.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-intelligent-secondary-control-of |
Repo | |
Framework | |
Uncertainty Measures and Prediction Quality Rating for the Semantic Segmentation of Nested Multi Resolution Street Scene Images
Title | Uncertainty Measures and Prediction Quality Rating for the Semantic Segmentation of Nested Multi Resolution Street Scene Images |
Authors | Matthias Rottmann, Marius Schubert |
Abstract | In the semantic segmentation of street scenes the reliability of the prediction and therefore uncertainty measures are of highest interest. We present a method that generates for each input image a hierarchy of nested crops around the image center and presents these, all re-scaled to the same size, to a neural network for semantic segmentation. The resulting softmax outputs are then post processed such that we can investigate mean and variance over all image crops as well as mean and variance of uncertainty heat maps obtained from pixel-wise uncertainty measures, like the entropy, applied to each crop’s softmax output. In our tests, we use the publicly available DeepLabv3+ MobilenetV2 network (trained on the Cityscapes dataset) and demonstrate that the incorporation of crops improves the quality of the prediction and that we obtain more reliable uncertainty measures. These are then aggregated over predicted segments for either classifying between IoU=0 and IoU>0 (meta classification) or predicting the IoU via linear regression (meta regression). The latter yields reliable performance estimates for segmentation networks, in particular useful in the absence of ground truth. For the task of meta classification we obtain a classification accuracy of $81.93%$ and an AUROC of $89.89%$. For meta regression we obtain an $R^2$ value of $84.77%$. These results yield significant improvements compared to other approaches. |
Tasks | Semantic Segmentation |
Published | 2019-04-09 |
URL | http://arxiv.org/abs/1904.04516v1 |
http://arxiv.org/pdf/1904.04516v1.pdf | |
PWC | https://paperswithcode.com/paper/uncertainty-measures-and-prediction-quality |
Repo | |
Framework | |
Service Wrapper: a system for converting web data into web services
Title | Service Wrapper: a system for converting web data into web services |
Authors | Naibo Wang, Zhiling Luo, Xiya Lyu, Zitong Yang, Jianwei Yin |
Abstract | Web services are widely used in many areas via callable APIs, however, data are not always available in this way. We always need to get some data from web pages whose structure is not in order. Many developers use web data extraction methods to generate wrappers to get useful contents from websites and convert them into well-structured files. These methods, however, are designed specifically for professional wrapper program developers and not friendly to users without expertise in this domain. In this work, we construct a service wrapper system to convert available data in web pages into web services. Additionally, a set of algorithms are introduced to solve problems in the whole conversion process. People can use our system to convert web data into web services with fool-style operations and invoke these services by one simple step, which greatly expands the use of web data. Our cases show the ease of use, high availability, and stability of our system. |
Tasks | |
Published | 2019-10-17 |
URL | https://arxiv.org/abs/1910.07786v1 |
https://arxiv.org/pdf/1910.07786v1.pdf | |
PWC | https://paperswithcode.com/paper/service-wrapper-a-system-for-converting-web |
Repo | |
Framework | |
Multi-Task Time Series Analysis applied to Drug Response Modelling
Title | Multi-Task Time Series Analysis applied to Drug Response Modelling |
Authors | Alex Bird, Christopher K. I. Williams, Christopher Hawthorne |
Abstract | Time series models such as dynamical systems are frequently fitted to a cohort of data, ignoring variation between individual entities such as patients. In this paper we show how these models can be personalised to an individual level while retaining statistical power, via use of multi-task learning (MTL). To our knowledge this is a novel development of MTL which applies to time series both with and without control inputs. The modelling framework is demonstrated on a physiological drug response problem which results in improved predictive accuracy and uncertainty estimation over existing state-of-the-art models. |
Tasks | Multi-Task Learning, Time Series, Time Series Analysis |
Published | 2019-03-21 |
URL | http://arxiv.org/abs/1903.08970v1 |
http://arxiv.org/pdf/1903.08970v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-task-time-series-analysis-applied-to |
Repo | |
Framework | |
Know2Look: Commonsense Knowledge for Visual Search
Title | Know2Look: Commonsense Knowledge for Visual Search |
Authors | Sreyasi Nag Chowdhury, Niket Tandon, Gerhard Weikum |
Abstract | With the rise in popularity of social media, images accompanied by contextual text form a huge section of the web. However, search and retrieval of documents are still largely dependent on solely textual cues. Although visual cues have started to gain focus, the imperfection in object/scene detection do not lead to significantly improved results. We hypothesize that the use of background commonsense knowledge on query terms can significantly aid in retrieval of documents with associated images. To this end we deploy three different modalities - text, visual cues, and commonsense knowledge pertaining to the query - as a recipe for efficient search and retrieval. |
Tasks | |
Published | 2019-09-02 |
URL | https://arxiv.org/abs/1909.00749v1 |
https://arxiv.org/pdf/1909.00749v1.pdf | |
PWC | https://paperswithcode.com/paper/know2look-commonsense-knowledge-for-visual-1 |
Repo | |
Framework | |