Paper Group ANR 633
Abstracting Probabilistic Models: A Logical Perspective. Making “fetch” happen: The influence of social and linguistic context on nonstandard word growth and decline. Information Theoretic Interpretation of Deep learning. On the Generalization of Equivariance and Convolution in Neural Networks to the Action of Compact Groups. Is Ordered Weighted $\ …
Abstracting Probabilistic Models: A Logical Perspective
Title | Abstracting Probabilistic Models: A Logical Perspective |
Authors | Vaishak Belle |
Abstract | Abstraction is a powerful idea widely used in science, to model, reason and explain the behavior of systems in a more tractable search space, by omitting irrelevant details. While notions of abstraction have matured for deterministic systems, the case for abstracting probabilistic models is not yet fully understood. In this paper, we provide a semantical framework for analyzing such abstractions from first principles. We develop the framework in a general way, allowing for expressive languages, including logic-based ones that admit relational and hierarchical constructs with stochastic primitives. We motivate a definition of consistency between a high-level model and its low-level counterpart, but also treat the case when the high-level model is missing critical information present in the low-level model. We prove properties of abstractions, both at the level of the parameter as well as the structure of the models. We conclude with some observations about how abstractions can be derived automatically. |
Tasks | |
Published | 2018-10-04 |
URL | https://arxiv.org/abs/1810.02434v3 |
https://arxiv.org/pdf/1810.02434v3.pdf | |
PWC | https://paperswithcode.com/paper/abstracting-probabilistic-relational-models |
Repo | |
Framework | |
Making “fetch” happen: The influence of social and linguistic context on nonstandard word growth and decline
Title | Making “fetch” happen: The influence of social and linguistic context on nonstandard word growth and decline |
Authors | Ian Stewart, Jacob Eisenstein |
Abstract | In an online community, new words come and go: today’s “haha” may be replaced by tomorrow’s “lol.” Changes in online writing are usually studied as a social process, with innovations diffusing through a network of individuals in a speech community. But unlike other types of innovation, language change is shaped and constrained by the system in which it takes part. To investigate the links between social and structural factors in language change, we undertake a large-scale analysis of nonstandard word growth in the online community Reddit. We find that dissemination across many linguistic contexts is a sign of growth: words that appear in more linguistic contexts grow faster and survive longer. We also find that social dissemination likely plays a less important role in explaining word growth and decline than previously hypothesized. |
Tasks | |
Published | 2018-02-09 |
URL | http://arxiv.org/abs/1802.04140v2 |
http://arxiv.org/pdf/1802.04140v2.pdf | |
PWC | https://paperswithcode.com/paper/making-fetch-happen-the-influence-of-social-1 |
Repo | |
Framework | |
Information Theoretic Interpretation of Deep learning
Title | Information Theoretic Interpretation of Deep learning |
Authors | Tianchen Zhao |
Abstract | We interpret part of the experimental results of Shwartz-Ziv and Tishby [2017]. Inspired by these results, we established a conjecture of the dynamics of the machinary of deep neural network. This conjecture can be used to explain the counterpart result by Saxe et al. [2018]. |
Tasks | |
Published | 2018-03-21 |
URL | http://arxiv.org/abs/1803.07980v2 |
http://arxiv.org/pdf/1803.07980v2.pdf | |
PWC | https://paperswithcode.com/paper/information-theoretic-interpretation-of-deep |
Repo | |
Framework | |
On the Generalization of Equivariance and Convolution in Neural Networks to the Action of Compact Groups
Title | On the Generalization of Equivariance and Convolution in Neural Networks to the Action of Compact Groups |
Authors | Risi Kondor, Shubhendu Trivedi |
Abstract | Convolutional neural networks have been extremely successful in the image recognition domain because they ensure equivariance to translations. There have been many recent attempts to generalize this framework to other domains, including graphs and data lying on manifolds. In this paper we give a rigorous, theoretical treatment of convolution and equivariance in neural networks with respect to not just translations, but the action of any compact group. Our main result is to prove that (given some natural constraints) convolutional structure is not just a sufficient, but also a necessary condition for equivariance to the action of a compact group. Our exposition makes use of concepts from representation theory and noncommutative harmonic analysis and derives new generalized convolution formulae. |
Tasks | |
Published | 2018-02-11 |
URL | http://arxiv.org/abs/1802.03690v3 |
http://arxiv.org/pdf/1802.03690v3.pdf | |
PWC | https://paperswithcode.com/paper/on-the-generalization-of-equivariance-and |
Repo | |
Framework | |
Is Ordered Weighted $\ell_1$ Regularized Regression Robust to Adversarial Perturbation? A Case Study on OSCAR
Title | Is Ordered Weighted $\ell_1$ Regularized Regression Robust to Adversarial Perturbation? A Case Study on OSCAR |
Authors | Pin-Yu Chen, Bhanukiran Vinzamuri, Sijia Liu |
Abstract | Many state-of-the-art machine learning models such as deep neural networks have recently shown to be vulnerable to adversarial perturbations, especially in classification tasks. Motivated by adversarial machine learning, in this paper we investigate the robustness of sparse regression models with strongly correlated covariates to adversarially designed measurement noises. Specifically, we consider the family of ordered weighted $\ell_1$ (OWL) regularized regression methods and study the case of OSCAR (octagonal shrinkage clustering algorithm for regression) in the adversarial setting. Under a norm-bounded threat model, we formulate the process of finding a maximally disruptive noise for OWL-regularized regression as an optimization problem and illustrate the steps towards finding such a noise in the case of OSCAR. Experimental results demonstrate that the regression performance of grouping strongly correlated features can be severely degraded under our adversarial setting, even when the noise budget is significantly smaller than the ground-truth signals. |
Tasks | |
Published | 2018-09-24 |
URL | http://arxiv.org/abs/1809.08706v2 |
http://arxiv.org/pdf/1809.08706v2.pdf | |
PWC | https://paperswithcode.com/paper/is-ordered-weighted-ell_1-regularized |
Repo | |
Framework | |
EMHMM Simulation Study
Title | EMHMM Simulation Study |
Authors | Antoni B. Chan, Janet H. Hsiao |
Abstract | Eye Movement analysis with Hidden Markov Models (EMHMM) is a method for modeling eye fixation sequences using hidden Markov models (HMMs). In this report, we run a simulation study to investigate the estimation error for learning HMMs with variational Bayesian inference, with respect to the number of sequences and the sequence lengths. We also relate the estimation error measured by KL divergence and L1-norm to a corresponding distortion in the ground-truth HMM parameters. |
Tasks | Bayesian Inference |
Published | 2018-10-17 |
URL | https://arxiv.org/abs/1810.07435v2 |
https://arxiv.org/pdf/1810.07435v2.pdf | |
PWC | https://paperswithcode.com/paper/emhmm-simulation-study |
Repo | |
Framework | |
Autonomously and Simultaneously Refining Deep Neural Network Parameters by a Bi-Generative Adversarial Network Aided Genetic Algorithm
Title | Autonomously and Simultaneously Refining Deep Neural Network Parameters by a Bi-Generative Adversarial Network Aided Genetic Algorithm |
Authors | Yantao Lu, Burak Kakillioglu, Senem Velipasalar |
Abstract | The choice of parameters, and the design of the network architecture are important factors affecting the performance of deep neural networks. Genetic Algorithms (GA) have been used before to determine parameters of a network. Yet, GAs perform a finite search over a discrete set of pre-defined candidates, and cannot, in general, generate unseen configurations. In this paper, to move from exploration to exploitation, we propose a novel and systematic method that autonomously and simultaneously optimizes multiple parameters of any deep neural network by using a GA aided by a bi-generative adversarial network (Bi-GAN). The proposed Bi-GAN allows the autonomous exploitation and choice of the number of neurons, for fully-connected layers, and number of filters, for convolutional layers, from a large range of values. Our proposed Bi-GAN involves two generators, and two different models compete and improve each other progressively with a GAN-based strategy to optimize the networks during GA evolution. Our proposed approach can be used to autonomously refine the number of convolutional layers and dense layers, number and size of kernels, and the number of neurons for the dense layers; choose the type of the activation function; and decide whether to use dropout and batch normalization or not, to improve the accuracy of different deep neural network architectures. Without loss of generality, the proposed method has been tested with the ModelNet database, and compared with the 3D Shapenets and two GA-only methods. The results show that the presented approach can simultaneously and successfully optimize multiple neural network parameters, and achieve higher accuracy even with shallower networks. |
Tasks | |
Published | 2018-09-24 |
URL | http://arxiv.org/abs/1809.10244v1 |
http://arxiv.org/pdf/1809.10244v1.pdf | |
PWC | https://paperswithcode.com/paper/autonomously-and-simultaneously-refining-deep |
Repo | |
Framework | |
Hands-on Experience with Gaussian Processes (GPs): Implementing GPs in Python - I
Title | Hands-on Experience with Gaussian Processes (GPs): Implementing GPs in Python - I |
Authors | Kshitij Tiwari |
Abstract | This document serves to complement our website which was developed with the aim of exposing the students to Gaussian Processes (GPs). GPs are non-parametric Bayesian regression models that are largely used by statisticians and geospatial data scientists for modeling spatial data. Several open source libraries spanning from Matlab [1], Python [2], R [3] etc., are already available for simple plug-and-use. The objective of this handout and in turn the website was to allow the users to develop stand-alone GPs in Python by relying on minimal external dependencies. To this end, we only use the default python modules and assist the users in developing their own GPs from scratch giving them an in-depth knowledge of what goes on under the hood. The module covers GP inference using maximum likelihood estimation (MLE) and gives examples of 1D (dummy) spatial data. |
Tasks | Gaussian Processes |
Published | 2018-09-06 |
URL | http://arxiv.org/abs/1809.01913v1 |
http://arxiv.org/pdf/1809.01913v1.pdf | |
PWC | https://paperswithcode.com/paper/hands-on-experience-with-gaussian-processes |
Repo | |
Framework | |
An Empirical Study of Generative Models with Encoders
Title | An Empirical Study of Generative Models with Encoders |
Authors | Paul K. Rubenstein, Yunpeng Li, Dominik Roblek |
Abstract | Generative adversarial networks (GANs) are capable of producing high quality image samples. However, unlike variational autoencoders (VAEs), GANs lack encoders that provide the inverse mapping for the generators, i.e., encode images back to the latent space. In this work, we consider adversarially learned generative models that also have encoders. We evaluate models based on their ability to produce high quality samples and reconstructions of real images. Our main contributions are twofold: First, we find that the baseline Bidirectional GAN (BiGAN) can be improved upon with the addition of an autoencoder loss, at the expense of an extra hyper-parameter to tune. Second, we show that comparable performance to BiGAN can be obtained by simply training an encoder to invert the generator of a normal GAN. |
Tasks | |
Published | 2018-12-19 |
URL | http://arxiv.org/abs/1812.07909v1 |
http://arxiv.org/pdf/1812.07909v1.pdf | |
PWC | https://paperswithcode.com/paper/an-empirical-study-of-generative-models-with |
Repo | |
Framework | |
Watermark Retrieval from 3D Printed Objects via Convolutional Neural Networks
Title | Watermark Retrieval from 3D Printed Objects via Convolutional Neural Networks |
Authors | Xin Zhang, Qian Wang, Toby Breckon, Ioannis Ivrissimtzis |
Abstract | We present a method for reading digital data embedded in planar 3D printed surfaces. The data are organised in binary arrays and embedded as surface textures in a way inspired by QR codes. At the core of the retrieval method lies a Convolutional Neural Network, outputting a confidence map of the location of the surface textures encoding value 1 bits. Subsequently, the bit array is retrieved through a series of simple image processing and statistical operations applied on the confidence map. Extensive experimentation with images captured from various camera views, under various illumination conditions and from objects printed with various material colours, shows that the proposed method generalizes well and achieves the level of accuracy required in practical applications. |
Tasks | |
Published | 2018-11-19 |
URL | http://arxiv.org/abs/1811.07640v1 |
http://arxiv.org/pdf/1811.07640v1.pdf | |
PWC | https://paperswithcode.com/paper/watermark-retrieval-from-3d-printed-objects |
Repo | |
Framework | |
Knowledge Base Relation Detection via Multi-View Matching
Title | Knowledge Base Relation Detection via Multi-View Matching |
Authors | Yang Yu, Kazi Saidul Hasan, Mo Yu, Wei Zhang, Zhiguo Wang |
Abstract | Relation detection is a core component for Knowledge Base Question Answering (KBQA). In this paper, we propose a KB relation detection model via multi-view matching which utilizes more useful information extracted from question and KB. The matching inside each view is through multiple perspectives to compare two input texts thoroughly. All these components are designed in an end-to-end trainable neural network model. Experiments on SimpleQuestions and WebQSP yield state-of-the-art results. |
Tasks | Knowledge Base Question Answering, Question Answering |
Published | 2018-03-01 |
URL | http://arxiv.org/abs/1803.00612v2 |
http://arxiv.org/pdf/1803.00612v2.pdf | |
PWC | https://paperswithcode.com/paper/knowledge-base-relation-detection-via-multi |
Repo | |
Framework | |
Nonlocal flocking dynamics: Learning the fractional order of PDEs from particle simulations
Title | Nonlocal flocking dynamics: Learning the fractional order of PDEs from particle simulations |
Authors | Zhiping Mao, Zhen Li, George Em Karniadakis |
Abstract | Flocking refers to collective behavior of a large number of interacting entities, where the interactions between discrete individuals produce collective motion on the large scale. We employ an agent-based model to describe the microscopic dynamics of each individual in a flock, and use a fractional PDE to model the evolution of macroscopic quantities of interest. The macroscopic models with phenomenological interaction functions are derived by applying the continuum hypothesis to the microscopic model. Instead of specifying the fPDEs with an ad hoc fractional order for nonlocal flocking dynamics, we learn the effective nonlocal influence function in fPDEs directly from particle trajectories generated by the agent-based simulations. We demonstrate how the learning framework is used to connect the discrete agent-based model to the continuum fPDEs in 1D and 2D nonlocal flocking dynamics. In particular, a Cucker-Smale particle model is employed to describe the microscale dynamics of each individual, while Euler equations with nonlocal interaction terms are used to compute the evolution of macroscale quantities. The trajectories generated by the particle simulations mimic the field data of tracking logs that can be obtained experimentally. They can be used to learn the fractional order of the influence function using a Gaussian process regression model implemented with the Bayesian optimization. We show that the numerical solution of the learned Euler equations solved by the finite volume scheme can yield correct density distributions consistent with the collective behavior of the agent-based system. The proposed method offers new insights on how to scale the discrete agent-based models to the continuum-based PDE models, and could serve as a paradigm on extracting effective governing equations for nonlocal flocking dynamics directly from particle trajectories. |
Tasks | |
Published | 2018-10-27 |
URL | http://arxiv.org/abs/1810.11596v2 |
http://arxiv.org/pdf/1810.11596v2.pdf | |
PWC | https://paperswithcode.com/paper/nonlocal-flocking-dynamics-learning-the |
Repo | |
Framework | |
Spatial-temporal Fusion Convolutional Neural Network for Simulated Driving Behavior Recognition
Title | Spatial-temporal Fusion Convolutional Neural Network for Simulated Driving Behavior Recognition |
Authors | Yaocong Hu, MingQi Lu, Xiaobo Lu |
Abstract | Abnormal driving behaviour is one of the leading cause of terrible traffic accidents endangering human life. Therefore, study on driving behaviour surveillance has become essential to traffic security and public management. In this paper, we conduct this promising research and employ a two stream CNN framework for video-based driving behaviour recognition, in which spatial stream CNN captures appearance information from still frames, whilst temporal stream CNN captures motion information with pre-computed optical flow displacement between a few adjacent video frames. We investigate different spatial-temporal fusion strategies to combine the intra frame static clues and inter frame dynamic clues for final behaviour recognition. So as to validate the effectiveness of the designed spatial-temporal deep learning based model, we create a simulated driving behaviour dataset, containing 1237 videos with 6 different driving behavior for recognition. Experiment result shows that our proposed method obtains noticeable performance improvements compared to the existing methods. |
Tasks | Optical Flow Estimation |
Published | 2018-12-03 |
URL | http://arxiv.org/abs/1812.00615v1 |
http://arxiv.org/pdf/1812.00615v1.pdf | |
PWC | https://paperswithcode.com/paper/spatial-temporal-fusion-convolutional-neural |
Repo | |
Framework | |
Optical Neural Networks
Title | Optical Neural Networks |
Authors | Grant Fennessy, Yevgeniy Vorobeychik |
Abstract | We develop a novel optical neural network (ONN) framework which introduces a degree of scalar invariance to image classification estima- tion. Taking a hint from the human eye, which has higher resolution near the center of the retina, images are broken out into multiple levels of varying zoom based on a focal point. Each level is passed through an identical convolutional neural network (CNN) in a Siamese fashion, and the results are recombined to produce a high accuracy estimate of the object class. ONNs act as a wrapper around existing CNNs, and can thus be applied to many existing algorithms to produce notable accuracy improvements without having to change the underlying architecture. |
Tasks | Image Classification |
Published | 2018-05-16 |
URL | http://arxiv.org/abs/1805.06082v2 |
http://arxiv.org/pdf/1805.06082v2.pdf | |
PWC | https://paperswithcode.com/paper/optical-neural-networks |
Repo | |
Framework | |
Efficient Relaxations for Dense CRFs with Sparse Higher Order Potentials
Title | Efficient Relaxations for Dense CRFs with Sparse Higher Order Potentials |
Authors | Thomas Joy, Alban Desmaison, Thalaiyasingam Ajanthan, Rudy Bunel, Mathieu Salzmann, Pushmeet Kohli, Philip H. S. Torr, M. Pawan Kumar |
Abstract | Dense conditional random fields (CRFs) have become a popular framework for modelling several problems in computer vision such as stereo correspondence and multi-class semantic segmentation. By modelling long-range interactions, dense CRFs provide a labelling that captures finer detail than their sparse counterparts. Currently, the state-of-the-art algorithm performs mean-field inference using a filter-based method but fails to provide a strong theoretical guarantee on the quality of the solution. A question naturally arises as to whether it is possible to obtain a maximum a posteriori (MAP) estimate of a dense CRF using a principled method. Within this paper, we show that this is indeed possible. We will show that, by using a filter-based method, continuous relaxations of the MAP problem can be optimised efficiently using state-of-the-art algorithms. Specifically, we will solve a quadratic programming (QP) relaxation using the Frank-Wolfe algorithm and a linear programming (LP) relaxation by developing a proximal minimisation framework. By exploiting labelling consistency in the higher-order potentials and utilising the filter-based method, we are able to formulate the above algorithms such that each iteration has a complexity linear in the number of classes and random variables. The presented algorithms can be applied to any labelling problem using a dense CRF with sparse higher-order potentials. In this paper, we use semantic segmentation as an example application as it demonstrates the ability of the algorithm to scale to dense CRFs with large dimensions. We perform experiments on the Pascal dataset to indicate that the presented algorithms are able to attain lower energies than the mean-field inference method. |
Tasks | Semantic Segmentation |
Published | 2018-05-23 |
URL | http://arxiv.org/abs/1805.09028v2 |
http://arxiv.org/pdf/1805.09028v2.pdf | |
PWC | https://paperswithcode.com/paper/efficient-relaxations-for-dense-crfs-with |
Repo | |
Framework | |