April 2, 2020

# Paper Group ANR 116

A Rotation-Invariant Framework for Deep Point Cloud Analysis. The Implicit and Explicit Regularization Effects of Dropout. Novel Language Resources for Hindi: An Aesthetics Text Corpus and a Comprehensive Stop Lemma List. PrivacyNet: Semi-Adversarial Networks for Multi-attribute Face Privacy. Generalized Nested Rollout Policy Adaptation. Oracle low …

#### A Rotation-Invariant Framework for Deep Point Cloud Analysis

Title A Rotation-Invariant Framework for Deep Point Cloud Analysis
Authors Xianzhi Li, Ruihui Li, Guangyong Chen, Chi-Wing Fu, Daniel Cohen-Or, Pheng-Ann Heng
Abstract Recently, many deep neural networks were designed to process 3D point clouds, but a common drawback is that rotation invariance is not ensured, leading to poor generalization to arbitrary orientations. In this paper, we introduce a new low-level purely rotation-invariant representation to replace common 3D Cartesian coordinates as the network inputs. Also, we present a network architecture to embed these representations into features, encoding local relations between points and their neighbors, and the global shape structure. To alleviate inevitable global information loss caused by the rotation-invariant representations, we further introduce a region relation convolution to encode local and non-local information. We evaluate our method on multiple point cloud analysis tasks, including shape classification, part segmentation, and shape retrieval. Experimental results show that our method achieves consistent, and also the best performance, on inputs at arbitrary orientations, compared with the state-of-the-arts.
Published 2020-03-16
URL https://arxiv.org/abs/2003.07238v1
PDF https://arxiv.org/pdf/2003.07238v1.pdf
PWC https://paperswithcode.com/paper/a-rotation-invariant-framework-for-deep-point
Repo
Framework

#### The Implicit and Explicit Regularization Effects of Dropout

Title The Implicit and Explicit Regularization Effects of Dropout
Authors Colin Wei, Sham Kakade, Tengyu Ma
Abstract Dropout is a widely-used regularization technique, often required to obtain state-of-the-art for a number of architectures. This work demonstrates that dropout introduces two distinct but entangled regularization effects: an explicit effect (also studied in prior work) which occurs since dropout modifies the expected training objective, and, perhaps surprisingly, an additional implicit effect from the stochasticity in the dropout training update. This implicit regularization effect is analogous to the effect of stochasticity in small mini-batch stochastic gradient descent. We disentangle these two effects through controlled experiments. We then derive analytic simplifications which characterize each effect in terms of the derivatives of the model and the loss, for deep neural networks. We demonstrate these simplified, analytic regularizers accurately capture the important aspects of dropout, showing they faithfully replace dropout in practice.
Published 2020-02-28
URL https://arxiv.org/abs/2002.12915v2
PDF https://arxiv.org/pdf/2002.12915v2.pdf
PWC https://paperswithcode.com/paper/the-implicit-and-explicit-regularization
Repo
Framework

#### Novel Language Resources for Hindi: An Aesthetics Text Corpus and a Comprehensive Stop Lemma List

Title Novel Language Resources for Hindi: An Aesthetics Text Corpus and a Comprehensive Stop Lemma List
Authors Gayatri Venugopal-Wairagade, Jatinderkumar R. Saini, Dhanya Pramod
Abstract This paper is an effort to complement the contributions made by researchers working toward the inclusion of non-English languages in natural language processing studies. Two novel Hindi language resources have been created and released for public consumption. The first resource is a corpus consisting of nearly thousand pre-processed fictional and nonfictional texts spanning over hundred years. The second resource is an exhaustive list of stop lemmas created from 12 corpora across multiple domains, consisting of over 13 million words, from which more than 200,000 lemmas were generated, and 11 publicly available stop word lists comprising over 1000 words, from which nearly 400 unique lemmas were generated. This research lays emphasis on the use of stop lemmas instead of stop words owing to the presence of various, but not all morphological forms of a word in stop word lists, as opposed to the presence of only the root form of the word, from which variations could be derived if required. It was also observed that stop lemmas were more consistent across multiple sources as compared to stop words. In order to generate a stop lemma list, the parts of speech of the lemmas were investigated but rejected as it was found that there was no significant correlation between the rank of a word in the frequency list and its part of speech. The stop lemma list was assessed using a comparative method. A formal evaluation method is suggested as future work arising from this study.
Published 2020-02-01
URL https://arxiv.org/abs/2002.00171v1
PDF https://arxiv.org/pdf/2002.00171v1.pdf
PWC https://paperswithcode.com/paper/novel-language-resources-for-hindi-an
Repo
Framework

#### PrivacyNet: Semi-Adversarial Networks for Multi-attribute Face Privacy

Title PrivacyNet: Semi-Adversarial Networks for Multi-attribute Face Privacy
Authors Vahid Mirjalili, Sebastian Raschka, Arun Ross
Abstract Recent research has established the possibility of deducing soft-biometric attributes such as age, gender and race from an individual’s face image with high accuracy. However, this raises privacy concerns, especially when face images collected for biometric recognition purposes are used for attribute analysis without the person’s consent. To address this problem, we develop a technique for imparting soft biometric privacy to face images via an image perturbation methodology. The image perturbation is undertaken using a GAN-based Semi-Adversarial Network (SAN) - referred to as PrivacyNet - that modifies an input face image such that it can be used by a face matcher for matching purposes but cannot be reliably used by an attribute classifier. Further, PrivacyNet allows a person to choose specific attributes that have to be obfuscated in the input face images (e.g., age and race), while allowing for other types of attributes to be extracted (e.g., gender). Extensive experiments using multiple face matchers, multiple age/gender/race classifiers, and multiple face datasets demonstrate the generalizability of the proposed multi-attribute privacy enhancing method across multiple face and attribute classifiers.
Published 2020-01-02
URL https://arxiv.org/abs/2001.00561v2
PDF https://arxiv.org/pdf/2001.00561v2.pdf
Repo
Framework

#### Generalized Nested Rollout Policy Adaptation

Title Generalized Nested Rollout Policy Adaptation
Authors Tristan Cazenave
Abstract Nested Rollout Policy Adaptation (NRPA) is a Monte Carlo search algorithm for single player games. In this paper we propose to generalize NRPA with a temperature and a bias and to analyze theoretically the algorithms. The generalized algorithm is named GNRPA. Experiments show it improves on NRPA for different application domains: SameGame and the Traveling Salesman Problem with Time Windows.
Published 2020-03-22
URL https://arxiv.org/abs/2003.10024v1
PDF https://arxiv.org/pdf/2003.10024v1.pdf
Repo
Framework

#### Oracle lower bounds for stochastic gradient sampling algorithms

Title Oracle lower bounds for stochastic gradient sampling algorithms
Authors Niladri S. Chatterji, Peter L. Bartlett, Philip M. Long
Abstract We consider the problem of sampling from a strongly log-concave density in $\mathbb{R}^{d}$, and prove an information theoretic lower bound on the number of stochastic gradient queries of the log density needed. Several popular sampling algorithms (including many Markov chain Monte Carlo methods) operate by using stochastic gradients of the log density to generate a sample; our results establish an information theoretic limit for all these algorithms. We show that for every algorithm, there exists a well-conditioned strongly log-concave target density for which the distribution of points generated by the algorithm would be at least $\varepsilon$ away from the target in total variation distance if the number of gradient queries is less than $\Omega(\sigma^2 d/\varepsilon^2)$, where $\sigma^2 d$ is the variance of the stochastic gradient. Our lower bound follows by combining the ideas of Le Cam deficiency routinely used in the comparison of statistical experiments along with standard information theoretic tools used in lower bounding Bayes risk functions. To the best of our knowledge our results provide the first nontrivial dimension-dependent lower bound for this problem.
Published 2020-02-01
URL https://arxiv.org/abs/2002.00291v1
PDF https://arxiv.org/pdf/2002.00291v1.pdf
Repo
Framework

#### A Review of Personality in Human Robot Interactions

Title A Review of Personality in Human Robot Interactions
Authors Lionel P. Robert, Rasha Alahmad, Connor Esterwood, Sangmi Kim, Sangseok You, Qiaoning Zhang
Abstract Personality has been identified as a vital factor in understanding the quality of human robot interactions. Despite this the research in this area remains fragmented and lacks a coherent framework. This makes it difficult to understand what we know and identify what we do not. As a result our knowledge of personality in human robot interactions has not kept pace with the deployment of robots in organizations or in our broader society. To address this shortcoming, this paper reviews 83 articles and 84 separate studies to assess the current state of human robot personality research. This review: (1) highlights major thematic research areas, (2) identifies gaps in the literature, (3) derives and presents major conclusions from the literature and (4) offers guidance for future research.
Published 2020-01-31
URL https://arxiv.org/abs/2001.11777v2
PDF https://arxiv.org/pdf/2001.11777v2.pdf
PWC https://paperswithcode.com/paper/a-review-of-personality-in-human-robot
Repo
Framework

#### Encoding-based Memory Modules for Recurrent Neural Networks

Title Encoding-based Memory Modules for Recurrent Neural Networks
Authors Antonio Carta, Alessandro Sperduti, Davide Bacciu
Abstract Learning to solve sequential tasks with recurrent models requires the ability to memorize long sequences and to extract task-relevant features from them. In this paper, we study the memorization subtask from the point of view of the design and training of recurrent neural networks. We propose a new model, the Linear Memory Network, which features an encoding-based memorization component built with a linear autoencoder for sequences. We extend the memorization component with a modular memory that encodes the hidden state sequence at different sampling frequencies. Additionally, we provide a specialized training algorithm that initializes the memory to efficiently encode the hidden activations of the network. The experimental results on synthetic and real-world datasets show that specializing the training algorithm to train the memorization component always improves the final performance whenever the memorization of long sequences is necessary to solve the problem.
Published 2020-01-31
URL https://arxiv.org/abs/2001.11771v1
PDF https://arxiv.org/pdf/2001.11771v1.pdf
PWC https://paperswithcode.com/paper/encoding-based-memory-modules-for-recurrent
Repo
Framework

#### Hamiltonian Neural Networks for solving differential equations

Title Hamiltonian Neural Networks for solving differential equations
Authors Marios Mattheakis, David Sondak, Akshunna S. Dogra, Pavlos Protopapas
Abstract There has been a wave of interest in applying machine learning to study dynamical systems. In particular, neural networks have been applied to solve the equations of motion, and therefore, track the evolution of a system. In contrast to other applications of neural networks and machine learning, dynamical systems – depending on their underlying symmetries – possess invariants such as energy, momentum, and angular momentum. Traditional numerical iteration methods usually violate these conservation laws, propagating errors in time, and reducing the predictability of the method. We present a Hamiltonian neural network that solves the differential equations that govern dynamical systems. This unsupervised model is learning solutions that satisfy identically, up to an arbitrarily small error, Hamilton’s equations and, therefore, conserve the Hamiltonian invariants. Once it is optimized, the proposed architecture is considered a symplectic unit due to the introduction of an efficient parametric form of solutions. In addition, by sharing the network parameters and the choice of an appropriate activation function drastically improve the predictability of the network. An error analysis is derived and states that the numerical errors depend on the overall network performance. The symplectic architecture is then employed to solve the equations for the nonlinear oscillator and the chaotic Henon-Heiles dynamical system. In both systems, the symplectic Euler integrator requires two orders more evaluation points than the Hamiltonian network in order to achieve the same order of the numerical error in the predicted phase space trajectories.
Published 2020-01-29
URL https://arxiv.org/abs/2001.11107v2
PDF https://arxiv.org/pdf/2001.11107v2.pdf
PWC https://paperswithcode.com/paper/hamiltonian-neural-networks-for-solving
Repo
Framework

#### Towards Deep Learning Models Resistant to Large Perturbations

Title Towards Deep Learning Models Resistant to Large Perturbations
Authors Amirreza Shaeiri, Rozhin Nobahari, Mohammad Hossein Rohban
Abstract Adversarial robustness has proven to be a required property of machine learning algorithms. A key and often overlooked aspect of this problem is to try to make the adversarial noise magnitude as large as possible to enhance the benefits of the model robustness. We show that the well-established algorithm called “adversarial training” fails to train a deep neural network given a large, but reasonable, perturbation magnitude. In this paper, we propose a simple yet effective initialization of the network weights that makes learning on higher levels of noise possible. We next evaluate this idea rigorously on MNIST ($\epsilon$ up to $\approx 0.40$) and CIFAR10 ($\epsilon$ up to $\approx 32/255$) datasets assuming the $\ell_{\infty}$ attack model. Additionally, in order to establish the limits of $\epsilon$ in which the learning is feasible, we study the optimal robust classifier assuming full access to the joint data and label distribution. Then, we provide some theoretical results on the adversarial accuracy for a simple multi-dimensional Bernoulli distribution, which yields some insights on the range of feasible perturbations for the MNIST dataset.
Published 2020-03-30
URL https://arxiv.org/abs/2003.13370v1
PDF https://arxiv.org/pdf/2003.13370v1.pdf
PWC https://paperswithcode.com/paper/towards-deep-learning-models-resistant-to-2
Repo
Framework

#### A study of local optima for learning feature interactions using neural networks

Title A study of local optima for learning feature interactions using neural networks
Abstract In many fields such as bioinformatics, high energy physics, power distribution, etc., it is desirable to learn non-linear models where a small number of variables are selected and the interaction between them is explicitly modeled to predict the response. In principle, neural networks (NNs) could accomplish this task since they can model non-linear feature interactions very well. However, NNs require large amounts of training data to have a good generalization. In this paper we study the datastarved regime where a NN is trained on a relatively small amount of training data. For that purpose we study feature selection for NNs, which is known to improve generalization for linear models. As an extreme case of data with feature selection and feature interactions we study the XOR-like data with irrelevant variables. We experimentally observed that the cross-entropy loss function on XOR-like data has many non-equivalent local optima, and the number of local optima grows exponentially with the number of irrelevant variables. To deal with the local minima and for feature selection we propose a node pruning and feature selection algorithm that improves the capability of NNs to find better local minima even when there are irrelevant variables. Finally, we show that the performance of a NN on real datasets can be improved using pruning, obtaining compact networks on a small number of features, with good prediction and interpretability.
Published 2020-02-11
URL https://arxiv.org/abs/2002.04322v1
PDF https://arxiv.org/pdf/2002.04322v1.pdf
PWC https://paperswithcode.com/paper/a-study-of-local-optima-for-learning-feature
Repo
Framework

#### Uncertainty Weighted Causal Graphs

Title Uncertainty Weighted Causal Graphs
Authors Eduardo C. Garrido-Merchán, C. Puente, A. Sobrino, J. A. Olivas
Abstract Causality has traditionally been a scientific way to generate knowledge by relating causes to effects. From an imaginery point of view, causal graphs are a helpful tool for representing and infering new causal information. In previous works, we have generated automatically causal graphs associated to a given concept by analyzing sets of documents and extracting and representing the found causal information in that visual way. The retrieved information shows that causality is frequently imperfect rather than exact, feature gathered by the graph. In this work we will attempt to go a step further modelling the uncertainty in the graph through probabilistic improving the management of the imprecision in the quoted graph.
Published 2020-02-02
URL https://arxiv.org/abs/2002.00429v2
PDF https://arxiv.org/pdf/2002.00429v2.pdf
PWC https://paperswithcode.com/paper/uncertainty-weighted-causal-graphs
Repo
Framework

#### Locality-sensitive hashing in function spaces

Title Locality-sensitive hashing in function spaces
Authors Will Shand, Stephen Becker
Abstract We discuss the problem of performing similarity search over function spaces. To perform search over such spaces in a reasonable amount of time, we use {\it locality-sensitive hashing} (LSH). We present two methods that allow LSH functions on $\mathbb{R}^N$ to be extended to $L^p$ spaces: one using function approximation in an orthonormal basis, and another using (quasi-)Monte Carlo-style techniques. We use the presented hashing schemes to construct an LSH family for Wasserstein distance over one-dimensional, continuous probability distributions.
Published 2020-02-10
URL https://arxiv.org/abs/2002.03909v1
PDF https://arxiv.org/pdf/2002.03909v1.pdf
PWC https://paperswithcode.com/paper/locality-sensitive-hashing-in-function-spaces
Repo
Framework

#### The Four Dimensions of Social Network Analysis: An Overview of Research Methods, Applications, and Software Tools

Title The Four Dimensions of Social Network Analysis: An Overview of Research Methods, Applications, and Software Tools
Authors David Camacho, Àngel Panizo-LLedot, Gema Bello-Orgaz, Antonio Gonzalez-Pardo, Erik Cambria
Abstract Social network based applications have experienced exponential growth in recent years. One of the reasons for this rise is that this application domain offers a particularly fertile place to test and develop the most advanced computational techniques to extract valuable information from the Web. The main contribution of this work is three-fold: (1) we provide an up-to-date literature review of the state of the art on social network analysis (SNA);(2) we propose a set of new metrics based on four essential features (or dimensions) in SNA; (3) finally, we provide a quantitative analysis of a set of popular SNA tools and frameworks. We have also performed a scientometric study to detect the most active research areas and application domains in this area. This work proposes the definition of four different dimensions, namely Pattern & Knowledge discovery, Information Fusion & Integration, Scalability, and Visualization, which are used to define a set of new metrics (termed degrees) in order to evaluate the different software tools and frameworks of SNA (a set of 20 SNA-software tools are analyzed and ranked following previous metrics). These dimensions, together with the defined degrees, allow evaluating and measure the maturity of social network technologies, looking for both a quantitative assessment of them, as to shed light to the challenges and future trends in this active area.
Published 2020-02-21
URL https://arxiv.org/abs/2002.09485v1
PDF https://arxiv.org/pdf/2002.09485v1.pdf
PWC https://paperswithcode.com/paper/the-four-dimensions-of-social-network
Repo
Framework

#### Effective Diversity in Population-Based Reinforcement Learning

Title Effective Diversity in Population-Based Reinforcement Learning
Authors Jack Parker-Holder, Aldo Pacchiano, Krzysztof Choromanski, Stephen Roberts
Abstract Maintaining a population of solutions has been shown to increase exploration in reinforcement learning, typically attributed to the greater diversity of behaviors considered. One such class of methods, novelty search, considers further boosting diversity across agents via a multi-objective optimization formulation. Despite the intuitive appeal, these mechanisms have several shortcomings. First, they make use of mean field updates, which induce cycling behaviors. Second, they often rely on handcrafted behavior characterizations, which require domain knowledge. Furthermore, boosting diversity often has a detrimental impact on optimizing already fruitful behaviors for rewards. Setting the relative importance of novelty- versus reward-factor is usually hardcoded or requires tedious tuning/annealing. In this paper, we introduce a novel measure of population-wide diversity, leveraging ideas from Determinantal Point Processes. We combine this in a principled fashion with the reward function to adapt to the degree of diversity during training, borrowing ideas from online learning. Combined with task-agnostic behavioral embeddings, we show this approach outperforms previous methods for multi-objective optimization, as well as vanilla algorithms solely optimizing for rewards.