April 1, 2020

3188 words 15 mins read

Paper Group ANR 391

Paper Group ANR 391

Exploring the Memorization-Generalization Continuum in Deep Learning. A Unified Framework for Gaussian Mixture Reduction with Composite Transportation Distance. Average-case Acceleration Through Spectral Density Estimation. Stochastic geometry to generalize the Mondrian Process. Too many cooks: Coordinating multi-agent collaboration through inverse …

Exploring the Memorization-Generalization Continuum in Deep Learning

Title Exploring the Memorization-Generalization Continuum in Deep Learning
Authors Ziheng Jiang, Chiyuan Zhang, Kunal Talwar, Michael C. Mozer
Abstract Human learners appreciate that some facts demand memorization whereas other facts support generalization. For example, English verbs have irregular cases that must be memorized (e.g., go->went) and regular cases that generalize well (e.g., kiss->kissed, miss->missed). Likewise, deep neural networks have the capacity to memorize rare or irregular forms but nonetheless generalize across instances that share common patterns or structures. We analyze how individual instances are treated by a model on the memorization-generalization continuum via a consistency score. The score is the expected accuracy of a particular architecture for a held-out instance on a training set of a fixed size sampled from the data distribution. We obtain empirical estimates of this score for individual instances in multiple datasets, and we show that the score identifies out-of-distribution and mislabeled examples at one end of the continuum and regular examples at the other end. We explore three proxies to the consistency score: kernel density estimation on input and hidden representations; and the time course of training, i.e., learning speed. In addition to helping to understand the memorization versus generalization dynamics during training, the C-score proxies have potential application for out-of-distribution detection, curriculum learning, and active data collection.
Tasks Density Estimation, Out-of-Distribution Detection
Published 2020-02-08
URL https://arxiv.org/abs/2002.03206v1
PDF https://arxiv.org/pdf/2002.03206v1.pdf
PWC https://paperswithcode.com/paper/exploring-the-memorization-generalization
Repo
Framework

A Unified Framework for Gaussian Mixture Reduction with Composite Transportation Distance

Title A Unified Framework for Gaussian Mixture Reduction with Composite Transportation Distance
Authors Qiong Zhang, Jiahua Chen
Abstract Gaussian mixture reduction (GMR) is the problem of approximating a finite Gaussian mixture by one with fewer components. It is widely used in density estimation, nonparametric belief propagation, and Bayesian recursive filtering. Although optimization and clustering-based algorithms have been proposed for GMR, they are either computationally expensive or lacking in theoretical supports. In this work, we propose to perform GMR by minimizing the entropic regularized composite transportation distance between two mixtures. We show our approach provides a unified framework for GMR that is both interpretable and computationally efficient. Our work also bridges the gap between optimization and clustering-based approaches for GMR. A Majorization-Minimization algorithm is developed for our optimization problem and its theoretical convergence is also established in this paper. Empirical experiments are also conducted to show the effectiveness of GMR. The effect of the choice of transportation cost on the performance of GMR is also investigated.
Tasks Density Estimation
Published 2020-02-19
URL https://arxiv.org/abs/2002.08410v1
PDF https://arxiv.org/pdf/2002.08410v1.pdf
PWC https://paperswithcode.com/paper/a-unified-framework-for-gaussian-mixture
Repo
Framework

Average-case Acceleration Through Spectral Density Estimation

Title Average-case Acceleration Through Spectral Density Estimation
Authors Fabian Pedregosa, Damien Scieur
Abstract We develop a framework for designing optimal quadratic optimization methods in terms of their average-case runtime. This yields a new class of methods that achieve acceleration through a model of the Hessian’s expected spectral density. We develop explicit algorithms for the uniform, Marchenko-Pastur, and exponential distributions. These methods are momentum-based gradient algorithms whose hyper-parameters can be estimated without knowledge of the Hessian’s smallest singular value, in contrast with classical accelerated methods like Nesterov acceleration and Polyak momentum. Empirical results on quadratic, logistic regression and neural networks show the proposed methods always match and in many cases significantly improve over classical accelerated methods.
Tasks Density Estimation
Published 2020-02-12
URL https://arxiv.org/abs/2002.04756v2
PDF https://arxiv.org/pdf/2002.04756v2.pdf
PWC https://paperswithcode.com/paper/average-case-acceleration-through-spectral
Repo
Framework

Stochastic geometry to generalize the Mondrian Process

Title Stochastic geometry to generalize the Mondrian Process
Authors Eliza O’Reilly, Ngoc Tran
Abstract The Mondrian process is a stochastic process that produces a recursive partition of space with random axis-aligned cuts. Random forests and Laplace kernel approximations built from the Mondrian process have led to efficient online learning methods and Bayesian optimization. By viewing the Mondrian process as a special case of the stable under iterated tessellation (STIT) process, we utilize tools from stochastic geometry to resolve three fundamental questions concern generalizability of the Mondrian process in machine learning. First, we show that the Mondrian process with general cut directions can be efficiently simulated, but it is unlikely to give rise to better classification or regression algorithms. Second, we characterize all possible kernels that generalizations of the Mondrian process can approximate. This includes, for instance, various forms of the weighted Laplace kernel and the exponential kernel. Third, we give an explicit formula for the density estimator arising from a Mondrian forest. This allows for precise comparisons between the Mondrian forest, the Mondrian kernel and the Laplace kernel in density estimation. Our paper calls for further developments at the novel intersection of stochastic geometry and machine learning.
Tasks Density Estimation
Published 2020-02-03
URL https://arxiv.org/abs/2002.00797v1
PDF https://arxiv.org/pdf/2002.00797v1.pdf
PWC https://paperswithcode.com/paper/stochastic-geometry-to-generalize-the
Repo
Framework

Too many cooks: Coordinating multi-agent collaboration through inverse planning

Title Too many cooks: Coordinating multi-agent collaboration through inverse planning
Authors Rose E. Wang, Sarah A. Wu, James A. Evans, Joshua B. Tenenbaum, David C. Parkes, Max Kleiman-Weiner
Abstract Collaboration requires agents to coordinate their behavior on the fly, sometimes cooperating to solve a single task together and other times dividing it up into sub-tasks to work on in parallel. Underlying the human ability to collaborate is theory-of-mind, the ability to infer the hidden mental states that drive others to act. Here, we develop Bayesian Delegation, a decentralized multi-agent learning mechanism with these abilities. Bayesian Delegation enables agents to rapidly infer the hidden intentions of others by inverse planning. These inferences enable agents to flexibly decide in the absence of communication when to cooperate on the same sub-task and when to work on different sub-tasks in parallel. We test this model in a suite of multi-agent Markov decision processes inspired by cooking problems. To succeed, agents must coordinate both their high-level plans (e.g., what sub-task they should work on) and their low-level actions (e.g., avoiding collisions). Bayesian Delegation bridges these two levels and rapidly aligns agents’ beliefs about who should work on what without any communication. When agents cooperate on the same sub-task, coordinated plans emerge that enable the group of agents to achieve tasks no agent can complete on their own. Our model outperforms lesioned agents without Bayesian Delegation or without the ability to cooperate on the same sub-task.
Tasks
Published 2020-03-26
URL https://arxiv.org/abs/2003.11778v1
PDF https://arxiv.org/pdf/2003.11778v1.pdf
PWC https://paperswithcode.com/paper/too-many-cooks-coordinating-multi-agent
Repo
Framework

An adversarial learning framework for preserving users’ anonymity in face-based emotion recognition

Title An adversarial learning framework for preserving users’ anonymity in face-based emotion recognition
Authors Vansh Narula, Zhangyang, Wang, Theodora Chaspari
Abstract Image and video-capturing technologies have permeated our every-day life. Such technologies can continuously monitor individuals’ expressions in real-life settings, affording us new insights into their emotional states and transitions, thus paving the way to novel well-being and healthcare applications. Yet, due to the strong privacy concerns, the use of such technologies is met with strong skepticism, since current face-based emotion recognition systems relying on deep learning techniques tend to preserve substantial information related to the identity of the user, apart from the emotion-specific information. This paper proposes an adversarial learning framework which relies on a convolutional neural network (CNN) architecture trained through an iterative procedure for minimizing identity-specific information and maximizing emotion-dependent information. The proposed approach is evaluated through emotion classification and face identification metrics, and is compared against two CNNs, one trained solely for emotion recognition and the other trained solely for face identification. Experiments are performed using the Yale Face Dataset and Japanese Female Facial Expression Database. Results indicate that the proposed approach can learn a convolutional transformation for preserving emotion recognition accuracy and degrading face identity recognition, providing a foundation toward privacy-aware emotion recognition technologies.
Tasks Emotion Classification, Emotion Recognition, Face Identification
Published 2020-01-16
URL https://arxiv.org/abs/2001.06103v1
PDF https://arxiv.org/pdf/2001.06103v1.pdf
PWC https://paperswithcode.com/paper/an-adversarial-learning-framework-for
Repo
Framework

Fiedler Regularization: Learning Neural Networks with Graph Sparsity

Title Fiedler Regularization: Learning Neural Networks with Graph Sparsity
Authors Edric Tam, David Dunson
Abstract We introduce a novel regularization approach for deep learning that incorporates and respects the underlying graphical structure of the neural network. Existing regularization methods often focus on dropping/penalizing weights in a global manner that ignores the connectivity structure of the neural network. We propose to use the Fiedler value of the neural network’s underlying graph as a tool for regularization. We provide theoretical support for this approach via spectral graph theory. We demonstrate the convexity of this penalty and provide an approximate, variational approach for fast computation in practical training of neural networks. We provide bounds on such approximations. We provide an alternative but equivalent formulation of this framework in the form of a structurally weighted L1 penalty, thus linking our approach to sparsity induction. We performed experiments on datasets that compare Fiedler regularization with traditional regularization methods such as dropout and weight decay. Results demonstrate the efficacy of Fiedler regularization.
Tasks
Published 2020-03-02
URL https://arxiv.org/abs/2003.00992v1
PDF https://arxiv.org/pdf/2003.00992v1.pdf
PWC https://paperswithcode.com/paper/fiedler-regularization-learning-neural
Repo
Framework

Seasonal and Trend Forecasting of Tourist Arrivals: An Adaptive Multiscale Ensemble Learning Approach

Title Seasonal and Trend Forecasting of Tourist Arrivals: An Adaptive Multiscale Ensemble Learning Approach
Authors Shaolong Suna, Dan Bi, Ju-e Guo, Shouyang Wang
Abstract The accurate seasonal and trend forecasting of tourist arrivals is a very challenging task. In the view of the importance of seasonal and trend forecasting of tourist arrivals, and limited research work paid attention to these previously. In this study, a new adaptive multiscale ensemble (AME) learning approach incorporating variational mode decomposition (VMD) and least square support vector regression (LSSVR) is developed for short-, medium-, and long-term seasonal and trend forecasting of tourist arrivals. In the formulation of our developed AME learning approach, the original tourist arrivals series are first decomposed into the trend, seasonal and remainders volatility components. Then, the ARIMA is used to forecast the trend component, the SARIMA is used to forecast seasonal component with a 12-month cycle, while the LSSVR is used to forecast remainder volatility components. Finally, the forecasting results of the three components are aggregated to generate an ensemble forecasting of tourist arrivals by the LSSVR based nonlinear ensemble approach. Furthermore, a direct strategy is used to implement multi-step-ahead forecasting. Taking two accuracy measures and the Diebold-Mariano test, the empirical results demonstrate that our proposed AME learning approach can achieve higher level and directional forecasting accuracy compared with other benchmarks used in this study, indicating that our proposed approach is a promising model for forecasting tourist arrivals with high seasonality and volatility.
Tasks
Published 2020-02-19
URL https://arxiv.org/abs/2002.08021v2
PDF https://arxiv.org/pdf/2002.08021v2.pdf
PWC https://paperswithcode.com/paper/seasonal-and-trend-forecasting-of-tourist
Repo
Framework

The 1st Challenge on Remote Physiological Signal Sensing (RePSS)

Title The 1st Challenge on Remote Physiological Signal Sensing (RePSS)
Authors Xiaobai Li, Hu Han, Hao Lu, Xuesong Niu, Zitong Yu, Antitza Dantcheva, Guoying Zhao, Shiguang Shan
Abstract Remote measurement of physiological signals from videos is an emerging topic. The topic draws great interests, but the lack of publicly available benchmark databases and a fair validation platform are hindering its further development. For this concern, we organize the first challenge on Remote Physiological Signal Sensing (RePSS), in which two databases of VIPL and OBF are provided as the benchmark for kin researchers to evaluate their approaches. The 1st challenge of RePSS focuses on measuring the average heart rate from facial videos, which is the basic problem of remote physiological measurement. This paper presents an overview of the challenge, including data, protocol, analysis of results and discussion. The top ranked solutions are highlighted to provide insights for researchers, and future directions are outlined for this topic and this challenge.
Tasks
Published 2020-03-26
URL https://arxiv.org/abs/2003.11756v1
PDF https://arxiv.org/pdf/2003.11756v1.pdf
PWC https://paperswithcode.com/paper/the-1st-challenge-on-remote-physiological
Repo
Framework

PaDGAN: A Generative Adversarial Network for Performance Augmented Diverse Designs

Title PaDGAN: A Generative Adversarial Network for Performance Augmented Diverse Designs
Authors Wei Chen, Faez Ahmed
Abstract Deep generative models are proven to be a useful tool for automatic design synthesis and design space exploration. When applied in engineering design, existing generative models face two challenges: 1) generated designs lack diversity and do not cover all areas of the design space and 2) it is difficult to explicitly improve the overall performance or quality of generated designs without excluding low-quality designs from the dataset, which may impair the performance of the trained model due to reduced training sample size. In this paper, we simultaneously address these challenges by proposing a new Determinantal Point Processes based loss function for probabilistic modeling of diversity and quality. With this new loss function, we develop a variant of the Generative Adversarial Network, named “Performance Augmented Diverse Generative Adversarial Network” or PaDGAN, which can generate novel high-quality designs with good coverage of the design space. We demonstrate that PaDGAN can generate diverse and high-quality designs on both synthetic and real-world examples and compare PaDGAN against other models such as the vanilla GAN and the BezierGAN. Unlike typical generative models that usually generate new designs by interpolating within the boundary of training data, we show that PaDGAN expands the design space boundary towards high-quality regions. The proposed method is broadly applicable to many tasks including design space exploration, design optimization, and creative solution recommendation.
Tasks Point Processes
Published 2020-02-26
URL https://arxiv.org/abs/2002.11304v1
PDF https://arxiv.org/pdf/2002.11304v1.pdf
PWC https://paperswithcode.com/paper/padgan-a-generative-adversarial-network-for
Repo
Framework

Unsupervised Adversarial Domain Adaptation for Implicit Discourse Relation Classification

Title Unsupervised Adversarial Domain Adaptation for Implicit Discourse Relation Classification
Authors Hsin-Ping Huang, Junyi Jessy Li
Abstract Implicit discourse relations are not only more challenging to classify, but also to annotate, than their explicit counterparts. We tackle situations where training data for implicit relations are lacking, and exploit domain adaptation from explicit relations (Ji et al., 2015). We present an unsupervised adversarial domain adaptive network equipped with a reconstruction component. Our system outperforms prior works and other adversarial benchmarks for unsupervised domain adaptation. Additionally, we extend our system to take advantage of labeled data if some are available.
Tasks Domain Adaptation, Implicit Discourse Relation Classification, Relation Classification, Unsupervised Domain Adaptation
Published 2020-03-04
URL https://arxiv.org/abs/2003.02244v1
PDF https://arxiv.org/pdf/2003.02244v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-adversarial-domain-adaptation-3
Repo
Framework

Bounds on the size of PC and URC formulas

Title Bounds on the size of PC and URC formulas
Authors Petr Kučera, Petr Savický
Abstract In this paper we investigate CNF formulas, for which the unit propagation is strong enough to derive a contradiction if the formula together with a partial assignment of the variables is unsatisfiable (unit refutation complete or URC formulas) or additionally to derive all implied literals if the formula is satisfiable (propagation complete or PC formulas). If a formula represents a function using existentially quantified auxiliary variables, it is called an encoding of the function. We prove several results on the sizes of PC and URC formulas and encodings. One of them are separations between the sizes of formulas of different types. Namely, we prove an exponential separation between the size of URC formulas and PC formulas and between the size of PC encodings using auxiliary variables and URC formulas. Besides of this, we prove that the sizes of any two irredundant PC formulas for the same function differ at most by a factor polynomial in the number of the variables and present an example of a function demonstrating that a similar statement is not true for URC formulas. One of the separations above implies that a q-Horn formula may require an exponential number of additional clauses to become a URC formula. On the other hand, for every q-Horn formula, we present a polynomial size URC encoding of the same function using auxiliary variables. This encoding is not q-Horn in general.
Tasks
Published 2020-01-03
URL https://arxiv.org/abs/2001.00819v3
PDF https://arxiv.org/pdf/2001.00819v3.pdf
PWC https://paperswithcode.com/paper/bounds-on-the-size-of-pc-and-urc-formulas
Repo
Framework

ESBM: An Entity Summarization BenchMark

Title ESBM: An Entity Summarization BenchMark
Authors Qingxia Liu, Gong Cheng, Kalpa Gunaratna, Yuzhong Qu
Abstract Entity summarization is the problem of computing an optimal compact summary for an entity by selecting a size-constrained subset of triples from RDF data. Entity summarization supports a multiplicity of applications and has led to fruitful research. However, there is a lack of evaluation efforts that cover the broad spectrum of existing systems. One reason is a lack of benchmarks for evaluation. Some benchmarks are no longer available, while others are small and have limitations. In this paper, we create an Entity Summarization BenchMark (ESBM) which overcomes the limitations of existing benchmarks and meets standard desiderata for a benchmark. Using this largest available benchmark for evaluating general-purpose entity summarizers, we perform the most extensive experiment to date where 9~existing systems are compared. Considering that all of these systems are unsupervised, we also implement and evaluate a supervised learning based system for reference.
Tasks
Published 2020-03-08
URL https://arxiv.org/abs/2003.03734v1
PDF https://arxiv.org/pdf/2003.03734v1.pdf
PWC https://paperswithcode.com/paper/esbm-an-entity-summarization-benchmark
Repo
Framework

Strength from Weakness: Fast Learning Using Weak Supervision

Title Strength from Weakness: Fast Learning Using Weak Supervision
Authors Joshua Robinson, Stefanie Jegelka, Suvrit Sra
Abstract We study generalization properties of weakly supervised learning. That is, learning where only a few “strong” labels (the actual target of our prediction) are present but many more “weak” labels are available. In particular, we show that having access to weak labels can significantly accelerate the learning rate for the strong task to the fast rate of $\mathcal{O}(\nicefrac1n)$, where $n$ denotes the number of strongly labeled data points. This acceleration can happen even if by itself the strongly labeled data admits only the slower $\mathcal{O}(\nicefrac{1}{\sqrt{n}})$ rate. The actual acceleration depends continuously on the number of weak labels available, and on the relation between the two tasks. Our theoretical results are reflected empirically across a range of tasks and illustrate how weak labels speed up learning on the strong task.
Tasks
Published 2020-02-19
URL https://arxiv.org/abs/2002.08483v1
PDF https://arxiv.org/pdf/2002.08483v1.pdf
PWC https://paperswithcode.com/paper/strength-from-weakness-fast-learning-using
Repo
Framework

GPS-Net: Graph Property Sensing Network for Scene Graph Generation

Title GPS-Net: Graph Property Sensing Network for Scene Graph Generation
Authors Xin Lin, Changxing Ding, Jinquan Zeng, Dacheng Tao
Abstract Scene graph generation (SGG) aims to detect objects in an image along with their pairwise relationships. There are three key properties of scene graph that have been underexplored in recent works: namely, the edge direction information, the difference in priority between nodes, and the long-tailed distribution of relationships. Accordingly, in this paper, we propose a Graph Property Sensing Network (GPS-Net) that fully explores these three properties for SGG. First, we propose a novel message passing module that augments the node feature with node-specific contextual information and encodes the edge direction information via a tri-linear model. Second, we introduce a node priority sensitive loss to reflect the difference in priority between nodes during training. This is achieved by designing a mapping function that adjusts the focusing parameter in the focal loss. Third, since the frequency of relationships is affected by the long-tailed distribution problem, we mitigate this issue by first softening the distribution and then enabling it to be adjusted for each subject-object pair according to their visual appearance. Systematic experiments demonstrate the effectiveness of the proposed techniques. Moreover, GPS-Net achieves state-of-the-art performance on three popular databases: VG, OI, and VRD by significant gains under various settings and metrics. The code and models are available at \url{https://github.com/taksau/GPS-Net}.
Tasks Graph Generation, Scene Graph Generation
Published 2020-03-29
URL https://arxiv.org/abs/2003.12962v1
PDF https://arxiv.org/pdf/2003.12962v1.pdf
PWC https://paperswithcode.com/paper/gps-net-graph-property-sensing-network-for
Repo
Framework
comments powered by Disqus