Paper Group ANR 110
Learning Stable Deep Dynamics Models. VaB-AL: Incorporating Class Imbalance and Difficulty with Variational Bayes for Active Learning. Development of accurate human head models for personalized electromagnetic dosimetry using deep learning. Dynamic multi-object Gaussian process models: A framework for data-driven functional modelling of human joint …
Learning Stable Deep Dynamics Models
Title | Learning Stable Deep Dynamics Models |
Authors | Gaurav Manek, J. Zico Kolter |
Abstract | Deep networks are commonly used to model dynamical systems, predicting how the state of a system will evolve over time (either autonomously or in response to control inputs). Despite the predictive power of these systems, it has been difficult to make formal claims about the basic properties of the learned systems. In this paper, we propose an approach for learning dynamical systems that are guaranteed to be stable over the entire state space. The approach works by jointly learning a dynamics model and Lyapunov function that guarantees non-expansiveness of the dynamics under the learned Lyapunov function. We show that such learning systems are able to model simple dynamical systems and can be combined with additional deep generative models to learn complex dynamics, such as video textures, in a fully end-to-end fashion. |
Tasks | |
Published | 2020-01-17 |
URL | https://arxiv.org/abs/2001.06116v1 |
https://arxiv.org/pdf/2001.06116v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-stable-deep-dynamics-models-1 |
Repo | |
Framework | |
VaB-AL: Incorporating Class Imbalance and Difficulty with Variational Bayes for Active Learning
Title | VaB-AL: Incorporating Class Imbalance and Difficulty with Variational Bayes for Active Learning |
Authors | Jongwon Choi, Kwang Moo Yi, Jihoon Kim, Jincho Choo, Byoungjip Kim, Jin-Yeop Chang, Youngjune Gwon, Hyung Jin Chang |
Abstract | Active Learning for discriminative models has largely been studied with the focus on individual samples, with less emphasis on how classes are distributed or which classes are hard to deal with. In this work, we show that this is harmful. We propose a method based on the Bayes’ rule, that can naturally incorporate class imbalance into the Active Learning framework. We derive that three terms should be considered together when estimating the probability of a classifier making a mistake for a given sample; i) probability of mislabelling a class, ii) likelihood of the data given a predicted class, and iii) the prior probability on the abundance of a predicted class. Implementing these terms requires a generative model and an intractable likelihood estimation. Therefore, we train a Variational Auto Encoder (VAE) for this purpose. To further tie the VAE with the classifier and facilitate VAE training, we use the classifiers’ deep feature representations as input to the VAE. By considering all three probabilities, among them especially the data imbalance, we can substantially improve the potential of existing methods under limited data budget. We show that our method can be applied to classification tasks on multiple different datasets – including one that is a real-world dataset with heavy data imbalance – significantly outperforming the state of the art. |
Tasks | Active Learning |
Published | 2020-03-25 |
URL | https://arxiv.org/abs/2003.11249v1 |
https://arxiv.org/pdf/2003.11249v1.pdf | |
PWC | https://paperswithcode.com/paper/vab-al-incorporating-class-imbalance-and |
Repo | |
Framework | |
Development of accurate human head models for personalized electromagnetic dosimetry using deep learning
Title | Development of accurate human head models for personalized electromagnetic dosimetry using deep learning |
Authors | Essam A. Rashed, Jose Gomez-Tames, Akimasa Hirata |
Abstract | The development of personalized human head models from medical images has become an important topic in the electromagnetic dosimetry field, including the optimization of electrostimulation, safety assessments, etc. Human head models are commonly generated via the segmentation of magnetic resonance images into different anatomical tissues. This process is time consuming and requires special experience for segmenting a relatively large number of tissues. Thus, it is challenging to accurately compute the electric field in different specific brain regions. Recently, deep learning has been applied for the segmentation of the human brain. However, most studies have focused on the segmentation of brain tissue only and little attention has been paid to other tissues, which are considerably important for electromagnetic dosimetry. In this study, we propose a new architecture for a convolutional neural network, named ForkNet, to perform the segmentation of whole human head structures, which is essential for evaluating the electrical field distribution in the brain. The proposed network can be used to generate personalized head models and applied for the evaluation of the electric field in the brain during transcranial magnetic stimulation. Our computational results indicate that the head models generated using the proposed network exhibit strong matching with those created via manual segmentation in an intra-scanner segmentation task. |
Tasks | |
Published | 2020-02-21 |
URL | https://arxiv.org/abs/2002.09080v1 |
https://arxiv.org/pdf/2002.09080v1.pdf | |
PWC | https://paperswithcode.com/paper/development-of-accurate-human-head-models-for |
Repo | |
Framework | |
Dynamic multi-object Gaussian process models: A framework for data-driven functional modelling of human joints
Title | Dynamic multi-object Gaussian process models: A framework for data-driven functional modelling of human joints |
Authors | Jean-Rassaire Fouefack, Bhushan Borotikar, Tania S. Douglas, Valérie Burdin, Tinashe E. M. Mutsvangwa |
Abstract | Statistical shape models (SSMs) are state-of-the-art medical image analysis tools for extracting and explaining features across a set of biological structures. However, a principled and robust way to combine shape and pose features has been illusive due to three main issues: 1) Non-homogeneity of the data (data with linear and non-linear natural variation across features), 2) non-optimal representation of the $3D$ motion (rigid transformation representations that are not proportional to the kinetic energy that move an object from one position to the other), and 3) artificial discretization of the models. In this paper, we propose a new framework for dynamic multi-object statistical modelling framework for the analysis of human joints in a continuous domain. Specifically, we propose to normalise shape and dynamic spatial features in the same linearized statistical space permitting the use of linear statistics; we adopt an optimal 3D motion representation for more accurate rigid transformation comparisons; and we provide a 3D shape and pose prediction protocol using a Markov chain Monte Carlo sampling-based fitting. The framework affords an efficient generative dynamic multi-object modelling platform for biological joints. We validate the framework using a controlled synthetic data. Finally, the framework is applied to an analysis of the human shoulder joint to compare its performance with standard SSM approaches in prediction of shape while adding the advantage of determining relative pose between bones in a complex. Excellent validity is observed and the shoulder joint shape-pose prediction results suggest that the novel framework may have utility for a range of medical image analysis applications. Furthermore, the framework is generic and can be extended to n$>$2 objects, making it suitable for clinical and diagnostic methods for the management of joint disorders. |
Tasks | Pose Prediction |
Published | 2020-01-22 |
URL | https://arxiv.org/abs/2001.07904v1 |
https://arxiv.org/pdf/2001.07904v1.pdf | |
PWC | https://paperswithcode.com/paper/dynamic-multi-object-gaussian-process-models |
Repo | |
Framework | |
Brain-Inspired Model for Incremental Learning Using a Few Examples
Title | Brain-Inspired Model for Incremental Learning Using a Few Examples |
Authors | Ali Ayub, Alan Wagner |
Abstract | Incremental learning attempts to develop a classifier which learns continuously from a stream of data segregated into different classes. Deep learning approaches suffer from catastrophic forgetting when learning classes incrementally. We propose a novel approach to incremental learning inspired by the concept learning model of the hippocampus that represents each image class as centroids and does not suffer from catastrophic forgetting. Classification of a test image is accomplished using the distance of the test image to the n closest centroids. We further demonstrate that our approach can incrementally learn from only a few examples per class. Evaluations of our approach on three class-incremental learning benchmarks: Caltech-101, CUBS-200-2011 and CIFAR-100 for incremental and few-shot incremental learning depict state-of-the-art results in terms of classification accuracy over all learned classes. |
Tasks | |
Published | 2020-02-27 |
URL | https://arxiv.org/abs/2002.12411v1 |
https://arxiv.org/pdf/2002.12411v1.pdf | |
PWC | https://paperswithcode.com/paper/brain-inspired-model-for-incremental-learning |
Repo | |
Framework | |
Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear Networks
Title | Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear Networks |
Authors | Wei Hu, Lechao Xiao, Jeffrey Pennington |
Abstract | The selection of initial parameter values for gradient-based optimization of deep neural networks is one of the most impactful hyperparameter choices in deep learning systems, affecting both convergence times and model performance. Yet despite significant empirical and theoretical analysis, relatively little has been proved about the concrete effects of different initialization schemes. In this work, we analyze the effect of initialization in deep linear networks, and provide for the first time a rigorous proof that drawing the initial weights from the orthogonal group speeds up convergence relative to the standard Gaussian initialization with iid weights. We show that for deep networks, the width needed for efficient convergence to a global minimum with orthogonal initializations is independent of the depth, whereas the width needed for efficient convergence with Gaussian initializations scales linearly in the depth. Our results demonstrate how the benefits of a good initialization can persist throughout learning, suggesting an explanation for the recent empirical successes found by initializing very deep non-linear networks according to the principle of dynamical isometry. |
Tasks | |
Published | 2020-01-16 |
URL | https://arxiv.org/abs/2001.05992v1 |
https://arxiv.org/pdf/2001.05992v1.pdf | |
PWC | https://paperswithcode.com/paper/provable-benefit-of-orthogonal-initialization-1 |
Repo | |
Framework | |
Complexity, Stability Properties Of Mixed Games and Dynamic Algorithms, And Learning In The Sharing Economy
Title | Complexity, Stability Properties Of Mixed Games and Dynamic Algorithms, And Learning In The Sharing Economy |
Authors | Michael C. Nwogugu |
Abstract | The Sharing Economy (which includes Airbnb, Apple, Alibaba, Uber, WeWork, Ebay, Didi Chuxing, Amazon) blossomed across the world, triggered structural changes in industries and significantly affected international capital flows primarily by disobeying a wide variety of statutes and laws in many countries. They also illegally reduced and changing the nature of competition in many industries often to the detriment of social welfare. This article develops new dynamic pricing models for the SEOs and derives some stability properties of mixed games and dynamic algorithms which eliminate antitrust liability and also reduce deadweight losses, greed, Regret and GPS manipulation. The new dynamic pricing models contravene the Myerson Satterthwaite Impossibility Theorem. |
Tasks | |
Published | 2020-01-18 |
URL | https://arxiv.org/abs/2001.08192v1 |
https://arxiv.org/pdf/2001.08192v1.pdf | |
PWC | https://paperswithcode.com/paper/complexity-stability-properties-of-mixed |
Repo | |
Framework | |
A utility-based analysis of equilibria in multi-objective normal form games
Title | A utility-based analysis of equilibria in multi-objective normal form games |
Authors | Roxana Rădulescu, Patrick Mannion, Yijie Zhang, Diederik M. Roijers, Ann Nowé |
Abstract | In multi-objective multi-agent systems (MOMAS), agents explicitly consider the possible tradeoffs between conflicting objective functions. We argue that compromises between competing objectives in MOMAS should be analysed on the basis of the utility that these compromises have for the users of a system, where an agent’s utility function maps their payoff vectors to scalar utility values. This utility-based approach naturally leads to two different optimisation criteria for agents in a MOMAS: expected scalarised returns (ESR) and scalarised expected returns (SER). In this article, we explore the differences between these two criteria using the framework of multi-objective normal form games (MONFGs). We demonstrate that the choice of optimisation criterion (ESR or SER) can radically alter the set of equilibria in a MONFG when non-linear utility functions are used. |
Tasks | |
Published | 2020-01-17 |
URL | https://arxiv.org/abs/2001.08177v1 |
https://arxiv.org/pdf/2001.08177v1.pdf | |
PWC | https://paperswithcode.com/paper/a-utility-based-analysis-of-equilibria-in |
Repo | |
Framework | |
Distributed Deep Convolutional Compression for Massive MIMO CSI Feedback
Title | Distributed Deep Convolutional Compression for Massive MIMO CSI Feedback |
Authors | Qianqian Yang, Mahdi Boloursaz Mashhadi, Deniz Gunduz |
Abstract | Massive multiple-input multiple-output (MIMO) systems require downlink channel state information (CSI) at the base station (BS) to achieve spatial diversity and multiplexing gains. In a frequency division duplex (FDD) multiuser massive MIMO network, each user needs to compress and feedback its downlink CSI to the BS. The CSI overhead scales with the numbers of antennas, users and subcarriers, and becomes a major bottleneck for the overall spectral efficiency. In this paper, we propose a deep learning (DL)-based CSI compression scheme, called DeepCMC, composed of convolutional layers followed by quantization and entropy coding blocks. In comparison with previous deep learning DL-based CSI reduction structures, DeepCMC includes quantization and entropy coding blocks and minimizes a weighted rate-distortion cost which enables a trade-off between the CSI quality and its feedback overhead. Simulation results demonstrate that DeepCMC outperforms the state of the art CSI compression schemes in terms of the reconstruction quality of CSI for the same compression rate. We also propose a distributed version of DeepCMC for a multi-user MIMO scenario to encode and reconstruct the CSI from multiple users in a distributed manner. Distributed DeepCMC not only utilizes the inherent CSI structures of a single MIMO user for compression, but also benefits from the correlations among the channel matrices of nearby users to further improve the performance in comparison with DeepCMC. |
Tasks | Quantization |
Published | 2020-03-07 |
URL | https://arxiv.org/abs/2003.04684v1 |
https://arxiv.org/pdf/2003.04684v1.pdf | |
PWC | https://paperswithcode.com/paper/distributed-deep-convolutional-compression |
Repo | |
Framework | |
CRL: Class Representative Learning for Image Classification
Title | CRL: Class Representative Learning for Image Classification |
Authors | Mayanka Chandrashekar, Yugyung Lee |
Abstract | Building robust and real-time classifiers with diverse datasets are one of the most significant challenges to deep learning researchers. It is because there is a considerable gap between a model built with training (seen) data and real (unseen) data in applications. Recent works including Zero-Shot Learning (ZSL), have attempted to deal with this problem of overcoming the apparent gap through transfer learning. In this paper, we propose a novel model, called Class Representative Learning Model (CRL), that can be especially effective in image classification influenced by ZSL. In the CRL model, first, the learning step is to build class representatives to represent classes in datasets by aggregating prominent features extracted from a Convolutional Neural Network (CNN). Second, the inferencing step in CRL is to match between the class representatives and new data. The proposed CRL model demonstrated superior performance compared to the current state-of-the-art research in ZSL and mobile deep learning. The proposed CRL model has been implemented and evaluated in a parallel environment, using Apache Spark, for both distributed learning and recognition. An extensive experimental study on the benchmark datasets, ImageNet-1K, CalTech-101, CalTech-256, CIFAR-100, shows that CRL can build a class distribution model with drastic improvement in learning and recognition performance without sacrificing accuracy compared to the state-of-the-art performances in image classification. |
Tasks | Image Classification, Transfer Learning, Zero-Shot Learning |
Published | 2020-02-16 |
URL | https://arxiv.org/abs/2002.06619v1 |
https://arxiv.org/pdf/2002.06619v1.pdf | |
PWC | https://paperswithcode.com/paper/crl-class-representative-learning-for-image |
Repo | |
Framework | |
Adaptive Federated Optimization
Title | Adaptive Federated Optimization |
Authors | Sashank Reddi, Zachary Charles, Manzil Zaheer, Zachary Garrett, Keith Rush, Jakub Konečný, Sanjiv Kumar, H. Brendan McMahan |
Abstract | Federated learning is a distributed machine learning paradigm in which a large number of clients coordinate with a central server to learn a model without sharing their own training data. Due to the heterogeneity of the client datasets, standard federated optimization methods such as Federated Averaging (FedAvg) are often difficult to tune and exhibit unfavorable convergence behavior. In non-federated settings, adaptive optimization methods have had notable success in combating such issues. In this work, we propose federated versions of adaptive optimizers, including Adagrad, Adam, and Yogi, and analyze their convergence in the presence of heterogeneous data for general nonconvex settings. Our results highlight the interplay between client heterogeneity and communication efficiency. We also perform extensive experiments on these methods and show that the use of adaptive optimizers can significantly improve the performance of federated learning. |
Tasks | |
Published | 2020-02-29 |
URL | https://arxiv.org/abs/2003.00295v1 |
https://arxiv.org/pdf/2003.00295v1.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-federated-optimization |
Repo | |
Framework | |
How Transferable are the Representations Learned by Deep Q Agents?
Title | How Transferable are the Representations Learned by Deep Q Agents? |
Authors | Jacob Tyo, Zachary Lipton |
Abstract | In this paper, we consider the source of Deep Reinforcement Learning (DRL)‘s sample complexity, asking how much derives from the requirement of learning useful representations of environment states and how much is due to the sample complexity of learning a policy. While for DRL agents, the distinction between representation and policy may not be clear, we seek new insight through a set of transfer learning experiments. In each experiment, we retain some fraction of layers trained on either the same game or a related game, comparing the benefits of transfer learning to learning a policy from scratch. Interestingly, we find that benefits due to transfer are highly variable in general and non-symmetric across pairs of tasks. Our experiments suggest that perhaps transfer from simpler environments can boost performance on more complex downstream tasks and that the requirements of learning a useful representation can range from negligible to the majority of the sample complexity, based on the environment. Furthermore, we find that fine-tuning generally outperforms training with the transferred layers frozen, confirming an insight first noted in the classification setting. |
Tasks | Transfer Learning |
Published | 2020-02-24 |
URL | https://arxiv.org/abs/2002.10021v1 |
https://arxiv.org/pdf/2002.10021v1.pdf | |
PWC | https://paperswithcode.com/paper/how-transferable-are-the-representations |
Repo | |
Framework | |
Reinforcement Learning for POMDP: Partitioned Rollout and Policy Iteration with Application to Autonomous Sequential Repair Problems
Title | Reinforcement Learning for POMDP: Partitioned Rollout and Policy Iteration with Application to Autonomous Sequential Repair Problems |
Authors | Sushmita Bhattacharya, Sahil Badyal, Thomas Wheeler, Stephanie Gil, Dimitri Bertsekas |
Abstract | In this paper we consider infinite horizon discounted dynamic programming problems with finite state and control spaces, and partial state observations. We discuss an algorithm that uses multistep lookahead, truncated rollout with a known base policy, and a terminal cost function approximation. This algorithm is also used for policy improvement in an approximate policy iteration scheme, where successive policies are approximated by using a neural network classifier. A novel feature of our approach is that it is well suited for distributed computation through an extended belief space formulation and the use of a partitioned architecture, which is trained with multiple neural networks. We apply our methods in simulation to a class of sequential repair problems where a robot inspects and repairs a pipeline with potentially several rupture sites under partial information about the state of the pipeline. |
Tasks | |
Published | 2020-02-11 |
URL | https://arxiv.org/abs/2002.04175v1 |
https://arxiv.org/pdf/2002.04175v1.pdf | |
PWC | https://paperswithcode.com/paper/reinforcement-learning-for-pomdp-partitioned |
Repo | |
Framework | |
The continuous categorical: a novel simplex-valued exponential family
Title | The continuous categorical: a novel simplex-valued exponential family |
Authors | Elliott Gordon-Rodriguez, Gabriel Loaiza-Ganem, John P. Cunningham |
Abstract | Simplex-valued data appear throughout statistics and machine learning, for example in the context of transfer learning and compression of deep networks. Existing models for this class of data rely on the Dirichlet distribution or other related loss functions; here we show these standard choices suffer systematically from a number of limitations, including bias and numerical issues that frustrate the use of flexible network models upstream of these distributions. We resolve these limitations by introducing a novel exponential family of distributions for modeling simplex-valued data - the continuous categorical, which arises as a nontrivial multivariate generalization of the recently discovered continuous Bernoulli. Unlike the Dirichlet and other typical choices, the continuous categorical results in a well-behaved probabilistic loss function that produces unbiased estimators, while preserving the mathematical simplicity of the Dirichlet. As well as exploring its theoretical properties, we introduce sampling methods for this distribution that are amenable to the reparameterization trick, and evaluate their performance. Lastly, we demonstrate that the continuous categorical outperforms standard choices empirically, across a simulation study, an applied example on multi-party elections, and a neural network compression task. |
Tasks | Neural Network Compression, Transfer Learning |
Published | 2020-02-20 |
URL | https://arxiv.org/abs/2002.08563v1 |
https://arxiv.org/pdf/2002.08563v1.pdf | |
PWC | https://paperswithcode.com/paper/the-continuous-categorical-a-novel-simplex |
Repo | |
Framework | |
A Framework for the Computational Linguistic Analysis of Dehumanization
Title | A Framework for the Computational Linguistic Analysis of Dehumanization |
Authors | Julia Mendelsohn, Yulia Tsvetkov, Dan Jurafsky |
Abstract | Dehumanization is a pernicious psychological process that often leads to extreme intergroup bias, hate speech, and violence aimed at targeted social groups. Despite these serious consequences and the wealth of available data, dehumanization has not yet been computationally studied on a large scale. Drawing upon social psychology research, we create a computational linguistic framework for analyzing dehumanizing language by identifying linguistic correlates of salient components of dehumanization. We then apply this framework to analyze discussions of LGBTQ people in the New York Times from 1986 to 2015. Overall, we find increasingly humanizing descriptions of LGBTQ people over time. However, we find that the label homosexual has emerged to be much more strongly associated with dehumanizing attitudes than other labels, such as gay. Our proposed techniques highlight processes of linguistic variation and change in discourses surrounding marginalized groups. Furthermore, the ability to analyze dehumanizing language at a large scale has implications for automatically detecting and understanding media bias as well as abusive language online. |
Tasks | |
Published | 2020-03-06 |
URL | https://arxiv.org/abs/2003.03014v1 |
https://arxiv.org/pdf/2003.03014v1.pdf | |
PWC | https://paperswithcode.com/paper/a-framework-for-the-computational-linguistic |
Repo | |
Framework | |