Paper Group ANR 1030
Bayesian Nonparametric Boolean Factor Models. A Syntactic Operator for Forgetting that Satisfies Strong Persistence. Low-rank approximations of hyperbolic embeddings. Certifiably Optimal Sparse Inverse Covariance Estimation. Regularizing Trajectory Optimization with Denoising Autoencoders. Neural Policy Gradient Methods: Global Optimality and Rates …
Bayesian Nonparametric Boolean Factor Models
Title | Bayesian Nonparametric Boolean Factor Models |
Authors | Tammo Rukat, Christopher Yau |
Abstract | We build upon probabilistic models for Boolean Matrix and Boolean Tensor factorisation that have recently been shown to solve these problems with unprecedented accuracy and to enable posterior inference to scale to Billions of observation. Here, we lift the restriction of a pre-specified number of latent dimensions by introducing an Indian Buffet Process prior over factor matrices. Not only does the full factor-conditional take a computationally convenient form due to the logical dependencies in the model, but also the posterior over the number of non-zero latent dimensions is remarkably simple. It amounts to counting the number false and true negative predictions, whereas positive predictions can be ignored. This constitutes a very transparent example of sampling-based posterior inference with an IBP prior and, importantly, lets us maintain extremely efficient inference. We discuss applications to simulated data, as well as to a real world data matrix with 6 Million entries. |
Tasks | |
Published | 2019-06-28 |
URL | https://arxiv.org/abs/1907.00063v1 |
https://arxiv.org/pdf/1907.00063v1.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-nonparametric-boolean-factor-models |
Repo | |
Framework | |
A Syntactic Operator for Forgetting that Satisfies Strong Persistence
Title | A Syntactic Operator for Forgetting that Satisfies Strong Persistence |
Authors | Matti Berthold, Ricardo Gonçalves, Matthias Knorr, João Leite |
Abstract | Whereas the operation of forgetting has recently seen a considerable amount of attention in the context of Answer Set Programming (ASP), most of it has focused on theoretical aspects, leaving the practical issues largely untouched. Recent studies include results about what sets of properties operators should satisfy, as well as the abstract characterization of several operators and their theoretical limits. However, no concrete operators have been investigated. In this paper, we address this issue by presenting the first concrete operator that satisfies strong persistence - a property that seems to best capture the essence of forgetting in the context of ASP - whenever this is possible, and many other important properties. The operator is syntactic, limiting the computation of the forgetting result to manipulating the rules in which the atoms to be forgotten occur, naturally yielding a forgetting result that is close to the original program. This paper is under consideration for acceptance in TPLP. |
Tasks | |
Published | 2019-07-29 |
URL | https://arxiv.org/abs/1907.12501v2 |
https://arxiv.org/pdf/1907.12501v2.pdf | |
PWC | https://paperswithcode.com/paper/a-syntactic-operator-for-forgetting-that |
Repo | |
Framework | |
Low-rank approximations of hyperbolic embeddings
Title | Low-rank approximations of hyperbolic embeddings |
Authors | Pratik Jawanpuria, Mayank Meghwanshi, Bamdev Mishra |
Abstract | The hyperbolic manifold is a smooth manifold of negative constant curvature. While the hyperbolic manifold is well-studied in the literature, it has gained interest in the machine learning and natural language processing communities lately due to its usefulness in modeling continuous hierarchies. Tasks with hierarchical structures are ubiquitous in those fields and there is a general interest to learning hyperbolic representations or embeddings of such tasks. Additionally, these embeddings of related tasks may also share a low-rank subspace. In this work, we propose to learn hyperbolic embeddings such that they also lie in a low-dimensional subspace. In particular, we consider the problem of learning a low-rank factorization of hyperbolic embeddings. We cast these problems as manifold optimization problems and propose computationally efficient algorithms. Empirical results illustrate the efficacy of the proposed approach. |
Tasks | |
Published | 2019-03-18 |
URL | http://arxiv.org/abs/1903.07307v1 |
http://arxiv.org/pdf/1903.07307v1.pdf | |
PWC | https://paperswithcode.com/paper/low-rank-approximations-of-hyperbolic |
Repo | |
Framework | |
Certifiably Optimal Sparse Inverse Covariance Estimation
Title | Certifiably Optimal Sparse Inverse Covariance Estimation |
Authors | Dimitris Bertsimas, Jourdain Lamperski, Jean Pauphilet |
Abstract | We consider the maximum likelihood estimation of sparse inverse covariance matrices. We demonstrate that current heuristic approaches primarily encourage robustness, instead of the desired sparsity. We give a novel approach that solves the cardinality constrained likelihood problem to certifiable optimality. The approach uses techniques from mixed-integer optimization and convex optimization, and provides a high-quality solution with a guarantee on its suboptimality, even if the algorithm is terminated early. Using a variety of synthetic and real datasets, we demonstrate that our approach can solve problems where the dimension of the inverse covariance matrix is up to 1,000s. We also demonstrate that our approach produces significantly sparser solutions than Glasso and other popular learning procedures, makes less false discoveries, while still maintaining state-of-the-art accuracy. |
Tasks | |
Published | 2019-06-25 |
URL | https://arxiv.org/abs/1906.10283v1 |
https://arxiv.org/pdf/1906.10283v1.pdf | |
PWC | https://paperswithcode.com/paper/certifiably-optimal-sparse-inverse-covariance |
Repo | |
Framework | |
Regularizing Trajectory Optimization with Denoising Autoencoders
Title | Regularizing Trajectory Optimization with Denoising Autoencoders |
Authors | Rinu Boney, Norman Di Palo, Mathias Berglund, Alexander Ilin, Juho Kannala, Antti Rasmus, Harri Valpola |
Abstract | Trajectory optimization using a learned model of the environment is one of the core elements of model-based reinforcement learning. This procedure often suffers from exploiting inaccuracies of the learned model. We propose to regularize trajectory optimization by means of a denoising autoencoder that is trained on the same trajectories as the model of the environment. We show that the proposed regularization leads to improved planning with both gradient-based and gradient-free optimizers. We also demonstrate that using regularized trajectory optimization leads to rapid initial learning in a set of popular motor control tasks, which suggests that the proposed approach can be a useful tool for improving sample efficiency. |
Tasks | Denoising |
Published | 2019-03-28 |
URL | https://arxiv.org/abs/1903.11981v3 |
https://arxiv.org/pdf/1903.11981v3.pdf | |
PWC | https://paperswithcode.com/paper/regularizing-trajectory-optimization-with |
Repo | |
Framework | |
Neural Policy Gradient Methods: Global Optimality and Rates of Convergence
Title | Neural Policy Gradient Methods: Global Optimality and Rates of Convergence |
Authors | Lingxiao Wang, Qi Cai, Zhuoran Yang, Zhaoran Wang |
Abstract | Policy gradient methods with actor-critic schemes demonstrate tremendous empirical successes, especially when the actors and critics are parameterized by neural networks. However, it remains less clear whether such “neural” policy gradient methods converge to globally optimal policies and whether they even converge at all. We answer both the questions affirmatively in the overparameterized regime. In detail, we prove that neural natural policy gradient converges to a globally optimal policy at a sublinear rate. Also, we show that neural vanilla policy gradient converges sublinearly to a stationary point. Meanwhile, by relating the suboptimality of the stationary points to the representation power of neural actor and critic classes, we prove the global optimality of all stationary points under mild regularity conditions. Particularly, we show that a key to the global optimality and convergence is the “compatibility” between the actor and critic, which is ensured by sharing neural architectures and random initializations across the actor and critic. To the best of our knowledge, our analysis establishes the first global optimality and convergence guarantees for neural policy gradient methods. |
Tasks | Policy Gradient Methods |
Published | 2019-08-29 |
URL | https://arxiv.org/abs/1909.01150v3 |
https://arxiv.org/pdf/1909.01150v3.pdf | |
PWC | https://paperswithcode.com/paper/neural-policy-gradient-methods-global |
Repo | |
Framework | |
Find or Classify? Dual Strategy for Slot-Value Predictions on Multi-Domain Dialog State Tracking
Title | Find or Classify? Dual Strategy for Slot-Value Predictions on Multi-Domain Dialog State Tracking |
Authors | Jian-Guo Zhang, Kazuma Hashimoto, Chien-Sheng Wu, Yao Wan, Philip S. Yu, Richard Socher, Caiming Xiong |
Abstract | Dialog State Tracking (DST) is a core component in task-oriented dialog systems. Existing approaches for DST usually fall into two categories, i.e, the picklist-based and span-based. From one hand, the picklist-based methods perform classifications for each slot over a candidate-value list, under the condition that a pre-defined ontology is accessible. However, it is impractical in industry since it is hard to get full access to the ontology. On the other hand, the span-based methods track values for each slot through finding text spans in the dialog context. However, due to the diversity of value descriptions, it is hard to find a particular string in the dialog context. To mitigate these issues, this paper proposes a Dual Strategy for DST (DS-DST) to borrow advantages from both the picklist-based and span-based methods, by classifying over a picklist or finding values from a slot span. Empirical results show that DS-DST achieves the state-of-the-art scores in terms of joint accuracy, i.e., 51.2% on the MultiWOZ 2.1 dataset, and 53.3% when the full ontology is accessible. |
Tasks | |
Published | 2019-10-08 |
URL | https://arxiv.org/abs/1910.03544v2 |
https://arxiv.org/pdf/1910.03544v2.pdf | |
PWC | https://paperswithcode.com/paper/find-or-classify-dual-strategy-for-slot-value |
Repo | |
Framework | |
Image to Video Domain Adaptation Using Web Supervision
Title | Image to Video Domain Adaptation Using Web Supervision |
Authors | Andrew Kae, Yale Song |
Abstract | Training deep neural networks typically requires large amounts of labeled data which may be scarce or expensive to obtain for a particular target domain. As an alternative, we can leverage webly-supervised data (i.e. results from a public search engine) which are relatively plentiful but may contain noisy results. In this work, we propose a novel two-stage approach to learn a video classifier using webly-supervised data. We argue that learning appearance features and then temporal features sequentially, rather than simultaneously, is an easier optimization for this task. We show this by first learning an image model from web images, which is used to initialize and train a video model. Our model applies domain adaptation to account for potential domain shift present between the source domain (webly-supervised data) and target domain and also accounts for noise by adding a novel attention component. We report results competitive with state-of-the-art for webly-supervised approaches on UCF-101 (while simplifying the training process) and also evaluate on Kinetics for comparison. |
Tasks | Domain Adaptation |
Published | 2019-08-05 |
URL | https://arxiv.org/abs/1908.01449v1 |
https://arxiv.org/pdf/1908.01449v1.pdf | |
PWC | https://paperswithcode.com/paper/image-to-video-domain-adaptation-using-web |
Repo | |
Framework | |
Deep Sub-Ensembles for Fast Uncertainty Estimation in Image Classification
Title | Deep Sub-Ensembles for Fast Uncertainty Estimation in Image Classification |
Authors | Matias Valdenegro-Toro |
Abstract | Fast estimates of model uncertainty are required for many robust robotics applications. Deep Ensembles provides state of the art uncertainty without requiring Bayesian methods, but still it is computationally expensive. In this paper we propose deep sub-ensembles, an approximation to deep ensembles where the core idea is to ensemble only the layers close to the output, and not the whole model. With ResNet-20 on the CIFAR10 dataset, we obtain 1.5-2.5 speedup over a Deep Ensemble, with a small increase in error and NLL, and similarly up to 5-15 speedup with a VGG-like network on the SVHN dataset. Our results show that this idea enables a trade-off between error and uncertainty quality versus computational performance. |
Tasks | Image Classification |
Published | 2019-10-17 |
URL | https://arxiv.org/abs/1910.08168v2 |
https://arxiv.org/pdf/1910.08168v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-sub-ensembles-for-fast-uncertainty |
Repo | |
Framework | |
Graph Node Embeddings using Domain-Aware Biased Random Walks
Title | Graph Node Embeddings using Domain-Aware Biased Random Walks |
Authors | Sourav Mukherjee, Tim Oates, Ryan Wright |
Abstract | The recent proliferation of publicly available graph-structured data has sparked an interest in machine learning algorithms for graph data. Since most traditional machine learning algorithms assume data to be tabular, embedding algorithms for mapping graph data to real-valued vector spaces has become an active area of research. Existing graph embedding approaches are based purely on structural information and ignore any semantic information from the underlying domain. In this paper, we demonstrate that semantic information can play a useful role in computing graph embeddings. Specifically, we present a framework for devising embedding strategies aware of domain-specific interpretations of graph nodes and edges, and use knowledge of downstream machine learning tasks to identify relevant graph substructures. Using two real-life domains, we show that our framework yields embeddings that are simple to implement and yet achieve equal or greater accuracy in machine learning tasks compared to domain independent approaches. |
Tasks | Graph Embedding |
Published | 2019-08-08 |
URL | https://arxiv.org/abs/1908.02947v1 |
https://arxiv.org/pdf/1908.02947v1.pdf | |
PWC | https://paperswithcode.com/paper/graph-node-embeddings-using-domain-aware |
Repo | |
Framework | |
EdgeAI: A Vision for Deep Learning in IoT Era
Title | EdgeAI: A Vision for Deep Learning in IoT Era |
Authors | Kartikeya Bhardwaj, Naveen Suda, Radu Marculescu |
Abstract | The significant computational requirements of deep learning present a major bottleneck for its large-scale adoption on hardware-constrained IoT-devices. Here, we envision a new paradigm called EdgeAI to address major impediments associated with deploying deep networks at the edge. Specifically, we discuss the existing directions in computation-aware deep learning and describe two new challenges in the IoT era: (1) Data-independent deployment of learning, and (2) Communication-aware distributed inference. We further present new directions from our recent research to alleviate the latter two challenges. Overcoming these challenges is crucial for rapid adoption of learning on IoT-devices in order to truly enable EdgeAI. |
Tasks | |
Published | 2019-10-23 |
URL | https://arxiv.org/abs/1910.10356v1 |
https://arxiv.org/pdf/1910.10356v1.pdf | |
PWC | https://paperswithcode.com/paper/edgeai-a-vision-for-deep-learning-in-iot-era |
Repo | |
Framework | |
AGAN: Towards Automated Design of Generative Adversarial Networks
Title | AGAN: Towards Automated Design of Generative Adversarial Networks |
Authors | Hanchao Wang, Jun Huan |
Abstract | Recent progress in Generative Adversarial Networks (GANs) has shown promising signs of improving GAN training via architectural change. Despite some early success, at present the design of GAN architectures requires human expertise, laborious trial-and-error testings, and often draws inspiration from its image classification counterpart. In the current paper, we present the first neural architecture search algorithm, automated neural architecture search for deep generative models, or AGAN for abbreviation, that is specifically suited for GAN training. For unsupervised image generation tasks on CIFAR-10, our algorithm finds architecture that outperforms state-of-the-art models under same regularization techniques. For supervised tasks, the automatically searched architectures also achieve highly competitive performance, outperforming best human-invented architectures at resolution $32\times32$. Moreover, we empirically demonstrate that the modules learned by AGAN are transferable to other image generation tasks such as STL-10. |
Tasks | Image Classification, Image Generation, Neural Architecture Search |
Published | 2019-06-25 |
URL | https://arxiv.org/abs/1906.11080v1 |
https://arxiv.org/pdf/1906.11080v1.pdf | |
PWC | https://paperswithcode.com/paper/agan-towards-automated-design-of-generative |
Repo | |
Framework | |
Analyzing Polarization in Social Media: Method and Application to Tweets on 21 Mass Shootings
Title | Analyzing Polarization in Social Media: Method and Application to Tweets on 21 Mass Shootings |
Authors | Dorottya Demszky, Nikhil Garg, Rob Voigt, James Zou, Matthew Gentzkow, Jesse Shapiro, Dan Jurafsky |
Abstract | We provide an NLP framework to uncover four linguistic dimensions of political polarization in social media: topic choice, framing, affect and illocutionary force. We quantify these aspects with existing lexical methods, and propose clustering of tweet embeddings as a means to identify salient topics for analysis across events; human evaluations show that our approach generates more cohesive topics than traditional LDA-based models. We apply our methods to study 4.4M tweets on 21 mass shootings. We provide evidence that the discussion of these events is highly polarized politically and that this polarization is primarily driven by partisan differences in framing rather than topic choice. We identify framing devices, such as grounding and the contrasting use of the terms “terrorist” and “crazy”, that contribute to polarization. Results pertaining to topic choice, affect and illocutionary force suggest that Republicans focus more on the shooter and event-specific facts (news) while Democrats focus more on the victims and call for policy changes. Our work contributes to a deeper understanding of the way group divisions manifest in language and to computational methods for studying them. |
Tasks | |
Published | 2019-04-02 |
URL | http://arxiv.org/abs/1904.01596v2 |
http://arxiv.org/pdf/1904.01596v2.pdf | |
PWC | https://paperswithcode.com/paper/analyzing-polarization-in-social-media-method |
Repo | |
Framework | |
Suicidal Ideation Detection: A Review of Machine Learning Methods and Applications
Title | Suicidal Ideation Detection: A Review of Machine Learning Methods and Applications |
Authors | Shaoxiong Ji, Shirui Pan, Xue Li, Erik Cambria, Guodong Long, Zi Huang |
Abstract | Suicide is a critical issue in the modern society. Early detection and prevention of suicide attempt should be addressed to save people’s life. Current suicidal ideation detection methods include clinical methods based on the interaction between social workers or experts and the targeted individuals, and machine learning techniques with feature engineering or deep learning for automatic detection based on online social contents. This is the first survey that comprehensively introduces and discusses the methods from these categories. Domain-specific applications of suicidal ideation detection are also reviewed according to their data sources, i.e., questionnaires, electronic health records, suicide notes, and online user content. To facilitate further research, several specific tasks and datasets are introduced. Finally, we summarize the limitations of current work and provide an outlook of further research directions. |
Tasks | Feature Engineering |
Published | 2019-10-23 |
URL | https://arxiv.org/abs/1910.12611v1 |
https://arxiv.org/pdf/1910.12611v1.pdf | |
PWC | https://paperswithcode.com/paper/suicidal-ideation-detection-a-review-of |
Repo | |
Framework | |
Mature GAIL: Imitation Learning for Low-level and High-dimensional Input using Global Encoder and Cost Transformation
Title | Mature GAIL: Imitation Learning for Low-level and High-dimensional Input using Global Encoder and Cost Transformation |
Authors | Wonsup Shin, Hyolim Kang, Sunghoon Hong |
Abstract | Recently, GAIL framework and various variants have shown remarkable possibilities for solving practical MDP problems. However, detailed researches of low-level, and high-dimensional state input in this framework, such as image sequences, has not been conducted. Furthermore, the cost function learned in the traditional GAIL frame-work only lies on a negative range, acting as a non-penalized reward and making the agent difficult to learn the optimal policy. In this paper, we propose a new algorithm based on the GAIL framework that includes a global encoder and the reward penalization mechanism. The global encoder solves two issues that arise when applying GAIL framework to high-dimensional image state. Also, it is shown that the penalization mechanism provides more adequate reward to the agent, resulting in stable performance improvement. Our approach’s potential can be backed up by the fact that it is generally applicable to variants of GAIL framework. We conducted in-depth experiments by applying our methods to various variants of the GAIL framework. And, the results proved that our method significantly improves the performances when it comes to low-level and high-dimensional tasks. |
Tasks | Imitation Learning |
Published | 2019-09-07 |
URL | https://arxiv.org/abs/1909.03200v1 |
https://arxiv.org/pdf/1909.03200v1.pdf | |
PWC | https://paperswithcode.com/paper/mature-gail-imitation-learning-for-low-level |
Repo | |
Framework | |