Paper Group ANR 1694
Accelerating Monte Carlo Bayesian Inference via Approximating Predictive Uncertainty over Simplex. Learning Fast Adaptation with Meta Strategy Optimization. Zeroth-Order Stochastic Alternating Direction Method of Multipliers for Nonconvex Nonsmooth Optimization. Learning from Data-Rich Problems: A Case Study on Genetic Variant Calling. Putting Mach …
Accelerating Monte Carlo Bayesian Inference via Approximating Predictive Uncertainty over Simplex
Title | Accelerating Monte Carlo Bayesian Inference via Approximating Predictive Uncertainty over Simplex |
Authors | Yufei Cui, Wuguannan Yao, Qiao Li, Antoni B. Chan, Chun Jason Xue |
Abstract | Estimating the predictive uncertainty of a Bayesian learning model is critical in various decision-making problems, e.g., reinforcement learning, detecting adversarial attack, self-driving car. As the model posterior is almost always intractable, most efforts were made on finding an accurate approximation the true posterior. Even though a decent estimation of the model posterior is obtained, another approximation is required to compute the predictive distribution over the desired output. A common accurate solution is to use Monte Carlo (MC) integration. However, it needs to maintain a large number of samples, evaluate the model repeatedly and average multiple model outputs. In many real-world cases, this is computationally prohibitive. In this work, assuming that the exact posterior or a decent approximation is obtained, we propose a generic framework to approximate the output probability distribution induced by model posterior with a parameterized model and in an amortized fashion. The aim is to approximate the true uncertainty of a specific Bayesian model, meanwhile alleviating the heavy workload of MC integration at testing time. The proposed method is universally applicable to Bayesian classification models that allow for posterior sampling. Theoretically, we show that the idea of amortization incurs no additional costs on approximation performance. Empirical results validate the strong practical performance of our approach. |
Tasks | Adversarial Attack, Bayesian Inference, Decision Making |
Published | 2019-05-29 |
URL | https://arxiv.org/abs/1905.12194v2 |
https://arxiv.org/pdf/1905.12194v2.pdf | |
PWC | https://paperswithcode.com/paper/accelerating-monte-carlo-bayesian-inference |
Repo | |
Framework | |
Learning Fast Adaptation with Meta Strategy Optimization
Title | Learning Fast Adaptation with Meta Strategy Optimization |
Authors | Wenhao Yu, Jie Tan, Yunfei Bai, Erwin Coumans, Sehoon Ha |
Abstract | The ability to walk in new scenarios is a key milestone on the path toward real-world applications of legged robots. In this work, we introduce Meta Strategy Optimization, a meta-learning algorithm for training policies with latent variable inputs that can quickly adapt to new scenarios with a handful of trials in the target environment. The key idea behind MSO is to expose the same adaptation process, Strategy Optimization (SO), to both the training and testing phases. This allows MSO to effectively learn locomotion skills as well as a latent space that is suitable for fast adaptation. We evaluate our method on a real quadruped robot and demonstrate successful adaptation in various scenarios, including sim-to-real transfer, walking with a weakened motor, or climbing up a slope. Furthermore, we quantitatively analyze the generalization capability of the trained policy in simulated environments. Both real and simulated experiments show that our method outperforms previous methods in adaptation to novel tasks. |
Tasks | Legged Robots, Meta-Learning |
Published | 2019-09-28 |
URL | https://arxiv.org/abs/1909.12995v2 |
https://arxiv.org/pdf/1909.12995v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-fast-adaptation-with-meta-strategy |
Repo | |
Framework | |
Zeroth-Order Stochastic Alternating Direction Method of Multipliers for Nonconvex Nonsmooth Optimization
Title | Zeroth-Order Stochastic Alternating Direction Method of Multipliers for Nonconvex Nonsmooth Optimization |
Authors | Feihu Huang, Shangqian Gao, Songcan Chen, Heng Huang |
Abstract | Alternating direction method of multipliers (ADMM) is a popular optimization tool for the composite and constrained problems in machine learning. However, in many machine learning problems such as black-box attacks and bandit feedback, ADMM could fail because the explicit gradients of these problems are difficult or infeasible to obtain. Zeroth-order (gradient-free) methods can effectively solve these problems due to that the objective function values are only required in the optimization. Recently, though there exist a few zeroth-order ADMM methods, they build on the convexity of objective function. Clearly, these existing zeroth-order methods are limited in many applications. In the paper, thus, we propose a class of fast zeroth-order stochastic ADMM methods (i.e., ZO-SVRG-ADMM and ZO-SAGA-ADMM) for solving nonconvex problems with multiple nonsmooth penalties, based on the coordinate smoothing gradient estimator. Moreover, we prove that both the ZO-SVRG-ADMM and ZO-SAGA-ADMM have convergence rate of $O(1/T)$, where $T$ denotes the number of iterations. In particular, our methods not only reach the best convergence rate $O(1/T)$ for the nonconvex optimization, but also are able to effectively solve many complex machine learning problems with multiple regularized penalties and constraints. Finally, we conduct the experiments of black-box binary classification and structured adversarial attack on black-box deep neural network to validate the efficiency of our algorithms. |
Tasks | Adversarial Attack |
Published | 2019-05-29 |
URL | https://arxiv.org/abs/1905.12729v2 |
https://arxiv.org/pdf/1905.12729v2.pdf | |
PWC | https://paperswithcode.com/paper/zeroth-order-stochastic-alternating-direction |
Repo | |
Framework | |
Learning from Data-Rich Problems: A Case Study on Genetic Variant Calling
Title | Learning from Data-Rich Problems: A Case Study on Genetic Variant Calling |
Authors | Ren Yi, Pi-Chuan Chang, Gunjan Baid, Andrew Carroll |
Abstract | Next Generation Sequencing can sample the whole genome (WGS) or the 1-2% of the genome that codes for proteins called the whole exome (WES). Machine learning approaches to variant calling achieve high accuracy in WGS data, but the reduced number of training examples causes training with WES data alone to achieve lower accuracy. We propose and compare three different data augmentation strategies for improving performance on WES data: 1) joint training with WES and WGS data, 2) warmstarting the WES model from a WGS model, and 3) joint training with the sequencing type specified. All three approaches show improved accuracy over a model trained using just WES data, suggesting the ability of models to generalize insights from the greater WGS data while retaining performance on the specialized WES problem. These data augmentation approaches may apply to other problem areas in genomics, where several specialized models would each see only a subset of the genome. |
Tasks | Data Augmentation |
Published | 2019-11-12 |
URL | https://arxiv.org/abs/1911.05151v2 |
https://arxiv.org/pdf/1911.05151v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-from-data-rich-problems-a-case-study |
Repo | |
Framework | |
Putting Machine Translation in Context with the Noisy Channel Model
Title | Putting Machine Translation in Context with the Noisy Channel Model |
Authors | Lei Yu, Laurent Sartran, Wojciech Stokowiec, Wang Ling, Lingpeng Kong, Phil Blunsom, Chris Dyer |
Abstract | We show that Bayes’ rule provides a compelling mechanism for controlling unconditional document language models, using the long-standing challenge of effectively leveraging document context in machine translation. In our formulation, we estimate the probability of a candidate translation as the product of the unconditional probability of the candidate output document and the reverse translation probability'' of translating the candidate output back into the input source language document---the so-called noisy channel’’ decomposition. A particular advantage of our model is that it requires only parallel sentences to train, rather than parallel documents, which are not always available. Using a new beam search reranking approximation to solve the decoding problem, we find that document language models outperform language models that assume independence between sentences, and that using either a document or sentence language model outperforms comparable models that directly estimate the translation probability. We obtain the best-published results on the NIST Chinese–English translation task, a standard task for evaluating document translation. Our model also outperforms the benchmark Transformer model by approximately 2.5 BLEU on the WMT19 Chinese–English translation task. |
Tasks | Language Modelling, Machine Translation |
Published | 2019-10-01 |
URL | https://arxiv.org/abs/1910.00553v1 |
https://arxiv.org/pdf/1910.00553v1.pdf | |
PWC | https://paperswithcode.com/paper/putting-machine-translation-in-context-with-1 |
Repo | |
Framework | |
Active Federated Learning
Title | Active Federated Learning |
Authors | Jack Goetz, Kshitiz Malik, Duc Bui, Seungwhan Moon, Honglei Liu, Anuj Kumar |
Abstract | Federated Learning allows for population level models to be trained without centralizing client data by transmitting the global model to clients, calculating gradients locally, then averaging the gradients. Downloading models and uploading gradients uses the client’s bandwidth, so minimizing these transmission costs is important. The data on each client is highly variable, so the benefit of training on different clients may differ dramatically. To exploit this we propose Active Federated Learning, where in each round clients are selected not uniformly at random, but with a probability conditioned on the current model and the data on the client to maximize efficiency. We propose a cheap, simple and intuitive sampling scheme which reduces the number of required training iterations by 20-70% while maintaining the same model accuracy, and which mimics well known resampling techniques under certain conditions. |
Tasks | |
Published | 2019-09-27 |
URL | https://arxiv.org/abs/1909.12641v1 |
https://arxiv.org/pdf/1909.12641v1.pdf | |
PWC | https://paperswithcode.com/paper/active-federated-learning |
Repo | |
Framework | |
Mixing realities for sketch retrieval in Virtual Reality
Title | Mixing realities for sketch retrieval in Virtual Reality |
Authors | Daniele Giunchi, Stuart james, Donald Degraen, Anthony Steed |
Abstract | Drawing tools for Virtual Reality (VR) enable users to model 3D designs from within the virtual environment itself. These tools employ sketching and sculpting techniques known from desktop-based interfaces and apply them to hand-based controller interaction. While these techniques allow for mid-air sketching of basic shapes, it remains difficult for users to create detailed and comprehensive 3D models. In our work, we focus on supporting the user in designing the virtual environment around them by enhancing sketch-based interfaces with a supporting system for interactive model retrieval. Through sketching, an immersed user can query a database containing detailed 3D models and replace them into the virtual environment. To understand supportive sketching within a virtual environment, we compare different methods of sketch interaction, i.e., 3D mid-air sketching, 2D sketching on a virtual tablet, 2D sketching on a fixed virtual whiteboard, and 2D sketching on a real tablet. %using a 2D physical tablet, a 2D virtual tablet, a 2D virtual whiteboard, and 3D mid-air sketching. Our results show that 3D mid-air sketching is considered to be a more intuitive method to search a collection of models while the addition of physical devices creates confusion due to the complications of their inclusion within a virtual environment. While we pose our work as a retrieval problem for 3D models of chairs, our results can be extrapolated to other sketching tasks for virtual environments. |
Tasks | |
Published | 2019-10-25 |
URL | https://arxiv.org/abs/1910.11637v2 |
https://arxiv.org/pdf/1910.11637v2.pdf | |
PWC | https://paperswithcode.com/paper/mixing-realities-for-sketch-retrieval-in |
Repo | |
Framework | |
Cybernetical Concepts for Cellular Automaton and Artificial Neural Network Modelling and Implementation
Title | Cybernetical Concepts for Cellular Automaton and Artificial Neural Network Modelling and Implementation |
Authors | Patrik Christen, Olivier Del Fabbro |
Abstract | As a discipline cybernetics has a long and rich history. In its first generation it not only had a worldwide span, in the area of computer modelling, for example, its proponents such as John von Neumann, Stanislaw Ulam, Warren McCulloch and Walter Pitts, also came up with models and methods such as cellular automata and artificial neural networks, which are still the foundation of most modern modelling approaches. At the same time, cybernetics also got the attention of philosophers, such as the Frenchman Gilbert Simondon, who made use of cybernetical concepts in order to establish a metaphysics and a natural philosophy of individuation, giving cybernetics thereby a philosophical interpretation, which he baptised allagmatic. In this paper, we emphasise this allagmatic theory by showing how Simondon’s philosophical concepts can be used to formulate a generic computer model or metamodel for complex systems modelling and its implementation in program code, according to generic programming. We also present how the developed allagmatic metamodel is capable of building simple cellular automata and artificial neural networks. |
Tasks | |
Published | 2019-11-24 |
URL | https://arxiv.org/abs/2001.02037v1 |
https://arxiv.org/pdf/2001.02037v1.pdf | |
PWC | https://paperswithcode.com/paper/cybernetical-concepts-for-cellular-automaton |
Repo | |
Framework | |
Securing Tag-based recommender systems against profile injection attacks: A comparative study. (Extended Report)
Title | Securing Tag-based recommender systems against profile injection attacks: A comparative study. (Extended Report) |
Authors | Georgios K. Pitsilis, Heri Ramampiaro, Helge Langseth |
Abstract | This work addresses the challenges related to attacks on collaborative tagging systems, which often comes in a form of malicious annotations or profile injection attacks. In particular, we study various countermeasures against two types of such attacks for social tagging systems, the Overload attack and the Piggyback attack. The countermeasure schemes studied here include baseline classifiers such as, Naive Bayes filter and Support Vector Machine, as well as a Deep Learning approach. Our evaluation performed over synthetic spam data generated from del.icio.us dataset, shows that in most cases, Deep Learning can outperform the classical solutions, providing high-level protection against threats. |
Tasks | Recommendation Systems |
Published | 2019-01-24 |
URL | http://arxiv.org/abs/1901.08422v1 |
http://arxiv.org/pdf/1901.08422v1.pdf | |
PWC | https://paperswithcode.com/paper/securing-tag-based-recommender-systems |
Repo | |
Framework | |
Noise-Level Estimation from Single Color Image Using Correlations Between Textures in RGB Channels
Title | Noise-Level Estimation from Single Color Image Using Correlations Between Textures in RGB Channels |
Authors | Akihiro Nakamura, Michihiro Kobayashi |
Abstract | We propose a simple method for estimating noise level from a single color image. In most image-denoising algorithms, an accurate noise-level estimate results in good denoising performance; however, it is difficult to estimate noise level from a single image because it is an ill-posed problem. We tackle this problem by using prior knowledge that textures are highly correlated between RGB channels and noise is uncorrelated to other signals. We also extended our method for RAW images because they are available in almost all digital cameras and often used in practical situations. Experiments show the high noise-estimation performance of our method in synthetic noisy images. We also applied our method to natural images including RAW images and achieved better noise-estimation performance than conventional methods. |
Tasks | Denoising, Image Denoising |
Published | 2019-04-04 |
URL | http://arxiv.org/abs/1904.02566v1 |
http://arxiv.org/pdf/1904.02566v1.pdf | |
PWC | https://paperswithcode.com/paper/noise-level-estimation-from-single-color |
Repo | |
Framework | |
SAT vs CSP: a commentary
Title | SAT vs CSP: a commentary |
Authors | Toby Walsh |
Abstract | In 2000, I published a relatively comprehensive study of mappings between propositional satisfiability (SAT) and constraint satisfaction problems (CSPs) [Wal00]. I analysed four different mappings of SAT problems into CSPs, and two of CSPs into SAT problems. For each mapping, I compared the impact of achieving arc-consistency on the CSP with unit propagation on the corresponding SAT problems, and lifted these results to CSP algorithms that maintain (some level of ) arc-consistency during search like FC and MAC, and to the Davis- Putnam procedure (which performs unit propagation at each search node). These results helped provide some insight into the relationship between propositional satisfiability and constraint satisfaction that set the scene for an important and valuable body of work that followed. I discuss here what prompted the paper, and what followed. |
Tasks | |
Published | 2019-09-27 |
URL | https://arxiv.org/abs/1910.00128v1 |
https://arxiv.org/pdf/1910.00128v1.pdf | |
PWC | https://paperswithcode.com/paper/sat-vs-csp-a-commentary |
Repo | |
Framework | |
PaRe: A Paper-Reviewer Matching Approach Using a Common Topic Space
Title | PaRe: A Paper-Reviewer Matching Approach Using a Common Topic Space |
Authors | Omer Anjum, Hongyu Gong, Suma Bhat, Wen-Mei Hwu, Jinjun Xiong |
Abstract | Finding the right reviewers to assess the quality of conference submissions is a time consuming process for conference organizers. Given the importance of this step, various automated reviewer-paper matching solutions have been proposed to alleviate the burden. Prior approaches, including bag-of-words models and probabilistic topic models have been inadequate to deal with the vocabulary mismatch and partial topic overlap between a paper submission and the reviewer’s expertise. Our approach, the common topic model, jointly models the topics common to the submission and the reviewer’s profile while relying on abstract topic vectors. Experiments and insightful evaluations on two datasets demonstrate that the proposed method achieves consistent improvements compared to available state-of-the-art implementations of paper-reviewer matching. |
Tasks | Topic Models |
Published | 2019-09-25 |
URL | https://arxiv.org/abs/1909.11258v1 |
https://arxiv.org/pdf/1909.11258v1.pdf | |
PWC | https://paperswithcode.com/paper/pare-a-paper-reviewer-matching-approach-using |
Repo | |
Framework | |
Probabilistic Inference of Binary Markov Random Fields in Spiking Neural Networks through Mean-field Approximation
Title | Probabilistic Inference of Binary Markov Random Fields in Spiking Neural Networks through Mean-field Approximation |
Authors | Yajing Zheng, Shanshan Jia, Zhaofei Yu, Tiejun Huang, Jian K. Liu, Yonghong Tian |
Abstract | Recent studies have suggested that the cognitive process of the human brain is realized as probabilistic inference and can be further modeled by probabilistic graphical models like Markov random fields. Nevertheless, it remains unclear how probabilistic inference can be implemented by a network of spiking neurons in the brain. Previous studies have tried to relate the inference equation of binary Markov random fields to the dynamic equation of spiking neural networks through belief propagation algorithm and reparameterization, but they are valid only for Markov random fields with limited network structure. In this paper, we propose a spiking neural network model that can implement inference of arbitrary binary Markov random fields. Specifically, we design a spiking recurrent neural network and prove that its neuronal dynamics are mathematically equivalent to the inference process of Markov random fields by adopting mean-field theory. Furthermore, our mean-field approach unifies previous works. Theoretical analysis and experimental results, together with the application to image denoising, demonstrate that our proposed spiking neural network can get comparable results to that of mean-field inference. |
Tasks | Denoising, Image Denoising |
Published | 2019-02-22 |
URL | https://arxiv.org/abs/1902.08411v3 |
https://arxiv.org/pdf/1902.08411v3.pdf | |
PWC | https://paperswithcode.com/paper/probabilistic-inference-of-binary-markov |
Repo | |
Framework | |
Constrained Decoding for Neural NLG from Compositional Representations in Task-Oriented Dialogue
Title | Constrained Decoding for Neural NLG from Compositional Representations in Task-Oriented Dialogue |
Authors | Anusha Balakrishnan, Jinfeng Rao, Kartikeya Upasani, Michael White, Rajen Subba |
Abstract | Generating fluent natural language responses from structured semantic representations is a critical step in task-oriented conversational systems. Avenues like the E2E NLG Challenge have encouraged the development of neural approaches, particularly sequence-to-sequence (Seq2Seq) models for this problem. The semantic representations used, however, are often underspecified, which places a higher burden on the generation model for sentence planning, and also limits the extent to which generated responses can be controlled in a live system. In this paper, we (1) propose using tree-structured semantic representations, like those used in traditional rule-based NLG systems, for better discourse-level structuring and sentence-level planning; (2) introduce a challenging dataset using this representation for the weather domain; (3) introduce a constrained decoding approach for Seq2Seq models that leverages this representation to improve semantic correctness; and (4) demonstrate promising results on our dataset and the E2E dataset. |
Tasks | |
Published | 2019-06-17 |
URL | https://arxiv.org/abs/1906.07220v1 |
https://arxiv.org/pdf/1906.07220v1.pdf | |
PWC | https://paperswithcode.com/paper/constrained-decoding-for-neural-nlg-from |
Repo | |
Framework | |
You2Me: Inferring Body Pose in Egocentric Video via First and Second Person Interactions
Title | You2Me: Inferring Body Pose in Egocentric Video via First and Second Person Interactions |
Authors | Evonne Ng, Donglai Xiang, Hanbyul Joo, Kristen Grauman |
Abstract | The body pose of a person wearing a camera is of great interest for applications in augmented reality, healthcare, and robotics, yet much of the person’s body is out of view for a typical wearable camera. We propose a learning-based approach to estimate the camera wearer’s 3D body pose from egocentric video sequences. Our key insight is to leverage interactions with another person—whose body pose we can directly observe—as a signal inherently linked to the body pose of the first-person subject. We show that since interactions between individuals often induce a well-ordered series of back-and-forth responses, it is possible to learn a temporal model of the interlinked poses even though one party is largely out of view. We demonstrate our idea on a variety of domains with dyadic interaction and show the substantial impact on egocentric body pose estimation, which improves the state of the art. Video results are available at http://vision.cs.utexas.edu/projects/you2me/ |
Tasks | Pose Estimation |
Published | 2019-04-22 |
URL | https://arxiv.org/abs/1904.09882v2 |
https://arxiv.org/pdf/1904.09882v2.pdf | |
PWC | https://paperswithcode.com/paper/you2me-inferring-body-pose-in-egocentric |
Repo | |
Framework | |