April 2, 2020

2955 words 14 mins read

Paper Group ANR 238

Paper Group ANR 238

Learning models of quantum systems from experiments. SwapText: Image Based Texts Transfer in Scenes. Everybody’s Talkin’: Let Me Talk as You Want. A working likelihood approach to support vector regression with a data-driven insensitivity parameter. Planning with Brain-inspired AI. Causal Inference With Selectively-Deconfounded Data. Learning to Le …

Learning models of quantum systems from experiments

Title Learning models of quantum systems from experiments
Authors Antonio A. Gentile, Brian Flynn, Sebastian Knauer, Nathan Wiebe, Stefano Paesani, Christopher E. Granade, John G. Rarity, Raffaele Santagati, Anthony Laing
Abstract An isolated system of interacting quantum particles is described by a Hamiltonian operator. Hamiltonian models underpin the study and analysis of physical and chemical processes throughout science and industry, so it is crucial they are faithful to the system they represent. However, formulating and testing Hamiltonian models of quantum systems from experimental data is difficult because it is impossible to directly observe which interactions the quantum system is subject to. Here, we propose and demonstrate an approach to retrieving a Hamiltonian model from experiments, using unsupervised machine learning. We test our methods experimentally on an electron spin in a nitrogen-vacancy interacting with its spin bath environment, and numerically, finding success rates up to 86%. By building agents capable of learning science, which recover meaningful representations, we can gain further insight on the physics of quantum systems.
Tasks
Published 2020-02-14
URL https://arxiv.org/abs/2002.06169v1
PDF https://arxiv.org/pdf/2002.06169v1.pdf
PWC https://paperswithcode.com/paper/learning-models-of-quantum-systems-from
Repo
Framework

SwapText: Image Based Texts Transfer in Scenes

Title SwapText: Image Based Texts Transfer in Scenes
Authors Qiangpeng Yang, Hongsheng Jin, Jun Huang, Wei Lin
Abstract Swapping text in scene images while preserving original fonts, colors, sizes and background textures is a challenging task due to the complex interplay between different factors. In this work, we present SwapText, a three-stage framework to transfer texts across scene images. First, a novel text swapping network is proposed to replace text labels only in the foreground image. Second, a background completion network is learned to reconstruct background images. Finally, the generated foreground image and background image are used to generate the word image by the fusion network. Using the proposing framework, we can manipulate the texts of the input images even with severe geometric distortion. Qualitative and quantitative results are presented on several scene text datasets, including regular and irregular text datasets. We conducted extensive experiments to prove the usefulness of our method such as image based text translation, text image synthesis, etc.
Tasks Image Generation
Published 2020-03-18
URL https://arxiv.org/abs/2003.08152v1
PDF https://arxiv.org/pdf/2003.08152v1.pdf
PWC https://paperswithcode.com/paper/swaptext-image-based-texts-transfer-in-scenes
Repo
Framework

Everybody’s Talkin’: Let Me Talk as You Want

Title Everybody’s Talkin’: Let Me Talk as You Want
Authors Linsen Song, Wayne Wu, Chen Qian, Ran He, Chen Change Loy
Abstract We present a method to edit a target portrait footage by taking a sequence of audio as input to synthesize a photo-realistic video. This method is unique because it is highly dynamic. It does not assume a person-specific rendering network yet capable of translating arbitrary source audio into arbitrary video output. Instead of learning a highly heterogeneous and nonlinear mapping from audio to the video directly, we first factorize each target video frame into orthogonal parameter spaces, i.e., expression, geometry, and pose, via monocular 3D face reconstruction. Next, a recurrent network is introduced to translate source audio into expression parameters that are primarily related to the audio content. The audio-translated expression parameters are then used to synthesize a photo-realistic human subject in each video frame, with the movement of the mouth regions precisely mapped to the source audio. The geometry and pose parameters of the target human portrait are retained, therefore preserving the context of the original video footage. Finally, we introduce a novel video rendering network and a dynamic programming method to construct a temporally coherent and photo-realistic video. Extensive experiments demonstrate the superiority of our method over existing approaches. Our method is end-to-end learnable and robust to voice variations in the source audio.
Tasks 3D Face Reconstruction, Face Reconstruction
Published 2020-01-15
URL https://arxiv.org/abs/2001.05201v1
PDF https://arxiv.org/pdf/2001.05201v1.pdf
PWC https://paperswithcode.com/paper/everybodys-talkin-let-me-talk-as-you-want
Repo
Framework

A working likelihood approach to support vector regression with a data-driven insensitivity parameter

Title A working likelihood approach to support vector regression with a data-driven insensitivity parameter
Authors Jinran Wu, You-Gan Wang
Abstract The insensitive parameter in support vector regression determines the set of support vectors that greatly impacts the prediction. A data-driven approach is proposed to determine an approximate value for this insensitive parameter by minimizing a generalized loss function originating from the likelihood principle. This data-driven support vector regression also statistically standardizes samples using the scale of noises. Nonlinear and linear numerical simulations with three types of noises ($\epsilon$-Laplacian distribution, normal distribution, and uniform distribution), and in addition, five real benchmark data sets, are used to test the capacity of the proposed method. Based on all of the simulations and the five case studies, the proposed support vector regression using a working likelihood, data-driven insensitive parameter is superior and has lower computational costs.
Tasks
Published 2020-03-09
URL https://arxiv.org/abs/2003.03893v1
PDF https://arxiv.org/pdf/2003.03893v1.pdf
PWC https://paperswithcode.com/paper/a-working-likelihood-approach-to-support
Repo
Framework

Planning with Brain-inspired AI

Title Planning with Brain-inspired AI
Authors Naoya Arakawa
Abstract This article surveys engineering and neuroscientific models of planning as a cognitive function, which is regarded as a typical function of fluid intelligence in the discussion of general intelligence. It aims to present existing planning models as references for realizing the planning function in brain-inspired AI or artificial general intelligence (AGI). It also proposes themes for the research and development of brain-inspired AI from the viewpoint of tasks and architecture.
Tasks
Published 2020-03-25
URL https://arxiv.org/abs/2003.12353v1
PDF https://arxiv.org/pdf/2003.12353v1.pdf
PWC https://paperswithcode.com/paper/planning-with-brain-inspired-ai
Repo
Framework

Causal Inference With Selectively-Deconfounded Data

Title Causal Inference With Selectively-Deconfounded Data
Authors Kyra Gan, Andrew A. Li, Zachary C. Lipton, Sridhar Tayur
Abstract Given only data generated by a standard confounding graph with unobserved confounder, the Average Treatment Effect (ATE) is not identifiable. To estimate the ATE, a practitioner must then either (a) collect deconfounded data; (b) run a clinical trial; or (c) elucidate further properties of the causal graph that might render the ATE identifiable. In this paper, we consider the benefit of incorporating a (large) confounded observational dataset alongside a (small) deconfounded observational dataset when estimating the ATE. Our theoretical results show that the inclusion of confounded data can significantly reduce the quantity of deconfounded data required to estimate the ATE to within a desired accuracy level. Moreover, in some cases—say, genetics—we could imagine retrospectively selecting samples to deconfound. We demonstrate that by strategically selecting these examples based upon the (already observed) treatment and outcome, we can reduce our data dependence further. Our theoretical and empirical results establish that the worst-case relative performance of our approach (vs. a natural benchmark) is bounded while our best-case gains are unbounded. Next, we demonstrate the benefits of selective deconfounding using a large real-world dataset related to genetic mutation in cancer. Finally, we introduce an online version of the problem, proposing two adaptive heuristics.
Tasks Causal Inference
Published 2020-02-25
URL https://arxiv.org/abs/2002.11096v1
PDF https://arxiv.org/pdf/2002.11096v1.pdf
PWC https://paperswithcode.com/paper/causal-inference-with-selectively
Repo
Framework

Learning to Learn Single Domain Generalization

Title Learning to Learn Single Domain Generalization
Authors Fengchun Qiao, Long Zhao, Xi Peng
Abstract We are concerned with a worst-case scenario in model generalization, in the sense that a model aims to perform well on many unseen domains while there is only one single domain available for training. We propose a new method named adversarial domain augmentation to solve this Out-of-Distribution (OOD) generalization problem. The key idea is to leverage adversarial training to create “fictitious” yet “challenging” populations, from which a model can learn to generalize with theoretical guarantees. To facilitate fast and desirable domain augmentation, we cast the model training in a meta-learning scheme and use a Wasserstein Auto-Encoder (WAE) to relax the widely used worst-case constraint. Detailed theoretical analysis is provided to testify our formulation, while extensive experiments on multiple benchmark datasets indicate its superior performance in tackling single domain generalization.
Tasks Domain Generalization, Meta-Learning
Published 2020-03-30
URL https://arxiv.org/abs/2003.13216v1
PDF https://arxiv.org/pdf/2003.13216v1.pdf
PWC https://paperswithcode.com/paper/learning-to-learn-single-domain
Repo
Framework

Efficient Distance Approximation for Structured High-Dimensional Distributions via Learning

Title Efficient Distance Approximation for Structured High-Dimensional Distributions via Learning
Authors Arnab Bhattacharyya, Sutanu Gayen, Kuldeep S. Meel, N. V. Vinodchandran
Abstract We design efficient distance approximation algorithms for several classes of structured high-dimensional distributions. Specifically, we show algorithms for the following problems: - Given sample access to two Bayesian networks $P_1$ and $P_2$ over known directed acyclic graphs $G_1$ and $G_2$ having $n$ nodes and bounded in-degree, approximate $d_{tv}(P_1,P_2)$ to within additive error $\epsilon$ using $poly(n,\epsilon)$ samples and time - Given sample access to two ferromagnetic Ising models $P_1$ and $P_2$ on $n$ variables with bounded width, approximate $d_{tv}(P_1, P_2)$ to within additive error $\epsilon$ using $poly(n,\epsilon)$ samples and time - Given sample access to two $n$-dimensional Gaussians $P_1$ and $P_2$, approximate $d_{tv}(P_1, P_2)$ to within additive error $\epsilon$ using $poly(n,\epsilon)$ samples and time - Given access to observations from two causal models $P$ and $Q$ on $n$ variables that are defined over known causal graphs, approximate $d_{tv}(P_a, Q_a)$ to within additive error $\epsilon$ using $poly(n,\epsilon)$ samples, where $P_a$ and $Q_a$ are the interventional distributions obtained by the intervention $do(A=a)$ on $P$ and $Q$ respectively for a particular variable $A$. Our results are the first efficient distance approximation algorithms for these well-studied problems. They are derived using a simple and general connection to distribution learning algorithms. The distance approximation algorithms imply new efficient algorithms for {\em tolerant} testing of closeness of the above-mentioned structured high-dimensional distributions.
Tasks
Published 2020-02-13
URL https://arxiv.org/abs/2002.05378v2
PDF https://arxiv.org/pdf/2002.05378v2.pdf
PWC https://paperswithcode.com/paper/efficient-distance-approximation-for
Repo
Framework

GTC: Guided Training of CTC Towards Efficient and Accurate Scene Text Recognition

Title GTC: Guided Training of CTC Towards Efficient and Accurate Scene Text Recognition
Authors Wenyang Hu, Xiaocong Cai, Jun Hou, Shuai Yi, Zhiping Lin
Abstract Connectionist Temporal Classification (CTC) and attention mechanism are two main approaches used in recent scene text recognition works. Compared with attention-based methods, CTC decoder has a much shorter inference time, yet a lower accuracy. To design an efficient and effective model, we propose the guided training of CTC (GTC), where CTC model learns a better alignment and feature representations from a more powerful attentional guidance. With the benefit of guided training, CTC model achieves robust and accurate prediction for both regular and irregular scene text while maintaining a fast inference speed. Moreover, to further leverage the potential of CTC decoder, a graph convolutional network (GCN) is proposed to learn the local correlations of extracted features. Extensive experiments on standard benchmarks demonstrate that our end-to-end model achieves a new state-of-the-art for regular and irregular scene text recognition and needs 6 times shorter inference time than attentionbased methods.
Tasks Scene Text Recognition
Published 2020-02-04
URL https://arxiv.org/abs/2002.01276v1
PDF https://arxiv.org/pdf/2002.01276v1.pdf
PWC https://paperswithcode.com/paper/gtc-guided-training-of-ctc-towards-efficient
Repo
Framework

Detection and Mitigation of Bias in Ted Talk Ratings

Title Detection and Mitigation of Bias in Ted Talk Ratings
Authors Rupam Acharyya, Shouman Das, Ankani Chattoraj, Oishani Sengupta, Md Iftekar Tanveer
Abstract Unbiased data collection is essential to guaranteeing fairness in artificial intelligence models. Implicit bias, a form of behavioral conditioning that leads us to attribute predetermined characteristics to members of certain groups and informs the data collection process. This paper quantifies implicit bias in viewer ratings of TEDTalks, a diverse social platform assessing social and professional performance, in order to present the correlations of different kinds of bias across sensitive attributes. Although the viewer ratings of these videos should purely reflect the speaker’s competence and skill, our analysis of the ratings demonstrates the presence of overwhelming and predominant implicit bias with respect to race and gender. In our paper, we present strategies to detect and mitigate bias that are critical to removing unfairness in AI.
Tasks
Published 2020-03-02
URL https://arxiv.org/abs/2003.00683v1
PDF https://arxiv.org/pdf/2003.00683v1.pdf
PWC https://paperswithcode.com/paper/detection-and-mitigation-of-bias-in-ted-talk
Repo
Framework

Better Classifier Calibration for Small Data Sets

Title Better Classifier Calibration for Small Data Sets
Authors Tuomo Alasalmi, Jaakko Suutala, Heli Koskimäki, Juha Röning
Abstract Classifier calibration does not always go hand in hand with the classifier’s ability to separate the classes. There are applications where good classifier calibration, i.e. the ability to produce accurate probability estimates, is more important than class separation. When the amount of data for training is limited, the traditional approach to improve calibration starts to crumble. In this article we show how generating more data for calibration is able to improve calibration algorithm performance in many cases where a classifier is not naturally producing well-calibrated outputs and the traditional approach fails. The proposed approach adds computational cost but considering that the main use case is with small data sets this extra computational cost stays insignificant and is comparable to other methods in prediction time. From the tested classifiers the largest improvement was detected with the random forest and naive Bayes classifiers. Therefore, the proposed approach can be recommended at least for those classifiers when the amount of data available for training is limited and good calibration is essential.
Tasks Calibration
Published 2020-02-24
URL https://arxiv.org/abs/2002.10199v1
PDF https://arxiv.org/pdf/2002.10199v1.pdf
PWC https://paperswithcode.com/paper/better-classifier-calibration-for-small-data
Repo
Framework

Graph Prolongation Convolutional Networks: Explicitly Multiscale Machine Learning on Graphs, with Applications to Modeling of Biological Systems

Title Graph Prolongation Convolutional Networks: Explicitly Multiscale Machine Learning on Graphs, with Applications to Modeling of Biological Systems
Authors C. B. Scott, Eric Mjolsness
Abstract We define a novel type of ensemble Graph Convolutional Network (GCN) model. Using optimized linear projection operators to map between spatial scales of graph, this ensemble model learns to aggregate information from each scale for its final prediction. We calculate these linear projection operators as the infima of an objective function relating the structure matrices used for each GCN. Equipped with these projections, our model (a Graph Prolongation-Convolutional Network) outperforms other GCN ensemble models at predicting the potential energy of monomer subunits in a coarse-grained mechanochemical simulation of microtubule bending. We demonstrate these performance gains by measuring an estimate of the FLOPs spent to train each model, as well as wall-clock time. Because our model learns at multiple scales, it is possible to train at each scale according to a predetermined schedule of coarse vs. fine training. We examine several such schedules adapted from the Algebraic Multigrid (AMG) literature, and quantify the computational benefit of each. Finally, we demonstrate how under certain assumptions, our graph prolongation layers may be decomposed into a matrix outer product of smaller GCN operations.
Tasks
Published 2020-02-14
URL https://arxiv.org/abs/2002.05842v1
PDF https://arxiv.org/pdf/2002.05842v1.pdf
PWC https://paperswithcode.com/paper/graph-prolongation-convolutional-networks
Repo
Framework

Neural-Network Heuristics for Adaptive Bayesian Quantum Estimation

Title Neural-Network Heuristics for Adaptive Bayesian Quantum Estimation
Authors Lukas J. Fiderer, Jonas Schuff, Daniel Braun
Abstract Quantum metrology promises unprecedented measurement precision but suffers in practice from the limited availability of resources such as the number of probes, their coherence time, or non-classical quantum states. The adaptive Bayesian approach to parameter estimation allows for an efficient use of resources thanks to adaptive experiment design. For its practical success fast numerical solutions for the Bayesian update and the adaptive experiment design are crucial. Here we show that neural networks can be trained to become fast and strong experiment-design heuristics using a combination of an evolutionary strategy and reinforcement learning. Neural-network heuristics are shown to outperform established heuristics for the technologically important example of frequency estimation of a qubit that suffers from dephasing. Our method of creating neural-network heuristics is very general and complements the well-studied sequential Monte-Carlo method for Bayesian updates to form a complete framework for adaptive Bayesian quantum estimation.
Tasks
Published 2020-03-04
URL https://arxiv.org/abs/2003.02183v1
PDF https://arxiv.org/pdf/2003.02183v1.pdf
PWC https://paperswithcode.com/paper/neural-network-heuristics-for-adaptive
Repo
Framework

Adaptive Online Learning with Varying Norms

Title Adaptive Online Learning with Varying Norms
Authors Ashok Cutkosky
Abstract Given any increasing sequence of norms $\cdot_0,\dots,\cdot_{T-1}$, we provide an online convex optimization algorithm that outputs points $w_t$ in some domain $W$ in response to convex losses $\ell_t:W\to \mathbb{R}$ that guarantees regret $R_T(u)=\sum_{t=1}^T \ell_t(w_t)-\ell_t(u)\le \tilde O\left(\u_{T-1}\sqrt{\sum_{t=1}^T \g_t_{t-1,\star}^2}\right)$ where $g_t$ is a subgradient of $\ell_t$ at $w_t$. Our method does not require tuning to the value of $u$ and allows for arbitrary convex $W$. We apply this result to obtain new “full-matrix”-style regret bounds. Along the way, we provide a new examination of the full-matrix AdaGrad algorithm, suggesting a better learning rate value that improves significantly upon prior analysis. We use our new techniques to tune AdaGrad on-the-fly, realizing our improved bound in a concrete algorithm.
Tasks
Published 2020-02-10
URL https://arxiv.org/abs/2002.03963v1
PDF https://arxiv.org/pdf/2002.03963v1.pdf
PWC https://paperswithcode.com/paper/adaptive-online-learning-with-varying-norms
Repo
Framework

On the role of surrogates in the efficient estimation of treatment effects with limited outcome data

Title On the role of surrogates in the efficient estimation of treatment effects with limited outcome data
Authors Nathan Kallus, Xiaojie Mao
Abstract We study the problem of estimating treatment effects when the outcome of primary interest (e.g., long-term health status) is only seldom observed but abundant surrogate observations (e.g., short-term health outcomes) are available. To investigate the role of surrogates in this setting, we derive the semiparametric efficiency lower bounds of average treatment effect (ATE) both with and without presence of surrogates, as well as several intermediary settings. These bounds characterize the best-possible precision of ATE estimation in each case, and their difference quantifies the efficiency gains from optimally leveraging the surrogates in terms of key problem characteristics when only limited outcome data are available. We show these results apply in two important regimes: when the number of surrogate observations is comparable to primary-outcome observations and when the former dominates the latter. Importantly, we take a missing-data approach that circumvents strong surrogate conditions which are commonly assumed in previous literature but almost always fail in practice. To show how to leverage the efficiency gains of surrogate observations, we propose ATE estimators and inferential methods based on flexible machine learning methods to estimate nuisance parameters that appear in the influence functions. We show our estimators enjoy efficiency and robustness guarantees under weak conditions.
Tasks
Published 2020-03-27
URL https://arxiv.org/abs/2003.12408v1
PDF https://arxiv.org/pdf/2003.12408v1.pdf
PWC https://paperswithcode.com/paper/on-the-role-of-surrogates-in-the-efficient
Repo
Framework
comments powered by Disqus