April 2, 2020

2955 words 14 mins read

Paper Group ANR 238

Learning models of quantum systems from experiments. SwapText: Image Based Texts Transfer in Scenes. Everybody’s Talkin’: Let Me Talk as You Want. A working likelihood approach to support vector regression with a data-driven insensitivity parameter. Planning with Brain-inspired AI. Causal Inference With Selectively-Deconfounded Data. Learning to Le …

Learning models of quantum systems from experiments


Title	Learning models of quantum systems from experiments
Authors	Antonio A. Gentile, Brian Flynn, Sebastian Knauer, Nathan Wiebe, Stefano Paesani, Christopher E. Granade, John G. Rarity, Raffaele Santagati, Anthony Laing
Abstract	An isolated system of interacting quantum particles is described by a Hamiltonian operator. Hamiltonian models underpin the study and analysis of physical and chemical processes throughout science and industry, so it is crucial they are faithful to the system they represent. However, formulating and testing Hamiltonian models of quantum systems from experimental data is difficult because it is impossible to directly observe which interactions the quantum system is subject to. Here, we propose and demonstrate an approach to retrieving a Hamiltonian model from experiments, using unsupervised machine learning. We test our methods experimentally on an electron spin in a nitrogen-vacancy interacting with its spin bath environment, and numerically, finding success rates up to 86%. By building agents capable of learning science, which recover meaningful representations, we can gain further insight on the physics of quantum systems.
Tasks
Published	2020-02-14
URL	https://arxiv.org/abs/2002.06169v1
PDF	https://arxiv.org/pdf/2002.06169v1.pdf
PWC	https://paperswithcode.com/paper/learning-models-of-quantum-systems-from
Repo
Framework

SwapText: Image Based Texts Transfer in Scenes


Title	SwapText: Image Based Texts Transfer in Scenes
Authors	Qiangpeng Yang, Hongsheng Jin, Jun Huang, Wei Lin
Abstract	Swapping text in scene images while preserving original fonts, colors, sizes and background textures is a challenging task due to the complex interplay between different factors. In this work, we present SwapText, a three-stage framework to transfer texts across scene images. First, a novel text swapping network is proposed to replace text labels only in the foreground image. Second, a background completion network is learned to reconstruct background images. Finally, the generated foreground image and background image are used to generate the word image by the fusion network. Using the proposing framework, we can manipulate the texts of the input images even with severe geometric distortion. Qualitative and quantitative results are presented on several scene text datasets, including regular and irregular text datasets. We conducted extensive experiments to prove the usefulness of our method such as image based text translation, text image synthesis, etc.
Tasks	Image Generation
Published	2020-03-18
URL	https://arxiv.org/abs/2003.08152v1
PDF	https://arxiv.org/pdf/2003.08152v1.pdf
PWC	https://paperswithcode.com/paper/swaptext-image-based-texts-transfer-in-scenes
Repo
Framework

Everybody’s Talkin’: Let Me Talk as You Want


Title	Everybody’s Talkin’: Let Me Talk as You Want
Authors	Linsen Song, Wayne Wu, Chen Qian, Ran He, Chen Change Loy
Abstract	We present a method to edit a target portrait footage by taking a sequence of audio as input to synthesize a photo-realistic video. This method is unique because it is highly dynamic. It does not assume a person-specific rendering network yet capable of translating arbitrary source audio into arbitrary video output. Instead of learning a highly heterogeneous and nonlinear mapping from audio to the video directly, we first factorize each target video frame into orthogonal parameter spaces, i.e., expression, geometry, and pose, via monocular 3D face reconstruction. Next, a recurrent network is introduced to translate source audio into expression parameters that are primarily related to the audio content. The audio-translated expression parameters are then used to synthesize a photo-realistic human subject in each video frame, with the movement of the mouth regions precisely mapped to the source audio. The geometry and pose parameters of the target human portrait are retained, therefore preserving the context of the original video footage. Finally, we introduce a novel video rendering network and a dynamic programming method to construct a temporally coherent and photo-realistic video. Extensive experiments demonstrate the superiority of our method over existing approaches. Our method is end-to-end learnable and robust to voice variations in the source audio.
Tasks	3D Face Reconstruction, Face Reconstruction
Published	2020-01-15
URL	https://arxiv.org/abs/2001.05201v1
PDF	https://arxiv.org/pdf/2001.05201v1.pdf
PWC	https://paperswithcode.com/paper/everybodys-talkin-let-me-talk-as-you-want
Repo
Framework

A working likelihood approach to support vector regression with a data-driven insensitivity parameter


Title	A working likelihood approach to support vector regression with a data-driven insensitivity parameter
Authors	Jinran Wu, You-Gan Wang
Abstract	The insensitive parameter in support vector regression determines the set of support vectors that greatly impacts the prediction. A data-driven approach is proposed to determine an approximate value for this insensitive parameter by minimizing a generalized loss function originating from the likelihood principle. This data-driven support vector regression also statistically standardizes samples using the scale of noises. Nonlinear and linear numerical simulations with three types of noises ($\epsilon$-Laplacian distribution, normal distribution, and uniform distribution), and in addition, five real benchmark data sets, are used to test the capacity of the proposed method. Based on all of the simulations and the five case studies, the proposed support vector regression using a working likelihood, data-driven insensitive parameter is superior and has lower computational costs.
Tasks
Published	2020-03-09
URL	https://arxiv.org/abs/2003.03893v1
PDF	https://arxiv.org/pdf/2003.03893v1.pdf
PWC	https://paperswithcode.com/paper/a-working-likelihood-approach-to-support
Repo
Framework

Planning with Brain-inspired AI


Title	Planning with Brain-inspired AI
Authors	Naoya Arakawa
Abstract	This article surveys engineering and neuroscientific models of planning as a cognitive function, which is regarded as a typical function of fluid intelligence in the discussion of general intelligence. It aims to present existing planning models as references for realizing the planning function in brain-inspired AI or artificial general intelligence (AGI). It also proposes themes for the research and development of brain-inspired AI from the viewpoint of tasks and architecture.
Tasks
Published	2020-03-25
URL	https://arxiv.org/abs/2003.12353v1
PDF	https://arxiv.org/pdf/2003.12353v1.pdf
PWC	https://paperswithcode.com/paper/planning-with-brain-inspired-ai
Repo
Framework

Causal Inference With Selectively-Deconfounded Data


Title	Causal Inference With Selectively-Deconfounded Data
Authors	Kyra Gan, Andrew A. Li, Zachary C. Lipton, Sridhar Tayur
Abstract	Given only data generated by a standard confounding graph with unobserved confounder, the Average Treatment Effect (ATE) is not identifiable. To estimate the ATE, a practitioner must then either (a) collect deconfounded data; (b) run a clinical trial; or (c) elucidate further properties of the causal graph that might render the ATE identifiable. In this paper, we consider the benefit of incorporating a (large) confounded observational dataset alongside a (small) deconfounded observational dataset when estimating the ATE. Our theoretical results show that the inclusion of confounded data can significantly reduce the quantity of deconfounded data required to estimate the ATE to within a desired accuracy level. Moreover, in some cases—say, genetics—we could imagine retrospectively selecting samples to deconfound. We demonstrate that by strategically selecting these examples based upon the (already observed) treatment and outcome, we can reduce our data dependence further. Our theoretical and empirical results establish that the worst-case relative performance of our approach (vs. a natural benchmark) is bounded while our best-case gains are unbounded. Next, we demonstrate the benefits of selective deconfounding using a large real-world dataset related to genetic mutation in cancer. Finally, we introduce an online version of the problem, proposing two adaptive heuristics.
Tasks	Causal Inference
Published	2020-02-25
URL	https://arxiv.org/abs/2002.11096v1
PDF	https://arxiv.org/pdf/2002.11096v1.pdf
PWC	https://paperswithcode.com/paper/causal-inference-with-selectively
Repo
Framework

Learning to Learn Single Domain Generalization


Title	Learning to Learn Single Domain Generalization
Authors	Fengchun Qiao, Long Zhao, Xi Peng
Abstract	We are concerned with a worst-case scenario in model generalization, in the sense that a model aims to perform well on many unseen domains while there is only one single domain available for training. We propose a new method named adversarial domain augmentation to solve this Out-of-Distribution (OOD) generalization problem. The key idea is to leverage adversarial training to create “fictitious” yet “challenging” populations, from which a model can learn to generalize with theoretical guarantees. To facilitate fast and desirable domain augmentation, we cast the model training in a meta-learning scheme and use a Wasserstein Auto-Encoder (WAE) to relax the widely used worst-case constraint. Detailed theoretical analysis is provided to testify our formulation, while extensive experiments on multiple benchmark datasets indicate its superior performance in tackling single domain generalization.
Tasks	Domain Generalization, Meta-Learning
Published	2020-03-30
URL	https://arxiv.org/abs/2003.13216v1
PDF	https://arxiv.org/pdf/2003.13216v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-learn-single-domain
Repo
Framework

Efficient Distance Approximation for Structured High-Dimensional Distributions via Learning


Title	Efficient Distance Approximation for Structured High-Dimensional Distributions via Learning
Authors	Arnab Bhattacharyya, Sutanu Gayen, Kuldeep S. Meel, N. V. Vinodchandran
Abstract	We design efficient distance approximation algorithms for several classes of structured high-dimensional distributions. Specifically, we show algorithms for the following problems: - Given sample access to two Bayesian networks $P_1$ and $P_2$ over known directed acyclic graphs $G_1$ and $G_2$ having $n$ nodes and bounded in-degree, approximate $d_{tv}(P_1,P_2)$ to within additive error $\epsilon$ using $poly(n,\epsilon)$ samples and time - Given sample access to two ferromagnetic Ising models $P_1$ and $P_2$ on $n$ variables with bounded width, approximate $d_{tv}(P_1, P_2)$ to within additive error $\epsilon$ using $poly(n,\epsilon)$ samples and time - Given sample access to two $n$-dimensional Gaussians $P_1$ and $P_2$, approximate $d_{tv}(P_1, P_2)$ to within additive error $\epsilon$ using $poly(n,\epsilon)$ samples and time - Given access to observations from two causal models $P$ and $Q$ on $n$ variables that are defined over known causal graphs, approximate $d_{tv}(P_a, Q_a)$ to within additive error $\epsilon$ using $poly(n,\epsilon)$ samples, where $P_a$ and $Q_a$ are the interventional distributions obtained by the intervention $do(A=a)$ on $P$ and $Q$ respectively for a particular variable $A$. Our results are the first efficient distance approximation algorithms for these well-studied problems. They are derived using a simple and general connection to distribution learning algorithms. The distance approximation algorithms imply new efficient algorithms for {\em tolerant} testing of closeness of the above-mentioned structured high-dimensional distributions.
Tasks
Published	2020-02-13
URL	https://arxiv.org/abs/2002.05378v2
PDF	https://arxiv.org/pdf/2002.05378v2.pdf
PWC	https://paperswithcode.com/paper/efficient-distance-approximation-for
Repo
Framework

GTC: Guided Training of CTC Towards Efficient and Accurate Scene Text Recognition


Title	GTC: Guided Training of CTC Towards Efficient and Accurate Scene Text Recognition
Authors	Wenyang Hu, Xiaocong Cai, Jun Hou, Shuai Yi, Zhiping Lin
Abstract	Connectionist Temporal Classification (CTC) and attention mechanism are two main approaches used in recent scene text recognition works. Compared with attention-based methods, CTC decoder has a much shorter inference time, yet a lower accuracy. To design an efficient and effective model, we propose the guided training of CTC (GTC), where CTC model learns a better alignment and feature representations from a more powerful attentional guidance. With the benefit of guided training, CTC model achieves robust and accurate prediction for both regular and irregular scene text while maintaining a fast inference speed. Moreover, to further leverage the potential of CTC decoder, a graph convolutional network (GCN) is proposed to learn the local correlations of extracted features. Extensive experiments on standard benchmarks demonstrate that our end-to-end model achieves a new state-of-the-art for regular and irregular scene text recognition and needs 6 times shorter inference time than attentionbased methods.
Tasks	Scene Text Recognition
Published	2020-02-04
URL	https://arxiv.org/abs/2002.01276v1
PDF	https://arxiv.org/pdf/2002.01276v1.pdf
PWC	https://paperswithcode.com/paper/gtc-guided-training-of-ctc-towards-efficient
Repo
Framework

Detection and Mitigation of Bias in Ted Talk Ratings


Title	Detection and Mitigation of Bias in Ted Talk Ratings
Authors	Rupam Acharyya, Shouman Das, Ankani Chattoraj, Oishani Sengupta, Md Iftekar Tanveer
Abstract	Unbiased data collection is essential to guaranteeing fairness in artificial intelligence models. Implicit bias, a form of behavioral conditioning that leads us to attribute predetermined characteristics to members of certain groups and informs the data collection process. This paper quantifies implicit bias in viewer ratings of TEDTalks, a diverse social platform assessing social and professional performance, in order to present the correlations of different kinds of bias across sensitive attributes. Although the viewer ratings of these videos should purely reflect the speaker’s competence and skill, our analysis of the ratings demonstrates the presence of overwhelming and predominant implicit bias with respect to race and gender. In our paper, we present strategies to detect and mitigate bias that are critical to removing unfairness in AI.
Tasks
Published	2020-03-02
URL	https://arxiv.org/abs/2003.00683v1
PDF	https://arxiv.org/pdf/2003.00683v1.pdf
PWC	https://paperswithcode.com/paper/detection-and-mitigation-of-bias-in-ted-talk
Repo
Framework

Better Classifier Calibration for Small Data Sets


Title	Better Classifier Calibration for Small Data Sets
Authors	Tuomo Alasalmi, Jaakko Suutala, Heli Koskimäki, Juha Röning
Abstract	Classifier calibration does not always go hand in hand with the classifier’s ability to separate the classes. There are applications where good classifier calibration, i.e. the ability to produce accurate probability estimates, is more important than class separation. When the amount of data for training is limited, the traditional approach to improve calibration starts to crumble. In this article we show how generating more data for calibration is able to improve calibration algorithm performance in many cases where a classifier is not naturally producing well-calibrated outputs and the traditional approach fails. The proposed approach adds computational cost but considering that the main use case is with small data sets this extra computational cost stays insignificant and is comparable to other methods in prediction time. From the tested classifiers the largest improvement was detected with the random forest and naive Bayes classifiers. Therefore, the proposed approach can be recommended at least for those classifiers when the amount of data available for training is limited and good calibration is essential.
Tasks	Calibration
Published	2020-02-24
URL	https://arxiv.org/abs/2002.10199v1
PDF	https://arxiv.org/pdf/2002.10199v1.pdf
PWC	https://paperswithcode.com/paper/better-classifier-calibration-for-small-data
Repo
Framework

Graph Prolongation Convolutional Networks: Explicitly Multiscale Machine Learning on Graphs, with Applications to Modeling of Biological Systems


Title	Graph Prolongation Convolutional Networks: Explicitly Multiscale Machine Learning on Graphs, with Applications to Modeling of Biological Systems
Authors	C. B. Scott, Eric Mjolsness
Abstract	We define a novel type of ensemble Graph Convolutional Network (GCN) model. Using optimized linear projection operators to map between spatial scales of graph, this ensemble model learns to aggregate information from each scale for its final prediction. We calculate these linear projection operators as the infima of an objective function relating the structure matrices used for each GCN. Equipped with these projections, our model (a Graph Prolongation-Convolutional Network) outperforms other GCN ensemble models at predicting the potential energy of monomer subunits in a coarse-grained mechanochemical simulation of microtubule bending. We demonstrate these performance gains by measuring an estimate of the FLOPs spent to train each model, as well as wall-clock time. Because our model learns at multiple scales, it is possible to train at each scale according to a predetermined schedule of coarse vs. fine training. We examine several such schedules adapted from the Algebraic Multigrid (AMG) literature, and quantify the computational benefit of each. Finally, we demonstrate how under certain assumptions, our graph prolongation layers may be decomposed into a matrix outer product of smaller GCN operations.
Tasks
Published	2020-02-14
URL	https://arxiv.org/abs/2002.05842v1
PDF	https://arxiv.org/pdf/2002.05842v1.pdf
PWC	https://paperswithcode.com/paper/graph-prolongation-convolutional-networks
Repo
Framework

Neural-Network Heuristics for Adaptive Bayesian Quantum Estimation


Title	Neural-Network Heuristics for Adaptive Bayesian Quantum Estimation
Authors	Lukas J. Fiderer, Jonas Schuff, Daniel Braun
Abstract	Quantum metrology promises unprecedented measurement precision but suffers in practice from the limited availability of resources such as the number of probes, their coherence time, or non-classical quantum states. The adaptive Bayesian approach to parameter estimation allows for an efficient use of resources thanks to adaptive experiment design. For its practical success fast numerical solutions for the Bayesian update and the adaptive experiment design are crucial. Here we show that neural networks can be trained to become fast and strong experiment-design heuristics using a combination of an evolutionary strategy and reinforcement learning. Neural-network heuristics are shown to outperform established heuristics for the technologically important example of frequency estimation of a qubit that suffers from dephasing. Our method of creating neural-network heuristics is very general and complements the well-studied sequential Monte-Carlo method for Bayesian updates to form a complete framework for adaptive Bayesian quantum estimation.
Tasks
Published	2020-03-04
URL	https://arxiv.org/abs/2003.02183v1
PDF	https://arxiv.org/pdf/2003.02183v1.pdf
PWC	https://paperswithcode.com/paper/neural-network-heuristics-for-adaptive
Repo
Framework

Adaptive Online Learning with Varying Norms


Title	Adaptive Online Learning with Varying Norms
Authors	Ashok Cutkosky
Abstract	Given any increasing sequence of norms $\cdot_0,\dots,\cdot_{T-1}$, we provide an online convex optimization algorithm that outputs points $w_t$ in some domain $W$ in response to convex losses $\ell_t:W\to \mathbb{R}$ that guarantees regret $R_T(u)=\sum_{t=1}^T \ell_t(w_t)-\ell_t(u)\le \tilde O\left(\u_{T-1}\sqrt{\sum_{t=1}^T \g_t_{t-1,\star}^2}\right)$ where $g_t$ is a subgradient of $\ell_t$ at $w_t$. Our method does not require tuning to the value of $u$ and allows for arbitrary convex $W$. We apply this result to obtain new “full-matrix”-style regret bounds. Along the way, we provide a new examination of the full-matrix AdaGrad algorithm, suggesting a better learning rate value that improves significantly upon prior analysis. We use our new techniques to tune AdaGrad on-the-fly, realizing our improved bound in a concrete algorithm.
Tasks
Published	2020-02-10
URL	https://arxiv.org/abs/2002.03963v1
PDF	https://arxiv.org/pdf/2002.03963v1.pdf
PWC	https://paperswithcode.com/paper/adaptive-online-learning-with-varying-norms
Repo
Framework

On the role of surrogates in the efficient estimation of treatment effects with limited outcome data


Title	On the role of surrogates in the efficient estimation of treatment effects with limited outcome data
Authors	Nathan Kallus, Xiaojie Mao
Abstract	We study the problem of estimating treatment effects when the outcome of primary interest (e.g., long-term health status) is only seldom observed but abundant surrogate observations (e.g., short-term health outcomes) are available. To investigate the role of surrogates in this setting, we derive the semiparametric efficiency lower bounds of average treatment effect (ATE) both with and without presence of surrogates, as well as several intermediary settings. These bounds characterize the best-possible precision of ATE estimation in each case, and their difference quantifies the efficiency gains from optimally leveraging the surrogates in terms of key problem characteristics when only limited outcome data are available. We show these results apply in two important regimes: when the number of surrogate observations is comparable to primary-outcome observations and when the former dominates the latter. Importantly, we take a missing-data approach that circumvents strong surrogate conditions which are commonly assumed in previous literature but almost always fail in practice. To show how to leverage the efficiency gains of surrogate observations, we propose ATE estimators and inferential methods based on flexible machine learning methods to estimate nuisance parameters that appear in the influence functions. We show our estimators enjoy efficiency and robustness guarantees under weak conditions.
Tasks
Published	2020-03-27
URL	https://arxiv.org/abs/2003.12408v1
PDF	https://arxiv.org/pdf/2003.12408v1.pdf
PWC	https://paperswithcode.com/paper/on-the-role-of-surrogates-in-the-efficient
Repo
Framework