October 17, 2019

2846 words 14 mins read

Paper Group ANR 786

Actively Avoiding Nonsense in Generative Models. Resisting Adversarial Attacks using Gaussian Mixture Variational Autoencoders. Joint Learning of Motion Estimation and Segmentation for Cardiac MR Image Sequences. Inference in Graded Bayesian Networks. An interpretable multiple kernel learning approach for the discovery of integrative cancer subtype …

Actively Avoiding Nonsense in Generative Models


Title	Actively Avoiding Nonsense in Generative Models
Authors	Steve Hanneke, Adam Kalai, Gautam Kamath, Christos Tzamos
Abstract	A generative model may generate utter nonsense when it is fit to maximize the likelihood of observed data. This happens due to “model error,” i.e., when the true data generating distribution does not fit within the class of generative models being learned. To address this, we propose a model of active distribution learning using a binary invalidity oracle that identifies some examples as clearly invalid, together with random positive examples sampled from the true distribution. The goal is to maximize the likelihood of the positive examples subject to the constraint of (almost) never generating examples labeled invalid by the oracle. Guarantees are agnostic compared to a class of probability distributions. We show that, while proper learning often requires exponentially many queries to the invalidity oracle, improper distribution learning can be done using polynomially many queries.
Tasks
Published	2018-02-20
URL	http://arxiv.org/abs/1802.07229v1
PDF	http://arxiv.org/pdf/1802.07229v1.pdf
PWC	https://paperswithcode.com/paper/actively-avoiding-nonsense-in-generative
Repo
Framework

Resisting Adversarial Attacks using Gaussian Mixture Variational Autoencoders


Title	Resisting Adversarial Attacks using Gaussian Mixture Variational Autoencoders
Authors	Partha Ghosh, Arpan Losalka, Michael J Black
Abstract	Susceptibility of deep neural networks to adversarial attacks poses a major theoretical and practical challenge. All efforts to harden classifiers against such attacks have seen limited success. Two distinct categories of samples to which deep networks are vulnerable, “adversarial samples” and “fooling samples”, have been tackled separately so far due to the difficulty posed when considered together. In this work, we show how one can address them both under one unified framework. We tie a discriminative model with a generative model, rendering the adversarial objective to entail a conflict. Our model has the form of a variational autoencoder, with a Gaussian mixture prior on the latent vector. Each mixture component of the prior distribution corresponds to one of the classes in the data. This enables us to perform selective classification, leading to the rejection of adversarial samples instead of misclassification. Our method inherently provides a way of learning a selective classifier in a semi-supervised scenario as well, which can resist adversarial attacks. We also show how one can reclassify the rejected adversarial samples.
Tasks
Published	2018-05-31
URL	http://arxiv.org/abs/1806.00081v2
PDF	http://arxiv.org/pdf/1806.00081v2.pdf
PWC	https://paperswithcode.com/paper/resisting-adversarial-attacks-using-gaussian
Repo
Framework

Joint Learning of Motion Estimation and Segmentation for Cardiac MR Image Sequences


Title	Joint Learning of Motion Estimation and Segmentation for Cardiac MR Image Sequences
Authors	Chen Qin, Wenjia Bai, Jo Schlemper, Steffen E. Petersen, Stefan K. Piechnik, Stefan Neubauer, Daniel Rueckert
Abstract	Cardiac motion estimation and segmentation play important roles in quantitatively assessing cardiac function and diagnosing cardiovascular diseases. In this paper, we propose a novel deep learning method for joint estimation of motion and segmentation from cardiac MR image sequences. The proposed network consists of two branches: a cardiac motion estimation branch which is built on a novel unsupervised Siamese style recurrent spatial transformer network, and a cardiac segmentation branch that is based on a fully convolutional network. In particular, a joint multi-scale feature encoder is learned by optimizing the segmentation branch and the motion estimation branch simultaneously. This enables the weakly-supervised segmentation by taking advantage of features that are unsupervisedly learned in the motion estimation branch from a large amount of unannotated data. Experimental results using cardiac MRI images from 220 subjects show that the joint learning of both tasks is complementary and the proposed models outperform the competing methods significantly in terms of accuracy and speed.
Tasks	Cardiac Segmentation, Motion Estimation
Published	2018-06-11
URL	http://arxiv.org/abs/1806.04066v1
PDF	http://arxiv.org/pdf/1806.04066v1.pdf
PWC	https://paperswithcode.com/paper/joint-learning-of-motion-estimation-and
Repo
Framework

Inference in Graded Bayesian Networks


Title	Inference in Graded Bayesian Networks
Authors	Robert Leppert, Karl-Heinz Zimmermann
Abstract	Machine learning provides algorithms that can learn from data and make inferences or predictions on data. Bayesian networks are a class of graphical models that allow to represent a collection of random variables and their condititional dependencies by directed acyclic graphs. In this paper, an inference algorithm for the hidden random variables of a Bayesian network is given by using the tropicalization of the marginal distribution of the observed variables. By restricting the topological structure to graded networks, an inference algorithm for graded Bayesian networks will be established that evaluates the hidden random variables rank by rank and in this way yields the most probable states of the hidden variables. This algorithm can be viewed as a generalized version of the Viterbi algorithm for graded Bayesian networks.
Tasks
Published	2018-12-23
URL	http://arxiv.org/abs/1901.01837v1
PDF	http://arxiv.org/pdf/1901.01837v1.pdf
PWC	https://paperswithcode.com/paper/inference-in-graded-bayesian-networks
Repo
Framework

An interpretable multiple kernel learning approach for the discovery of integrative cancer subtypes


Title	An interpretable multiple kernel learning approach for the discovery of integrative cancer subtypes
Authors	Nora K. Speicher, Nico Pfeifer
Abstract	Due to the complexity of cancer, clustering algorithms have been used to disentangle the observed heterogeneity and identify cancer subtypes that can be treated specifically. While kernel based clustering approaches allow the use of more than one input matrix, which is an important factor when considering a multidimensional disease like cancer, the clustering results remain hard to evaluate and, in many cases, it is unclear which piece of information had which impact on the final result. In this paper, we propose an extension of multiple kernel learning clustering that enables the characterization of each identified patient cluster based on the features that had the highest impact on the result. To this end, we combine feature clustering with multiple kernel dimensionality reduction and introduce FIPPA, a score which measures the feature cluster impact on a patient cluster. Results: We applied the approach to different cancer types described by four different data types with the aim of identifying integrative patient subtypes and understanding which features were most important for their identification. Our results show that our method does not only have state-of-the-art performance according to standard measures (e.g., survival analysis), but, based on the high impact features, it also produces meaningful explanations for the molecular bases of the subtypes. This could provide an important step in the validation of potential cancer subtypes and enable the formulation of new hypotheses concerning individual patient groups. Similar analysis are possible for other disease phenotypes.
Tasks	Dimensionality Reduction, Discovery Of Integrative Cancer Subtypes, Survival Analysis
Published	2018-11-20
URL	http://arxiv.org/abs/1811.08102v1
PDF	http://arxiv.org/pdf/1811.08102v1.pdf
PWC	https://paperswithcode.com/paper/an-interpretable-multiple-kernel-learning
Repo
Framework

Recurrent Attention Unit


Title	Recurrent Attention Unit
Authors	Guoqiang Zhong, Guohua Yue, Xiao Ling
Abstract	Recurrent Neural Network (RNN) has been successfully applied in many sequence learning problems. Such as handwriting recognition, image description, natural language processing and video motion analysis. After years of development, researchers have improved the internal structure of the RNN and introduced many variants. Among others, Gated Recurrent Unit (GRU) is one of the most widely used RNN model. However, GRU lacks the capability of adaptively paying attention to certain regions or locations, so that it may cause information redundancy or loss during leaning. In this paper, we propose a RNN model, called Recurrent Attention Unit (RAU), which seamlessly integrates the attention mechanism into the interior of GRU by adding an attention gate. The attention gate can enhance GRU’s ability to remember long-term memory and help memory cells quickly discard unimportant content. RAU is capable of extracting information from the sequential data by adaptively selecting a sequence of regions or locations and pay more attention to the selected regions during learning. Extensive experiments on image classification, sentiment classification and language modeling show that RAU consistently outperforms GRU and other baseline methods.
Tasks	Image Classification, Language Modelling, Sentiment Analysis
Published	2018-10-30
URL	http://arxiv.org/abs/1810.12754v1
PDF	http://arxiv.org/pdf/1810.12754v1.pdf
PWC	https://paperswithcode.com/paper/recurrent-attention-unit
Repo
Framework

Iterative Classroom Teaching


Title	Iterative Classroom Teaching
Authors	Teresa Yeo, Parameswaran Kamalaruban, Adish Singla, Arpit Merchant, Thibault Asselborn, Louis Faucon, Pierre Dillenbourg, Volkan Cevher
Abstract	We consider the machine teaching problem in a classroom-like setting wherein the teacher has to deliver the same examples to a diverse group of students. Their diversity stems from differences in their initial internal states as well as their learning rates. We prove that a teacher with full knowledge about the learning dynamics of the students can teach a target concept to the entire classroom using O(min{d,N} log(1/eps)) examples, where d is the ambient dimension of the problem, N is the number of learners, and eps is the accuracy parameter. We show the robustness of our teaching strategy when the teacher has limited knowledge of the learners’ internal dynamics as provided by a noisy oracle. Further, we study the trade-off between the learners’ workload and the teacher’s cost in teaching the target concept. Our experiments validate our theoretical results and suggest that appropriately partitioning the classroom into homogenous groups provides a balance between these two objectives.
Tasks
Published	2018-11-08
URL	http://arxiv.org/abs/1811.03537v2
PDF	http://arxiv.org/pdf/1811.03537v2.pdf
PWC	https://paperswithcode.com/paper/iterative-classroom-teaching
Repo
Framework

Online ICA: Understanding Global Dynamics of Nonconvex Optimization via Diffusion Processes


Title	Online ICA: Understanding Global Dynamics of Nonconvex Optimization via Diffusion Processes
Authors	Chris Junchi Li, Zhaoran Wang, Han Liu
Abstract	Solving statistical learning problems often involves nonconvex optimization. Despite the empirical success of nonconvex statistical optimization methods, their global dynamics, especially convergence to the desirable local minima, remain less well understood in theory. In this paper, we propose a new analytic paradigm based on diffusion processes to characterize the global dynamics of nonconvex statistical optimization. As a concrete example, we study stochastic gradient descent (SGD) for the tensor decomposition formulation of independent component analysis. In particular, we cast different phases of SGD into diffusion processes, i.e., solutions to stochastic differential equations. Initialized from an unstable equilibrium, the global dynamics of SGD transit over three consecutive phases: (i) an unstable Ornstein-Uhlenbeck process slowly departing from the initialization, (ii) the solution to an ordinary differential equation, which quickly evolves towards the desirable local minimum, and (iii) a stable Ornstein-Uhlenbeck process oscillating around the desirable local minimum. Our proof techniques are based upon Stroock and Varadhan’s weak convergence of Markov chains to diffusion processes, which are of independent interest.
Tasks
Published	2018-08-29
URL	http://arxiv.org/abs/1808.09642v1
PDF	http://arxiv.org/pdf/1808.09642v1.pdf
PWC	https://paperswithcode.com/paper/online-ica-understanding-global-dynamics-of
Repo
Framework

Trading algorithms with learning in latent alpha models


Title	Trading algorithms with learning in latent alpha models
Authors	Philippe Casgrain, Sebastian Jaimungal
Abstract	Alpha signals for statistical arbitrage strategies are often driven by latent factors. This paper analyses how to optimally trade with latent factors that cause prices to jump and diffuse. Moreover, we account for the effect of the trader’s actions on quoted prices and the prices they receive from trading. Under fairly general assumptions, we demonstrate how the trader can learn the posterior distribution over the latent states, and explicitly solve the latent optimal trading problem. We provide a verification theorem, and a methodology for calibrating the model by deriving a variation of the expectation-maximization algorithm. To illustrate the efficacy of the optimal strategy, we demonstrate its performance through simulations and compare it to strategies which ignore learning in the latent factors. We also provide calibration results for a particular model using Intel Corporation stock as an example.
Tasks	Calibration
Published	2018-06-12
URL	http://arxiv.org/abs/1806.04472v1
PDF	http://arxiv.org/pdf/1806.04472v1.pdf
PWC	https://paperswithcode.com/paper/trading-algorithms-with-learning-in-latent
Repo
Framework

A Dynamical Systems Perspective on Nonsmooth Constrained Optimization


Title	A Dynamical Systems Perspective on Nonsmooth Constrained Optimization
Authors	Guilherme França, Daniel P. Robinson, René Vidal
Abstract	The acceleration technique introduced by Nesterov for gradient descent is widely used in machine learning, but its principles are not yet fully understood. Recently, significant progress has been made to close this understanding gap through a continuous-time dynamical systems perspective associated with gradient methods for smooth and unconstrained problems. Here we extend this perspective to nonsmooth and linearly constrained problems by deriving nonsmooth dynamical systems related to variants of the relaxed and accelerated alternating direction method of multipliers (ADMM). We introduce two new ADMM variants, one based on Nesterov’s acceleration and the other inspired by Polyak’s heavy ball method, and derive differential inclusions modelling these algorithms in the continuous-time limit. Using a nonsmooth Lyapunov analysis, we obtain rate-of-convergence results for these dynamical systems in the convex and strongly convex setting that illustrate an interesting tradeoff between Nesterov and heavy ball acceleration.
Tasks
Published	2018-08-13
URL	http://arxiv.org/abs/1808.04048v2
PDF	http://arxiv.org/pdf/1808.04048v2.pdf
PWC	https://paperswithcode.com/paper/a-dynamical-systems-perspective-on-nonsmooth
Repo
Framework

Retrieve-and-Read: Multi-task Learning of Information Retrieval and Reading Comprehension


Title	Retrieve-and-Read: Multi-task Learning of Information Retrieval and Reading Comprehension
Authors	Kyosuke Nishida, Itsumi Saito, Atsushi Otsuka, Hisako Asano, Junji Tomita
Abstract	This study considers the task of machine reading at scale (MRS) wherein, given a question, a system first performs the information retrieval (IR) task of finding relevant passages in a knowledge source and then carries out the reading comprehension (RC) task of extracting an answer span from the passages. Previous MRS studies, in which the IR component was trained without considering answer spans, struggled to accurately find a small number of relevant passages from a large set of passages. In this paper, we propose a simple and effective approach that incorporates the IR and RC tasks by using supervised multi-task learning in order that the IR component can be trained by considering answer spans. Experimental results on the standard benchmark, answering SQuAD questions using the full Wikipedia as the knowledge source, showed that our model achieved state-of-the-art performance. Moreover, we thoroughly evaluated the individual contributions of our model components with our new Japanese dataset and SQuAD. The results showed significant improvements in the IR task and provided a new perspective on IR for RC: it is effective to teach which part of the passage answers the question rather than to give only a relevance score to the whole passage.
Tasks	Information Retrieval, Multi-Task Learning, Reading Comprehension
Published	2018-08-31
URL	http://arxiv.org/abs/1808.10628v1
PDF	http://arxiv.org/pdf/1808.10628v1.pdf
PWC	https://paperswithcode.com/paper/retrieve-and-read-multi-task-learning-of
Repo
Framework

Active and Adaptive Sequential learning


Title	Active and Adaptive Sequential learning
Authors	Yuheng Bu, Jiaxun Lu, Venugopal V. Veeravalli
Abstract	A framework is introduced for actively and adaptively solving a sequence of machine learning problems, which are changing in bounded manner from one time step to the next. An algorithm is developed that actively queries the labels of the most informative samples from an unlabeled data pool, and that adapts to the change by utilizing the information acquired in the previous steps. Our analysis shows that the proposed active learning algorithm based on stochastic gradient descent achieves a near-optimal excess risk performance for maximum likelihood estimation. Furthermore, an estimator of the change in the learning problems using the active learning samples is constructed, which provides an adaptive sample size selection rule that guarantees the excess risk is bounded for sufficiently large number of time steps. Experiments with synthetic and real data are presented to validate our algorithm and theoretical results.
Tasks	Active Learning
Published	2018-05-29
URL	http://arxiv.org/abs/1805.11710v1
PDF	http://arxiv.org/pdf/1805.11710v1.pdf
PWC	https://paperswithcode.com/paper/active-and-adaptive-sequential-learning
Repo
Framework

A new multilayer optical film optimal method based on deep q-learning


Title	A new multilayer optical film optimal method based on deep q-learning
Authors	Anqing Jiang, Osamu Yoshie, LiangYao Chen
Abstract	Multi-layer optical film has been found to afford important applications in optical communication, optical absorbers, optical filters, etc. Different algorithms of multi-layer optical film design has been developed, as simplex method, colony algorithm, genetic algorithm. These algorithms rapidly promote the design and manufacture of multi-layer films. However, traditional numerical algorithms of converge to local optimum. This means that the algorithms can not give a global optimal solution to the material researchers. In recent years, due to the rapid development to far artificial intelligence, to optimize optical film structure using AI algorithm has become possible. In this paper, we will introduce a new optical film design algorithm based on the deep Q learning. This model can converge the global optimum of the optical thin film structure, this will greatly improve the design efficiency of multi-layer films.
Tasks	Q-Learning
Published	2018-12-07
URL	http://arxiv.org/abs/1812.02873v1
PDF	http://arxiv.org/pdf/1812.02873v1.pdf
PWC	https://paperswithcode.com/paper/a-new-multilayer-optical-film-optimal-method
Repo
Framework

Characterisation of (Sub)sequential Rational Functions over a General Class Monoids


Title	Characterisation of (Sub)sequential Rational Functions over a General Class Monoids
Authors	Stefan Gerdjikov
Abstract	In this technical report we describe a general class of monoids for which (sub)sequential rational can be characterised in terms of a congruence relation in the flavour of Myhill-Nerode relation. The class of monoids that we consider can be described in terms of natural algebraic axioms, contains the free monoids, groups, the tropical monoid, and is closed under Cartesian.
Tasks
Published	2018-01-28
URL	http://arxiv.org/abs/1801.10063v1
PDF	http://arxiv.org/pdf/1801.10063v1.pdf
PWC	https://paperswithcode.com/paper/characterisation-of-subsequential-rational
Repo
Framework

Projective Splitting with Forward Steps only Requires Continuity


Title	Projective Splitting with Forward Steps only Requires Continuity
Authors	Patrick R. Johnstone, Jonathan Eckstein
Abstract	A recent innovation in projective splitting algorithms for monotone operator inclusions has been the development of a procedure using two forward steps instead of the customary proximal steps for operators that are Lipschitz continuous. This paper shows that the Lipschitz assumption is unnecessary when the forward steps are performed in finite-dimensional spaces: a backtracking linesearch yields a convergent algorithm for operators that are merely continuous with full domain.
Tasks
Published	2018-09-17
URL	http://arxiv.org/abs/1809.07180v1
PDF	http://arxiv.org/pdf/1809.07180v1.pdf
PWC	https://paperswithcode.com/paper/projective-splitting-with-forward-steps-only
Repo
Framework