July 26, 2019

2958 words 14 mins read

Paper Group ANR 785

Stochastic Cubic Regularization for Fast Nonconvex Optimization. Completing a joint PMF from projections: a low-rank coupled tensor factorization approach. Data-Efficient Reinforcement Learning with Probabilistic Model Predictive Control. Model Extraction Warning in MLaaS Paradigm. Max-Margin Invariant Features from Transformed Unlabeled Data. A Te …

Stochastic Cubic Regularization for Fast Nonconvex Optimization


Title	Stochastic Cubic Regularization for Fast Nonconvex Optimization
Authors	Nilesh Tripuraneni, Mitchell Stern, Chi Jin, Jeffrey Regier, Michael I. Jordan
Abstract	This paper proposes a stochastic variant of a classic algorithm—the cubic-regularized Newton method [Nesterov and Polyak 2006]. The proposed algorithm efficiently escapes saddle points and finds approximate local minima for general smooth, nonconvex functions in only $\mathcal{\tilde{O}}(\epsilon^{-3.5})$ stochastic gradient and stochastic Hessian-vector product evaluations. The latter can be computed as efficiently as stochastic gradients. This improves upon the $\mathcal{\tilde{O}}(\epsilon^{-4})$ rate of stochastic gradient descent. Our rate matches the best-known result for finding local minima without requiring any delicate acceleration or variance-reduction techniques.
Tasks
Published	2017-11-08
URL	http://arxiv.org/abs/1711.02838v2
PDF	http://arxiv.org/pdf/1711.02838v2.pdf
PWC	https://paperswithcode.com/paper/stochastic-cubic-regularization-for-fast
Repo
Framework

Completing a joint PMF from projections: a low-rank coupled tensor factorization approach


Title	Completing a joint PMF from projections: a low-rank coupled tensor factorization approach
Authors	Nikos Kargas, Nicholas D. Sidiropoulos
Abstract	There has recently been considerable interest in completing a low-rank matrix or tensor given only a small fraction (or few linear combinations) of its entries. Related approaches have found considerable success in the area of recommender systems, under machine learning. From a statistical estimation point of view, the gold standard is to have access to the joint probability distribution of all pertinent random variables, from which any desired optimal estimator can be readily derived. In practice high-dimensional joint distributions are very hard to estimate, and only estimates of low-dimensional projections may be available. We show that it is possible to identify higher-order joint PMFs from lower-order marginalized PMFs using coupled low-rank tensor factorization. Our approach features guaranteed identifiability when the full joint PMF is of low-enough rank, and effective approximation otherwise. We provide an algorithmic approach to compute the sought factors, and illustrate the merits of our approach using rating prediction as an example.
Tasks	Recommendation Systems
Published	2017-02-16
URL	http://arxiv.org/abs/1702.05184v1
PDF	http://arxiv.org/pdf/1702.05184v1.pdf
PWC	https://paperswithcode.com/paper/completing-a-joint-pmf-from-projections-a-low
Repo
Framework

Data-Efficient Reinforcement Learning with Probabilistic Model Predictive Control


Title	Data-Efficient Reinforcement Learning with Probabilistic Model Predictive Control
Authors	Sanket Kamthe, Marc Peter Deisenroth
Abstract	Trial-and-error based reinforcement learning (RL) has seen rapid advancements in recent times, especially with the advent of deep neural networks. However, the majority of autonomous RL algorithms require a large number of interactions with the environment. A large number of interactions may be impractical in many real-world applications, such as robotics, and many practical systems have to obey limitations in the form of state space or control constraints. To reduce the number of system interactions while simultaneously handling constraints, we propose a model-based RL framework based on probabilistic Model Predictive Control (MPC). In particular, we propose to learn a probabilistic transition model using Gaussian Processes (GPs) to incorporate model uncertainty into long-term predictions, thereby, reducing the impact of model errors. We then use MPC to find a control sequence that minimises the expected long-term cost. We provide theoretical guarantees for first-order optimality in the GP-based transition models with deterministic approximate inference for long-term planning. We demonstrate that our approach does not only achieve state-of-the-art data efficiency, but also is a principled way for RL in constrained environments.
Tasks	Gaussian Processes
Published	2017-06-20
URL	http://arxiv.org/abs/1706.06491v2
PDF	http://arxiv.org/pdf/1706.06491v2.pdf
PWC	https://paperswithcode.com/paper/data-efficient-reinforcement-learning-with
Repo
Framework

Model Extraction Warning in MLaaS Paradigm


Title	Model Extraction Warning in MLaaS Paradigm
Authors	Manish Kesarwani, Bhaskar Mukhoty, Vijay Arya, Sameep Mehta
Abstract	Cloud vendors are increasingly offering machine learning services as part of their platform and services portfolios. These services enable the deployment of machine learning models on the cloud that are offered on a pay-per-query basis to application developers and end users. However recent work has shown that the hosted models are susceptible to extraction attacks. Adversaries may launch queries to steal the model and compromise future query payments or privacy of the training data. In this work, we present a cloud-based extraction monitor that can quantify the extraction status of models by observing the query and response streams of both individual and colluding adversarial users. We present a novel technique that uses information gain to measure the model learning rate by users with increasing number of queries. Additionally, we present an alternate technique that maintains intelligent query summaries to measure the learning rate relative to the coverage of the input feature space in the presence of collusion. Both these approaches have low computational overhead and can easily be offered as services to model owners to warn them of possible extraction attacks from adversaries. We present performance results for these approaches for decision tree models deployed on BigML MLaaS platform, using open source datasets and different adversarial attack strategies.
Tasks	Adversarial Attack
Published	2017-11-20
URL	http://arxiv.org/abs/1711.07221v1
PDF	http://arxiv.org/pdf/1711.07221v1.pdf
PWC	https://paperswithcode.com/paper/model-extraction-warning-in-mlaas-paradigm
Repo
Framework

Max-Margin Invariant Features from Transformed Unlabeled Data


Title	Max-Margin Invariant Features from Transformed Unlabeled Data
Authors	Dipan K. Pal, Ashwin A. Kannan, Gautam Arakalgud, Marios Savvides
Abstract	The study of representations invariant to common transformations of the data is important to learning. Most techniques have focused on local approximate invariance implemented within expensive optimization frameworks lacking explicit theoretical guarantees. In this paper, we study kernels that are invariant to a unitary group while having theoretical guarantees in addressing the important practical issue of unavailability of transformed versions of labelled data. A problem we call the Unlabeled Transformation Problem which is a special form of semi-supervised learning and one-shot learning. We present a theoretically motivated alternate approach to the invariant kernel SVM based on which we propose Max-Margin Invariant Features (MMIF) to solve this problem. As an illustration, we design an framework for face recognition and demonstrate the efficacy of our approach on a large scale semi-synthetic dataset with 153,000 images and a new challenging protocol on Labelled Faces in the Wild (LFW) while out-performing strong baselines.
Tasks	Face Recognition, One-Shot Learning
Published	2017-10-24
URL	http://arxiv.org/abs/1710.08585v1
PDF	http://arxiv.org/pdf/1710.08585v1.pdf
PWC	https://paperswithcode.com/paper/max-margin-invariant-features-from
Repo
Framework

A Teacher-Student Framework for Zero-Resource Neural Machine Translation


Title	A Teacher-Student Framework for Zero-Resource Neural Machine Translation
Authors	Yun Chen, Yang Liu, Yong Cheng, Victor O. K. Li
Abstract	While end-to-end neural machine translation (NMT) has made remarkable progress recently, it still suffers from the data scarcity problem for low-resource language pairs and domains. In this paper, we propose a method for zero-resource NMT by assuming that parallel sentences have close probabilities of generating a sentence in a third language. Based on this assumption, our method is able to train a source-to-target NMT model (“student”) without parallel corpora available, guided by an existing pivot-to-target NMT model (“teacher”) on a source-pivot parallel corpus. Experimental results show that the proposed method significantly improves over a baseline pivot-based model by +3.0 BLEU points across various language pairs.
Tasks	Machine Translation
Published	2017-05-02
URL	http://arxiv.org/abs/1705.00753v1
PDF	http://arxiv.org/pdf/1705.00753v1.pdf
PWC	https://paperswithcode.com/paper/a-teacher-student-framework-for-zero-resource
Repo
Framework

A Multi-Layer K-means Approach for Multi-Sensor Data Pattern Recognition in Multi-Target Localization


Title	A Multi-Layer K-means Approach for Multi-Sensor Data Pattern Recognition in Multi-Target Localization
Authors	Samuel Silva, Rengan Suresh, Feng Tao, Johnathan Votion, Yongcan Cao
Abstract	Data-target association is an important step in multi-target localization for the intelligent operation of un- manned systems in numerous applications such as search and rescue, traffic management and surveillance. The objective of this paper is to present an innovative data association learning approach named multi-layer K-means (MLKM) based on leveraging the advantages of some existing machine learning approaches, including K-means, K-means++, and deep neural networks. To enable the accurate data association from different sensors for efficient target localization, MLKM relies on the clustering capabilities of K-means++ structured in a multi-layer framework with the error correction feature that is motivated by the backpropogation that is well-known in deep learning research. To show the effectiveness of the MLKM method, numerous simulation examples are conducted to compare its performance with K-means, K-means++, and deep neural networks.
Tasks
Published	2017-05-30
URL	http://arxiv.org/abs/1705.10757v1
PDF	http://arxiv.org/pdf/1705.10757v1.pdf
PWC	https://paperswithcode.com/paper/a-multi-layer-k-means-approach-for-multi
Repo
Framework

Heterogeneous Face Attribute Estimation: A Deep Multi-Task Learning Approach


Title	Heterogeneous Face Attribute Estimation: A Deep Multi-Task Learning Approach
Authors	Hu Han, Anil K. Jain, Fang Wang, Shiguang Shan, Xilin Chen
Abstract	Face attribute estimation has many potential applications in video surveillance, face retrieval, and social media. While a number of methods have been proposed for face attribute estimation, most of them did not explicitly consider the attribute correlation and heterogeneity (e.g., ordinal vs. nominal and holistic vs. local) during feature representation learning. In this paper, we present a Deep Multi-Task Learning (DMTL) approach to jointly estimate multiple heterogeneous attributes from a single face image. In DMTL, we tackle attribute correlation and heterogeneity with convolutional neural networks (CNNs) consisting of shared feature learning for all the attributes, and category-specific feature learning for heterogeneous attributes. We also introduce an unconstrained face database (LFW+), an extension of public-domain LFW, with heterogeneous demographic attributes (age, gender, and race) obtained via crowdsourcing. Experimental results on benchmarks with multiple face attributes (MORPH II, LFW+, CelebA, LFWA, and FotW) show that the proposed approach has superior performance compared to state of the art. Finally, evaluations on a public-domain face database (LAP) with a single attribute show that the proposed approach has excellent generalization ability.
Tasks	Multi-Task Learning, Representation Learning
Published	2017-06-03
URL	http://arxiv.org/abs/1706.00906v3
PDF	http://arxiv.org/pdf/1706.00906v3.pdf
PWC	https://paperswithcode.com/paper/heterogeneous-face-attribute-estimation-a
Repo
Framework

Towards automated patient data cleaning using deep learning: A feasibility study on the standardization of organ labeling


Title	Towards automated patient data cleaning using deep learning: A feasibility study on the standardization of organ labeling
Authors	Timothy Rozario, Troy Long, Mingli Chen, Weiguo Lu, Steve Jiang
Abstract	Data cleaning consumes about 80% of the time spent on data analysis for clinical research projects. This is a much bigger problem in the era of big data and machine learning in the field of medicine where large volumes of data are being generated. We report an initial effort towards automated patient data cleaning using deep learning: the standardization of organ labeling in radiation therapy. Organs are often labeled inconsistently at different institutions (sometimes even within the same institution) and at different time periods, which poses a problem for clinical research, especially for multi-institutional collaborative clinical research where the acquired patient data is not being used effectively. We developed a convolutional neural network (CNN) to automatically identify each organ in the CT image and then label it with the standardized nomenclature presented at AAPM Task Group 263. We tested this model on the CT images of 54 patients with prostate and 100 patients with head and neck cancer who previously received radiation therapy. The model achieved 100% accuracy in detecting organs and assigning standardized labels for the patients tested. This work shows the feasibility of using deep learning in patient data cleaning that enables standardized datasets to be generated for effective intra- and interinstitutional collaborative clinical research.
Tasks
Published	2017-12-30
URL	http://arxiv.org/abs/1801.00096v1
PDF	http://arxiv.org/pdf/1801.00096v1.pdf
PWC	https://paperswithcode.com/paper/towards-automated-patient-data-cleaning-using
Repo
Framework

How Generative Adversarial Networks and Their Variants Work: An Overview


Title	How Generative Adversarial Networks and Their Variants Work: An Overview
Authors	Yongjun Hong, Uiwon Hwang, Jaeyoon Yoo, Sungroh Yoon
Abstract	Generative Adversarial Networks (GAN) have received wide attention in the machine learning field for their potential to learn high-dimensional, complex real data distribution. Specifically, they do not rely on any assumptions about the distribution and can generate real-like samples from latent space in a simple manner. This powerful property leads GAN to be applied to various applications such as image synthesis, image attribute editing, image translation, domain adaptation and other academic fields. In this paper, we aim to discuss the details of GAN for those readers who are familiar with, but do not comprehend GAN deeply or who wish to view GAN from various perspectives. In addition, we explain how GAN operates and the fundamental meaning of various objective functions that have been suggested recently. We then focus on how the GAN can be combined with an autoencoder framework. Finally, we enumerate the GAN variants that are applied to various tasks and other fields for those who are interested in exploiting GAN for their research.
Tasks	Domain Adaptation, Image Generation
Published	2017-11-16
URL	http://arxiv.org/abs/1711.05914v9
PDF	http://arxiv.org/pdf/1711.05914v9.pdf
PWC	https://paperswithcode.com/paper/how-generative-adversarial-networks-and-their
Repo
Framework

ApproxDBN: Approximate Computing for Discriminative Deep Belief Networks


Title	ApproxDBN: Approximate Computing for Discriminative Deep Belief Networks
Authors	Xiaojing Xu, Srinjoy Das, Ken Kreutz-Delgado
Abstract	Probabilistic generative neural networks are useful for many applications, such as image classification, speech recognition and occlusion removal. However, the power budget for hardware implementations of neural networks can be extremely tight. To address this challenge we describe a design methodology for using approximate computing methods to implement Approximate Deep Belief Networks (ApproxDBNs) by systematically exploring the use of (1) limited precision of variables; (2) criticality analysis to identify the nodes in the network which can operate with such limited precision while allowing the network to maintain target accuracy levels; and (3) a greedy search methodology with incremental retraining to determine the optimal reduction in precision to enable maximize power savings under user-specified accuracy constraints. Experimental results show that significant bit-length reduction can be achieved by our ApproxDBN with constrained accuracy loss.
Tasks	Image Classification, Speech Recognition
Published	2017-04-13
URL	http://arxiv.org/abs/1704.03993v3
PDF	http://arxiv.org/pdf/1704.03993v3.pdf
PWC	https://paperswithcode.com/paper/approxdbn-approximate-computing-for
Repo
Framework

Depression and Self-Harm Risk Assessment in Online Forums


Title	Depression and Self-Harm Risk Assessment in Online Forums
Authors	Andrew Yates, Arman Cohan, Nazli Goharian
Abstract	Users suffering from mental health conditions often turn to online resources for support, including specialized online support communities or general communities such as Twitter and Reddit. In this work, we present a neural framework for supporting and studying users in both types of communities. We propose methods for identifying posts in support communities that may indicate a risk of self-harm, and demonstrate that our approach outperforms strong previously proposed methods for identifying such posts. Self-harm is closely related to depression, which makes identifying depressed users on general forums a crucial related task. We introduce a large-scale general forum dataset (“RSDD”) consisting of users with self-reported depression diagnoses matched with control users. We show how our method can be applied to effectively identify depressed users from their use of language alone. We demonstrate that our method outperforms strong baselines on this general forum dataset.
Tasks
Published	2017-09-06
URL	http://arxiv.org/abs/1709.01848v1
PDF	http://arxiv.org/pdf/1709.01848v1.pdf
PWC	https://paperswithcode.com/paper/depression-and-self-harm-risk-assessment-in
Repo
Framework

Sequential Local Learning for Latent Graphical Models


Title	Sequential Local Learning for Latent Graphical Models
Authors	Sejun Park, Eunho Yang, Jinwoo Shin
Abstract	Learning parameters of latent graphical models (GM) is inherently much harder than that of no-latent ones since the latent variables make the corresponding log-likelihood non-concave. Nevertheless, expectation-maximization schemes are popularly used in practice, but they are typically stuck in local optima. In the recent years, the method of moments have provided a refreshing angle for resolving the non-convex issue, but it is applicable to a quite limited class of latent GMs. In this paper, we aim for enhancing its power via enlarging such a class of latent GMs. To this end, we introduce two novel concepts, coined marginalization and conditioning, which can reduce the problem of learning a larger GM to that of a smaller one. More importantly, they lead to a sequential learning framework that repeatedly increases the learning portion of given latent GM, and thus covers a significantly broader and more complicated class of loopy latent GMs which include convolutional and random regular models.
Tasks
Published	2017-03-12
URL	http://arxiv.org/abs/1703.04082v2
PDF	http://arxiv.org/pdf/1703.04082v2.pdf
PWC	https://paperswithcode.com/paper/sequential-local-learning-for-latent
Repo
Framework

Discriminative Nonlinear Analysis Operator Learning: When Cosparse Model Meets Image Classification


Title	Discriminative Nonlinear Analysis Operator Learning: When Cosparse Model Meets Image Classification
Authors	Zaidao Wen, Biao Hou, Licheng Jiao
Abstract	Linear synthesis model based dictionary learning framework has achieved remarkable performances in image classification in the last decade. Behaved as a generative feature model, it however suffers from some intrinsic deficiencies. In this paper, we propose a novel parametric nonlinear analysis cosparse model (NACM) with which a unique feature vector will be much more efficiently extracted. Additionally, we derive a deep insight to demonstrate that NACM is capable of simultaneously learning the task adapted feature transformation and regularization to encode our preferences, domain prior knowledge and task oriented supervised information into the features. The proposed NACM is devoted to the classification task as a discriminative feature model and yield a novel discriminative nonlinear analysis operator learning framework (DNAOL). The theoretical analysis and experimental performances clearly demonstrate that DNAOL will not only achieve the better or at least competitive classification accuracies than the state-of-the-art algorithms but it can also dramatically reduce the time complexities in both training and testing phases.
Tasks	Dictionary Learning, Image Classification
Published	2017-04-30
URL	http://arxiv.org/abs/1705.00322v1
PDF	http://arxiv.org/pdf/1705.00322v1.pdf
PWC	https://paperswithcode.com/paper/discriminative-nonlinear-analysis-operator
Repo
Framework

How Well Can Generative Adversarial Networks Learn Densities: A Nonparametric View


Title	How Well Can Generative Adversarial Networks Learn Densities: A Nonparametric View
Authors	Tengyuan Liang
Abstract	We study in this paper the rate of convergence for learning densities under the Generative Adversarial Networks (GAN) framework, borrowing insights from nonparametric statistics. We introduce an improved GAN estimator that achieves a faster rate, through simultaneously leveraging the level of smoothness in the target density and the evaluation metric, which in theory remedies the mode collapse problem reported in the literature. A minimax lower bound is constructed to show that when the dimension is large, the exponent in the rate for the new GAN estimator is near optimal. One can view our results as answering in a quantitative way how well GAN learns a wide range of densities with different smoothness properties, under a hierarchy of evaluation metrics. As a byproduct, we also obtain improved generalization bounds for GAN with deeper ReLU discriminator network.
Tasks
Published	2017-12-21
URL	http://arxiv.org/abs/1712.08244v2
PDF	http://arxiv.org/pdf/1712.08244v2.pdf
PWC	https://paperswithcode.com/paper/how-well-can-generative-adversarial-networks
Repo
Framework