January 27, 2020

3185 words 15 mins read

Paper Group ANR 1090

K-nn active learning under local smoothness condition. Fully Parallel Hyperparameter Search: Reshaped Space-Filling. Learning to Control Latent Representations for Few-Shot Learning of Named Entities. Superintelligence Safety: A Requirements Engineering Perspective. Toward Sensor-based Sleep Monitoring with Electrodermal Activity Measures. Imagined …

K-nn active learning under local smoothness condition


Title	K-nn active learning under local smoothness condition
Authors	Boris Ndjia Njike, Xavier Siebert
Abstract	There is a large body of work on convergence rates either in passive or active learning. Here we outline some of the results that have been obtained, more specifically in a nonparametric setting under assumptions about the smoothness and the margin noise. We also discuss the relative merits of these underlying assumptions by putting active learning in perspective with recent work on passive learning. We provide a novel active learning algorithm with a rate of convergence better than in passive learning, using a particular smoothness assumption customized for $k$-nearest neighbors. This smoothness assumption provides a dependence on the marginal distribution of the instance space unlike other recent algorithms. Our algorithm thus avoids the strong density assumption that supposes the existence of the density function of the marginal distribution of the instance space and is therefore more generally applicable.
Tasks	Active Learning
Published	2019-02-08
URL	http://arxiv.org/abs/1902.03055v2
PDF	http://arxiv.org/pdf/1902.03055v2.pdf
PWC	https://paperswithcode.com/paper/k-nn-active-learning-under-local-smoothness
Repo
Framework

Fully Parallel Hyperparameter Search: Reshaped Space-Filling


Title	Fully Parallel Hyperparameter Search: Reshaped Space-Filling
Authors	M. -L. Cauwet, C. Couprie, J. Dehos, P. Luc, J. Rapin, M. Riviere, F. Teytaud, O. Teytaud
Abstract	Space-filling designs such as scrambled-Hammersley, Latin Hypercube Sampling and Jittered Sampling have been proposed for fully parallel hyperparameter search, and were shown to be more effective than random or grid search. In this paper, we show that these designs only improve over random search by a constant factor. In contrast, we introduce a new approach based on reshaping the search distribution, which leads to substantial gains over random search, both theoretically and empirically. We propose two flavors of reshaping. First, when the distribution of the optimum is some known $P_0$, we propose Recentering, which uses as search distribution a modified version of $P_0$ tightened closer to the center of the domain, in a dimension-dependent and budget-dependent manner. Second, we show that in a wide range of experiments with $P_0$ unknown, using a proposed Cauchy transformation, which simultaneously has a heavier tail (for unbounded hyperparameters) and is closer to the boundaries (for bounded hyperparameters), leads to improved performances. Besides artificial experiments and simple real world tests on clustering or Salmon mappings, we check our proposed methods on expensive artificial intelligence tasks such as attend/infer/repeat, video next frame segmentation forecasting and progressive generative adversarial networks.
Tasks
Published	2019-10-18
URL	https://arxiv.org/abs/1910.08406v2
PDF	https://arxiv.org/pdf/1910.08406v2.pdf
PWC	https://paperswithcode.com/paper/fully-parallel-hyperparameter-search-reshaped
Repo
Framework

Learning to Control Latent Representations for Few-Shot Learning of Named Entities


Title	Learning to Control Latent Representations for Few-Shot Learning of Named Entities
Authors	Omar U. Florez, Erik Mueller
Abstract	Humans excel in continuously learning with small data without forgetting how to solve old problems. However, neural networks require large datasets to compute latent representations across different tasks while minimizing a loss function. For example, a natural language understanding (NLU) system will often deal with emerging entities during its deployment as interactions with users in realistic scenarios will generate new and infrequent names, events, and locations. Here, we address this scenario by introducing an RL trainable controller that disentangles the representation learning of a neural encoder from its memory management role. Our proposed solution is straightforward and simple: we train a controller to execute an optimal sequence of reading and writing operations on an external memory with the goal of leveraging diverse activations from the past and provide accurate predictions. Our approach is named Learning to Control (LTC) and allows few-shot learning with two degrees of memory plasticity. We experimentally show that our system obtains accurate results for few-shot learning of entity recognition in the Stanford Task-Oriented Dialogue dataset.
Tasks	Few-Shot Learning, Representation Learning
Published	2019-11-19
URL	https://arxiv.org/abs/1911.08542v1
PDF	https://arxiv.org/pdf/1911.08542v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-control-latent-representations
Repo
Framework

Superintelligence Safety: A Requirements Engineering Perspective


Title	Superintelligence Safety: A Requirements Engineering Perspective
Authors	Hermann Kaindl, Jonas Ferdigg
Abstract	Under the headline “AI safety”, a wide-reaching issue is being discussed, whether in the future some “superhuman artificial intelligence” / “superintelligence” could could pose a threat to humanity. In addition, the late Steven Hawking warned that the rise of robots may be disastrous for mankind. A major concern is that even benevolent superhuman artificial intelligence (AI) may become seriously harmful if its given goals are not exactly aligned with ours, or if we cannot specify precisely its objective function. Metaphorically, this is compared to king Midas in Greek mythology, who expressed the wish that everything he touched should turn to gold, but obviously this wish was not specified precisely enough. In our view, this sounds like requirements problems and the challenge of their precise formulation. (To our best knowledge, this has not been pointed out yet.) As usual in requirements engineering (RE), ambiguity or incompleteness may cause problems. In addition, the overall issue calls for a major RE endeavor, figuring out the wishes and the needs with regard to a superintelligence, which will in our opinion most likely be a very complex software-intensive system based on AI. This may even entail theoretically defining an extended requirements problem.
Tasks
Published	2019-09-26
URL	https://arxiv.org/abs/1909.12152v1
PDF	https://arxiv.org/pdf/1909.12152v1.pdf
PWC	https://paperswithcode.com/paper/superintelligence-safety-a-requirements
Repo
Framework

Toward Sensor-based Sleep Monitoring with Electrodermal Activity Measures


Title	Toward Sensor-based Sleep Monitoring with Electrodermal Activity Measures
Authors	William Romine, Tanvi Banerjee, Garrett Goodman
Abstract	We use self-report and electrodermal activity (EDA) wearable sensor data from 77 nights of sleep on six participants to test the efficacy of EDA data for sleep monitoring. We used factor analysis to find latent factors in the EDA data, and causal model search to find the most probable graphical model accounting for self-reported sleep efficiency (SE), sleep quality (SQ), and the latent EDA factors. Structural equation modeling was used to confirm fit of the extracted graph. Based on the generated graph, logistic regression and naive Bayes models were used to test the efficacy of the EDA data in predicting SE and SQ. Six EDA features extracted from the total signal over a night’s sleep could be explained by two latent factors, EDA Magnitude and EDA Storms. EDA Magnitude performed as a strong predictor for SE to aid detection of substantial changes in time asleep. The performance of EDA Magnitured and SE in classifying SQ showed promise for wearable sleep monitoring applications. However, our data suggest that obtaining a more accurate sensor-based measure of SE will be necessary before smaller changes in SQ can be detected from EDA sensor data alone.
Tasks	Sleep Quality
Published	2019-01-31
URL	http://arxiv.org/abs/1901.11440v1
PDF	http://arxiv.org/pdf/1901.11440v1.pdf
PWC	https://paperswithcode.com/paper/toward-sensor-based-sleep-monitoring-with
Repo
Framework

Imagined Value Gradients: Model-Based Policy Optimization with Transferable Latent Dynamics Models


Title	Imagined Value Gradients: Model-Based Policy Optimization with Transferable Latent Dynamics Models
Authors	Arunkumar Byravan, Jost Tobias Springenberg, Abbas Abdolmaleki, Roland Hafner, Michael Neunert, Thomas Lampe, Noah Siegel, Nicolas Heess, Martin Riedmiller
Abstract	Humans are masters at quickly learning many complex tasks, relying on an approximate understanding of the dynamics of their environments. In much the same way, we would like our learning agents to quickly adapt to new tasks. In this paper, we explore how model-based Reinforcement Learning (RL) can facilitate transfer to new tasks. We develop an algorithm that learns an action-conditional, predictive model of expected future observations, rewards and values from which a policy can be derived by following the gradient of the estimated value along imagined trajectories. We show how robust policy optimization can be achieved in robot manipulation tasks even with approximate models that are learned directly from vision and proprioception. We evaluate the efficacy of our approach in a transfer learning scenario, re-using previously learned models on tasks with different reward structures and visual distractors, and show a significant improvement in learning speed compared to strong off-policy baselines. Videos with results can be found at https://sites.google.com/view/ivg-corl19
Tasks	Transfer Learning
Published	2019-10-09
URL	https://arxiv.org/abs/1910.04142v1
PDF	https://arxiv.org/pdf/1910.04142v1.pdf
PWC	https://paperswithcode.com/paper/imagined-value-gradients-model-based-policy
Repo
Framework

Using Near Infrared Spectroscopy and Machine Learning to diagnose Systemic Sclerosis


Title	Using Near Infrared Spectroscopy and Machine Learning to diagnose Systemic Sclerosis
Authors	Joelle Feijó de França, Hugo Abreu Mendes, Lucas Gallindo Costa, Andrea Tavares Dantas, Angela Luzia Branco Pinto Duarte, Anderson Stevens Leônidas Gomes, Emery Cleiton Cabral Correia Lins
Abstract	The motivation of this work is the use of non-invasive and low cost techniques to obtain a faster and more accurate diagnosis of systemic sclerosis (SSc), rheumatic, autoimmune, chronic and rare disease. The technique in question is Near Infrared Spectroscopy (NIRS). Spectra were acquired from three different regions of hand’s volunteers. Machine learning algorithms are used to classify and search for the best optical wavelength. The results demonstrate that it is easy to obtain wavelength bands more important for the diagnosis. We use the algorithm RFECV and SVC. The results suggests that the most important wavelength band is at 1270 nm, referring to the luminescence of Singlet Oxygen. The results indicates that the Proximal Interphalangeal Joints region returns better accuracy’s scores. Optical spectrometers can be found at low prices and can be easily used in clinical evaluations, while the algorithms used are completely diffused on open source platforms.
Tasks
Published	2019-08-16
URL	https://arxiv.org/abs/1908.06137v1
PDF	https://arxiv.org/pdf/1908.06137v1.pdf
PWC	https://paperswithcode.com/paper/using-near-infrared-spectroscopy-and-machine
Repo
Framework

Engineering problems in machine learning systems


Title	Engineering problems in machine learning systems
Authors	Hiroshi Kuwajima, Hirotoshi Yasuoka, Toshihiro Nakae
Abstract	Fatal accidents are a major issue hindering the wide acceptance of safety-critical systems that employ machine learning and deep learning models, such as automated driving vehicles. In order to use machine learning in a safety-critical system, it is necessary to demonstrate the safety and security of the system through engineering processes. However, thus far, no such widely accepted engineering concepts or frameworks have been established for these systems. The key to using a machine learning model in a deductively engineered system is decomposing the data-driven training of machine learning models into requirement, design, and verification, particularly for machine learning models used in safety-critical systems. Simultaneously, open problems and relevant technical fields are not organized in a manner that enables researchers to select a theme and work on it. In this study, we identify, classify, and explore the open problems in engineering (safety-critical) machine learning systems — that is, in terms of requirement, design, and verification of machine learning models and systems — as well as discuss related works and research directions, using automated driving vehicles as an example. Our results show that machine learning models are characterized by a lack of requirements specification, lack of design specification, lack of interpretability, and lack of robustness. We also perform a gap analysis on a conventional system quality standard SQuARE with the characteristics of machine learning models to study quality models for machine learning systems. We find that a lack of requirements specification and lack of robustness have the greatest impact on conventional quality models.
Tasks
Published	2019-04-01
URL	https://arxiv.org/abs/1904.00001v2
PDF	https://arxiv.org/pdf/1904.00001v2.pdf
PWC	https://paperswithcode.com/paper/open-problems-in-engineering-machine-learning
Repo
Framework

Acceleration via Symplectic Discretization of High-Resolution Differential Equations


Title	Acceleration via Symplectic Discretization of High-Resolution Differential Equations
Authors	Bin Shi, Simon S. Du, Weijie J. Su, Michael I. Jordan
Abstract	We study first-order optimization methods obtained by discretizing ordinary differential equations (ODEs) corresponding to Nesterov’s accelerated gradient methods (NAGs) and Polyak’s heavy-ball method. We consider three discretization schemes: an explicit Euler scheme, an implicit Euler scheme, and a symplectic scheme. We show that the optimization algorithm generated by applying the symplectic scheme to a high-resolution ODE proposed by Shi et al. [2018] achieves an accelerated rate for minimizing smooth strongly convex functions. On the other hand, the resulting algorithm either fails to achieve acceleration or is impractical when the scheme is implicit, the ODE is low-resolution, or the scheme is explicit.
Tasks
Published	2019-02-11
URL	https://arxiv.org/abs/1902.03694v2
PDF	https://arxiv.org/pdf/1902.03694v2.pdf
PWC	https://paperswithcode.com/paper/acceleration-via-symplectic-discretization-of
Repo
Framework

Avoiding Jammers: A Reinforcement Learning Approach


Title	Avoiding Jammers: A Reinforcement Learning Approach
Authors	Serkan Ak, Stefan Bruggenwirth
Abstract	This paper investigates the anti-jamming performance of a cognitive radar under a partially observable Markov decision process (POMDP) model. First, we obtain an explicit expression for uncertainty of jammer dynamics, which paves the way for illuminating the performance metric of probability of being jammed for the radar beyond a conventional signal-to-noise ratio ($\mathsf{SNR}$) based analysis. Considering two frequency hopping strategies developed in the framework of reinforcement learning (RL), this performance metric is analyzed with deep Q-network (DQN) and long short term memory (LSTM) networks under various uncertainty values. Finally, the requirement of the target network in the RL algorithm for both network architectures is replaced with a softmax operator. Simulation results show that this operator improves upon the performance of the traditional target network.
Tasks
Published	2019-11-20
URL	https://arxiv.org/abs/1911.08874v2
PDF	https://arxiv.org/pdf/1911.08874v2.pdf
PWC	https://paperswithcode.com/paper/avoiding-jammers-a-reinforcement-learning
Repo
Framework

Universal Adversarial Perturbations Against Person Re-Identification


Title	Universal Adversarial Perturbations Against Person Re-Identification
Authors	Wenjie Ding, Xing Wei, Xiaopeng Hong, Rongrong Ji, Yihong Gong
Abstract	Person re-identification (re-ID) has made great progress and achieved high performance in recent years with the development of deep learning. However, as an application related to security issues, there are few researches considering the safety of person re-ID systems. In this paper, we attempt to explore the robustness of current person re-ID models against adversarial samples. Specifically, we attack the re-ID models using universal adversarial perturbations (UAPs), which are especially dangerous to the surveillance systems because it could fool most pedestrian images with a little overhead. Existing methods for UAPs mainly consider classification models, while the tasks in open set scenarios like re-ID are rarely explored. Re-ID attack is different from classification ones in the sense that the former discards decision boundary during test and cares more about the ranking list. Therefore, we propose an effective method to train UAPs against person re-ID models from the global list-wise perspective. Furthermore, to increase the impact of attack to different models and datasets, we propose a novel UAPs learning method based on total variation minimization. Extensive experiments validate the effectiveness of our proposed method.
Tasks	Person Re-Identification
Published	2019-10-30
URL	https://arxiv.org/abs/1910.14184v1
PDF	https://arxiv.org/pdf/1910.14184v1.pdf
PWC	https://paperswithcode.com/paper/universal-adversarial-perturbations-against-2
Repo
Framework

Impact of Prior Knowledge and Data Correlation on Privacy Leakage: A Unified Analysis


Title	Impact of Prior Knowledge and Data Correlation on Privacy Leakage: A Unified Analysis
Authors	Yanan Li, Xuebin Ren, Shusen Yang, Xinyu Yang
Abstract	It has been widely understood that differential privacy (DP) can guarantee rigorous privacy against adversaries with arbitrary prior knowledge. However, recent studies demonstrate that this may not be true for correlated data, and indicate that three factors could influence privacy leakage: the data correlation pattern, prior knowledge of adversaries, and sensitivity of the query function. This poses a fundamental problem: what is the mathematical relationship between the three factors and privacy leakage? In this paper, we present a unified analysis of this problem. A new privacy definition, named \textit{prior differential privacy (PDP)}, is proposed to evaluate privacy leakage considering the exact prior knowledge possessed by the adversary. We use two models, the weighted hierarchical graph (WHG) and the multivariate Gaussian model to analyze discrete and continuous data, respectively. We demonstrate that positive, negative, and hybrid correlations have distinct impacts on privacy leakage. Considering general correlations, a closed-form expression of privacy leakage is derived for continuous data, and a chain rule is presented for discrete data. Our results are valid for general linear queries, including count, sum, mean, and histogram. Numerical experiments are presented to verify our theoretical analysis.
Tasks
Published	2019-06-05
URL	https://arxiv.org/abs/1906.02606v1
PDF	https://arxiv.org/pdf/1906.02606v1.pdf
PWC	https://paperswithcode.com/paper/impact-of-prior-knowledge-and-data
Repo
Framework

On-Device Machine Learning: An Algorithms and Learning Theory Perspective


Title	On-Device Machine Learning: An Algorithms and Learning Theory Perspective
Authors	Sauptik Dhar, Junyao Guo, Jiayi Liu, Samarth Tripathi, Unmesh Kurup, Mohak Shah
Abstract	The current paradigm for using machine learning models on a device is to train a model in the cloud and perform inference using the trained model on the device. However, with the increasing number of smart devices and improved hardware, there is interest in performing model training on the device. Given this surge in interest, a comprehensive survey of the field from a device-agnostic perspective sets the stage for both understanding the state-of-the-art and for identifying open challenges and future avenues of research. Since on-device learning is an expansive field with connections to a large number of related topics in AI and machine learning (including online learning, model adaptation, one/few-shot learning, etc), covering such a large number of topics in a single survey is impractical. Instead, this survey finds a middle ground by reformulating the problem of on-device learning as resource constrained learning where the resources are compute and memory. This reformulation allows tools, techniques, and algorithms from a wide variety of research areas to be compared equitably. In addition to summarizing the state of the art, the survey also identifies a number of challenges and next steps for both the algorithmic and theoretical aspects of on-device learning.
Tasks	Few-Shot Learning
Published	2019-11-02
URL	https://arxiv.org/abs/1911.00623v1
PDF	https://arxiv.org/pdf/1911.00623v1.pdf
PWC	https://paperswithcode.com/paper/on-device-machine-learning-an-algorithms-and
Repo
Framework

Complexity of Highly Parallel Non-Smooth Convex Optimization


Title	Complexity of Highly Parallel Non-Smooth Convex Optimization
Authors	Sébastien Bubeck, Qijia Jiang, Yin Tat Lee, Yuanzhi Li, Aaron Sidford
Abstract	A landmark result of non-smooth convex optimization is that gradient descent is an optimal algorithm whenever the number of computed gradients is smaller than the dimension $d$. In this paper we study the extension of this result to the parallel optimization setting. Namely we consider optimization algorithms interacting with a highly parallel gradient oracle, that is one that can answer $\mathrm{poly}(d)$ gradient queries in parallel. We show that in this case gradient descent is optimal only up to $\tilde{O}(\sqrt{d})$ rounds of interactions with the oracle. The lower bound improves upon a decades old construction by Nemirovski which proves optimality only up to $d^{1/3}$ rounds (as recently observed by Balkanski and Singer), and the suboptimality of gradient descent after $\sqrt{d}$ rounds was already observed by Duchi, Bartlett and Wainwright. In the latter regime we propose a new method with improved complexity, which we conjecture to be optimal. The analysis of this new method is based upon a generalized version of the recent results on optimal acceleration for highly smooth convex optimization.
Tasks
Published	2019-06-25
URL	https://arxiv.org/abs/1906.10655v1
PDF	https://arxiv.org/pdf/1906.10655v1.pdf
PWC	https://paperswithcode.com/paper/complexity-of-highly-parallel-non-smooth
Repo
Framework

Hindsight Analysis of the Chicago Food Inspection Forecasting Model


Title	Hindsight Analysis of the Chicago Food Inspection Forecasting Model
Authors	Vinesh Kannan, Matthew A. Shapiro, Mustafa Bilgic
Abstract	The Chicago Department of Public Health (CDPH) conducts routine food inspections of over 15,000 food establishments to ensure the health and safety of their patrons. In 2015, CDPH deployed a machine learning model to schedule inspections of establishments based on their likelihood to commit critical food code violations. The City of Chicago released the training data and source code for the model, allowing anyone to examine the model. We provide the first independent analysis of the model, the data, the predictor variables, the performance metrics, and the underlying assumptions. We present a summary of our findings, share lessons learned, and make recommendations to address some of the issues our analysis unearthed.
Tasks
Published	2019-10-10
URL	https://arxiv.org/abs/1910.04906v1
PDF	https://arxiv.org/pdf/1910.04906v1.pdf
PWC	https://paperswithcode.com/paper/hindsight-analysis-of-the-chicago-food
Repo
Framework