January 27, 2020

3005 words 15 mins read

Paper Group ANR 1221

Fast approximation of orthogonal matrices and application to PCA. Can a Robot Become a Movie Director? Learning Artistic Principles for Aerial Cinematography. Comparing Efficiency of Expert Data Aggregation Methods. Target Tracking for Contextual Bandits: Application to Demand Side Management. Information-Geometric Set Embeddings (IGSE): From Sets …

Fast approximation of orthogonal matrices and application to PCA


Title	Fast approximation of orthogonal matrices and application to PCA
Authors	Cristian Rusu, Lorenzo Rosasco
Abstract	We study the problem of approximating orthogonal matrices so that their application is numerically fast and yet accurate. We find an approximation by solving an optimization problem over a set of structured matrices, that we call extended orthogonal Givens transformations, including Givens rotations as a special case. We propose an efficient greedy algorithm to solve such a problem and show that it strikes a balance between approximation accuracy and speed of computation. The approach is relevant to spectral methods and we illustrate its application to PCA.
Tasks
Published	2019-07-18
URL	https://arxiv.org/abs/1907.08697v3
PDF	https://arxiv.org/pdf/1907.08697v3.pdf
PWC	https://paperswithcode.com/paper/fast-approximation-of-orthogonal-matrices-and
Repo
Framework

Can a Robot Become a Movie Director? Learning Artistic Principles for Aerial Cinematography


Title	Can a Robot Become a Movie Director? Learning Artistic Principles for Aerial Cinematography
Authors	Mirko Gschwindt, Efe Camci, Rogerio Bonatti, Wenshan Wang, Erdal Kayacan, Sebastian Scherer
Abstract	Aerial filming is constantly gaining importance due to the recent advances in drone technology. It invites many intriguing, unsolved problems at the intersection of aesthetical and scientific challenges. In this work, we propose a deep reinforcement learning agent which supervises motion planning of a filming drone by making desirable shot mode selections based on aesthetical values of video shots. Unlike most of the current state-of-the-art approaches that require explicit guidance by a human expert, our drone learns how to make favorable viewpoint selections by experience. We propose a learning scheme that exploits aesthetical features of retrospective shots in order to extract a desirable policy for better prospective shots. We train our agent in realistic AirSim simulations using both a hand-crafted reward function as well as reward from direct human input. We then deploy the same agent on a real DJI M210 drone in order to test the generalization capability of our approach to real world conditions. To evaluate the success of our approach in the end, we conduct a comprehensive user study in which participants rate the shot quality of our methods. Videos of the system in action can be seen at https://youtu.be/qmVw6mfyEmw.
Tasks	Motion Planning
Published	2019-04-04
URL	https://arxiv.org/abs/1904.02579v2
PDF	https://arxiv.org/pdf/1904.02579v2.pdf
PWC	https://paperswithcode.com/paper/can-a-robot-become-a-movie-director-learning
Repo
Framework

Comparing Efficiency of Expert Data Aggregation Methods


Title	Comparing Efficiency of Expert Data Aggregation Methods
Authors	Sergii Kadenko, Vitaliy Tsyganok
Abstract	Expert estimation of objects takes place when there are no benchmark values of object weights, but these weights still have to be defined. That is why it is problematic to define the efficiency of expert estimation methods. We propose to define efficiency of such methods based on stability of their results under perturbations of input data. We compare two modifications of combinatorial method of expert data aggregation (spanning tree enumeration). Using the example of these two methods, we illustrate two approaches to efficiency evaluation. The first approach is based on usage of real data, obtained through estimation of a set of model objects by a group of experts. The second approach is based on simulation of the whole expert examination cycle (including expert estimates). During evaluation of efficiency of the two listed modifications of combinatorial expert data aggregation method the simulation-based approach proved more robust and credible. Our experimental study confirms that if weights of spanning trees are taken into consideration, the results of combinatorial data aggregation method become more stable. So, weighted spanning tree enumeration method has an advantage over non-weighted method (and, consequently, over logarithmic least squares and row geometric mean methods).
Tasks
Published	2019-11-09
URL	https://arxiv.org/abs/1911.04888v1
PDF	https://arxiv.org/pdf/1911.04888v1.pdf
PWC	https://paperswithcode.com/paper/comparing-efficiency-of-expert-data
Repo
Framework

Target Tracking for Contextual Bandits: Application to Demand Side Management


Title	Target Tracking for Contextual Bandits: Application to Demand Side Management
Authors	Margaux Brégère, Pierre Gaillard, Yannig Goude, Gilles Stoltz
Abstract	We propose a contextual-bandit approach for demand side management by offering price incentives. More precisely, a target mean consumption is set at each round and the mean consumption is modeled as a complex function of the distribution of prices sent and of some contextual variables such as the temperature, weather, and so on. The performance of our strategies is measured in quadratic losses through a regret criterion. We offer $T^{2/3}$ upper bounds on this regret (up to poly-logarithmic terms)—and even faster rates under stronger assumptions—for strategies inspired by standard strategies for contextual bandits (like LinUCB, see Li et al., 2010). Simulations on a real data set gathered by UK Power Networks, in which price incentives were offered, show that our strategies are effective and may indeed manage demand response by suitably picking the price levels.
Tasks	Multi-Armed Bandits
Published	2019-01-28
URL	https://arxiv.org/abs/1901.09532v2
PDF	https://arxiv.org/pdf/1901.09532v2.pdf
PWC	https://paperswithcode.com/paper/target-tracking-for-contextual-bandits
Repo
Framework

Information-Geometric Set Embeddings (IGSE): From Sets to Probability Distributions


Title	Information-Geometric Set Embeddings (IGSE): From Sets to Probability Distributions
Authors	Ke Sun, Frank Nielsen
Abstract	This letter introduces an abstract learning problem called the “set embedding”: The objective is to map sets into probability distributions so as to lose less information. We relate set union and intersection operations with corresponding interpolations of probability distributions. We also demonstrate a preliminary solution with experimental results on toy set embedding examples.
Tasks
Published	2019-11-27
URL	https://arxiv.org/abs/1911.12463v2
PDF	https://arxiv.org/pdf/1911.12463v2.pdf
PWC	https://paperswithcode.com/paper/information-geometric-set-embeddings-igse
Repo
Framework

ElectroLens: Understanding Atomistic Simulations Through Spatially-resolved Visualization of High-dimensional Features


Title	ElectroLens: Understanding Atomistic Simulations Through Spatially-resolved Visualization of High-dimensional Features
Authors	Xiangyun Lei, Fred Hohman, Duen Horng, Chau, Andrew J. Medford
Abstract	In recent years, machine learning (ML) has gained significant popularity in the field of chemical informatics and electronic structure theory. These techniques often require researchers to engineer abstract “features” that encode chemical concepts into a mathematical form compatible with the input to machine-learning models. However, there is no existing tool to connect these abstract features back to the actual chemical system, making it difficult to diagnose failures and to build intuition about the meaning of the features. We present ElectroLens, a new visualization tool for high-dimensional spatially-resolved features to tackle this problem. The tool visualizes high-dimensional data sets for atomistic and electron environment features by a series of linked 3D views and 2D plots. The tool is able to connect different derived features and their corresponding regions in 3D via interactive selection. It is built to be scalable, and integrate with existing infrastructure.
Tasks
Published	2019-08-20
URL	https://arxiv.org/abs/1908.08381v1
PDF	https://arxiv.org/pdf/1908.08381v1.pdf
PWC	https://paperswithcode.com/paper/electrolens-understanding-atomistic
Repo
Framework

Explaining Visual Models by Causal Attribution


Title	Explaining Visual Models by Causal Attribution
Authors	Álvaro Parafita, Jordi Vitrià
Abstract	Model explanations based on pure observational data cannot compute the effects of features reliably, due to their inability to estimate how each factor alteration could affect the rest. We argue that explanations should be based on the causal model of the data and the derived intervened causal models, that represent the data distribution subject to interventions. With these models, we can compute counterfactuals, new samples that will inform us how the model reacts to feature changes on our input. We propose a novel explanation methodology based on Causal Counterfactuals and identify the limitations of current Image Generative Models in their application to counterfactual creation.
Tasks
Published	2019-09-19
URL	https://arxiv.org/abs/1909.08891v1
PDF	https://arxiv.org/pdf/1909.08891v1.pdf
PWC	https://paperswithcode.com/paper/explaining-visual-models-by-causal
Repo
Framework

xSLUE: A Benchmark and Analysis Platform for Cross-Style Language Understanding and Evaluation


Title	xSLUE: A Benchmark and Analysis Platform for Cross-Style Language Understanding and Evaluation
Authors	Dongyeop Kang, Eduard Hovy
Abstract	Every natural text is written in some style. The style is formed by a complex combination of different stylistic factors, including formality markers, emotions, metaphors, etc. Some factors implicitly reflect the author’s personality, while others are explicitly controlled by the author’s choices in order to achieve some personal or social goal. One cannot form a complete understanding of a text and its author without considering these factors. The factors combine and co-vary in complex ways to form styles. Studying the nature of the covarying combinations sheds light on stylistic language in general, sometimes called cross-style language understanding. This paper provides a benchmark corpus (xSLUE) with an online platform (http://xslue.com) for cross-style language understanding and evaluation. The benchmark contains text in 15 different styles and 23 classification tasks. For each task, we provide the fine-tuned classifier for further analysis. Our analysis shows that some styles are highly dependent on each other (e.g., impoliteness and offense), and some domains (e.g., tweets, political debates) are stylistically more diverse than others (e.g., academic manuscripts). We discuss the technical challenges of cross-style understanding and potential directions for future research: cross-style modeling which shares the internal representation for low-resource or low-performance styles and other applications such as cross-style generation.
Tasks
Published	2019-11-09
URL	https://arxiv.org/abs/1911.03663v1
PDF	https://arxiv.org/pdf/1911.03663v1.pdf
PWC	https://paperswithcode.com/paper/xslue-a-benchmark-and-analysis-platform-for
Repo
Framework

A Personalized Preference Learning Framework for Caching in Mobile Networks


Title	A Personalized Preference Learning Framework for Caching in Mobile Networks
Authors	Adeel Malik, Joongheon Kim, Kwang Soon Kim, Won-Yong Shin
Abstract	This paper comprehensively studies a content-centric mobile network based on a preference learning framework, where each mobile user is equipped with a finite-size cache. We consider a practical scenario where each user requests a content file according to its own preferences, which is motivated by the existence of heterogeneity in file preferences among different users. Under our model, we consider a single-hop-based device-to-device (D2D) content delivery protocol and characterize the average hit ratio for the following two file preference cases: the personalized file preferences and the common file preferences. By assuming that the model parameters such as user activity levels, user file preferences, and file popularity are unknown and thus need to be inferred, we present a collaborative filtering (CF)-based approach to learn these parameters. Then, we reformulate the hit ratio maximization problems into a submodular function maximization and propose two computationally efficient algorithms including a greedy approach to efficiently solve the cache allocation problems. We analyze the computational complexity of each algorithm. Moreover, we analyze the corresponding level of the approximation that our greedy algorithm can achieve compared to the optimal solution. Using a real-world dataset, we demonstrate that the proposed framework employing the personalized file preferences brings substantial gains over its counterpart for various system parameters.
Tasks
Published	2019-04-15
URL	https://arxiv.org/abs/1904.06744v3
PDF	https://arxiv.org/pdf/1904.06744v3.pdf
PWC	https://paperswithcode.com/paper/a-personalized-preference-learning-framework
Repo
Framework

Sparsity Emerges Naturally in Neural Language Models


Title	Sparsity Emerges Naturally in Neural Language Models
Authors	Naomi Saphra, Adam Lopez
Abstract	Concerns about interpretability, computational resources, and principled inductive priors have motivated efforts to engineer sparse neural models for NLP tasks. If sparsity is important for NLP, might well-trained neural models naturally become roughly sparse? Using the Taxi-Euclidean norm to measure sparsity, we find that frequent input words are associated with concentrated or sparse activations, while frequent target words are associated with dispersed activations but concentrated gradients. We find that gradients associated with function words are more concentrated than the gradients of content words, even controlling for word frequency.
Tasks
Published	2019-07-22
URL	https://arxiv.org/abs/1908.01817v1
PDF	https://arxiv.org/pdf/1908.01817v1.pdf
PWC	https://paperswithcode.com/paper/sparsity-emerges-naturally-in-neural-language
Repo
Framework

Precision disease networks (PDN)


Title	Precision disease networks (PDN)
Authors	J. Cabrera, D. Amaratunga, W. Kostis, J Kostis
Abstract	This paper presents a method for building patient-based networks that we call Precision disease networks, and its uses for predicting medical outcomes. Our methodology consists of building networks, one for each patient or case, that describes the dis-ease evolution of the patient (PDN) and store the networks as a set of features in a data set of PDN’s, one per observation. We cluster the PDN data and study the within and between cluster variability. In addition, we develop data visualization technics in order to display, compare and summarize the network data. Finally, we analyze a dataset of heart diseases patients from a New Jersey statewide data-base MIDAS (Myocardial Infarction Data Acquisition System, in order to show that the network data improve on the prediction of important patient outcomes such as death or cardiovascular death, when compared with the standard statistical analysis.
Tasks
Published	2019-10-30
URL	https://arxiv.org/abs/1910.14460v1
PDF	https://arxiv.org/pdf/1910.14460v1.pdf
PWC	https://paperswithcode.com/paper/precision-disease-networks-pdn
Repo
Framework

$\texttt{DeepSqueeze}$: Decentralization Meets Error-Compensated Compression


Title	$\texttt{DeepSqueeze}$: Decentralization Meets Error-Compensated Compression
Authors	Hanlin Tang, Xiangru Lian, Shuang Qiu, Lei Yuan, Ce Zhang, Tong Zhang, Ji Liu
Abstract	Communication is a key bottleneck in distributed training. Recently, an \emph{error-compensated} compression technology was particularly designed for the \emph{centralized} learning and receives huge successes, by showing significant advantages over state-of-the-art compression based methods in saving the communication cost. Since the \emph{decentralized} training has been witnessed to be superior to the traditional \emph{centralized} training in the communication restricted scenario, therefore a natural question to ask is “how to apply the error-compensated technology to the decentralized learning to further reduce the communication cost.” However, a trivial extension of compression based centralized training algorithms does not exist for the decentralized scenario. key difference between centralized and decentralized training makes this extension extremely non-trivial. In this paper, we propose an elegant algorithmic design to employ error-compensated stochastic gradient descent for the decentralized scenario, named $\texttt{DeepSqueeze}$. Both the theoretical analysis and the empirical study are provided to show the proposed $\texttt{DeepSqueeze}$ algorithm outperforms the existing compression based decentralized learning algorithms. To the best of our knowledge, this is the first time to apply the error-compensated compression to the decentralized learning.
Tasks
Published	2019-07-17
URL	https://arxiv.org/abs/1907.07346v2
PDF	https://arxiv.org/pdf/1907.07346v2.pdf
PWC	https://paperswithcode.com/paper/textttdeepsqueeze-parallel-stochastic
Repo
Framework

Sublinear Time Numerical Linear Algebra for Structured Matrices


Title	Sublinear Time Numerical Linear Algebra for Structured Matrices
Authors	Xiaofei Shi, David P. Woodruff
Abstract	We show how to solve a number of problems in numerical linear algebra, such as least squares regression, $\ell_p$-regression for any $p \geq 1$, low rank approximation, and kernel regression, in time $T(A) \poly(\log(nd))$, where for a given input matrix $A \in \mathbb{R}^{n \times d}$, $T(A)$ is the time needed to compute $A\cdot y$ for an arbitrary vector $y \in \mathbb{R}^d$. Since $T(A) \leq O(\nnz(A))$, where $\nnz(A)$ denotes the number of non-zero entries of $A$, the time is no worse, up to polylogarithmic factors, as all of the recent advances for such problems that run in input-sparsity time. However, for many applications, $T(A)$ can be much smaller than $\nnz(A)$, yielding significantly sublinear time algorithms. For example, in the overconstrained $(1+\epsilon)$-approximate polynomial interpolation problem, $A$ is a Vandermonde matrix and $T(A) = O(n \log n)$; in this case our running time is $n \cdot \poly(\log n) + \poly(d/\epsilon)$ and we recover the results of \cite{avron2013sketching} as a special case. For overconstrained autoregression, which is a common problem arising in dynamical systems, $T(A) = O(n \log n)$, and we immediately obtain $n \cdot \poly(\log n) + \poly(d/\epsilon)$ time. For kernel autoregression, we significantly improve the running time of prior algorithms for general kernels. For the important case of autoregression with the polynomial kernel and arbitrary target vector $b\in\mathbb{R}^n$, we obtain even faster algorithms. Our algorithms show that, perhaps surprisingly, most of these optimization problems do not require much more time than that of a polylogarithmic number of matrix-vector multiplications.
Tasks
Published	2019-12-12
URL	https://arxiv.org/abs/1912.06060v1
PDF	https://arxiv.org/pdf/1912.06060v1.pdf
PWC	https://paperswithcode.com/paper/sublinear-time-numerical-linear-algebra-for
Repo
Framework

Kernel Wasserstein Distance


Title	Kernel Wasserstein Distance
Authors	Jung Hun Oh, Maryam Pouryahya, Aditi Iyer, Aditya P. Apte, Allen Tannenbaum, Joseph O. Deasy
Abstract	The Wasserstein distance is a powerful metric based on the theory of optimal transport. It gives a natural measure of the distance between two distributions with a wide range of applications. In contrast to a number of the common divergences on distributions such as Kullback-Leibler or Jensen-Shannon, it is (weakly) continuous, and thus ideal for analyzing corrupted data. To date, however, no kernel methods for dealing with nonlinear data have been proposed via the Wasserstein distance. In this work, we develop a novel method to compute the L2-Wasserstein distance in a kernel space implemented using the kernel trick. The latter is a general method in machine learning employed to handle data in a nonlinear manner. We evaluate the proposed approach in identifying computerized tomography (CT) slices with dental artifacts in head and neck cancer, performing unsupervised hierarchical clustering on the resulting Wasserstein distance matrix that is computed on imaging texture features extracted from each CT slice. Our experiments show that the kernel approach outperforms classical non-kernel approaches in identifying CT slices with artifacts.
Tasks
Published	2019-05-22
URL	https://arxiv.org/abs/1905.09314v1
PDF	https://arxiv.org/pdf/1905.09314v1.pdf
PWC	https://paperswithcode.com/paper/kernel-wasserstein-distance
Repo
Framework

A Comparative Study for Unsupervised Network Representation Learning


Title	A Comparative Study for Unsupervised Network Representation Learning
Authors	Megha Khosla, Vinay Setty, Avishek Anand
Abstract	There has been appreciable progress in unsupervised network representation learning (UNRL) approaches over graphs recently with flexible random-walk approaches, new optimization objectives and deep architectures. However, there is no common ground for systematic comparison of embeddings to understand their behavior for different graphs and tasks. In this paper we theoretically group different approaches under a unifying framework and empirically investigate the effectiveness of different network representation methods. In particular, we argue that most of the UNRL approaches either explicitly or implicit model and exploit context information of a node. Consequently, we propose a framework that casts a variety of approaches – random walk based, matrix factorization and deep learning based – into a unified context-based optimization function. We systematically group the methods based on their similarities and differences. We study the differences among these methods in detail which we later use to explain their performance differences (on downstream tasks). We conduct a large-scale empirical study considering 9 popular and recent UNRL techniques and 11 real-world datasets with varying structural properties and two common tasks – node classification and link prediction. We find that there is no single method that is a clear winner and that the choice of a suitable method is dictated by certain properties of the embedding methods, task and structural properties of the underlying graph. In addition we also report the common pitfalls in evaluation of UNRL methods and come up with suggestions for experimental design and interpretation of results.
Tasks	Link Prediction, Node Classification, Representation Learning
Published	2019-03-19
URL	https://arxiv.org/abs/1903.07902v6
PDF	https://arxiv.org/pdf/1903.07902v6.pdf
PWC	https://paperswithcode.com/paper/a-comprehensive-comparison-of-unsupervised
Repo
Framework