July 28, 2019

3088 words 15 mins read

Paper Group ANR 233

Balancing Novelty and Salience: Adaptive Learning to Rank Entities for Timeline Summarization of High-impact Events. On the Power of Truncated SVD for General High-rank Matrix Estimation Problems. Learning Semantic Correspondences in Technical Documentation. Analysis of Thompson Sampling for Gaussian Process Optimization in the Bandit Setting. Crim …

Balancing Novelty and Salience: Adaptive Learning to Rank Entities for Timeline Summarization of High-impact Events


Title	Balancing Novelty and Salience: Adaptive Learning to Rank Entities for Timeline Summarization of High-impact Events
Authors	Tuan Tran, Claudia Niederée, Nattiya Kanhabua, Ujwal Gadiraju, Avishek Anand
Abstract	Long-running, high-impact events such as the Boston Marathon bombing often develop through many stages and involve a large number of entities in their unfolding. Timeline summarization of an event by key sentences eases story digestion, but does not distinguish between what a user remembers and what she might want to re-check. In this work, we present a novel approach for timeline summarization of high-impact events, which uses entities instead of sentences for summarizing the event at each individual point in time. Such entity summaries can serve as both (1) important memory cues in a retrospective event consideration and (2) pointers for personalized event exploration. In order to automatically create such summaries, it is crucial to identify the “right” entities for inclusion. We propose to learn a ranking function for entities, with a dynamically adapted trade-off between the in-document salience of entities and the informativeness of entities across documents, i.e., the level of new information associated with an entity for a time point under consideration. Furthermore, for capturing collective attention for an entity we use an innovative soft labeling approach based on Wikipedia. Our experiments on a real large news datasets confirm the effectiveness of the proposed methods.
Tasks	Learning-To-Rank, Timeline Summarization
Published	2017-01-14
URL	http://arxiv.org/abs/1701.03947v1
PDF	http://arxiv.org/pdf/1701.03947v1.pdf
PWC	https://paperswithcode.com/paper/balancing-novelty-and-salience-adaptive
Repo
Framework

On the Power of Truncated SVD for General High-rank Matrix Estimation Problems


Title	On the Power of Truncated SVD for General High-rank Matrix Estimation Problems
Authors	Simon S. Du, Yining Wang, Aarti Singh
Abstract	We show that given an estimate $\widehat{A}$ that is close to a general high-rank positive semi-definite (PSD) matrix $A$ in spectral norm (i.e., $\widehat{A}-A_2 \leq \delta$), the simple truncated SVD of $\widehat{A}$ produces a multiplicative approximation of $A$ in Frobenius norm. This observation leads to many interesting results on general high-rank matrix estimation problems, which we briefly summarize below ($A$ is an $n\times n$ high-rank PSD matrix and $A_k$ is the best rank-$k$ approximation of $A$): (1) High-rank matrix completion: By observing $\Omega(\frac{n\max{\epsilon^{-4},k^2}\mu_0^2\A_F^2\log n}{\sigma_{k+1}(A)^2})$ elements of $A$ where $\sigma_{k+1}\left(A\right)$ is the $\left(k+1\right)$-th singular value of $A$ and $\mu_0$ is the incoherence, the truncated SVD on a zero-filled matrix satisfies $\widehat{A}_k-A_F \leq (1+O(\epsilon))\A-A_k_F$ with high probability. (2)High-rank matrix de-noising: Let $\widehat{A}=A+E$ where $E$ is a Gaussian random noise matrix with zero mean and $\nu^2/n$ variance on each entry. Then the truncated SVD of $\widehat{A}$ satisfies $\widehat{A}_k-A_F \leq (1+O(\sqrt{\nu/\sigma_{k+1}(A)}))\A-A_k_F + O(\sqrt{k}\nu)$. (3) Low-rank Estimation of high-dimensional covariance: Given $N$ i.i.d.~samples $X_1,\cdots,X_N\sim\mathcal N_n(0,A)$, can we estimate $A$ with a relative-error Frobenius norm bound? We show that if $N = \Omega\left(n\max{\epsilon^{-4},k^2}\gamma_k(A)^2\log N\right)$ for $\gamma_k(A)=\sigma_1(A)/\sigma_{k+1}(A)$, then $\widehat{A}_k-A_F \leq (1+O(\epsilon))\A-A_k_F$ with high probability, where $\widehat{A}=\frac{1}{N}\sum_{i=1}^N{X_iX_i^\top}$ is the sample covariance.
Tasks	Matrix Completion
Published	2017-02-22
URL	http://arxiv.org/abs/1702.06861v2
PDF	http://arxiv.org/pdf/1702.06861v2.pdf
PWC	https://paperswithcode.com/paper/on-the-power-of-truncated-svd-for-general
Repo
Framework

Learning Semantic Correspondences in Technical Documentation


Title	Learning Semantic Correspondences in Technical Documentation
Authors	Kyle Richardson, Jonas Kuhn
Abstract	We consider the problem of translating high-level textual descriptions to formal representations in technical documentation as part of an effort to model the meaning of such documentation. We focus specifically on the problem of learning translational correspondences between text descriptions and grounded representations in the target documentation, such as formal representation of functions or code templates. Our approach exploits the parallel nature of such documentation, or the tight coupling between high-level text and the low-level representations we aim to learn. Data is collected by mining technical documents for such parallel text-representation pairs, which we use to train a simple semantic parsing model. We report new baseline results on sixteen novel datasets, including the standard library documentation for nine popular programming languages across seven natural languages, and a small collection of Unix utility manuals.
Tasks	Semantic Parsing
Published	2017-05-13
URL	http://arxiv.org/abs/1705.04815v1
PDF	http://arxiv.org/pdf/1705.04815v1.pdf
PWC	https://paperswithcode.com/paper/learning-semantic-correspondences-in
Repo
Framework

Analysis of Thompson Sampling for Gaussian Process Optimization in the Bandit Setting


Title	Analysis of Thompson Sampling for Gaussian Process Optimization in the Bandit Setting
Authors	Kinjal Basu, Souvik Ghosh
Abstract	We consider the global optimization of a function over a continuous domain. At every evaluation attempt, we can observe the function at a chosen point in the domain and we reap the reward of the value observed. We assume that drawing these observations are expensive and noisy. We frame it as a continuum-armed bandit problem with a Gaussian Process prior on the function. In this regime, most algorithms have been developed to minimize some form of regret. Contrary to this popular norm, in this paper, we study the convergence of the sequential point $\boldsymbol{x}^t$ to the global optimizer $\boldsymbol{x}^*$ for the Thompson Sampling approach. Under some assumptions and regularity conditions, we show an exponential rate of convergence to the true optimal.
Tasks
Published	2017-05-18
URL	http://arxiv.org/abs/1705.06808v3
PDF	http://arxiv.org/pdf/1705.06808v3.pdf
PWC	https://paperswithcode.com/paper/analysis-of-thompson-sampling-for-gaussian
Repo
Framework

Crime incidents embedding using restricted Boltzmann machines


Title	Crime incidents embedding using restricted Boltzmann machines
Authors	Shixiang Zhu, Yao Xie
Abstract	We present a new approach for detecting related crime series, by unsupervised learning of the latent feature embeddings from narratives of crime record via the Gaussian-Bernoulli Restricted Boltzmann Machines (RBM). This is a drastically different approach from prior work on crime analysis, which typically considers only time and location and at most category information. After the embedding, related cases are closer to each other in the Euclidean feature space, and the unrelated cases are far apart, which is a good property can enable subsequent analysis such as detection and clustering of related cases. Experiments over several series of related crime incidents hand labeled by the Atlanta Police Department reveal the promise of our embedding methods.
Tasks
Published	2017-10-28
URL	http://arxiv.org/abs/1710.10513v3
PDF	http://arxiv.org/pdf/1710.10513v3.pdf
PWC	https://paperswithcode.com/paper/crime-incidents-embedding-using-restricted
Repo
Framework

Sparse-TDA: Sparse Realization of Topological Data Analysis for Multi-Way Classification


Title	Sparse-TDA: Sparse Realization of Topological Data Analysis for Multi-Way Classification
Authors	Wei Guo, Krithika Manohar, Steven L. Brunton, Ashis G. Banerjee
Abstract	Topological data analysis (TDA) has emerged as one of the most promising techniques to reconstruct the unknown shapes of high-dimensional spaces from observed data samples. TDA, thus, yields key shape descriptors in the form of persistent topological features that can be used for any supervised or unsupervised learning task, including multi-way classification. Sparse sampling, on the other hand, provides a highly efficient technique to reconstruct signals in the spatial-temporal domain from just a few carefully-chosen samples. Here, we present a new method, referred to as the Sparse-TDA algorithm, that combines favorable aspects of the two techniques. This combination is realized by selecting an optimal set of sparse pixel samples from the persistent features generated by a vector-based TDA algorithm. These sparse samples are selected from a low-rank matrix representation of persistent features using QR pivoting. We show that the Sparse-TDA method demonstrates promising performance on three benchmark problems related to human posture recognition and image texture classification.
Tasks	Texture Classification, Topological Data Analysis
Published	2017-01-12
URL	http://arxiv.org/abs/1701.03212v4
PDF	http://arxiv.org/pdf/1701.03212v4.pdf
PWC	https://paperswithcode.com/paper/sparse-tda-sparse-realization-of-topological
Repo
Framework

A Framework for Knowledge Management and Automated Reasoning Applied on Intelligent Transport Systems


Title	A Framework for Knowledge Management and Automated Reasoning Applied on Intelligent Transport Systems
Authors	Aneta Vulgarakis Feljan, Athanasios Karapantelakis, Leonid Mokrushin, Hongxin Liang, Rafia Inam, Elena Fersman, Carlos R. B. Azevedo, Klaus Raizer, Ricardo S. Souza
Abstract	Cyber-Physical Systems in general, and Intelligent Transport Systems (ITS) in particular use heterogeneous data sources combined with problem solving expertise in order to make critical decisions that may lead to some form of actions e.g., driver notifications, change of traffic light signals and braking to prevent an accident. Currently, a major part of the decision process is done by human domain experts, which is time-consuming, tedious and error-prone. Additionally, due to the intrinsic nature of knowledge possession this decision process cannot be easily replicated or reused. Therefore, there is a need for automating the reasoning processes by providing computational systems a formal representation of the domain knowledge and a set of methods to process that knowledge. In this paper, we propose a knowledge model that can be used to express both declarative knowledge about the systems’ components, their relations and their current state, as well as procedural knowledge representing possible system behavior. In addition, we introduce a framework for knowledge management and automated reasoning (KMARF). The idea behind KMARF is to automatically select an appropriate problem solver based on formalized reasoning expertise in the knowledge base, and convert a problem definition to the corresponding format. This approach automates reasoning, thus reducing operational costs, and enables reusability of knowledge and methods across different domains. We illustrate the approach on a transportation planning use case.
Tasks
Published	2017-01-11
URL	http://arxiv.org/abs/1701.03000v1
PDF	http://arxiv.org/pdf/1701.03000v1.pdf
PWC	https://paperswithcode.com/paper/a-framework-for-knowledge-management-and
Repo
Framework

Evolution of Convolutional Highway Networks


Title	Evolution of Convolutional Highway Networks
Authors	Oliver Kramer
Abstract	Convolutional highways are deep networks based on multiple stacked convolutional layers for feature preprocessing. We introduce an evolutionary algorithm (EA) for optimization of the structure and hyperparameters of convolutional highways and demonstrate the potential of this optimization setting on the well-known MNIST data set. The (1+1)-EA employs Rechenberg’s mutation rate control and a niching mechanism to overcome local optima adapts the optimization approach. An experimental study shows that the EA is capable of improving the state-of-the-art network contribution and of evolving highway networks from scratch.
Tasks
Published	2017-09-11
URL	http://arxiv.org/abs/1709.03247v1
PDF	http://arxiv.org/pdf/1709.03247v1.pdf
PWC	https://paperswithcode.com/paper/evolution-of-convolutional-highway-networks
Repo
Framework

Detecting changes in slope with an $L_0$ penalty


Title	Detecting changes in slope with an $L_0$ penalty
Authors	Robert Maidstone, Paul Fearnhead, Adam Letchford
Abstract	Whilst there are many approaches to detecting changes in mean for a univariate time-series, the problem of detecting multiple changes in slope has comparatively been ignored. Part of the reason for this is that detecting changes in slope is much more challenging. For example, simple binary segmentation procedures do not work for this problem, whilst efficient dynamic programming methods that work well for the change in mean problem cannot be directly used for detecting changes in slope. We present a novel dynamic programming approach, CPOP, for finding the “best” continuous piecewise-linear fit to data. We define best based on a criterion that measures fit to data using the residual sum of squares, but penalises complexity based on an $L_0$ penalty on changes in slope. We show that using such a criterion is more reliable at estimating changepoint locations than approaches that penalise complexity using an $L_1$ penalty. Empirically CPOP has good computational properties, and can analyse a time-series with over 10,000 observations and over 100 changes in a few minutes. Our method is used to analyse data on the motion of bacteria, and provides fits to the data that both have substantially smaller residual sum of squares and are more parsimonious than two competing approaches.
Tasks	Time Series
Published	2017-01-06
URL	http://arxiv.org/abs/1701.01672v2
PDF	http://arxiv.org/pdf/1701.01672v2.pdf
PWC	https://paperswithcode.com/paper/detecting-changes-in-slope-with-an-l_0
Repo
Framework

Stabiliser states are efficiently PAC-learnable


Title	Stabiliser states are efficiently PAC-learnable
Authors	Andrea Rocchetto
Abstract	The exponential scaling of the wave function is a fundamental property of quantum systems with far reaching implications in our ability to process quantum information. A problem where these are particularly relevant is quantum state tomography. State tomography, whose objective is to obtain a full description of a quantum system, can be analysed in the framework of computational learning theory. In this model, quantum states have been shown to be Probably Approximately Correct (PAC)-learnable with sample complexity linear in the number of qubits. However, it is conjectured that in general quantum states require an exponential amount of computation to be learned. Here, using results from the literature on the efficient classical simulation of quantum systems, we show that stabiliser states are efficiently PAC-learnable. Our results solve an open problem formulated by Aaronson [Proc. R. Soc. A, 2088, (2007)] and propose learning theory as a tool for exploring the power of quantum computation.
Tasks	Quantum State Tomography
Published	2017-04-30
URL	http://arxiv.org/abs/1705.00345v2
PDF	http://arxiv.org/pdf/1705.00345v2.pdf
PWC	https://paperswithcode.com/paper/stabiliser-states-are-efficiently-pac
Repo
Framework

Robust Online Matrix Factorization for Dynamic Background Subtraction


Title	Robust Online Matrix Factorization for Dynamic Background Subtraction
Authors	Hongwei Yong, Deyu Meng, Wangmeng Zuo, Lei Zhang
Abstract	We propose an effective online background subtraction method, which can be robustly applied to practical videos that have variations in both foreground and background. Different from previous methods which often model the foreground as Gaussian or Laplacian distributions, we model the foreground for each frame with a specific mixture of Gaussians (MoG) distribution, which is updated online frame by frame. Particularly, our MoG model in each frame is regularized by the learned foreground/background knowledge in previous frames. This makes our online MoG model highly robust, stable and adaptive to practical foreground and background variations. The proposed model can be formulated as a concise probabilistic MAP model, which can be readily solved by EM algorithm. We further embed an affine transformation operator into the proposed model, which can be automatically adjusted to fit a wide range of video background transformations and make the method more robust to camera movements. With using the sub-sampling technique, the proposed method can be accelerated to execute more than 250 frames per second on average, meeting the requirement of real-time background subtraction for practical video processing tasks. The superiority of the proposed method is substantiated by extensive experiments implemented on synthetic and real videos, as compared with state-of-the-art online and offline background subtraction methods.
Tasks
Published	2017-05-28
URL	http://arxiv.org/abs/1705.10000v1
PDF	http://arxiv.org/pdf/1705.10000v1.pdf
PWC	https://paperswithcode.com/paper/robust-online-matrix-factorization-for
Repo
Framework

Proportionate gradient updates with PercentDelta


Title	Proportionate gradient updates with PercentDelta
Authors	Sami Abu-El-Haija
Abstract	Deep Neural Networks are generally trained using iterative gradient updates. Magnitudes of gradients are affected by many factors, including choice of activation functions and initialization. More importantly, gradient magnitudes can greatly differ across layers, with some layers receiving much smaller gradients than others. causing some layers to train slower than others and therefore slowing down the overall convergence. We analytically explain this disproportionality. Then we propose to explicitly train all layers at the same speed, by scaling the gradient w.r.t. every trainable tensor to be proportional to its current value. In particular, at every batch, we want to update all trainable tensors, such that the relative change of the L1-norm of the tensors is the same, across all layers of the network, throughout training time. Experiments on MNIST show that our method appropriately scales gradients, such that the relative change in trainable tensors is approximately equal across layers. In addition, measuring the test accuracy with training time, shows that our method trains faster than other methods, giving higher test accuracy given same budget of training steps.
Tasks
Published	2017-08-24
URL	http://arxiv.org/abs/1708.07227v1
PDF	http://arxiv.org/pdf/1708.07227v1.pdf
PWC	https://paperswithcode.com/paper/proportionate-gradient-updates-with
Repo
Framework

RubyStar: A Non-Task-Oriented Mixture Model Dialog System


Title	RubyStar: A Non-Task-Oriented Mixture Model Dialog System
Authors	Huiting Liu, Tao Lin, Hanfei Sun, Weijian Lin, Chih-Wei Chang, Teng Zhong, Alexander Rudnicky
Abstract	RubyStar is a dialog system designed to create “human-like” conversation by combining different response generation strategies. RubyStar conducts a non-task-oriented conversation on general topics by using an ensemble of rule-based, retrieval-based and generative methods. Topic detection, engagement monitoring, and context tracking are used for managing interaction. Predictable elements of conversation, such as the bot’s backstory and simple question answering are handled by separate modules. We describe a rating scheme we developed for evaluating response generation. We find that character-level RNN is an effective generation model for general responses, with proper parameter settings; however other kinds of conversation topics might benefit from using other models.
Tasks	Question Answering
Published	2017-11-08
URL	http://arxiv.org/abs/1711.02781v3
PDF	http://arxiv.org/pdf/1711.02781v3.pdf
PWC	https://paperswithcode.com/paper/rubystar-a-non-task-oriented-mixture-model
Repo
Framework

Non-Asymptotic Analysis of Robust Control from Coarse-Grained Identification


Title	Non-Asymptotic Analysis of Robust Control from Coarse-Grained Identification
Authors	Stephen Tu, Ross Boczar, Andrew Packard, Benjamin Recht
Abstract	This work explores the trade-off between the number of samples required to accurately build models of dynamical systems and the degradation of performance in various control objectives due to a coarse approximation. In particular, we show that simple models can be easily fit from input/output data and are sufficient for achieving various control objectives. We derive bounds on the number of noisy input/output samples from a stable linear time-invariant system that are sufficient to guarantee that the corresponding finite impulse response approximation is close to the true system in the $\mathcal{H}_\infty$-norm. We demonstrate that these demands are lower than those derived in prior art which aimed to accurately identify dynamical models. We also explore how different physical input constraints, such as power constraints, affect the sample complexity. Finally, we show how our analysis fits within the established framework of robust control, by demonstrating how a controller designed for an approximate system provably meets performance objectives on the true system.
Tasks
Published	2017-07-15
URL	http://arxiv.org/abs/1707.04791v2
PDF	http://arxiv.org/pdf/1707.04791v2.pdf
PWC	https://paperswithcode.com/paper/non-asymptotic-analysis-of-robust-control
Repo
Framework

A Survey of Learning in Multiagent Environments: Dealing with Non-Stationarity


Title	A Survey of Learning in Multiagent Environments: Dealing with Non-Stationarity
Authors	Pablo Hernandez-Leal, Michael Kaisers, Tim Baarslag, Enrique Munoz de Cote
Abstract	The key challenge in multiagent learning is learning a best response to the behaviour of other agents, which may be non-stationary: if the other agents adapt their strategy as well, the learning target moves. Disparate streams of research have approached non-stationarity from several angles, which make a variety of implicit assumptions that make it hard to keep an overview of the state of the art and to validate the innovation and significance of new works. This survey presents a coherent overview of work that addresses opponent-induced non-stationarity with tools from game theory, reinforcement learning and multi-armed bandits. Further, we reflect on the principle approaches how algorithms model and cope with this non-stationarity, arriving at a new framework and five categories (in increasing order of sophistication): ignore, forget, respond to target models, learn models, and theory of mind. A wide range of state-of-the-art algorithms is classified into a taxonomy, using these categories and key characteristics of the environment (e.g., observability) and adaptation behaviour of the opponents (e.g., smooth, abrupt). To clarify even further we present illustrative variations of one domain, contrasting the strengths and limitations of each category. Finally, we discuss in which environments the different approaches yield most merit, and point to promising avenues of future research.
Tasks	Multi-Armed Bandits
Published	2017-07-28
URL	http://arxiv.org/abs/1707.09183v2
PDF	http://arxiv.org/pdf/1707.09183v2.pdf
PWC	https://paperswithcode.com/paper/a-survey-of-learning-in-multiagent
Repo
Framework