October 18, 2019

3390 words 16 mins read

Paper Group ANR 445

Chaining Mutual Information and Tightening Generalization Bounds. Forecasting market states. Model-based Exception Mining for Object-Relational Data. What is known about Vertex Cover Kernelization?. Next Hit Predictor - Self-exciting Risk Modeling for Predicting Next Locations of Serial Crimes. Test without Trust: Optimal Locally Private Distributi …

Chaining Mutual Information and Tightening Generalization Bounds


Title	Chaining Mutual Information and Tightening Generalization Bounds
Authors	Amir R. Asadi, Emmanuel Abbe, Sergio Verdú
Abstract	Bounding the generalization error of learning algorithms has a long history, which yet falls short in explaining various generalization successes including those of deep learning. Two important difficulties are (i) exploiting the dependencies between the hypotheses, (ii) exploiting the dependence between the algorithm’s input and output. Progress on the first point was made with the chaining method, originating from the work of Kolmogorov, and used in the VC-dimension bound. More recently, progress on the second point was made with the mutual information method by Russo and Zou ‘15. Yet, these two methods are currently disjoint. In this paper, we introduce a technique to combine the chaining and mutual information methods, to obtain a generalization bound that is both algorithm-dependent and that exploits the dependencies between the hypotheses. We provide an example in which our bound significantly outperforms both the chaining and the mutual information bounds. As a corollary, we tighten Dudley’s inequality when the learning algorithm chooses its output from a small subset of hypotheses with high probability.
Tasks
Published	2018-06-11
URL	https://arxiv.org/abs/1806.03803v2
PDF	https://arxiv.org/pdf/1806.03803v2.pdf
PWC	https://paperswithcode.com/paper/chaining-mutual-information-and-tightening
Repo
Framework

Forecasting market states


Title	Forecasting market states
Authors	Pier Francesco Procacci, Tomaso Aste
Abstract	We propose a novel methodology to define, analyze and forecast market states. In our approach market states are identified by a reference sparse precision matrix and a vector of expectation values. In our procedure, each multivariate observation is associated with a given market state accordingly to a minimization of a penalized Mahalanobis distance. The procedure is made computationally very efficient and can be used with a large number of assets. We demonstrate that this procedure is successful at clustering different states of the markets in an unsupervised manner. In particular, we describe an experiment with one hundred log-returns and two states in which the methodology automatically associates states prevalently to pre- and post- crisis periods with one state gathering periods with average positive returns and the other state periods with average negative returns, therefore discovering spontaneously the common classification of `bull' and` bear’ markets. In another experiment, with again one hundred log-returns and two states, we demonstrate that this procedure can be efficiently used to forecast off-sample future market states with significant prediction accuracy. This methodology opens the way to a range of applications in risk management and trading strategies in the context where the correlation structure plays a central role.
Tasks
Published	2018-07-13
URL	https://arxiv.org/abs/1807.05836v3
PDF	https://arxiv.org/pdf/1807.05836v3.pdf
PWC	https://paperswithcode.com/paper/forecasting-market-states
Repo
Framework

Model-based Exception Mining for Object-Relational Data


Title	Model-based Exception Mining for Object-Relational Data
Authors	Fatemeh Riahi, Oliver Schulte
Abstract	This paper is based on a previous publication [29]. Our work extends exception mining and outlier detection to the case of object-relational data. Object-relational data represent a complex heterogeneous network [12], which comprises objects of different types, links among these objects, also of different types, and attributes of these links. This special structure prohibits a direct vectorial data representation. We follow the well-established Exceptional Model Mining framework, which leverages machine learning models for exception mining: A object is exceptional to the extent that a model learned for the object data differs from a model learned for the general population. Exceptional objects can be viewed as outliers. We apply state of-the-art probabilistic modelling techniques for object-relational data that construct a graphical model (Bayesian network), which compactly represents probabilistic associations in the data. A new metric, derived from the learned object-relational model, quantifies the extent to which the individual association pattern of a potential outlier deviates from that of the whole population. The metric is based on the likelihood ratio of two parameter vectors: One that represents the population associations, and another that represents the individual associations. Our method is validated on synthetic datasets and on real-world data sets about soccer matches and movies. Compared to baseline methods, our novel transformed likelihood ratio achieved the best detection accuracy on all datasets.
Tasks	Outlier Detection
Published	2018-07-01
URL	http://arxiv.org/abs/1807.00381v1
PDF	http://arxiv.org/pdf/1807.00381v1.pdf
PWC	https://paperswithcode.com/paper/model-based-exception-mining-for-object
Repo
Framework

What is known about Vertex Cover Kernelization?


Title	What is known about Vertex Cover Kernelization?
Authors	Michael R. Fellows, Lars Jaffke, Aliz Izabella Király, Frances A. Rosamond, Mathias Weller
Abstract	We are pleased to dedicate this survey on kernelization of the Vertex Cover problem, to Professor Juraj Hromkovi\v{c} on the occasion of his 60th birthday. The Vertex Cover problem is often referred to as the Drosophila of parameterized complexity. It enjoys a long history. New and worthy perspectives will always be demonstrated first with concrete results here. This survey discusses several research directions in Vertex Cover kernelization. The Barrier Degree of Vertex Cover kernelization is discussed. We have reduction rules that kernelize vertices of small degree, including in this paper new results that reduce graphs almost to minimum degree five. Can this process go on forever? What is the minimum vertex-degree barrier for polynomial-time kernelization? Assuming the Exponential-Time Hypothesis, there is a minimum degree barrier. The idea of automated kernelization is discussed. We here report the first experimental results of an AI-guided branching algorithm for Vertex Cover whose logic seems amenable for application in finding reduction rules to kernelize small-degree vertices. The survey highlights a central open problem in parameterized complexity. Happy Birthday, Juraj!
Tasks
Published	2018-11-23
URL	https://arxiv.org/abs/1811.09429v4
PDF	https://arxiv.org/pdf/1811.09429v4.pdf
PWC	https://paperswithcode.com/paper/what-is-known-about-vertex-cover
Repo
Framework

Next Hit Predictor - Self-exciting Risk Modeling for Predicting Next Locations of Serial Crimes


Title	Next Hit Predictor - Self-exciting Risk Modeling for Predicting Next Locations of Serial Crimes
Authors	Yunyi Li, Tong Wang
Abstract	Our goal is to predict the location of the next crime in a crime series, based on the identified previous offenses in the series. We build a predictive model called Next Hit Predictor (NHP) that finds the most likely location of the next serial crime via a carefully designed risk model. The risk model follows the paradigm of a self-exciting point process which consists of a background crime risk and triggered risks stimulated by previous offenses in the series. Thus, NHP creates a risk map for a crime series at hand. To train the risk model, we formulate a convex learning objective that considers pairwise rankings of locations and use stochastic gradient descent to learn the optimal parameters. Next Hit Predictor incorporates both spatial-temporal features and geographical characteristics of prior crime locations in the series. Next Hit Predictor has demonstrated promising results on decades’ worth of serial crime data collected by the Crime Analysis Unit of the Cambridge Police Department in Massachusetts, USA.
Tasks
Published	2018-12-13
URL	http://arxiv.org/abs/1812.05224v1
PDF	http://arxiv.org/pdf/1812.05224v1.pdf
PWC	https://paperswithcode.com/paper/next-hit-predictor-self-exciting-risk
Repo
Framework

Test without Trust: Optimal Locally Private Distribution Testing


Title	Test without Trust: Optimal Locally Private Distribution Testing
Authors	Jayadev Acharya, Clément L. Canonne, Cody Freitag, Himanshu Tyagi
Abstract	We study the problem of distribution testing when the samples can only be accessed using a locally differentially private mechanism and focus on two representative testing questions of identity (goodness-of-fit) and independence testing for discrete distributions. We are concerned with two settings: First, when we insist on using an already deployed, general-purpose locally differentially private mechanism such as the popular RAPPOR or the recently introduced Hadamard Response for collecting data, and must build our tests based on the data collected via this mechanism; and second, when no such restriction is imposed, and we can design a bespoke mechanism specifically for testing. For the latter purpose, we introduce the Randomized Aggregated Private Testing Optimal Response (RAPTOR) mechanism which is remarkably simple and requires only one bit of communication per sample. We propose tests based on these mechanisms and analyze their sample complexities. Each proposed test can be implemented efficiently. In each case (barring one), we complement our performance bounds for algorithms with information-theoretic lower bounds and establish sample optimality of our proposed algorithm. A peculiar feature that emerges is that our sample-optimal algorithm based on RAPTOR uses public-coins, and any test based on RAPPOR or Hadamard Response, which are both private-coin mechanisms, requires significantly more samples.
Tasks
Published	2018-08-07
URL	http://arxiv.org/abs/1808.02174v1
PDF	http://arxiv.org/pdf/1808.02174v1.pdf
PWC	https://paperswithcode.com/paper/test-without-trust-optimal-locally-private
Repo
Framework

YASENN: Explaining Neural Networks via Partitioning Activation Sequences


Title	YASENN: Explaining Neural Networks via Partitioning Activation Sequences
Authors	Yaroslav Zharov, Denis Korzhenkov, Pavel Shvechikov, Alexander Tuzhilin
Abstract	We introduce a novel approach to feed-forward neural network interpretation based on partitioning the space of sequences of neuron activations. In line with this approach, we propose a model-specific interpretation method, called YASENN. Our method inherits many advantages of model-agnostic distillation, such as an ability to focus on the particular input region and to express an explanation in terms of features different from those observed by a neural network. Moreover, examination of distillation error makes the method applicable to the problems with low tolerance to interpretation mistakes. Technically, YASENN distills the network with an ensemble of layer-wise gradient boosting decision trees and encodes the sequences of neuron activations with leaf indices. The finite number of unique codes induces a partitioning of the input space. Each partition may be described in a variety of ways, including examination of an interpretable model (e.g. a logistic regression or a decision tree) trained to discriminate between objects of those partitions. Our experiments provide an intuition behind the method and demonstrate revealed artifacts in neural network decision making.
Tasks	Interpretable Machine Learning
Published	2018-11-07
URL	http://arxiv.org/abs/1811.02783v1
PDF	http://arxiv.org/pdf/1811.02783v1.pdf
PWC	https://paperswithcode.com/paper/yasenn-explaining-neural-networks-via
Repo
Framework

A Precision Environment-Wide Association Study of Hypertension via Supervised Cadre Models


Title	A Precision Environment-Wide Association Study of Hypertension via Supervised Cadre Models
Authors	Alexander New, Kristin P. Bennett
Abstract	We consider the problem in precision health of grouping people into subpopulations based on their degree of vulnerability to a risk factor. These subpopulations cannot be discovered with traditional clustering techniques because their quality is evaluated with a supervised metric: the ease of modeling a response variable over observations within them. Instead, we apply the supervised cadre model (SCM), which does use this metric. We extend the SCM formalism so that it may be applied to multivariate regression and binary classification problems. We also develop a way to use conditional entropy to assess the confidence in the process by which a subject is assigned their cadre. Using the SCM, we generalize the environment-wide association study (EWAS) workflow to be able to model heterogeneity in population risk. In our EWAS, we consider more than two hundred environmental exposure factors and find their association with diastolic blood pressure, systolic blood pressure, and hypertension. This requires adapting the SCM to be applicable to data generated by a complex survey design. After correcting for false positives, we found 25 exposure variables that had a significant association with at least one of our response variables. Eight of these were significant for a discovered subpopulation but not for the overall population. Some of these associations have been identified by previous researchers, while others appear to be novel. We examine several discovered subpopulations in detail, and we find that they are interpretable and that they suggest further research questions.
Tasks
Published	2018-08-14
URL	http://arxiv.org/abs/1808.04880v2
PDF	http://arxiv.org/pdf/1808.04880v2.pdf
PWC	https://paperswithcode.com/paper/a-precision-environment-wide-association
Repo
Framework

Interpreting Deep Classifier by Visual Distillation of Dark Knowledge


Title	Interpreting Deep Classifier by Visual Distillation of Dark Knowledge
Authors	Kai Xu, Dae Hoon Park, Chang Yi, Charles Sutton
Abstract	Interpreting black box classifiers, such as deep networks, allows an analyst to validate a classifier before it is deployed in a high-stakes setting. A natural idea is to visualize the deep network’s representations, so as to “see what the network sees”. In this paper, we demonstrate that standard dimension reduction methods in this setting can yield uninformative or even misleading visualizations. Instead, we present DarkSight, which visually summarizes the predictions of a classifier in a way inspired by notion of dark knowledge. DarkSight embeds the data points into a low-dimensional space such that it is easy to compress the deep classifier into a simpler one, essentially combining model compression and dimension reduction. We compare DarkSight against t-SNE both qualitatively and quantitatively, demonstrating that DarkSight visualizations are more informative. Our method additionally yields a new confidence measure based on dark knowledge by quantifying how unusual a given vector of predictions is.
Tasks	Dimensionality Reduction, Model Compression
Published	2018-03-11
URL	http://arxiv.org/abs/1803.04042v1
PDF	http://arxiv.org/pdf/1803.04042v1.pdf
PWC	https://paperswithcode.com/paper/interpreting-deep-classifier-by-visual
Repo
Framework

Fitting a deeply-nested hierarchical model to a large book review dataset using a moment-based estimator


Title	Fitting a deeply-nested hierarchical model to a large book review dataset using a moment-based estimator
Authors	Ningshan Zhang, Kyle Schmaus, Patrick O. Perry
Abstract	We consider a particular instance of a common problem in recommender systems: using a database of book reviews to inform user-targeted recommendations. In our dataset, books are categorized into genres and sub-genres. To exploit this nested taxonomy, we use a hierarchical model that enables information pooling across across similar items at many levels within the genre hierarchy. The main challenge in deploying this model is computational: the data sizes are large, and fitting the model at scale using off-the-shelf maximum likelihood procedures is prohibitive. To get around this computational bottleneck, we extend a moment-based fitting procedure proposed for fitting single-level hierarchical models to the general case of arbitrarily deep hierarchies. This extension is an order of magnetite faster than standard maximum likelihood procedures. The fitting method can be deployed beyond recommender systems to general contexts with deeply-nested hierarchical generalized linear mixed models.
Tasks	Recommendation Systems
Published	2018-06-01
URL	http://arxiv.org/abs/1806.02321v1
PDF	http://arxiv.org/pdf/1806.02321v1.pdf
PWC	https://paperswithcode.com/paper/fitting-a-deeply-nested-hierarchical-model-to
Repo
Framework


Title	The Spot the Difference corpus: a multi-modal corpus of spontaneous task oriented spoken interactions
Authors	José Lopes, Nils Hemmingsson, Oliver Åstrand
Abstract	This paper describes the Spot the Difference Corpus which contains 54 interactions between pairs of subjects interacting to find differences in two very similar scenes. The setup used, the participants’ metadata and details about collection are described. We are releasing this corpus of task-oriented spontaneous dialogues. This release includes rich transcriptions, annotations, audio and video. We believe that this dataset constitutes a valuable resource to study several dimensions of human communication that go from turn-taking to the study of referring expressions. In our preliminary analyses we have looked at task success (how many differences were found out of the total number of differences) and how it evolves over time. In addition we have looked at scene complexity provided by the RGB components’ entropy and how it could relate to speech overlaps, interruptions and the expression of uncertainty. We found there is a tendency that more complex scenes have more competitive interruptions.
Tasks
Published	2018-05-14
URL	http://arxiv.org/abs/1805.05091v1
PDF	http://arxiv.org/pdf/1805.05091v1.pdf
PWC	https://paperswithcode.com/paper/the-spot-the-difference-corpus-a-multi-modal
Repo
Framework

Parsimonious Network based on Fuzzy Inference System (PANFIS) for Time Series Feature Prediction of Low Speed Slew Bearing Prognosis


Title	Parsimonious Network based on Fuzzy Inference System (PANFIS) for Time Series Feature Prediction of Low Speed Slew Bearing Prognosis
Authors	Wahyu Caesarendra, Mahardhika Pratama, Tegoeh Tjahjowidodo, Kiet Tieud, Buyung Kosasih
Abstract	In recent years, the utilization of rotating parts, e.g. bearings and gears, has been continuously supporting the manufacturing line to produce consistent output quality. Due to their critical role, the breakdown of these components might significantly impact the production rate. A proper condition based monitoring (CBM) is among a few ways to maintain and monitor the rotating systems. Prognosis, as one of the major tasks in CBM that predicts and estimates the remaining useful life of the machine, has attracted significant interest in decades. This paper presents a literature review on prognosis approaches from published papers in the last decade. The prognostic approaches are described comprehensively to provide a better idea on how to select an appropriate prognosis method for specific needs. An advanced predictive analytics, namely Parsimonious Network Based on Fuzzy Inference System (PANFIS), was proposed and tested into the low speed slew bearing data. PANFIS differs itself from conventional prognostic approaches in which it supports for online lifelong prognostics without the requirement of retraining or reconfiguration phase. The method is applied to normal-to-failure bearing vibration data collected for 139 days and to predict the time-domain features of vibration slew bearing signals. The performance of the proposed method is compared to some established methods such as ANFIS, eTS, and Simp_eTS. From the results, it is suggested that PANFIS offers outstanding performance compared to those of other methods.
Tasks	Time Series
Published	2018-02-05
URL	http://arxiv.org/abs/1802.09332v1
PDF	http://arxiv.org/pdf/1802.09332v1.pdf
PWC	https://paperswithcode.com/paper/parsimonious-network-based-on-fuzzy-inference
Repo
Framework

Learning Determinantal Point Processes by Corrective Negative Sampling


Title	Learning Determinantal Point Processes by Corrective Negative Sampling
Authors	Zelda Mariet, Mike Gartrell, Suvrit Sra
Abstract	Determinantal Point Processes (DPPs) have attracted significant interest from the machine-learning community due to their ability to elegantly and tractably model the delicate balance between quality and diversity of sets. DPPs are commonly learned from data using maximum likelihood estimation (MLE). While fitting observed sets well, MLE for DPPs may also assign high likelihoods to unobserved sets that are far from the true generative distribution of the data. To address this issue, which reduces the quality of the learned model, we introduce a novel optimization problem, Contrastive Estimation (CE), which encodes information about “negative” samples into the basic learning model. CE is grounded in the successful use of negative information in machine-vision and language modeling. Depending on the chosen negative distribution (which may be static or evolve during optimization), CE assumes two different forms, which we analyze theoretically and experimentally. We evaluate our new model on real-world datasets; on a challenging dataset, CE learning delivers a considerable improvement in predictive performance over a DPP learned without using contrastive information.
Tasks	Language Modelling, Point Processes
Published	2018-02-15
URL	http://arxiv.org/abs/1802.05649v4
PDF	http://arxiv.org/pdf/1802.05649v4.pdf
PWC	https://paperswithcode.com/paper/learning-determinantal-point-processes-by
Repo
Framework

Cross-domain Dialogue Policy Transfer via Simultaneous Speech-act and Slot Alignment


Title	Cross-domain Dialogue Policy Transfer via Simultaneous Speech-act and Slot Alignment
Authors	Kaixiang Mo, Yu Zhang, Qiang Yang, Pascale Fung
Abstract	Dialogue policy transfer enables us to build dialogue policies in a target domain with little data by leveraging knowledge from a source domain with plenty of data. Dialogue sentences are usually represented by speech-acts and domain slots, and the dialogue policy transfer is usually achieved by assigning a slot mapping matrix based on human heuristics. However, existing dialogue policy transfer methods cannot transfer across dialogue domains with different speech-acts, for example, between systems built by different companies. Also, they depend on either common slots or slot entropy, which are not available when the source and target slots are totally disjoint and no database is available to calculate the slot entropy. To solve this problem, we propose a Policy tRansfer across dOMaIns and SpEech-acts (PROMISE) model, which is able to transfer dialogue policies across domains with different speech-acts and disjoint slots. The PROMISE model can learn to align different speech-acts and slots simultaneously, and it does not require common slots or the calculation of the slot entropy. Experiments on both real-world dialogue data and simulations demonstrate that PROMISE model can effectively transfer dialogue policies across domains with different speech-acts and disjoint slots.
Tasks
Published	2018-04-20
URL	http://arxiv.org/abs/1804.07691v1
PDF	http://arxiv.org/pdf/1804.07691v1.pdf
PWC	https://paperswithcode.com/paper/cross-domain-dialogue-policy-transfer-via
Repo
Framework

The price of debiasing automatic metrics in natural language evaluation


Title	The price of debiasing automatic metrics in natural language evaluation
Authors	Arun Tejasvi Chaganty, Stephen Mussman, Percy Liang
Abstract	For evaluating generation systems, automatic metrics such as BLEU cost nothing to run but have been shown to correlate poorly with human judgment, leading to systematic bias against certain model improvements. On the other hand, averaging human judgments, the unbiased gold standard, is often too expensive. In this paper, we use control variates to combine automatic metrics with human evaluation to obtain an unbiased estimator with lower cost than human evaluation alone. In practice, however, we obtain only a 7-13% cost reduction on evaluating summarization and open-response question answering systems. We then prove that our estimator is optimal: there is no unbiased estimator with lower cost. Our theory further highlights the two fundamental bottlenecks—the automatic metric and the prompt shown to human evaluators—both of which need to be improved to obtain greater cost savings.
Tasks	Question Answering
Published	2018-07-06
URL	http://arxiv.org/abs/1807.02202v1
PDF	http://arxiv.org/pdf/1807.02202v1.pdf
PWC	https://paperswithcode.com/paper/the-price-of-debiasing-automatic-metrics-in
Repo
Framework