May 6, 2019

3164 words 15 mins read

Paper Group ANR 278

Choice by Elimination via Deep Neural Networks. RhoanaNet Pipeline: Dense Automatic Neural Annotation. Graph Aggregation. TopicResponse: A Marriage of Topic Modelling and Rasch Modelling for Automatic Measurement in MOOCs. Surrogate-Assisted Partial Order-based Evolutionary Optimisation. Which Learning Algorithms Can Generalize Identity-Based Rules …

Choice by Elimination via Deep Neural Networks


Title	Choice by Elimination via Deep Neural Networks
Authors	Truyen Tran, Dinh Phung, Svetha Venkatesh
Abstract	We introduce Neural Choice by Elimination, a new framework that integrates deep neural networks into probabilistic sequential choice models for learning to rank. Given a set of items to chose from, the elimination strategy starts with the whole item set and iteratively eliminates the least worthy item in the remaining subset. We prove that the choice by elimination is equivalent to marginalizing out the random Gompertz latent utilities. Coupled with the choice model is the recently introduced Neural Highway Networks for approximating arbitrarily complex rank functions. We evaluate the proposed framework on a large-scale public dataset with over 425K items, drawn from the Yahoo! learning to rank challenge. It is demonstrated that the proposed method is competitive against state-of-the-art learning to rank methods.
Tasks	Learning-To-Rank
Published	2016-02-17
URL	http://arxiv.org/abs/1602.05285v1
PDF	http://arxiv.org/pdf/1602.05285v1.pdf
PWC	https://paperswithcode.com/paper/choice-by-elimination-via-deep-neural
Repo
Framework

RhoanaNet Pipeline: Dense Automatic Neural Annotation


Title	RhoanaNet Pipeline: Dense Automatic Neural Annotation
Authors	Seymour Knowles-Barley, Verena Kaynig, Thouis Ray Jones, Alyssa Wilson, Joshua Morgan, Dongil Lee, Daniel Berger, Narayanan Kasthuri, Jeff W. Lichtman, Hanspeter Pfister
Abstract	Reconstructing a synaptic wiring diagram, or connectome, from electron microscopy (EM) images of brain tissue currently requires many hours of manual annotation or proofreading (Kasthuri and Lichtman, 2010; Lichtman and Sanes, 2008; Seung, 2009). The desire to reconstruct ever larger and more complex networks has pushed the collection of ever larger EM datasets. A cubic millimeter of raw imaging data would take up 1 PB of storage and present an annotation project that would be impractical without relying heavily on automatic segmentation methods. The RhoanaNet image processing pipeline was developed to automatically segment large volumes of EM data and ease the burden of manual proofreading and annotation. Based on (Kaynig et al., 2015), we updated every stage of the software pipeline to provide better throughput performance and higher quality segmentation results. We used state of the art deep learning techniques to generate improved membrane probability maps, and Gala (Nunez-Iglesias et al., 2014) was used to agglomerate 2D segments into 3D objects. We applied the RhoanaNet pipeline to four densely annotated EM datasets, two from mouse cortex, one from cerebellum and one from mouse lateral geniculate nucleus (LGN). All training and test data is made available for benchmark comparisons. The best segmentation results obtained gave $V^\text{Info}_\text{F-score}$ scores of 0.9054 and 09182 for the cortex datasets, 0.9438 for LGN, and 0.9150 for Cerebellum. The RhoanaNet pipeline is open source software. All source code, training data, test data, and annotations for all four benchmark datasets are available at www.rhoana.org.
Tasks
Published	2016-11-21
URL	http://arxiv.org/abs/1611.06973v1
PDF	http://arxiv.org/pdf/1611.06973v1.pdf
PWC	https://paperswithcode.com/paper/rhoananet-pipeline-dense-automatic-neural
Repo
Framework

Graph Aggregation


Title	Graph Aggregation
Authors	Ulle Endriss, Umberto Grandi
Abstract	Graph aggregation is the process of computing a single output graph that constitutes a good compromise between several input graphs, each provided by a different source. One needs to perform graph aggregation in a wide variety of situations, e.g., when applying a voting rule (graphs as preference orders), when consolidating conflicting views regarding the relationships between arguments in a debate (graphs as abstract argumentation frameworks), or when computing a consensus between several alternative clusterings of a given dataset (graphs as equivalence relations). In this paper, we introduce a formal framework for graph aggregation grounded in social choice theory. Our focus is on understanding which properties shared by the individual input graphs will transfer to the output graph returned by a given aggregation rule. We consider both common properties of graphs, such as transitivity and reflexivity, and arbitrary properties expressible in certain fragments of modal logic. Our results establish several connections between the types of properties preserved under aggregation and the choice-theoretic axioms satisfied by the rules used. The most important of these results is a powerful impossibility theorem that generalises Arrow’s seminal result for the aggregation of preference orders to a large collection of different types of graphs.
Tasks	Abstract Argumentation
Published	2016-09-13
URL	http://arxiv.org/abs/1609.03765v1
PDF	http://arxiv.org/pdf/1609.03765v1.pdf
PWC	https://paperswithcode.com/paper/graph-aggregation
Repo
Framework

TopicResponse: A Marriage of Topic Modelling and Rasch Modelling for Automatic Measurement in MOOCs


Title	TopicResponse: A Marriage of Topic Modelling and Rasch Modelling for Automatic Measurement in MOOCs
Authors	Jiazhen He, Benjamin I. P. Rubinstein, James Bailey, Rui Zhang, Sandra Milligan
Abstract	This paper explores the suitability of using automatically discovered topics from MOOC discussion forums for modelling students’ academic abilities. The Rasch model from psychometrics is a popular generative probabilistic model that relates latent student skill, latent item difficulty, and observed student-item responses within a principled, unified framework. According to scholarly educational theory, discovered topics can be regarded as appropriate measurement items if (1) students’ participation across the discovered topics is well fit by the Rasch model, and if (2) the topics are interpretable to subject-matter experts as being educationally meaningful. Such Rasch-scaled topics, with associated difficulty levels, could be of potential benefit to curriculum refinement, student assessment and personalised feedback. The technical challenge that remains, is to discover meaningful topics that simultaneously achieve good statistical fit with the Rasch model. To address this challenge, we combine the Rasch model with non-negative matrix factorisation based topic modelling, jointly fitting both models. We demonstrate the suitability of our approach with quantitative experiments on data from three Coursera MOOCs, and with qualitative survey results on topic interpretability on a Discrete Optimisation MOOC.
Tasks
Published	2016-07-29
URL	http://arxiv.org/abs/1607.08720v2
PDF	http://arxiv.org/pdf/1607.08720v2.pdf
PWC	https://paperswithcode.com/paper/topicresponse-a-marriage-of-topic-modelling
Repo
Framework

Surrogate-Assisted Partial Order-based Evolutionary Optimisation


Title	Surrogate-Assisted Partial Order-based Evolutionary Optimisation
Authors	Vanessa Volz, Günter Rudolph, Boris Naujoks
Abstract	In this paper, we propose a novel approach (SAPEO) to support the survival selection process in multi-objective evolutionary algorithms with surrogate models - it dynamically chooses individuals to evaluate exactly based on the model uncertainty and the distinctness of the population. We introduce variants that differ in terms of the risk they allow when doing survival selection. Here, the anytime performance of different SAPEO variants is evaluated in conjunction with an SMS-EMOA using the BBOB bi-objective benchmark. We compare the obtained results with the performance of the regular SMS-EMOA, as well as another surrogate-assisted approach. The results open up general questions about the applicability and required conditions for surrogate-assisted multi-objective evolutionary algorithms to be tackled in the future.
Tasks
Published	2016-11-01
URL	http://arxiv.org/abs/1611.00260v1
PDF	http://arxiv.org/pdf/1611.00260v1.pdf
PWC	https://paperswithcode.com/paper/surrogate-assisted-partial-order-based
Repo
Framework

Which Learning Algorithms Can Generalize Identity-Based Rules to Novel Inputs?


Title	Which Learning Algorithms Can Generalize Identity-Based Rules to Novel Inputs?
Authors	Paul Tupper, Bobak Shahriari
Abstract	We propose a novel framework for the analysis of learning algorithms that allows us to say when such algorithms can and cannot generalize certain patterns from training data to test data. In particular we focus on situations where the rule that must be learned concerns two components of a stimulus being identical. We call such a basis for discrimination an identity-based rule. Identity-based rules have proven to be difficult or impossible for certain types of learning algorithms to acquire from limited datasets. This is in contrast to human behaviour on similar tasks. Here we provide a framework for rigorously establishing which learning algorithms will fail at generalizing identity-based rules to novel stimuli. We use this framework to show that such algorithms are unable to generalize identity-based rules to novel inputs unless trained on virtually all possible inputs. We demonstrate these results computationally with a multilayer feedforward neural network.
Tasks
Published	2016-05-12
URL	http://arxiv.org/abs/1605.04002v1
PDF	http://arxiv.org/pdf/1605.04002v1.pdf
PWC	https://paperswithcode.com/paper/which-learning-algorithms-can-generalize
Repo
Framework

Data-Driven Threshold Machine: Scan Statistics, Change-Point Detection, and Extreme Bandits


Title	Data-Driven Threshold Machine: Scan Statistics, Change-Point Detection, and Extreme Bandits
Authors	Shuang Li, Yao Xie, Le Song
Abstract	We present a novel distribution-free approach, the data-driven threshold machine (DTM), for a fundamental problem at the core of many learning tasks: choose a threshold for a given pre-specified level that bounds the tail probability of the maximum of a (possibly dependent but stationary) random sequence. We do not assume data distribution, but rather relying on the asymptotic distribution of extremal values, and reduce the problem to estimate three parameters of the extreme value distributions and the extremal index. We specially take care of data dependence via estimating extremal index since in many settings, such as scan statistics, change-point detection, and extreme bandits, where dependence in the sequence of statistics can be significant. Key features of our DTM also include robustness and the computational efficiency, and it only requires one sample path to form a reliable estimate of the threshold, in contrast to the Monte Carlo sampling approach which requires drawing a large number of sample paths. We demonstrate the good performance of DTM via numerical examples in various dependent settings.
Tasks	Change Point Detection
Published	2016-10-14
URL	http://arxiv.org/abs/1610.04599v1
PDF	http://arxiv.org/pdf/1610.04599v1.pdf
PWC	https://paperswithcode.com/paper/data-driven-threshold-machine-scan-statistics
Repo
Framework

Exact Bayesian inference for off-line change-point detection in tree-structured graphical models


Title	Exact Bayesian inference for off-line change-point detection in tree-structured graphical models
Authors	Loïc Schwaller, Stéphane Robin
Abstract	We consider the problem of change-point detection in multivariate time-series. The multivariate distribution of the observations is supposed to follow a graphical model, whose graph and parameters are affected by abrupt changes throughout time. We demonstrate that it is possible to perform exact Bayesian inference whenever one considers a simple class of undirected graphs called spanning trees as possible structures. We are then able to integrate on the graph and segmentation spaces at the same time by combining classical dynamic programming with algebraic results pertaining to spanning trees. In particular, we show that quantities such as posterior distributions for change-points or posterior edge probabilities over time can efficiently be obtained. We illustrate our results on both synthetic and experimental data arising from biology and neuroscience.
Tasks	Bayesian Inference, Change Point Detection, Time Series
Published	2016-03-25
URL	http://arxiv.org/abs/1603.07871v2
PDF	http://arxiv.org/pdf/1603.07871v2.pdf
PWC	https://paperswithcode.com/paper/exact-bayesian-inference-for-off-line-change
Repo
Framework


Title	Extracting Implicit Social Relation for Social Recommendation Techniques in User Rating Prediction
Authors	Seyed Mohammad Taheri, Hamidreza Mahyar, Mohammad Firouzi, Elahe Ghalebi K., Radu Grosu, Ali Movaghar
Abstract	Recommendation plays an increasingly important role in our daily lives. Recommender systems automatically suggest items to users that might be interesting for them. Recent studies illustrate that incorporating social trust in Matrix Factorization methods demonstrably improves accuracy of rating prediction. Such approaches mainly use the trust scores explicitly expressed by users. However, it is often challenging to have users provide explicit trust scores of each other. There exist quite a few works, which propose Trust Metrics to compute and predict trust scores between users based on their interactions. In this paper, first we present how social relation can be extracted from users’ ratings to items by describing Hellinger distance between users in recommender systems. Then, we propose to incorporate the predicted trust scores into social matrix factorization models. By analyzing social relation extraction from three well-known real-world datasets, which both: trust and recommendation data available, we conclude that using the implicit social relation in social recommendation techniques has almost the same performance compared to the actual trust scores explicitly expressed by users. Hence, we build our method, called Hell-TrustSVD, on top of the state-of-the-art social recommendation technique to incorporate both the extracted implicit social relations and ratings given by users on the prediction of items for an active user. To the best of our knowledge, this is the first work to extend TrustSVD with extracted social trust information. The experimental results support the idea of employing implicit trust into matrix factorization whenever explicit trust is not available, can perform much better than the state-of-the-art approaches in user rating prediction.
Tasks	Recommendation Systems, Relation Extraction
Published	2016-12-05
URL	http://arxiv.org/abs/1612.01428v2
PDF	http://arxiv.org/pdf/1612.01428v2.pdf
PWC	https://paperswithcode.com/paper/extracting-implicit-social-relation-for
Repo
Framework

Self-Averaging Expectation Propagation


Title	Self-Averaging Expectation Propagation
Authors	Burak Çakmak, Manfred Opper, Bernard H. Fleury, Ole Winther
Abstract	We investigate the problem of approximate Bayesian inference for a general class of observation models by means of the expectation propagation (EP) framework for large systems under some statistical assumptions. Our approach tries to overcome the numerical bottleneck of EP caused by the inversion of large matrices. Assuming that the measurement matrices are realizations of specific types of ensembles we use the concept of freeness from random matrix theory to show that the EP cavity variances exhibit an asymptotic self-averaging property. They can be pre-computed using specific generating functions, i.e. the R- and/or S-transforms in free probability, which do not require matrix inversions. Our approach extends the framework of (generalized) approximate message passing – assumes zero-mean iid entries of the measurement matrix – to a general class of random matrix ensembles. The generalization is via a simple formulation of the R- and/or S-transforms of the limiting eigenvalue distribution of the Gramian of the measurement matrix. We demonstrate the performance of our approach on a signal recovery problem of nonlinear compressed sensing and compare it with that of EP.
Tasks	Bayesian Inference
Published	2016-08-23
URL	http://arxiv.org/abs/1608.06602v1
PDF	http://arxiv.org/pdf/1608.06602v1.pdf
PWC	https://paperswithcode.com/paper/self-averaging-expectation-propagation
Repo
Framework

Self-Paced Multi-Task Learning


Title	Self-Paced Multi-Task Learning
Authors	Changsheng Li, Junchi Yan, Fan Wei, Weishan Dong, Qingshan Liu, Hongyuan Zha
Abstract	In this paper, we propose a novel multi-task learning (MTL) framework, called Self-Paced Multi-Task Learning (SPMTL). Different from previous works treating all tasks and instances equally when training, SPMTL attempts to jointly learn the tasks by taking into consideration the complexities of both tasks and instances. This is inspired by the cognitive process of human brain that often learns from the easy to the hard. We construct a compact SPMTL formulation by proposing a new task-oriented regularizer that can jointly prioritize the tasks and the instances. Thus it can be interpreted as a self-paced learner for MTL. A simple yet effective algorithm is designed for optimizing the proposed objective function. An error bound for a simplified formulation is also analyzed theoretically. Experimental results on toy and real-world datasets demonstrate the effectiveness of the proposed approach, compared to the state-of-the-art methods.
Tasks	Multi-Task Learning
Published	2016-04-06
URL	http://arxiv.org/abs/1604.01474v2
PDF	http://arxiv.org/pdf/1604.01474v2.pdf
PWC	https://paperswithcode.com/paper/self-paced-multi-task-learning
Repo
Framework

An incremental linear-time learning algorithm for the Optimum-Path Forest classifier


Title	An incremental linear-time learning algorithm for the Optimum-Path Forest classifier
Authors	Moacir Ponti, Mateus Riva
Abstract	We present a classification method with incremental capabilities based on the Optimum-Path Forest classifier (OPF). The OPF considers instances as nodes of a fully-connected training graph, arc weights represent distances between two feature vectors. Our algorithm includes new instances in an OPF in linear-time, while keeping similar accuracies when compared with the original quadratic-time model.
Tasks
Published	2016-04-12
URL	http://arxiv.org/abs/1604.03346v5
PDF	http://arxiv.org/pdf/1604.03346v5.pdf
PWC	https://paperswithcode.com/paper/an-incremental-linear-time-learning-algorithm
Repo
Framework

Covariance of Motion and Appearance Featuresfor Spatio Temporal Recognition Tasks


Title	Covariance of Motion and Appearance Featuresfor Spatio Temporal Recognition Tasks
Authors	Subhabrata Bhattacharya, Nasim Souly, Mubarak Shah
Abstract	In this paper, we introduce an end-to-end framework for video analysis focused towards practical scenarios built on theoretical foundations from sparse representation, including a novel descriptor for general purpose video analysis. In our approach, we compute kinematic features from optical flow and first and second-order derivatives of intensities to represent motion and appearance respectively. These features are then used to construct covariance matrices which capture joint statistics of both low-level motion and appearance features extracted from a video. Using an over-complete dictionary of the covariance based descriptors built from labeled training samples, we formulate low-level event recognition as a sparse linear approximation problem. Within this, we pose the sparse decomposition of a covariance matrix, which also conforms to the space of semi-positive definite matrices, as a determinant maximization problem. Also since covariance matrices lie on non-linear Riemannian manifolds, we compare our former approach with a sparse linear approximation alternative that is suitable for equivalent vector spaces of covariance matrices. This is done by searching for the best projection of the query data on a dictionary using an Orthogonal Matching pursuit algorithm. We show the applicability of our video descriptor in two different application domains - namely low-level event recognition in unconstrained scenarios and gesture recognition using one shot learning. Our experiments provide promising insights in large scale video analysis.
Tasks	Gesture Recognition, One-Shot Learning, Optical Flow Estimation
Published	2016-06-16
URL	http://arxiv.org/abs/1606.05355v1
PDF	http://arxiv.org/pdf/1606.05355v1.pdf
PWC	https://paperswithcode.com/paper/covariance-of-motion-and-appearance
Repo
Framework


Title	Twitter-Network Topic Model: A Full Bayesian Treatment for Social Network and Text Modeling
Authors	Kar Wai Lim, Changyou Chen, Wray Buntine
Abstract	Twitter data is extremely noisy – each tweet is short, unstructured and with informal language, a challenge for current topic modeling. On the other hand, tweets are accompanied by extra information such as authorship, hashtags and the user-follower network. Exploiting this additional information, we propose the Twitter-Network (TN) topic model to jointly model the text and the social network in a full Bayesian nonparametric way. The TN topic model employs the hierarchical Poisson-Dirichlet processes (PDP) for text modeling and a Gaussian process random function model for social network modeling. We show that the TN topic model significantly outperforms several existing nonparametric models due to its flexibility. Moreover, the TN topic model enables additional informative inference such as authors’ interests, hashtag analysis, as well as leading to further applications such as author recommendation, automatic topic labeling and hashtag suggestion. Note our general inference framework can readily be applied to other topic models with embedded PDP nodes.
Tasks	Topic Models
Published	2016-09-22
URL	http://arxiv.org/abs/1609.06791v1
PDF	http://arxiv.org/pdf/1609.06791v1.pdf
PWC	https://paperswithcode.com/paper/twitter-network-topic-model-a-full-bayesian
Repo
Framework

Kernel Distribution Embeddings: Universal Kernels, Characteristic Kernels and Kernel Metrics on Distributions


Title	Kernel Distribution Embeddings: Universal Kernels, Characteristic Kernels and Kernel Metrics on Distributions
Authors	Carl-Johann Simon-Gabriel, Bernhard Schölkopf
Abstract	Kernel mean embeddings have recently attracted the attention of the machine learning community. They map measures $\mu$ from some set $M$ to functions in a reproducing kernel Hilbert space (RKHS) with kernel $k$. The RKHS distance of two mapped measures is a semi-metric $d_k$ over $M$. We study three questions. (I) For a given kernel, what sets $M$ can be embedded? (II) When is the embedding injective over $M$ (in which case $d_k$ is a metric)? (III) How does the $d_k$-induced topology compare to other topologies on $M$? The existing machine learning literature has addressed these questions in cases where $M$ is (a subset of) the finite regular Borel measures. We unify, improve and generalise those results. Our approach naturally leads to continuous and possibly even injective embeddings of (Schwartz-) distributions, i.e., generalised measures, but the reader is free to focus on measures only. In particular, we systemise and extend various (partly known) equivalences between different notions of universal, characteristic and strictly positive definite kernels, and show that on an underlying locally compact Hausdorff space, $d_k$ metrises the weak convergence of probability measures if and only if $k$ is continuous and characteristic.
Tasks
Published	2016-04-18
URL	https://arxiv.org/abs/1604.05251v2
PDF	https://arxiv.org/pdf/1604.05251v2.pdf
PWC	https://paperswithcode.com/paper/kernel-distribution-embeddings-universal
Repo
Framework