Paper Group ANR 278
Choice by Elimination via Deep Neural Networks. RhoanaNet Pipeline: Dense Automatic Neural Annotation. Graph Aggregation. TopicResponse: A Marriage of Topic Modelling and Rasch Modelling for Automatic Measurement in MOOCs. Surrogate-Assisted Partial Order-based Evolutionary Optimisation. Which Learning Algorithms Can Generalize Identity-Based Rules …
Choice by Elimination via Deep Neural Networks
Title | Choice by Elimination via Deep Neural Networks |
Authors | Truyen Tran, Dinh Phung, Svetha Venkatesh |
Abstract | We introduce Neural Choice by Elimination, a new framework that integrates deep neural networks into probabilistic sequential choice models for learning to rank. Given a set of items to chose from, the elimination strategy starts with the whole item set and iteratively eliminates the least worthy item in the remaining subset. We prove that the choice by elimination is equivalent to marginalizing out the random Gompertz latent utilities. Coupled with the choice model is the recently introduced Neural Highway Networks for approximating arbitrarily complex rank functions. We evaluate the proposed framework on a large-scale public dataset with over 425K items, drawn from the Yahoo! learning to rank challenge. It is demonstrated that the proposed method is competitive against state-of-the-art learning to rank methods. |
Tasks | Learning-To-Rank |
Published | 2016-02-17 |
URL | http://arxiv.org/abs/1602.05285v1 |
http://arxiv.org/pdf/1602.05285v1.pdf | |
PWC | https://paperswithcode.com/paper/choice-by-elimination-via-deep-neural |
Repo | |
Framework | |
RhoanaNet Pipeline: Dense Automatic Neural Annotation
Title | RhoanaNet Pipeline: Dense Automatic Neural Annotation |
Authors | Seymour Knowles-Barley, Verena Kaynig, Thouis Ray Jones, Alyssa Wilson, Joshua Morgan, Dongil Lee, Daniel Berger, Narayanan Kasthuri, Jeff W. Lichtman, Hanspeter Pfister |
Abstract | Reconstructing a synaptic wiring diagram, or connectome, from electron microscopy (EM) images of brain tissue currently requires many hours of manual annotation or proofreading (Kasthuri and Lichtman, 2010; Lichtman and Sanes, 2008; Seung, 2009). The desire to reconstruct ever larger and more complex networks has pushed the collection of ever larger EM datasets. A cubic millimeter of raw imaging data would take up 1 PB of storage and present an annotation project that would be impractical without relying heavily on automatic segmentation methods. The RhoanaNet image processing pipeline was developed to automatically segment large volumes of EM data and ease the burden of manual proofreading and annotation. Based on (Kaynig et al., 2015), we updated every stage of the software pipeline to provide better throughput performance and higher quality segmentation results. We used state of the art deep learning techniques to generate improved membrane probability maps, and Gala (Nunez-Iglesias et al., 2014) was used to agglomerate 2D segments into 3D objects. We applied the RhoanaNet pipeline to four densely annotated EM datasets, two from mouse cortex, one from cerebellum and one from mouse lateral geniculate nucleus (LGN). All training and test data is made available for benchmark comparisons. The best segmentation results obtained gave $V^\text{Info}_\text{F-score}$ scores of 0.9054 and 09182 for the cortex datasets, 0.9438 for LGN, and 0.9150 for Cerebellum. The RhoanaNet pipeline is open source software. All source code, training data, test data, and annotations for all four benchmark datasets are available at www.rhoana.org. |
Tasks | |
Published | 2016-11-21 |
URL | http://arxiv.org/abs/1611.06973v1 |
http://arxiv.org/pdf/1611.06973v1.pdf | |
PWC | https://paperswithcode.com/paper/rhoananet-pipeline-dense-automatic-neural |
Repo | |
Framework | |
Graph Aggregation
Title | Graph Aggregation |
Authors | Ulle Endriss, Umberto Grandi |
Abstract | Graph aggregation is the process of computing a single output graph that constitutes a good compromise between several input graphs, each provided by a different source. One needs to perform graph aggregation in a wide variety of situations, e.g., when applying a voting rule (graphs as preference orders), when consolidating conflicting views regarding the relationships between arguments in a debate (graphs as abstract argumentation frameworks), or when computing a consensus between several alternative clusterings of a given dataset (graphs as equivalence relations). In this paper, we introduce a formal framework for graph aggregation grounded in social choice theory. Our focus is on understanding which properties shared by the individual input graphs will transfer to the output graph returned by a given aggregation rule. We consider both common properties of graphs, such as transitivity and reflexivity, and arbitrary properties expressible in certain fragments of modal logic. Our results establish several connections between the types of properties preserved under aggregation and the choice-theoretic axioms satisfied by the rules used. The most important of these results is a powerful impossibility theorem that generalises Arrow’s seminal result for the aggregation of preference orders to a large collection of different types of graphs. |
Tasks | Abstract Argumentation |
Published | 2016-09-13 |
URL | http://arxiv.org/abs/1609.03765v1 |
http://arxiv.org/pdf/1609.03765v1.pdf | |
PWC | https://paperswithcode.com/paper/graph-aggregation |
Repo | |
Framework | |
TopicResponse: A Marriage of Topic Modelling and Rasch Modelling for Automatic Measurement in MOOCs
Title | TopicResponse: A Marriage of Topic Modelling and Rasch Modelling for Automatic Measurement in MOOCs |
Authors | Jiazhen He, Benjamin I. P. Rubinstein, James Bailey, Rui Zhang, Sandra Milligan |
Abstract | This paper explores the suitability of using automatically discovered topics from MOOC discussion forums for modelling students’ academic abilities. The Rasch model from psychometrics is a popular generative probabilistic model that relates latent student skill, latent item difficulty, and observed student-item responses within a principled, unified framework. According to scholarly educational theory, discovered topics can be regarded as appropriate measurement items if (1) students’ participation across the discovered topics is well fit by the Rasch model, and if (2) the topics are interpretable to subject-matter experts as being educationally meaningful. Such Rasch-scaled topics, with associated difficulty levels, could be of potential benefit to curriculum refinement, student assessment and personalised feedback. The technical challenge that remains, is to discover meaningful topics that simultaneously achieve good statistical fit with the Rasch model. To address this challenge, we combine the Rasch model with non-negative matrix factorisation based topic modelling, jointly fitting both models. We demonstrate the suitability of our approach with quantitative experiments on data from three Coursera MOOCs, and with qualitative survey results on topic interpretability on a Discrete Optimisation MOOC. |
Tasks | |
Published | 2016-07-29 |
URL | http://arxiv.org/abs/1607.08720v2 |
http://arxiv.org/pdf/1607.08720v2.pdf | |
PWC | https://paperswithcode.com/paper/topicresponse-a-marriage-of-topic-modelling |
Repo | |
Framework | |
Surrogate-Assisted Partial Order-based Evolutionary Optimisation
Title | Surrogate-Assisted Partial Order-based Evolutionary Optimisation |
Authors | Vanessa Volz, Günter Rudolph, Boris Naujoks |
Abstract | In this paper, we propose a novel approach (SAPEO) to support the survival selection process in multi-objective evolutionary algorithms with surrogate models - it dynamically chooses individuals to evaluate exactly based on the model uncertainty and the distinctness of the population. We introduce variants that differ in terms of the risk they allow when doing survival selection. Here, the anytime performance of different SAPEO variants is evaluated in conjunction with an SMS-EMOA using the BBOB bi-objective benchmark. We compare the obtained results with the performance of the regular SMS-EMOA, as well as another surrogate-assisted approach. The results open up general questions about the applicability and required conditions for surrogate-assisted multi-objective evolutionary algorithms to be tackled in the future. |
Tasks | |
Published | 2016-11-01 |
URL | http://arxiv.org/abs/1611.00260v1 |
http://arxiv.org/pdf/1611.00260v1.pdf | |
PWC | https://paperswithcode.com/paper/surrogate-assisted-partial-order-based |
Repo | |
Framework | |
Which Learning Algorithms Can Generalize Identity-Based Rules to Novel Inputs?
Title | Which Learning Algorithms Can Generalize Identity-Based Rules to Novel Inputs? |
Authors | Paul Tupper, Bobak Shahriari |
Abstract | We propose a novel framework for the analysis of learning algorithms that allows us to say when such algorithms can and cannot generalize certain patterns from training data to test data. In particular we focus on situations where the rule that must be learned concerns two components of a stimulus being identical. We call such a basis for discrimination an identity-based rule. Identity-based rules have proven to be difficult or impossible for certain types of learning algorithms to acquire from limited datasets. This is in contrast to human behaviour on similar tasks. Here we provide a framework for rigorously establishing which learning algorithms will fail at generalizing identity-based rules to novel stimuli. We use this framework to show that such algorithms are unable to generalize identity-based rules to novel inputs unless trained on virtually all possible inputs. We demonstrate these results computationally with a multilayer feedforward neural network. |
Tasks | |
Published | 2016-05-12 |
URL | http://arxiv.org/abs/1605.04002v1 |
http://arxiv.org/pdf/1605.04002v1.pdf | |
PWC | https://paperswithcode.com/paper/which-learning-algorithms-can-generalize |
Repo | |
Framework | |
Data-Driven Threshold Machine: Scan Statistics, Change-Point Detection, and Extreme Bandits
Title | Data-Driven Threshold Machine: Scan Statistics, Change-Point Detection, and Extreme Bandits |
Authors | Shuang Li, Yao Xie, Le Song |
Abstract | We present a novel distribution-free approach, the data-driven threshold machine (DTM), for a fundamental problem at the core of many learning tasks: choose a threshold for a given pre-specified level that bounds the tail probability of the maximum of a (possibly dependent but stationary) random sequence. We do not assume data distribution, but rather relying on the asymptotic distribution of extremal values, and reduce the problem to estimate three parameters of the extreme value distributions and the extremal index. We specially take care of data dependence via estimating extremal index since in many settings, such as scan statistics, change-point detection, and extreme bandits, where dependence in the sequence of statistics can be significant. Key features of our DTM also include robustness and the computational efficiency, and it only requires one sample path to form a reliable estimate of the threshold, in contrast to the Monte Carlo sampling approach which requires drawing a large number of sample paths. We demonstrate the good performance of DTM via numerical examples in various dependent settings. |
Tasks | Change Point Detection |
Published | 2016-10-14 |
URL | http://arxiv.org/abs/1610.04599v1 |
http://arxiv.org/pdf/1610.04599v1.pdf | |
PWC | https://paperswithcode.com/paper/data-driven-threshold-machine-scan-statistics |
Repo | |
Framework | |
Exact Bayesian inference for off-line change-point detection in tree-structured graphical models
Title | Exact Bayesian inference for off-line change-point detection in tree-structured graphical models |
Authors | Loïc Schwaller, Stéphane Robin |
Abstract | We consider the problem of change-point detection in multivariate time-series. The multivariate distribution of the observations is supposed to follow a graphical model, whose graph and parameters are affected by abrupt changes throughout time. We demonstrate that it is possible to perform exact Bayesian inference whenever one considers a simple class of undirected graphs called spanning trees as possible structures. We are then able to integrate on the graph and segmentation spaces at the same time by combining classical dynamic programming with algebraic results pertaining to spanning trees. In particular, we show that quantities such as posterior distributions for change-points or posterior edge probabilities over time can efficiently be obtained. We illustrate our results on both synthetic and experimental data arising from biology and neuroscience. |
Tasks | Bayesian Inference, Change Point Detection, Time Series |
Published | 2016-03-25 |
URL | http://arxiv.org/abs/1603.07871v2 |
http://arxiv.org/pdf/1603.07871v2.pdf | |
PWC | https://paperswithcode.com/paper/exact-bayesian-inference-for-off-line-change |
Repo | |
Framework | |
Extracting Implicit Social Relation for Social Recommendation Techniques in User Rating Prediction
Title | Extracting Implicit Social Relation for Social Recommendation Techniques in User Rating Prediction |
Authors | Seyed Mohammad Taheri, Hamidreza Mahyar, Mohammad Firouzi, Elahe Ghalebi K., Radu Grosu, Ali Movaghar |
Abstract | Recommendation plays an increasingly important role in our daily lives. Recommender systems automatically suggest items to users that might be interesting for them. Recent studies illustrate that incorporating social trust in Matrix Factorization methods demonstrably improves accuracy of rating prediction. Such approaches mainly use the trust scores explicitly expressed by users. However, it is often challenging to have users provide explicit trust scores of each other. There exist quite a few works, which propose Trust Metrics to compute and predict trust scores between users based on their interactions. In this paper, first we present how social relation can be extracted from users’ ratings to items by describing Hellinger distance between users in recommender systems. Then, we propose to incorporate the predicted trust scores into social matrix factorization models. By analyzing social relation extraction from three well-known real-world datasets, which both: trust and recommendation data available, we conclude that using the implicit social relation in social recommendation techniques has almost the same performance compared to the actual trust scores explicitly expressed by users. Hence, we build our method, called Hell-TrustSVD, on top of the state-of-the-art social recommendation technique to incorporate both the extracted implicit social relations and ratings given by users on the prediction of items for an active user. To the best of our knowledge, this is the first work to extend TrustSVD with extracted social trust information. The experimental results support the idea of employing implicit trust into matrix factorization whenever explicit trust is not available, can perform much better than the state-of-the-art approaches in user rating prediction. |
Tasks | Recommendation Systems, Relation Extraction |
Published | 2016-12-05 |
URL | http://arxiv.org/abs/1612.01428v2 |
http://arxiv.org/pdf/1612.01428v2.pdf | |
PWC | https://paperswithcode.com/paper/extracting-implicit-social-relation-for |
Repo | |
Framework | |
Self-Averaging Expectation Propagation
Title | Self-Averaging Expectation Propagation |
Authors | Burak Çakmak, Manfred Opper, Bernard H. Fleury, Ole Winther |
Abstract | We investigate the problem of approximate Bayesian inference for a general class of observation models by means of the expectation propagation (EP) framework for large systems under some statistical assumptions. Our approach tries to overcome the numerical bottleneck of EP caused by the inversion of large matrices. Assuming that the measurement matrices are realizations of specific types of ensembles we use the concept of freeness from random matrix theory to show that the EP cavity variances exhibit an asymptotic self-averaging property. They can be pre-computed using specific generating functions, i.e. the R- and/or S-transforms in free probability, which do not require matrix inversions. Our approach extends the framework of (generalized) approximate message passing – assumes zero-mean iid entries of the measurement matrix – to a general class of random matrix ensembles. The generalization is via a simple formulation of the R- and/or S-transforms of the limiting eigenvalue distribution of the Gramian of the measurement matrix. We demonstrate the performance of our approach on a signal recovery problem of nonlinear compressed sensing and compare it with that of EP. |
Tasks | Bayesian Inference |
Published | 2016-08-23 |
URL | http://arxiv.org/abs/1608.06602v1 |
http://arxiv.org/pdf/1608.06602v1.pdf | |
PWC | https://paperswithcode.com/paper/self-averaging-expectation-propagation |
Repo | |
Framework | |
Self-Paced Multi-Task Learning
Title | Self-Paced Multi-Task Learning |
Authors | Changsheng Li, Junchi Yan, Fan Wei, Weishan Dong, Qingshan Liu, Hongyuan Zha |
Abstract | In this paper, we propose a novel multi-task learning (MTL) framework, called Self-Paced Multi-Task Learning (SPMTL). Different from previous works treating all tasks and instances equally when training, SPMTL attempts to jointly learn the tasks by taking into consideration the complexities of both tasks and instances. This is inspired by the cognitive process of human brain that often learns from the easy to the hard. We construct a compact SPMTL formulation by proposing a new task-oriented regularizer that can jointly prioritize the tasks and the instances. Thus it can be interpreted as a self-paced learner for MTL. A simple yet effective algorithm is designed for optimizing the proposed objective function. An error bound for a simplified formulation is also analyzed theoretically. Experimental results on toy and real-world datasets demonstrate the effectiveness of the proposed approach, compared to the state-of-the-art methods. |
Tasks | Multi-Task Learning |
Published | 2016-04-06 |
URL | http://arxiv.org/abs/1604.01474v2 |
http://arxiv.org/pdf/1604.01474v2.pdf | |
PWC | https://paperswithcode.com/paper/self-paced-multi-task-learning |
Repo | |
Framework | |
An incremental linear-time learning algorithm for the Optimum-Path Forest classifier
Title | An incremental linear-time learning algorithm for the Optimum-Path Forest classifier |
Authors | Moacir Ponti, Mateus Riva |
Abstract | We present a classification method with incremental capabilities based on the Optimum-Path Forest classifier (OPF). The OPF considers instances as nodes of a fully-connected training graph, arc weights represent distances between two feature vectors. Our algorithm includes new instances in an OPF in linear-time, while keeping similar accuracies when compared with the original quadratic-time model. |
Tasks | |
Published | 2016-04-12 |
URL | http://arxiv.org/abs/1604.03346v5 |
http://arxiv.org/pdf/1604.03346v5.pdf | |
PWC | https://paperswithcode.com/paper/an-incremental-linear-time-learning-algorithm |
Repo | |
Framework | |
Covariance of Motion and Appearance Featuresfor Spatio Temporal Recognition Tasks
Title | Covariance of Motion and Appearance Featuresfor Spatio Temporal Recognition Tasks |
Authors | Subhabrata Bhattacharya, Nasim Souly, Mubarak Shah |
Abstract | In this paper, we introduce an end-to-end framework for video analysis focused towards practical scenarios built on theoretical foundations from sparse representation, including a novel descriptor for general purpose video analysis. In our approach, we compute kinematic features from optical flow and first and second-order derivatives of intensities to represent motion and appearance respectively. These features are then used to construct covariance matrices which capture joint statistics of both low-level motion and appearance features extracted from a video. Using an over-complete dictionary of the covariance based descriptors built from labeled training samples, we formulate low-level event recognition as a sparse linear approximation problem. Within this, we pose the sparse decomposition of a covariance matrix, which also conforms to the space of semi-positive definite matrices, as a determinant maximization problem. Also since covariance matrices lie on non-linear Riemannian manifolds, we compare our former approach with a sparse linear approximation alternative that is suitable for equivalent vector spaces of covariance matrices. This is done by searching for the best projection of the query data on a dictionary using an Orthogonal Matching pursuit algorithm. We show the applicability of our video descriptor in two different application domains - namely low-level event recognition in unconstrained scenarios and gesture recognition using one shot learning. Our experiments provide promising insights in large scale video analysis. |
Tasks | Gesture Recognition, One-Shot Learning, Optical Flow Estimation |
Published | 2016-06-16 |
URL | http://arxiv.org/abs/1606.05355v1 |
http://arxiv.org/pdf/1606.05355v1.pdf | |
PWC | https://paperswithcode.com/paper/covariance-of-motion-and-appearance |
Repo | |
Framework | |
Twitter-Network Topic Model: A Full Bayesian Treatment for Social Network and Text Modeling
Title | Twitter-Network Topic Model: A Full Bayesian Treatment for Social Network and Text Modeling |
Authors | Kar Wai Lim, Changyou Chen, Wray Buntine |
Abstract | Twitter data is extremely noisy – each tweet is short, unstructured and with informal language, a challenge for current topic modeling. On the other hand, tweets are accompanied by extra information such as authorship, hashtags and the user-follower network. Exploiting this additional information, we propose the Twitter-Network (TN) topic model to jointly model the text and the social network in a full Bayesian nonparametric way. The TN topic model employs the hierarchical Poisson-Dirichlet processes (PDP) for text modeling and a Gaussian process random function model for social network modeling. We show that the TN topic model significantly outperforms several existing nonparametric models due to its flexibility. Moreover, the TN topic model enables additional informative inference such as authors’ interests, hashtag analysis, as well as leading to further applications such as author recommendation, automatic topic labeling and hashtag suggestion. Note our general inference framework can readily be applied to other topic models with embedded PDP nodes. |
Tasks | Topic Models |
Published | 2016-09-22 |
URL | http://arxiv.org/abs/1609.06791v1 |
http://arxiv.org/pdf/1609.06791v1.pdf | |
PWC | https://paperswithcode.com/paper/twitter-network-topic-model-a-full-bayesian |
Repo | |
Framework | |
Kernel Distribution Embeddings: Universal Kernels, Characteristic Kernels and Kernel Metrics on Distributions
Title | Kernel Distribution Embeddings: Universal Kernels, Characteristic Kernels and Kernel Metrics on Distributions |
Authors | Carl-Johann Simon-Gabriel, Bernhard Schölkopf |
Abstract | Kernel mean embeddings have recently attracted the attention of the machine learning community. They map measures $\mu$ from some set $M$ to functions in a reproducing kernel Hilbert space (RKHS) with kernel $k$. The RKHS distance of two mapped measures is a semi-metric $d_k$ over $M$. We study three questions. (I) For a given kernel, what sets $M$ can be embedded? (II) When is the embedding injective over $M$ (in which case $d_k$ is a metric)? (III) How does the $d_k$-induced topology compare to other topologies on $M$? The existing machine learning literature has addressed these questions in cases where $M$ is (a subset of) the finite regular Borel measures. We unify, improve and generalise those results. Our approach naturally leads to continuous and possibly even injective embeddings of (Schwartz-) distributions, i.e., generalised measures, but the reader is free to focus on measures only. In particular, we systemise and extend various (partly known) equivalences between different notions of universal, characteristic and strictly positive definite kernels, and show that on an underlying locally compact Hausdorff space, $d_k$ metrises the weak convergence of probability measures if and only if $k$ is continuous and characteristic. |
Tasks | |
Published | 2016-04-18 |
URL | https://arxiv.org/abs/1604.05251v2 |
https://arxiv.org/pdf/1604.05251v2.pdf | |
PWC | https://paperswithcode.com/paper/kernel-distribution-embeddings-universal |
Repo | |
Framework | |