July 27, 2019

2977 words 14 mins read

Paper Group ANR 568

Using Redescription Mining to Relate Clinical and Biological Characteristics of Cognitively Impaired and Alzheimer’s Disease Patients. Nonlinear network-based quantitative trait prediction from transcriptomic data. Probabilistic Vehicle Trajectory Prediction over Occupancy Grid Map via Recurrent Neural Network. Tree Memory Networks for Modelling Lo …

Using Redescription Mining to Relate Clinical and Biological Characteristics of Cognitively Impaired and Alzheimer’s Disease Patients


Title	Using Redescription Mining to Relate Clinical and Biological Characteristics of Cognitively Impaired and Alzheimer’s Disease Patients
Authors	Matej Mihelčić, Goran Šimić, Mirjana Babić Leko, Nada Lavrač, Sašo Džeroski, Tomislav Šmuc
Abstract	We used redescription mining to find interpretable rules revealing associations between those determinants that provide insights about the Alzheimer’s disease (AD). We extended the CLUS-RM redescription mining algorithm to a constraint-based redescription mining (CBRM) setting, which enables several modes of targeted exploration of specific, user-constrained associations. Redescription mining enabled finding specific constructs of clinical and biological attributes that describe many groups of subjects of different size, homogeneity and levels of cognitive impairment. We confirmed some previously known findings. However, in some instances, as with the attributes: testosterone, the imaging attribute Spatial Pattern of Abnormalities for Recognition of Early AD, as well as the levels of leptin and angiopoietin-2 in plasma, we corroborated previously debatable findings or provided additional information about these variables and their association with AD pathogenesis. Applying redescription mining on ADNI data resulted with the discovery of one largely unknown attribute: the Pregnancy-Associated Protein-A (PAPP-A), which we found highly associated with cognitive impairment in AD. Statistically significant correlations (p <= 0.01) were found between PAPP-A and various different clinical tests. The high importance of this finding lies in the fact that PAPP-A is a metalloproteinase, known to cleave insulin-like growth factor binding proteins. Since it also shares similar substrates with A Disintegrin and the Metalloproteinase family of enzymes that act as {\alpha}-secretase to physiologically cleave amyloid precursor protein (APP) in the non-amyloidogenic pathway, it could be directly involved in the metabolism of APP very early during the disease course. Therefore, further studies should investigate the role of PAPP-A in the development of AD more thoroughly.
Tasks
Published	2017-02-20
URL	http://arxiv.org/abs/1702.06831v2
PDF	http://arxiv.org/pdf/1702.06831v2.pdf
PWC	https://paperswithcode.com/paper/using-redescription-mining-to-relate-clinical
Repo
Framework

Nonlinear network-based quantitative trait prediction from transcriptomic data


Title	Nonlinear network-based quantitative trait prediction from transcriptomic data
Authors	Emilie Devijver, Mélina Gallopin, Emeline Perthame
Abstract	Quantitatively predicting phenotype variables by the expression changes in a set of candidate genes is of great interest in molecular biology but it is also a challenging task for several reasons. First, the collected biological observations might be heterogeneous and correspond to different biological mechanisms. Secondly, the gene expression variables used to predict the phenotype are potentially highly correlated since genes interact though unknown regulatory networks. In this paper, we present a novel approach designed to predict quantitative trait from transcriptomic data, taking into account the heterogeneity in biological samples and the hidden gene regulatory networks underlying different biological mechanisms. The proposed model performs well on prediction but it is also fully parametric, which facilitates the downstream biological interpretation. The model provides clusters of individuals based on the relation between gene expression data and the phenotype, and also leads to infer a gene regulatory network specific for each cluster of individuals. We perform numerical simulations to demonstrate that our model is competitive with other prediction models, and we demonstrate the predictive performance and the interpretability of our model to predict alcohol sensitivity from transcriptomic data on real data from Drosophila Melanogaster Genetic Reference Panel (DGRP).
Tasks
Published	2017-01-26
URL	http://arxiv.org/abs/1701.07899v5
PDF	http://arxiv.org/pdf/1701.07899v5.pdf
PWC	https://paperswithcode.com/paper/nonlinear-network-based-quantitative-trait
Repo
Framework

Probabilistic Vehicle Trajectory Prediction over Occupancy Grid Map via Recurrent Neural Network


Title	Probabilistic Vehicle Trajectory Prediction over Occupancy Grid Map via Recurrent Neural Network
Authors	ByeoungDo Kim, Chang Mook Kang, Seung Hi Lee, Hyunmin Chae, Jaekyum Kim, Chung Choo Chung, Jun Won Choi
Abstract	In this paper, we propose an efficient vehicle trajectory prediction framework based on recurrent neural network. Basically, the characteristic of the vehicle’s trajectory is different from that of regular moving objects since it is affected by various latent factors including road structure, traffic rules, and driver’s intention. Previous state of the art approaches use sophisticated vehicle behavior model describing these factors and derive the complex trajectory prediction algorithm, which requires a system designer to conduct intensive model optimization for practical use. Our approach is data-driven and simple to use in that it learns complex behavior of the vehicles from the massive amount of trajectory data through deep neural network model. The proposed trajectory prediction method employs the recurrent neural network called long short-term memory (LSTM) to analyze the temporal behavior and predict the future coordinate of the surrounding vehicles. The proposed scheme feeds the sequence of vehicles’ coordinates obtained from sensor measurements to the LSTM and produces the probabilistic information on the future location of the vehicles over occupancy grid map. The experiments conducted using the data collected from highway driving show that the proposed method can produce reasonably good estimate of future trajectory.
Tasks	Trajectory Prediction
Published	2017-04-24
URL	http://arxiv.org/abs/1704.07049v2
PDF	http://arxiv.org/pdf/1704.07049v2.pdf
PWC	https://paperswithcode.com/paper/probabilistic-vehicle-trajectory-prediction
Repo
Framework

Tree Memory Networks for Modelling Long-term Temporal Dependencies


Title	Tree Memory Networks for Modelling Long-term Temporal Dependencies
Authors	Tharindu Fernando, Simon Denman, Aaron McFadyen, Sridha Sridharan, Clinton Fookes
Abstract	In the domain of sequence modelling, Recurrent Neural Networks (RNN) have been capable of achieving impressive results in a variety of application areas including visual question answering, part-of-speech tagging and machine translation. However this success in modelling short term dependencies has not successfully transitioned to application areas such as trajectory prediction, which require capturing both short term and long term relationships. In this paper, we propose a Tree Memory Network (TMN) for modelling long term and short term relationships in sequence-to-sequence mapping problems. The proposed network architecture is composed of an input module, controller and a memory module. In contrast to related literature, which models the memory as a sequence of historical states, we model the memory as a recursive tree structure. This structure more effectively captures temporal dependencies across both short term and long term sequences using its hierarchical structure. We demonstrate the effectiveness and flexibility of the proposed TMN in two practical problems, aircraft trajectory modelling and pedestrian trajectory modelling in a surveillance setting, and in both cases we outperform the current state-of-the-art. Furthermore, we perform an in depth analysis on the evolution of the memory module content over time and provide visual evidence on how the proposed TMN is able to map both long term and short term relationships efficiently via a hierarchical structure.
Tasks	Machine Translation, Part-Of-Speech Tagging, Question Answering, Trajectory Prediction, Visual Question Answering
Published	2017-03-12
URL	http://arxiv.org/abs/1703.04706v2
PDF	http://arxiv.org/pdf/1703.04706v2.pdf
PWC	https://paperswithcode.com/paper/tree-memory-networks-for-modelling-long-term
Repo
Framework

Preliminary results on Ontology-based Open Data Publishing


Title	Preliminary results on Ontology-based Open Data Publishing
Authors	Gianluca Cima
Abstract	Despite the current interest in Open Data publishing, a formal and comprehensive methodology supporting an organization in deciding which data to publish and carrying out precise procedures for publishing high-quality data, is still missing. In this paper we argue that the Ontology-based Data Management paradigm can provide a formal basis for a principled approach to publish high quality, semantically annotated Open Data. We describe two main approaches to using an ontology for this endeavor, and then we present some technical results on one of the approaches, called bottom-up, where the specification of the data to be published is given in terms of the sources, and specific techniques allow deriving suitable annotations for interpreting the published data under the light of the ontology.
Tasks
Published	2017-05-30
URL	http://arxiv.org/abs/1705.10480v2
PDF	http://arxiv.org/pdf/1705.10480v2.pdf
PWC	https://paperswithcode.com/paper/preliminary-results-on-ontology-based-open
Repo
Framework

Structured Sparse Modelling with Hierarchical GP


Title	Structured Sparse Modelling with Hierarchical GP
Authors	Danil Kuzin, Olga Isupova, Lyudmila Mihaylova
Abstract	In this paper a new Bayesian model for sparse linear regression with a spatio-temporal structure is proposed. It incorporates the structural assumptions based on a hierarchical Gaussian process prior for spike and slab coefficients. We design an inference algorithm based on Expectation Propagation and evaluate the model over the real data.
Tasks
Published	2017-04-27
URL	http://arxiv.org/abs/1704.08727v1
PDF	http://arxiv.org/pdf/1704.08727v1.pdf
PWC	https://paperswithcode.com/paper/structured-sparse-modelling-with-hierarchical
Repo
Framework

The Imprecisions of Precision Measures in Process Mining


Title	The Imprecisions of Precision Measures in Process Mining
Authors	Niek Tax, Xixi Lu, Natalia Sidorova, Dirk Fahland, Wil M. P. van der Aalst
Abstract	In process mining, precision measures are used to quantify how much a process model overapproximates the behavior seen in an event log. Although several measures have been proposed throughout the years, no research has been done to validate whether these measures achieve the intended aim of quantifying over-approximation in a consistent way for all models and logs. This paper fills this gap by postulating a number of axioms for quantifying precision consistently for any log and any model. Further, we show through counter-examples that none of the existing measures consistently quantifies precision.
Tasks
Published	2017-05-03
URL	http://arxiv.org/abs/1705.03303v2
PDF	http://arxiv.org/pdf/1705.03303v2.pdf
PWC	https://paperswithcode.com/paper/the-imprecisions-of-precision-measures-in
Repo
Framework

Semi-supervised learning


Title	Semi-supervised learning
Authors	Alejandro Cholaquidis, Ricardo Fraiman, Mariela Sued
Abstract	Semi-supervised learning deals with the problem of how, if possible, to take advantage of a huge amount of not classified data, to perform classification, in situations when, typically, the labelled data are few. Even though this is not always possible (it depends on how useful is to know the distribution of the unlabelled data in the inference of the labels), several algorithm have been proposed recently. A new algorithm is proposed, that under almost neccesary conditions, attains asymptotically the performance of the best theoretical rule, when the size of unlabeled data tends to infinity. The set of necessary assumptions, although reasonables, show that semi-parametric classification only works for very well conditioned problems.
Tasks
Published	2017-09-17
URL	http://arxiv.org/abs/1709.05673v2
PDF	http://arxiv.org/pdf/1709.05673v2.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-learning
Repo
Framework

The reparameterization trick for acquisition functions


Title	The reparameterization trick for acquisition functions
Authors	James T. Wilson, Riccardo Moriconi, Frank Hutter, Marc Peter Deisenroth
Abstract	Bayesian optimization is a sample-efficient approach to solving global optimization problems. Along with a surrogate model, this approach relies on theoretically motivated value heuristics (acquisition functions) to guide the search process. Maximizing acquisition functions yields the best performance; unfortunately, this ideal is difficult to achieve since optimizing acquisition functions per se is frequently non-trivial. This statement is especially true in the parallel setting, where acquisition functions are routinely non-convex, high-dimensional, and intractable. Here, we demonstrate how many popular acquisition functions can be formulated as Gaussian integrals amenable to the reparameterization trick and, ensuingly, gradient-based optimization. Further, we use this reparameterized representation to derive an efficient Monte Carlo estimator for the upper confidence bound acquisition function in the context of parallel selection.
Tasks
Published	2017-12-01
URL	http://arxiv.org/abs/1712.00424v1
PDF	http://arxiv.org/pdf/1712.00424v1.pdf
PWC	https://paperswithcode.com/paper/the-reparameterization-trick-for-acquisition
Repo
Framework

Sobolev Norm Learning Rates for Regularized Least-Squares Algorithm


Title	Sobolev Norm Learning Rates for Regularized Least-Squares Algorithm
Authors	Simon Fischer, Ingo Steinwart
Abstract	Learning rates for least-squares regression are typically expressed in terms of $L_2$-norms. In this paper we extend these rates to norms stronger than the $L_2$-norm without requiring the regression function to be contained in the hypothesis space. In the special case of Sobolev reproducing kernel Hilbert spaces used as hypotheses spaces, these stronger norms coincide with fractional Sobolev norms between the used Sobolev space and $L_2$. As a consequence, not only the target function but also some of its derivatives can be estimated without changing the algorithm. From a technical point of view, we combine the well-known integral operator techniques with an embedding property, which so far has only been used in combination with empirical process arguments. This combination results in new finite sample bounds with respect to the stronger norms. From these finite sample bounds our rates easily follow. Finally, we prove the asymptotic optimality of our results in many cases.
Tasks
Published	2017-02-23
URL	https://arxiv.org/abs/1702.07254v2
PDF	https://arxiv.org/pdf/1702.07254v2.pdf
PWC	https://paperswithcode.com/paper/sobolev-norm-learning-rates-for-regularized
Repo
Framework

Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints


Title	Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints
Authors	Jieyu Zhao, Tianlu Wang, Mark Yatskar, Vicente Ordonez, Kai-Wei Chang
Abstract	Language is increasingly being used to define rich visual recognition problems with supporting image collections sourced from the web. Structured prediction models are used in these tasks to take advantage of correlations between co-occurring labels and visual input but risk inadvertently encoding social biases found in web corpora. In this work, we study data and models associated with multilabel object classification and visual semantic role labeling. We find that (a) datasets for these tasks contain significant gender bias and (b) models trained on these datasets further amplify existing bias. For example, the activity cooking is over 33% more likely to involve females than males in a training set, and a trained model further amplifies the disparity to 68% at test time. We propose to inject corpus-level constraints for calibrating existing structured prediction models and design an algorithm based on Lagrangian relaxation for collective inference. Our method results in almost no performance loss for the underlying recognition task but decreases the magnitude of bias amplification by 47.5% and 40.5% for multilabel classification and visual semantic role labeling, respectively.
Tasks	Object Classification, Semantic Role Labeling, Structured Prediction
Published	2017-07-29
URL	http://arxiv.org/abs/1707.09457v1
PDF	http://arxiv.org/pdf/1707.09457v1.pdf
PWC	https://paperswithcode.com/paper/men-also-like-shopping-reducing-gender-bias
Repo
Framework

Gradient-based Inference for Networks with Output Constraints


Title	Gradient-based Inference for Networks with Output Constraints
Authors	Jay Yoon Lee, Sanket Vaibhav Mehta, Michael Wick, Jean-Baptiste Tristan, Jaime Carbonell
Abstract	Practitioners apply neural networks to increasingly complex problems in natural language processing, such as syntactic parsing and semantic role labeling that have rich output structures. Many such structured-prediction problems require deterministic constraints on the output values; for example, in sequence-to-sequence syntactic parsing, we require that the sequential outputs encode valid trees. While hidden units might capture such properties, the network is not always able to learn such constraints from the training data alone, and practitioners must then resort to post-processing. In this paper, we present an inference method for neural networks that enforces deterministic constraints on outputs without performing rule-based post-processing or expensive discrete search. Instead, in the spirit of gradient-based training, we enforce constraints with gradient-based inference (GBI): for each input at test-time, we nudge continuous model weights until the network’s unconstrained inference procedure generates an output that satisfies the constraints. We study the efficacy of GBI on three tasks with hard constraints: semantic role labeling, syntactic parsing, and sequence transduction. In each case, the algorithm not only satisfies constraints but improves accuracy, even when the underlying network is state-of-the-art.
Tasks	Constituency Parsing, Semantic Role Labeling, Structured Prediction
Published	2017-07-26
URL	http://arxiv.org/abs/1707.08608v3
PDF	http://arxiv.org/pdf/1707.08608v3.pdf
PWC	https://paperswithcode.com/paper/gradient-based-inference-for-networks-with
Repo
Framework

Online Reweighted Least Squares Algorithm for Sparse Recovery and Application to Short-Wave Infrared Imaging


Title	Online Reweighted Least Squares Algorithm for Sparse Recovery and Application to Short-Wave Infrared Imaging
Authors	Subhadip Mukherjee, Deepak R., Huaijin Chen, Ashok Veeraraghavan, Chandra Sekhar Seelamantula
Abstract	We address the problem of sparse recovery in an online setting, where random linear measurements of a sparse signal are revealed sequentially and the objective is to recover the underlying signal. We propose a reweighted least squares (RLS) algorithm to solve the problem of online sparse reconstruction, wherein a system of linear equations is solved using conjugate gradient with the arrival of every new measurement. The proposed online algorithm is useful in a setting where one seeks to design a progressive decoding strategy to reconstruct a sparse signal from linear measurements so that one does not have to wait until all measurements are acquired. Moreover, the proposed algorithm is also useful in applications where it is infeasible to process all the measurements using a batch algorithm, owing to computational and storage constraints. It is not needed a priori to collect a fixed number of measurements; rather one can keep collecting measurements until the quality of reconstruction is satisfactory and stop taking further measurements once the reconstruction is sufficiently accurate. We provide a proof-of-concept by comparing the performance of our algorithm with the RLS-based batch reconstruction strategy, known as iteratively reweighted least squares (IRLS), on natural images. Experiments on a recently proposed focal plane array-based imaging setup show up to 1 dB improvement in output peak signal-to-noise ratio as compared with the total variation-based reconstruction.
Tasks
Published	2017-06-29
URL	http://arxiv.org/abs/1706.09585v1
PDF	http://arxiv.org/pdf/1706.09585v1.pdf
PWC	https://paperswithcode.com/paper/online-reweighted-least-squares-algorithm-for
Repo
Framework

What Really is Deep Learning Doing?


Title	What Really is Deep Learning Doing?
Authors	Chuyu Xiong
Abstract	Deep learning has achieved a great success in many areas, from computer vision to natural language processing, to game playing, and much more. Yet, what deep learning is really doing is still an open question. There are a lot of works in this direction. For example, [5] tried to explain deep learning by group renormalization, and [6] tried to explain deep learning from the view of functional approximation. In order to address this very crucial question, here we see deep learning from perspective of mechanical learning and learning machine (see [1], [2]). From this particular angle, we can see deep learning much better and answer with confidence: What deep learning is really doing? why it works well, how it works, and how much data is necessary for learning. We also will discuss advantages and disadvantages of deep learning at the end of this work.
Tasks
Published	2017-11-06
URL	http://arxiv.org/abs/1711.03577v1
PDF	http://arxiv.org/pdf/1711.03577v1.pdf
PWC	https://paperswithcode.com/paper/what-really-is-deep-learning-doing
Repo
Framework

Near Perfect Protein Multi-Label Classification with Deep Neural Networks


Title	Near Perfect Protein Multi-Label Classification with Deep Neural Networks
Authors	Balazs Szalkai, Vince Grolmusz
Abstract	Artificial neural networks (ANNs) have gained a well-deserved popularity among machine learning tools upon their recent successful applications in image- and sound processing and classification problems. ANNs have also been applied for predicting the family or function of a protein, knowing its residue sequence. Here we present two new ANNs with multi-label classification ability, showing impressive accuracy when classifying protein sequences into 698 UniProt families (AUC=99.99%) and 983 Gene Ontology classes (AUC=99.45%).
Tasks	Multi-Label Classification
Published	2017-03-30
URL	http://arxiv.org/abs/1703.10663v1
PDF	http://arxiv.org/pdf/1703.10663v1.pdf
PWC	https://paperswithcode.com/paper/near-perfect-protein-multi-label
Repo
Framework