Paper Group ANR 813
Concepts and Applications of Conformal Prediction in Computational Drug Discovery. Sparse hierarchical representation learning on molecular graphs. Statistical and machine learning ensemble modelling to forecast sea surface temperature. Exploiting Temporality for Semi-Supervised Video Segmentation. Action Anticipation for Collaborative Environments …
Concepts and Applications of Conformal Prediction in Computational Drug Discovery
Title | Concepts and Applications of Conformal Prediction in Computational Drug Discovery |
Authors | Isidro Cortés-Ciriano, Andreas Bender |
Abstract | Estimating the reliability of individual predictions is key to increase the adoption of computational models and artificial intelligence in preclinical drug discovery, as well as to foster its application to guide decision making in clinical settings. Among the large number of algorithms developed over the last decades to compute prediction errors, Conformal Prediction (CP) has gained increasing attention in the computational drug discovery community. A major reason for its recent popularity is the ease of interpretation of the computed prediction errors in both classification and regression tasks. For instance, at a confidence level of 90% the true value will be within the predicted confidence intervals in at least 90% of the cases. This so called validity of conformal predictors is guaranteed by the robust mathematical foundation underlying CP. The versatility of CP relies on its minimal computational footprint, as it can be easily coupled to any machine learning algorithm at little computational cost. In this review, we summarize underlying concepts and practical applications of CP with a particular focus on virtual screening and activity modelling, and list open source implementations of relevant software. Finally, we describe the current limitations in the field, and provide a perspective on future opportunities for CP in preclinical and clinical drug discovery. |
Tasks | Decision Making, Drug Discovery |
Published | 2019-08-09 |
URL | https://arxiv.org/abs/1908.03569v1 |
https://arxiv.org/pdf/1908.03569v1.pdf | |
PWC | https://paperswithcode.com/paper/concepts-and-applications-of-conformal |
Repo | |
Framework | |
Sparse hierarchical representation learning on molecular graphs
Title | Sparse hierarchical representation learning on molecular graphs |
Authors | Matthias Bal, Hagen Triendl, Mariana Assmann, Michael Craig, Lawrence Phillips, Jarvist Moore Frost, Usman Bashir, Noor Shaker, Vid Stojevic |
Abstract | Architectures for sparse hierarchical representation learning have recently been proposed for graph-structured data, but so far assume the absence of edge features in the graph. We close this gap and propose a method to pool graphs with edge features, inspired by the hierarchical nature of chemistry. In particular, we introduce two types of pooling layers compatible with an edge-feature graph-convolutional architecture and investigate their performance for molecules relevant to drug discovery on a set of two classification and two regression benchmark datasets of MoleculeNet. We find that our models significantly outperform previous benchmarks on three of the datasets and reach state-of-the-art results on the fourth benchmark, with pooling improving performance for three out of four tasks, keeping performance stable on the fourth task, and generally speeding up the training process. |
Tasks | Drug Discovery, Representation Learning |
Published | 2019-08-06 |
URL | https://arxiv.org/abs/1908.02065v1 |
https://arxiv.org/pdf/1908.02065v1.pdf | |
PWC | https://paperswithcode.com/paper/sparse-hierarchical-representation-learning |
Repo | |
Framework | |
Statistical and machine learning ensemble modelling to forecast sea surface temperature
Title | Statistical and machine learning ensemble modelling to forecast sea surface temperature |
Authors | Stefan Wolff, Fearghal O’Donncha, Bei Chen |
Abstract | In situ and remotely sensed observations have huge potential to develop data-driven predictive models for oceanography. A suite of machine learning models, including regression, decision tree and deep learning approaches were developed to estimate sea surface temperatures (SST). Training data consisted of satellite-derived SST and atmospheric data from The Weather Company. Models were evaluated in terms of accuracy and computational complexity. Predictive skills were assessed against observations and a state-of-the-art, physics-based model from the European Centre for Medium Weather Forecasting. Results demonstrated that by combining automated feature engineering with machine-learning approaches, accuracy comparable to existing state-of-the-art can be achieved. Models captured seasonal trends in the data together with short-term variations driven by atmospheric forcing. Further, it demonstrated that machine-learning-based approaches can be used as transportable prediction tools for ocean variables – a challenge for existing physics-based approaches that rely heavily on user parametrisation to specific geography and topography. The low computational cost of inference makes the approach particularly attractive for edge-based computing where predictive models could be deployed on low-power devices in the marine environment. |
Tasks | Automated Feature Engineering, Feature Engineering, Weather Forecasting |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.08573v1 |
https://arxiv.org/pdf/1909.08573v1.pdf | |
PWC | https://paperswithcode.com/paper/statistical-and-machine-learning-ensemble |
Repo | |
Framework | |
Exploiting Temporality for Semi-Supervised Video Segmentation
Title | Exploiting Temporality for Semi-Supervised Video Segmentation |
Authors | Radu Sibechi, Olaf Booij, Nora Baka, Peter Bloem |
Abstract | In recent years, there has been remarkable progress in supervised image segmentation. Video segmentation is less explored, despite the temporal dimension being highly informative. Semantic labels, e.g. that cannot be accurately detected in the current frame, may be inferred by incorporating information from previous frames. However, video segmentation is challenging due to the amount of data that needs to be processed and, more importantly, the cost involved in obtaining ground truth annotations for each frame. In this paper, we tackle the issue of label scarcity by using consecutive frames of a video, where only one frame is annotated. We propose a deep, end-to-end trainable model which leverages temporal information in order to make use of easy to acquire unlabeled data. Our network architecture relies on a novel interconnection of two components: a fully convolutional network to model spatial information and temporal units that are employed at intermediate levels of the convolutional network in order to propagate information through time. The main contribution of this work is the guidance of the temporal signal through the network. We show that only placing a temporal module between the encoder and decoder is suboptimal (baseline). Our extensive experiments on the CityScapes dataset indicate that the resulting model can leverage unlabeled temporal frames and significantly outperform both the frame-by-frame image segmentation and the baseline approach. |
Tasks | Semantic Segmentation, Video Semantic Segmentation |
Published | 2019-08-29 |
URL | https://arxiv.org/abs/1908.11309v1 |
https://arxiv.org/pdf/1908.11309v1.pdf | |
PWC | https://paperswithcode.com/paper/exploiting-temporality-for-semi-supervised |
Repo | |
Framework | |
Action Anticipation for Collaborative Environments: The Impact of Contextual Information and Uncertainty-Based Prediction
Title | Action Anticipation for Collaborative Environments: The Impact of Contextual Information and Uncertainty-Based Prediction |
Authors | Clebeson Canuto dos Santos, Plinio Moreno, Jorge Leonide Aching Samatelo, Raquel Frizera Vassallo, José Santos-Victor |
Abstract | For effectively interacting with humans in collaborative environments, machines need to be able anticipate future events, in order to execute actions in a timely manner. However, the observation of the human limbs movements may not be sufficient to anticipate their actions in an unambiguous manner. In this work we consider two additional sources of information (i.e. context) over time, gaze movements and object information, and study how these additional contextual cues improve the action anticipation performance. We address action anticipation as a classification task, where the model takes the available information as the input, and predicts the most likely action. We propose to use the uncertainty about each prediction as an online decision-making criterion for action anticipation. Uncertainty is modeled as a stochastic process applied to a time-based neural network architecture, which improves the conventional class-likelihood (i.e. deterministic) criterion. The main contributions of this paper are three-fold: (i) we propose a deep architecture that outperforms previous results in the action anticipation task; (ii) we show that contextual information is important do disambiguate the interpretation of similar actions; (iii) we propose the minimization of uncertainty as a more effective criterion for action anticipation, when compared with the maximization of class probability. Our results on the Acticipate dataset showed the importance of contextual information and the uncertainty criterion for action anticipation. We achieve an average accuracy of 98.75% in the anticipation task using only an average of 25% of observations. In addition, considering that a good anticipation model should also perform well in the action recognition task, we achieve an average accuracy of 100% in action recognition on the Acticipate dataset, when the entire observation set is used. |
Tasks | Decision Making |
Published | 2019-10-01 |
URL | https://arxiv.org/abs/1910.00714v1 |
https://arxiv.org/pdf/1910.00714v1.pdf | |
PWC | https://paperswithcode.com/paper/action-anticipation-for-collaborative |
Repo | |
Framework | |
Sequential Experimental Design for Transductive Linear Bandits
Title | Sequential Experimental Design for Transductive Linear Bandits |
Authors | Tanner Fiez, Lalit Jain, Kevin Jamieson, Lillian Ratliff |
Abstract | In this paper we introduce the transductive linear bandit problem: given a set of measurement vectors $\mathcal{X}\subset \mathbb{R}^d$, a set of items $\mathcal{Z}\subset \mathbb{R}^d$, a fixed confidence $\delta$, and an unknown vector $\theta^{\ast}\in \mathbb{R}^d$, the goal is to infer $\text{argmax}_{z\in \mathcal{Z}} z^\top\theta^\ast$ with probability $1-\delta$ by making as few sequentially chosen noisy measurements of the form $x^\top\theta^{\ast}$ as possible. When $\mathcal{X}=\mathcal{Z}$, this setting generalizes linear bandits, and when $\mathcal{X}$ is the standard basis vectors and $\mathcal{Z}\subset {0,1}^d$, combinatorial bandits. Such a transductive setting naturally arises when the set of measurement vectors is limited due to factors such as availability or cost. As an example, in drug discovery the compounds and dosages $\mathcal{X}$ a practitioner may be willing to evaluate in the lab in vitro due to cost or safety reasons may differ vastly from those compounds and dosages $\mathcal{Z}$ that can be safely administered to patients in vivo. Alternatively, in recommender systems for books, the set of books $\mathcal{X}$ a user is queried about may be restricted to well known best-sellers even though the goal might be to recommend more esoteric titles $\mathcal{Z}$. In this paper, we provide instance-dependent lower bounds for the transductive setting, an algorithm that matches these up to logarithmic factors, and an evaluation. In particular, we provide the first non-asymptotic algorithm for linear bandits that nearly achieves the information theoretic lower bound. |
Tasks | Drug Discovery, Recommendation Systems |
Published | 2019-06-20 |
URL | https://arxiv.org/abs/1906.08399v1 |
https://arxiv.org/pdf/1906.08399v1.pdf | |
PWC | https://paperswithcode.com/paper/sequential-experimental-design-for |
Repo | |
Framework | |
Attention-based Conditioning Methods for External Knowledge Integration
Title | Attention-based Conditioning Methods for External Knowledge Integration |
Authors | Katerina Margatina, Christos Baziotis, Alexandros Potamianos |
Abstract | In this paper, we present a novel approach for incorporating external knowledge in Recurrent Neural Networks (RNNs). We propose the integration of lexicon features into the self-attention mechanism of RNN-based architectures. This form of conditioning on the attention distribution, enforces the contribution of the most salient words for the task at hand. We introduce three methods, namely attentional concatenation, feature-based gating and affine transformation. Experiments on six benchmark datasets show the effectiveness of our methods. Attentional feature-based gating yields consistent performance improvement across tasks. Our approach is implemented as a simple add-on module for RNN-based models with minimal computational overhead and can be adapted to any deep neural architecture. |
Tasks | |
Published | 2019-06-09 |
URL | https://arxiv.org/abs/1906.03674v1 |
https://arxiv.org/pdf/1906.03674v1.pdf | |
PWC | https://paperswithcode.com/paper/attention-based-conditioning-methods-for |
Repo | |
Framework | |
Scalable approximate inference for state space models with normalising flows
Title | Scalable approximate inference for state space models with normalising flows |
Authors | Tom Ryder, Andrew Golightly, Isaac Matthews, Dennis Prangle |
Abstract | By exploiting mini-batch stochastic gradient optimisation, variational inference has had great success in scaling up approximate Bayesian inference to big data. To date, however, this strategy has only been applicable to models of independent data. Here we extend mini-batch variational methods to state space models of time series data. To do so we introduce a novel generative model as our variational approximation, a local inverse autoregressive flow. This allows a subsequence to be sampled without sampling the entire distribution. Hence we can perform training iterations using short portions of the time series at low computational cost. We illustrate our method on AR(1), Lotka-Volterra and FitzHugh-Nagumo models, achieving accurate parameter estimation in a short time. |
Tasks | Bayesian Inference, Normalising Flows, Time Series |
Published | 2019-10-02 |
URL | https://arxiv.org/abs/1910.00879v1 |
https://arxiv.org/pdf/1910.00879v1.pdf | |
PWC | https://paperswithcode.com/paper/scalable-approximate-inference-for-state |
Repo | |
Framework | |
Debiased Bayesian inference for average treatment effects
Title | Debiased Bayesian inference for average treatment effects |
Authors | Kolyan Ray, Botond Szabo |
Abstract | Bayesian approaches have become increasingly popular in causal inference problems due to their conceptual simplicity, excellent performance and in-built uncertainty quantification (‘posterior credible sets’). We investigate Bayesian inference for average treatment effects from observational data, which is a challenging problem due to the missing counterfactuals and selection bias. Working in the standard potential outcomes framework, we propose a data-driven modification to an arbitrary (nonparametric) prior based on the propensity score that corrects for the first-order posterior bias, thereby improving performance. We illustrate our method for Gaussian process (GP) priors using (semi-)synthetic data. Our experiments demonstrate significant improvement in both estimation accuracy and uncertainty quantification compared to the unmodified GP, rendering our approach highly competitive with the state-of-the-art. |
Tasks | Bayesian Inference, Causal Inference |
Published | 2019-09-26 |
URL | https://arxiv.org/abs/1909.12078v1 |
https://arxiv.org/pdf/1909.12078v1.pdf | |
PWC | https://paperswithcode.com/paper/debiased-bayesian-inference-for-average |
Repo | |
Framework | |
Implementing Ranking-Based Semantics in ConArg: a Preliminary Report
Title | Implementing Ranking-Based Semantics in ConArg: a Preliminary Report |
Authors | Stafano Bistarelli, Francesco Faloci, Carlo Taticchi |
Abstract | ConArg is a suite of tools that offers a wide series of applications for dealing with argumentation problems. In this work, we present the advances we made in implementing a ranking-based semantics, based on computational choice power indexes, within ConArg. Such kind of semantics represents a method for sorting the arguments of an abstract argumentation framework, according to some preference relation. The ranking-based semantics we implement relies on Shapley, Banzhaf, Deegan-Packel and Johnston power index, transferring well know properties from computational social choice to argumentation framework ranking-based semantics. |
Tasks | Abstract Argumentation |
Published | 2019-08-21 |
URL | https://arxiv.org/abs/1908.07784v2 |
https://arxiv.org/pdf/1908.07784v2.pdf | |
PWC | https://paperswithcode.com/paper/190807784 |
Repo | |
Framework | |
A Surprising Density of Illusionable Natural Speech
Title | A Surprising Density of Illusionable Natural Speech |
Authors | Melody Y. Guan, Gregory Valiant |
Abstract | Recent work on adversarial examples has demonstrated that most natural inputs can be perturbed to fool even state-of-the-art machine learning systems. But does this happen for humans as well? In this work, we investigate: what fraction of natural instances of speech can be turned into “illusions” which either alter humans’ perception or result in different people having significantly different perceptions? We first consider the McGurk effect, the phenomenon by which adding a carefully chosen video clip to the audio channel affects the viewer’s perception of what is said (McGurk and MacDonald, 1976). We obtain empirical estimates that a significant fraction of both words and sentences occurring in natural speech have some susceptibility to this effect. We also learn models for predicting McGurk illusionability. Finally we demonstrate that the Yanny or Laurel auditory illusion (Pressnitzer et al., 2018) is not an isolated occurrence by generating several very different new instances. We believe that the surprising density of illusionable natural speech warrants further investigation, from the perspectives of both security and cognitive science. Supplementary videos are available at: https://www.youtube.com/playlist?list=PLaX7t1K-e_fF2iaenoKznCatm0RC37B_k. |
Tasks | |
Published | 2019-06-03 |
URL | https://arxiv.org/abs/1906.01040v3 |
https://arxiv.org/pdf/1906.01040v3.pdf | |
PWC | https://paperswithcode.com/paper/a-surprising-density-of-illusionable-natural |
Repo | |
Framework | |
Modeling Inter-Speaker Relationship in XLNet for Contextual Spoken Language Understanding
Title | Modeling Inter-Speaker Relationship in XLNet for Contextual Spoken Language Understanding |
Authors | Jonggu Kim, Jong-Hyeok Lee |
Abstract | We propose two methods to capture relevant history information in a multi-turn dialogue by modeling inter-speaker relationship for spoken language understanding (SLU). Our methods are tailored for and therefore compatible with XLNet, which is a state-of-the-art pretrained model, so we verified our models built on the top of XLNet. In our experiments, all models achieved higher accuracy than state-of-the-art contextual SLU models on two benchmark datasets. Analysis on the results demonstrated that the proposed methods are effective to improve SLU accuracy of XLNet. These methods to identify important dialogue history will be useful to alleviate ambiguity in SLU of the current utterance. |
Tasks | Spoken Language Understanding |
Published | 2019-10-28 |
URL | https://arxiv.org/abs/1910.12531v1 |
https://arxiv.org/pdf/1910.12531v1.pdf | |
PWC | https://paperswithcode.com/paper/modeling-inter-speaker-relationship-in-xlnet |
Repo | |
Framework | |
3D hierarchical optimization for Multi-view depth map coding
Title | 3D hierarchical optimization for Multi-view depth map coding |
Authors | Marc Maceira, David Varas, Josep-Ramon Morros, JavierRuiz-Hidalgo, Ferran Marques |
Abstract | Depth data has a widespread use since the popularity of high-resolution 3D sensors. In multi-view sequences, depth information is used to supplement the color data of each view. This article proposes a joint encoding of multiple depth maps with a unique representation. Color and depth images of each view are segmented independently and combined in an optimal Rate-Distortion fashion. The resulting partitions are projected to a reference view where a coherent hierarchy for the multiple views is built. A Rate-Distortionoptimization is applied to obtain the final segmentation choosing nodes of the hierarchy. The consistent segmentation is used to robustly encode depth maps of multiple views obtaining competitive results with HEVC coding standards. Available at: http://link.springer.com/article/10.1007/s11042-017-5409-z |
Tasks | |
Published | 2019-11-01 |
URL | https://arxiv.org/abs/1911.00376v1 |
https://arxiv.org/pdf/1911.00376v1.pdf | |
PWC | https://paperswithcode.com/paper/3d-hierarchical-optimization-for-multi-view |
Repo | |
Framework | |
“How do urban incidents affect traffic speed?” A Deep Graph Convolutional Network for Incident-driven Traffic Speed Prediction
Title | “How do urban incidents affect traffic speed?” A Deep Graph Convolutional Network for Incident-driven Traffic Speed Prediction |
Authors | Qinge Xie, Tiancheng Guo, Yang Chen, Yu Xiao, Xin Wang, Ben Y. Zhao |
Abstract | Accurate traffic speed prediction is an important and challenging topic for transportation planning. Previous studies on traffic speed prediction predominately used spatio-temporal and context features for prediction. However, they have not made good use of the impact of urban traffic incidents. In this work, we aim to make use of the information of urban incidents to achieve a better prediction of traffic speed. Our incident-driven prediction framework consists of three processes. First, we propose a critical incident discovery method to discover urban traffic incidents with high impact on traffic speed. Second, we design a binary classifier, which uses deep learning methods to extract the latent incident impact features from the middle layer of the classifier. Combining above methods, we propose a Deep Incident-Aware Graph Convolutional Network (DIGC-Net) to effectively incorporate urban traffic incident, spatio-temporal, periodic and context features for traffic speed prediction. We conduct experiments on two real-world urban traffic datasets of San Francisco and New York City. The results demonstrate the superior performance of our model compare to the competing benchmarks. |
Tasks | |
Published | 2019-12-03 |
URL | https://arxiv.org/abs/1912.01242v1 |
https://arxiv.org/pdf/1912.01242v1.pdf | |
PWC | https://paperswithcode.com/paper/how-do-urban-incidents-affect-traffic-speed-a |
Repo | |
Framework | |
Scalable Deep Neural Networks via Low-Rank Matrix Factorization
Title | Scalable Deep Neural Networks via Low-Rank Matrix Factorization |
Authors | Atsushi Yaguchi, Taiji Suzuki, Shuhei Nitta, Yukinobu Sakata, Akiyuki Tanizawa |
Abstract | Compressing deep neural networks (DNNs) is important for real-world applications operating on resource-constrained devices. However, it is difficult to change the model size once the training is completed, which needs re-training to configure models suitable for different devices. In this paper, we propose a novel method that enables DNNs to flexibly change their size after training. We factorize the weight matrices of the DNNs via singular value decomposition (SVD) and change their ranks according to the target size. In contrast with existing methods, we introduce simple criteria that characterize the importance of each basis and layer, which enables to effectively compress the error and complexity of models as little as possible. In experiments on multiple image-classification tasks, our method exhibits favorable performance compared with other methods. |
Tasks | Image Classification |
Published | 2019-10-29 |
URL | https://arxiv.org/abs/1910.13141v1 |
https://arxiv.org/pdf/1910.13141v1.pdf | |
PWC | https://paperswithcode.com/paper/scalable-deep-neural-networks-via-low-rank-1 |
Repo | |
Framework | |