Paper Group ANR 306
Heterogeneous Strategy Particle Swarm Optimization. Transfer Learning Across Patient Variations with Hidden Parameter Markov Decision Processes. Attentive Pooling Networks. Extraction of airway trees using multiple hypothesis tracking and template matching. A Linear Algebraic Approach to Datalog Evaluation. Learning Efficient Algorithms with Hierar …
Heterogeneous Strategy Particle Swarm Optimization
Title | Heterogeneous Strategy Particle Swarm Optimization |
Authors | Wen-Bo Du, Wen Ying, Gang Yan, Yan-Bo Zhu, Xian-Bin Cao |
Abstract | PSO is a widely recognized optimization algorithm inspired by social swarm. In this brief we present a heterogeneous strategy particle swarm optimization (HSPSO), in which a proportion of particles adopt a fully informed strategy to enhance the converging speed while the rest are singly informed to maintain the diversity. Our extensive numerical experiments show that HSPSO algorithm is able to obtain satisfactory solutions, outperforming both PSO and the fully informed PSO. The evolution process is examined from both structural and microscopic points of view. We find that the cooperation between two types of particles can facilitate a good balance between exploration and exploitation, yielding better performance. We demonstrate the applicability of HSPSO on the filter design problem. |
Tasks | |
Published | 2016-07-30 |
URL | http://arxiv.org/abs/1608.00138v1 |
http://arxiv.org/pdf/1608.00138v1.pdf | |
PWC | https://paperswithcode.com/paper/heterogeneous-strategy-particle-swarm |
Repo | |
Framework | |
Transfer Learning Across Patient Variations with Hidden Parameter Markov Decision Processes
Title | Transfer Learning Across Patient Variations with Hidden Parameter Markov Decision Processes |
Authors | Taylor Killian, George Konidaris, Finale Doshi-Velez |
Abstract | Due to physiological variation, patients diagnosed with the same condition may exhibit divergent, but related, responses to the same treatments. Hidden Parameter Markov Decision Processes (HiP-MDPs) tackle this transfer-learning problem by embedding these tasks into a low-dimensional space. However, the original formulation of HiP-MDP had a critical flaw: the embedding uncertainty was modeled independently of the agent’s state uncertainty, requiring an unnatural training procedure in which all tasks visited every part of the state space—possible for robots that can be moved to a particular location, impossible for human patients. We update the HiP-MDP framework and extend it to more robustly develop personalized medicine strategies for HIV treatment. |
Tasks | Transfer Learning |
Published | 2016-12-01 |
URL | http://arxiv.org/abs/1612.00475v1 |
http://arxiv.org/pdf/1612.00475v1.pdf | |
PWC | https://paperswithcode.com/paper/transfer-learning-across-patient-variations |
Repo | |
Framework | |
Attentive Pooling Networks
Title | Attentive Pooling Networks |
Authors | Cicero dos Santos, Ming Tan, Bing Xiang, Bowen Zhou |
Abstract | In this work, we propose Attentive Pooling (AP), a two-way attention mechanism for discriminative model training. In the context of pair-wise ranking or classification with neural networks, AP enables the pooling layer to be aware of the current input pair, in a way that information from the two input items can directly influence the computation of each other’s representations. Along with such representations of the paired inputs, AP jointly learns a similarity measure over projected segments (e.g. trigrams) of the pair, and subsequently, derives the corresponding attention vector for each input to guide the pooling. Our two-way attention mechanism is a general framework independent of the underlying representation learning, and it has been applied to both convolutional neural networks (CNNs) and recurrent neural networks (RNNs) in our studies. The empirical results, from three very different benchmark tasks of question answering/answer selection, demonstrate that our proposed models outperform a variety of strong baselines and achieve state-of-the-art performance in all the benchmarks. |
Tasks | Answer Selection, Question Answering, Representation Learning |
Published | 2016-02-11 |
URL | http://arxiv.org/abs/1602.03609v1 |
http://arxiv.org/pdf/1602.03609v1.pdf | |
PWC | https://paperswithcode.com/paper/attentive-pooling-networks |
Repo | |
Framework | |
Extraction of airway trees using multiple hypothesis tracking and template matching
Title | Extraction of airway trees using multiple hypothesis tracking and template matching |
Authors | Raghavendra Selvan, Jens Petersen, Jesper H. Pedersen, Marleen de Bruijne |
Abstract | Knowledge of airway tree morphology has important clinical applications in diagnosis of chronic obstructive pulmonary disease. We present an automatic tree extraction method based on multiple hypothesis tracking and template matching for this purpose and evaluate its performance on chest CT images. The method is adapted from a semi-automatic method devised for vessel segmentation. Idealized tubular templates are constructed that match airway probability obtained from a trained classifier and ranked based on their relative significance. Several such regularly spaced templates form the local hypotheses used in constructing a multiple hypothesis tree, which is then traversed to reach decisions. The proposed modifications remove the need for local thresholding of hypotheses as decisions are made entirely based on statistical comparisons involving the hypothesis tree. The results show improvements in performance when compared to the original method and region growing on intensity images. We also compare the method with region growing on the probability images, where the presented method does not show substantial improvement, but we expect it to be less sensitive to local anomalies in the data. |
Tasks | |
Published | 2016-11-24 |
URL | http://arxiv.org/abs/1611.08131v1 |
http://arxiv.org/pdf/1611.08131v1.pdf | |
PWC | https://paperswithcode.com/paper/extraction-of-airway-trees-using-multiple |
Repo | |
Framework | |
A Linear Algebraic Approach to Datalog Evaluation
Title | A Linear Algebraic Approach to Datalog Evaluation |
Authors | Taisuke Sato |
Abstract | In this paper, we propose a fundamentally new approach to Datalog evaluation. Given a linear Datalog program DB written using N constants and binary predicates, we first translate if-and-only-if completions of clauses in DB into a set Eq(DB) of matrix equations with a non-linear operation where relations in M_DB, the least Herbrand model of DB, are encoded as adjacency matrices. We then translate Eq(DB) into another, but purely linear matrix equations tilde_Eq(DB). It is proved that the least solution of tilde_Eq(DB) in the sense of matrix ordering is converted to the least solution of Eq(DB) and the latter gives M_DB as a set of adjacency matrices. Hence computing the least solution of tilde_Eq(DB) is equivalent to computing M_DB specified by DB. For a class of tail recursive programs and for some other types of programs, our approach achieves O(N^3) time complexity irrespective of the number of variables in a clause since only matrix operations costing O(N^3) or less are used. We conducted two experiments that compute the least Herbrand models of linear Datalog programs. The first experiment computes transitive closure of artificial data and real network data taken from the Koblenz Network Collection. The second one compared the proposed approach with the state-of-the-art symbolic systems including two Prolog systems and two ASP systems, in terms of computation time for a transitive closure program and the same generation program. In the experiment, it is observed that our linear algebraic approach runs 10^1 ~ 10^4 times faster than the symbolic systems when data is not sparse. To appear in Theory and Practice of Logic Programming (TPLP). |
Tasks | |
Published | 2016-07-30 |
URL | http://arxiv.org/abs/1608.00139v2 |
http://arxiv.org/pdf/1608.00139v2.pdf | |
PWC | https://paperswithcode.com/paper/a-linear-algebraic-approach-to-datalog |
Repo | |
Framework | |
Learning Efficient Algorithms with Hierarchical Attentive Memory
Title | Learning Efficient Algorithms with Hierarchical Attentive Memory |
Authors | Marcin Andrychowicz, Karol Kurach |
Abstract | In this paper, we propose and investigate a novel memory architecture for neural networks called Hierarchical Attentive Memory (HAM). It is based on a binary tree with leaves corresponding to memory cells. This allows HAM to perform memory access in O(log n) complexity, which is a significant improvement over the standard attention mechanism that requires O(n) operations, where n is the size of the memory. We show that an LSTM network augmented with HAM can learn algorithms for problems like merging, sorting or binary searching from pure input-output examples. In particular, it learns to sort n numbers in time O(n log n) and generalizes well to input sequences much longer than the ones seen during the training. We also show that HAM can be trained to act like classic data structures: a stack, a FIFO queue and a priority queue. |
Tasks | |
Published | 2016-02-09 |
URL | http://arxiv.org/abs/1602.03218v2 |
http://arxiv.org/pdf/1602.03218v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-efficient-algorithms-with |
Repo | |
Framework | |
Interacting with Massive Behavioral Data
Title | Interacting with Massive Behavioral Data |
Authors | Shih-Chieh Su |
Abstract | In this short paper, we propose the split-diffuse (SD) algorithm that takes the output of an existing word embedding algorithm, and distributes the data points uniformly across the visualization space. The result improves the perceivability and the interactability by the human. We apply the SD algorithm to analyze the user behavior through access logs within the cyber security domain. The result, named the topic grids, is a set of grids on various topics generated from the logs. On the same set of grids, different behavioral metrics can be shown on different targets over different periods of time, to provide visualization and interaction to the human experts. Analysis, investigation, and other types of interaction can be performed on the topic grids more efficiently than on the output of existing dimension reduction methods. In addition to the cyber security domain, the topic grids can be further applied to other domains like e-commerce, credit card transaction, customer service to analyze the behavior in a large scale. |
Tasks | Dimensionality Reduction |
Published | 2016-08-26 |
URL | http://arxiv.org/abs/1608.07619v1 |
http://arxiv.org/pdf/1608.07619v1.pdf | |
PWC | https://paperswithcode.com/paper/interacting-with-massive-behavioral-data |
Repo | |
Framework | |
Correntropy Maximization via ADMM - Application to Robust Hyperspectral Unmixing
Title | Correntropy Maximization via ADMM - Application to Robust Hyperspectral Unmixing |
Authors | Fei Zhu, Abderrahim Halimi, Paul Honeine, Badong Chen, Nanning Zheng |
Abstract | In hyperspectral images, some spectral bands suffer from low signal-to-noise ratio due to noisy acquisition and atmospheric effects, thus requiring robust techniques for the unmixing problem. This paper presents a robust supervised spectral unmixing approach for hyperspectral images. The robustness is achieved by writing the unmixing problem as the maximization of the correntropy criterion subject to the most commonly used constraints. Two unmixing problems are derived: the first problem considers the fully-constrained unmixing, with both the non-negativity and sum-to-one constraints, while the second one deals with the non-negativity and the sparsity-promoting of the abundances. The corresponding optimization problems are solved efficiently using an alternating direction method of multipliers (ADMM) approach. Experiments on synthetic and real hyperspectral images validate the performance of the proposed algorithms for different scenarios, demonstrating that the correntropy-based unmixing is robust to outlier bands. |
Tasks | Hyperspectral Unmixing |
Published | 2016-02-04 |
URL | http://arxiv.org/abs/1602.01729v1 |
http://arxiv.org/pdf/1602.01729v1.pdf | |
PWC | https://paperswithcode.com/paper/correntropy-maximization-via-admm-application |
Repo | |
Framework | |
Markov Chain Modeling and Simulation of Breathing Patterns
Title | Markov Chain Modeling and Simulation of Breathing Patterns |
Authors | Davide Alinovi, Gianluigi Ferrari, Francesco Pisani, Riccardo Raheli |
Abstract | The lack of large video databases obtained from real patients with respiratory disorders makes the design and optimization of video-based monitoring systems quite critical. The purpose of this study is the development of suitable models and simulators of breathing behaviors and disorders, such as respiratory pauses and apneas, in order to allow efficient design and test of video-based monitoring systems. More precisely, a novel Continuous-Time Markov Chain (CTMC) statistical model of breathing patterns is presented. The Respiratory Rate (RR) pattern, estimated by measured vital signs of hospital-monitored patients, is approximated as a CTMC, whose states and parameters are selected through an appropriate statistical analysis. Then, two simulators, software- and hardware-based, are proposed. After validation of the CTMC model, the proposed simulators are tested with previously developed video-based algorithms for the estimation of the RR and the detection of apnea events. Examples of application to assess the performance of systems for video-based RR estimation and apnea detection are presented. The results, in terms of Kullback-Leibler divergence, show that realistic breathing patterns, including specific respiratory disorders, can be accurately described by the proposed model; moreover, the simulators are able to reproduce practical breathing patterns for video analysis. The presented CTMC statistical model can be strategic to describe realistic breathing patterns and devise simulators useful to develop and test novel and effective video processing-based monitoring systems. |
Tasks | |
Published | 2016-10-05 |
URL | http://arxiv.org/abs/1610.01444v1 |
http://arxiv.org/pdf/1610.01444v1.pdf | |
PWC | https://paperswithcode.com/paper/markov-chain-modeling-and-simulation-of |
Repo | |
Framework | |
Adaptive and Scalable Android Malware Detection through Online Learning
Title | Adaptive and Scalable Android Malware Detection through Online Learning |
Authors | Annamalai Narayanan, Liu Yang, Lihui Chen, Liu Jinliang |
Abstract | It is well-known that malware constantly evolves so as to evade detection and this causes the entire malware population to be non-stationary. Contrary to this fact, prior works on machine learning based Android malware detection have assumed that the distribution of the observed malware characteristics (i.e., features) do not change over time. In this work, we address the problem of malware population drift and propose a novel online machine learning based framework, named DroidOL to handle it and effectively detect malware. In order to perform accurate detection, security-sensitive behaviors are captured from apps in the form of inter-procedural control-flow sub-graph features using a state-of-the-art graph kernel. In order to perform scalable detection and to adapt to the drift and evolution in malware population, an online passive-aggressive classifier is used. In a large-scale comparative analysis with more than 87,000 apps, DroidOL achieves 84.29% accuracy outperforming two state-of-the-art malware techniques by more than 20% in their typical batch learning setting and more than 3% when they are continuously re-trained. Our experimental findings strongly indicate that online learning based approaches are highly suitable for real-world malware detection. |
Tasks | Android Malware Detection, Malware Detection |
Published | 2016-06-23 |
URL | http://arxiv.org/abs/1606.07150v2 |
http://arxiv.org/pdf/1606.07150v2.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-and-scalable-android-malware |
Repo | |
Framework | |
Importance sampling strategy for non-convex randomized block-coordinate descent
Title | Importance sampling strategy for non-convex randomized block-coordinate descent |
Authors | Rémi Flamary, Alain Rakotomamonjy, Gilles Gasso |
Abstract | As the number of samples and dimensionality of optimization problems related to statistics an machine learning explode, block coordinate descent algorithms have gained popularity since they reduce the original problem to several smaller ones. Coordinates to be optimized are usually selected randomly according to a given probability distribution. We introduce an importance sampling strategy that helps randomized coordinate descent algorithms to focus on blocks that are still far from convergence. The framework applies to problems composed of the sum of two possibly non-convex terms, one being separable and non-smooth. We have compared our algorithm to a full gradient proximal approach as well as to a randomized block coordinate algorithm that considers uniform sampling and cyclic block coordinate descent. Experimental evidences show the clear benefit of using an importance sampling strategy. |
Tasks | |
Published | 2016-06-23 |
URL | http://arxiv.org/abs/1606.07286v1 |
http://arxiv.org/pdf/1606.07286v1.pdf | |
PWC | https://paperswithcode.com/paper/importance-sampling-strategy-for-non-convex |
Repo | |
Framework | |
Compositional Model based Fisher Vector Coding for Image Classification
Title | Compositional Model based Fisher Vector Coding for Image Classification |
Authors | Lingqiao Liu, Peng Wang, Chunhua Shen, Lei Wang, Anton van den Hengel, Chao Wang, Heng Tao Shen |
Abstract | Deriving from the gradient vector of a generative model of local features, Fisher vector coding (FVC) has been identified as an effective coding method for image classification. Most, if not all, FVC implementations employ the Gaussian mixture model (GMM) to depict the generation process of local features. However, the representative power of the GMM could be limited because it essentially assumes that local features can be characterized by a fixed number of feature prototypes and the number of prototypes is usually small in FVC. To handle this limitation, in this paper we break the convention which assumes that a local feature is drawn from one of few Gaussian distributions. Instead, we adopt a compositional mechanism which assumes that a local feature is drawn from a Gaussian distribution whose mean vector is composed as the linear combination of multiple key components and the combination weight is a latent random variable. In this way, we can greatly enhance the representative power of the generative model of FVC. To implement our idea, we designed two particular generative models with such a compositional mechanism. |
Tasks | Image Classification |
Published | 2016-01-16 |
URL | http://arxiv.org/abs/1601.04143v3 |
http://arxiv.org/pdf/1601.04143v3.pdf | |
PWC | https://paperswithcode.com/paper/compositional-model-based-fisher-vector |
Repo | |
Framework | |
Hit-and-Run for Sampling and Planning in Non-Convex Spaces
Title | Hit-and-Run for Sampling and Planning in Non-Convex Spaces |
Authors | Yasin Abbasi-Yadkori, Peter L. Bartlett, Victor Gabillon, Alan Malek |
Abstract | We propose the Hit-and-Run algorithm for planning and sampling problems in non-convex spaces. For sampling, we show the first analysis of the Hit-and-Run algorithm in non-convex spaces and show that it mixes fast as long as certain smoothness conditions are satisfied. In particular, our analysis reveals an intriguing connection between fast mixing and the existence of smooth measure-preserving mappings from a convex space to the non-convex space. For planning, we show advantages of Hit-and-Run compared to state-of-the-art planning methods such as Rapidly-Exploring Random Trees. |
Tasks | |
Published | 2016-10-19 |
URL | http://arxiv.org/abs/1610.08865v1 |
http://arxiv.org/pdf/1610.08865v1.pdf | |
PWC | https://paperswithcode.com/paper/hit-and-run-for-sampling-and-planning-in-non |
Repo | |
Framework | |
Generalized Optimal Matching Methods for Causal Inference
Title | Generalized Optimal Matching Methods for Causal Inference |
Authors | Nathan Kallus |
Abstract | We develop an encompassing framework for matching, covariate balancing, and doubly-robust methods for causal inference from observational data called generalized optimal matching (GOM). The framework is given by generalizing a new functional-analytical formulation of optimal matching, giving rise to the class of GOM methods, for which we provide a single unified theory to analyze tractability, consistency, and efficiency. Many commonly used existing methods are included in GOM and, using their GOM interpretation, can be extended to optimally and automatically trade off balance for variance and outperform their standard counterparts. As a subclass, GOM gives rise to kernel optimal matching (KOM), which, as supported by new theoretical and empirical results, is notable for combining many of the positive properties of other methods in one. KOM, which is solved as a linearly-constrained convex-quadratic optimization problem, inherits both the interpretability and model-free consistency of matching but can also achieve the $\sqrt{n}$-consistency of well-specified regression and the efficiency and robustness of doubly robust methods. In settings of limited overlap, KOM enables a very transparent method for interval estimation for partial identification and robust coverage. We demonstrate these benefits in examples with both synthetic and real data |
Tasks | Causal Inference |
Published | 2016-12-26 |
URL | http://arxiv.org/abs/1612.08321v3 |
http://arxiv.org/pdf/1612.08321v3.pdf | |
PWC | https://paperswithcode.com/paper/generalized-optimal-matching-methods-for |
Repo | |
Framework | |
Detecting Dependencies in Sparse, Multivariate Databases Using Probabilistic Programming and Non-parametric Bayes
Title | Detecting Dependencies in Sparse, Multivariate Databases Using Probabilistic Programming and Non-parametric Bayes |
Authors | Feras Saad, Vikash Mansinghka |
Abstract | Datasets with hundreds of variables and many missing values are commonplace. In this setting, it is both statistically and computationally challenging to detect true predictive relationships between variables and also to suppress false positives. This paper proposes an approach that combines probabilistic programming, information theory, and non-parametric Bayes. It shows how to use Bayesian non-parametric modeling to (i) build an ensemble of joint probability models for all the variables; (ii) efficiently detect marginal independencies; and (iii) estimate the conditional mutual information between arbitrary subsets of variables, subject to a broad class of constraints. Users can access these capabilities using BayesDB, a probabilistic programming platform for probabilistic data analysis, by writing queries in a simple, SQL-like language. This paper demonstrates empirically that the method can (i) detect context-specific (in)dependencies on challenging synthetic problems and (ii) yield improved sensitivity and specificity over baselines from statistics and machine learning, on a real-world database of over 300 sparsely observed indicators of macroeconomic development and public health. |
Tasks | Probabilistic Programming |
Published | 2016-11-05 |
URL | http://arxiv.org/abs/1611.01708v2 |
http://arxiv.org/pdf/1611.01708v2.pdf | |
PWC | https://paperswithcode.com/paper/detecting-dependencies-in-sparse-multivariate |
Repo | |
Framework | |