January 26, 2020

2968 words 14 mins read

Paper Group ANR 1586

Accelerated Flow for Probability Distributions. Stochastic gradient Markov chain Monte Carlo. Biomedical Evidence Generation Engine. Regularized Non-negative Spectral Embedding for Clustering. High Utility Interval-Based Sequences. Template-Based Automatic Search of Compact Semantic Segmentation Architectures. How multilingual is Multilingual BERT? …

Accelerated Flow for Probability Distributions


Title	Accelerated Flow for Probability Distributions
Authors	Amirhossein Taghvaei, Prashant G. Mehta
Abstract	This paper presents a methodology and numerical algorithms for constructing accelerated gradient flows on the space of probability distributions. In particular, we extend the recent variational formulation of accelerated gradient methods in (wibisono, et. al. 2016) from vector valued variables to probability distributions. The variational problem is modeled as a mean-field optimal control problem. The maximum principle of optimal control theory is used to derive Hamilton’s equations for the optimal gradient flow. The Hamilton’s equation are shown to achieve the accelerated form of density transport from any initial probability distribution to a target probability distribution. A quantitative estimate on the asymptotic convergence rate is provided based on a Lyapunov function construction, when the objective functional is displacement convex. Two numerical approximations are presented to implement the Hamilton’s equations as a system of $N$ interacting particles. The continuous limit of the Nesterov’s algorithm is shown to be a special case with $N=1$. The algorithm is illustrated with numerical examples.
Tasks
Published	2019-01-10
URL	http://arxiv.org/abs/1901.03317v2
PDF	http://arxiv.org/pdf/1901.03317v2.pdf
PWC	https://paperswithcode.com/paper/accelerated-flow-for-probability
Repo
Framework

Stochastic gradient Markov chain Monte Carlo


Title	Stochastic gradient Markov chain Monte Carlo
Authors	Christopher Nemeth, Paul Fearnhead
Abstract	Markov chain Monte Carlo (MCMC) algorithms are generally regarded as the gold standard technique for Bayesian inference. They are theoretically well-understood and conceptually simple to apply in practice. The drawback of MCMC is that in general performing exact inference requires all of the data to be processed at each iteration of the algorithm. For large data sets, the computational cost of MCMC can be prohibitive, which has led to recent developments in scalable Monte Carlo algorithms that have a significantly lower computational cost than standard MCMC. In this paper, we focus on a particular class of scalable Monte Carlo algorithms, stochastic gradient Markov chain Monte Carlo (SGMCMC) which utilises data subsampling techniques to reduce the per-iteration cost of MCMC. We provide an introduction to some popular SGMCMC algorithms and review the supporting theoretical results, as well as comparing the efficiency of SGMCMC algorithms against MCMC on benchmark examples. The supporting R code is available online.
Tasks	Bayesian Inference
Published	2019-07-16
URL	https://arxiv.org/abs/1907.06986v1
PDF	https://arxiv.org/pdf/1907.06986v1.pdf
PWC	https://paperswithcode.com/paper/stochastic-gradient-markov-chain-monte-carlo
Repo
Framework

Biomedical Evidence Generation Engine


Title	Biomedical Evidence Generation Engine
Authors	Sendong Zhao, Fei Wang
Abstract	With the rapid development of precision medicine, a large amount of health data (such as electronic health records, gene sequencing, medical images, etc.) has been produced. It encourages more and more interest in data-driven insight discovery from these data. It is a reasonable way to verify the derived insights in biomedical literature. However, manual verification is inefficient and not scalable. Therefore, an intelligent technique is necessary to solve this problem. In this paper, we propose a task of biomedical evidence generation, which is very novel and different from existing NLP tasks. Furthermore, we developed a biomedical evidence generation engine for this task with the pipeline of three components which are a literature retrieval module, a skeleton information identification module, and a text summarization module.
Tasks	Information Retrieval, Question Answering, Text Matching, Text Summarization
Published	2019-11-11
URL	https://arxiv.org/abs/1911.06146v2
PDF	https://arxiv.org/pdf/1911.06146v2.pdf
PWC	https://paperswithcode.com/paper/interactive-attention-for-semantic-text
Repo
Framework

Regularized Non-negative Spectral Embedding for Clustering


Title	Regularized Non-negative Spectral Embedding for Clustering
Authors	Yifei Wang, Rui Liu, Yong Chen, Hui Zhangs, Zhiwen Ye
Abstract	Spectral Clustering is a popular technique to split data points into groups, especially for complex datasets. The algorithms in the Spectral Clustering family typically consist of multiple separate stages (such as similarity matrix construction, low-dimensional embedding, and K-Means clustering as post processing), which may lead to sub-optimal results because of the possible mismatch between different stages. In this paper, we propose an end-to-end single-stage learning method to clustering called Regularized Non-negative Spectral Embedding (RNSE) which extends Spectral Clustering with the adaptive learning of similarity matrix and meanwhile utilizes non-negative constraints to facilitate one-step clustering (directly from data points to clustering labels). Two well-founded methods, successive alternating projection and strategic multiplicative update, are employed to work out the quite challenging optimization problems in RNSE. Extensive experiments on both synthetic and real-world datasets demonstrate RNSE superior clustering performance to some state-of-the-art competitors.
Tasks
Published	2019-11-01
URL	https://arxiv.org/abs/1911.00179v1
PDF	https://arxiv.org/pdf/1911.00179v1.pdf
PWC	https://paperswithcode.com/paper/regularized-non-negative-spectral-embedding
Repo
Framework

High Utility Interval-Based Sequences


Title	High Utility Interval-Based Sequences
Authors	S. Mohammad Mirbagheri, Howard J. Hamilton
Abstract	Sequential pattern mining is an interesting research area with broad range of applications. Most prior research on sequential pattern mining has considered point-based data where events occur instantaneously. However, in many application domains, events persist over intervals of time of varying lengths. Furthermore, traditional frameworks of sequential pattern mining assume all events to have the same weight or utility. This simplifying assumption neglects the opportunity to find informative patterns in terms of utilities such as costs. To address these issues, we incorporate the concept of utility into interval-based sequences and define a framework to mine high utility patterns in interval-based sequences i.e., patterns whose utility meets or exceeds a minimum threshold. In the proposed framework, the utility of events is considered while assuming multiple events can occur coincidentally and persist over varying periods of time. An Apriori-based algorithm name High Utility Interval-based Pattern Miner (HUIPMiner) is proposed and applied to real datasets. To achieve an efficient solution, HUIPMiner is augmented with a pruning strategy. Experimental results show that HUIPMiner is an effective solution to the problem of mining high utility interval-based sequences.
Tasks	Sequential Pattern Mining
Published	2019-12-24
URL	https://arxiv.org/abs/1912.11165v1
PDF	https://arxiv.org/pdf/1912.11165v1.pdf
PWC	https://paperswithcode.com/paper/high-utility-interval-based-sequences
Repo
Framework

Template-Based Automatic Search of Compact Semantic Segmentation Architectures


Title	Template-Based Automatic Search of Compact Semantic Segmentation Architectures
Authors	Vladimir Nekrasov, Chunhua Shen, Ian Reid
Abstract	Automatic search of neural architectures for various vision and natural language tasks is becoming a prominent tool as it allows to discover high-performing structures on any dataset of interest. Nevertheless, on more difficult domains, such as dense per-pixel classification, current automatic approaches are limited in their scope - due to their strong reliance on existing image classifiers they tend to search only for a handful of additional layers with discovered architectures still containing a large number of parameters. In contrast, in this work we propose a novel solution able to find light-weight and accurate segmentation architectures starting from only few blocks of a pre-trained classification network. To this end, we progressively build up a methodology that relies on templates of sets of operations, predicts which template and how many times should be applied at each step, while also generating the connectivity structure and downsampling factors. All these decisions are being made by a recurrent neural network that is rewarded based on the score of the emitted architecture on the holdout set and trained using reinforcement learning. One discovered architecture achieves 63.2% mean IoU on CamVid and 67.8% on CityScapes having only 270K parameters.
Tasks	Semantic Segmentation
Published	2019-04-04
URL	http://arxiv.org/abs/1904.02365v1
PDF	http://arxiv.org/pdf/1904.02365v1.pdf
PWC	https://paperswithcode.com/paper/template-based-automatic-search-of-compact
Repo
Framework

How multilingual is Multilingual BERT?


Title	How multilingual is Multilingual BERT?
Authors	Telmo Pires, Eva Schlinger, Dan Garrette
Abstract	In this paper, we show that Multilingual BERT (M-BERT), released by Devlin et al. (2018) as a single language model pre-trained from monolingual corpora in 104 languages, is surprisingly good at zero-shot cross-lingual model transfer, in which task-specific annotations in one language are used to fine-tune the model for evaluation in another language. To understand why, we present a large number of probing experiments, showing that transfer is possible even to languages in different scripts, that transfer works best between typologically similar languages, that monolingual corpora can train models for code-switching, and that the model can find translation pairs. From these results, we can conclude that M-BERT does create multilingual representations, but that these representations exhibit systematic deficiencies affecting certain language pairs.
Tasks	Language Modelling
Published	2019-06-04
URL	https://arxiv.org/abs/1906.01502v1
PDF	https://arxiv.org/pdf/1906.01502v1.pdf
PWC	https://paperswithcode.com/paper/how-multilingual-is-multilingual-bert
Repo
Framework

Context-Aware Multipath Networks


Title	Context-Aware Multipath Networks
Authors	Dumindu Tissera, Kumara Kahatapitiya, Rukshan Wijesinghe, Subha Fernando, Ranga Rodrigo
Abstract	Making a single network effectively address diverse contexts—learning the variations within a dataset or multiple datasets—is an intriguing step towards achieving generalized intelligence. Existing approaches of deepening, widening, and assembling networks are not cost effective in general. In view of this, networks which can allocate resources according to the context of the input and regulate flow of information across the network are effective. In this paper, we present Context-Aware Multipath Network (CAMNet), a multi-path neural network with data-dependant routing between parallel tensors. We show that our model performs as a generalized model capturing variations in individual datasets and multiple different datasets, both simultaneously and sequentially. CAMNet surpasses the performance of classification and pixel-labeling tasks in comparison with the equivalent single-path, multi-path, and deeper single-path networks, considering datasets individually, sequentially, and in combination. The data-dependent routing between tensors in CAMNet enables the model to control the flow of information end-to-end, deciding which resources to be common or domain-specific.
Tasks
Published	2019-07-26
URL	https://arxiv.org/abs/1907.11519v1
PDF	https://arxiv.org/pdf/1907.11519v1.pdf
PWC	https://paperswithcode.com/paper/context-aware-multipath-networks
Repo
Framework

Unsupervised Part Mining for Fine-grained Image Classification


Title	Unsupervised Part Mining for Fine-grained Image Classification
Authors	Jian Zhang, Runsheng Zhang, Yaping Huang, Qi Zou
Abstract	Fine-grained image classification remains challenging due to the large intra-class variance and small inter-class variance. Since the subtle visual differences are only in local regions of discriminative parts among subcategories, part localization is a key issue for fine-grained image classification. Most existing approaches localize object or parts in an image with object or part annotations, which are expensive and labor-consuming. To tackle this issue, we propose a fully unsupervised part mining (UPM) approach to localize the discriminative parts without even image-level annotations, which largely improves the fine-grained classification performance. We first utilize pattern mining techniques to discover frequent patterns, i.e., co-occurrence highlighted regions, in the feature maps extracted from a pre-trained convolutional neural network (CNN) model. Inspired by the fact that these relevant meaningful patterns typically hold appearance and spatial consistency, we then cluster the mined regions to obtain the cluster centers and the discriminative parts surrounding the cluster centers are generated. Importantly, any annotations and sophisticated training procedures are not used in our proposed part localization approach. Finally, a multi-stream classification network is built for aggregating the original, object-level and part-level features simultaneously. Compared with other state-of-the-art approaches, our UPM approach achieves the competitive performance.
Tasks	Fine-Grained Image Classification, Image Classification
Published	2019-02-26
URL	http://arxiv.org/abs/1902.09941v1
PDF	http://arxiv.org/pdf/1902.09941v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-part-mining-for-fine-grained
Repo
Framework

Street Scene: A new dataset and evaluation protocol for video anomaly detection


Title	Street Scene: A new dataset and evaluation protocol for video anomaly detection
Authors	Bharathkumar Ramachandra, Michael Jones
Abstract	Progress in video anomaly detection research is currently slowed by small datasets that lack a wide variety of activities as well as flawed evaluation criteria. This paper aims to help move this research effort forward by introducing a large and varied new dataset called Street Scene, as well as two new evaluation criteria that provide a better estimate of how an algorithm will perform in practice. In addition to the new dataset and evaluation criteria, we present two variations of a novel baseline video anomaly detection algorithm and show they are much more accurate on Street Scene than two state-of-the-art algorithms from the literature.
Tasks	Anomaly Detection
Published	2019-02-15
URL	https://arxiv.org/abs/1902.05872v3
PDF	https://arxiv.org/pdf/1902.05872v3.pdf
PWC	https://paperswithcode.com/paper/street-scene-a-new-dataset-and-evaluation
Repo
Framework

Hyper-SAGNN: a self-attention based graph neural network for hypergraphs


Title	Hyper-SAGNN: a self-attention based graph neural network for hypergraphs
Authors	Ruochi Zhang, Yuesong Zou, Jian Ma
Abstract	Graph representation learning for hypergraphs can be used to extract patterns among higher-order interactions that are critically important in many real world problems. Current approaches designed for hypergraphs, however, are unable to handle different types of hypergraphs and are typically not generic for various learning tasks. Indeed, models that can predict variable-sized heterogeneous hyperedges have not been available. Here we develop a new self-attention based graph neural network called Hyper-SAGNN applicable to homogeneous and heterogeneous hypergraphs with variable hyperedge sizes. We perform extensive evaluations on multiple datasets, including four benchmark network datasets and two single-cell Hi-C datasets in genomics. We demonstrate that Hyper-SAGNN significantly outperforms the state-of-the-art methods on traditional tasks while also achieving great performance on a new task called outsider identification. Hyper-SAGNN will be useful for graph representation learning to uncover complex higher-order interactions in different applications.
Tasks	Graph Representation Learning, Representation Learning
Published	2019-11-06
URL	https://arxiv.org/abs/1911.02613v1
PDF	https://arxiv.org/pdf/1911.02613v1.pdf
PWC	https://paperswithcode.com/paper/hyper-sagnn-a-self-attention-based-graph-1
Repo
Framework

Do we train on test data? Purging CIFAR of near-duplicates


Title	Do we train on test data? Purging CIFAR of near-duplicates
Authors	Björn Barz, Joachim Denzler
Abstract	We find that 3.3% and 10% of the images from the CIFAR-10 and CIFAR-100 test sets, respectively, have duplicates in the training set. This may incur a bias on the comparison of image recognition techniques with respect to their generalization capability on these heavily benchmarked datasets. To eliminate this bias, we provide the “fair CIFAR” (ciFAIR) dataset, where we replaced all duplicates in the test sets with new images sampled from the same domain. The training set remains unchanged, in order not to invalidate pre-trained models. We then re-evaluate the classification performance of various popular state-of-the-art CNN architectures on these new test sets to investigate whether recent research has overfitted to memorizing data instead of learning abstract concepts. Fortunately, this does not seem to be the case yet. The ciFAIR dataset and pre-trained models are available at https://cvjena.github.io/cifair/, where we also maintain a leaderboard.
Tasks
Published	2019-02-01
URL	http://arxiv.org/abs/1902.00423v1
PDF	http://arxiv.org/pdf/1902.00423v1.pdf
PWC	https://paperswithcode.com/paper/do-we-train-on-test-data-purging-cifar-of
Repo
Framework

Structure Learning of Gaussian Markov Random Fields with False Discovery Rate Control


Title	Structure Learning of Gaussian Markov Random Fields with False Discovery Rate Control
Authors	Sangkyun Lee, Piotr Sobczyk, Malgorzata Bogdan
Abstract	In this paper, we propose a new estimation procedure for discovering the structure of Gaussian Markov random fields (MRFs) with false discovery rate (FDR) control, making use of the sorted l1-norm (SL1) regularization. A Gaussian MRF is an acyclic graph representing a multivariate Gaussian distribution, where nodes are random variables and edges represent the conditional dependence between the connected nodes. Since it is possible to learn the edge structure of Gaussian MRFs directly from data, Gaussian MRFs provide an excellent way to understand complex data by revealing the dependence structure among many inputs features, such as genes, sensors, users, documents, etc. In learning the graphical structure of Gaussian MRFs, it is desired to discover the actual edges of the underlying but unknown probabilistic graphical model-it becomes more complicated when the number of random variables (features) p increases, compared to the number of data points n. In particular, when p » n, it is statistically unavoidable for any estimation procedure to include false edges. Therefore, there have been many trials to reduce the false detection of edges, in particular, using different types of regularization on the learning parameters. Our method makes use of the SL1 regularization, introduced recently for model selection in linear regression. We focus on the benefit of SL1 regularization that it can be used to control the FDR of detecting important random variables. Adapting SL1 for probabilistic graphical models, we show that SL1 can be used for the structure learning of Gaussian MRFs using our suggested procedure nsSLOPE (neighborhood selection Sorted L-One Penalized Estimation), controlling the FDR of detecting edges.
Tasks	Model Selection
Published	2019-10-24
URL	https://arxiv.org/abs/1910.10860v1
PDF	https://arxiv.org/pdf/1910.10860v1.pdf
PWC	https://paperswithcode.com/paper/structure-learning-of-gaussian-markov-random
Repo
Framework

Non-contact Infant Sleep Apnea Detection


Title	Non-contact Infant Sleep Apnea Detection
Authors	Gihan Jayatilaka, Harshana Weligampola, Suren Sritharan, Pankayraj Pathmanathan, Roshan Ragel, Isuru Nawinne
Abstract	Sleep apnea is a breathing disorder where a person repeatedly stops breathing in sleep. Early detection is crucial for infants because it might bring long term adversities. The existing accurate detection mechanism (pulse oximetry) is a skin contact measurement. The existing non-contact mechanisms (acoustics, video processing) are not accurate enough. This paper presents a novel algorithm for the detection of sleep apnea with video processing. The solution is non-contact, accurate and lightweight enough to run on a single board computer. The paper discusses the accuracy of the algorithm on real data, advantages of the new algorithm, its limitations and suggests future improvements.
Tasks	Sleep apnea detection
Published	2019-10-10
URL	https://arxiv.org/abs/1910.04725v1
PDF	https://arxiv.org/pdf/1910.04725v1.pdf
PWC	https://paperswithcode.com/paper/non-contact-infant-sleep-apnea-detection
Repo
Framework

Robust Cross-lingual Embeddings from Parallel Sentences


Title	Robust Cross-lingual Embeddings from Parallel Sentences
Authors	Ali Sabet, Prakhar Gupta, Jean-Baptiste Cordonnier, Robert West, Martin Jaggi
Abstract	Recent advances in cross-lingual word embeddings have primarily relied on mapping-based methods, which project pretrained word embeddings from different languages into a shared space through a linear transformation. However, these approaches assume word embedding spaces are isomorphic between different languages, which has been shown not to hold in practice (S{\o}gaard et al., 2018), and fundamentally limits their performance. This motivates investigating joint learning methods which can overcome this impediment, by simultaneously learning embeddings across languages via a cross-lingual term in the training objective. Given the abundance of parallel data available (Tiedemann, 2012), we propose a bilingual extension of the CBOW method which leverages sentence-aligned corpora to obtain robust cross-lingual word and sentence representations. Our approach significantly improves cross-lingual sentence retrieval performance over all other approaches, as well as convincingly outscores mapping methods while maintaining parity with jointly trained methods on word-translation. It also achieves parity with a deep RNN method on a zero-shot cross-lingual document classification task, requiring far fewer computational resources for training and inference. As an additional advantage, our bilingual method also improves the quality of monolingual word vectors despite training on much smaller datasets. We make our code and models publicly available.
Tasks	Cross-Lingual Document Classification, Document Classification, Word Embeddings
Published	2019-12-28
URL	https://arxiv.org/abs/1912.12481v1
PDF	https://arxiv.org/pdf/1912.12481v1.pdf
PWC	https://paperswithcode.com/paper/robust-cross-lingual-embeddings-from-parallel-1
Repo
Framework