January 30, 2020

3199 words 16 mins read

Paper Group ANR 397

Markov chain Monte Carlo algorithms with sequential proposals. On-Device Neural Net Inference with Mobile GPUs. Contrast Phase Classification with a Generative Adversarial Network. Zero-Shot Sign Language Recognition: Can Textual Data Uncover Sign Languages?. Biologically inspired architectures for sample-efficient deep reinforcement learning. Prun …

Markov chain Monte Carlo algorithms with sequential proposals


Title	Markov chain Monte Carlo algorithms with sequential proposals
Authors	Joonha Park, Yves F. Atchadé
Abstract	We explore a general framework in Markov chain Monte Carlo (MCMC) sampling where sequential proposals are tried as a candidate for the next state of the Markov chain. This sequential-proposal framework can be applied to various existing MCMC methods, including Metropolis-Hastings algorithms using random proposals and methods that use deterministic proposals such as Hamiltonian Monte Carlo (HMC) or the bouncy particle sampler. Sequential-proposal MCMC methods construct the same Markov chains as those constructed by the delayed rejection method under certain circumstances. In the context of HMC, the sequential-proposal approach has been proposed as extra chance generalized hybrid Monte Carlo (XCGHMC). We develop two novel methods in which the trajectories leading to proposals in HMC are automatically tuned to avoid doubling back, as in the No-U-Turn sampler (NUTS). The numerical efficiency of these new methods compare favorably to the NUTS. We additionally show that the sequential-proposal bouncy particle sampler enables the constructed Markov chain to pass through regions of low target density and thus facilitates better mixing of the chain when the target density is multimodal.
Tasks
Published	2019-07-15
URL	https://arxiv.org/abs/1907.06544v3
PDF	https://arxiv.org/pdf/1907.06544v3.pdf
PWC	https://paperswithcode.com/paper/markov-chain-monte-carlo-algorithms-with
Repo
Framework

On-Device Neural Net Inference with Mobile GPUs


Title	On-Device Neural Net Inference with Mobile GPUs
Authors	Juhyun Lee, Nikolay Chirkov, Ekaterina Ignasheva, Yury Pisarchyk, Mogan Shieh, Fabio Riccardi, Raman Sarokin, Andrei Kulik, Matthias Grundmann
Abstract	On-device inference of machine learning models for mobile phones is desirable due to its lower latency and increased privacy. Running such a compute-intensive task solely on the mobile CPU, however, can be difficult due to limited computing power, thermal constraints, and energy consumption. App developers and researchers have begun exploiting hardware accelerators to overcome these challenges. Recently, device manufacturers are adding neural processing units into high-end phones for on-device inference, but these account for only a small fraction of hand-held devices. In this paper, we present how we leverage the mobile GPU, a ubiquitous hardware accelerator on virtually every phone, to run inference of deep neural networks in real-time for both Android and iOS devices. By describing our architecture, we also discuss how to design networks that are mobile GPU-friendly. Our state-of-the-art mobile GPU inference engine is integrated into the open-source project TensorFlow Lite and publicly available at https://tensorflow.org/lite.
Tasks
Published	2019-07-03
URL	https://arxiv.org/abs/1907.01989v1
PDF	https://arxiv.org/pdf/1907.01989v1.pdf
PWC	https://paperswithcode.com/paper/on-device-neural-net-inference-with-mobile
Repo
Framework

Contrast Phase Classification with a Generative Adversarial Network


Title	Contrast Phase Classification with a Generative Adversarial Network
Authors	Yucheng Tang, Ho Hin Lee, Yuchen Xu, Olivia Tang, Yunqiang Chen, Dashan Gao, Shizhong Han, Riqiang Gao, Camilo Bermudez, Michael R. Savona, Richard G. Abramson, Yuankai Huo, Bennett A. Landman
Abstract	Dynamic contrast enhanced computed tomography (CT) is an imaging technique that provides critical information on the relationship of vascular structure and dynamics in the context of underlying anatomy. A key challenge for image processing with contrast enhanced CT is that phase discrepancies are latent in different tissues due to contrast protocols, vascular dynamics, and metabolism variance. Previous studies with deep learning frameworks have been proposed for classifying contrast enhancement with networks inspired by computer vision. Here, we revisit the challenge in the context of whole abdomen contrast enhanced CTs. To capture and compensate for the complex contrast changes, we propose a novel discriminator in the form of a multi-domain disentangled representation learning network. The goal of this network is to learn an intermediate representation that separates contrast enhancement from anatomy and enables classification of images with varying contrast time. Briefly, our unpaired contrast disentangling GAN(CD-GAN) Discriminator follows the ResNet architecture to classify a CT scan from different enhancement phases. To evaluate the approach, we trained the enhancement phase classifier on 21060 slices from two clinical cohorts of 230 subjects. Testing was performed on 9100 slices from 30 independent subjects who had been imaged with CT scans from all contrast phases. Performance was quantified in terms of the multi-class normalized confusion matrix. The proposed network significantly improved correspondence over baseline UNet, ResNet50 and StarGAN performance of accuracy scores 0.54. 0.55, 0.62 and 0.91, respectively. The proposed discriminator from the disentangled network presents a promising technique that may allow deeper modeling of dynamic imaging against patient specific anatomies.
Tasks	Computed Tomography (CT), Representation Learning
Published	2019-11-14
URL	https://arxiv.org/abs/1911.06395v1
PDF	https://arxiv.org/pdf/1911.06395v1.pdf
PWC	https://paperswithcode.com/paper/contrast-phase-classification-with-a
Repo
Framework

Zero-Shot Sign Language Recognition: Can Textual Data Uncover Sign Languages?


Title	Zero-Shot Sign Language Recognition: Can Textual Data Uncover Sign Languages?
Authors	Yunus Can Bilge, Nazli Ikizler-Cinbis, Ramazan Gokberk Cinbis
Abstract	We introduce the problem of zero-shot sign language recognition (ZSSLR), where the goal is to leverage models learned over the seen sign class examples to recognize the instances of unseen signs. To this end, we propose to utilize the readily available descriptions in sign language dictionaries as an intermediate-level semantic representation for knowledge transfer. We introduce a new benchmark dataset called ASL-Text that consists of 250 sign language classes and their accompanying textual descriptions. Compared to the ZSL datasets in other domains (such as object recognition), our dataset consists of limited number of training examples for a large number of classes, which imposes a significant challenge. We propose a framework that operates over the body and hand regions by means of 3D-CNNs, and models longer temporal relationships via bidirectional LSTMs. By leveraging the descriptive text embeddings along with these spatio-temporal representations within a zero-shot learning framework, we show that textual data can indeed be useful in uncovering sign languages. We anticipate that the introduced approach and the accompanying dataset will provide a basis for further exploration of this new zero-shot learning problem.
Tasks	Object Recognition, Sign Language Recognition, Transfer Learning, Zero-Shot Learning
Published	2019-07-24
URL	https://arxiv.org/abs/1907.10292v1
PDF	https://arxiv.org/pdf/1907.10292v1.pdf
PWC	https://paperswithcode.com/paper/zero-shot-sign-language-recognition-can
Repo
Framework

Biologically inspired architectures for sample-efficient deep reinforcement learning


Title	Biologically inspired architectures for sample-efficient deep reinforcement learning
Authors	Pierre H. Richemond, Arinbjörn Kolbeinsson, Yike Guo
Abstract	Deep reinforcement learning requires a heavy price in terms of sample efficiency and overparameterization in the neural networks used for function approximation. In this work, we use tensor factorization in order to learn more compact representation for reinforcement learning policies. We show empirically that in the low-data regime, it is possible to learn online policies with 2 to 10 times less total coefficients, with little to no loss of performance. We also leverage progress in second order optimization, and use the theory of wavelet scattering to further reduce the number of learned coefficients, by foregoing learning the topmost convolutional layer filters altogether. We evaluate our results on the Atari suite against recent baseline algorithms that represent the state-of-the-art in data efficiency, and get comparable results with an order of magnitude gain in weight parsimony.
Tasks
Published	2019-11-25
URL	https://arxiv.org/abs/1911.11285v1
PDF	https://arxiv.org/pdf/1911.11285v1.pdf
PWC	https://paperswithcode.com/paper/biologically-inspired-architectures-for
Repo
Framework

Pruning Algorithms for Low-Dimensional Non-metric k-NN Search: A Case Study


Title	Pruning Algorithms for Low-Dimensional Non-metric k-NN Search: A Case Study
Authors	Leonid Boytsov, Eric Nyberg
Abstract	We focus on low-dimensional non-metric search, where tree-based approaches permit efficient and accurate retrieval while having short indexing time. These methods rely on space partitioning and require a pruning rule to avoid visiting unpromising parts. We consider two known data-driven approaches to extend these rules to non-metric spaces: TriGen and a piece-wise linear approximation of the pruning rule. We propose and evaluate two adaptations of TriGen to non-symmetric similarities (TriGen does not support non-symmetric distances). We also evaluate a hybrid of TriGen and the piece-wise linear approximation pruning. We find that this hybrid approach is often more effective than either of the pruning rules. We make our software publicly available.
Tasks
Published	2019-10-08
URL	https://arxiv.org/abs/1910.03539v1
PDF	https://arxiv.org/pdf/1910.03539v1.pdf
PWC	https://paperswithcode.com/paper/pruning-algorithms-for-low-dimensional-non
Repo
Framework

Finite-time Analysis of Approximate Policy Iteration for the Linear Quadratic Regulator


Title	Finite-time Analysis of Approximate Policy Iteration for the Linear Quadratic Regulator
Authors	Karl Krauth, Stephen Tu, Benjamin Recht
Abstract	We study the sample complexity of approximate policy iteration (PI) for the Linear Quadratic Regulator (LQR), building on a recent line of work using LQR as a testbed to understand the limits of reinforcement learning (RL) algorithms on continuous control tasks. Our analysis quantifies the tension between policy improvement and policy evaluation, and suggests that policy evaluation is the dominant factor in terms of sample complexity. Specifically, we show that to obtain a controller that is within $\varepsilon$ of the optimal LQR controller, each step of policy evaluation requires at most $(n+d)^3/\varepsilon^2$ samples, where $n$ is the dimension of the state vector and $d$ is the dimension of the input vector. On the other hand, only $\log(1/\varepsilon)$ policy improvement steps suffice, resulting in an overall sample complexity of $(n+d)^3 \varepsilon^{-2} \log(1/\varepsilon)$. We furthermore build on our analysis and construct a simple adaptive procedure based on $\varepsilon$-greedy exploration which relies on approximate PI as a sub-routine and obtains $T^{2/3}$ regret, improving upon a recent result of Abbasi-Yadkori et al.
Tasks	Continuous Control
Published	2019-05-30
URL	https://arxiv.org/abs/1905.12842v1
PDF	https://arxiv.org/pdf/1905.12842v1.pdf
PWC	https://paperswithcode.com/paper/finite-time-analysis-of-approximate-policy
Repo
Framework

A Planning Framework for Persistent, Multi-UAV Coverage with Global Deconfliction


Title	A Planning Framework for Persistent, Multi-UAV Coverage with Global Deconfliction
Authors	Tushar Kusnur, Shohin Mukherjee, Dhruv Mauria Saxena, Tomoya Fukami, Takayuki Koyama, Oren Salzman, Maxim Likhachev
Abstract	Planning for multi-robot coverage seeks to determine collision-free paths for a fleet of robots, enabling them to collectively observe points of interest in an environment. Persistent coverage is a variant of traditional coverage where coverage-levels in the environment decay over time. Thus, robots have to continuously revisit parts of the environment to maintain a desired coverage-level. Facilitating this in the real world demands we tackle numerous subproblems. While there exist standard solutions to these subproblems, there is no complete framework that addresses all of their individual challenges as a whole in a practical setting. We adapt and combine these solutions to present a planning framework for persistent coverage with multiple unmanned aerial vehicles (UAVs). Specifically, we run a continuous loop of goal assignment and globally deconflicting, kinodynamic path planning for multiple UAVs. We evaluate our framework in simulation as well as the real world. In particular, we demonstrate that (i) our framework exhibits graceful coverage given sufficient resources, we maintain persistent coverage; if resources are insufficient (e.g., having too few UAVs for a given size of the enviornment), coverage-levels decay slowly and (ii) planning with global deconfliction in our framework incurs a negligibly higher price compared to other weaker, more local collision-checking schemes. (Video: https://youtu.be/aqDs6Wymp5Q)
Tasks
Published	2019-08-25
URL	https://arxiv.org/abs/1908.09236v3
PDF	https://arxiv.org/pdf/1908.09236v3.pdf
PWC	https://paperswithcode.com/paper/a-planning-framework-for-persistent-multi-uav
Repo
Framework

Ontology Based Global and Collective Motion Patterns for Event Classification in Basketball Videos


Title	Ontology Based Global and Collective Motion Patterns for Event Classification in Basketball Videos
Authors	Lifang Wu, Zhou Yang, Jiaoyu He, Meng Jian, Yaowen Xu, Dezhong Xu, Chang Wen Chen
Abstract	In multi-person videos, especially team sport videos, a semantic event is usually represented as a confrontation between two teams of players, which can be represented as collective motion. In broadcast basketball videos, specific camera motions are used to present specific events. Therefore, a semantic event in broadcast basketball videos is closely related to both the global motion (camera motion) and the collective motion. A semantic event in basketball videos can be generally divided into three stages: pre-event, event occurrence (event-occ), and post-event. In this paper, we propose an ontology-based global and collective motion pattern (On_GCMP) algorithm for basketball event classification. First, a two-stage GCMP based event classification scheme is proposed. The GCMP is extracted using optical flow. The two-stage scheme progressively combines a five-class event classification algorithm on event-occs and a two-class event classification algorithm on pre-events. Both algorithms utilize sequential convolutional neural networks (CNNs) and long short-term memory (LSTM) networks to extract the spatial and temporal features of GCMP for event classification. Second, we utilize post-event segments to predict success/failure using deep features of images in the video frames (RGB_DF_VF) based algorithms. Finally the event classification results and success/failure classification results are integrated to obtain the final results. To evaluate the proposed scheme, we collected a new dataset called NCAA+, which is automatically obtained from the NCAA dataset by extending the fixed length of video clips forward and backward of the corresponding semantic events. The experimental results demonstrate that the proposed scheme achieves the mean average precision of 58.10% on NCAA+. It is higher by 6.50% than state-of-the-art on NCAA.
Tasks	Optical Flow Estimation
Published	2019-03-16
URL	http://arxiv.org/abs/1903.06879v2
PDF	http://arxiv.org/pdf/1903.06879v2.pdf
PWC	https://paperswithcode.com/paper/ontology-based-global-and-collective-motion
Repo
Framework

Confounder Selection via Support Intersection


Title	Confounder Selection via Support Intersection
Authors	Shinyuu Lee, Yuru Zhu
Abstract	Confounding matters in almost all observational studies that focus on causality. In order to eliminate bias caused by connfounders, oftentimes a substantial number of features need to be collected in the analysis. In this case, large p small n problem can arise and dimensional reduction technique is required. However, the traditional variable selection methods which focus on prediction are problematic in this setting. Throughout this paper, we analyze this issue in detail and assume the sparsity of confounders which is different from the previous works. Under this assumption we propose several variable selection methods based on support intersection to pick out the confounders. Also we discussed the different approaches for estimation of causal effect and unconfoundedness test. To aid in our description, finally we provide numerical simulations to support our claims and compare to common heuristic methods, as well as applications on real dataset.
Tasks
Published	2019-12-25
URL	https://arxiv.org/abs/1912.11652v1
PDF	https://arxiv.org/pdf/1912.11652v1.pdf
PWC	https://paperswithcode.com/paper/confounder-selection-via-support-intersection
Repo
Framework

Named Entity Recognition Only from Word Embeddings


Title	Named Entity Recognition Only from Word Embeddings
Authors	Ying Luo, Hai Zhao, Junlang Zhan
Abstract	Deep neural network models have helped named entity (NE) recognition achieve amazing performance without handcrafting features. However, existing systems require large amounts of human annotated training data. Efforts have been made to replace human annotations with external knowledge (e.g., NE dictionary, part-of-speech tags), while it is another challenge to obtain such effective resources. In this work, we propose a fully unsupervised NE recognition model which only needs to take informative clues from pre-trained word embeddings. We first apply Gaussian Hidden Markov Model and Deep Autoencoding Gaussian Mixture Model on word embeddings for entity span detection and type prediction, and then further design an instance selector based on reinforcement learning to distinguish positive sentences from noisy sentences and refine these coarse-grained annotations through neural networks. Extensive experiments on CoNLL benchmark datasets demonstrate that our proposed light NE recognition model achieves remarkable performance without using any annotated lexicon or corpus.
Tasks	Named Entity Recognition, Word Embeddings
Published	2019-08-31
URL	https://arxiv.org/abs/1909.00164v1
PDF	https://arxiv.org/pdf/1909.00164v1.pdf
PWC	https://paperswithcode.com/paper/named-entity-recognition-only-from-word
Repo
Framework

Entity Projection via Machine Translation for Cross-Lingual NER


Title	Entity Projection via Machine Translation for Cross-Lingual NER
Authors	Alankar Jain, Bhargavi Paranjape, Zachary C. Lipton
Abstract	Although over 100 languages are supported by strong off-the-shelf machine translation systems, only a subset of them possess large annotated corpora for named entity recognition. Motivated by this fact, we leverage machine translation to improve annotation-projection approaches to cross-lingual named entity recognition. We propose a system that improves over prior entity-projection methods by: (a) leveraging machine translation systems twice: first for translating sentences and subsequently for translating entities; (b) matching entities based on orthographic and phonetic similarity; and (c) identifying matches based on distributional statistics derived from the dataset. Our approach improves upon current state-of-the-art methods for cross-lingual named entity recognition on 5 diverse languages by an average of 4.1 points. Further, our method achieves state-of-the-art F_1 scores for Armenian, outperforming even a monolingual model trained on Armenian source data.
Tasks	Machine Translation, Named Entity Recognition
Published	2019-08-31
URL	https://arxiv.org/abs/1909.05356v2
PDF	https://arxiv.org/pdf/1909.05356v2.pdf
PWC	https://paperswithcode.com/paper/entity-projection-via-machine-translation-for
Repo
Framework

RES-PCA: A Scalable Approach to Recovering Low-rank Matrices


Title	RES-PCA: A Scalable Approach to Recovering Low-rank Matrices
Authors	Chong Peng, Chenglizhao Chen, Zhao Kang, Jianbo Li, Qiang Cheng
Abstract	Robust principal component analysis (RPCA) has drawn significant attentions due to its powerful capability in recovering low-rank matrices as well as successful appplications in various real world problems. The current state-of-the-art algorithms usually need to solve singular value decomposition of large matrices, which generally has at least a quadratic or even cubic complexity. This drawback has limited the application of RPCA in solving real world problems. To combat this drawback, in this paper we propose a new type of RPCA method, RES-PCA, which is linearly efficient and scalable in both data size and dimension. For comparison purpose, AltProj, an existing scalable approach to RPCA requires the precise knowlwdge of the true rank; otherwise, it may fail to recover low-rank matrices. By contrast, our method works with or without knowing the true rank; even when both methods work, our method is faster. Extensive experiments have been performed and testified to the effectiveness of proposed method quantitatively and in visual quality, which suggests that our method is suitable to be employed as a light-weight, scalable component for RPCA in any application pipelines.
Tasks
Published	2019-04-16
URL	http://arxiv.org/abs/1904.07497v1
PDF	http://arxiv.org/pdf/1904.07497v1.pdf
PWC	https://paperswithcode.com/paper/res-pca-a-scalable-approach-to-recovering-low
Repo
Framework

The Renyi Gaussian Process: Towards Improved Generalization


Title	The Renyi Gaussian Process: Towards Improved Generalization
Authors	Xubo Yue, Raed Kontar
Abstract	We introduce an alternative closed form lower bound on the Gaussian process ($\mathcal{GP}$) likelihood based on the R'enyi $\alpha$-divergence. This new lower bound can be viewed as a convex combination of the Nystr"om approximation and the exact $\mathcal{GP}$. The key advantage of this bound, is its capability to control and tune the enforced regularization on the model and thus is a generalization of the traditional variational $\mathcal{GP}$ regression. From a theoretical perspective, we provide the convergence rate and risk bound for inference using our proposed approach. Experiments on real data show that the proposed algorithm may be able to deliver improvement over several $\mathcal{GP}$ inference methods.
Tasks
Published	2019-10-15
URL	https://arxiv.org/abs/1910.06990v3
PDF	https://arxiv.org/pdf/1910.06990v3.pdf
PWC	https://paperswithcode.com/paper/the-renyi-gaussian-process
Repo
Framework

Learning Shared Cross-modality Representation Using Multispectral-LiDAR and Hyperspectral Data


Title	Learning Shared Cross-modality Representation Using Multispectral-LiDAR and Hyperspectral Data
Authors	Danfeng Hong, Jocelyn Chanussot, Naoto Yokoya, Jian Kang, Xiao Xiang Zhu
Abstract	Due to the ever-growing diversity of the data source, multi-modality feature learning has attracted more and more attention. However, most of these methods are designed by jointly learning feature representation from multi-modalities that exist in both training and test sets, yet they are less investigated in absence of certain modality in the test phase. To this end, in this letter, we propose to learn a shared feature space across multi-modalities in the training process. By this way, the out-of-sample from any of multi-modalities can be directly projected onto the learned space for a more effective cross-modality representation. More significantly, the shared space is regarded as a latent subspace in our proposed method, which connects the original multi-modal samples with label information to further improve the feature discrimination. Experiments are conducted on the multispectral-Lidar and hyperspectral dataset provided by the 2018 IEEE GRSS Data Fusion Contest to demonstrate the effectiveness and superiority of the proposed method in comparison with several popular baselines.
Tasks
Published	2019-12-18
URL	https://arxiv.org/abs/1912.08837v1
PDF	https://arxiv.org/pdf/1912.08837v1.pdf
PWC	https://paperswithcode.com/paper/learning-shared-cross-modality-representation
Repo
Framework