Paper Group ANR 397
Markov chain Monte Carlo algorithms with sequential proposals. On-Device Neural Net Inference with Mobile GPUs. Contrast Phase Classification with a Generative Adversarial Network. Zero-Shot Sign Language Recognition: Can Textual Data Uncover Sign Languages?. Biologically inspired architectures for sample-efficient deep reinforcement learning. Prun …
Markov chain Monte Carlo algorithms with sequential proposals
Title | Markov chain Monte Carlo algorithms with sequential proposals |
Authors | Joonha Park, Yves F. Atchadé |
Abstract | We explore a general framework in Markov chain Monte Carlo (MCMC) sampling where sequential proposals are tried as a candidate for the next state of the Markov chain. This sequential-proposal framework can be applied to various existing MCMC methods, including Metropolis-Hastings algorithms using random proposals and methods that use deterministic proposals such as Hamiltonian Monte Carlo (HMC) or the bouncy particle sampler. Sequential-proposal MCMC methods construct the same Markov chains as those constructed by the delayed rejection method under certain circumstances. In the context of HMC, the sequential-proposal approach has been proposed as extra chance generalized hybrid Monte Carlo (XCGHMC). We develop two novel methods in which the trajectories leading to proposals in HMC are automatically tuned to avoid doubling back, as in the No-U-Turn sampler (NUTS). The numerical efficiency of these new methods compare favorably to the NUTS. We additionally show that the sequential-proposal bouncy particle sampler enables the constructed Markov chain to pass through regions of low target density and thus facilitates better mixing of the chain when the target density is multimodal. |
Tasks | |
Published | 2019-07-15 |
URL | https://arxiv.org/abs/1907.06544v3 |
https://arxiv.org/pdf/1907.06544v3.pdf | |
PWC | https://paperswithcode.com/paper/markov-chain-monte-carlo-algorithms-with |
Repo | |
Framework | |
On-Device Neural Net Inference with Mobile GPUs
Title | On-Device Neural Net Inference with Mobile GPUs |
Authors | Juhyun Lee, Nikolay Chirkov, Ekaterina Ignasheva, Yury Pisarchyk, Mogan Shieh, Fabio Riccardi, Raman Sarokin, Andrei Kulik, Matthias Grundmann |
Abstract | On-device inference of machine learning models for mobile phones is desirable due to its lower latency and increased privacy. Running such a compute-intensive task solely on the mobile CPU, however, can be difficult due to limited computing power, thermal constraints, and energy consumption. App developers and researchers have begun exploiting hardware accelerators to overcome these challenges. Recently, device manufacturers are adding neural processing units into high-end phones for on-device inference, but these account for only a small fraction of hand-held devices. In this paper, we present how we leverage the mobile GPU, a ubiquitous hardware accelerator on virtually every phone, to run inference of deep neural networks in real-time for both Android and iOS devices. By describing our architecture, we also discuss how to design networks that are mobile GPU-friendly. Our state-of-the-art mobile GPU inference engine is integrated into the open-source project TensorFlow Lite and publicly available at https://tensorflow.org/lite. |
Tasks | |
Published | 2019-07-03 |
URL | https://arxiv.org/abs/1907.01989v1 |
https://arxiv.org/pdf/1907.01989v1.pdf | |
PWC | https://paperswithcode.com/paper/on-device-neural-net-inference-with-mobile |
Repo | |
Framework | |
Contrast Phase Classification with a Generative Adversarial Network
Title | Contrast Phase Classification with a Generative Adversarial Network |
Authors | Yucheng Tang, Ho Hin Lee, Yuchen Xu, Olivia Tang, Yunqiang Chen, Dashan Gao, Shizhong Han, Riqiang Gao, Camilo Bermudez, Michael R. Savona, Richard G. Abramson, Yuankai Huo, Bennett A. Landman |
Abstract | Dynamic contrast enhanced computed tomography (CT) is an imaging technique that provides critical information on the relationship of vascular structure and dynamics in the context of underlying anatomy. A key challenge for image processing with contrast enhanced CT is that phase discrepancies are latent in different tissues due to contrast protocols, vascular dynamics, and metabolism variance. Previous studies with deep learning frameworks have been proposed for classifying contrast enhancement with networks inspired by computer vision. Here, we revisit the challenge in the context of whole abdomen contrast enhanced CTs. To capture and compensate for the complex contrast changes, we propose a novel discriminator in the form of a multi-domain disentangled representation learning network. The goal of this network is to learn an intermediate representation that separates contrast enhancement from anatomy and enables classification of images with varying contrast time. Briefly, our unpaired contrast disentangling GAN(CD-GAN) Discriminator follows the ResNet architecture to classify a CT scan from different enhancement phases. To evaluate the approach, we trained the enhancement phase classifier on 21060 slices from two clinical cohorts of 230 subjects. Testing was performed on 9100 slices from 30 independent subjects who had been imaged with CT scans from all contrast phases. Performance was quantified in terms of the multi-class normalized confusion matrix. The proposed network significantly improved correspondence over baseline UNet, ResNet50 and StarGAN performance of accuracy scores 0.54. 0.55, 0.62 and 0.91, respectively. The proposed discriminator from the disentangled network presents a promising technique that may allow deeper modeling of dynamic imaging against patient specific anatomies. |
Tasks | Computed Tomography (CT), Representation Learning |
Published | 2019-11-14 |
URL | https://arxiv.org/abs/1911.06395v1 |
https://arxiv.org/pdf/1911.06395v1.pdf | |
PWC | https://paperswithcode.com/paper/contrast-phase-classification-with-a |
Repo | |
Framework | |
Zero-Shot Sign Language Recognition: Can Textual Data Uncover Sign Languages?
Title | Zero-Shot Sign Language Recognition: Can Textual Data Uncover Sign Languages? |
Authors | Yunus Can Bilge, Nazli Ikizler-Cinbis, Ramazan Gokberk Cinbis |
Abstract | We introduce the problem of zero-shot sign language recognition (ZSSLR), where the goal is to leverage models learned over the seen sign class examples to recognize the instances of unseen signs. To this end, we propose to utilize the readily available descriptions in sign language dictionaries as an intermediate-level semantic representation for knowledge transfer. We introduce a new benchmark dataset called ASL-Text that consists of 250 sign language classes and their accompanying textual descriptions. Compared to the ZSL datasets in other domains (such as object recognition), our dataset consists of limited number of training examples for a large number of classes, which imposes a significant challenge. We propose a framework that operates over the body and hand regions by means of 3D-CNNs, and models longer temporal relationships via bidirectional LSTMs. By leveraging the descriptive text embeddings along with these spatio-temporal representations within a zero-shot learning framework, we show that textual data can indeed be useful in uncovering sign languages. We anticipate that the introduced approach and the accompanying dataset will provide a basis for further exploration of this new zero-shot learning problem. |
Tasks | Object Recognition, Sign Language Recognition, Transfer Learning, Zero-Shot Learning |
Published | 2019-07-24 |
URL | https://arxiv.org/abs/1907.10292v1 |
https://arxiv.org/pdf/1907.10292v1.pdf | |
PWC | https://paperswithcode.com/paper/zero-shot-sign-language-recognition-can |
Repo | |
Framework | |
Biologically inspired architectures for sample-efficient deep reinforcement learning
Title | Biologically inspired architectures for sample-efficient deep reinforcement learning |
Authors | Pierre H. Richemond, Arinbjörn Kolbeinsson, Yike Guo |
Abstract | Deep reinforcement learning requires a heavy price in terms of sample efficiency and overparameterization in the neural networks used for function approximation. In this work, we use tensor factorization in order to learn more compact representation for reinforcement learning policies. We show empirically that in the low-data regime, it is possible to learn online policies with 2 to 10 times less total coefficients, with little to no loss of performance. We also leverage progress in second order optimization, and use the theory of wavelet scattering to further reduce the number of learned coefficients, by foregoing learning the topmost convolutional layer filters altogether. We evaluate our results on the Atari suite against recent baseline algorithms that represent the state-of-the-art in data efficiency, and get comparable results with an order of magnitude gain in weight parsimony. |
Tasks | |
Published | 2019-11-25 |
URL | https://arxiv.org/abs/1911.11285v1 |
https://arxiv.org/pdf/1911.11285v1.pdf | |
PWC | https://paperswithcode.com/paper/biologically-inspired-architectures-for |
Repo | |
Framework | |
Pruning Algorithms for Low-Dimensional Non-metric k-NN Search: A Case Study
Title | Pruning Algorithms for Low-Dimensional Non-metric k-NN Search: A Case Study |
Authors | Leonid Boytsov, Eric Nyberg |
Abstract | We focus on low-dimensional non-metric search, where tree-based approaches permit efficient and accurate retrieval while having short indexing time. These methods rely on space partitioning and require a pruning rule to avoid visiting unpromising parts. We consider two known data-driven approaches to extend these rules to non-metric spaces: TriGen and a piece-wise linear approximation of the pruning rule. We propose and evaluate two adaptations of TriGen to non-symmetric similarities (TriGen does not support non-symmetric distances). We also evaluate a hybrid of TriGen and the piece-wise linear approximation pruning. We find that this hybrid approach is often more effective than either of the pruning rules. We make our software publicly available. |
Tasks | |
Published | 2019-10-08 |
URL | https://arxiv.org/abs/1910.03539v1 |
https://arxiv.org/pdf/1910.03539v1.pdf | |
PWC | https://paperswithcode.com/paper/pruning-algorithms-for-low-dimensional-non |
Repo | |
Framework | |
Finite-time Analysis of Approximate Policy Iteration for the Linear Quadratic Regulator
Title | Finite-time Analysis of Approximate Policy Iteration for the Linear Quadratic Regulator |
Authors | Karl Krauth, Stephen Tu, Benjamin Recht |
Abstract | We study the sample complexity of approximate policy iteration (PI) for the Linear Quadratic Regulator (LQR), building on a recent line of work using LQR as a testbed to understand the limits of reinforcement learning (RL) algorithms on continuous control tasks. Our analysis quantifies the tension between policy improvement and policy evaluation, and suggests that policy evaluation is the dominant factor in terms of sample complexity. Specifically, we show that to obtain a controller that is within $\varepsilon$ of the optimal LQR controller, each step of policy evaluation requires at most $(n+d)^3/\varepsilon^2$ samples, where $n$ is the dimension of the state vector and $d$ is the dimension of the input vector. On the other hand, only $\log(1/\varepsilon)$ policy improvement steps suffice, resulting in an overall sample complexity of $(n+d)^3 \varepsilon^{-2} \log(1/\varepsilon)$. We furthermore build on our analysis and construct a simple adaptive procedure based on $\varepsilon$-greedy exploration which relies on approximate PI as a sub-routine and obtains $T^{2/3}$ regret, improving upon a recent result of Abbasi-Yadkori et al. |
Tasks | Continuous Control |
Published | 2019-05-30 |
URL | https://arxiv.org/abs/1905.12842v1 |
https://arxiv.org/pdf/1905.12842v1.pdf | |
PWC | https://paperswithcode.com/paper/finite-time-analysis-of-approximate-policy |
Repo | |
Framework | |
A Planning Framework for Persistent, Multi-UAV Coverage with Global Deconfliction
Title | A Planning Framework for Persistent, Multi-UAV Coverage with Global Deconfliction |
Authors | Tushar Kusnur, Shohin Mukherjee, Dhruv Mauria Saxena, Tomoya Fukami, Takayuki Koyama, Oren Salzman, Maxim Likhachev |
Abstract | Planning for multi-robot coverage seeks to determine collision-free paths for a fleet of robots, enabling them to collectively observe points of interest in an environment. Persistent coverage is a variant of traditional coverage where coverage-levels in the environment decay over time. Thus, robots have to continuously revisit parts of the environment to maintain a desired coverage-level. Facilitating this in the real world demands we tackle numerous subproblems. While there exist standard solutions to these subproblems, there is no complete framework that addresses all of their individual challenges as a whole in a practical setting. We adapt and combine these solutions to present a planning framework for persistent coverage with multiple unmanned aerial vehicles (UAVs). Specifically, we run a continuous loop of goal assignment and globally deconflicting, kinodynamic path planning for multiple UAVs. We evaluate our framework in simulation as well as the real world. In particular, we demonstrate that (i) our framework exhibits graceful coverage given sufficient resources, we maintain persistent coverage; if resources are insufficient (e.g., having too few UAVs for a given size of the enviornment), coverage-levels decay slowly and (ii) planning with global deconfliction in our framework incurs a negligibly higher price compared to other weaker, more local collision-checking schemes. (Video: https://youtu.be/aqDs6Wymp5Q) |
Tasks | |
Published | 2019-08-25 |
URL | https://arxiv.org/abs/1908.09236v3 |
https://arxiv.org/pdf/1908.09236v3.pdf | |
PWC | https://paperswithcode.com/paper/a-planning-framework-for-persistent-multi-uav |
Repo | |
Framework | |
Ontology Based Global and Collective Motion Patterns for Event Classification in Basketball Videos
Title | Ontology Based Global and Collective Motion Patterns for Event Classification in Basketball Videos |
Authors | Lifang Wu, Zhou Yang, Jiaoyu He, Meng Jian, Yaowen Xu, Dezhong Xu, Chang Wen Chen |
Abstract | In multi-person videos, especially team sport videos, a semantic event is usually represented as a confrontation between two teams of players, which can be represented as collective motion. In broadcast basketball videos, specific camera motions are used to present specific events. Therefore, a semantic event in broadcast basketball videos is closely related to both the global motion (camera motion) and the collective motion. A semantic event in basketball videos can be generally divided into three stages: pre-event, event occurrence (event-occ), and post-event. In this paper, we propose an ontology-based global and collective motion pattern (On_GCMP) algorithm for basketball event classification. First, a two-stage GCMP based event classification scheme is proposed. The GCMP is extracted using optical flow. The two-stage scheme progressively combines a five-class event classification algorithm on event-occs and a two-class event classification algorithm on pre-events. Both algorithms utilize sequential convolutional neural networks (CNNs) and long short-term memory (LSTM) networks to extract the spatial and temporal features of GCMP for event classification. Second, we utilize post-event segments to predict success/failure using deep features of images in the video frames (RGB_DF_VF) based algorithms. Finally the event classification results and success/failure classification results are integrated to obtain the final results. To evaluate the proposed scheme, we collected a new dataset called NCAA+, which is automatically obtained from the NCAA dataset by extending the fixed length of video clips forward and backward of the corresponding semantic events. The experimental results demonstrate that the proposed scheme achieves the mean average precision of 58.10% on NCAA+. It is higher by 6.50% than state-of-the-art on NCAA. |
Tasks | Optical Flow Estimation |
Published | 2019-03-16 |
URL | http://arxiv.org/abs/1903.06879v2 |
http://arxiv.org/pdf/1903.06879v2.pdf | |
PWC | https://paperswithcode.com/paper/ontology-based-global-and-collective-motion |
Repo | |
Framework | |
Confounder Selection via Support Intersection
Title | Confounder Selection via Support Intersection |
Authors | Shinyuu Lee, Yuru Zhu |
Abstract | Confounding matters in almost all observational studies that focus on causality. In order to eliminate bias caused by connfounders, oftentimes a substantial number of features need to be collected in the analysis. In this case, large p small n problem can arise and dimensional reduction technique is required. However, the traditional variable selection methods which focus on prediction are problematic in this setting. Throughout this paper, we analyze this issue in detail and assume the sparsity of confounders which is different from the previous works. Under this assumption we propose several variable selection methods based on support intersection to pick out the confounders. Also we discussed the different approaches for estimation of causal effect and unconfoundedness test. To aid in our description, finally we provide numerical simulations to support our claims and compare to common heuristic methods, as well as applications on real dataset. |
Tasks | |
Published | 2019-12-25 |
URL | https://arxiv.org/abs/1912.11652v1 |
https://arxiv.org/pdf/1912.11652v1.pdf | |
PWC | https://paperswithcode.com/paper/confounder-selection-via-support-intersection |
Repo | |
Framework | |
Named Entity Recognition Only from Word Embeddings
Title | Named Entity Recognition Only from Word Embeddings |
Authors | Ying Luo, Hai Zhao, Junlang Zhan |
Abstract | Deep neural network models have helped named entity (NE) recognition achieve amazing performance without handcrafting features. However, existing systems require large amounts of human annotated training data. Efforts have been made to replace human annotations with external knowledge (e.g., NE dictionary, part-of-speech tags), while it is another challenge to obtain such effective resources. In this work, we propose a fully unsupervised NE recognition model which only needs to take informative clues from pre-trained word embeddings. We first apply Gaussian Hidden Markov Model and Deep Autoencoding Gaussian Mixture Model on word embeddings for entity span detection and type prediction, and then further design an instance selector based on reinforcement learning to distinguish positive sentences from noisy sentences and refine these coarse-grained annotations through neural networks. Extensive experiments on CoNLL benchmark datasets demonstrate that our proposed light NE recognition model achieves remarkable performance without using any annotated lexicon or corpus. |
Tasks | Named Entity Recognition, Word Embeddings |
Published | 2019-08-31 |
URL | https://arxiv.org/abs/1909.00164v1 |
https://arxiv.org/pdf/1909.00164v1.pdf | |
PWC | https://paperswithcode.com/paper/named-entity-recognition-only-from-word |
Repo | |
Framework | |
Entity Projection via Machine Translation for Cross-Lingual NER
Title | Entity Projection via Machine Translation for Cross-Lingual NER |
Authors | Alankar Jain, Bhargavi Paranjape, Zachary C. Lipton |
Abstract | Although over 100 languages are supported by strong off-the-shelf machine translation systems, only a subset of them possess large annotated corpora for named entity recognition. Motivated by this fact, we leverage machine translation to improve annotation-projection approaches to cross-lingual named entity recognition. We propose a system that improves over prior entity-projection methods by: (a) leveraging machine translation systems twice: first for translating sentences and subsequently for translating entities; (b) matching entities based on orthographic and phonetic similarity; and (c) identifying matches based on distributional statistics derived from the dataset. Our approach improves upon current state-of-the-art methods for cross-lingual named entity recognition on 5 diverse languages by an average of 4.1 points. Further, our method achieves state-of-the-art F_1 scores for Armenian, outperforming even a monolingual model trained on Armenian source data. |
Tasks | Machine Translation, Named Entity Recognition |
Published | 2019-08-31 |
URL | https://arxiv.org/abs/1909.05356v2 |
https://arxiv.org/pdf/1909.05356v2.pdf | |
PWC | https://paperswithcode.com/paper/entity-projection-via-machine-translation-for |
Repo | |
Framework | |
RES-PCA: A Scalable Approach to Recovering Low-rank Matrices
Title | RES-PCA: A Scalable Approach to Recovering Low-rank Matrices |
Authors | Chong Peng, Chenglizhao Chen, Zhao Kang, Jianbo Li, Qiang Cheng |
Abstract | Robust principal component analysis (RPCA) has drawn significant attentions due to its powerful capability in recovering low-rank matrices as well as successful appplications in various real world problems. The current state-of-the-art algorithms usually need to solve singular value decomposition of large matrices, which generally has at least a quadratic or even cubic complexity. This drawback has limited the application of RPCA in solving real world problems. To combat this drawback, in this paper we propose a new type of RPCA method, RES-PCA, which is linearly efficient and scalable in both data size and dimension. For comparison purpose, AltProj, an existing scalable approach to RPCA requires the precise knowlwdge of the true rank; otherwise, it may fail to recover low-rank matrices. By contrast, our method works with or without knowing the true rank; even when both methods work, our method is faster. Extensive experiments have been performed and testified to the effectiveness of proposed method quantitatively and in visual quality, which suggests that our method is suitable to be employed as a light-weight, scalable component for RPCA in any application pipelines. |
Tasks | |
Published | 2019-04-16 |
URL | http://arxiv.org/abs/1904.07497v1 |
http://arxiv.org/pdf/1904.07497v1.pdf | |
PWC | https://paperswithcode.com/paper/res-pca-a-scalable-approach-to-recovering-low |
Repo | |
Framework | |
The Renyi Gaussian Process: Towards Improved Generalization
Title | The Renyi Gaussian Process: Towards Improved Generalization |
Authors | Xubo Yue, Raed Kontar |
Abstract | We introduce an alternative closed form lower bound on the Gaussian process ($\mathcal{GP}$) likelihood based on the R'enyi $\alpha$-divergence. This new lower bound can be viewed as a convex combination of the Nystr"om approximation and the exact $\mathcal{GP}$. The key advantage of this bound, is its capability to control and tune the enforced regularization on the model and thus is a generalization of the traditional variational $\mathcal{GP}$ regression. From a theoretical perspective, we provide the convergence rate and risk bound for inference using our proposed approach. Experiments on real data show that the proposed algorithm may be able to deliver improvement over several $\mathcal{GP}$ inference methods. |
Tasks | |
Published | 2019-10-15 |
URL | https://arxiv.org/abs/1910.06990v3 |
https://arxiv.org/pdf/1910.06990v3.pdf | |
PWC | https://paperswithcode.com/paper/the-renyi-gaussian-process |
Repo | |
Framework | |
Learning Shared Cross-modality Representation Using Multispectral-LiDAR and Hyperspectral Data
Title | Learning Shared Cross-modality Representation Using Multispectral-LiDAR and Hyperspectral Data |
Authors | Danfeng Hong, Jocelyn Chanussot, Naoto Yokoya, Jian Kang, Xiao Xiang Zhu |
Abstract | Due to the ever-growing diversity of the data source, multi-modality feature learning has attracted more and more attention. However, most of these methods are designed by jointly learning feature representation from multi-modalities that exist in both training and test sets, yet they are less investigated in absence of certain modality in the test phase. To this end, in this letter, we propose to learn a shared feature space across multi-modalities in the training process. By this way, the out-of-sample from any of multi-modalities can be directly projected onto the learned space for a more effective cross-modality representation. More significantly, the shared space is regarded as a latent subspace in our proposed method, which connects the original multi-modal samples with label information to further improve the feature discrimination. Experiments are conducted on the multispectral-Lidar and hyperspectral dataset provided by the 2018 IEEE GRSS Data Fusion Contest to demonstrate the effectiveness and superiority of the proposed method in comparison with several popular baselines. |
Tasks | |
Published | 2019-12-18 |
URL | https://arxiv.org/abs/1912.08837v1 |
https://arxiv.org/pdf/1912.08837v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-shared-cross-modality-representation |
Repo | |
Framework | |