October 20, 2019

3216 words 16 mins read

Paper Group AWR 310

Paper Group AWR 310

Distractor-aware Siamese Networks for Visual Object Tracking. DQN-TAMER: Human-in-the-Loop Reinforcement Learning with Intractable Feedback. A multi-contrast MRI approach to thalamus segmentation. Evolving simple programs for playing Atari games. Exploiting Kernel Sparsity and Entropy for Interpretable CNN Compression. Fictitious GAN: Training GANs …

Distractor-aware Siamese Networks for Visual Object Tracking

Title Distractor-aware Siamese Networks for Visual Object Tracking
Authors Zheng Zhu, Qiang Wang, Bo Li, Wei Wu, Junjie Yan, Weiming Hu
Abstract Recently, Siamese networks have drawn great attention in visual tracking community because of their balanced accuracy and speed. However, features used in most Siamese tracking approaches can only discriminate foreground from the non-semantic backgrounds. The semantic backgrounds are always considered as distractors, which hinders the robustness of Siamese trackers. In this paper, we focus on learning distractor-aware Siamese networks for accurate and long-term tracking. To this end, features used in traditional Siamese trackers are analyzed at first. We observe that the imbalanced distribution of training data makes the learned features less discriminative. During the off-line training phase, an effective sampling strategy is introduced to control this distribution and make the model focus on the semantic distractors. During inference, a novel distractor-aware module is designed to perform incremental learning, which can effectively transfer the general embedding to the current video domain. In addition, we extend the proposed approach for long-term tracking by introducing a simple yet effective local-to-global search region strategy. Extensive experiments on benchmarks show that our approach significantly outperforms the state-of-the-arts, yielding 9.6% relative gain in VOT2016 dataset and 35.9% relative gain in UAV20L dataset. The proposed tracker can perform at 160 FPS on short-term benchmarks and 110 FPS on long-term benchmarks.
Tasks Object Tracking, Visual Object Tracking, Visual Tracking
Published 2018-08-18
URL http://arxiv.org/abs/1808.06048v1
PDF http://arxiv.org/pdf/1808.06048v1.pdf
PWC https://paperswithcode.com/paper/distractor-aware-siamese-networks-for-visual
Repo https://github.com/foolwood/DaSiamRPN
Framework pytorch

DQN-TAMER: Human-in-the-Loop Reinforcement Learning with Intractable Feedback

Title DQN-TAMER: Human-in-the-Loop Reinforcement Learning with Intractable Feedback
Authors Riku Arakawa, Sosuke Kobayashi, Yuya Unno, Yuta Tsuboi, Shin-ichi Maeda
Abstract Exploration has been one of the greatest challenges in reinforcement learning (RL), which is a large obstacle in the application of RL to robotics. Even with state-of-the-art RL algorithms, building a well-learned agent often requires too many trials, mainly due to the difficulty of matching its actions with rewards in the distant future. A remedy for this is to train an agent with real-time feedback from a human observer who immediately gives rewards for some actions. This study tackles a series of challenges for introducing such a human-in-the-loop RL scheme. The first contribution of this work is our experiments with a precisely modeled human observer: binary, delay, stochasticity, unsustainability, and natural reaction. We also propose an RL method called DQN-TAMER, which efficiently uses both human feedback and distant rewards. We find that DQN-TAMER agents outperform their baselines in Maze and Taxi simulated environments. Furthermore, we demonstrate a real-world human-in-the-loop RL application where a camera automatically recognizes a user’s facial expressions as feedback to the agent while the agent explores a maze.
Tasks
Published 2018-10-28
URL http://arxiv.org/abs/1810.11748v1
PDF http://arxiv.org/pdf/1810.11748v1.pdf
PWC https://paperswithcode.com/paper/dqn-tamer-human-in-the-loop-reinforcement
Repo https://github.com/JulienDesvergnes/human-reinforcement-learning
Framework tf

A multi-contrast MRI approach to thalamus segmentation

Title A multi-contrast MRI approach to thalamus segmentation
Authors Veronica Corona, Jan Lellmann, Peter Nestor, Carola-Bibiane Schoenlieb, Julio Acosta-Cabronero
Abstract Thalamic alterations are relevant to many neurological disorders including Alzheimer’s disease, Parkinson’s disease and multiple sclerosis. Routine interventions to improve symptom severity in movement disorders, for example, often consist of surgery or deep brain stimulation to diencephalic nuclei. Therefore, accurate delineation of grey matter thalamic subregions is of the upmost clinical importance. MRI is highly appropriate for structural segmentation as it provides different views of the anatomy from a single scanning session. Though with several contrasts potentially available, it is also of increasing importance to develop new image segmentation techniques that can operate multi-spectrally. We hereby propose a new segmentation method for use with multi-modality data, which we evaluated for automated segmentation of major thalamic subnuclear groups using T1-, T2*-weighted and quantitative susceptibility mapping (QSM) information. The proposed method consists of four steps: highly iterative image co-registration, manual segmentation on the average training-data template, supervised learning for pattern recognition, and a final convex optimisation step imposing further spatial constraints to refine the solution. This led to solutions in greater agreement with manual segmentation than the standard Morel atlas based approach. Furthermore, we show that the multi-contrast approach boosts segmentation performances. We then investigated whether prior knowledge using the training-template contours could further improve convex segmentation accuracy and robustness, which led to highly precise multi-contrast segmentations in single subjects. This approach can be extended to most 3D imaging data types and any region of interest discernible in single scans or multi-subject templates.
Tasks Semantic Segmentation
Published 2018-07-27
URL http://arxiv.org/abs/1807.10757v1
PDF http://arxiv.org/pdf/1807.10757v1.pdf
PWC https://paperswithcode.com/paper/a-multi-contrast-mri-approach-to-thalamus
Repo https://github.com/veronicacorona/multicontrastSegmentation
Framework none

Evolving simple programs for playing Atari games

Title Evolving simple programs for playing Atari games
Authors Dennis G Wilson, Sylvain Cussat-Blanc, Hervé Luga, Julian F Miller
Abstract Cartesian Genetic Programming (CGP) has previously shown capabilities in image processing tasks by evolving programs with a function set specialized for computer vision. A similar approach can be applied to Atari playing. Programs are evolved using mixed type CGP with a function set suited for matrix operations, including image processing, but allowing for controller behavior to emerge. While the programs are relatively small, many controllers are competitive with state of the art methods for the Atari benchmark set and require less training time. By evaluating the programs of the best evolved individuals, simple but effective strategies can be found.
Tasks Atari Games
Published 2018-06-14
URL http://arxiv.org/abs/1806.05695v1
PDF http://arxiv.org/pdf/1806.05695v1.pdf
PWC https://paperswithcode.com/paper/evolving-simple-programs-for-playing-atari
Repo https://github.com/JacobLaney/cgp-tetris
Framework none

Exploiting Kernel Sparsity and Entropy for Interpretable CNN Compression

Title Exploiting Kernel Sparsity and Entropy for Interpretable CNN Compression
Authors Yuchao Li, Shaohui Lin, Baochang Zhang, Jianzhuang Liu, David Doermann, Yongjian Wu, Feiyue Huang, Rongrong Ji
Abstract Compressing convolutional neural networks (CNNs) has received ever-increasing research focus. However, most existing CNN compression methods do not interpret their inherent structures to distinguish the implicit redundancy. In this paper, we investigate the problem of CNN compression from a novel interpretable perspective. The relationship between the input feature maps and 2D kernels is revealed in a theoretical framework, based on which a kernel sparsity and entropy (KSE) indicator is proposed to quantitate the feature map importance in a feature-agnostic manner to guide model compression. Kernel clustering is further conducted based on the KSE indicator to accomplish high-precision CNN compression. KSE is capable of simultaneously compressing each layer in an efficient way, which is significantly faster compared to previous data-driven feature map pruning methods. We comprehensively evaluate the compression and speedup of the proposed method on CIFAR-10, SVHN and ImageNet 2012. Our method demonstrates superior performance gains over previous ones. In particular, it achieves 4.7 \times FLOPs reduction and 2.9 \times compression on ResNet-50 with only a Top-5 accuracy drop of 0.35% on ImageNet 2012, which significantly outperforms state-of-the-art methods.
Tasks Model Compression
Published 2018-12-11
URL http://arxiv.org/abs/1812.04368v2
PDF http://arxiv.org/pdf/1812.04368v2.pdf
PWC https://paperswithcode.com/paper/exploiting-kernel-sparsity-and-entropy-for
Repo https://github.com/yuchaoli/KSE
Framework pytorch

Fictitious GAN: Training GANs with Historical Models

Title Fictitious GAN: Training GANs with Historical Models
Authors Hao Ge, Yin Xia, Xu Chen, Randall Berry, Ying Wu
Abstract Generative adversarial networks (GANs) are powerful tools for learning generative models. In practice, the training may suffer from lack of convergence. GANs are commonly viewed as a two-player zero-sum game between two neural networks. Here, we leverage this game theoretic view to study the convergence behavior of the training process. Inspired by the fictitious play learning process, a novel training method, referred to as Fictitious GAN, is introduced. Fictitious GAN trains the deep neural networks using a mixture of historical models. Specifically, the discriminator (resp. generator) is updated according to the best-response to the mixture outputs from a sequence of previously trained generators (resp. discriminators). It is shown that Fictitious GAN can effectively resolve some convergence issues that cannot be resolved by the standard training approach. It is proved that asymptotically the average of the generator outputs has the same distribution as the data samples.
Tasks
Published 2018-03-23
URL http://arxiv.org/abs/1803.08647v2
PDF http://arxiv.org/pdf/1803.08647v2.pdf
PWC https://paperswithcode.com/paper/fictitious-gan-training-gans-with-historical
Repo https://github.com/pijel/fGAN
Framework pytorch

SMART: An Open Source Data Labeling Platform for Supervised Learning

Title SMART: An Open Source Data Labeling Platform for Supervised Learning
Authors Rob Chew, Michael Wenger, Caroline Kery, Jason Nance, Keith Richards, Emily Hadley, Peter Baumgartner
Abstract SMART is an open source web application designed to help data scientists and research teams efficiently build labeled training data sets for supervised machine learning tasks. SMART provides users with an intuitive interface for creating labeled data sets, supports active learning to help reduce the required amount of labeled data, and incorporates inter-rater reliability statistics to provide insight into label quality. SMART is designed to be platform agnostic and easily deployable to meet the needs of as many different research teams as possible. The project website contains links to the code repository and extensive user documentation.
Tasks Active Learning
Published 2018-12-11
URL http://arxiv.org/abs/1812.06591v1
PDF http://arxiv.org/pdf/1812.06591v1.pdf
PWC https://paperswithcode.com/paper/smart-an-open-source-data-labeling-platform
Repo https://github.com/XDgov/ML-NLP-Resource-List
Framework none

But How Does It Work in Theory? Linear SVM with Random Features

Title But How Does It Work in Theory? Linear SVM with Random Features
Authors Yitong Sun, Anna Gilbert, Ambuj Tewari
Abstract We prove that, under low noise assumptions, the support vector machine with $N\ll m$ random features (RFSVM) can achieve the learning rate faster than $O(1/\sqrt{m})$ on a training set with $m$ samples when an optimized feature map is used. Our work extends the previous fast rate analysis of random features method from least square loss to 0-1 loss. We also show that the reweighted feature selection method, which approximates the optimized feature map, helps improve the performance of RFSVM in experiments on a synthetic data set.
Tasks Feature Selection
Published 2018-09-12
URL http://arxiv.org/abs/1809.04481v3
PDF http://arxiv.org/pdf/1809.04481v3.pdf
PWC https://paperswithcode.com/paper/but-how-does-it-work-in-theory-linear-svm
Repo https://github.com/syitong/randfourier
Framework none

Open Set Domain Adaptation by Backpropagation

Title Open Set Domain Adaptation by Backpropagation
Authors Kuniaki Saito, Shohei Yamamoto, Yoshitaka Ushiku, Tatsuya Harada
Abstract Numerous algorithms have been proposed for transferring knowledge from a label-rich domain (source) to a label-scarce domain (target). Almost all of them are proposed for a closed-set scenario, where the source and the target domain completely share the class of their samples. We call the shared class the \doublequote{known class.} However, in practice, when samples in target domain are not labeled, we cannot know whether the domains share the class. A target domain can contain samples of classes that are not shared by the source domain. We call such classes the \doublequote{unknown class} and algorithms that work well in the open set situation are very practical. However, most existing distribution matching methods for domain adaptation do not work well in this setting because unknown target samples should not be aligned with the source. In this paper, we propose a method for an open set domain adaptation scenario which utilizes adversarial training. A classifier is trained to make a boundary between the source and the target samples whereas a generator is trained to make target samples far from the boundary. Thus, we assign two options to the feature generator: aligning them with source known samples or rejecting them as unknown target samples. This approach allows extracting features that separate unknown target samples from known target samples. Our method was extensively evaluated in domain adaptation setting and outperformed other methods with a large margin in most settings.
Tasks Domain Adaptation
Published 2018-04-27
URL http://arxiv.org/abs/1804.10427v2
PDF http://arxiv.org/pdf/1804.10427v2.pdf
PWC https://paperswithcode.com/paper/open-set-domain-adaptation-by-backpropagation
Repo https://github.com/ChenJinBIT/OSDA
Framework tf

Studying the Plasticity in Deep Convolutional Neural Networks using Random Pruning

Title Studying the Plasticity in Deep Convolutional Neural Networks using Random Pruning
Authors Deepak Mittal, Shweta Bhardwaj, Mitesh M. Khapra, Balaraman Ravindran
Abstract Recently there has been a lot of work on pruning filters from deep convolutional neural networks (CNNs) with the intention of reducing computations.The key idea is to rank the filters based on a certain criterion (say, l1-norm) and retain only the top ranked filters. Once the low scoring filters are pruned away the remainder of the network is fine tuned and is shown to give performance comparable to the original unpruned network. In this work, we report experiments which suggest that the comparable performance of the pruned network is not due to the specific criterion chosen but due to the inherent plasticity of deep neural networks which allows them to recover from the loss of pruned filters once the rest of the filters are fine-tuned. Specifically we show counter-intuitive results wherein by randomly pruning 25-50% filters from deep CNNs we are able to obtain the same performance as obtained by using state-of-the-art pruning methods. We empirically validate our claims by doing an exhaustive evaluation with VGG-16 and ResNet-50. We also evaluate a real world scenario where a CNN trained on all 1000 ImageNet classes needs to be tested on only a small set of classes at test time (say, only animals). We create a new benchmark dataset from ImageNet to evaluate such class specific pruning and show that even here a random pruning strategy gives close to state-of-the-art performance. Unlike existing approaches which mainly focus on the task of image classification, in this work we also report results on object detection and image segmentation. We show that using a simple random pruning strategy we can achieve significant speed up in object detection (74% improvement in fps) while retaining the same accuracy as that of the original Faster RCNN model. Similarly we show that the performance of a pruned Segmentation Network (SegNet) is actually very similar to that of the original unpruned SegNet.
Tasks Image Classification, Object Detection, Semantic Segmentation
Published 2018-12-26
URL http://arxiv.org/abs/1812.10240v1
PDF http://arxiv.org/pdf/1812.10240v1.pdf
PWC https://paperswithcode.com/paper/studying-the-plasticity-in-deep-convolutional
Repo https://github.com/marcoancona/TorchPruner
Framework pytorch

Global Encoding for Abstractive Summarization

Title Global Encoding for Abstractive Summarization
Authors Junyang Lin, Xu Sun, Shuming Ma, Qi Su
Abstract In neural abstractive summarization, the conventional sequence-to-sequence (seq2seq) model often suffers from repetition and semantic irrelevance. To tackle the problem, we propose a global encoding framework, which controls the information flow from the encoder to the decoder based on the global information of the source context. It consists of a convolutional gated unit to perform global encoding to improve the representations of the source-side information. Evaluations on the LCSTS and the English Gigaword both demonstrate that our model outperforms the baseline models, and the analysis shows that our model is capable of reducing repetition.
Tasks Abstractive Text Summarization
Published 2018-05-10
URL http://arxiv.org/abs/1805.03989v2
PDF http://arxiv.org/pdf/1805.03989v2.pdf
PWC https://paperswithcode.com/paper/global-encoding-for-abstractive-summarization
Repo https://github.com/wuhao050698/Abstractive-Summarization
Framework pytorch

Low-Rank Bandit Methods for High-Dimensional Dynamic Pricing

Title Low-Rank Bandit Methods for High-Dimensional Dynamic Pricing
Authors Jonas Mueller, Vasilis Syrgkanis, Matt Taddy
Abstract We consider dynamic pricing with many products under an evolving but low-dimensional demand model. Assuming the temporal variation in cross-elasticities exhibits low-rank structure based on fixed (latent) features of the products, we show that the revenue maximization problem reduces to an online bandit convex optimization with side information given by the observed demands. We design dynamic pricing algorithms whose revenue approaches that of the best fixed price vector in hindsight, at a rate that only depends on the intrinsic rank of the demand model and not the number of products. Our approach applies a bandit convex optimization algorithm in a projected low-dimensional space spanned by the latent product features, while simultaneously learning this span via online singular value decomposition of a carefully-crafted matrix containing the observed demands.
Tasks
Published 2018-01-30
URL https://arxiv.org/abs/1801.10242v2
PDF https://arxiv.org/pdf/1801.10242v2.pdf
PWC https://paperswithcode.com/paper/low-rank-bandit-methods-for-high-dimensional
Repo https://github.com/jwmueller/BanditDynamicPricing
Framework none

UniParse: A universal graph-based parsing toolkit

Title UniParse: A universal graph-based parsing toolkit
Authors Daniel Varab, Natalie Schluter
Abstract This paper describes the design and use of the graph-based parsing framework and toolkit UniParse, released as an open-source python software package. UniParse as a framework novelly streamlines research prototyping, development and evaluation of graph-based dependency parsing architectures. UniParse does this by enabling highly efficient, sufficiently independent, easily readable, and easily extensible implementations for all dependency parser components. We distribute the toolkit with ready-made configurations as re-implementations of all current state-of-the-art first-order graph-based parsers, including even more efficient Cython implementations of both encoders and decoders, as well as the required specialised loss functions.
Tasks Dependency Parsing
Published 2018-07-11
URL http://arxiv.org/abs/1807.04053v1
PDF http://arxiv.org/pdf/1807.04053v1.pdf
PWC https://paperswithcode.com/paper/uniparse-a-universal-graph-based-parsing
Repo https://github.com/ITUnlp/UniParse
Framework none

Integrative Analysis of Patient Health Records and Neuroimages via Memory-based Graph Convolutional Network

Title Integrative Analysis of Patient Health Records and Neuroimages via Memory-based Graph Convolutional Network
Authors Xi Sheryl Zhang, Jingyuan Chou, Fei Wang
Abstract With the arrival of the big data era, more and more data are becoming readily available in various real-world applications and those data are usually highly heterogeneous. Taking computational medicine as an example, we have both Electronic Health Records (EHR) and medical images for each patient. For complicated diseases such as Parkinson’s and Alzheimer’s, both EHR and neuroimaging information are very important for disease understanding because they contain complementary aspects of the disease. However, EHR and neuroimage are completely different. So far the existing research has been mainly focusing on one of them. In this paper, we proposed a framework, Memory-Based Graph Convolution Network (MemGCN), to perform integrative analysis with such multi-modal data. Specifically, GCN is used to extract useful information from the patients’ neuroimages. The information contained in the patient EHRs before the acquisition of each brain image is captured by a memory network because of its sequential nature. The information contained in each brain image is combined with the information read out from the memory network to infer the disease state at the image acquisition timestamp. To further enhance the analytical power of MemGCN, we also designed a multi-hop strategy that allows multiple reading and updating on the memory can be performed at each iteration. We conduct experiments using the patient data from the Parkinson’s Progression Markers Initiative (PPMI) with the task of classification of Parkinson’s Disease (PD) cases versus controls. We demonstrate that superior classification performance can be achieved with our proposed framework, comparing with existing approaches involving a single type of data.
Tasks
Published 2018-09-17
URL https://arxiv.org/abs/1809.06018v4
PDF https://arxiv.org/pdf/1809.06018v4.pdf
PWC https://paperswithcode.com/paper/integrative-analysis-of-patient-health
Repo https://github.com/sheryl-ai/MemGCN
Framework tf

Medical Image Imputation from Image Collections

Title Medical Image Imputation from Image Collections
Authors Adrian V. Dalca, Katherine L. Bouman, William T. Freeman, Natalia S. Rost, Mert R. Sabuncu, Polina Golland
Abstract We present an algorithm for creating high resolution anatomically plausible images consistent with acquired clinical brain MRI scans with large inter-slice spacing. Although large data sets of clinical images contain a wealth of information, time constraints during acquisition result in sparse scans that fail to capture much of the anatomy. These characteristics often render computational analysis impractical as many image analysis algorithms tend to fail when applied to such images. Highly specialized algorithms that explicitly handle sparse slice spacing do not generalize well across problem domains. In contrast, we aim to enable application of existing algorithms that were originally developed for high resolution research scans to significantly undersampled scans. We introduce a generative model that captures fine-scale anatomical structure across subjects in clinical image collections and derive an algorithm for filling in the missing data in scans with large inter-slice spacing. Our experimental results demonstrate that the resulting method outperforms state-of-the-art upsampling super-resolution techniques, and promises to facilitate subsequent analysis not previously possible with scans of this quality. Our implementation is freely available at https://github.com/adalca/papago .
Tasks Image Imputation, Imputation, Super-Resolution
Published 2018-08-17
URL http://arxiv.org/abs/1808.05732v1
PDF http://arxiv.org/pdf/1808.05732v1.pdf
PWC https://paperswithcode.com/paper/medical-image-imputation-from-image
Repo https://github.com/adalca/patchlib
Framework none
comments powered by Disqus