Paper Group AWR 310
Distractor-aware Siamese Networks for Visual Object Tracking. DQN-TAMER: Human-in-the-Loop Reinforcement Learning with Intractable Feedback. A multi-contrast MRI approach to thalamus segmentation. Evolving simple programs for playing Atari games. Exploiting Kernel Sparsity and Entropy for Interpretable CNN Compression. Fictitious GAN: Training GANs …
Distractor-aware Siamese Networks for Visual Object Tracking
Title | Distractor-aware Siamese Networks for Visual Object Tracking |
Authors | Zheng Zhu, Qiang Wang, Bo Li, Wei Wu, Junjie Yan, Weiming Hu |
Abstract | Recently, Siamese networks have drawn great attention in visual tracking community because of their balanced accuracy and speed. However, features used in most Siamese tracking approaches can only discriminate foreground from the non-semantic backgrounds. The semantic backgrounds are always considered as distractors, which hinders the robustness of Siamese trackers. In this paper, we focus on learning distractor-aware Siamese networks for accurate and long-term tracking. To this end, features used in traditional Siamese trackers are analyzed at first. We observe that the imbalanced distribution of training data makes the learned features less discriminative. During the off-line training phase, an effective sampling strategy is introduced to control this distribution and make the model focus on the semantic distractors. During inference, a novel distractor-aware module is designed to perform incremental learning, which can effectively transfer the general embedding to the current video domain. In addition, we extend the proposed approach for long-term tracking by introducing a simple yet effective local-to-global search region strategy. Extensive experiments on benchmarks show that our approach significantly outperforms the state-of-the-arts, yielding 9.6% relative gain in VOT2016 dataset and 35.9% relative gain in UAV20L dataset. The proposed tracker can perform at 160 FPS on short-term benchmarks and 110 FPS on long-term benchmarks. |
Tasks | Object Tracking, Visual Object Tracking, Visual Tracking |
Published | 2018-08-18 |
URL | http://arxiv.org/abs/1808.06048v1 |
http://arxiv.org/pdf/1808.06048v1.pdf | |
PWC | https://paperswithcode.com/paper/distractor-aware-siamese-networks-for-visual |
Repo | https://github.com/foolwood/DaSiamRPN |
Framework | pytorch |
DQN-TAMER: Human-in-the-Loop Reinforcement Learning with Intractable Feedback
Title | DQN-TAMER: Human-in-the-Loop Reinforcement Learning with Intractable Feedback |
Authors | Riku Arakawa, Sosuke Kobayashi, Yuya Unno, Yuta Tsuboi, Shin-ichi Maeda |
Abstract | Exploration has been one of the greatest challenges in reinforcement learning (RL), which is a large obstacle in the application of RL to robotics. Even with state-of-the-art RL algorithms, building a well-learned agent often requires too many trials, mainly due to the difficulty of matching its actions with rewards in the distant future. A remedy for this is to train an agent with real-time feedback from a human observer who immediately gives rewards for some actions. This study tackles a series of challenges for introducing such a human-in-the-loop RL scheme. The first contribution of this work is our experiments with a precisely modeled human observer: binary, delay, stochasticity, unsustainability, and natural reaction. We also propose an RL method called DQN-TAMER, which efficiently uses both human feedback and distant rewards. We find that DQN-TAMER agents outperform their baselines in Maze and Taxi simulated environments. Furthermore, we demonstrate a real-world human-in-the-loop RL application where a camera automatically recognizes a user’s facial expressions as feedback to the agent while the agent explores a maze. |
Tasks | |
Published | 2018-10-28 |
URL | http://arxiv.org/abs/1810.11748v1 |
http://arxiv.org/pdf/1810.11748v1.pdf | |
PWC | https://paperswithcode.com/paper/dqn-tamer-human-in-the-loop-reinforcement |
Repo | https://github.com/JulienDesvergnes/human-reinforcement-learning |
Framework | tf |
A multi-contrast MRI approach to thalamus segmentation
Title | A multi-contrast MRI approach to thalamus segmentation |
Authors | Veronica Corona, Jan Lellmann, Peter Nestor, Carola-Bibiane Schoenlieb, Julio Acosta-Cabronero |
Abstract | Thalamic alterations are relevant to many neurological disorders including Alzheimer’s disease, Parkinson’s disease and multiple sclerosis. Routine interventions to improve symptom severity in movement disorders, for example, often consist of surgery or deep brain stimulation to diencephalic nuclei. Therefore, accurate delineation of grey matter thalamic subregions is of the upmost clinical importance. MRI is highly appropriate for structural segmentation as it provides different views of the anatomy from a single scanning session. Though with several contrasts potentially available, it is also of increasing importance to develop new image segmentation techniques that can operate multi-spectrally. We hereby propose a new segmentation method for use with multi-modality data, which we evaluated for automated segmentation of major thalamic subnuclear groups using T1-, T2*-weighted and quantitative susceptibility mapping (QSM) information. The proposed method consists of four steps: highly iterative image co-registration, manual segmentation on the average training-data template, supervised learning for pattern recognition, and a final convex optimisation step imposing further spatial constraints to refine the solution. This led to solutions in greater agreement with manual segmentation than the standard Morel atlas based approach. Furthermore, we show that the multi-contrast approach boosts segmentation performances. We then investigated whether prior knowledge using the training-template contours could further improve convex segmentation accuracy and robustness, which led to highly precise multi-contrast segmentations in single subjects. This approach can be extended to most 3D imaging data types and any region of interest discernible in single scans or multi-subject templates. |
Tasks | Semantic Segmentation |
Published | 2018-07-27 |
URL | http://arxiv.org/abs/1807.10757v1 |
http://arxiv.org/pdf/1807.10757v1.pdf | |
PWC | https://paperswithcode.com/paper/a-multi-contrast-mri-approach-to-thalamus |
Repo | https://github.com/veronicacorona/multicontrastSegmentation |
Framework | none |
Evolving simple programs for playing Atari games
Title | Evolving simple programs for playing Atari games |
Authors | Dennis G Wilson, Sylvain Cussat-Blanc, Hervé Luga, Julian F Miller |
Abstract | Cartesian Genetic Programming (CGP) has previously shown capabilities in image processing tasks by evolving programs with a function set specialized for computer vision. A similar approach can be applied to Atari playing. Programs are evolved using mixed type CGP with a function set suited for matrix operations, including image processing, but allowing for controller behavior to emerge. While the programs are relatively small, many controllers are competitive with state of the art methods for the Atari benchmark set and require less training time. By evaluating the programs of the best evolved individuals, simple but effective strategies can be found. |
Tasks | Atari Games |
Published | 2018-06-14 |
URL | http://arxiv.org/abs/1806.05695v1 |
http://arxiv.org/pdf/1806.05695v1.pdf | |
PWC | https://paperswithcode.com/paper/evolving-simple-programs-for-playing-atari |
Repo | https://github.com/JacobLaney/cgp-tetris |
Framework | none |
Exploiting Kernel Sparsity and Entropy for Interpretable CNN Compression
Title | Exploiting Kernel Sparsity and Entropy for Interpretable CNN Compression |
Authors | Yuchao Li, Shaohui Lin, Baochang Zhang, Jianzhuang Liu, David Doermann, Yongjian Wu, Feiyue Huang, Rongrong Ji |
Abstract | Compressing convolutional neural networks (CNNs) has received ever-increasing research focus. However, most existing CNN compression methods do not interpret their inherent structures to distinguish the implicit redundancy. In this paper, we investigate the problem of CNN compression from a novel interpretable perspective. The relationship between the input feature maps and 2D kernels is revealed in a theoretical framework, based on which a kernel sparsity and entropy (KSE) indicator is proposed to quantitate the feature map importance in a feature-agnostic manner to guide model compression. Kernel clustering is further conducted based on the KSE indicator to accomplish high-precision CNN compression. KSE is capable of simultaneously compressing each layer in an efficient way, which is significantly faster compared to previous data-driven feature map pruning methods. We comprehensively evaluate the compression and speedup of the proposed method on CIFAR-10, SVHN and ImageNet 2012. Our method demonstrates superior performance gains over previous ones. In particular, it achieves 4.7 \times FLOPs reduction and 2.9 \times compression on ResNet-50 with only a Top-5 accuracy drop of 0.35% on ImageNet 2012, which significantly outperforms state-of-the-art methods. |
Tasks | Model Compression |
Published | 2018-12-11 |
URL | http://arxiv.org/abs/1812.04368v2 |
http://arxiv.org/pdf/1812.04368v2.pdf | |
PWC | https://paperswithcode.com/paper/exploiting-kernel-sparsity-and-entropy-for |
Repo | https://github.com/yuchaoli/KSE |
Framework | pytorch |
Fictitious GAN: Training GANs with Historical Models
Title | Fictitious GAN: Training GANs with Historical Models |
Authors | Hao Ge, Yin Xia, Xu Chen, Randall Berry, Ying Wu |
Abstract | Generative adversarial networks (GANs) are powerful tools for learning generative models. In practice, the training may suffer from lack of convergence. GANs are commonly viewed as a two-player zero-sum game between two neural networks. Here, we leverage this game theoretic view to study the convergence behavior of the training process. Inspired by the fictitious play learning process, a novel training method, referred to as Fictitious GAN, is introduced. Fictitious GAN trains the deep neural networks using a mixture of historical models. Specifically, the discriminator (resp. generator) is updated according to the best-response to the mixture outputs from a sequence of previously trained generators (resp. discriminators). It is shown that Fictitious GAN can effectively resolve some convergence issues that cannot be resolved by the standard training approach. It is proved that asymptotically the average of the generator outputs has the same distribution as the data samples. |
Tasks | |
Published | 2018-03-23 |
URL | http://arxiv.org/abs/1803.08647v2 |
http://arxiv.org/pdf/1803.08647v2.pdf | |
PWC | https://paperswithcode.com/paper/fictitious-gan-training-gans-with-historical |
Repo | https://github.com/pijel/fGAN |
Framework | pytorch |
SMART: An Open Source Data Labeling Platform for Supervised Learning
Title | SMART: An Open Source Data Labeling Platform for Supervised Learning |
Authors | Rob Chew, Michael Wenger, Caroline Kery, Jason Nance, Keith Richards, Emily Hadley, Peter Baumgartner |
Abstract | SMART is an open source web application designed to help data scientists and research teams efficiently build labeled training data sets for supervised machine learning tasks. SMART provides users with an intuitive interface for creating labeled data sets, supports active learning to help reduce the required amount of labeled data, and incorporates inter-rater reliability statistics to provide insight into label quality. SMART is designed to be platform agnostic and easily deployable to meet the needs of as many different research teams as possible. The project website contains links to the code repository and extensive user documentation. |
Tasks | Active Learning |
Published | 2018-12-11 |
URL | http://arxiv.org/abs/1812.06591v1 |
http://arxiv.org/pdf/1812.06591v1.pdf | |
PWC | https://paperswithcode.com/paper/smart-an-open-source-data-labeling-platform |
Repo | https://github.com/XDgov/ML-NLP-Resource-List |
Framework | none |
But How Does It Work in Theory? Linear SVM with Random Features
Title | But How Does It Work in Theory? Linear SVM with Random Features |
Authors | Yitong Sun, Anna Gilbert, Ambuj Tewari |
Abstract | We prove that, under low noise assumptions, the support vector machine with $N\ll m$ random features (RFSVM) can achieve the learning rate faster than $O(1/\sqrt{m})$ on a training set with $m$ samples when an optimized feature map is used. Our work extends the previous fast rate analysis of random features method from least square loss to 0-1 loss. We also show that the reweighted feature selection method, which approximates the optimized feature map, helps improve the performance of RFSVM in experiments on a synthetic data set. |
Tasks | Feature Selection |
Published | 2018-09-12 |
URL | http://arxiv.org/abs/1809.04481v3 |
http://arxiv.org/pdf/1809.04481v3.pdf | |
PWC | https://paperswithcode.com/paper/but-how-does-it-work-in-theory-linear-svm |
Repo | https://github.com/syitong/randfourier |
Framework | none |
Open Set Domain Adaptation by Backpropagation
Title | Open Set Domain Adaptation by Backpropagation |
Authors | Kuniaki Saito, Shohei Yamamoto, Yoshitaka Ushiku, Tatsuya Harada |
Abstract | Numerous algorithms have been proposed for transferring knowledge from a label-rich domain (source) to a label-scarce domain (target). Almost all of them are proposed for a closed-set scenario, where the source and the target domain completely share the class of their samples. We call the shared class the \doublequote{known class.} However, in practice, when samples in target domain are not labeled, we cannot know whether the domains share the class. A target domain can contain samples of classes that are not shared by the source domain. We call such classes the \doublequote{unknown class} and algorithms that work well in the open set situation are very practical. However, most existing distribution matching methods for domain adaptation do not work well in this setting because unknown target samples should not be aligned with the source. In this paper, we propose a method for an open set domain adaptation scenario which utilizes adversarial training. A classifier is trained to make a boundary between the source and the target samples whereas a generator is trained to make target samples far from the boundary. Thus, we assign two options to the feature generator: aligning them with source known samples or rejecting them as unknown target samples. This approach allows extracting features that separate unknown target samples from known target samples. Our method was extensively evaluated in domain adaptation setting and outperformed other methods with a large margin in most settings. |
Tasks | Domain Adaptation |
Published | 2018-04-27 |
URL | http://arxiv.org/abs/1804.10427v2 |
http://arxiv.org/pdf/1804.10427v2.pdf | |
PWC | https://paperswithcode.com/paper/open-set-domain-adaptation-by-backpropagation |
Repo | https://github.com/ChenJinBIT/OSDA |
Framework | tf |
Studying the Plasticity in Deep Convolutional Neural Networks using Random Pruning
Title | Studying the Plasticity in Deep Convolutional Neural Networks using Random Pruning |
Authors | Deepak Mittal, Shweta Bhardwaj, Mitesh M. Khapra, Balaraman Ravindran |
Abstract | Recently there has been a lot of work on pruning filters from deep convolutional neural networks (CNNs) with the intention of reducing computations.The key idea is to rank the filters based on a certain criterion (say, l1-norm) and retain only the top ranked filters. Once the low scoring filters are pruned away the remainder of the network is fine tuned and is shown to give performance comparable to the original unpruned network. In this work, we report experiments which suggest that the comparable performance of the pruned network is not due to the specific criterion chosen but due to the inherent plasticity of deep neural networks which allows them to recover from the loss of pruned filters once the rest of the filters are fine-tuned. Specifically we show counter-intuitive results wherein by randomly pruning 25-50% filters from deep CNNs we are able to obtain the same performance as obtained by using state-of-the-art pruning methods. We empirically validate our claims by doing an exhaustive evaluation with VGG-16 and ResNet-50. We also evaluate a real world scenario where a CNN trained on all 1000 ImageNet classes needs to be tested on only a small set of classes at test time (say, only animals). We create a new benchmark dataset from ImageNet to evaluate such class specific pruning and show that even here a random pruning strategy gives close to state-of-the-art performance. Unlike existing approaches which mainly focus on the task of image classification, in this work we also report results on object detection and image segmentation. We show that using a simple random pruning strategy we can achieve significant speed up in object detection (74% improvement in fps) while retaining the same accuracy as that of the original Faster RCNN model. Similarly we show that the performance of a pruned Segmentation Network (SegNet) is actually very similar to that of the original unpruned SegNet. |
Tasks | Image Classification, Object Detection, Semantic Segmentation |
Published | 2018-12-26 |
URL | http://arxiv.org/abs/1812.10240v1 |
http://arxiv.org/pdf/1812.10240v1.pdf | |
PWC | https://paperswithcode.com/paper/studying-the-plasticity-in-deep-convolutional |
Repo | https://github.com/marcoancona/TorchPruner |
Framework | pytorch |
Global Encoding for Abstractive Summarization
Title | Global Encoding for Abstractive Summarization |
Authors | Junyang Lin, Xu Sun, Shuming Ma, Qi Su |
Abstract | In neural abstractive summarization, the conventional sequence-to-sequence (seq2seq) model often suffers from repetition and semantic irrelevance. To tackle the problem, we propose a global encoding framework, which controls the information flow from the encoder to the decoder based on the global information of the source context. It consists of a convolutional gated unit to perform global encoding to improve the representations of the source-side information. Evaluations on the LCSTS and the English Gigaword both demonstrate that our model outperforms the baseline models, and the analysis shows that our model is capable of reducing repetition. |
Tasks | Abstractive Text Summarization |
Published | 2018-05-10 |
URL | http://arxiv.org/abs/1805.03989v2 |
http://arxiv.org/pdf/1805.03989v2.pdf | |
PWC | https://paperswithcode.com/paper/global-encoding-for-abstractive-summarization |
Repo | https://github.com/wuhao050698/Abstractive-Summarization |
Framework | pytorch |
Low-Rank Bandit Methods for High-Dimensional Dynamic Pricing
Title | Low-Rank Bandit Methods for High-Dimensional Dynamic Pricing |
Authors | Jonas Mueller, Vasilis Syrgkanis, Matt Taddy |
Abstract | We consider dynamic pricing with many products under an evolving but low-dimensional demand model. Assuming the temporal variation in cross-elasticities exhibits low-rank structure based on fixed (latent) features of the products, we show that the revenue maximization problem reduces to an online bandit convex optimization with side information given by the observed demands. We design dynamic pricing algorithms whose revenue approaches that of the best fixed price vector in hindsight, at a rate that only depends on the intrinsic rank of the demand model and not the number of products. Our approach applies a bandit convex optimization algorithm in a projected low-dimensional space spanned by the latent product features, while simultaneously learning this span via online singular value decomposition of a carefully-crafted matrix containing the observed demands. |
Tasks | |
Published | 2018-01-30 |
URL | https://arxiv.org/abs/1801.10242v2 |
https://arxiv.org/pdf/1801.10242v2.pdf | |
PWC | https://paperswithcode.com/paper/low-rank-bandit-methods-for-high-dimensional |
Repo | https://github.com/jwmueller/BanditDynamicPricing |
Framework | none |
UniParse: A universal graph-based parsing toolkit
Title | UniParse: A universal graph-based parsing toolkit |
Authors | Daniel Varab, Natalie Schluter |
Abstract | This paper describes the design and use of the graph-based parsing framework and toolkit UniParse, released as an open-source python software package. UniParse as a framework novelly streamlines research prototyping, development and evaluation of graph-based dependency parsing architectures. UniParse does this by enabling highly efficient, sufficiently independent, easily readable, and easily extensible implementations for all dependency parser components. We distribute the toolkit with ready-made configurations as re-implementations of all current state-of-the-art first-order graph-based parsers, including even more efficient Cython implementations of both encoders and decoders, as well as the required specialised loss functions. |
Tasks | Dependency Parsing |
Published | 2018-07-11 |
URL | http://arxiv.org/abs/1807.04053v1 |
http://arxiv.org/pdf/1807.04053v1.pdf | |
PWC | https://paperswithcode.com/paper/uniparse-a-universal-graph-based-parsing |
Repo | https://github.com/ITUnlp/UniParse |
Framework | none |
Integrative Analysis of Patient Health Records and Neuroimages via Memory-based Graph Convolutional Network
Title | Integrative Analysis of Patient Health Records and Neuroimages via Memory-based Graph Convolutional Network |
Authors | Xi Sheryl Zhang, Jingyuan Chou, Fei Wang |
Abstract | With the arrival of the big data era, more and more data are becoming readily available in various real-world applications and those data are usually highly heterogeneous. Taking computational medicine as an example, we have both Electronic Health Records (EHR) and medical images for each patient. For complicated diseases such as Parkinson’s and Alzheimer’s, both EHR and neuroimaging information are very important for disease understanding because they contain complementary aspects of the disease. However, EHR and neuroimage are completely different. So far the existing research has been mainly focusing on one of them. In this paper, we proposed a framework, Memory-Based Graph Convolution Network (MemGCN), to perform integrative analysis with such multi-modal data. Specifically, GCN is used to extract useful information from the patients’ neuroimages. The information contained in the patient EHRs before the acquisition of each brain image is captured by a memory network because of its sequential nature. The information contained in each brain image is combined with the information read out from the memory network to infer the disease state at the image acquisition timestamp. To further enhance the analytical power of MemGCN, we also designed a multi-hop strategy that allows multiple reading and updating on the memory can be performed at each iteration. We conduct experiments using the patient data from the Parkinson’s Progression Markers Initiative (PPMI) with the task of classification of Parkinson’s Disease (PD) cases versus controls. We demonstrate that superior classification performance can be achieved with our proposed framework, comparing with existing approaches involving a single type of data. |
Tasks | |
Published | 2018-09-17 |
URL | https://arxiv.org/abs/1809.06018v4 |
https://arxiv.org/pdf/1809.06018v4.pdf | |
PWC | https://paperswithcode.com/paper/integrative-analysis-of-patient-health |
Repo | https://github.com/sheryl-ai/MemGCN |
Framework | tf |
Medical Image Imputation from Image Collections
Title | Medical Image Imputation from Image Collections |
Authors | Adrian V. Dalca, Katherine L. Bouman, William T. Freeman, Natalia S. Rost, Mert R. Sabuncu, Polina Golland |
Abstract | We present an algorithm for creating high resolution anatomically plausible images consistent with acquired clinical brain MRI scans with large inter-slice spacing. Although large data sets of clinical images contain a wealth of information, time constraints during acquisition result in sparse scans that fail to capture much of the anatomy. These characteristics often render computational analysis impractical as many image analysis algorithms tend to fail when applied to such images. Highly specialized algorithms that explicitly handle sparse slice spacing do not generalize well across problem domains. In contrast, we aim to enable application of existing algorithms that were originally developed for high resolution research scans to significantly undersampled scans. We introduce a generative model that captures fine-scale anatomical structure across subjects in clinical image collections and derive an algorithm for filling in the missing data in scans with large inter-slice spacing. Our experimental results demonstrate that the resulting method outperforms state-of-the-art upsampling super-resolution techniques, and promises to facilitate subsequent analysis not previously possible with scans of this quality. Our implementation is freely available at https://github.com/adalca/papago . |
Tasks | Image Imputation, Imputation, Super-Resolution |
Published | 2018-08-17 |
URL | http://arxiv.org/abs/1808.05732v1 |
http://arxiv.org/pdf/1808.05732v1.pdf | |
PWC | https://paperswithcode.com/paper/medical-image-imputation-from-image |
Repo | https://github.com/adalca/patchlib |
Framework | none |