July 30, 2019

2907 words 14 mins read

Paper Group AWR 26

Paper Group AWR 26

Generalized orderless pooling performs implicit salient matching. On Breast Cancer Detection: An Application of Machine Learning Algorithms on the Wisconsin Diagnostic Dataset. Deep Expander Networks: Efficient Deep Networks from Graph Theory. A Good Practice Towards Top Performance of Face Recognition: Transferred Deep Feature Fusion. Towards A Ri …

Generalized orderless pooling performs implicit salient matching

Title Generalized orderless pooling performs implicit salient matching
Authors Marcel Simon, Yang Gao, Trevor Darrell, Joachim Denzler, Erik Rodner
Abstract Most recent CNN architectures use average pooling as a final feature encoding step. In the field of fine-grained recognition, however, recent global representations like bilinear pooling offer improved performance. In this paper, we generalize average and bilinear pooling to “alpha-pooling”, allowing for learning the pooling strategy during training. In addition, we present a novel way to visualize decisions made by these approaches. We identify parts of training images having the highest influence on the prediction of a given test image. It allows for justifying decisions to users and also for analyzing the influence of semantic parts. For example, we can show that the higher capacity VGG16 model focuses much more on the bird’s head than, e.g., the lower-capacity VGG-M model when recognizing fine-grained bird categories. Both contributions allow us to analyze the difference when moving between average and bilinear pooling. In addition, experiments show that our generalized approach can outperform both across a variety of standard datasets.
Tasks
Published 2017-05-01
URL http://arxiv.org/abs/1705.00487v3
PDF http://arxiv.org/pdf/1705.00487v3.pdf
PWC https://paperswithcode.com/paper/generalized-orderless-pooling-performs
Repo https://github.com/KeremTurgutlu/bcnn
Framework none

On Breast Cancer Detection: An Application of Machine Learning Algorithms on the Wisconsin Diagnostic Dataset

Title On Breast Cancer Detection: An Application of Machine Learning Algorithms on the Wisconsin Diagnostic Dataset
Authors Abien Fred Agarap
Abstract This paper presents a comparison of six machine learning (ML) algorithms: GRU-SVM (Agarap, 2017), Linear Regression, Multilayer Perceptron (MLP), Nearest Neighbor (NN) search, Softmax Regression, and Support Vector Machine (SVM) on the Wisconsin Diagnostic Breast Cancer (WDBC) dataset (Wolberg, Street, & Mangasarian, 1992) by measuring their classification test accuracy and their sensitivity and specificity values. The said dataset consists of features which were computed from digitized images of FNA tests on a breast mass (Wolberg, Street, & Mangasarian, 1992). For the implementation of the ML algorithms, the dataset was partitioned in the following fashion: 70% for training phase, and 30% for the testing phase. The hyper-parameters used for all the classifiers were manually assigned. Results show that all the presented ML algorithms performed well (all exceeded 90% test accuracy) on the classification task. The MLP algorithm stands out among the implemented algorithms with a test accuracy of ~99.04%.
Tasks Breast Cancer Detection
Published 2017-11-20
URL http://arxiv.org/abs/1711.07831v4
PDF http://arxiv.org/pdf/1711.07831v4.pdf
PWC https://paperswithcode.com/paper/on-breast-cancer-detection-an-application-of
Repo https://github.com/cyberninja22/On-Breast-Cancer-Detection-An-Application-of-Machine-Learning-Algorithms-on-the-Wisconsin-Diagnosti
Framework none

Deep Expander Networks: Efficient Deep Networks from Graph Theory

Title Deep Expander Networks: Efficient Deep Networks from Graph Theory
Authors Ameya Prabhu, Girish Varma, Anoop Namboodiri
Abstract Efficient CNN designs like ResNets and DenseNet were proposed to improve accuracy vs efficiency trade-offs. They essentially increased the connectivity, allowing efficient information flow across layers. Inspired by these techniques, we propose to model connections between filters of a CNN using graphs which are simultaneously sparse and well connected. Sparsity results in efficiency while well connectedness can preserve the expressive power of the CNNs. We use a well-studied class of graphs from theoretical computer science that satisfies these properties known as Expander graphs. Expander graphs are used to model connections between filters in CNNs to design networks called X-Nets. We present two guarantees on the connectivity of X-Nets: Each node influences every node in a layer in logarithmic steps, and the number of paths between two sets of nodes is proportional to the product of their sizes. We also propose efficient training and inference algorithms, making it possible to train deeper and wider X-Nets effectively. Expander based models give a 4% improvement in accuracy on MobileNet over grouped convolutions, a popular technique, which has the same sparsity but worse connectivity. X-Nets give better performance trade-offs than the original ResNet and DenseNet-BC architectures. We achieve model sizes comparable to state-of-the-art pruning techniques using our simple architecture design, without any pruning. We hope that this work motivates other approaches to utilize results from graph theory to develop efficient network architectures.
Tasks
Published 2017-11-23
URL http://arxiv.org/abs/1711.08757v3
PDF http://arxiv.org/pdf/1711.08757v3.pdf
PWC https://paperswithcode.com/paper/deep-expander-networks-efficient-deep
Repo https://github.com/osmr/imgclsmob
Framework mxnet

A Good Practice Towards Top Performance of Face Recognition: Transferred Deep Feature Fusion

Title A Good Practice Towards Top Performance of Face Recognition: Transferred Deep Feature Fusion
Authors Lin Xiong, Jayashree Karlekar, Jian Zhao, Yi Cheng, Yan Xu, Jiashi Feng, Sugiri Pranata, Shengmei Shen
Abstract Unconstrained face recognition performance evaluations have traditionally focused on Labeled Faces in the Wild (LFW) dataset for imagery and the YouTubeFaces (YTF) dataset for videos in the last couple of years. Spectacular progress in this field has resulted in saturation on verification and identification accuracies for those benchmark datasets. In this paper, we propose a unified learning framework named Transferred Deep Feature Fusion (TDFF) targeting at the new IARPA Janus Benchmark A (IJB-A) face recognition dataset released by NIST face challenge. The IJB-A dataset includes real-world unconstrained faces from 500 subjects with full pose and illumination variations which are much harder than the LFW and YTF datasets. Inspired by transfer learning, we train two advanced deep convolutional neural networks (DCNN) with two different large datasets in source domain, respectively. By exploring the complementarity of two distinct DCNNs, deep feature fusion is utilized after feature extraction in target domain. Then, template specific linear SVMs is adopted to enhance the discrimination of framework. Finally, multiple matching scores corresponding different templates are merged as the final results. This simple unified framework exhibits excellent performance on IJB-A dataset. Based on the proposed approach, we have submitted our IJB-A results to National Institute of Standards and Technology (NIST) for official evaluation. Moreover, by introducing new data and advanced neural architecture, our method outperforms the state-of-the-art by a wide margin on IJB-A dataset.
Tasks Face Recognition, Transfer Learning
Published 2017-04-03
URL http://arxiv.org/abs/1704.00438v2
PDF http://arxiv.org/pdf/1704.00438v2.pdf
PWC https://paperswithcode.com/paper/a-good-practice-towards-top-performance-of
Repo https://github.com/bruinxiong/Evaluation_IJBA
Framework none

Towards A Rigorous Science of Interpretable Machine Learning

Title Towards A Rigorous Science of Interpretable Machine Learning
Authors Finale Doshi-Velez, Been Kim
Abstract As machine learning systems become ubiquitous, there has been a surge of interest in interpretable machine learning: systems that provide explanation for their outputs. These explanations are often used to qualitatively assess other criteria such as safety or non-discrimination. However, despite the interest in interpretability, there is very little consensus on what interpretable machine learning is and how it should be measured. In this position paper, we first define interpretability and describe when interpretability is needed (and when it is not). Next, we suggest a taxonomy for rigorous evaluation and expose open questions towards a more rigorous science of interpretable machine learning.
Tasks Interpretable Machine Learning
Published 2017-02-28
URL http://arxiv.org/abs/1702.08608v2
PDF http://arxiv.org/pdf/1702.08608v2.pdf
PWC https://paperswithcode.com/paper/towards-a-rigorous-science-of-interpretable
Repo https://github.com/QueensGambit/PGM-Causal-Reasoning
Framework none

Image Segmentation to Distinguish Between Overlapping Human Chromosomes

Title Image Segmentation to Distinguish Between Overlapping Human Chromosomes
Authors R. Lily Hu, Jeremy Karnowski, Ross Fadely, Jean-Patrick Pommier
Abstract In medicine, visualizing chromosomes is important for medical diagnostics, drug development, and biomedical research. Unfortunately, chromosomes often overlap and it is necessary to identify and distinguish between the overlapping chromosomes. A segmentation solution that is fast and automated will enable scaling of cost effective medicine and biomedical research. We apply neural network-based image segmentation to the problem of distinguishing between partially overlapping DNA chromosomes. A convolutional neural network is customized for this problem. The results achieved intersection over union (IOU) scores of 94.7% for the overlapping region and 88-94% on the non-overlapping chromosome regions.
Tasks Semantic Segmentation
Published 2017-12-20
URL http://arxiv.org/abs/1712.07639v1
PDF http://arxiv.org/pdf/1712.07639v1.pdf
PWC https://paperswithcode.com/paper/image-segmentation-to-distinguish-between
Repo https://github.com/LilyHu/Academic_Papers
Framework none

Variational Dropout Sparsifies Deep Neural Networks

Title Variational Dropout Sparsifies Deep Neural Networks
Authors Dmitry Molchanov, Arsenii Ashukha, Dmitry Vetrov
Abstract We explore a recently proposed Variational Dropout technique that provided an elegant Bayesian interpretation to Gaussian Dropout. We extend Variational Dropout to the case when dropout rates are unbounded, propose a way to reduce the variance of the gradient estimator and report first experimental results with individual dropout rates per weight. Interestingly, it leads to extremely sparse solutions both in fully-connected and convolutional layers. This effect is similar to automatic relevance determination effect in empirical Bayes but has a number of advantages. We reduce the number of parameters up to 280 times on LeNet architectures and up to 68 times on VGG-like networks with a negligible decrease of accuracy.
Tasks Sparse Learning
Published 2017-01-19
URL http://arxiv.org/abs/1701.05369v3
PDF http://arxiv.org/pdf/1701.05369v3.pdf
PWC https://paperswithcode.com/paper/variational-dropout-sparsifies-deep-neural
Repo https://github.com/Leensman/VarDropPytorch
Framework pytorch

Repulsion Loss: Detecting Pedestrians in a Crowd

Title Repulsion Loss: Detecting Pedestrians in a Crowd
Authors Xinlong Wang, Tete Xiao, Yuning Jiang, Shuai Shao, Jian Sun, Chunhua Shen
Abstract Detecting individual pedestrians in a crowd remains a challenging problem since the pedestrians often gather together and occlude each other in real-world scenarios. In this paper, we first explore how a state-of-the-art pedestrian detector is harmed by crowd occlusion via experimentation, providing insights into the crowd occlusion problem. Then, we propose a novel bounding box regression loss specifically designed for crowd scenes, termed repulsion loss. This loss is driven by two motivations: the attraction by target, and the repulsion by other surrounding objects. The repulsion term prevents the proposal from shifting to surrounding objects thus leading to more crowd-robust localization. Our detector trained by repulsion loss outperforms all the state-of-the-art methods with a significant improvement in occlusion cases.
Tasks Pedestrian Detection
Published 2017-11-21
URL http://arxiv.org/abs/1711.07752v2
PDF http://arxiv.org/pdf/1711.07752v2.pdf
PWC https://paperswithcode.com/paper/repulsion-loss-detecting-pedestrians-in-a
Repo https://github.com/bailvwangzi/repulsion_loss_ssd
Framework pytorch

Illuminating Pedestrians via Simultaneous Detection & Segmentation

Title Illuminating Pedestrians via Simultaneous Detection & Segmentation
Authors Garrick Brazil, Xi Yin, Xiaoming Liu
Abstract Pedestrian detection is a critical problem in computer vision with significant impact on safety in urban autonomous driving. In this work, we explore how semantic segmentation can be used to boost pedestrian detection accuracy while having little to no impact on network efficiency. We propose a segmentation infusion network to enable joint supervision on semantic segmentation and pedestrian detection. When placed properly, the additional supervision helps guide features in shared layers to become more sophisticated and helpful for the downstream pedestrian detector. Using this approach, we find weakly annotated boxes to be sufficient for considerable performance gains. We provide an in-depth analysis to demonstrate how shared layers are shaped by the segmentation supervision. In doing so, we show that the resulting feature maps become more semantically meaningful and robust to shape and occlusion. Overall, our simultaneous detection and segmentation framework achieves a considerable gain over the state-of-the-art on the Caltech pedestrian dataset, competitive performance on KITTI, and executes 2x faster than competitive methods.
Tasks Autonomous Driving, Pedestrian Detection, Semantic Segmentation
Published 2017-06-26
URL http://arxiv.org/abs/1706.08564v1
PDF http://arxiv.org/pdf/1706.08564v1.pdf
PWC https://paperswithcode.com/paper/illuminating-pedestrians-via-simultaneous
Repo https://github.com/Ricardozzf/sdsrcnn
Framework none

Personalized Gaussian Processes for Future Prediction of Alzheimer’s Disease Progression

Title Personalized Gaussian Processes for Future Prediction of Alzheimer’s Disease Progression
Authors Kelly Peterson, Ognjen Rudovic, Ricardo Guerrero, Rosalind W. Picard
Abstract In this paper, we introduce the use of a personalized Gaussian Process model (pGP) to predict the key metrics of Alzheimer’s Disease progression (MMSE, ADAS-Cog13, CDRSB and CS) based on each patient’s previous visits. We start by learning a population-level model using multi-modal data from previously seen patients using the base Gaussian Process (GP) regression. Then, this model is adapted sequentially over time to a new patient using domain adaptive GPs to form the patient’s pGP. We show that this new approach, together with an auto-regressive formulation, leads to significant improvements in forecasting future clinical status and cognitive scores for target patients when compared to modeling the population with traditional GPs.
Tasks Future prediction, Gaussian Processes
Published 2017-12-01
URL http://arxiv.org/abs/1712.00181v4
PDF http://arxiv.org/pdf/1712.00181v4.pdf
PWC https://paperswithcode.com/paper/personalized-gaussian-processes-for-future
Repo https://github.com/yuriautsumi/PersonalizedGP
Framework tf

Deep Q-learning from Demonstrations

Title Deep Q-learning from Demonstrations
Authors Todd Hester, Matej Vecerik, Olivier Pietquin, Marc Lanctot, Tom Schaul, Bilal Piot, Dan Horgan, John Quan, Andrew Sendonaris, Gabriel Dulac-Arnold, Ian Osband, John Agapiou, Joel Z. Leibo, Audrunas Gruslys
Abstract Deep reinforcement learning (RL) has achieved several high profile successes in difficult decision-making problems. However, these algorithms typically require a huge amount of data before they reach reasonable performance. In fact, their performance during learning can be extremely poor. This may be acceptable for a simulator, but it severely limits the applicability of deep RL to many real-world tasks, where the agent must learn in the real environment. In this paper we study a setting where the agent may access data from previous control of the system. We present an algorithm, Deep Q-learning from Demonstrations (DQfD), that leverages small sets of demonstration data to massively accelerate the learning process even from relatively small amounts of demonstration data and is able to automatically assess the necessary ratio of demonstration data while learning thanks to a prioritized replay mechanism. DQfD works by combining temporal difference updates with supervised classification of the demonstrator’s actions. We show that DQfD has better initial performance than Prioritized Dueling Double Deep Q-Networks (PDD DQN) as it starts with better scores on the first million steps on 41 of 42 games and on average it takes PDD DQN 83 million steps to catch up to DQfD’s performance. DQfD learns to out-perform the best demonstration given in 14 of 42 games. In addition, DQfD leverages human demonstrations to achieve state-of-the-art results for 11 games. Finally, we show that DQfD performs better than three related algorithms for incorporating demonstration data into DQN.
Tasks Decision Making, Q-Learning
Published 2017-04-12
URL http://arxiv.org/abs/1704.03732v4
PDF http://arxiv.org/pdf/1704.03732v4.pdf
PWC https://paperswithcode.com/paper/deep-q-learning-from-demonstrations
Repo https://github.com/LilTwo/DRL-using-PyTorch
Framework pytorch

Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning

Title Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning
Authors Jakob Foerster, Nantas Nardelli, Gregory Farquhar, Triantafyllos Afouras, Philip H. S. Torr, Pushmeet Kohli, Shimon Whiteson
Abstract Many real-world problems, such as network packet routing and urban traffic control, are naturally modeled as multi-agent reinforcement learning (RL) problems. However, existing multi-agent RL methods typically scale poorly in the problem size. Therefore, a key challenge is to translate the success of deep learning on single-agent RL to the multi-agent setting. A major stumbling block is that independent Q-learning, the most popular multi-agent RL method, introduces nonstationarity that makes it incompatible with the experience replay memory on which deep Q-learning relies. This paper proposes two methods that address this problem: 1) using a multi-agent variant of importance sampling to naturally decay obsolete data and 2) conditioning each agent’s value function on a fingerprint that disambiguates the age of the data sampled from the replay memory. Results on a challenging decentralised variant of StarCraft unit micromanagement confirm that these methods enable the successful combination of experience replay with multi-agent RL.
Tasks Multi-agent Reinforcement Learning, Q-Learning, Starcraft
Published 2017-02-28
URL http://arxiv.org/abs/1702.08887v3
PDF http://arxiv.org/pdf/1702.08887v3.pdf
PWC https://paperswithcode.com/paper/stabilising-experience-replay-for-deep-multi
Repo https://github.com/MUmarJaved/MultiAgent-Distributed-Reinforcement-Learning
Framework tf

SplineCNN: Fast Geometric Deep Learning with Continuous B-Spline Kernels

Title SplineCNN: Fast Geometric Deep Learning with Continuous B-Spline Kernels
Authors Matthias Fey, Jan Eric Lenssen, Frank Weichert, Heinrich Müller
Abstract We present Spline-based Convolutional Neural Networks (SplineCNNs), a variant of deep neural networks for irregular structured and geometric input, e.g., graphs or meshes. Our main contribution is a novel convolution operator based on B-splines, that makes the computation time independent from the kernel size due to the local support property of the B-spline basis functions. As a result, we obtain a generalization of the traditional CNN convolution operator by using continuous kernel functions parametrized by a fixed number of trainable weights. In contrast to related approaches that filter in the spectral domain, the proposed method aggregates features purely in the spatial domain. In addition, SplineCNN allows entire end-to-end training of deep architectures, using only the geometric structure as input, instead of handcrafted feature descriptors. For validation, we apply our method on tasks from the fields of image graph classification, shape correspondence and graph node classification, and show that it outperforms or pars state-of-the-art approaches while being significantly faster and having favorable properties like domain-independence.
Tasks Graph Classification, Node Classification
Published 2017-11-24
URL http://arxiv.org/abs/1711.08920v2
PDF http://arxiv.org/pdf/1711.08920v2.pdf
PWC https://paperswithcode.com/paper/splinecnn-fast-geometric-deep-learning-with
Repo https://github.com/rusty1s/pytorch_geometric
Framework pytorch

JFLEG: A Fluency Corpus and Benchmark for Grammatical Error Correction

Title JFLEG: A Fluency Corpus and Benchmark for Grammatical Error Correction
Authors Courtney Napoles, Keisuke Sakaguchi, Joel Tetreault
Abstract We present a new parallel corpus, JHU FLuency-Extended GUG corpus (JFLEG) for developing and evaluating grammatical error correction (GEC). Unlike other corpora, it represents a broad range of language proficiency levels and uses holistic fluency edits to not only correct grammatical errors but also make the original text more native sounding. We describe the types of corrections made and benchmark four leading GEC systems on this corpus, identifying specific areas in which they do well and how they can improve. JFLEG fulfills the need for a new gold standard to properly assess the current state of GEC.
Tasks Grammatical Error Correction
Published 2017-02-14
URL http://arxiv.org/abs/1702.04066v1
PDF http://arxiv.org/pdf/1702.04066v1.pdf
PWC https://paperswithcode.com/paper/jfleg-a-fluency-corpus-and-benchmark-for
Repo https://github.com/keisks/jfleg
Framework none

OpenNMT: Open-Source Toolkit for Neural Machine Translation

Title OpenNMT: Open-Source Toolkit for Neural Machine Translation
Authors Guillaume Klein, Yoon Kim, Yuntian Deng, Jean Senellart, Alexander M. Rush
Abstract We describe an open-source toolkit for neural machine translation (NMT). The toolkit prioritizes efficiency, modularity, and extensibility with the goal of supporting NMT research into model architectures, feature representations, and source modalities, while maintaining competitive performance and reasonable training requirements. The toolkit consists of modeling and translation support, as well as detailed pedagogical documentation about the underlying techniques.
Tasks Machine Translation
Published 2017-01-10
URL http://arxiv.org/abs/1701.02810v2
PDF http://arxiv.org/pdf/1701.02810v2.pdf
PWC https://paperswithcode.com/paper/opennmt-open-source-toolkit-for-neural
Repo https://github.com/OpenNMT/OpenNMT
Framework torch
comments powered by Disqus