July 30, 2019

2907 words 14 mins read

Paper Group AWR 26

Generalized orderless pooling performs implicit salient matching. On Breast Cancer Detection: An Application of Machine Learning Algorithms on the Wisconsin Diagnostic Dataset. Deep Expander Networks: Efficient Deep Networks from Graph Theory. A Good Practice Towards Top Performance of Face Recognition: Transferred Deep Feature Fusion. Towards A Ri …

Generalized orderless pooling performs implicit salient matching


Title	Generalized orderless pooling performs implicit salient matching
Authors	Marcel Simon, Yang Gao, Trevor Darrell, Joachim Denzler, Erik Rodner
Abstract	Most recent CNN architectures use average pooling as a final feature encoding step. In the field of fine-grained recognition, however, recent global representations like bilinear pooling offer improved performance. In this paper, we generalize average and bilinear pooling to “alpha-pooling”, allowing for learning the pooling strategy during training. In addition, we present a novel way to visualize decisions made by these approaches. We identify parts of training images having the highest influence on the prediction of a given test image. It allows for justifying decisions to users and also for analyzing the influence of semantic parts. For example, we can show that the higher capacity VGG16 model focuses much more on the bird’s head than, e.g., the lower-capacity VGG-M model when recognizing fine-grained bird categories. Both contributions allow us to analyze the difference when moving between average and bilinear pooling. In addition, experiments show that our generalized approach can outperform both across a variety of standard datasets.
Tasks
Published	2017-05-01
URL	http://arxiv.org/abs/1705.00487v3
PDF	http://arxiv.org/pdf/1705.00487v3.pdf
PWC	https://paperswithcode.com/paper/generalized-orderless-pooling-performs
Repo	https://github.com/KeremTurgutlu/bcnn
Framework	none

On Breast Cancer Detection: An Application of Machine Learning Algorithms on the Wisconsin Diagnostic Dataset


Title	On Breast Cancer Detection: An Application of Machine Learning Algorithms on the Wisconsin Diagnostic Dataset
Authors	Abien Fred Agarap
Abstract	This paper presents a comparison of six machine learning (ML) algorithms: GRU-SVM (Agarap, 2017), Linear Regression, Multilayer Perceptron (MLP), Nearest Neighbor (NN) search, Softmax Regression, and Support Vector Machine (SVM) on the Wisconsin Diagnostic Breast Cancer (WDBC) dataset (Wolberg, Street, & Mangasarian, 1992) by measuring their classification test accuracy and their sensitivity and specificity values. The said dataset consists of features which were computed from digitized images of FNA tests on a breast mass (Wolberg, Street, & Mangasarian, 1992). For the implementation of the ML algorithms, the dataset was partitioned in the following fashion: 70% for training phase, and 30% for the testing phase. The hyper-parameters used for all the classifiers were manually assigned. Results show that all the presented ML algorithms performed well (all exceeded 90% test accuracy) on the classification task. The MLP algorithm stands out among the implemented algorithms with a test accuracy of ~99.04%.
Tasks	Breast Cancer Detection
Published	2017-11-20
URL	http://arxiv.org/abs/1711.07831v4
PDF	http://arxiv.org/pdf/1711.07831v4.pdf
PWC	https://paperswithcode.com/paper/on-breast-cancer-detection-an-application-of
Repo	https://github.com/cyberninja22/On-Breast-Cancer-Detection-An-Application-of-Machine-Learning-Algorithms-on-the-Wisconsin-Diagnosti
Framework	none

Deep Expander Networks: Efficient Deep Networks from Graph Theory


Title	Deep Expander Networks: Efficient Deep Networks from Graph Theory
Authors	Ameya Prabhu, Girish Varma, Anoop Namboodiri
Abstract	Efficient CNN designs like ResNets and DenseNet were proposed to improve accuracy vs efficiency trade-offs. They essentially increased the connectivity, allowing efficient information flow across layers. Inspired by these techniques, we propose to model connections between filters of a CNN using graphs which are simultaneously sparse and well connected. Sparsity results in efficiency while well connectedness can preserve the expressive power of the CNNs. We use a well-studied class of graphs from theoretical computer science that satisfies these properties known as Expander graphs. Expander graphs are used to model connections between filters in CNNs to design networks called X-Nets. We present two guarantees on the connectivity of X-Nets: Each node influences every node in a layer in logarithmic steps, and the number of paths between two sets of nodes is proportional to the product of their sizes. We also propose efficient training and inference algorithms, making it possible to train deeper and wider X-Nets effectively. Expander based models give a 4% improvement in accuracy on MobileNet over grouped convolutions, a popular technique, which has the same sparsity but worse connectivity. X-Nets give better performance trade-offs than the original ResNet and DenseNet-BC architectures. We achieve model sizes comparable to state-of-the-art pruning techniques using our simple architecture design, without any pruning. We hope that this work motivates other approaches to utilize results from graph theory to develop efficient network architectures.
Tasks
Published	2017-11-23
URL	http://arxiv.org/abs/1711.08757v3
PDF	http://arxiv.org/pdf/1711.08757v3.pdf
PWC	https://paperswithcode.com/paper/deep-expander-networks-efficient-deep
Repo	https://github.com/osmr/imgclsmob
Framework	mxnet

A Good Practice Towards Top Performance of Face Recognition: Transferred Deep Feature Fusion


Title	A Good Practice Towards Top Performance of Face Recognition: Transferred Deep Feature Fusion
Authors	Lin Xiong, Jayashree Karlekar, Jian Zhao, Yi Cheng, Yan Xu, Jiashi Feng, Sugiri Pranata, Shengmei Shen
Abstract	Unconstrained face recognition performance evaluations have traditionally focused on Labeled Faces in the Wild (LFW) dataset for imagery and the YouTubeFaces (YTF) dataset for videos in the last couple of years. Spectacular progress in this field has resulted in saturation on verification and identification accuracies for those benchmark datasets. In this paper, we propose a unified learning framework named Transferred Deep Feature Fusion (TDFF) targeting at the new IARPA Janus Benchmark A (IJB-A) face recognition dataset released by NIST face challenge. The IJB-A dataset includes real-world unconstrained faces from 500 subjects with full pose and illumination variations which are much harder than the LFW and YTF datasets. Inspired by transfer learning, we train two advanced deep convolutional neural networks (DCNN) with two different large datasets in source domain, respectively. By exploring the complementarity of two distinct DCNNs, deep feature fusion is utilized after feature extraction in target domain. Then, template specific linear SVMs is adopted to enhance the discrimination of framework. Finally, multiple matching scores corresponding different templates are merged as the final results. This simple unified framework exhibits excellent performance on IJB-A dataset. Based on the proposed approach, we have submitted our IJB-A results to National Institute of Standards and Technology (NIST) for official evaluation. Moreover, by introducing new data and advanced neural architecture, our method outperforms the state-of-the-art by a wide margin on IJB-A dataset.
Tasks	Face Recognition, Transfer Learning
Published	2017-04-03
URL	http://arxiv.org/abs/1704.00438v2
PDF	http://arxiv.org/pdf/1704.00438v2.pdf
PWC	https://paperswithcode.com/paper/a-good-practice-towards-top-performance-of
Repo	https://github.com/bruinxiong/Evaluation_IJBA
Framework	none

Towards A Rigorous Science of Interpretable Machine Learning


Title	Towards A Rigorous Science of Interpretable Machine Learning
Authors	Finale Doshi-Velez, Been Kim
Abstract	As machine learning systems become ubiquitous, there has been a surge of interest in interpretable machine learning: systems that provide explanation for their outputs. These explanations are often used to qualitatively assess other criteria such as safety or non-discrimination. However, despite the interest in interpretability, there is very little consensus on what interpretable machine learning is and how it should be measured. In this position paper, we first define interpretability and describe when interpretability is needed (and when it is not). Next, we suggest a taxonomy for rigorous evaluation and expose open questions towards a more rigorous science of interpretable machine learning.
Tasks	Interpretable Machine Learning
Published	2017-02-28
URL	http://arxiv.org/abs/1702.08608v2
PDF	http://arxiv.org/pdf/1702.08608v2.pdf
PWC	https://paperswithcode.com/paper/towards-a-rigorous-science-of-interpretable
Repo	https://github.com/QueensGambit/PGM-Causal-Reasoning
Framework	none

Image Segmentation to Distinguish Between Overlapping Human Chromosomes


Title	Image Segmentation to Distinguish Between Overlapping Human Chromosomes
Authors	R. Lily Hu, Jeremy Karnowski, Ross Fadely, Jean-Patrick Pommier
Abstract	In medicine, visualizing chromosomes is important for medical diagnostics, drug development, and biomedical research. Unfortunately, chromosomes often overlap and it is necessary to identify and distinguish between the overlapping chromosomes. A segmentation solution that is fast and automated will enable scaling of cost effective medicine and biomedical research. We apply neural network-based image segmentation to the problem of distinguishing between partially overlapping DNA chromosomes. A convolutional neural network is customized for this problem. The results achieved intersection over union (IOU) scores of 94.7% for the overlapping region and 88-94% on the non-overlapping chromosome regions.
Tasks	Semantic Segmentation
Published	2017-12-20
URL	http://arxiv.org/abs/1712.07639v1
PDF	http://arxiv.org/pdf/1712.07639v1.pdf
PWC	https://paperswithcode.com/paper/image-segmentation-to-distinguish-between
Repo	https://github.com/LilyHu/Academic_Papers
Framework	none

Variational Dropout Sparsifies Deep Neural Networks


Title	Variational Dropout Sparsifies Deep Neural Networks
Authors	Dmitry Molchanov, Arsenii Ashukha, Dmitry Vetrov
Abstract	We explore a recently proposed Variational Dropout technique that provided an elegant Bayesian interpretation to Gaussian Dropout. We extend Variational Dropout to the case when dropout rates are unbounded, propose a way to reduce the variance of the gradient estimator and report first experimental results with individual dropout rates per weight. Interestingly, it leads to extremely sparse solutions both in fully-connected and convolutional layers. This effect is similar to automatic relevance determination effect in empirical Bayes but has a number of advantages. We reduce the number of parameters up to 280 times on LeNet architectures and up to 68 times on VGG-like networks with a negligible decrease of accuracy.
Tasks	Sparse Learning
Published	2017-01-19
URL	http://arxiv.org/abs/1701.05369v3
PDF	http://arxiv.org/pdf/1701.05369v3.pdf
PWC	https://paperswithcode.com/paper/variational-dropout-sparsifies-deep-neural
Repo	https://github.com/Leensman/VarDropPytorch
Framework	pytorch

Repulsion Loss: Detecting Pedestrians in a Crowd


Title	Repulsion Loss: Detecting Pedestrians in a Crowd
Authors	Xinlong Wang, Tete Xiao, Yuning Jiang, Shuai Shao, Jian Sun, Chunhua Shen
Abstract	Detecting individual pedestrians in a crowd remains a challenging problem since the pedestrians often gather together and occlude each other in real-world scenarios. In this paper, we first explore how a state-of-the-art pedestrian detector is harmed by crowd occlusion via experimentation, providing insights into the crowd occlusion problem. Then, we propose a novel bounding box regression loss specifically designed for crowd scenes, termed repulsion loss. This loss is driven by two motivations: the attraction by target, and the repulsion by other surrounding objects. The repulsion term prevents the proposal from shifting to surrounding objects thus leading to more crowd-robust localization. Our detector trained by repulsion loss outperforms all the state-of-the-art methods with a significant improvement in occlusion cases.
Tasks	Pedestrian Detection
Published	2017-11-21
URL	http://arxiv.org/abs/1711.07752v2
PDF	http://arxiv.org/pdf/1711.07752v2.pdf
PWC	https://paperswithcode.com/paper/repulsion-loss-detecting-pedestrians-in-a
Repo	https://github.com/bailvwangzi/repulsion_loss_ssd
Framework	pytorch

Illuminating Pedestrians via Simultaneous Detection & Segmentation


Title	Illuminating Pedestrians via Simultaneous Detection & Segmentation
Authors	Garrick Brazil, Xi Yin, Xiaoming Liu
Abstract	Pedestrian detection is a critical problem in computer vision with significant impact on safety in urban autonomous driving. In this work, we explore how semantic segmentation can be used to boost pedestrian detection accuracy while having little to no impact on network efficiency. We propose a segmentation infusion network to enable joint supervision on semantic segmentation and pedestrian detection. When placed properly, the additional supervision helps guide features in shared layers to become more sophisticated and helpful for the downstream pedestrian detector. Using this approach, we find weakly annotated boxes to be sufficient for considerable performance gains. We provide an in-depth analysis to demonstrate how shared layers are shaped by the segmentation supervision. In doing so, we show that the resulting feature maps become more semantically meaningful and robust to shape and occlusion. Overall, our simultaneous detection and segmentation framework achieves a considerable gain over the state-of-the-art on the Caltech pedestrian dataset, competitive performance on KITTI, and executes 2x faster than competitive methods.
Tasks	Autonomous Driving, Pedestrian Detection, Semantic Segmentation
Published	2017-06-26
URL	http://arxiv.org/abs/1706.08564v1
PDF	http://arxiv.org/pdf/1706.08564v1.pdf
PWC	https://paperswithcode.com/paper/illuminating-pedestrians-via-simultaneous
Repo	https://github.com/Ricardozzf/sdsrcnn
Framework	none

Personalized Gaussian Processes for Future Prediction of Alzheimer’s Disease Progression


Title	Personalized Gaussian Processes for Future Prediction of Alzheimer’s Disease Progression
Authors	Kelly Peterson, Ognjen Rudovic, Ricardo Guerrero, Rosalind W. Picard
Abstract	In this paper, we introduce the use of a personalized Gaussian Process model (pGP) to predict the key metrics of Alzheimer’s Disease progression (MMSE, ADAS-Cog13, CDRSB and CS) based on each patient’s previous visits. We start by learning a population-level model using multi-modal data from previously seen patients using the base Gaussian Process (GP) regression. Then, this model is adapted sequentially over time to a new patient using domain adaptive GPs to form the patient’s pGP. We show that this new approach, together with an auto-regressive formulation, leads to significant improvements in forecasting future clinical status and cognitive scores for target patients when compared to modeling the population with traditional GPs.
Tasks	Future prediction, Gaussian Processes
Published	2017-12-01
URL	http://arxiv.org/abs/1712.00181v4
PDF	http://arxiv.org/pdf/1712.00181v4.pdf
PWC	https://paperswithcode.com/paper/personalized-gaussian-processes-for-future
Repo	https://github.com/yuriautsumi/PersonalizedGP
Framework	tf

Deep Q-learning from Demonstrations


Title	Deep Q-learning from Demonstrations
Authors	Todd Hester, Matej Vecerik, Olivier Pietquin, Marc Lanctot, Tom Schaul, Bilal Piot, Dan Horgan, John Quan, Andrew Sendonaris, Gabriel Dulac-Arnold, Ian Osband, John Agapiou, Joel Z. Leibo, Audrunas Gruslys
Abstract	Deep reinforcement learning (RL) has achieved several high profile successes in difficult decision-making problems. However, these algorithms typically require a huge amount of data before they reach reasonable performance. In fact, their performance during learning can be extremely poor. This may be acceptable for a simulator, but it severely limits the applicability of deep RL to many real-world tasks, where the agent must learn in the real environment. In this paper we study a setting where the agent may access data from previous control of the system. We present an algorithm, Deep Q-learning from Demonstrations (DQfD), that leverages small sets of demonstration data to massively accelerate the learning process even from relatively small amounts of demonstration data and is able to automatically assess the necessary ratio of demonstration data while learning thanks to a prioritized replay mechanism. DQfD works by combining temporal difference updates with supervised classification of the demonstrator’s actions. We show that DQfD has better initial performance than Prioritized Dueling Double Deep Q-Networks (PDD DQN) as it starts with better scores on the first million steps on 41 of 42 games and on average it takes PDD DQN 83 million steps to catch up to DQfD’s performance. DQfD learns to out-perform the best demonstration given in 14 of 42 games. In addition, DQfD leverages human demonstrations to achieve state-of-the-art results for 11 games. Finally, we show that DQfD performs better than three related algorithms for incorporating demonstration data into DQN.
Tasks	Decision Making, Q-Learning
Published	2017-04-12
URL	http://arxiv.org/abs/1704.03732v4
PDF	http://arxiv.org/pdf/1704.03732v4.pdf
PWC	https://paperswithcode.com/paper/deep-q-learning-from-demonstrations
Repo	https://github.com/LilTwo/DRL-using-PyTorch
Framework	pytorch

Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning


Title	Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning
Authors	Jakob Foerster, Nantas Nardelli, Gregory Farquhar, Triantafyllos Afouras, Philip H. S. Torr, Pushmeet Kohli, Shimon Whiteson
Abstract	Many real-world problems, such as network packet routing and urban traffic control, are naturally modeled as multi-agent reinforcement learning (RL) problems. However, existing multi-agent RL methods typically scale poorly in the problem size. Therefore, a key challenge is to translate the success of deep learning on single-agent RL to the multi-agent setting. A major stumbling block is that independent Q-learning, the most popular multi-agent RL method, introduces nonstationarity that makes it incompatible with the experience replay memory on which deep Q-learning relies. This paper proposes two methods that address this problem: 1) using a multi-agent variant of importance sampling to naturally decay obsolete data and 2) conditioning each agent’s value function on a fingerprint that disambiguates the age of the data sampled from the replay memory. Results on a challenging decentralised variant of StarCraft unit micromanagement confirm that these methods enable the successful combination of experience replay with multi-agent RL.
Tasks	Multi-agent Reinforcement Learning, Q-Learning, Starcraft
Published	2017-02-28
URL	http://arxiv.org/abs/1702.08887v3
PDF	http://arxiv.org/pdf/1702.08887v3.pdf
PWC	https://paperswithcode.com/paper/stabilising-experience-replay-for-deep-multi
Repo	https://github.com/MUmarJaved/MultiAgent-Distributed-Reinforcement-Learning
Framework	tf

SplineCNN: Fast Geometric Deep Learning with Continuous B-Spline Kernels


Title	SplineCNN: Fast Geometric Deep Learning with Continuous B-Spline Kernels
Authors	Matthias Fey, Jan Eric Lenssen, Frank Weichert, Heinrich Müller
Abstract	We present Spline-based Convolutional Neural Networks (SplineCNNs), a variant of deep neural networks for irregular structured and geometric input, e.g., graphs or meshes. Our main contribution is a novel convolution operator based on B-splines, that makes the computation time independent from the kernel size due to the local support property of the B-spline basis functions. As a result, we obtain a generalization of the traditional CNN convolution operator by using continuous kernel functions parametrized by a fixed number of trainable weights. In contrast to related approaches that filter in the spectral domain, the proposed method aggregates features purely in the spatial domain. In addition, SplineCNN allows entire end-to-end training of deep architectures, using only the geometric structure as input, instead of handcrafted feature descriptors. For validation, we apply our method on tasks from the fields of image graph classification, shape correspondence and graph node classification, and show that it outperforms or pars state-of-the-art approaches while being significantly faster and having favorable properties like domain-independence.
Tasks	Graph Classification, Node Classification
Published	2017-11-24
URL	http://arxiv.org/abs/1711.08920v2
PDF	http://arxiv.org/pdf/1711.08920v2.pdf
PWC	https://paperswithcode.com/paper/splinecnn-fast-geometric-deep-learning-with
Repo	https://github.com/rusty1s/pytorch_geometric
Framework	pytorch

JFLEG: A Fluency Corpus and Benchmark for Grammatical Error Correction


Title	JFLEG: A Fluency Corpus and Benchmark for Grammatical Error Correction
Authors	Courtney Napoles, Keisuke Sakaguchi, Joel Tetreault
Abstract	We present a new parallel corpus, JHU FLuency-Extended GUG corpus (JFLEG) for developing and evaluating grammatical error correction (GEC). Unlike other corpora, it represents a broad range of language proficiency levels and uses holistic fluency edits to not only correct grammatical errors but also make the original text more native sounding. We describe the types of corrections made and benchmark four leading GEC systems on this corpus, identifying specific areas in which they do well and how they can improve. JFLEG fulfills the need for a new gold standard to properly assess the current state of GEC.
Tasks	Grammatical Error Correction
Published	2017-02-14
URL	http://arxiv.org/abs/1702.04066v1
PDF	http://arxiv.org/pdf/1702.04066v1.pdf
PWC	https://paperswithcode.com/paper/jfleg-a-fluency-corpus-and-benchmark-for
Repo	https://github.com/keisks/jfleg
Framework	none

OpenNMT: Open-Source Toolkit for Neural Machine Translation


Title	OpenNMT: Open-Source Toolkit for Neural Machine Translation
Authors	Guillaume Klein, Yoon Kim, Yuntian Deng, Jean Senellart, Alexander M. Rush
Abstract	We describe an open-source toolkit for neural machine translation (NMT). The toolkit prioritizes efficiency, modularity, and extensibility with the goal of supporting NMT research into model architectures, feature representations, and source modalities, while maintaining competitive performance and reasonable training requirements. The toolkit consists of modeling and translation support, as well as detailed pedagogical documentation about the underlying techniques.
Tasks	Machine Translation
Published	2017-01-10
URL	http://arxiv.org/abs/1701.02810v2
PDF	http://arxiv.org/pdf/1701.02810v2.pdf
PWC	https://paperswithcode.com/paper/opennmt-open-source-toolkit-for-neural
Repo	https://github.com/OpenNMT/OpenNMT
Framework	torch