Paper Group AWR 26
Generalized orderless pooling performs implicit salient matching. On Breast Cancer Detection: An Application of Machine Learning Algorithms on the Wisconsin Diagnostic Dataset. Deep Expander Networks: Efficient Deep Networks from Graph Theory. A Good Practice Towards Top Performance of Face Recognition: Transferred Deep Feature Fusion. Towards A Ri …
Generalized orderless pooling performs implicit salient matching
Title | Generalized orderless pooling performs implicit salient matching |
Authors | Marcel Simon, Yang Gao, Trevor Darrell, Joachim Denzler, Erik Rodner |
Abstract | Most recent CNN architectures use average pooling as a final feature encoding step. In the field of fine-grained recognition, however, recent global representations like bilinear pooling offer improved performance. In this paper, we generalize average and bilinear pooling to “alpha-pooling”, allowing for learning the pooling strategy during training. In addition, we present a novel way to visualize decisions made by these approaches. We identify parts of training images having the highest influence on the prediction of a given test image. It allows for justifying decisions to users and also for analyzing the influence of semantic parts. For example, we can show that the higher capacity VGG16 model focuses much more on the bird’s head than, e.g., the lower-capacity VGG-M model when recognizing fine-grained bird categories. Both contributions allow us to analyze the difference when moving between average and bilinear pooling. In addition, experiments show that our generalized approach can outperform both across a variety of standard datasets. |
Tasks | |
Published | 2017-05-01 |
URL | http://arxiv.org/abs/1705.00487v3 |
http://arxiv.org/pdf/1705.00487v3.pdf | |
PWC | https://paperswithcode.com/paper/generalized-orderless-pooling-performs |
Repo | https://github.com/KeremTurgutlu/bcnn |
Framework | none |
On Breast Cancer Detection: An Application of Machine Learning Algorithms on the Wisconsin Diagnostic Dataset
Title | On Breast Cancer Detection: An Application of Machine Learning Algorithms on the Wisconsin Diagnostic Dataset |
Authors | Abien Fred Agarap |
Abstract | This paper presents a comparison of six machine learning (ML) algorithms: GRU-SVM (Agarap, 2017), Linear Regression, Multilayer Perceptron (MLP), Nearest Neighbor (NN) search, Softmax Regression, and Support Vector Machine (SVM) on the Wisconsin Diagnostic Breast Cancer (WDBC) dataset (Wolberg, Street, & Mangasarian, 1992) by measuring their classification test accuracy and their sensitivity and specificity values. The said dataset consists of features which were computed from digitized images of FNA tests on a breast mass (Wolberg, Street, & Mangasarian, 1992). For the implementation of the ML algorithms, the dataset was partitioned in the following fashion: 70% for training phase, and 30% for the testing phase. The hyper-parameters used for all the classifiers were manually assigned. Results show that all the presented ML algorithms performed well (all exceeded 90% test accuracy) on the classification task. The MLP algorithm stands out among the implemented algorithms with a test accuracy of ~99.04%. |
Tasks | Breast Cancer Detection |
Published | 2017-11-20 |
URL | http://arxiv.org/abs/1711.07831v4 |
http://arxiv.org/pdf/1711.07831v4.pdf | |
PWC | https://paperswithcode.com/paper/on-breast-cancer-detection-an-application-of |
Repo | https://github.com/cyberninja22/On-Breast-Cancer-Detection-An-Application-of-Machine-Learning-Algorithms-on-the-Wisconsin-Diagnosti |
Framework | none |
Deep Expander Networks: Efficient Deep Networks from Graph Theory
Title | Deep Expander Networks: Efficient Deep Networks from Graph Theory |
Authors | Ameya Prabhu, Girish Varma, Anoop Namboodiri |
Abstract | Efficient CNN designs like ResNets and DenseNet were proposed to improve accuracy vs efficiency trade-offs. They essentially increased the connectivity, allowing efficient information flow across layers. Inspired by these techniques, we propose to model connections between filters of a CNN using graphs which are simultaneously sparse and well connected. Sparsity results in efficiency while well connectedness can preserve the expressive power of the CNNs. We use a well-studied class of graphs from theoretical computer science that satisfies these properties known as Expander graphs. Expander graphs are used to model connections between filters in CNNs to design networks called X-Nets. We present two guarantees on the connectivity of X-Nets: Each node influences every node in a layer in logarithmic steps, and the number of paths between two sets of nodes is proportional to the product of their sizes. We also propose efficient training and inference algorithms, making it possible to train deeper and wider X-Nets effectively. Expander based models give a 4% improvement in accuracy on MobileNet over grouped convolutions, a popular technique, which has the same sparsity but worse connectivity. X-Nets give better performance trade-offs than the original ResNet and DenseNet-BC architectures. We achieve model sizes comparable to state-of-the-art pruning techniques using our simple architecture design, without any pruning. We hope that this work motivates other approaches to utilize results from graph theory to develop efficient network architectures. |
Tasks | |
Published | 2017-11-23 |
URL | http://arxiv.org/abs/1711.08757v3 |
http://arxiv.org/pdf/1711.08757v3.pdf | |
PWC | https://paperswithcode.com/paper/deep-expander-networks-efficient-deep |
Repo | https://github.com/osmr/imgclsmob |
Framework | mxnet |
A Good Practice Towards Top Performance of Face Recognition: Transferred Deep Feature Fusion
Title | A Good Practice Towards Top Performance of Face Recognition: Transferred Deep Feature Fusion |
Authors | Lin Xiong, Jayashree Karlekar, Jian Zhao, Yi Cheng, Yan Xu, Jiashi Feng, Sugiri Pranata, Shengmei Shen |
Abstract | Unconstrained face recognition performance evaluations have traditionally focused on Labeled Faces in the Wild (LFW) dataset for imagery and the YouTubeFaces (YTF) dataset for videos in the last couple of years. Spectacular progress in this field has resulted in saturation on verification and identification accuracies for those benchmark datasets. In this paper, we propose a unified learning framework named Transferred Deep Feature Fusion (TDFF) targeting at the new IARPA Janus Benchmark A (IJB-A) face recognition dataset released by NIST face challenge. The IJB-A dataset includes real-world unconstrained faces from 500 subjects with full pose and illumination variations which are much harder than the LFW and YTF datasets. Inspired by transfer learning, we train two advanced deep convolutional neural networks (DCNN) with two different large datasets in source domain, respectively. By exploring the complementarity of two distinct DCNNs, deep feature fusion is utilized after feature extraction in target domain. Then, template specific linear SVMs is adopted to enhance the discrimination of framework. Finally, multiple matching scores corresponding different templates are merged as the final results. This simple unified framework exhibits excellent performance on IJB-A dataset. Based on the proposed approach, we have submitted our IJB-A results to National Institute of Standards and Technology (NIST) for official evaluation. Moreover, by introducing new data and advanced neural architecture, our method outperforms the state-of-the-art by a wide margin on IJB-A dataset. |
Tasks | Face Recognition, Transfer Learning |
Published | 2017-04-03 |
URL | http://arxiv.org/abs/1704.00438v2 |
http://arxiv.org/pdf/1704.00438v2.pdf | |
PWC | https://paperswithcode.com/paper/a-good-practice-towards-top-performance-of |
Repo | https://github.com/bruinxiong/Evaluation_IJBA |
Framework | none |
Towards A Rigorous Science of Interpretable Machine Learning
Title | Towards A Rigorous Science of Interpretable Machine Learning |
Authors | Finale Doshi-Velez, Been Kim |
Abstract | As machine learning systems become ubiquitous, there has been a surge of interest in interpretable machine learning: systems that provide explanation for their outputs. These explanations are often used to qualitatively assess other criteria such as safety or non-discrimination. However, despite the interest in interpretability, there is very little consensus on what interpretable machine learning is and how it should be measured. In this position paper, we first define interpretability and describe when interpretability is needed (and when it is not). Next, we suggest a taxonomy for rigorous evaluation and expose open questions towards a more rigorous science of interpretable machine learning. |
Tasks | Interpretable Machine Learning |
Published | 2017-02-28 |
URL | http://arxiv.org/abs/1702.08608v2 |
http://arxiv.org/pdf/1702.08608v2.pdf | |
PWC | https://paperswithcode.com/paper/towards-a-rigorous-science-of-interpretable |
Repo | https://github.com/QueensGambit/PGM-Causal-Reasoning |
Framework | none |
Image Segmentation to Distinguish Between Overlapping Human Chromosomes
Title | Image Segmentation to Distinguish Between Overlapping Human Chromosomes |
Authors | R. Lily Hu, Jeremy Karnowski, Ross Fadely, Jean-Patrick Pommier |
Abstract | In medicine, visualizing chromosomes is important for medical diagnostics, drug development, and biomedical research. Unfortunately, chromosomes often overlap and it is necessary to identify and distinguish between the overlapping chromosomes. A segmentation solution that is fast and automated will enable scaling of cost effective medicine and biomedical research. We apply neural network-based image segmentation to the problem of distinguishing between partially overlapping DNA chromosomes. A convolutional neural network is customized for this problem. The results achieved intersection over union (IOU) scores of 94.7% for the overlapping region and 88-94% on the non-overlapping chromosome regions. |
Tasks | Semantic Segmentation |
Published | 2017-12-20 |
URL | http://arxiv.org/abs/1712.07639v1 |
http://arxiv.org/pdf/1712.07639v1.pdf | |
PWC | https://paperswithcode.com/paper/image-segmentation-to-distinguish-between |
Repo | https://github.com/LilyHu/Academic_Papers |
Framework | none |
Variational Dropout Sparsifies Deep Neural Networks
Title | Variational Dropout Sparsifies Deep Neural Networks |
Authors | Dmitry Molchanov, Arsenii Ashukha, Dmitry Vetrov |
Abstract | We explore a recently proposed Variational Dropout technique that provided an elegant Bayesian interpretation to Gaussian Dropout. We extend Variational Dropout to the case when dropout rates are unbounded, propose a way to reduce the variance of the gradient estimator and report first experimental results with individual dropout rates per weight. Interestingly, it leads to extremely sparse solutions both in fully-connected and convolutional layers. This effect is similar to automatic relevance determination effect in empirical Bayes but has a number of advantages. We reduce the number of parameters up to 280 times on LeNet architectures and up to 68 times on VGG-like networks with a negligible decrease of accuracy. |
Tasks | Sparse Learning |
Published | 2017-01-19 |
URL | http://arxiv.org/abs/1701.05369v3 |
http://arxiv.org/pdf/1701.05369v3.pdf | |
PWC | https://paperswithcode.com/paper/variational-dropout-sparsifies-deep-neural |
Repo | https://github.com/Leensman/VarDropPytorch |
Framework | pytorch |
Repulsion Loss: Detecting Pedestrians in a Crowd
Title | Repulsion Loss: Detecting Pedestrians in a Crowd |
Authors | Xinlong Wang, Tete Xiao, Yuning Jiang, Shuai Shao, Jian Sun, Chunhua Shen |
Abstract | Detecting individual pedestrians in a crowd remains a challenging problem since the pedestrians often gather together and occlude each other in real-world scenarios. In this paper, we first explore how a state-of-the-art pedestrian detector is harmed by crowd occlusion via experimentation, providing insights into the crowd occlusion problem. Then, we propose a novel bounding box regression loss specifically designed for crowd scenes, termed repulsion loss. This loss is driven by two motivations: the attraction by target, and the repulsion by other surrounding objects. The repulsion term prevents the proposal from shifting to surrounding objects thus leading to more crowd-robust localization. Our detector trained by repulsion loss outperforms all the state-of-the-art methods with a significant improvement in occlusion cases. |
Tasks | Pedestrian Detection |
Published | 2017-11-21 |
URL | http://arxiv.org/abs/1711.07752v2 |
http://arxiv.org/pdf/1711.07752v2.pdf | |
PWC | https://paperswithcode.com/paper/repulsion-loss-detecting-pedestrians-in-a |
Repo | https://github.com/bailvwangzi/repulsion_loss_ssd |
Framework | pytorch |
Illuminating Pedestrians via Simultaneous Detection & Segmentation
Title | Illuminating Pedestrians via Simultaneous Detection & Segmentation |
Authors | Garrick Brazil, Xi Yin, Xiaoming Liu |
Abstract | Pedestrian detection is a critical problem in computer vision with significant impact on safety in urban autonomous driving. In this work, we explore how semantic segmentation can be used to boost pedestrian detection accuracy while having little to no impact on network efficiency. We propose a segmentation infusion network to enable joint supervision on semantic segmentation and pedestrian detection. When placed properly, the additional supervision helps guide features in shared layers to become more sophisticated and helpful for the downstream pedestrian detector. Using this approach, we find weakly annotated boxes to be sufficient for considerable performance gains. We provide an in-depth analysis to demonstrate how shared layers are shaped by the segmentation supervision. In doing so, we show that the resulting feature maps become more semantically meaningful and robust to shape and occlusion. Overall, our simultaneous detection and segmentation framework achieves a considerable gain over the state-of-the-art on the Caltech pedestrian dataset, competitive performance on KITTI, and executes 2x faster than competitive methods. |
Tasks | Autonomous Driving, Pedestrian Detection, Semantic Segmentation |
Published | 2017-06-26 |
URL | http://arxiv.org/abs/1706.08564v1 |
http://arxiv.org/pdf/1706.08564v1.pdf | |
PWC | https://paperswithcode.com/paper/illuminating-pedestrians-via-simultaneous |
Repo | https://github.com/Ricardozzf/sdsrcnn |
Framework | none |
Personalized Gaussian Processes for Future Prediction of Alzheimer’s Disease Progression
Title | Personalized Gaussian Processes for Future Prediction of Alzheimer’s Disease Progression |
Authors | Kelly Peterson, Ognjen Rudovic, Ricardo Guerrero, Rosalind W. Picard |
Abstract | In this paper, we introduce the use of a personalized Gaussian Process model (pGP) to predict the key metrics of Alzheimer’s Disease progression (MMSE, ADAS-Cog13, CDRSB and CS) based on each patient’s previous visits. We start by learning a population-level model using multi-modal data from previously seen patients using the base Gaussian Process (GP) regression. Then, this model is adapted sequentially over time to a new patient using domain adaptive GPs to form the patient’s pGP. We show that this new approach, together with an auto-regressive formulation, leads to significant improvements in forecasting future clinical status and cognitive scores for target patients when compared to modeling the population with traditional GPs. |
Tasks | Future prediction, Gaussian Processes |
Published | 2017-12-01 |
URL | http://arxiv.org/abs/1712.00181v4 |
http://arxiv.org/pdf/1712.00181v4.pdf | |
PWC | https://paperswithcode.com/paper/personalized-gaussian-processes-for-future |
Repo | https://github.com/yuriautsumi/PersonalizedGP |
Framework | tf |
Deep Q-learning from Demonstrations
Title | Deep Q-learning from Demonstrations |
Authors | Todd Hester, Matej Vecerik, Olivier Pietquin, Marc Lanctot, Tom Schaul, Bilal Piot, Dan Horgan, John Quan, Andrew Sendonaris, Gabriel Dulac-Arnold, Ian Osband, John Agapiou, Joel Z. Leibo, Audrunas Gruslys |
Abstract | Deep reinforcement learning (RL) has achieved several high profile successes in difficult decision-making problems. However, these algorithms typically require a huge amount of data before they reach reasonable performance. In fact, their performance during learning can be extremely poor. This may be acceptable for a simulator, but it severely limits the applicability of deep RL to many real-world tasks, where the agent must learn in the real environment. In this paper we study a setting where the agent may access data from previous control of the system. We present an algorithm, Deep Q-learning from Demonstrations (DQfD), that leverages small sets of demonstration data to massively accelerate the learning process even from relatively small amounts of demonstration data and is able to automatically assess the necessary ratio of demonstration data while learning thanks to a prioritized replay mechanism. DQfD works by combining temporal difference updates with supervised classification of the demonstrator’s actions. We show that DQfD has better initial performance than Prioritized Dueling Double Deep Q-Networks (PDD DQN) as it starts with better scores on the first million steps on 41 of 42 games and on average it takes PDD DQN 83 million steps to catch up to DQfD’s performance. DQfD learns to out-perform the best demonstration given in 14 of 42 games. In addition, DQfD leverages human demonstrations to achieve state-of-the-art results for 11 games. Finally, we show that DQfD performs better than three related algorithms for incorporating demonstration data into DQN. |
Tasks | Decision Making, Q-Learning |
Published | 2017-04-12 |
URL | http://arxiv.org/abs/1704.03732v4 |
http://arxiv.org/pdf/1704.03732v4.pdf | |
PWC | https://paperswithcode.com/paper/deep-q-learning-from-demonstrations |
Repo | https://github.com/LilTwo/DRL-using-PyTorch |
Framework | pytorch |
Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning
Title | Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning |
Authors | Jakob Foerster, Nantas Nardelli, Gregory Farquhar, Triantafyllos Afouras, Philip H. S. Torr, Pushmeet Kohli, Shimon Whiteson |
Abstract | Many real-world problems, such as network packet routing and urban traffic control, are naturally modeled as multi-agent reinforcement learning (RL) problems. However, existing multi-agent RL methods typically scale poorly in the problem size. Therefore, a key challenge is to translate the success of deep learning on single-agent RL to the multi-agent setting. A major stumbling block is that independent Q-learning, the most popular multi-agent RL method, introduces nonstationarity that makes it incompatible with the experience replay memory on which deep Q-learning relies. This paper proposes two methods that address this problem: 1) using a multi-agent variant of importance sampling to naturally decay obsolete data and 2) conditioning each agent’s value function on a fingerprint that disambiguates the age of the data sampled from the replay memory. Results on a challenging decentralised variant of StarCraft unit micromanagement confirm that these methods enable the successful combination of experience replay with multi-agent RL. |
Tasks | Multi-agent Reinforcement Learning, Q-Learning, Starcraft |
Published | 2017-02-28 |
URL | http://arxiv.org/abs/1702.08887v3 |
http://arxiv.org/pdf/1702.08887v3.pdf | |
PWC | https://paperswithcode.com/paper/stabilising-experience-replay-for-deep-multi |
Repo | https://github.com/MUmarJaved/MultiAgent-Distributed-Reinforcement-Learning |
Framework | tf |
SplineCNN: Fast Geometric Deep Learning with Continuous B-Spline Kernels
Title | SplineCNN: Fast Geometric Deep Learning with Continuous B-Spline Kernels |
Authors | Matthias Fey, Jan Eric Lenssen, Frank Weichert, Heinrich Müller |
Abstract | We present Spline-based Convolutional Neural Networks (SplineCNNs), a variant of deep neural networks for irregular structured and geometric input, e.g., graphs or meshes. Our main contribution is a novel convolution operator based on B-splines, that makes the computation time independent from the kernel size due to the local support property of the B-spline basis functions. As a result, we obtain a generalization of the traditional CNN convolution operator by using continuous kernel functions parametrized by a fixed number of trainable weights. In contrast to related approaches that filter in the spectral domain, the proposed method aggregates features purely in the spatial domain. In addition, SplineCNN allows entire end-to-end training of deep architectures, using only the geometric structure as input, instead of handcrafted feature descriptors. For validation, we apply our method on tasks from the fields of image graph classification, shape correspondence and graph node classification, and show that it outperforms or pars state-of-the-art approaches while being significantly faster and having favorable properties like domain-independence. |
Tasks | Graph Classification, Node Classification |
Published | 2017-11-24 |
URL | http://arxiv.org/abs/1711.08920v2 |
http://arxiv.org/pdf/1711.08920v2.pdf | |
PWC | https://paperswithcode.com/paper/splinecnn-fast-geometric-deep-learning-with |
Repo | https://github.com/rusty1s/pytorch_geometric |
Framework | pytorch |
JFLEG: A Fluency Corpus and Benchmark for Grammatical Error Correction
Title | JFLEG: A Fluency Corpus and Benchmark for Grammatical Error Correction |
Authors | Courtney Napoles, Keisuke Sakaguchi, Joel Tetreault |
Abstract | We present a new parallel corpus, JHU FLuency-Extended GUG corpus (JFLEG) for developing and evaluating grammatical error correction (GEC). Unlike other corpora, it represents a broad range of language proficiency levels and uses holistic fluency edits to not only correct grammatical errors but also make the original text more native sounding. We describe the types of corrections made and benchmark four leading GEC systems on this corpus, identifying specific areas in which they do well and how they can improve. JFLEG fulfills the need for a new gold standard to properly assess the current state of GEC. |
Tasks | Grammatical Error Correction |
Published | 2017-02-14 |
URL | http://arxiv.org/abs/1702.04066v1 |
http://arxiv.org/pdf/1702.04066v1.pdf | |
PWC | https://paperswithcode.com/paper/jfleg-a-fluency-corpus-and-benchmark-for |
Repo | https://github.com/keisks/jfleg |
Framework | none |
OpenNMT: Open-Source Toolkit for Neural Machine Translation
Title | OpenNMT: Open-Source Toolkit for Neural Machine Translation |
Authors | Guillaume Klein, Yoon Kim, Yuntian Deng, Jean Senellart, Alexander M. Rush |
Abstract | We describe an open-source toolkit for neural machine translation (NMT). The toolkit prioritizes efficiency, modularity, and extensibility with the goal of supporting NMT research into model architectures, feature representations, and source modalities, while maintaining competitive performance and reasonable training requirements. The toolkit consists of modeling and translation support, as well as detailed pedagogical documentation about the underlying techniques. |
Tasks | Machine Translation |
Published | 2017-01-10 |
URL | http://arxiv.org/abs/1701.02810v2 |
http://arxiv.org/pdf/1701.02810v2.pdf | |
PWC | https://paperswithcode.com/paper/opennmt-open-source-toolkit-for-neural |
Repo | https://github.com/OpenNMT/OpenNMT |
Framework | torch |