July 29, 2019

3320 words 16 mins read

Paper Group AWR 129

Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples. Selective Deep Convolutional Features for Image Retrieval. Calibrated Boosting-Forest. Gaussian Prototypical Networks for Few-Shot Learning on Omniglot. Automatic Liver and Tumor Segmentation of CT and MRI Volumes using Cascaded Fully Convolutional Neural Networks …

Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples


Title	Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples
Authors	Kimin Lee, Honglak Lee, Kibok Lee, Jinwoo Shin
Abstract	The problem of detecting whether a test sample is from in-distribution (i.e., training distribution by a classifier) or out-of-distribution sufficiently different from it arises in many real-world machine learning applications. However, the state-of-art deep neural networks are known to be highly overconfident in their predictions, i.e., do not distinguish in- and out-of-distributions. Recently, to handle this issue, several threshold-based detectors have been proposed given pre-trained neural classifiers. However, the performance of prior works highly depends on how to train the classifiers since they only focus on improving inference procedures. In this paper, we develop a novel training method for classifiers so that such inference algorithms can work better. In particular, we suggest two additional terms added to the original loss (e.g., cross entropy). The first one forces samples from out-of-distribution less confident by the classifier and the second one is for (implicitly) generating most effective training samples for the first one. In essence, our method jointly trains both classification and generative neural networks for out-of-distribution. We demonstrate its effectiveness using deep convolutional neural networks on various popular image datasets.
Tasks
Published	2017-11-26
URL	http://arxiv.org/abs/1711.09325v3
PDF	http://arxiv.org/pdf/1711.09325v3.pdf
PWC	https://paperswithcode.com/paper/training-confidence-calibrated-classifiers
Repo	https://github.com/alinlab/Confident_classifier
Framework	pytorch

Selective Deep Convolutional Features for Image Retrieval


Title	Selective Deep Convolutional Features for Image Retrieval
Authors	Tuan Hoang, Thanh-Toan Do, Dang-Khoa Le Tan, Ngai-Man Cheung
Abstract	Convolutional Neural Network (CNN) is a very powerful approach to extract discriminative local descriptors for effective image search. Recent work adopts fine-tuned strategies to further improve the discriminative power of the descriptors. Taking a different approach, in this paper, we propose a novel framework to achieve competitive retrieval performance. Firstly, we propose various masking schemes, namely SIFT-mask, SUM-mask, and MAX-mask, to select a representative subset of local convolutional features and remove a large number of redundant features. We demonstrate that this can effectively address the burstiness issue and improve retrieval accuracy. Secondly, we propose to employ recent embedding and aggregating methods to further enhance feature discriminability. Extensive experiments demonstrate that our proposed framework achieves state-of-the-art retrieval accuracy.
Tasks	Image Retrieval
Published	2017-07-04
URL	http://arxiv.org/abs/1707.00809v2
PDF	http://arxiv.org/pdf/1707.00809v2.pdf
PWC	https://paperswithcode.com/paper/selective-deep-convolutional-features-for
Repo	https://github.com/hnanhtuan/selectiveConvFeature
Framework	none

Calibrated Boosting-Forest


Title	Calibrated Boosting-Forest
Authors	Haozhen Wu
Abstract	Excellent ranking power along with well calibrated probability estimates are needed in many classification tasks. In this paper, we introduce a technique, Calibrated Boosting-Forest that captures both. This novel technique is an ensemble of gradient boosting machines that can support both continuous and binary labels. While offering superior ranking power over any individual regression or classification model, Calibrated Boosting-Forest is able to preserve well calibrated posterior probabilities. Along with these benefits, we provide an alternative to the tedious step of tuning gradient boosting machines. We demonstrate that tuning Calibrated Boosting-Forest can be reduced to a simple hyper-parameter selection. We further establish that increasing this hyper-parameter improves the ranking performance under a diminishing return. We examine the effectiveness of Calibrated Boosting-Forest on ligand-based virtual screening where both continuous and binary labels are available and compare the performance of Calibrated Boosting-Forest with logistic regression, gradient boosting machine and deep learning. Calibrated Boosting-Forest achieved an approximately 48% improvement compared to a state-of-art deep learning model. Moreover, it achieved around 95% improvement on probability quality measurement compared to the best individual gradient boosting machine. Calibrated Boosting-Forest offers a benchmark demonstration that in the field of ligand-based virtual screening, deep learning is not the universally dominant machine learning model and good calibrated probabilities can better facilitate virtual screening process.
Tasks
Published	2017-10-16
URL	http://arxiv.org/abs/1710.05476v3
PDF	http://arxiv.org/pdf/1710.05476v3.pdf
PWC	https://paperswithcode.com/paper/calibrated-boosting-forest
Repo	https://github.com/haozhenWu/Calibrated-Boosting-Forest
Framework	none

Gaussian Prototypical Networks for Few-Shot Learning on Omniglot


Title	Gaussian Prototypical Networks for Few-Shot Learning on Omniglot
Authors	Stanislav Fort
Abstract	We propose a novel architecture for $k$-shot classification on the Omniglot dataset. Building on prototypical networks, we extend their architecture to what we call Gaussian prototypical networks. Prototypical networks learn a map between images and embedding vectors, and use their clustering for classification. In our model, a part of the encoder output is interpreted as a confidence region estimate about the embedding point, and expressed as a Gaussian covariance matrix. Our network then constructs a direction and class dependent distance metric on the embedding space, using uncertainties of individual data points as weights. We show that Gaussian prototypical networks are a preferred architecture over vanilla prototypical networks with an equivalent number of parameters. We report state-of-the-art performance in 1-shot and 5-shot classification both in 5-way and 20-way regime (for 5-shot 5-way, we are comparable to previous state-of-the-art) on the Omniglot dataset. We explore artificially down-sampling a fraction of images in the training set, which improves our performance even further. We therefore hypothesize that Gaussian prototypical networks might perform better in less homogeneous, noisier datasets, which are commonplace in real world applications.
Tasks	Few-Shot Learning, Omniglot
Published	2017-08-09
URL	http://arxiv.org/abs/1708.02735v1
PDF	http://arxiv.org/pdf/1708.02735v1.pdf
PWC	https://paperswithcode.com/paper/gaussian-prototypical-networks-for-few-shot
Repo	https://github.com/stanislavfort/gaussian-prototypical-networks
Framework	tf

Automatic Liver and Tumor Segmentation of CT and MRI Volumes using Cascaded Fully Convolutional Neural Networks


Title	Automatic Liver and Tumor Segmentation of CT and MRI Volumes using Cascaded Fully Convolutional Neural Networks
Authors	Patrick Ferdinand Christ, Florian Ettlinger, Felix Grün, Mohamed Ezzeldin A. Elshaera, Jana Lipkova, Sebastian Schlecht, Freba Ahmaddy, Sunil Tatavarty, Marc Bickel, Patrick Bilic, Markus Rempfler, Felix Hofmann, Melvin D Anastasi, Seyed-Ahmad Ahmadi, Georgios Kaissis, Julian Holch, Wieland Sommer, Rickmer Braren, Volker Heinemann, Bjoern Menze
Abstract	Automatic segmentation of the liver and hepatic lesions is an important step towards deriving quantitative biomarkers for accurate clinical diagnosis and computer-aided decision support systems. This paper presents a method to automatically segment liver and lesions in CT and MRI abdomen images using cascaded fully convolutional neural networks (CFCNs) enabling the segmentation of a large-scale medical trial or quantitative image analysis. We train and cascade two FCNs for a combined segmentation of the liver and its lesions. In the first step, we train a FCN to segment the liver as ROI input for a second FCN. The second FCN solely segments lesions within the predicted liver ROIs of step 1. CFCN models were trained on an abdominal CT dataset comprising 100 hepatic tumor volumes. Validations on further datasets show that CFCN-based semantic liver and lesion segmentation achieves Dice scores over 94% for liver with computation times below 100s per volume. We further experimentally demonstrate the robustness of the proposed method on an 38 MRI liver tumor volumes and the public 3DIRCAD dataset.
Tasks	Automatic Liver And Tumor Segmentation, Lesion Segmentation
Published	2017-02-20
URL	http://arxiv.org/abs/1702.05970v2
PDF	http://arxiv.org/pdf/1702.05970v2.pdf
PWC	https://paperswithcode.com/paper/automatic-liver-and-tumor-segmentation-of-ct
Repo	https://github.com/IBBM/Cascaded-FCN
Framework	tf

MMD GAN: Towards Deeper Understanding of Moment Matching Network


Title	MMD GAN: Towards Deeper Understanding of Moment Matching Network
Authors	Chun-Liang Li, Wei-Cheng Chang, Yu Cheng, Yiming Yang, Barnabás Póczos
Abstract	Generative moment matching network (GMMN) is a deep generative model that differs from Generative Adversarial Network (GAN) by replacing the discriminator in GAN with a two-sample test based on kernel maximum mean discrepancy (MMD). Although some theoretical guarantees of MMD have been studied, the empirical performance of GMMN is still not as competitive as that of GAN on challenging and large benchmark datasets. The computational efficiency of GMMN is also less desirable in comparison with GAN, partially due to its requirement for a rather large batch size during the training. In this paper, we propose to improve both the model expressiveness of GMMN and its computational efficiency by introducing adversarial kernel learning techniques, as the replacement of a fixed Gaussian kernel in the original GMMN. The new approach combines the key ideas in both GMMN and GAN, hence we name it MMD GAN. The new distance measure in MMD GAN is a meaningful loss that enjoys the advantage of weak topology and can be optimized via gradient descent with relatively small batch sizes. In our evaluation on multiple benchmark datasets, including MNIST, CIFAR- 10, CelebA and LSUN, the performance of MMD-GAN significantly outperforms GMMN, and is competitive with other representative GAN works.
Tasks
Published	2017-05-24
URL	http://arxiv.org/abs/1705.08584v3
PDF	http://arxiv.org/pdf/1705.08584v3.pdf
PWC	https://paperswithcode.com/paper/mmd-gan-towards-deeper-understanding-of
Repo	https://github.com/OctoberChang/MMD-GAN
Framework	pytorch

The Birth of Collective Memories: Analyzing Emerging Entities in Text Streams


Title	The Birth of Collective Memories: Analyzing Emerging Entities in Text Streams
Authors	David Graus, Daan Odijk, Maarten de Rijke
Abstract	We study how collective memories are formed online. We do so by tracking entities that emerge in public discourse, that is, in online text streams such as social media and news streams, before they are incorporated into Wikipedia, which, we argue, can be viewed as an online place for collective memory. By tracking how entities emerge in public discourse, i.e., the temporal patterns between their first mention in online text streams and subsequent incorporation into collective memory, we gain insights into how the collective remembrance process happens online. Specifically, we analyze nearly 80,000 entities as they emerge in online text streams before they are incorporated into Wikipedia. The online text streams we use for our analysis comprise of social media and news streams, and span over 579 million documents in a timespan of 18 months. We discover two main emergence patterns: entities that emerge in a “bursty” fashion, i.e., that appear in public discourse without a precedent, blast into activity and transition into collective memory. Other entities display a “delayed” pattern, where they appear in public discourse, experience a period of inactivity, and then resurface before transitioning into our cultural collective memory.
Tasks
Published	2017-01-15
URL	http://arxiv.org/abs/1701.04039v2
PDF	http://arxiv.org/pdf/1701.04039v2.pdf
PWC	https://paperswithcode.com/paper/the-birth-of-collective-memories-analyzing
Repo	https://github.com/graus/emerging-entities-timeseries
Framework	none

Improving Object Detection from Scratch via Gated Feature Reuse


Title	Improving Object Detection from Scratch via Gated Feature Reuse
Authors	Zhiqiang Shen, Honghui Shi, Jiahui Yu, Hai Phan, Rogerio Feris, Liangliang Cao, Ding Liu, Xinchao Wang, Thomas Huang, Marios Savvides
Abstract	In this paper, we present a simple and parameter-efficient drop-in module for one-stage object detectors like SSD when learning from scratch (i.e., without pre-trained models). We call our module GFR (Gated Feature Reuse), which exhibits two main advantages. First, we introduce a novel gate-controlled prediction strategy enabled by Squeeze-and-Excitation to adaptively enhance or attenuate supervision at different scales based on the input object size. As a result, our model is more effective in detecting diverse sizes of objects. Second, we propose a feature-pyramids structure to squeeze rich spatial and semantic features into a single prediction layer, which strengthens feature representation and reduces the number of parameters to learn. We apply the proposed structure on DSOD and SSD detection frameworks, and evaluate the performance on PASCAL VOC 2007, 2012 and COCO datasets. With fewer model parameters, GFR-DSOD outperforms the baseline DSOD by 1.4%, 1.1%, 1.7% and 0.6%, respectively. GFR-SSD also outperforms the original SSD and SSD with dense prediction by 3.6% and 2.8% on VOC 2007 dataset. Code is available at: https://github.com/szq0214/GFR-DSOD .
Tasks	Object Detection
Published	2017-12-04
URL	https://arxiv.org/abs/1712.00886v2
PDF	https://arxiv.org/pdf/1712.00886v2.pdf
PWC	https://paperswithcode.com/paper/learning-object-detectors-from-scratch-with
Repo	https://github.com/szq0214/GRP-DSOD
Framework	pytorch

Neural Network-based Graph Embedding for Cross-Platform Binary Code Similarity Detection


Title	Neural Network-based Graph Embedding for Cross-Platform Binary Code Similarity Detection
Authors	Xiaojun Xu, Chang Liu, Qian Feng, Heng Yin, Le Song, Dawn Song
Abstract	The problem of cross-platform binary code similarity detection aims at detecting whether two binary functions coming from different platforms are similar or not. It has many security applications, including plagiarism detection, malware detection, vulnerability search, etc. Existing approaches rely on approximate graph matching algorithms, which are inevitably slow and sometimes inaccurate, and hard to adapt to a new task. To address these issues, in this work, we propose a novel neural network-based approach to compute the embedding, i.e., a numeric vector, based on the control flow graph of each binary function, then the similarity detection can be done efficiently by measuring the distance between the embeddings for two functions. We implement a prototype called Gemini. Our extensive evaluation shows that Gemini outperforms the state-of-the-art approaches by large margins with respect to similarity detection accuracy. Further, Gemini can speed up prior art’s embedding generation time by 3 to 4 orders of magnitude and reduce the required training time from more than 1 week down to 30 minutes to 10 hours. Our real world case studies demonstrate that Gemini can identify significantly more vulnerable firmware images than the state-of-the-art, i.e., Genius. Our research showcases a successful application of deep learning on computer security problems.
Tasks	Graph Embedding, Graph Matching, Malware Detection
Published	2017-08-22
URL	http://arxiv.org/abs/1708.06525v4
PDF	http://arxiv.org/pdf/1708.06525v4.pdf
PWC	https://paperswithcode.com/paper/neural-network-based-graph-embedding-for
Repo	https://github.com/ruchikagargdiwakar/ml_cyber_security_usecases
Framework	none

Unified Deep Supervised Domain Adaptation and Generalization


Title	Unified Deep Supervised Domain Adaptation and Generalization
Authors	Saeid Motiian, Marco Piccirilli, Donald A. Adjeroh, Gianfranco Doretto
Abstract	This work provides a unified framework for addressing the problem of visual supervised domain adaptation and generalization with deep models. The main idea is to exploit the Siamese architecture to learn an embedding subspace that is discriminative, and where mapped visual domains are semantically aligned and yet maximally separated. The supervised setting becomes attractive especially when only few target data samples need to be labeled. In this scenario, alignment and separation of semantic probability distributions is difficult because of the lack of data. We found that by reverting to point-wise surrogates of distribution distances and similarities provides an effective solution. In addition, the approach has a high speed of adaptation, which requires an extremely low number of labeled target training samples, even one per category can be effective. The approach is extended to domain generalization. For both applications the experiments show very promising results.
Tasks	Domain Adaptation, Domain Generalization
Published	2017-09-28
URL	http://arxiv.org/abs/1709.10190v1
PDF	http://arxiv.org/pdf/1709.10190v1.pdf
PWC	https://paperswithcode.com/paper/unified-deep-supervised-domain-adaptation-and
Repo	https://github.com/samotiian/CCSA
Framework	none

Learning to Generalize: Meta-Learning for Domain Generalization


Title	Learning to Generalize: Meta-Learning for Domain Generalization
Authors	Da Li, Yongxin Yang, Yi-Zhe Song, Timothy M. Hospedales
Abstract	Domain shift refers to the well known problem that a model trained in one source domain performs poorly when applied to a target domain with different statistics. {Domain Generalization} (DG) techniques attempt to alleviate this issue by producing models which by design generalize well to novel testing domains. We propose a novel {meta-learning} method for domain generalization. Rather than designing a specific model that is robust to domain shift as in most previous DG work, we propose a model agnostic training procedure for DG. Our algorithm simulates train/test domain shift during training by synthesizing virtual testing domains within each mini-batch. The meta-optimization objective requires that steps to improve training domain performance should also improve testing domain performance. This meta-learning procedure trains models with good generalization ability to novel domains. We evaluate our method and achieve state of the art results on a recent cross-domain image classification benchmark, as well demonstrating its potential on two classic reinforcement learning tasks.
Tasks	Domain Generalization, Image Classification, Meta-Learning
Published	2017-10-10
URL	http://arxiv.org/abs/1710.03463v1
PDF	http://arxiv.org/pdf/1710.03463v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-generalize-meta-learning-for
Repo	https://github.com/HAHA-DL/MLDG
Framework	pytorch

Progressive Growing of GANs for Improved Quality, Stability, and Variation


Title	Progressive Growing of GANs for Improved Quality, Stability, and Variation
Authors	Tero Karras, Timo Aila, Samuli Laine, Jaakko Lehtinen
Abstract	We describe a new training methodology for generative adversarial networks. The key idea is to grow both the generator and discriminator progressively: starting from a low resolution, we add new layers that model increasingly fine details as training progresses. This both speeds the training up and greatly stabilizes it, allowing us to produce images of unprecedented quality, e.g., CelebA images at 1024^2. We also propose a simple way to increase the variation in generated images, and achieve a record inception score of 8.80 in unsupervised CIFAR10. Additionally, we describe several implementation details that are important for discouraging unhealthy competition between the generator and discriminator. Finally, we suggest a new metric for evaluating GAN results, both in terms of image quality and variation. As an additional contribution, we construct a higher-quality version of the CelebA dataset.
Tasks	Face Generation, Image Generation
Published	2017-10-27
URL	http://arxiv.org/abs/1710.10196v3
PDF	http://arxiv.org/pdf/1710.10196v3.pdf
PWC	https://paperswithcode.com/paper/progressive-growing-of-gans-for-improved
Repo	https://github.com/MNaplesDevelopment/DCGAN-PyTorch
Framework	pytorch

The k-means-u* algorithm: non-local jumps and greedy retries improve k-means++ clustering


Title	The k-means-u* algorithm: non-local jumps and greedy retries improve k-means++ clustering
Authors	Bernd Fritzke
Abstract	We present a new clustering algorithm called k-means-u* which in many cases is able to significantly improve the clusterings found by k-means++, the current de-facto standard for clustering in Euclidean spaces. First we introduce the k-means-u algorithm which starts from a result of k-means++ and attempts to improve it with a sequence of non-local “jumps” alternated by runs of standard k-means. Each jump transfers the “least useful” center towards the center with the largest local error, offset by a small random vector. This is continued as long as the error decreases and often leads to an improved solution. Occasionally k-means-u terminates despite obvious remaining optimization possibilities. By allowing a limited number of retries for the last jump it is frequently possible to reach better local minima. The resulting algorithm is called k-means-u* and dominates k-means++ wrt. solution quality which is demonstrated empirically using various data sets. By construction the logarithmic quality bound established for k-means++ holds for k-means-u* as well.
Tasks
Published	2017-06-27
URL	http://arxiv.org/abs/1706.09059v2
PDF	http://arxiv.org/pdf/1706.09059v2.pdf
PWC	https://paperswithcode.com/paper/the-k-means-u-algorithm-non-local-jumps-and
Repo	https://github.com/gittar/k-means-u-star
Framework	none

Egocentric Video Description based on Temporally-Linked Sequences


Title	Egocentric Video Description based on Temporally-Linked Sequences
Authors	Marc Bolaños, Álvaro Peris, Francisco Casacuberta, Sergi Soler, Petia Radeva
Abstract	Egocentric vision consists in acquiring images along the day from a first person point-of-view using wearable cameras. The automatic analysis of this information allows to discover daily patterns for improving the quality of life of the user. A natural topic that arises in egocentric vision is storytelling, that is, how to understand and tell the story relying behind the pictures. In this paper, we tackle storytelling as an egocentric sequences description problem. We propose a novel methodology that exploits information from temporally neighboring events, matching precisely the nature of egocentric sequences. Furthermore, we present a new method for multimodal data fusion consisting on a multi-input attention recurrent network. We also publish the first dataset for egocentric image sequences description, consisting of 1,339 events with 3,991 descriptions, from 55 days acquired by 11 people. Furthermore, we prove that our proposal outperforms classical attentional encoder-decoder methods for video description.
Tasks	Video Description
Published	2017-04-07
URL	http://arxiv.org/abs/1704.02163v3
PDF	http://arxiv.org/pdf/1704.02163v3.pdf
PWC	https://paperswithcode.com/paper/egocentric-video-description-based-on
Repo	https://github.com/MarcBS/TMA
Framework	none

VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection


Title	VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection
Authors	Yin Zhou, Oncel Tuzel
Abstract	Accurate detection of objects in 3D point clouds is a central problem in many applications, such as autonomous navigation, housekeeping robots, and augmented/virtual reality. To interface a highly sparse LiDAR point cloud with a region proposal network (RPN), most existing efforts have focused on hand-crafted feature representations, for example, a bird’s eye view projection. In this work, we remove the need of manual feature engineering for 3D point clouds and propose VoxelNet, a generic 3D detection network that unifies feature extraction and bounding box prediction into a single stage, end-to-end trainable deep network. Specifically, VoxelNet divides a point cloud into equally spaced 3D voxels and transforms a group of points within each voxel into a unified feature representation through the newly introduced voxel feature encoding (VFE) layer. In this way, the point cloud is encoded as a descriptive volumetric representation, which is then connected to a RPN to generate detections. Experiments on the KITTI car detection benchmark show that VoxelNet outperforms the state-of-the-art LiDAR based 3D detection methods by a large margin. Furthermore, our network learns an effective discriminative representation of objects with various geometries, leading to encouraging results in 3D detection of pedestrians and cyclists, based on only LiDAR.
Tasks	3D Object Detection, Autonomous Navigation, Feature Engineering, Object Detection, Object Localization
Published	2017-11-17
URL	http://arxiv.org/abs/1711.06396v1
PDF	http://arxiv.org/pdf/1711.06396v1.pdf
PWC	https://paperswithcode.com/paper/voxelnet-end-to-end-learning-for-point-cloud
Repo	https://github.com/Lw510107/pointnet-2018.6.27-
Framework	tf