Paper Group AWR 129
Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples. Selective Deep Convolutional Features for Image Retrieval. Calibrated Boosting-Forest. Gaussian Prototypical Networks for Few-Shot Learning on Omniglot. Automatic Liver and Tumor Segmentation of CT and MRI Volumes using Cascaded Fully Convolutional Neural Networks …
Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples
Title | Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples |
Authors | Kimin Lee, Honglak Lee, Kibok Lee, Jinwoo Shin |
Abstract | The problem of detecting whether a test sample is from in-distribution (i.e., training distribution by a classifier) or out-of-distribution sufficiently different from it arises in many real-world machine learning applications. However, the state-of-art deep neural networks are known to be highly overconfident in their predictions, i.e., do not distinguish in- and out-of-distributions. Recently, to handle this issue, several threshold-based detectors have been proposed given pre-trained neural classifiers. However, the performance of prior works highly depends on how to train the classifiers since they only focus on improving inference procedures. In this paper, we develop a novel training method for classifiers so that such inference algorithms can work better. In particular, we suggest two additional terms added to the original loss (e.g., cross entropy). The first one forces samples from out-of-distribution less confident by the classifier and the second one is for (implicitly) generating most effective training samples for the first one. In essence, our method jointly trains both classification and generative neural networks for out-of-distribution. We demonstrate its effectiveness using deep convolutional neural networks on various popular image datasets. |
Tasks | |
Published | 2017-11-26 |
URL | http://arxiv.org/abs/1711.09325v3 |
http://arxiv.org/pdf/1711.09325v3.pdf | |
PWC | https://paperswithcode.com/paper/training-confidence-calibrated-classifiers |
Repo | https://github.com/alinlab/Confident_classifier |
Framework | pytorch |
Selective Deep Convolutional Features for Image Retrieval
Title | Selective Deep Convolutional Features for Image Retrieval |
Authors | Tuan Hoang, Thanh-Toan Do, Dang-Khoa Le Tan, Ngai-Man Cheung |
Abstract | Convolutional Neural Network (CNN) is a very powerful approach to extract discriminative local descriptors for effective image search. Recent work adopts fine-tuned strategies to further improve the discriminative power of the descriptors. Taking a different approach, in this paper, we propose a novel framework to achieve competitive retrieval performance. Firstly, we propose various masking schemes, namely SIFT-mask, SUM-mask, and MAX-mask, to select a representative subset of local convolutional features and remove a large number of redundant features. We demonstrate that this can effectively address the burstiness issue and improve retrieval accuracy. Secondly, we propose to employ recent embedding and aggregating methods to further enhance feature discriminability. Extensive experiments demonstrate that our proposed framework achieves state-of-the-art retrieval accuracy. |
Tasks | Image Retrieval |
Published | 2017-07-04 |
URL | http://arxiv.org/abs/1707.00809v2 |
http://arxiv.org/pdf/1707.00809v2.pdf | |
PWC | https://paperswithcode.com/paper/selective-deep-convolutional-features-for |
Repo | https://github.com/hnanhtuan/selectiveConvFeature |
Framework | none |
Calibrated Boosting-Forest
Title | Calibrated Boosting-Forest |
Authors | Haozhen Wu |
Abstract | Excellent ranking power along with well calibrated probability estimates are needed in many classification tasks. In this paper, we introduce a technique, Calibrated Boosting-Forest that captures both. This novel technique is an ensemble of gradient boosting machines that can support both continuous and binary labels. While offering superior ranking power over any individual regression or classification model, Calibrated Boosting-Forest is able to preserve well calibrated posterior probabilities. Along with these benefits, we provide an alternative to the tedious step of tuning gradient boosting machines. We demonstrate that tuning Calibrated Boosting-Forest can be reduced to a simple hyper-parameter selection. We further establish that increasing this hyper-parameter improves the ranking performance under a diminishing return. We examine the effectiveness of Calibrated Boosting-Forest on ligand-based virtual screening where both continuous and binary labels are available and compare the performance of Calibrated Boosting-Forest with logistic regression, gradient boosting machine and deep learning. Calibrated Boosting-Forest achieved an approximately 48% improvement compared to a state-of-art deep learning model. Moreover, it achieved around 95% improvement on probability quality measurement compared to the best individual gradient boosting machine. Calibrated Boosting-Forest offers a benchmark demonstration that in the field of ligand-based virtual screening, deep learning is not the universally dominant machine learning model and good calibrated probabilities can better facilitate virtual screening process. |
Tasks | |
Published | 2017-10-16 |
URL | http://arxiv.org/abs/1710.05476v3 |
http://arxiv.org/pdf/1710.05476v3.pdf | |
PWC | https://paperswithcode.com/paper/calibrated-boosting-forest |
Repo | https://github.com/haozhenWu/Calibrated-Boosting-Forest |
Framework | none |
Gaussian Prototypical Networks for Few-Shot Learning on Omniglot
Title | Gaussian Prototypical Networks for Few-Shot Learning on Omniglot |
Authors | Stanislav Fort |
Abstract | We propose a novel architecture for $k$-shot classification on the Omniglot dataset. Building on prototypical networks, we extend their architecture to what we call Gaussian prototypical networks. Prototypical networks learn a map between images and embedding vectors, and use their clustering for classification. In our model, a part of the encoder output is interpreted as a confidence region estimate about the embedding point, and expressed as a Gaussian covariance matrix. Our network then constructs a direction and class dependent distance metric on the embedding space, using uncertainties of individual data points as weights. We show that Gaussian prototypical networks are a preferred architecture over vanilla prototypical networks with an equivalent number of parameters. We report state-of-the-art performance in 1-shot and 5-shot classification both in 5-way and 20-way regime (for 5-shot 5-way, we are comparable to previous state-of-the-art) on the Omniglot dataset. We explore artificially down-sampling a fraction of images in the training set, which improves our performance even further. We therefore hypothesize that Gaussian prototypical networks might perform better in less homogeneous, noisier datasets, which are commonplace in real world applications. |
Tasks | Few-Shot Learning, Omniglot |
Published | 2017-08-09 |
URL | http://arxiv.org/abs/1708.02735v1 |
http://arxiv.org/pdf/1708.02735v1.pdf | |
PWC | https://paperswithcode.com/paper/gaussian-prototypical-networks-for-few-shot |
Repo | https://github.com/stanislavfort/gaussian-prototypical-networks |
Framework | tf |
Automatic Liver and Tumor Segmentation of CT and MRI Volumes using Cascaded Fully Convolutional Neural Networks
Title | Automatic Liver and Tumor Segmentation of CT and MRI Volumes using Cascaded Fully Convolutional Neural Networks |
Authors | Patrick Ferdinand Christ, Florian Ettlinger, Felix Grün, Mohamed Ezzeldin A. Elshaera, Jana Lipkova, Sebastian Schlecht, Freba Ahmaddy, Sunil Tatavarty, Marc Bickel, Patrick Bilic, Markus Rempfler, Felix Hofmann, Melvin D Anastasi, Seyed-Ahmad Ahmadi, Georgios Kaissis, Julian Holch, Wieland Sommer, Rickmer Braren, Volker Heinemann, Bjoern Menze |
Abstract | Automatic segmentation of the liver and hepatic lesions is an important step towards deriving quantitative biomarkers for accurate clinical diagnosis and computer-aided decision support systems. This paper presents a method to automatically segment liver and lesions in CT and MRI abdomen images using cascaded fully convolutional neural networks (CFCNs) enabling the segmentation of a large-scale medical trial or quantitative image analysis. We train and cascade two FCNs for a combined segmentation of the liver and its lesions. In the first step, we train a FCN to segment the liver as ROI input for a second FCN. The second FCN solely segments lesions within the predicted liver ROIs of step 1. CFCN models were trained on an abdominal CT dataset comprising 100 hepatic tumor volumes. Validations on further datasets show that CFCN-based semantic liver and lesion segmentation achieves Dice scores over 94% for liver with computation times below 100s per volume. We further experimentally demonstrate the robustness of the proposed method on an 38 MRI liver tumor volumes and the public 3DIRCAD dataset. |
Tasks | Automatic Liver And Tumor Segmentation, Lesion Segmentation |
Published | 2017-02-20 |
URL | http://arxiv.org/abs/1702.05970v2 |
http://arxiv.org/pdf/1702.05970v2.pdf | |
PWC | https://paperswithcode.com/paper/automatic-liver-and-tumor-segmentation-of-ct |
Repo | https://github.com/IBBM/Cascaded-FCN |
Framework | tf |
MMD GAN: Towards Deeper Understanding of Moment Matching Network
Title | MMD GAN: Towards Deeper Understanding of Moment Matching Network |
Authors | Chun-Liang Li, Wei-Cheng Chang, Yu Cheng, Yiming Yang, Barnabás Póczos |
Abstract | Generative moment matching network (GMMN) is a deep generative model that differs from Generative Adversarial Network (GAN) by replacing the discriminator in GAN with a two-sample test based on kernel maximum mean discrepancy (MMD). Although some theoretical guarantees of MMD have been studied, the empirical performance of GMMN is still not as competitive as that of GAN on challenging and large benchmark datasets. The computational efficiency of GMMN is also less desirable in comparison with GAN, partially due to its requirement for a rather large batch size during the training. In this paper, we propose to improve both the model expressiveness of GMMN and its computational efficiency by introducing adversarial kernel learning techniques, as the replacement of a fixed Gaussian kernel in the original GMMN. The new approach combines the key ideas in both GMMN and GAN, hence we name it MMD GAN. The new distance measure in MMD GAN is a meaningful loss that enjoys the advantage of weak topology and can be optimized via gradient descent with relatively small batch sizes. In our evaluation on multiple benchmark datasets, including MNIST, CIFAR- 10, CelebA and LSUN, the performance of MMD-GAN significantly outperforms GMMN, and is competitive with other representative GAN works. |
Tasks | |
Published | 2017-05-24 |
URL | http://arxiv.org/abs/1705.08584v3 |
http://arxiv.org/pdf/1705.08584v3.pdf | |
PWC | https://paperswithcode.com/paper/mmd-gan-towards-deeper-understanding-of |
Repo | https://github.com/OctoberChang/MMD-GAN |
Framework | pytorch |
The Birth of Collective Memories: Analyzing Emerging Entities in Text Streams
Title | The Birth of Collective Memories: Analyzing Emerging Entities in Text Streams |
Authors | David Graus, Daan Odijk, Maarten de Rijke |
Abstract | We study how collective memories are formed online. We do so by tracking entities that emerge in public discourse, that is, in online text streams such as social media and news streams, before they are incorporated into Wikipedia, which, we argue, can be viewed as an online place for collective memory. By tracking how entities emerge in public discourse, i.e., the temporal patterns between their first mention in online text streams and subsequent incorporation into collective memory, we gain insights into how the collective remembrance process happens online. Specifically, we analyze nearly 80,000 entities as they emerge in online text streams before they are incorporated into Wikipedia. The online text streams we use for our analysis comprise of social media and news streams, and span over 579 million documents in a timespan of 18 months. We discover two main emergence patterns: entities that emerge in a “bursty” fashion, i.e., that appear in public discourse without a precedent, blast into activity and transition into collective memory. Other entities display a “delayed” pattern, where they appear in public discourse, experience a period of inactivity, and then resurface before transitioning into our cultural collective memory. |
Tasks | |
Published | 2017-01-15 |
URL | http://arxiv.org/abs/1701.04039v2 |
http://arxiv.org/pdf/1701.04039v2.pdf | |
PWC | https://paperswithcode.com/paper/the-birth-of-collective-memories-analyzing |
Repo | https://github.com/graus/emerging-entities-timeseries |
Framework | none |
Improving Object Detection from Scratch via Gated Feature Reuse
Title | Improving Object Detection from Scratch via Gated Feature Reuse |
Authors | Zhiqiang Shen, Honghui Shi, Jiahui Yu, Hai Phan, Rogerio Feris, Liangliang Cao, Ding Liu, Xinchao Wang, Thomas Huang, Marios Savvides |
Abstract | In this paper, we present a simple and parameter-efficient drop-in module for one-stage object detectors like SSD when learning from scratch (i.e., without pre-trained models). We call our module GFR (Gated Feature Reuse), which exhibits two main advantages. First, we introduce a novel gate-controlled prediction strategy enabled by Squeeze-and-Excitation to adaptively enhance or attenuate supervision at different scales based on the input object size. As a result, our model is more effective in detecting diverse sizes of objects. Second, we propose a feature-pyramids structure to squeeze rich spatial and semantic features into a single prediction layer, which strengthens feature representation and reduces the number of parameters to learn. We apply the proposed structure on DSOD and SSD detection frameworks, and evaluate the performance on PASCAL VOC 2007, 2012 and COCO datasets. With fewer model parameters, GFR-DSOD outperforms the baseline DSOD by 1.4%, 1.1%, 1.7% and 0.6%, respectively. GFR-SSD also outperforms the original SSD and SSD with dense prediction by 3.6% and 2.8% on VOC 2007 dataset. Code is available at: https://github.com/szq0214/GFR-DSOD . |
Tasks | Object Detection |
Published | 2017-12-04 |
URL | https://arxiv.org/abs/1712.00886v2 |
https://arxiv.org/pdf/1712.00886v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-object-detectors-from-scratch-with |
Repo | https://github.com/szq0214/GRP-DSOD |
Framework | pytorch |
Neural Network-based Graph Embedding for Cross-Platform Binary Code Similarity Detection
Title | Neural Network-based Graph Embedding for Cross-Platform Binary Code Similarity Detection |
Authors | Xiaojun Xu, Chang Liu, Qian Feng, Heng Yin, Le Song, Dawn Song |
Abstract | The problem of cross-platform binary code similarity detection aims at detecting whether two binary functions coming from different platforms are similar or not. It has many security applications, including plagiarism detection, malware detection, vulnerability search, etc. Existing approaches rely on approximate graph matching algorithms, which are inevitably slow and sometimes inaccurate, and hard to adapt to a new task. To address these issues, in this work, we propose a novel neural network-based approach to compute the embedding, i.e., a numeric vector, based on the control flow graph of each binary function, then the similarity detection can be done efficiently by measuring the distance between the embeddings for two functions. We implement a prototype called Gemini. Our extensive evaluation shows that Gemini outperforms the state-of-the-art approaches by large margins with respect to similarity detection accuracy. Further, Gemini can speed up prior art’s embedding generation time by 3 to 4 orders of magnitude and reduce the required training time from more than 1 week down to 30 minutes to 10 hours. Our real world case studies demonstrate that Gemini can identify significantly more vulnerable firmware images than the state-of-the-art, i.e., Genius. Our research showcases a successful application of deep learning on computer security problems. |
Tasks | Graph Embedding, Graph Matching, Malware Detection |
Published | 2017-08-22 |
URL | http://arxiv.org/abs/1708.06525v4 |
http://arxiv.org/pdf/1708.06525v4.pdf | |
PWC | https://paperswithcode.com/paper/neural-network-based-graph-embedding-for |
Repo | https://github.com/ruchikagargdiwakar/ml_cyber_security_usecases |
Framework | none |
Unified Deep Supervised Domain Adaptation and Generalization
Title | Unified Deep Supervised Domain Adaptation and Generalization |
Authors | Saeid Motiian, Marco Piccirilli, Donald A. Adjeroh, Gianfranco Doretto |
Abstract | This work provides a unified framework for addressing the problem of visual supervised domain adaptation and generalization with deep models. The main idea is to exploit the Siamese architecture to learn an embedding subspace that is discriminative, and where mapped visual domains are semantically aligned and yet maximally separated. The supervised setting becomes attractive especially when only few target data samples need to be labeled. In this scenario, alignment and separation of semantic probability distributions is difficult because of the lack of data. We found that by reverting to point-wise surrogates of distribution distances and similarities provides an effective solution. In addition, the approach has a high speed of adaptation, which requires an extremely low number of labeled target training samples, even one per category can be effective. The approach is extended to domain generalization. For both applications the experiments show very promising results. |
Tasks | Domain Adaptation, Domain Generalization |
Published | 2017-09-28 |
URL | http://arxiv.org/abs/1709.10190v1 |
http://arxiv.org/pdf/1709.10190v1.pdf | |
PWC | https://paperswithcode.com/paper/unified-deep-supervised-domain-adaptation-and |
Repo | https://github.com/samotiian/CCSA |
Framework | none |
Learning to Generalize: Meta-Learning for Domain Generalization
Title | Learning to Generalize: Meta-Learning for Domain Generalization |
Authors | Da Li, Yongxin Yang, Yi-Zhe Song, Timothy M. Hospedales |
Abstract | Domain shift refers to the well known problem that a model trained in one source domain performs poorly when applied to a target domain with different statistics. {Domain Generalization} (DG) techniques attempt to alleviate this issue by producing models which by design generalize well to novel testing domains. We propose a novel {meta-learning} method for domain generalization. Rather than designing a specific model that is robust to domain shift as in most previous DG work, we propose a model agnostic training procedure for DG. Our algorithm simulates train/test domain shift during training by synthesizing virtual testing domains within each mini-batch. The meta-optimization objective requires that steps to improve training domain performance should also improve testing domain performance. This meta-learning procedure trains models with good generalization ability to novel domains. We evaluate our method and achieve state of the art results on a recent cross-domain image classification benchmark, as well demonstrating its potential on two classic reinforcement learning tasks. |
Tasks | Domain Generalization, Image Classification, Meta-Learning |
Published | 2017-10-10 |
URL | http://arxiv.org/abs/1710.03463v1 |
http://arxiv.org/pdf/1710.03463v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-generalize-meta-learning-for |
Repo | https://github.com/HAHA-DL/MLDG |
Framework | pytorch |
Progressive Growing of GANs for Improved Quality, Stability, and Variation
Title | Progressive Growing of GANs for Improved Quality, Stability, and Variation |
Authors | Tero Karras, Timo Aila, Samuli Laine, Jaakko Lehtinen |
Abstract | We describe a new training methodology for generative adversarial networks. The key idea is to grow both the generator and discriminator progressively: starting from a low resolution, we add new layers that model increasingly fine details as training progresses. This both speeds the training up and greatly stabilizes it, allowing us to produce images of unprecedented quality, e.g., CelebA images at 1024^2. We also propose a simple way to increase the variation in generated images, and achieve a record inception score of 8.80 in unsupervised CIFAR10. Additionally, we describe several implementation details that are important for discouraging unhealthy competition between the generator and discriminator. Finally, we suggest a new metric for evaluating GAN results, both in terms of image quality and variation. As an additional contribution, we construct a higher-quality version of the CelebA dataset. |
Tasks | Face Generation, Image Generation |
Published | 2017-10-27 |
URL | http://arxiv.org/abs/1710.10196v3 |
http://arxiv.org/pdf/1710.10196v3.pdf | |
PWC | https://paperswithcode.com/paper/progressive-growing-of-gans-for-improved |
Repo | https://github.com/MNaplesDevelopment/DCGAN-PyTorch |
Framework | pytorch |
The k-means-u* algorithm: non-local jumps and greedy retries improve k-means++ clustering
Title | The k-means-u* algorithm: non-local jumps and greedy retries improve k-means++ clustering |
Authors | Bernd Fritzke |
Abstract | We present a new clustering algorithm called k-means-u* which in many cases is able to significantly improve the clusterings found by k-means++, the current de-facto standard for clustering in Euclidean spaces. First we introduce the k-means-u algorithm which starts from a result of k-means++ and attempts to improve it with a sequence of non-local “jumps” alternated by runs of standard k-means. Each jump transfers the “least useful” center towards the center with the largest local error, offset by a small random vector. This is continued as long as the error decreases and often leads to an improved solution. Occasionally k-means-u terminates despite obvious remaining optimization possibilities. By allowing a limited number of retries for the last jump it is frequently possible to reach better local minima. The resulting algorithm is called k-means-u* and dominates k-means++ wrt. solution quality which is demonstrated empirically using various data sets. By construction the logarithmic quality bound established for k-means++ holds for k-means-u* as well. |
Tasks | |
Published | 2017-06-27 |
URL | http://arxiv.org/abs/1706.09059v2 |
http://arxiv.org/pdf/1706.09059v2.pdf | |
PWC | https://paperswithcode.com/paper/the-k-means-u-algorithm-non-local-jumps-and |
Repo | https://github.com/gittar/k-means-u-star |
Framework | none |
Egocentric Video Description based on Temporally-Linked Sequences
Title | Egocentric Video Description based on Temporally-Linked Sequences |
Authors | Marc Bolaños, Álvaro Peris, Francisco Casacuberta, Sergi Soler, Petia Radeva |
Abstract | Egocentric vision consists in acquiring images along the day from a first person point-of-view using wearable cameras. The automatic analysis of this information allows to discover daily patterns for improving the quality of life of the user. A natural topic that arises in egocentric vision is storytelling, that is, how to understand and tell the story relying behind the pictures. In this paper, we tackle storytelling as an egocentric sequences description problem. We propose a novel methodology that exploits information from temporally neighboring events, matching precisely the nature of egocentric sequences. Furthermore, we present a new method for multimodal data fusion consisting on a multi-input attention recurrent network. We also publish the first dataset for egocentric image sequences description, consisting of 1,339 events with 3,991 descriptions, from 55 days acquired by 11 people. Furthermore, we prove that our proposal outperforms classical attentional encoder-decoder methods for video description. |
Tasks | Video Description |
Published | 2017-04-07 |
URL | http://arxiv.org/abs/1704.02163v3 |
http://arxiv.org/pdf/1704.02163v3.pdf | |
PWC | https://paperswithcode.com/paper/egocentric-video-description-based-on |
Repo | https://github.com/MarcBS/TMA |
Framework | none |
VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection
Title | VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection |
Authors | Yin Zhou, Oncel Tuzel |
Abstract | Accurate detection of objects in 3D point clouds is a central problem in many applications, such as autonomous navigation, housekeeping robots, and augmented/virtual reality. To interface a highly sparse LiDAR point cloud with a region proposal network (RPN), most existing efforts have focused on hand-crafted feature representations, for example, a bird’s eye view projection. In this work, we remove the need of manual feature engineering for 3D point clouds and propose VoxelNet, a generic 3D detection network that unifies feature extraction and bounding box prediction into a single stage, end-to-end trainable deep network. Specifically, VoxelNet divides a point cloud into equally spaced 3D voxels and transforms a group of points within each voxel into a unified feature representation through the newly introduced voxel feature encoding (VFE) layer. In this way, the point cloud is encoded as a descriptive volumetric representation, which is then connected to a RPN to generate detections. Experiments on the KITTI car detection benchmark show that VoxelNet outperforms the state-of-the-art LiDAR based 3D detection methods by a large margin. Furthermore, our network learns an effective discriminative representation of objects with various geometries, leading to encouraging results in 3D detection of pedestrians and cyclists, based on only LiDAR. |
Tasks | 3D Object Detection, Autonomous Navigation, Feature Engineering, Object Detection, Object Localization |
Published | 2017-11-17 |
URL | http://arxiv.org/abs/1711.06396v1 |
http://arxiv.org/pdf/1711.06396v1.pdf | |
PWC | https://paperswithcode.com/paper/voxelnet-end-to-end-learning-for-point-cloud |
Repo | https://github.com/Lw510107/pointnet-2018.6.27- |
Framework | tf |