February 1, 2020

3249 words 16 mins read

Paper Group AWR 323

One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers. Automated Segmentation of CT Scans for Normal Pressure Hydrocephalus. Probabilistic Face Embeddings. GradVis: Visualization and Second Order Analysis of Optimization Surfaces during the Training of Deep Neural Networks. Hierarchy-of-Visual-Words: …

One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers


Title	One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers
Authors	Ari S. Morcos, Haonan Yu, Michela Paganini, Yuandong Tian
Abstract	The success of lottery ticket initializations (Frankle and Carbin, 2019) suggests that small, sparsified networks can be trained so long as the network is initialized appropriately. Unfortunately, finding these “winning ticket” initializations is computationally expensive. One potential solution is to reuse the same winning tickets across a variety of datasets and optimizers. However, the generality of winning ticket initializations remains unclear. Here, we attempt to answer this question by generating winning tickets for one training configuration (optimizer and dataset) and evaluating their performance on another configuration. Perhaps surprisingly, we found that, within the natural images domain, winning ticket initializations generalized across a variety of datasets, including Fashion MNIST, SVHN, CIFAR-10/100, ImageNet, and Places365, often achieving performance close to that of winning tickets generated on the same dataset. Moreover, winning tickets generated using larger datasets consistently transferred better than those generated using smaller datasets. We also found that winning ticket initializations generalize across optimizers with high performance. These results suggest that winning ticket initializations generated by sufficiently large datasets contain inductive biases generic to neural networks more broadly which improve training across many settings and provide hope for the development of better initialization methods.
Tasks
Published	2019-06-06
URL	https://arxiv.org/abs/1906.02773v2
PDF	https://arxiv.org/pdf/1906.02773v2.pdf
PWC	https://paperswithcode.com/paper/one-ticket-to-win-them-all-generalizing
Repo	https://github.com/varungohil/Generalizing-Lottery-Tickets
Framework	pytorch

Automated Segmentation of CT Scans for Normal Pressure Hydrocephalus


Title	Automated Segmentation of CT Scans for Normal Pressure Hydrocephalus
Authors	Angela Zhang, Po-Yu Kao, Ronald Sahyouni, Ashutosh Shelat, Jefferson Chen, B. S. Manjunath
Abstract	Normal Pressure Hydrocephalus (NPH) is one of the few reversible forms of dementia, Due to their low cost and versatility, Computed Tomography (CT) scans have long been used as an aid to help diagnose intracerebral anomalies such as NPH. However, no well-defined and effective protocol currently exists for the analysis of CT scan-based ventricular, cerebral mass and subarachnoid space volumes in the setting of NPH. The Evan’s ratio, an approximation of the ratio of ventricle to brain volume using only one 2D slice of the scan, has been proposed but is not robust. Instead of manually measuring a 2-dimensional proxy for the ratio of ventricle volume to brain volume, this study proposes an automated method of calculating the brain volumes for better recognition of NPH from a radiological standpoint. The method first aligns the subject CT volume to a common space through an affine transformation, then uses a random forest classifier to mask relevant tissue types. A 3D morphological segmentation method is used to partition the brain volume, which in turn is used to train machine learning methods to classify the subjects into non-NPH vs. NPH based on volumetric information. The proposed algorithm has increased sensitivity compared to the Evan’s ratio thresholding method.
Tasks	Computed Tomography (CT)
Published	2019-01-25
URL	https://arxiv.org/abs/1901.09088v2
PDF	https://arxiv.org/pdf/1901.09088v2.pdf
PWC	https://paperswithcode.com/paper/fully-automated-volumetric-classification-in
Repo	https://github.com/UCSB-VRL/NPH_Prediction
Framework	none

Probabilistic Face Embeddings


Title	Probabilistic Face Embeddings
Authors	Yichun Shi, Anil K. Jain
Abstract	Embedding methods have achieved success in face recognition by comparing facial features in a latent semantic space. However, in a fully unconstrained face setting, the facial features learned by the embedding model could be ambiguous or may not even be present in the input face, leading to noisy representations. We propose Probabilistic Face Embeddings (PFEs), which represent each face image as a Gaussian distribution in the latent space. The mean of the distribution estimates the most likely feature values while the variance shows the uncertainty in the feature values. Probabilistic solutions can then be naturally derived for matching and fusing PFEs using the uncertainty information. Empirical evaluation on different baseline models, training datasets and benchmarks show that the proposed method can improve the face recognition performance of deterministic embeddings by converting them into PFEs. The uncertainties estimated by PFEs also serve as good indicators of the potential matching accuracy, which are important for a risk-controlled recognition system.
Tasks	Face Recognition
Published	2019-04-21
URL	https://arxiv.org/abs/1904.09658v4
PDF	https://arxiv.org/pdf/1904.09658v4.pdf
PWC	https://paperswithcode.com/paper/probabilistic-face-embeddings
Repo	https://github.com/seasonSH/Probabilistic-Face-Embeddings
Framework	tf

GradVis: Visualization and Second Order Analysis of Optimization Surfaces during the Training of Deep Neural Networks


Title	GradVis: Visualization and Second Order Analysis of Optimization Surfaces during the Training of Deep Neural Networks
Authors	Avraam Chatzimichailidis, Franz-Josef Pfreundt, Nicolas R. Gauger, Janis Keuper
Abstract	Current training methods for deep neural networks boil down to very high dimensional and non-convex optimization problems which are usually solved by a wide range of stochastic gradient descent methods. While these approaches tend to work in practice, there are still many gaps in the theoretical understanding of key aspects like convergence and generalization guarantees, which are induced by the properties of the optimization surface (loss landscape). In order to gain deeper insights, a number of recent publications proposed methods to visualize and analyze the optimization surfaces. However, the computational cost of these methods are very high, making it hardly possible to use them on larger networks. In this paper, we present the GradVis Toolbox, an open source library for efficient and scalable visualization and analysis of deep neural network loss landscapes in Tensorflow and PyTorch. Introducing more efficient mathematical formulations and a novel parallelization scheme, GradVis allows to plot 2d and 3d projections of optimization surfaces and trajectories, as well as high resolution second order gradient information for large networks.
Tasks
Published	2019-09-26
URL	https://arxiv.org/abs/1909.12108v2
PDF	https://arxiv.org/pdf/1909.12108v2.pdf
PWC	https://paperswithcode.com/paper/gradvis-visualization-and-second-order
Repo	https://github.com/cc-hpc-itwm/GradVis
Framework	pytorch

Hierarchy-of-Visual-Words: a Learning-based Approach for Trademark Image Retrieval


Title	Hierarchy-of-Visual-Words: a Learning-based Approach for Trademark Image Retrieval
Authors	Vítor N. Lourenço, Gabriela G. Silva, Leandro A. F. Fernandes
Abstract	In this paper, we present the Hierarchy-of-Visual-Words (HoVW), a novel trademark image retrieval (TIR) method that decomposes images into simpler geometric shapes and defines a descriptor for binary trademark image representation by encoding the hierarchical arrangement of component shapes. The proposed hierarchical organization of visual data stores each component shape as a visual word. It is capable of representing the geometry of individual elements and the topology of the trademark image, making the descriptor robust against linear as well as to some level of nonlinear transformation. Experiments show that HoVW outperforms previous TIR methods on the MPEG-7 CE-1 and MPEG-7 CE-2 image databases.
Tasks	Image Retrieval, Trademark Retrieval
Published	2019-08-07
URL	https://arxiv.org/abs/1908.02786v1
PDF	https://arxiv.org/pdf/1908.02786v1.pdf
PWC	https://paperswithcode.com/paper/hierarchy-of-visual-words-a-learning-based
Repo	https://github.com/Prograf-UFF/HoVW
Framework	none

DDSL: Deep Differentiable Simplex Layer for Learning Geometric Signals


Title	DDSL: Deep Differentiable Simplex Layer for Learning Geometric Signals
Authors	Chiyu “Max” Jiang, Dana Lynn Ona Lansigan, Philip Marcus, Matthias Nießner
Abstract	We present a Deep Differentiable Simplex Layer (DDSL) for neural networks for geometric deep learning. The DDSL is a differentiable layer compatible with deep neural networks for bridging simplex mesh-based geometry representations (point clouds, line mesh, triangular mesh, tetrahedral mesh) with raster images (e.g., 2D/3D grids). The DDSL uses Non-Uniform Fourier Transform (NUFT) to perform differentiable, efficient, anti-aliased rasterization of simplex-based signals. We present a complete theoretical framework for the process as well as an efficient backpropagation algorithm. Compared to previous differentiable renderers and rasterizers, the DDSL generalizes to arbitrary simplex degrees and dimensions. In particular, we explore its applications to 2D shapes and illustrate two applications of this method: (1) mesh editing and optimization guided by neural network outputs, and (2) using DDSL for a differentiable rasterization loss to facilitate end-to-end training of polygon generators. We are able to validate the effectiveness of gradient-based shape optimization with the example of airfoil optimization, and using the differentiable rasterization loss to facilitate end-to-end training, we surpass state of the art for polygonal image segmentation given ground-truth bounding boxes.
Tasks	Semantic Segmentation
Published	2019-01-30
URL	https://arxiv.org/abs/1901.11082v3
PDF	https://arxiv.org/pdf/1901.11082v3.pdf
PWC	https://paperswithcode.com/paper/ddsl-deep-differentiable-simplex-layer-for
Repo	https://github.com/maxjiang93/DDSL
Framework	pytorch

Privacy and Utility Preserving Sensor-Data Transformations


Title	Privacy and Utility Preserving Sensor-Data Transformations
Authors	Mohammad Malekzadeh, Richard G. Clegg, Andrea Cavallaro, Hamed Haddadi
Abstract	Sensitive inferences and user re-identification are major threats to privacy when raw sensor data from wearable or portable devices are shared with cloud-assisted applications. To mitigate these threats, we propose mechanisms to transform sensor data before sharing them with applications running on users’ devices. These transformations aim at eliminating patterns that can be used for user re-identification or for inferring potentially sensitive activities, while introducing a minor utility loss for the target application (or task). We show that, on gesture and activity recognition tasks, we can prevent inference of potentially sensitive activities while keeping the reduction in recognition accuracy of non-sensitive activities to less than 5 percentage points. We also show that we can reduce the accuracy of user re-identification and of the potential inference of gender to the level of a random guess, while keeping the accuracy of activity recognition comparable to that obtained on the original data.
Tasks	Activity Recognition
Published	2019-11-14
URL	https://arxiv.org/abs/1911.05996v1
PDF	https://arxiv.org/pdf/1911.05996v1.pdf
PWC	https://paperswithcode.com/paper/privacy-and-utility-preserving-sensor-data
Repo	https://github.com/mmalekzadeh/motion-sense
Framework	none

Exploring Explicit Domain Supervision for Latent Space Disentanglement in Unpaired Image-to-Image Translation


Title	Exploring Explicit Domain Supervision for Latent Space Disentanglement in Unpaired Image-to-Image Translation
Authors	Jianxin Lin, Zhibo Chen, Yingce Xia, Sen Liu, Tao Qin, Jiebo Luo
Abstract	Image-to-image translation tasks have been widely investigated with Generative Adversarial Networks (GANs). However, existing approaches are mostly designed in an unsupervised manner while little attention has been paid to domain information within unpaired data. In this paper, we treat domain information as explicit supervision and design an unpaired image-to-image translation framework, Domain-supervised GAN (DosGAN), which takes the first step towards the exploration of explicit domain supervision. In contrast to representing domain characteristics using different generators or domain codes, we pre-train a classification network to explicitly classify the domain of an image. After pre-training, this network is used to extract the domain-specific features of each image. Such features, together with the domain-independent features extracted by another encoder (shared across different domains), are used to generate image in target domain. Extensive experiments on multiple facial attribute translation, multiple identity translation, multiple season translation and conditional edges-to-shoes/handbags demonstrate the effectiveness of our method. In addition, we can transfer the domain-specific feature extractor obtained on the Facescrub dataset with domain supervision information to unseen domains, such as faces in the CelebA dataset. We also succeed in achieving conditional translation with any two images in CelebA, while previous models like StarGAN cannot handle this task.
Tasks	Image-to-Image Translation
Published	2019-02-11
URL	https://arxiv.org/abs/1902.03782v4
PDF	https://arxiv.org/pdf/1902.03782v4.pdf
PWC	https://paperswithcode.com/paper/unpaired-image-to-image-translation-with
Repo	https://github.com/linjx-ustc1106/DosGAN-PyTorch
Framework	pytorch

DaiMoN: A Decentralized Artificial Intelligence Model Network


Title	DaiMoN: A Decentralized Artificial Intelligence Model Network
Authors	Surat Teerapittayanon, H. T. Kung
Abstract	We introduce DaiMoN, a decentralized artificial intelligence model network, which incentivizes peer collaboration in improving the accuracy of machine learning models for a given classification problem. It is an autonomous network where peers may submit models with improved accuracy and other peers may verify the accuracy improvement. The system maintains an append-only decentralized ledger to keep the log of critical information, including who has trained the model and improved its accuracy, when it has been improved, by how much it has improved, and where to find the newly updated model. DaiMoN rewards these contributing peers with cryptographic tokens. A main feature of DaiMoN is that it allows peers to verify the accuracy improvement of submitted models without knowing the test labels. This is an essential component in order to mitigate intentional model overfitting by model-improving peers. To enable this model accuracy evaluation with hidden test labels, DaiMoN uses a novel learnable Distance Embedding for Labels (DEL) function proposed in this paper. Specific to each test dataset, DEL scrambles the test label vector by embedding it in a low-dimension space while approximately preserving the distance between the dataset’s test label vector and a label vector inferred by the classifier. It therefore allows proof-of-improvement (PoI) by peers without providing them access to true test labels. We provide analysis and empirical evidence that under DEL, peers can accurately assess model accuracy. We also argue that it is hard to invert the embedding function and thus, DEL is resilient against attacks aiming to recover test labels in order to cheat. Our prototype implementation of DaiMoN is available at https://github.com/steerapi/daimon.
Tasks
Published	2019-07-19
URL	https://arxiv.org/abs/1907.08377v1
PDF	https://arxiv.org/pdf/1907.08377v1.pdf
PWC	https://paperswithcode.com/paper/daimon-a-decentralized-artificial
Repo	https://github.com/steerapi/daimon
Framework	none

Interpretable CNNs for Object Classification


Title	Interpretable CNNs for Object Classification
Authors	Quanshi Zhang, Xin Wang, Ying Nian Wu, Huilin Zhou, Song-Chun Zhu
Abstract	This paper proposes a generic method to learn interpretable convolutional filters in a deep convolutional neural network (CNN) for object classification, where each interpretable filter encodes features of a specific object part. Our method does not require additional annotations of object parts or textures for supervision. Instead, we use the same training data as traditional CNNs. Our method automatically assigns each interpretable filter in a high conv-layer with an object part of a certain category during the learning process. Such explicit knowledge representations in conv-layers of CNN help people clarify the logic encoded in the CNN, i.e., answering what patterns the CNN extracts from an input image and uses for prediction. We have tested our method using different benchmark CNNs with various structures to demonstrate the broad applicability of our method. Experiments have shown that our interpretable filters are much more semantically meaningful than traditional filters.
Tasks	Object Classification
Published	2019-01-08
URL	https://arxiv.org/abs/1901.02413v2
PDF	https://arxiv.org/pdf/1901.02413v2.pdf
PWC	https://paperswithcode.com/paper/interpretable-cnns
Repo	https://github.com/Zymrael/paper-notes
Framework	pytorch

Semi-Supervised and Task-Driven Data Augmentation


Title	Semi-Supervised and Task-Driven Data Augmentation
Authors	Krishna Chaitanya, Neerav Karani, Christian Baumgartner, Olivio Donati, Anton Becker, Ender Konukoglu
Abstract	Supervised deep learning methods for segmentation require large amounts of labelled training data, without which they are prone to overfitting, not generalizing well to unseen images. In practice, obtaining a large number of annotations from clinical experts is expensive and time-consuming. One way to address scarcity of annotated examples is data augmentation using random spatial and intensity transformations. Recently, it has been proposed to use generative models to synthesize realistic training examples, complementing the random augmentation. So far, these methods have yielded limited gains over the random augmentation. However, there is potential to improve the approach by (i) explicitly modeling deformation fields (non-affine spatial transformation) and intensity transformations and (ii) leveraging unlabelled data during the generative process. With this motivation, we propose a novel task-driven data augmentation method where to synthesize new training examples, a generative network explicitly models and applies deformation fields and additive intensity masks on existing labelled data, modeling shape and intensity variations, respectively. Crucially, the generative model is optimized to be conducive to the task, in this case segmentation, and constrained to match the distribution of images observed from labelled and unlabelled samples. Furthermore, explicit modeling of deformation fields allow synthesizing segmentation masks and images in exact correspondence by simply applying the generated transformation to an input image and the corresponding annotation. Our experiments on cardiac magnetic resonance images (MRI) showed that, for the task of segmentation in small training data scenarios, the proposed method substantially outperforms conventional augmentation techniques.
Tasks	Data Augmentation
Published	2019-02-11
URL	http://arxiv.org/abs/1902.05396v2
PDF	http://arxiv.org/pdf/1902.05396v2.pdf
PWC	https://paperswithcode.com/paper/semi-supervised-and-task-driven-data
Repo	https://github.com/krishnabits001/task_driven_data_augmentation
Framework	tf

PyODDS: An End-to-End Outlier Detection System


Title	PyODDS: An End-to-End Outlier Detection System
Authors	Yuening Li, Daochen Zha, Na Zou, Xia Hu
Abstract	PyODDS is an end-to end Python system for outlier detection with database support. PyODDS provides outlier detection algorithms which meet the demands for users in different fields, w/wo data science or machine learning background. PyODDS gives the ability to execute machine learning algorithms in-database without moving data out of the database server or over the network. It also provides access to a wide range of outlier detection algorithms, including statistical analysis and more recent deep learning based approaches. PyODDS is released under the MIT open-source license, and currently available at (https://github.com/datamllab/pyodds) with official documentations at (https://pyodds.github.io/).
Tasks	Outlier Detection
Published	2019-10-07
URL	https://arxiv.org/abs/1910.02575v2
PDF	https://arxiv.org/pdf/1910.02575v2.pdf
PWC	https://paperswithcode.com/paper/pyodds-an-end-to-end-outlier-detection-system
Repo	https://github.com/datamllab/pyodds
Framework	tf

Dynamic Evaluation of Transformer Language Models


Title	Dynamic Evaluation of Transformer Language Models
Authors	Ben Krause, Emmanuel Kahembwe, Iain Murray, Steve Renals
Abstract	This research note combines two methods that have recently improved the state of the art in language modeling: Transformers and dynamic evaluation. Transformers use stacked layers of self-attention that allow them to capture long range dependencies in sequential data. Dynamic evaluation fits models to the recent sequence history, allowing them to assign higher probabilities to re-occurring sequential patterns. By applying dynamic evaluation to Transformer-XL models, we improve the state of the art on enwik8 from 0.99 to 0.94 bits/char, text8 from 1.08 to 1.04 bits/char, and WikiText-103 from 18.3 to 16.4 perplexity points.
Tasks	Language Modelling
Published	2019-04-17
URL	http://arxiv.org/abs/1904.08378v1
PDF	http://arxiv.org/pdf/1904.08378v1.pdf
PWC	https://paperswithcode.com/paper/dynamic-evaluation-of-transformer-language
Repo	https://github.com/benkrause/dynamiceval-transformer
Framework	tf

Unsupervised Microvascular Image Segmentation Using an Active Contours Mimicking Neural Network


Title	Unsupervised Microvascular Image Segmentation Using an Active Contours Mimicking Neural Network
Authors	Shir Gur, Lior Wolf, Lior Golgher, Pablo Blinder
Abstract	The task of blood vessel segmentation in microscopy images is crucial for many diagnostic and research applications. However, vessels can look vastly different, depending on the transient imaging conditions, and collecting data for supervised training is laborious. We present a novel deep learning method for unsupervised segmentation of blood vessels. The method is inspired by the field of active contours and we introduce a new loss term, which is based on the morphological Active Contours Without Edges (ACWE) optimization method. The role of the morphological operators is played by novel pooling layers that are incorporated to the network’s architecture. We demonstrate the challenges that are faced by previous supervised learning solutions, when the imaging conditions shift. Our unsupervised method is able to outperform such previous methods in both the labeled dataset, and when applied to similar but different datasets. Our code, as well as efficient PyTorch reimplementations of the baseline methods VesselNN and DeepVess is available on GitHub - https://github.com/shirgur/UMIS.
Tasks	Semantic Segmentation
Published	2019-08-04
URL	https://arxiv.org/abs/1908.01373v2
PDF	https://arxiv.org/pdf/1908.01373v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-microvascular-image-segmentation
Repo	https://github.com/shirgur/UMIS
Framework	pytorch

Kinematic Single Vehicle Trajectory Prediction Baselines and Applications with the NGSIM Dataset


Title	Kinematic Single Vehicle Trajectory Prediction Baselines and Applications with the NGSIM Dataset
Authors	Jean Mercat, Nicole El Zoghby, Guillaume Sandou, Dominique Beauvois, Guillermo Pita Gil
Abstract	In the recent vehicle trajectory prediction literature, the most common baselines are briefly introduced without the necessary information to reproduce it. In this article we produce reproducible vehicle prediction results from simple models. For that purpose, the process is explicit, and the code is available. Those baseline models are a constant velocity model and a single-vehicle prediction model. They are applied on the NGSIM US-101 and I-80 datasets using only relative positions. Thus, the process can be reproduced with any database containing tracking of vehicle positions. The evaluation reports Root Mean Squared Error (RMSE), Final Displacement Error (FDE), Negative Log-Likelihood (NLL), and Miss Rate (MR). The NLL estimation needs a careful definition because several formulations that differ from the mathematical definition are used in other works. This article is meant to be used along with the published code to establish baselines for further work. An extension is proposed to replace the constant velocity assumption with a learned model using a recurrent neural network. This brings good improvements in accuracy and uncertainty estimation and opens possibilities for both complex and interpretable models.
Tasks	Trajectory Prediction
Published	2019-08-29
URL	https://arxiv.org/abs/1908.11472v2
PDF	https://arxiv.org/pdf/1908.11472v2.pdf
PWC	https://paperswithcode.com/paper/inertial-single-vehicle-trajectory-prediction
Repo	https://github.com/jmercat/KalmanBaseline
Framework	pytorch