January 26, 2020

3185 words 15 mins read

Paper Group ANR 1362

Forecasting Granular Audience Size for Online Advertising. Robustness Certificates Against Adversarial Examples for ReLU Networks. WaLDORf: Wasteless Language-model Distillation On Reading-comprehension. Confederated Machine Learning on Horizontally and Vertically Separated Medical Data for Large-Scale Health System Intelligence. Fusion-supervised …

Forecasting Granular Audience Size for Online Advertising


Title	Forecasting Granular Audience Size for Online Advertising
Authors	Ritwik Sinha, Dhruv Singal, Pranav Maneriker, Kushal Chawla, Yash Shrivastava, Deepak Pai, Atanu R Sinha
Abstract	Orchestration of campaigns for online display advertising requires marketers to forecast audience size at the granularity of specific attributes of web traffic, characterized by the categorical nature of all attributes (e.g. {US, Chrome, Mobile}). With each attribute taking many values, the very large attribute combination set makes estimating audience size for any specific attribute combination challenging. We modify Eclat, a frequent itemset mining (FIM) algorithm, to accommodate categorical variables. For consequent frequent and infrequent itemsets, we then provide forecasts using time series analysis with conditional probabilities to aid approximation. An extensive simulation, based on typical characteristics of audience data, is built to stress test our modified-FIM approach. In two real datasets, comparison with baselines including neural network models, shows that our method lowers computation time of FIM for categorical data. On hold out samples we show that the proposed forecasting method outperforms these baselines.
Tasks	Time Series, Time Series Analysis
Published	2019-01-08
URL	http://arxiv.org/abs/1901.02412v1
PDF	http://arxiv.org/pdf/1901.02412v1.pdf
PWC	https://paperswithcode.com/paper/forecasting-granular-audience-size-for-online
Repo
Framework

Robustness Certificates Against Adversarial Examples for ReLU Networks


Title	Robustness Certificates Against Adversarial Examples for ReLU Networks
Authors	Sahil Singla, Soheil Feizi
Abstract	While neural networks have achieved high performance in different learning tasks, their accuracy drops significantly in the presence of small adversarial perturbations to inputs. Defenses based on regularization and adversarial training are often followed by new attacks to defeat them. In this paper, we propose attack-agnostic robustness certificates for a multi-label classification problem using a deep ReLU network. Although computing the exact distance of a given input sample to the classification decision boundary requires solving a non-convex optimization, we characterize two lower bounds for such distances, namely the simplex certificate and the decision boundary certificate. These robustness certificates leverage the piece-wise linear structure of ReLU networks and use the fact that in a polyhedron around a given sample, the prediction function is linear. In particular, the proposed simplex certificate has a closed-form, is differentiable and is an order of magnitude faster to compute than the existing methods even for deep networks. In addition to theoretical bounds, we provide numerical results for our certificates over MNIST and compare them with some existing upper bounds.
Tasks	Multi-Label Classification
Published	2019-02-01
URL	http://arxiv.org/abs/1902.01235v2
PDF	http://arxiv.org/pdf/1902.01235v2.pdf
PWC	https://paperswithcode.com/paper/robustness-certificates-against-adversarial
Repo
Framework

WaLDORf: Wasteless Language-model Distillation On Reading-comprehension


Title	WaLDORf: Wasteless Language-model Distillation On Reading-comprehension
Authors	James Yi Tian, Alexander P. Kreuzer, Pai-Hung Chen, Hans-Martin Will
Abstract	Transformer based Very Large Language Models (VLLMs) like BERT, XLNet and RoBERTa, have recently shown tremendous performance on a large variety of Natural Language Understanding (NLU) tasks. However, due to their size, these VLLMs are extremely resource intensive and cumbersome to deploy at production time. Several recent publications have looked into various ways to distil knowledge from a transformer based VLLM (most commonly BERT-Base) into a smaller model which can run much faster at inference time. Here, we propose a novel set of techniques which together produce a task-specific hybrid convolutional and transformer model, WaLDORf, that achieves state-of-the-art inference speed while still being more accurate than previous distilled models.
Tasks	Language Modelling, Reading Comprehension
Published	2019-12-13
URL	https://arxiv.org/abs/1912.06638v2
PDF	https://arxiv.org/pdf/1912.06638v2.pdf
PWC	https://paperswithcode.com/paper/waldorf-wasteless-language-model-distillation
Repo
Framework

Confederated Machine Learning on Horizontally and Vertically Separated Medical Data for Large-Scale Health System Intelligence


Title	Confederated Machine Learning on Horizontally and Vertically Separated Medical Data for Large-Scale Health System Intelligence
Authors	Dianbo Liu, Timothy A Miller, Kenneth D. Mandl
Abstract	A patient’s health information is generally fragmented across silos. Though it is technically feasible to unite data for analysis in a manner that underpins a rapid learning healthcare system, privacy concerns and regulatory barriers limit data centralization. Machine learning can be conducted in a federated manner on patient datasets with the same set of variables, but separated across sites of care. But federated learning cannot handle the situation where different data types for a given patient are separated vertically across different organizations. We call methods that enable machine learning model training on data separated by two or more degrees “confederated machine learning.” We built and evaluated a confederated machine learning model to stratify the risk of accidental falls among the elderly
Tasks
Published	2019-10-04
URL	https://arxiv.org/abs/1910.02109v1
PDF	https://arxiv.org/pdf/1910.02109v1.pdf
PWC	https://paperswithcode.com/paper/confederated-machine-learning-on-horizontally
Repo
Framework


Title	Fusion-supervised Deep Cross-modal Hashing
Authors	Li Wang, Lei Zhu, En Yu, Jiande Sun, Huaxiang Zhang
Abstract	Deep hashing has recently received attention in cross-modal retrieval for its impressive advantages. However, existing hashing methods for cross-modal retrieval cannot fully capture the heterogeneous multi-modal correlation and exploit the semantic information. In this paper, we propose a novel \emph{Fusion-supervised Deep Cross-modal Hashing} (FDCH) approach. Firstly, FDCH learns unified binary codes through a fusion hash network with paired samples as input, which effectively enhances the modeling of the correlation of heterogeneous multi-modal data. Then, these high-quality unified hash codes further supervise the training of the modality-specific hash networks for encoding out-of-sample queries. Meanwhile, both pair-wise similarity information and classification information are embedded in the hash networks under one stream framework, which simultaneously preserves cross-modal similarity and keeps semantic consistency. Experimental results on two benchmark datasets demonstrate the state-of-the-art performance of FDCH.
Tasks	Cross-Modal Retrieval
Published	2019-04-25
URL	https://arxiv.org/abs/1904.11171v2
PDF	https://arxiv.org/pdf/1904.11171v2.pdf
PWC	https://paperswithcode.com/paper/fusion-supervised-deep-cross-modal-hashing
Repo
Framework

QUEST: Quantized embedding space for transferring knowledge


Title	QUEST: Quantized embedding space for transferring knowledge
Authors	Himalaya Jain, Spyros Gidaris, Nikos Komodakis, Patrick Pérez, Matthieu Cord
Abstract	Knowledge distillation refers to the process of training a compact student network to achieve better accuracy by learning from a high capacity teacher network. Most of the existing knowledge distillation methods direct the student to follow the teacher by matching the teacher’s output, feature maps or their distribution. In this work, we propose a novel way to achieve this goal: by distilling the knowledge through a quantized space. According to our method, the teacher’s feature maps are quantized to represent the main visual concepts encompassed in the feature maps. The student is then asked to predict the quantized representation, which thus forms the task that the student uses to learn from the teacher. Despite its simplicity, we show that our approach is able to yield results that improve the state of the art on knowledge distillation. To that end, we provide an extensive evaluation across several network architectures and most commonly used benchmark datasets.
Tasks
Published	2019-12-03
URL	https://arxiv.org/abs/1912.01540v1
PDF	https://arxiv.org/pdf/1912.01540v1.pdf
PWC	https://paperswithcode.com/paper/quest-quantized-embedding-space-for
Repo
Framework

Fast User-Guided Video Object Segmentation by Interaction-and-Propagation Networks


Title	Fast User-Guided Video Object Segmentation by Interaction-and-Propagation Networks
Authors	Seoung Wug Oh, Joon-Young Lee, Ning Xu, Seon Joo Kim
Abstract	We present a deep learning method for the interactive video object segmentation. Our method is built upon two core operations, interaction and propagation, and each operation is conducted by Convolutional Neural Networks. The two networks are connected both internally and externally so that the networks are trained jointly and interact with each other to solve the complex video object segmentation problem. We propose a new multi-round training scheme for the interactive video object segmentation so that the networks can learn how to understand the user’s intention and update incorrect estimations during the training. At the testing time, our method produces high-quality results and also runs fast enough to work with users interactively. We evaluated the proposed method quantitatively on the interactive track benchmark at the DAVIS Challenge 2018. We outperformed other competing methods by a significant margin in both the speed and the accuracy. We also demonstrated that our method works well with real user interactions.
Tasks	Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation
Published	2019-04-22
URL	http://arxiv.org/abs/1904.09791v2
PDF	http://arxiv.org/pdf/1904.09791v2.pdf
PWC	https://paperswithcode.com/paper/fast-user-guided-video-object-segmentation-by
Repo
Framework

UniVSE: Robust Visual Semantic Embeddings via Structured Semantic Representations


Title	UniVSE: Robust Visual Semantic Embeddings via Structured Semantic Representations
Authors	Hao Wu, Jiayuan Mao, Yufeng Zhang, Yuning Jiang, Lei Li, Weiwei Sun, Wei-Ying Ma
Abstract	We propose Unified Visual-Semantic Embeddings (UniVSE) for learning a joint space of visual and textual concepts. The space unifies the concepts at different levels, including objects, attributes, relations, and full scenes. A contrastive learning approach is proposed for the fine-grained alignment from only image-caption pairs. Moreover, we present an effective approach for enforcing the coverage of semantic components that appear in the sentence. We demonstrate the robustness of Unified VSE in defending text-domain adversarial attacks on cross-modal retrieval tasks. Such robustness also empowers the use of visual cues to resolve word dependencies in novel sentences.
Tasks	Cross-Modal Retrieval
Published	2019-04-11
URL	http://arxiv.org/abs/1904.05521v2
PDF	http://arxiv.org/pdf/1904.05521v2.pdf
PWC	https://paperswithcode.com/paper/unified-visual-semantic-embeddings-bridging
Repo
Framework

Automated Machine Learning: State-of-The-Art and Open Challenges


Title	Automated Machine Learning: State-of-The-Art and Open Challenges
Authors	Radwa Elshawi, Mohamed Maher, Sherif Sakr
Abstract	With the continuous and vast increase in the amount of data in our digital world, it has been acknowledged that the number of knowledgeable data scientists can not scale to address these challenges. Thus, there was a crucial need for automating the process of building good machine learning models. In the last few years, several techniques and frameworks have been introduced to tackle the challenge of automating the process of Combined Algorithm Selection and Hyper-parameter tuning (CASH) in the machine learning domain. The main aim of these techniques is to reduce the role of the human in the loop and fill the gap for non-expert machine learning users by playing the role of the domain expert. In this paper, we present a comprehensive survey for the state-of-the-art efforts in tackling the CASH problem. In addition, we highlight the research work of automating the other steps of the full complex machine learning pipeline (AutoML) from data understanding till model deployment. Furthermore, we provide comprehensive coverage for the various tools and frameworks that have been introduced in this domain. Finally, we discuss some of the research directions and open challenges that need to be addressed in order to achieve the vision and goals of the AutoML process.
Tasks	AutoML
Published	2019-06-05
URL	https://arxiv.org/abs/1906.02287v2
PDF	https://arxiv.org/pdf/1906.02287v2.pdf
PWC	https://paperswithcode.com/paper/automated-machine-learning-state-of-the-art
Repo
Framework

A Multiple Filter Based Neural Network Approach to the Extrapolation of Adsorption Energies on Metal Surfaces for Catalysis Applications


Title	A Multiple Filter Based Neural Network Approach to the Extrapolation of Adsorption Energies on Metal Surfaces for Catalysis Applications
Authors	Asif J. Chowdhury, Wenqiang Yang, Kareem E. Abdelfatah, Mehdi Zare, Andreas Heyden, Gabriel Terejanu
Abstract	Computational catalyst discovery involves the development of microkinetic reactor models based on estimated parameters determined from density functional theory (DFT). For complex surface chemistries, the cost of calculating the adsorption energies by DFT for a large number of reaction intermediates can become prohibitive. Here, we have identified appropriate descriptors and machine learning models that can be used to predict part of these adsorption energies given data on the rest of them. Our investigations also included the case when the species data used to train the predictive model is of different size relative to the species the model tries to predict - an extrapolation in the data space which is typically difficult with regular machine learning models. We have developed a neural network based predictive model that combines an established model with the concepts of a convolutional neural network that, when extrapolating, achieves significant improvement over the previous models.
Tasks
Published	2019-10-01
URL	https://arxiv.org/abs/1910.00623v1
PDF	https://arxiv.org/pdf/1910.00623v1.pdf
PWC	https://paperswithcode.com/paper/a-multiple-filter-based-neural-network
Repo
Framework

Context Aware Road-user Importance Estimation (iCARE)


Title	Context Aware Road-user Importance Estimation (iCARE)
Authors	Alireza Rahimpour, Sujitha Martin, Ashish Tawari, Hairong Qi
Abstract	Road-users are a critical part of decision-making for both self-driving cars and driver assistance systems. Some road-users, however, are more important for decision-making than others because of their respective intentions, ego vehicle’s intention and their effects on each other. In this paper, we propose a novel architecture for road-user importance estimation which takes advantage of the local and global context of the scene. For local context, the model exploits the appearance of the road users (which captures orientation, intention, etc.) and their location relative to ego-vehicle. The global context in our model is defined based on the feature map of the convolutional layer of the module which predicts the future path of the ego-vehicle and contains rich global information of the scene (e.g., infrastructure, road lanes, etc.), as well as the ego vehicle’s intention information. Moreover, this paper introduces a new data set of real-world driving, concentrated around inter-sections and includes annotations of important road users. Systematic evaluations of our proposed method against several baselines show promising results.
Tasks	Decision Making, Self-Driving Cars
Published	2019-08-30
URL	https://arxiv.org/abs/1909.05152v1
PDF	https://arxiv.org/pdf/1909.05152v1.pdf
PWC	https://paperswithcode.com/paper/context-aware-road-user-importance-estimation
Repo
Framework

Cross-Spectral Face Hallucination via Disentangling Independent Factors


Title	Cross-Spectral Face Hallucination via Disentangling Independent Factors
Authors	Boyan Duan, Chaoyou Fu, Yi Li, Xingguang Song, Ran He
Abstract	The cross-sensor gap is one of the challenges that have aroused much research interests in Heterogeneous Face Recognition (HFR). Although recent methods have attempted to fill the gap with deep generative networks, most of them suffer from the inevitable misalignment between different face modalities. Instead of imaging sensors, the misalignment primarily results from facial geometric variations that are independent of the spectrum. Rather than building a monolithic but complex structure, this paper proposes a Pose Aligned Cross-spectral Hallucination (PACH) approach to disentangle the independent factors and deal with them in individual stages. In the first stage, an Unsupervised Face Alignment (UFA) module is designed to align the facial shapes of the near-infrared (NIR) images with those of the visible (VIS) images in a generative way, where UV maps are effectively utilized as the shape guidance. Thus the task of the second stage becomes spectrum translation with aligned paired data. We develop a Texture Prior Synthesis (TPS) module to achieve complexion control and consequently generate more realistic VIS images than existing methods. Experiments on three challenging NIR-VIS datasets verify the effectiveness of our approach in producing visually appealing images and achieving state-of-the-art performance in HFR.
Tasks	Face Alignment, Face Hallucination, Face Recognition, Heterogeneous Face Recognition
Published	2019-09-10
URL	https://arxiv.org/abs/1909.04365v2
PDF	https://arxiv.org/pdf/1909.04365v2.pdf
PWC	https://paperswithcode.com/paper/pose-agnostic-cross-spectral-hallucination
Repo
Framework

Correcting rural building annotations in OpenStreetMap using convolutional neural networks


Title	Correcting rural building annotations in OpenStreetMap using convolutional neural networks
Authors	John E. Vargas-Muñoz, Sylvain Lobry, Alexandre X. Falcão, Devis Tuia
Abstract	Rural building mapping is paramount to support demographic studies and plan actions in response to crisis that affect those areas. Rural building annotations exist in OpenStreetMap (OSM), but their quality and quantity are not sufficient for training models that can create accurate rural building maps. The problems with these annotations essentially fall into three categories: (i) most commonly, many annotations are geometrically misaligned with the updated imagery; (ii) some annotations do not correspond to buildings in the images (they are misannotations or the buildings have been destroyed); and (iii) some annotations are missing for buildings in the images (the buildings were never annotated or were built between subsequent image acquisitions). First, we propose a method based on Markov Random Field (MRF) to align the buildings with their annotations. The method maximizes the correlation between annotations and a building probability map while enforcing that nearby buildings have similar alignment vectors. Second, the annotations with no evidence in the building probability map are removed. Third, we present a method to detect non-annotated buildings with predefined shapes and add their annotation. The proposed methodology shows considerable improvement in accuracy of the OSM annotations for two regions of Tanzania and Zimbabwe, being more accurate than state-of-the-art baselines.
Tasks
Published	2019-01-24
URL	http://arxiv.org/abs/1901.08190v1
PDF	http://arxiv.org/pdf/1901.08190v1.pdf
PWC	https://paperswithcode.com/paper/correcting-rural-building-annotations-in
Repo
Framework

LAMP-HQ: A Large-Scale Multi-Pose High-Quality Database and Benchmark for NIR-VIS Face Recognition


Title	LAMP-HQ: A Large-Scale Multi-Pose High-Quality Database and Benchmark for NIR-VIS Face Recognition
Authors	Aijing Yu, Haoxue Wu, Huaibo Huang, Zhen Lei, Ran He
Abstract	Near-infrared-visible (NIR-VIS) heterogeneous face recognition matches NIR to corresponding VIS face images. However, due to the sensing gap, NIR images often lose some identity information so that the recognition issue is more difficult than conventional VIS face recognition. Recently, NIR-VIS heterogeneous face recognition has attracted considerable attention in the computer vision community because of its convenience and adaptability in practical applications. Various deep learning-based methods have been proposed and substantially increased the recognition performance, but the lack of NIR-VIS training samples leads to the difficulty of the model training process. In this paper, we propose a new Large-Scale Multi-Pose High-Quality NIR-VIS database LAMP-HQ containing 56,788 NIR and 16,828 VIS images of 573 subjects with large diversities in pose, illumination, attribute, scene and accessory. We furnish a benchmark along with the protocol for NIR-VIS face recognition via generation on LAMP-HQ, including Pixel2Pixel, CycleGAN, and ADFL. Furthermore, we propose a novel exemplar-based variational spectral attention network to produce high-fidelity VIS images from NIR data. A spectral conditional attention module is introduced to reduce the domain gap between NIR and VIS data and then improve the performance of NIR-VIS heterogeneous face recognition on various databases including the LAMP-HQ.
Tasks	Face Recognition, Heterogeneous Face Recognition
Published	2019-12-17
URL	https://arxiv.org/abs/1912.07809v2
PDF	https://arxiv.org/pdf/1912.07809v2.pdf
PWC	https://paperswithcode.com/paper/lamp-hq-a-large-scale-multi-pose-high-quality
Repo
Framework

Comprehensive SNN Compression Using ADMM Optimization and Activity Regularization


Title	Comprehensive SNN Compression Using ADMM Optimization and Activity Regularization
Authors	Lei Deng, Yujie Wu, Yifan Hu, Ling Liang, Guoqi Li, Xing Hu, Yufei Ding, Peng Li, Yuan Xie
Abstract	Spiking neural network is an important family of models to emulate the brain, which has been widely adopted by neuromorphic platforms. In the meantime, it is well-known that the huge memory and compute costs of neural networks greatly hinder the execution with high efficiency, especially on edge devices. To this end, model compression is proposed as a promising technique to improve the running efficiency via parameter and operation reduction. Therefore, it is interesting to investigate how much an SNN model can be compressed without compromising much functionality. However, this is quite challenging because SNNs usually behave distinctly from deep learning models. Specifically, i) the accuracy of spike-coded SNNs is usually sensitive to any network change; ii) the computation of SNNs is event-driven rather than static. Here we present a comprehensive SNN compression through three steps. First, we formulate the connection pruning and the weight quantization as a supervised learning-based constrained optimization problem. Second, we combine the emerging spatio-temporal backpropagation and the powerful alternating direction method of multipliers to solve the problem with minimum accuracy loss. Third, we further propose an activity regularization to reduce the spike events for fewer active operations. We define several quantitative metrics to evaluation the compression performance for SNNs and validate our methodology in pattern recognition tasks over MNIST, N-MNIST, and CIFAR10 datasets. Extensive comparisons between different compression strategies, the corresponding result analysis, and some interesting insights are provided. To our best knowledge, this is the first work that studies SNN compression in a comprehensive manner by exploiting all possible compression ways and achieves better results. Our work offers a promising solution to pursue ultra-efficient neuromorphic systems.
Tasks	Model Compression, Quantization
Published	2019-11-03
URL	https://arxiv.org/abs/1911.00822v1
PDF	https://arxiv.org/pdf/1911.00822v1.pdf
PWC	https://paperswithcode.com/paper/comprehensive-snn-compression-using-admm
Repo
Framework