Paper Group ANR 1362
Forecasting Granular Audience Size for Online Advertising. Robustness Certificates Against Adversarial Examples for ReLU Networks. WaLDORf: Wasteless Language-model Distillation On Reading-comprehension. Confederated Machine Learning on Horizontally and Vertically Separated Medical Data for Large-Scale Health System Intelligence. Fusion-supervised …
Forecasting Granular Audience Size for Online Advertising
Title | Forecasting Granular Audience Size for Online Advertising |
Authors | Ritwik Sinha, Dhruv Singal, Pranav Maneriker, Kushal Chawla, Yash Shrivastava, Deepak Pai, Atanu R Sinha |
Abstract | Orchestration of campaigns for online display advertising requires marketers to forecast audience size at the granularity of specific attributes of web traffic, characterized by the categorical nature of all attributes (e.g. {US, Chrome, Mobile}). With each attribute taking many values, the very large attribute combination set makes estimating audience size for any specific attribute combination challenging. We modify Eclat, a frequent itemset mining (FIM) algorithm, to accommodate categorical variables. For consequent frequent and infrequent itemsets, we then provide forecasts using time series analysis with conditional probabilities to aid approximation. An extensive simulation, based on typical characteristics of audience data, is built to stress test our modified-FIM approach. In two real datasets, comparison with baselines including neural network models, shows that our method lowers computation time of FIM for categorical data. On hold out samples we show that the proposed forecasting method outperforms these baselines. |
Tasks | Time Series, Time Series Analysis |
Published | 2019-01-08 |
URL | http://arxiv.org/abs/1901.02412v1 |
http://arxiv.org/pdf/1901.02412v1.pdf | |
PWC | https://paperswithcode.com/paper/forecasting-granular-audience-size-for-online |
Repo | |
Framework | |
Robustness Certificates Against Adversarial Examples for ReLU Networks
Title | Robustness Certificates Against Adversarial Examples for ReLU Networks |
Authors | Sahil Singla, Soheil Feizi |
Abstract | While neural networks have achieved high performance in different learning tasks, their accuracy drops significantly in the presence of small adversarial perturbations to inputs. Defenses based on regularization and adversarial training are often followed by new attacks to defeat them. In this paper, we propose attack-agnostic robustness certificates for a multi-label classification problem using a deep ReLU network. Although computing the exact distance of a given input sample to the classification decision boundary requires solving a non-convex optimization, we characterize two lower bounds for such distances, namely the simplex certificate and the decision boundary certificate. These robustness certificates leverage the piece-wise linear structure of ReLU networks and use the fact that in a polyhedron around a given sample, the prediction function is linear. In particular, the proposed simplex certificate has a closed-form, is differentiable and is an order of magnitude faster to compute than the existing methods even for deep networks. In addition to theoretical bounds, we provide numerical results for our certificates over MNIST and compare them with some existing upper bounds. |
Tasks | Multi-Label Classification |
Published | 2019-02-01 |
URL | http://arxiv.org/abs/1902.01235v2 |
http://arxiv.org/pdf/1902.01235v2.pdf | |
PWC | https://paperswithcode.com/paper/robustness-certificates-against-adversarial |
Repo | |
Framework | |
WaLDORf: Wasteless Language-model Distillation On Reading-comprehension
Title | WaLDORf: Wasteless Language-model Distillation On Reading-comprehension |
Authors | James Yi Tian, Alexander P. Kreuzer, Pai-Hung Chen, Hans-Martin Will |
Abstract | Transformer based Very Large Language Models (VLLMs) like BERT, XLNet and RoBERTa, have recently shown tremendous performance on a large variety of Natural Language Understanding (NLU) tasks. However, due to their size, these VLLMs are extremely resource intensive and cumbersome to deploy at production time. Several recent publications have looked into various ways to distil knowledge from a transformer based VLLM (most commonly BERT-Base) into a smaller model which can run much faster at inference time. Here, we propose a novel set of techniques which together produce a task-specific hybrid convolutional and transformer model, WaLDORf, that achieves state-of-the-art inference speed while still being more accurate than previous distilled models. |
Tasks | Language Modelling, Reading Comprehension |
Published | 2019-12-13 |
URL | https://arxiv.org/abs/1912.06638v2 |
https://arxiv.org/pdf/1912.06638v2.pdf | |
PWC | https://paperswithcode.com/paper/waldorf-wasteless-language-model-distillation |
Repo | |
Framework | |
Confederated Machine Learning on Horizontally and Vertically Separated Medical Data for Large-Scale Health System Intelligence
Title | Confederated Machine Learning on Horizontally and Vertically Separated Medical Data for Large-Scale Health System Intelligence |
Authors | Dianbo Liu, Timothy A Miller, Kenneth D. Mandl |
Abstract | A patient’s health information is generally fragmented across silos. Though it is technically feasible to unite data for analysis in a manner that underpins a rapid learning healthcare system, privacy concerns and regulatory barriers limit data centralization. Machine learning can be conducted in a federated manner on patient datasets with the same set of variables, but separated across sites of care. But federated learning cannot handle the situation where different data types for a given patient are separated vertically across different organizations. We call methods that enable machine learning model training on data separated by two or more degrees “confederated machine learning.” We built and evaluated a confederated machine learning model to stratify the risk of accidental falls among the elderly |
Tasks | |
Published | 2019-10-04 |
URL | https://arxiv.org/abs/1910.02109v1 |
https://arxiv.org/pdf/1910.02109v1.pdf | |
PWC | https://paperswithcode.com/paper/confederated-machine-learning-on-horizontally |
Repo | |
Framework | |
Fusion-supervised Deep Cross-modal Hashing
Title | Fusion-supervised Deep Cross-modal Hashing |
Authors | Li Wang, Lei Zhu, En Yu, Jiande Sun, Huaxiang Zhang |
Abstract | Deep hashing has recently received attention in cross-modal retrieval for its impressive advantages. However, existing hashing methods for cross-modal retrieval cannot fully capture the heterogeneous multi-modal correlation and exploit the semantic information. In this paper, we propose a novel \emph{Fusion-supervised Deep Cross-modal Hashing} (FDCH) approach. Firstly, FDCH learns unified binary codes through a fusion hash network with paired samples as input, which effectively enhances the modeling of the correlation of heterogeneous multi-modal data. Then, these high-quality unified hash codes further supervise the training of the modality-specific hash networks for encoding out-of-sample queries. Meanwhile, both pair-wise similarity information and classification information are embedded in the hash networks under one stream framework, which simultaneously preserves cross-modal similarity and keeps semantic consistency. Experimental results on two benchmark datasets demonstrate the state-of-the-art performance of FDCH. |
Tasks | Cross-Modal Retrieval |
Published | 2019-04-25 |
URL | https://arxiv.org/abs/1904.11171v2 |
https://arxiv.org/pdf/1904.11171v2.pdf | |
PWC | https://paperswithcode.com/paper/fusion-supervised-deep-cross-modal-hashing |
Repo | |
Framework | |
QUEST: Quantized embedding space for transferring knowledge
Title | QUEST: Quantized embedding space for transferring knowledge |
Authors | Himalaya Jain, Spyros Gidaris, Nikos Komodakis, Patrick Pérez, Matthieu Cord |
Abstract | Knowledge distillation refers to the process of training a compact student network to achieve better accuracy by learning from a high capacity teacher network. Most of the existing knowledge distillation methods direct the student to follow the teacher by matching the teacher’s output, feature maps or their distribution. In this work, we propose a novel way to achieve this goal: by distilling the knowledge through a quantized space. According to our method, the teacher’s feature maps are quantized to represent the main visual concepts encompassed in the feature maps. The student is then asked to predict the quantized representation, which thus forms the task that the student uses to learn from the teacher. Despite its simplicity, we show that our approach is able to yield results that improve the state of the art on knowledge distillation. To that end, we provide an extensive evaluation across several network architectures and most commonly used benchmark datasets. |
Tasks | |
Published | 2019-12-03 |
URL | https://arxiv.org/abs/1912.01540v1 |
https://arxiv.org/pdf/1912.01540v1.pdf | |
PWC | https://paperswithcode.com/paper/quest-quantized-embedding-space-for |
Repo | |
Framework | |
Fast User-Guided Video Object Segmentation by Interaction-and-Propagation Networks
Title | Fast User-Guided Video Object Segmentation by Interaction-and-Propagation Networks |
Authors | Seoung Wug Oh, Joon-Young Lee, Ning Xu, Seon Joo Kim |
Abstract | We present a deep learning method for the interactive video object segmentation. Our method is built upon two core operations, interaction and propagation, and each operation is conducted by Convolutional Neural Networks. The two networks are connected both internally and externally so that the networks are trained jointly and interact with each other to solve the complex video object segmentation problem. We propose a new multi-round training scheme for the interactive video object segmentation so that the networks can learn how to understand the user’s intention and update incorrect estimations during the training. At the testing time, our method produces high-quality results and also runs fast enough to work with users interactively. We evaluated the proposed method quantitatively on the interactive track benchmark at the DAVIS Challenge 2018. We outperformed other competing methods by a significant margin in both the speed and the accuracy. We also demonstrated that our method works well with real user interactions. |
Tasks | Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation |
Published | 2019-04-22 |
URL | http://arxiv.org/abs/1904.09791v2 |
http://arxiv.org/pdf/1904.09791v2.pdf | |
PWC | https://paperswithcode.com/paper/fast-user-guided-video-object-segmentation-by |
Repo | |
Framework | |
UniVSE: Robust Visual Semantic Embeddings via Structured Semantic Representations
Title | UniVSE: Robust Visual Semantic Embeddings via Structured Semantic Representations |
Authors | Hao Wu, Jiayuan Mao, Yufeng Zhang, Yuning Jiang, Lei Li, Weiwei Sun, Wei-Ying Ma |
Abstract | We propose Unified Visual-Semantic Embeddings (UniVSE) for learning a joint space of visual and textual concepts. The space unifies the concepts at different levels, including objects, attributes, relations, and full scenes. A contrastive learning approach is proposed for the fine-grained alignment from only image-caption pairs. Moreover, we present an effective approach for enforcing the coverage of semantic components that appear in the sentence. We demonstrate the robustness of Unified VSE in defending text-domain adversarial attacks on cross-modal retrieval tasks. Such robustness also empowers the use of visual cues to resolve word dependencies in novel sentences. |
Tasks | Cross-Modal Retrieval |
Published | 2019-04-11 |
URL | http://arxiv.org/abs/1904.05521v2 |
http://arxiv.org/pdf/1904.05521v2.pdf | |
PWC | https://paperswithcode.com/paper/unified-visual-semantic-embeddings-bridging |
Repo | |
Framework | |
Automated Machine Learning: State-of-The-Art and Open Challenges
Title | Automated Machine Learning: State-of-The-Art and Open Challenges |
Authors | Radwa Elshawi, Mohamed Maher, Sherif Sakr |
Abstract | With the continuous and vast increase in the amount of data in our digital world, it has been acknowledged that the number of knowledgeable data scientists can not scale to address these challenges. Thus, there was a crucial need for automating the process of building good machine learning models. In the last few years, several techniques and frameworks have been introduced to tackle the challenge of automating the process of Combined Algorithm Selection and Hyper-parameter tuning (CASH) in the machine learning domain. The main aim of these techniques is to reduce the role of the human in the loop and fill the gap for non-expert machine learning users by playing the role of the domain expert. In this paper, we present a comprehensive survey for the state-of-the-art efforts in tackling the CASH problem. In addition, we highlight the research work of automating the other steps of the full complex machine learning pipeline (AutoML) from data understanding till model deployment. Furthermore, we provide comprehensive coverage for the various tools and frameworks that have been introduced in this domain. Finally, we discuss some of the research directions and open challenges that need to be addressed in order to achieve the vision and goals of the AutoML process. |
Tasks | AutoML |
Published | 2019-06-05 |
URL | https://arxiv.org/abs/1906.02287v2 |
https://arxiv.org/pdf/1906.02287v2.pdf | |
PWC | https://paperswithcode.com/paper/automated-machine-learning-state-of-the-art |
Repo | |
Framework | |
A Multiple Filter Based Neural Network Approach to the Extrapolation of Adsorption Energies on Metal Surfaces for Catalysis Applications
Title | A Multiple Filter Based Neural Network Approach to the Extrapolation of Adsorption Energies on Metal Surfaces for Catalysis Applications |
Authors | Asif J. Chowdhury, Wenqiang Yang, Kareem E. Abdelfatah, Mehdi Zare, Andreas Heyden, Gabriel Terejanu |
Abstract | Computational catalyst discovery involves the development of microkinetic reactor models based on estimated parameters determined from density functional theory (DFT). For complex surface chemistries, the cost of calculating the adsorption energies by DFT for a large number of reaction intermediates can become prohibitive. Here, we have identified appropriate descriptors and machine learning models that can be used to predict part of these adsorption energies given data on the rest of them. Our investigations also included the case when the species data used to train the predictive model is of different size relative to the species the model tries to predict - an extrapolation in the data space which is typically difficult with regular machine learning models. We have developed a neural network based predictive model that combines an established model with the concepts of a convolutional neural network that, when extrapolating, achieves significant improvement over the previous models. |
Tasks | |
Published | 2019-10-01 |
URL | https://arxiv.org/abs/1910.00623v1 |
https://arxiv.org/pdf/1910.00623v1.pdf | |
PWC | https://paperswithcode.com/paper/a-multiple-filter-based-neural-network |
Repo | |
Framework | |
Context Aware Road-user Importance Estimation (iCARE)
Title | Context Aware Road-user Importance Estimation (iCARE) |
Authors | Alireza Rahimpour, Sujitha Martin, Ashish Tawari, Hairong Qi |
Abstract | Road-users are a critical part of decision-making for both self-driving cars and driver assistance systems. Some road-users, however, are more important for decision-making than others because of their respective intentions, ego vehicle’s intention and their effects on each other. In this paper, we propose a novel architecture for road-user importance estimation which takes advantage of the local and global context of the scene. For local context, the model exploits the appearance of the road users (which captures orientation, intention, etc.) and their location relative to ego-vehicle. The global context in our model is defined based on the feature map of the convolutional layer of the module which predicts the future path of the ego-vehicle and contains rich global information of the scene (e.g., infrastructure, road lanes, etc.), as well as the ego vehicle’s intention information. Moreover, this paper introduces a new data set of real-world driving, concentrated around inter-sections and includes annotations of important road users. Systematic evaluations of our proposed method against several baselines show promising results. |
Tasks | Decision Making, Self-Driving Cars |
Published | 2019-08-30 |
URL | https://arxiv.org/abs/1909.05152v1 |
https://arxiv.org/pdf/1909.05152v1.pdf | |
PWC | https://paperswithcode.com/paper/context-aware-road-user-importance-estimation |
Repo | |
Framework | |
Cross-Spectral Face Hallucination via Disentangling Independent Factors
Title | Cross-Spectral Face Hallucination via Disentangling Independent Factors |
Authors | Boyan Duan, Chaoyou Fu, Yi Li, Xingguang Song, Ran He |
Abstract | The cross-sensor gap is one of the challenges that have aroused much research interests in Heterogeneous Face Recognition (HFR). Although recent methods have attempted to fill the gap with deep generative networks, most of them suffer from the inevitable misalignment between different face modalities. Instead of imaging sensors, the misalignment primarily results from facial geometric variations that are independent of the spectrum. Rather than building a monolithic but complex structure, this paper proposes a Pose Aligned Cross-spectral Hallucination (PACH) approach to disentangle the independent factors and deal with them in individual stages. In the first stage, an Unsupervised Face Alignment (UFA) module is designed to align the facial shapes of the near-infrared (NIR) images with those of the visible (VIS) images in a generative way, where UV maps are effectively utilized as the shape guidance. Thus the task of the second stage becomes spectrum translation with aligned paired data. We develop a Texture Prior Synthesis (TPS) module to achieve complexion control and consequently generate more realistic VIS images than existing methods. Experiments on three challenging NIR-VIS datasets verify the effectiveness of our approach in producing visually appealing images and achieving state-of-the-art performance in HFR. |
Tasks | Face Alignment, Face Hallucination, Face Recognition, Heterogeneous Face Recognition |
Published | 2019-09-10 |
URL | https://arxiv.org/abs/1909.04365v2 |
https://arxiv.org/pdf/1909.04365v2.pdf | |
PWC | https://paperswithcode.com/paper/pose-agnostic-cross-spectral-hallucination |
Repo | |
Framework | |
Correcting rural building annotations in OpenStreetMap using convolutional neural networks
Title | Correcting rural building annotations in OpenStreetMap using convolutional neural networks |
Authors | John E. Vargas-Muñoz, Sylvain Lobry, Alexandre X. Falcão, Devis Tuia |
Abstract | Rural building mapping is paramount to support demographic studies and plan actions in response to crisis that affect those areas. Rural building annotations exist in OpenStreetMap (OSM), but their quality and quantity are not sufficient for training models that can create accurate rural building maps. The problems with these annotations essentially fall into three categories: (i) most commonly, many annotations are geometrically misaligned with the updated imagery; (ii) some annotations do not correspond to buildings in the images (they are misannotations or the buildings have been destroyed); and (iii) some annotations are missing for buildings in the images (the buildings were never annotated or were built between subsequent image acquisitions). First, we propose a method based on Markov Random Field (MRF) to align the buildings with their annotations. The method maximizes the correlation between annotations and a building probability map while enforcing that nearby buildings have similar alignment vectors. Second, the annotations with no evidence in the building probability map are removed. Third, we present a method to detect non-annotated buildings with predefined shapes and add their annotation. The proposed methodology shows considerable improvement in accuracy of the OSM annotations for two regions of Tanzania and Zimbabwe, being more accurate than state-of-the-art baselines. |
Tasks | |
Published | 2019-01-24 |
URL | http://arxiv.org/abs/1901.08190v1 |
http://arxiv.org/pdf/1901.08190v1.pdf | |
PWC | https://paperswithcode.com/paper/correcting-rural-building-annotations-in |
Repo | |
Framework | |
LAMP-HQ: A Large-Scale Multi-Pose High-Quality Database and Benchmark for NIR-VIS Face Recognition
Title | LAMP-HQ: A Large-Scale Multi-Pose High-Quality Database and Benchmark for NIR-VIS Face Recognition |
Authors | Aijing Yu, Haoxue Wu, Huaibo Huang, Zhen Lei, Ran He |
Abstract | Near-infrared-visible (NIR-VIS) heterogeneous face recognition matches NIR to corresponding VIS face images. However, due to the sensing gap, NIR images often lose some identity information so that the recognition issue is more difficult than conventional VIS face recognition. Recently, NIR-VIS heterogeneous face recognition has attracted considerable attention in the computer vision community because of its convenience and adaptability in practical applications. Various deep learning-based methods have been proposed and substantially increased the recognition performance, but the lack of NIR-VIS training samples leads to the difficulty of the model training process. In this paper, we propose a new Large-Scale Multi-Pose High-Quality NIR-VIS database LAMP-HQ containing 56,788 NIR and 16,828 VIS images of 573 subjects with large diversities in pose, illumination, attribute, scene and accessory. We furnish a benchmark along with the protocol for NIR-VIS face recognition via generation on LAMP-HQ, including Pixel2Pixel, CycleGAN, and ADFL. Furthermore, we propose a novel exemplar-based variational spectral attention network to produce high-fidelity VIS images from NIR data. A spectral conditional attention module is introduced to reduce the domain gap between NIR and VIS data and then improve the performance of NIR-VIS heterogeneous face recognition on various databases including the LAMP-HQ. |
Tasks | Face Recognition, Heterogeneous Face Recognition |
Published | 2019-12-17 |
URL | https://arxiv.org/abs/1912.07809v2 |
https://arxiv.org/pdf/1912.07809v2.pdf | |
PWC | https://paperswithcode.com/paper/lamp-hq-a-large-scale-multi-pose-high-quality |
Repo | |
Framework | |
Comprehensive SNN Compression Using ADMM Optimization and Activity Regularization
Title | Comprehensive SNN Compression Using ADMM Optimization and Activity Regularization |
Authors | Lei Deng, Yujie Wu, Yifan Hu, Ling Liang, Guoqi Li, Xing Hu, Yufei Ding, Peng Li, Yuan Xie |
Abstract | Spiking neural network is an important family of models to emulate the brain, which has been widely adopted by neuromorphic platforms. In the meantime, it is well-known that the huge memory and compute costs of neural networks greatly hinder the execution with high efficiency, especially on edge devices. To this end, model compression is proposed as a promising technique to improve the running efficiency via parameter and operation reduction. Therefore, it is interesting to investigate how much an SNN model can be compressed without compromising much functionality. However, this is quite challenging because SNNs usually behave distinctly from deep learning models. Specifically, i) the accuracy of spike-coded SNNs is usually sensitive to any network change; ii) the computation of SNNs is event-driven rather than static. Here we present a comprehensive SNN compression through three steps. First, we formulate the connection pruning and the weight quantization as a supervised learning-based constrained optimization problem. Second, we combine the emerging spatio-temporal backpropagation and the powerful alternating direction method of multipliers to solve the problem with minimum accuracy loss. Third, we further propose an activity regularization to reduce the spike events for fewer active operations. We define several quantitative metrics to evaluation the compression performance for SNNs and validate our methodology in pattern recognition tasks over MNIST, N-MNIST, and CIFAR10 datasets. Extensive comparisons between different compression strategies, the corresponding result analysis, and some interesting insights are provided. To our best knowledge, this is the first work that studies SNN compression in a comprehensive manner by exploiting all possible compression ways and achieves better results. Our work offers a promising solution to pursue ultra-efficient neuromorphic systems. |
Tasks | Model Compression, Quantization |
Published | 2019-11-03 |
URL | https://arxiv.org/abs/1911.00822v1 |
https://arxiv.org/pdf/1911.00822v1.pdf | |
PWC | https://paperswithcode.com/paper/comprehensive-snn-compression-using-admm |
Repo | |
Framework | |