April 2, 2020

3001 words 15 mins read

Paper Group ANR 173

Uncertainty Quantification for Bayesian Optimization. Multi-Source Deep Domain Adaptation for Quality Control in Retail Food Packaging. DC-BERT: Decoupling Question and Document for Efficient Contextual Encoding. The Indian Chefs Process. A Novel Twitter Sentiment Analysis Model with Baseline Correlation for Financial Market Prediction with Improve …

Uncertainty Quantification for Bayesian Optimization


Title	Uncertainty Quantification for Bayesian Optimization
Authors	Rui Tuo, Wenjia Wang
Abstract	Bayesian optimization is a class of global optimization techniques. It regards the underlying objective function as a realization of a Gaussian process. Although the outputs of Bayesian optimization are random according to the Gaussian process assumption, quantification of this uncertainty is rarely studied in the literature. In this work, we propose a novel approach to assess the output uncertainty of Bayesian optimization algorithms, in terms of constructing confidence regions of the maximum point or value of the objective function. These regions can be computed efficiently, and their confidence levels are guaranteed by newly developed uniform error bounds for sequential Gaussian process regression. Our theory provides a unified uncertainty quantification framework for all existing sequential sampling policies and stopping criteria.
Tasks
Published	2020-02-04
URL	https://arxiv.org/abs/2002.01569v1
PDF	https://arxiv.org/pdf/2002.01569v1.pdf
PWC	https://paperswithcode.com/paper/uncertainty-quantification-for-bayesian
Repo
Framework

Multi-Source Deep Domain Adaptation for Quality Control in Retail Food Packaging


Title	Multi-Source Deep Domain Adaptation for Quality Control in Retail Food Packaging
Authors	Mamatha Thota, Stefanos Kollias, Mark Swainson, Georgios Leontidis
Abstract	Retail food packaging contains information which informs choice and can be vital to consumer health, including product name, ingredients list, nutritional information, allergens, preparation guidelines, pack weight, storage and shelf life information (use-by / best before dates). The presence and accuracy of such information is critical to ensure a detailed understanding of the product and to reduce the potential for health risks. Consequently, erroneous or illegible labeling has the potential to be highly detrimental to consumers and many other stakeholders in the supply chain. In this paper, a multi-source deep learning-based domain adaptation system is proposed and tested to identify and verify the presence and legibility of use-by date information from food packaging photos taken as part of the validation process as the products pass along the food production line. This was achieved by improving the generalization of the techniques via making use of multi-source datasets in order to extract domain-invariant representations for all domains and aligning distribution of all pairs of source and target domains in a common feature space, along with the class boundaries. The proposed system performed very well in the conducted experiments, for automating the verification process and reducing labeling errors that could otherwise threaten public health and contravene legal requirements for food packaging information and accuracy. Comprehensive experiments on our food packaging datasets demonstrate that the proposed multi-source deep domain adaptation method significantly improves the classification accuracy and therefore has great potential for application and beneficial impact in food manufacturing control systems.
Tasks	Domain Adaptation
Published	2020-01-28
URL	https://arxiv.org/abs/2001.10335v1
PDF	https://arxiv.org/pdf/2001.10335v1.pdf
PWC	https://paperswithcode.com/paper/multi-source-deep-domain-adaptation-for
Repo
Framework

DC-BERT: Decoupling Question and Document for Efficient Contextual Encoding


Title	DC-BERT: Decoupling Question and Document for Efficient Contextual Encoding
Authors	Yuyu Zhang, Ping Nie, Xiubo Geng, Arun Ramamurthy, Le Song, Daxin Jiang
Abstract	Recent studies on open-domain question answering have achieved prominent performance improvement using pre-trained language models such as BERT. State-of-the-art approaches typically follow the “retrieve and read” pipeline and employ BERT-based reranker to filter retrieved documents before feeding them into the reader module. The BERT retriever takes as input the concatenation of question and each retrieved document. Despite the success of these approaches in terms of QA accuracy, due to the concatenation, they can barely handle high-throughput of incoming questions each with a large collection of retrieved documents. To address the efficiency problem, we propose DC-BERT, a decoupled contextual encoding framework that has dual BERT models: an online BERT which encodes the question only once, and an offline BERT which pre-encodes all the documents and caches their encodings. On SQuAD Open and Natural Questions Open datasets, DC-BERT achieves 10x speedup on document retrieval, while retaining most (about 98%) of the QA performance compared to state-of-the-art approaches for open-domain question answering.
Tasks	Open-Domain Question Answering, Question Answering
Published	2020-02-28
URL	https://arxiv.org/abs/2002.12591v1
PDF	https://arxiv.org/pdf/2002.12591v1.pdf
PWC	https://paperswithcode.com/paper/dc-bert-decoupling-question-and-document-for
Repo
Framework

The Indian Chefs Process


Title	The Indian Chefs Process
Authors	Patrick Dallaire, Luca Ambrogioni, Ludovic Trottier, Umut Güçlü, Max Hinne, Philippe Giguère, Brahim Chaib-Draa, Marcel van Gerven, Francois Laviolette
Abstract	This paper introduces the Indian Chefs Process (ICP), a Bayesian nonparametric prior on the joint space of infinite directed acyclic graphs (DAGs) and orders that generalizes Indian Buffet Processes. As our construction shows, the proposed distribution relies on a latent Beta Process controlling both the orders and outgoing connection probabilities of the nodes, and yields a probability distribution on sparse infinite graphs. The main advantage of the ICP over previously proposed Bayesian nonparametric priors for DAG structures is its greater flexibility. To the best of our knowledge, the ICP is the first Bayesian nonparametric model supporting every possible DAG. We demonstrate the usefulness of the ICP on learning the structure of deep generative sigmoid networks as well as convolutional neural networks.
Tasks
Published	2020-01-29
URL	https://arxiv.org/abs/2001.10657v1
PDF	https://arxiv.org/pdf/2001.10657v1.pdf
PWC	https://paperswithcode.com/paper/the-indian-chefs-process
Repo
Framework

A Novel Twitter Sentiment Analysis Model with Baseline Correlation for Financial Market Prediction with Improved Efficiency


Title	A Novel Twitter Sentiment Analysis Model with Baseline Correlation for Financial Market Prediction with Improved Efficiency
Authors	Xinyi Guo, Jinfeng Li
Abstract	A novel social networks sentiment analysis model is proposed based on Twitter sentiment score (TSS) for real-time prediction of the future stock market price FTSE 100, as compared with conventional econometric models of investor sentiment based on closed-end fund discount (CEFD). The proposed TSS model features a new baseline correlation approach, which not only exhibits a decent prediction accuracy, but also reduces the computation burden and enables a fast decision making without the knowledge of historical data. Polynomial regression, classification modelling and lexicon-based sentiment analysis are performed using R. The obtained TSS predicts the future stock market trend in advance by 15 time samples (30 working hours) with an accuracy of 67.22% using the proposed baseline criterion without referring to historical TSS or market data. Specifically, TSS’s prediction performance of an upward market is found far better than that of a downward market. Under the logistic regression and linear discriminant analysis, the accuracy of TSS in predicting the upward trend of the future market achieves 97.87%.
Tasks	Decision Making, Sentiment Analysis, Twitter Sentiment Analysis
Published	2020-03-18
URL	https://arxiv.org/abs/2003.08137v1
PDF	https://arxiv.org/pdf/2003.08137v1.pdf
PWC	https://paperswithcode.com/paper/a-novel-twitter-sentiment-analysis-model-with
Repo
Framework


Title	M$^5$L: Multi-Modal Multi-Margin Metric Learning for RGBT Tracking
Authors	Zhengzheng Tu, Chun Lin, Chenglong Li, Jin Tang, Bin Luo
Abstract	Classifying the confusing samples in the course of RGBT tracking is a quite challenging problem, which hasn’t got satisfied solution. Existing methods only focus on enlarging the boundary between positive and negative samples, however, the structured information of samples might be harmed, e.g., confusing positive samples are closer to the anchor than normal positive samples.To handle this problem, we propose a novel Multi-Modal Multi-Margin Metric Learning framework, named M$^5$L for RGBT tracking in this paper. In particular, we design a multi-margin structured loss to distinguish the confusing samples which play a most critical role in tracking performance boosting. To alleviate this problem, we additionally enlarge the boundaries between confusing positive samples and normal ones, between confusing negative samples and normal ones with predefined margins, by exploiting the structured information of all samples in each modality.Moreover, a cross-modality constraint is employed to reduce the difference between modalities and push positive samples closer to the anchor than negative ones from two modalities.In addition, to achieve quality-aware RGB and thermal feature fusion, we introduce the modality attentions and learn them using a feature fusion module in our network. Extensive experiments on large-scale datasets testify that our framework clearly improves the tracking performance and outperforms the state-of-the-art RGBT trackers.
Tasks	Metric Learning
Published	2020-03-17
URL	https://arxiv.org/abs/2003.07650v1
PDF	https://arxiv.org/pdf/2003.07650v1.pdf
PWC	https://paperswithcode.com/paper/m5l-multi-modal-multi-margin-metric-learning
Repo
Framework

Sideways: Depth-Parallel Training of Video Models


Title	Sideways: Depth-Parallel Training of Video Models
Authors	Mateusz Malinowski, Grzegorz Swirszcz, Joao Carreira, Viorica Patraucean
Abstract	We propose Sideways, an approximate backpropagation scheme for training video models. In standard backpropagation, the gradients and activations at every computation step through the model are temporally synchronized. The forward activations need to be stored until the backward pass is executed, preventing inter-layer (depth) parallelization. However, can we leverage smooth, redundant input streams such as videos to develop a more efficient training scheme? Here, we explore an alternative to backpropagation; we overwrite network activations whenever new ones, i.e., from new frames, become available. Such a more gradual accumulation of information from both passes breaks the precise correspondence between gradients and activations, leading to theoretically more noisy weight updates. Counter-intuitively, we show that Sideways training of deep convolutional video networks not only still converges, but can also potentially exhibit better generalization compared to standard synchronized backpropagation.
Tasks
Published	2020-01-17
URL	https://arxiv.org/abs/2001.06232v3
PDF	https://arxiv.org/pdf/2001.06232v3.pdf
PWC	https://paperswithcode.com/paper/sideways-depth-parallel-training-of-video
Repo
Framework

Few-Shot Scene Adaptive Crowd Counting Using Meta-Learning


Title	Few-Shot Scene Adaptive Crowd Counting Using Meta-Learning
Authors	Mahesh Kumar Krishna Reddy, Mohammad Hossain, Mrigank Rochan, Yang Wang
Abstract	We consider the problem of few-shot scene adaptive crowd counting. Given a target camera scene, our goal is to adapt a model to this specific scene with only a few labeled images of that scene. The solution to this problem has potential applications in numerous real-world scenarios, where we ideally like to deploy a crowd counting model specially adapted to a target camera. We accomplish this challenge by taking inspiration from the recently introduced learning-to-learn paradigm in the context of few-shot regime. In training, our method learns the model parameters in a way that facilitates the fast adaptation to the target scene. At test time, given a target scene with a small number of labeled data, our method quickly adapts to that scene with a few gradient updates to the learned parameters. Our extensive experimental results show that the proposed approach outperforms other alternatives in few-shot scene adaptive crowd counting.
Tasks	Crowd Counting, Meta-Learning
Published	2020-02-01
URL	https://arxiv.org/abs/2002.00264v2
PDF	https://arxiv.org/pdf/2002.00264v2.pdf
PWC	https://paperswithcode.com/paper/few-shot-scene-adaptive-crowd-counting-using
Repo
Framework

Auditing ML Models for Individual Bias and Unfairness


Title	Auditing ML Models for Individual Bias and Unfairness
Authors	Songkai Xue, Mikhail Yurochkin, Yuekai Sun
Abstract	We consider the task of auditing ML models for individual bias/unfairness. We formalize the task in an optimization problem and develop a suite of inferential tools for the optimal value. Our tools permit us to obtain asymptotic confidence intervals and hypothesis tests that cover the target/control the Type I error rate exactly. To demonstrate the utility of our tools, we use them to reveal the gender and racial biases in Northpointe’s COMPAS recidivism prediction instrument.
Tasks
Published	2020-03-11
URL	https://arxiv.org/abs/2003.05048v1
PDF	https://arxiv.org/pdf/2003.05048v1.pdf
PWC	https://paperswithcode.com/paper/auditing-ml-models-for-individual-bias-and
Repo
Framework

SieveNet: A Unified Framework for Robust Image-Based Virtual Try-On


Title	SieveNet: A Unified Framework for Robust Image-Based Virtual Try-On
Authors	Surgan Jandial, Ayush Chopra, Kumar Ayush, Mayur Hemani, Abhijeet Kumar, Balaji Krishnamurthy
Abstract	Image-based virtual try-on for fashion has gained considerable attention recently. The task requires trying on a clothing item on a target model image. An efficient framework for this is composed of two stages: (1) warping (transforming) the try-on cloth to align with the pose and shape of the target model, and (2) a texture transfer module to seamlessly integrate the warped try-on cloth onto the target model image. Existing methods suffer from artifacts and distortions in their try-on output. In this work, we present SieveNet, a framework for robust image-based virtual try-on. Firstly, we introduce a multi-stage coarse-to-fine warping network to better model fine-grained intricacies (while transforming the try-on cloth) and train it with a novel perceptual geometric matching loss. Next, we introduce a try-on cloth conditioned segmentation mask prior to improve the texture transfer network. Finally, we also introduce a dueling triplet loss strategy for training the texture translation network which further improves the quality of the generated try-on results. We present extensive qualitative and quantitative evaluations of each component of the proposed pipeline and show significant performance improvements against the current state-of-the-art method.
Tasks
Published	2020-01-17
URL	https://arxiv.org/abs/2001.06265v1
PDF	https://arxiv.org/pdf/2001.06265v1.pdf
PWC	https://paperswithcode.com/paper/sievenet-a-unified-framework-for-robust-image
Repo
Framework

The Deep Learning Compiler: A Comprehensive Survey


Title	The Deep Learning Compiler: A Comprehensive Survey
Authors	Mingzhen Li, Yi Liu, Xiaoyan Liu, Qingxiao Sun, Xin You, Hailong Yang, Zhongzhi Luan, Depei Qian
Abstract	The difficulty of deploying various deep learning (DL) models on diverse DL hardware has boosted the research and development of DL compilers in the community. Several DL compilers have been proposed from both industry and academia such as Tensorflow XLA and TVM. Similarly, the DL compilers take the DL models described in different DL frameworks as input, and then generate optimized codes for diverse DL hardware as output. However, none of the existing survey has analyzed the unique design of the DL compilers comprehensively. In this paper, we perform a comprehensive survey of existing DL compilers by dissecting the commonly adopted design in details, with emphasis on the DL oriented multi-level IRs, and frontend/backend optimizations. Specifically, we provide a comprehensive comparison among existing DL compilers from various aspects. In addition, we present detailed analysis of the multi-level IR design and compiler optimization techniques. Finally, several insights are highlighted as the potential research directions of DL compiler. This is the first survey paper focusing on the unique design of DL compiler, which we hope can pave the road for future research towards the DL compiler.
Tasks
Published	2020-02-06
URL	https://arxiv.org/abs/2002.03794v2
PDF	https://arxiv.org/pdf/2002.03794v2.pdf
PWC	https://paperswithcode.com/paper/the-deep-learning-compiler-a-comprehensive
Repo
Framework

Improving noise robust automatic speech recognition with single-channel time-domain enhancement network


Title	Improving noise robust automatic speech recognition with single-channel time-domain enhancement network
Authors	Keisuke Kinoshita, Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani
Abstract	With the advent of deep learning, research on noise-robust automatic speech recognition (ASR) has progressed rapidly. However, ASR performance in noisy conditions of single-channel systems remains unsatisfactory. Indeed, most single-channel speech enhancement (SE) methods (denoising) have brought only limited performance gains over state-of-the-art ASR back-end trained on multi-condition training data. Recently, there has been much research on neural network-based SE methods working in the time-domain showing levels of performance never attained before. However, it has not been established whether the high enhancement performance achieved by such time-domain approaches could be translated into ASR. In this paper, we show that a single-channel time-domain denoising approach can significantly improve ASR performance, providing more than 30 % relative word error reduction over a strong ASR back-end on the real evaluation data of the single-channel track of the CHiME-4 dataset. These positive results demonstrate that single-channel noise reduction can still improve ASR performance, which should open the door to more research in that direction.
Tasks	Denoising, Speech Enhancement, Speech Recognition
Published	2020-03-09
URL	https://arxiv.org/abs/2003.03998v1
PDF	https://arxiv.org/pdf/2003.03998v1.pdf
PWC	https://paperswithcode.com/paper/improving-noise-robust-automatic-speech
Repo
Framework

A meta-algorithm for classification using random recursive tree ensembles: A high energy physics application


Title	A meta-algorithm for classification using random recursive tree ensembles: A high energy physics application
Authors	Vidhi Lalchand
Abstract	The aim of this work is to propose a meta-algorithm for automatic classification in the presence of discrete binary classes. Classifier learning in the presence of overlapping class distributions is a challenging problem in machine learning. Overlapping classes are described by the presence of ambiguous areas in the feature space with a high density of points belonging to both classes. This often occurs in real-world datasets, one such example is numeric data denoting properties of particle decays derived from high-energy accelerators like the Large Hadron Collider (LHC). A significant body of research targeting the class overlap problem use ensemble classifiers to boost the performance of algorithms by using them iteratively in multiple stages or using multiple copies of the same model on different subsets of the input training data. The former is called boosting and the latter is called bagging. The algorithm proposed in this thesis targets a challenging classification problem in high energy physics - that of improving the statistical significance of the Higgs discovery. The underlying dataset used to train the algorithm is experimental data built from the official ATLAS full-detector simulation with Higgs events (signal) mixed with different background events (background) that closely mimic the statistical properties of the signal generating class overlap. The algorithm proposed is a variant of the classical boosted decision tree which is known to be one of the most successful analysis techniques in experimental physics. The algorithm utilizes a unified framework that combines two meta-learning techniques - bagging and boosting. The results show that this combination only works in the presence of a randomization trick in the base learners.
Tasks	Meta-Learning
Published	2020-01-19
URL	https://arxiv.org/abs/2001.06880v1
PDF	https://arxiv.org/pdf/2001.06880v1.pdf
PWC	https://paperswithcode.com/paper/a-meta-algorithm-for-classification-using
Repo
Framework

Efficient Trainable Front-Ends for Neural Speech Enhancement


Title	Efficient Trainable Front-Ends for Neural Speech Enhancement
Authors	Jonah Casebeer, Umut Isik, Shrikant Venkataramani, Arvindh Krishnaswamy
Abstract	Many neural speech enhancement and source separation systems operate in the time-frequency domain. Such models often benefit from making their Short-Time Fourier Transform (STFT) front-ends trainable. In current literature, these are implemented as large Discrete Fourier Transform matrices; which are prohibitively inefficient for low-compute systems. We present an efficient, trainable front-end based on the butterfly mechanism to compute the Fast Fourier Transform, and show its accuracy and efficiency benefits for low-compute neural speech enhancement models. We also explore the effects of making the STFT window trainable.
Tasks	Speech Enhancement
Published	2020-02-20
URL	https://arxiv.org/abs/2002.09286v1
PDF	https://arxiv.org/pdf/2002.09286v1.pdf
PWC	https://paperswithcode.com/paper/efficient-trainable-front-ends-for-neural
Repo
Framework

Goldilocks Neural Networks


Title	Goldilocks Neural Networks
Authors	Jan Rosenzweig, Zoran Cvetkovic, Ivana Roenzweig
Abstract	We introduce the new “Goldilocks” class of activation functions, which non-linearly deform the input signal only locally when the input signal is in the appropriate range. The small local deformation of the signal enables better understanding of how and why the signal is transformed through the layers. Numerical results on CIFAR-10 and CIFAR-100 data sets show that Goldilocks networks perform better than, or comparably to SELU and RELU, while introducing tractability of data deformation through the layers.
Tasks
Published	2020-02-11
URL	https://arxiv.org/abs/2002.05059v2
PDF	https://arxiv.org/pdf/2002.05059v2.pdf
PWC	https://paperswithcode.com/paper/goldilocks-neural-networks
Repo
Framework