April 2, 2020

3001 words 15 mins read

Paper Group ANR 173

Paper Group ANR 173

Uncertainty Quantification for Bayesian Optimization. Multi-Source Deep Domain Adaptation for Quality Control in Retail Food Packaging. DC-BERT: Decoupling Question and Document for Efficient Contextual Encoding. The Indian Chefs Process. A Novel Twitter Sentiment Analysis Model with Baseline Correlation for Financial Market Prediction with Improve …

Uncertainty Quantification for Bayesian Optimization

Title Uncertainty Quantification for Bayesian Optimization
Authors Rui Tuo, Wenjia Wang
Abstract Bayesian optimization is a class of global optimization techniques. It regards the underlying objective function as a realization of a Gaussian process. Although the outputs of Bayesian optimization are random according to the Gaussian process assumption, quantification of this uncertainty is rarely studied in the literature. In this work, we propose a novel approach to assess the output uncertainty of Bayesian optimization algorithms, in terms of constructing confidence regions of the maximum point or value of the objective function. These regions can be computed efficiently, and their confidence levels are guaranteed by newly developed uniform error bounds for sequential Gaussian process regression. Our theory provides a unified uncertainty quantification framework for all existing sequential sampling policies and stopping criteria.
Published 2020-02-04
URL https://arxiv.org/abs/2002.01569v1
PDF https://arxiv.org/pdf/2002.01569v1.pdf
PWC https://paperswithcode.com/paper/uncertainty-quantification-for-bayesian

Multi-Source Deep Domain Adaptation for Quality Control in Retail Food Packaging

Title Multi-Source Deep Domain Adaptation for Quality Control in Retail Food Packaging
Authors Mamatha Thota, Stefanos Kollias, Mark Swainson, Georgios Leontidis
Abstract Retail food packaging contains information which informs choice and can be vital to consumer health, including product name, ingredients list, nutritional information, allergens, preparation guidelines, pack weight, storage and shelf life information (use-by / best before dates). The presence and accuracy of such information is critical to ensure a detailed understanding of the product and to reduce the potential for health risks. Consequently, erroneous or illegible labeling has the potential to be highly detrimental to consumers and many other stakeholders in the supply chain. In this paper, a multi-source deep learning-based domain adaptation system is proposed and tested to identify and verify the presence and legibility of use-by date information from food packaging photos taken as part of the validation process as the products pass along the food production line. This was achieved by improving the generalization of the techniques via making use of multi-source datasets in order to extract domain-invariant representations for all domains and aligning distribution of all pairs of source and target domains in a common feature space, along with the class boundaries. The proposed system performed very well in the conducted experiments, for automating the verification process and reducing labeling errors that could otherwise threaten public health and contravene legal requirements for food packaging information and accuracy. Comprehensive experiments on our food packaging datasets demonstrate that the proposed multi-source deep domain adaptation method significantly improves the classification accuracy and therefore has great potential for application and beneficial impact in food manufacturing control systems.
Tasks Domain Adaptation
Published 2020-01-28
URL https://arxiv.org/abs/2001.10335v1
PDF https://arxiv.org/pdf/2001.10335v1.pdf
PWC https://paperswithcode.com/paper/multi-source-deep-domain-adaptation-for

DC-BERT: Decoupling Question and Document for Efficient Contextual Encoding

Title DC-BERT: Decoupling Question and Document for Efficient Contextual Encoding
Authors Yuyu Zhang, Ping Nie, Xiubo Geng, Arun Ramamurthy, Le Song, Daxin Jiang
Abstract Recent studies on open-domain question answering have achieved prominent performance improvement using pre-trained language models such as BERT. State-of-the-art approaches typically follow the “retrieve and read” pipeline and employ BERT-based reranker to filter retrieved documents before feeding them into the reader module. The BERT retriever takes as input the concatenation of question and each retrieved document. Despite the success of these approaches in terms of QA accuracy, due to the concatenation, they can barely handle high-throughput of incoming questions each with a large collection of retrieved documents. To address the efficiency problem, we propose DC-BERT, a decoupled contextual encoding framework that has dual BERT models: an online BERT which encodes the question only once, and an offline BERT which pre-encodes all the documents and caches their encodings. On SQuAD Open and Natural Questions Open datasets, DC-BERT achieves 10x speedup on document retrieval, while retaining most (about 98%) of the QA performance compared to state-of-the-art approaches for open-domain question answering.
Tasks Open-Domain Question Answering, Question Answering
Published 2020-02-28
URL https://arxiv.org/abs/2002.12591v1
PDF https://arxiv.org/pdf/2002.12591v1.pdf
PWC https://paperswithcode.com/paper/dc-bert-decoupling-question-and-document-for

The Indian Chefs Process

Title The Indian Chefs Process
Authors Patrick Dallaire, Luca Ambrogioni, Ludovic Trottier, Umut Güçlü, Max Hinne, Philippe Giguère, Brahim Chaib-Draa, Marcel van Gerven, Francois Laviolette
Abstract This paper introduces the Indian Chefs Process (ICP), a Bayesian nonparametric prior on the joint space of infinite directed acyclic graphs (DAGs) and orders that generalizes Indian Buffet Processes. As our construction shows, the proposed distribution relies on a latent Beta Process controlling both the orders and outgoing connection probabilities of the nodes, and yields a probability distribution on sparse infinite graphs. The main advantage of the ICP over previously proposed Bayesian nonparametric priors for DAG structures is its greater flexibility. To the best of our knowledge, the ICP is the first Bayesian nonparametric model supporting every possible DAG. We demonstrate the usefulness of the ICP on learning the structure of deep generative sigmoid networks as well as convolutional neural networks.
Published 2020-01-29
URL https://arxiv.org/abs/2001.10657v1
PDF https://arxiv.org/pdf/2001.10657v1.pdf
PWC https://paperswithcode.com/paper/the-indian-chefs-process

A Novel Twitter Sentiment Analysis Model with Baseline Correlation for Financial Market Prediction with Improved Efficiency

Title A Novel Twitter Sentiment Analysis Model with Baseline Correlation for Financial Market Prediction with Improved Efficiency
Authors Xinyi Guo, Jinfeng Li
Abstract A novel social networks sentiment analysis model is proposed based on Twitter sentiment score (TSS) for real-time prediction of the future stock market price FTSE 100, as compared with conventional econometric models of investor sentiment based on closed-end fund discount (CEFD). The proposed TSS model features a new baseline correlation approach, which not only exhibits a decent prediction accuracy, but also reduces the computation burden and enables a fast decision making without the knowledge of historical data. Polynomial regression, classification modelling and lexicon-based sentiment analysis are performed using R. The obtained TSS predicts the future stock market trend in advance by 15 time samples (30 working hours) with an accuracy of 67.22% using the proposed baseline criterion without referring to historical TSS or market data. Specifically, TSS’s prediction performance of an upward market is found far better than that of a downward market. Under the logistic regression and linear discriminant analysis, the accuracy of TSS in predicting the upward trend of the future market achieves 97.87%.
Tasks Decision Making, Sentiment Analysis, Twitter Sentiment Analysis
Published 2020-03-18
URL https://arxiv.org/abs/2003.08137v1
PDF https://arxiv.org/pdf/2003.08137v1.pdf
PWC https://paperswithcode.com/paper/a-novel-twitter-sentiment-analysis-model-with

M$^5$L: Multi-Modal Multi-Margin Metric Learning for RGBT Tracking

Title M$^5$L: Multi-Modal Multi-Margin Metric Learning for RGBT Tracking
Authors Zhengzheng Tu, Chun Lin, Chenglong Li, Jin Tang, Bin Luo
Abstract Classifying the confusing samples in the course of RGBT tracking is a quite challenging problem, which hasn’t got satisfied solution. Existing methods only focus on enlarging the boundary between positive and negative samples, however, the structured information of samples might be harmed, e.g., confusing positive samples are closer to the anchor than normal positive samples.To handle this problem, we propose a novel Multi-Modal Multi-Margin Metric Learning framework, named M$^5$L for RGBT tracking in this paper. In particular, we design a multi-margin structured loss to distinguish the confusing samples which play a most critical role in tracking performance boosting. To alleviate this problem, we additionally enlarge the boundaries between confusing positive samples and normal ones, between confusing negative samples and normal ones with predefined margins, by exploiting the structured information of all samples in each modality.Moreover, a cross-modality constraint is employed to reduce the difference between modalities and push positive samples closer to the anchor than negative ones from two modalities.In addition, to achieve quality-aware RGB and thermal feature fusion, we introduce the modality attentions and learn them using a feature fusion module in our network. Extensive experiments on large-scale datasets testify that our framework clearly improves the tracking performance and outperforms the state-of-the-art RGBT trackers.
Tasks Metric Learning
Published 2020-03-17
URL https://arxiv.org/abs/2003.07650v1
PDF https://arxiv.org/pdf/2003.07650v1.pdf
PWC https://paperswithcode.com/paper/m5l-multi-modal-multi-margin-metric-learning

Sideways: Depth-Parallel Training of Video Models

Title Sideways: Depth-Parallel Training of Video Models
Authors Mateusz Malinowski, Grzegorz Swirszcz, Joao Carreira, Viorica Patraucean
Abstract We propose Sideways, an approximate backpropagation scheme for training video models. In standard backpropagation, the gradients and activations at every computation step through the model are temporally synchronized. The forward activations need to be stored until the backward pass is executed, preventing inter-layer (depth) parallelization. However, can we leverage smooth, redundant input streams such as videos to develop a more efficient training scheme? Here, we explore an alternative to backpropagation; we overwrite network activations whenever new ones, i.e., from new frames, become available. Such a more gradual accumulation of information from both passes breaks the precise correspondence between gradients and activations, leading to theoretically more noisy weight updates. Counter-intuitively, we show that Sideways training of deep convolutional video networks not only still converges, but can also potentially exhibit better generalization compared to standard synchronized backpropagation.
Published 2020-01-17
URL https://arxiv.org/abs/2001.06232v3
PDF https://arxiv.org/pdf/2001.06232v3.pdf
PWC https://paperswithcode.com/paper/sideways-depth-parallel-training-of-video

Few-Shot Scene Adaptive Crowd Counting Using Meta-Learning

Title Few-Shot Scene Adaptive Crowd Counting Using Meta-Learning
Authors Mahesh Kumar Krishna Reddy, Mohammad Hossain, Mrigank Rochan, Yang Wang
Abstract We consider the problem of few-shot scene adaptive crowd counting. Given a target camera scene, our goal is to adapt a model to this specific scene with only a few labeled images of that scene. The solution to this problem has potential applications in numerous real-world scenarios, where we ideally like to deploy a crowd counting model specially adapted to a target camera. We accomplish this challenge by taking inspiration from the recently introduced learning-to-learn paradigm in the context of few-shot regime. In training, our method learns the model parameters in a way that facilitates the fast adaptation to the target scene. At test time, given a target scene with a small number of labeled data, our method quickly adapts to that scene with a few gradient updates to the learned parameters. Our extensive experimental results show that the proposed approach outperforms other alternatives in few-shot scene adaptive crowd counting.
Tasks Crowd Counting, Meta-Learning
Published 2020-02-01
URL https://arxiv.org/abs/2002.00264v2
PDF https://arxiv.org/pdf/2002.00264v2.pdf
PWC https://paperswithcode.com/paper/few-shot-scene-adaptive-crowd-counting-using

Auditing ML Models for Individual Bias and Unfairness

Title Auditing ML Models for Individual Bias and Unfairness
Authors Songkai Xue, Mikhail Yurochkin, Yuekai Sun
Abstract We consider the task of auditing ML models for individual bias/unfairness. We formalize the task in an optimization problem and develop a suite of inferential tools for the optimal value. Our tools permit us to obtain asymptotic confidence intervals and hypothesis tests that cover the target/control the Type I error rate exactly. To demonstrate the utility of our tools, we use them to reveal the gender and racial biases in Northpointe’s COMPAS recidivism prediction instrument.
Published 2020-03-11
URL https://arxiv.org/abs/2003.05048v1
PDF https://arxiv.org/pdf/2003.05048v1.pdf
PWC https://paperswithcode.com/paper/auditing-ml-models-for-individual-bias-and

SieveNet: A Unified Framework for Robust Image-Based Virtual Try-On

Title SieveNet: A Unified Framework for Robust Image-Based Virtual Try-On
Authors Surgan Jandial, Ayush Chopra, Kumar Ayush, Mayur Hemani, Abhijeet Kumar, Balaji Krishnamurthy
Abstract Image-based virtual try-on for fashion has gained considerable attention recently. The task requires trying on a clothing item on a target model image. An efficient framework for this is composed of two stages: (1) warping (transforming) the try-on cloth to align with the pose and shape of the target model, and (2) a texture transfer module to seamlessly integrate the warped try-on cloth onto the target model image. Existing methods suffer from artifacts and distortions in their try-on output. In this work, we present SieveNet, a framework for robust image-based virtual try-on. Firstly, we introduce a multi-stage coarse-to-fine warping network to better model fine-grained intricacies (while transforming the try-on cloth) and train it with a novel perceptual geometric matching loss. Next, we introduce a try-on cloth conditioned segmentation mask prior to improve the texture transfer network. Finally, we also introduce a dueling triplet loss strategy for training the texture translation network which further improves the quality of the generated try-on results. We present extensive qualitative and quantitative evaluations of each component of the proposed pipeline and show significant performance improvements against the current state-of-the-art method.
Published 2020-01-17
URL https://arxiv.org/abs/2001.06265v1
PDF https://arxiv.org/pdf/2001.06265v1.pdf
PWC https://paperswithcode.com/paper/sievenet-a-unified-framework-for-robust-image

The Deep Learning Compiler: A Comprehensive Survey

Title The Deep Learning Compiler: A Comprehensive Survey
Authors Mingzhen Li, Yi Liu, Xiaoyan Liu, Qingxiao Sun, Xin You, Hailong Yang, Zhongzhi Luan, Depei Qian
Abstract The difficulty of deploying various deep learning (DL) models on diverse DL hardware has boosted the research and development of DL compilers in the community. Several DL compilers have been proposed from both industry and academia such as Tensorflow XLA and TVM. Similarly, the DL compilers take the DL models described in different DL frameworks as input, and then generate optimized codes for diverse DL hardware as output. However, none of the existing survey has analyzed the unique design of the DL compilers comprehensively. In this paper, we perform a comprehensive survey of existing DL compilers by dissecting the commonly adopted design in details, with emphasis on the DL oriented multi-level IRs, and frontend/backend optimizations. Specifically, we provide a comprehensive comparison among existing DL compilers from various aspects. In addition, we present detailed analysis of the multi-level IR design and compiler optimization techniques. Finally, several insights are highlighted as the potential research directions of DL compiler. This is the first survey paper focusing on the unique design of DL compiler, which we hope can pave the road for future research towards the DL compiler.
Published 2020-02-06
URL https://arxiv.org/abs/2002.03794v2
PDF https://arxiv.org/pdf/2002.03794v2.pdf
PWC https://paperswithcode.com/paper/the-deep-learning-compiler-a-comprehensive

Improving noise robust automatic speech recognition with single-channel time-domain enhancement network

Title Improving noise robust automatic speech recognition with single-channel time-domain enhancement network
Authors Keisuke Kinoshita, Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani
Abstract With the advent of deep learning, research on noise-robust automatic speech recognition (ASR) has progressed rapidly. However, ASR performance in noisy conditions of single-channel systems remains unsatisfactory. Indeed, most single-channel speech enhancement (SE) methods (denoising) have brought only limited performance gains over state-of-the-art ASR back-end trained on multi-condition training data. Recently, there has been much research on neural network-based SE methods working in the time-domain showing levels of performance never attained before. However, it has not been established whether the high enhancement performance achieved by such time-domain approaches could be translated into ASR. In this paper, we show that a single-channel time-domain denoising approach can significantly improve ASR performance, providing more than 30 % relative word error reduction over a strong ASR back-end on the real evaluation data of the single-channel track of the CHiME-4 dataset. These positive results demonstrate that single-channel noise reduction can still improve ASR performance, which should open the door to more research in that direction.
Tasks Denoising, Speech Enhancement, Speech Recognition
Published 2020-03-09
URL https://arxiv.org/abs/2003.03998v1
PDF https://arxiv.org/pdf/2003.03998v1.pdf
PWC https://paperswithcode.com/paper/improving-noise-robust-automatic-speech

A meta-algorithm for classification using random recursive tree ensembles: A high energy physics application

Title A meta-algorithm for classification using random recursive tree ensembles: A high energy physics application
Authors Vidhi Lalchand
Abstract The aim of this work is to propose a meta-algorithm for automatic classification in the presence of discrete binary classes. Classifier learning in the presence of overlapping class distributions is a challenging problem in machine learning. Overlapping classes are described by the presence of ambiguous areas in the feature space with a high density of points belonging to both classes. This often occurs in real-world datasets, one such example is numeric data denoting properties of particle decays derived from high-energy accelerators like the Large Hadron Collider (LHC). A significant body of research targeting the class overlap problem use ensemble classifiers to boost the performance of algorithms by using them iteratively in multiple stages or using multiple copies of the same model on different subsets of the input training data. The former is called boosting and the latter is called bagging. The algorithm proposed in this thesis targets a challenging classification problem in high energy physics - that of improving the statistical significance of the Higgs discovery. The underlying dataset used to train the algorithm is experimental data built from the official ATLAS full-detector simulation with Higgs events (signal) mixed with different background events (background) that closely mimic the statistical properties of the signal generating class overlap. The algorithm proposed is a variant of the classical boosted decision tree which is known to be one of the most successful analysis techniques in experimental physics. The algorithm utilizes a unified framework that combines two meta-learning techniques - bagging and boosting. The results show that this combination only works in the presence of a randomization trick in the base learners.
Tasks Meta-Learning
Published 2020-01-19
URL https://arxiv.org/abs/2001.06880v1
PDF https://arxiv.org/pdf/2001.06880v1.pdf
PWC https://paperswithcode.com/paper/a-meta-algorithm-for-classification-using

Efficient Trainable Front-Ends for Neural Speech Enhancement

Title Efficient Trainable Front-Ends for Neural Speech Enhancement
Authors Jonah Casebeer, Umut Isik, Shrikant Venkataramani, Arvindh Krishnaswamy
Abstract Many neural speech enhancement and source separation systems operate in the time-frequency domain. Such models often benefit from making their Short-Time Fourier Transform (STFT) front-ends trainable. In current literature, these are implemented as large Discrete Fourier Transform matrices; which are prohibitively inefficient for low-compute systems. We present an efficient, trainable front-end based on the butterfly mechanism to compute the Fast Fourier Transform, and show its accuracy and efficiency benefits for low-compute neural speech enhancement models. We also explore the effects of making the STFT window trainable.
Tasks Speech Enhancement
Published 2020-02-20
URL https://arxiv.org/abs/2002.09286v1
PDF https://arxiv.org/pdf/2002.09286v1.pdf
PWC https://paperswithcode.com/paper/efficient-trainable-front-ends-for-neural

Goldilocks Neural Networks

Title Goldilocks Neural Networks
Authors Jan Rosenzweig, Zoran Cvetkovic, Ivana Roenzweig
Abstract We introduce the new “Goldilocks” class of activation functions, which non-linearly deform the input signal only locally when the input signal is in the appropriate range. The small local deformation of the signal enables better understanding of how and why the signal is transformed through the layers. Numerical results on CIFAR-10 and CIFAR-100 data sets show that Goldilocks networks perform better than, or comparably to SELU and RELU, while introducing tractability of data deformation through the layers.
Published 2020-02-11
URL https://arxiv.org/abs/2002.05059v2
PDF https://arxiv.org/pdf/2002.05059v2.pdf
PWC https://paperswithcode.com/paper/goldilocks-neural-networks
comments powered by Disqus