Paper Group ANR 173
Uncertainty Quantification for Bayesian Optimization. Multi-Source Deep Domain Adaptation for Quality Control in Retail Food Packaging. DC-BERT: Decoupling Question and Document for Efficient Contextual Encoding. The Indian Chefs Process. A Novel Twitter Sentiment Analysis Model with Baseline Correlation for Financial Market Prediction with Improve …
Uncertainty Quantification for Bayesian Optimization
Title | Uncertainty Quantification for Bayesian Optimization |
Authors | Rui Tuo, Wenjia Wang |
Abstract | Bayesian optimization is a class of global optimization techniques. It regards the underlying objective function as a realization of a Gaussian process. Although the outputs of Bayesian optimization are random according to the Gaussian process assumption, quantification of this uncertainty is rarely studied in the literature. In this work, we propose a novel approach to assess the output uncertainty of Bayesian optimization algorithms, in terms of constructing confidence regions of the maximum point or value of the objective function. These regions can be computed efficiently, and their confidence levels are guaranteed by newly developed uniform error bounds for sequential Gaussian process regression. Our theory provides a unified uncertainty quantification framework for all existing sequential sampling policies and stopping criteria. |
Tasks | |
Published | 2020-02-04 |
URL | https://arxiv.org/abs/2002.01569v1 |
https://arxiv.org/pdf/2002.01569v1.pdf | |
PWC | https://paperswithcode.com/paper/uncertainty-quantification-for-bayesian |
Repo | |
Framework | |
Multi-Source Deep Domain Adaptation for Quality Control in Retail Food Packaging
Title | Multi-Source Deep Domain Adaptation for Quality Control in Retail Food Packaging |
Authors | Mamatha Thota, Stefanos Kollias, Mark Swainson, Georgios Leontidis |
Abstract | Retail food packaging contains information which informs choice and can be vital to consumer health, including product name, ingredients list, nutritional information, allergens, preparation guidelines, pack weight, storage and shelf life information (use-by / best before dates). The presence and accuracy of such information is critical to ensure a detailed understanding of the product and to reduce the potential for health risks. Consequently, erroneous or illegible labeling has the potential to be highly detrimental to consumers and many other stakeholders in the supply chain. In this paper, a multi-source deep learning-based domain adaptation system is proposed and tested to identify and verify the presence and legibility of use-by date information from food packaging photos taken as part of the validation process as the products pass along the food production line. This was achieved by improving the generalization of the techniques via making use of multi-source datasets in order to extract domain-invariant representations for all domains and aligning distribution of all pairs of source and target domains in a common feature space, along with the class boundaries. The proposed system performed very well in the conducted experiments, for automating the verification process and reducing labeling errors that could otherwise threaten public health and contravene legal requirements for food packaging information and accuracy. Comprehensive experiments on our food packaging datasets demonstrate that the proposed multi-source deep domain adaptation method significantly improves the classification accuracy and therefore has great potential for application and beneficial impact in food manufacturing control systems. |
Tasks | Domain Adaptation |
Published | 2020-01-28 |
URL | https://arxiv.org/abs/2001.10335v1 |
https://arxiv.org/pdf/2001.10335v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-source-deep-domain-adaptation-for |
Repo | |
Framework | |
DC-BERT: Decoupling Question and Document for Efficient Contextual Encoding
Title | DC-BERT: Decoupling Question and Document for Efficient Contextual Encoding |
Authors | Yuyu Zhang, Ping Nie, Xiubo Geng, Arun Ramamurthy, Le Song, Daxin Jiang |
Abstract | Recent studies on open-domain question answering have achieved prominent performance improvement using pre-trained language models such as BERT. State-of-the-art approaches typically follow the “retrieve and read” pipeline and employ BERT-based reranker to filter retrieved documents before feeding them into the reader module. The BERT retriever takes as input the concatenation of question and each retrieved document. Despite the success of these approaches in terms of QA accuracy, due to the concatenation, they can barely handle high-throughput of incoming questions each with a large collection of retrieved documents. To address the efficiency problem, we propose DC-BERT, a decoupled contextual encoding framework that has dual BERT models: an online BERT which encodes the question only once, and an offline BERT which pre-encodes all the documents and caches their encodings. On SQuAD Open and Natural Questions Open datasets, DC-BERT achieves 10x speedup on document retrieval, while retaining most (about 98%) of the QA performance compared to state-of-the-art approaches for open-domain question answering. |
Tasks | Open-Domain Question Answering, Question Answering |
Published | 2020-02-28 |
URL | https://arxiv.org/abs/2002.12591v1 |
https://arxiv.org/pdf/2002.12591v1.pdf | |
PWC | https://paperswithcode.com/paper/dc-bert-decoupling-question-and-document-for |
Repo | |
Framework | |
The Indian Chefs Process
Title | The Indian Chefs Process |
Authors | Patrick Dallaire, Luca Ambrogioni, Ludovic Trottier, Umut Güçlü, Max Hinne, Philippe Giguère, Brahim Chaib-Draa, Marcel van Gerven, Francois Laviolette |
Abstract | This paper introduces the Indian Chefs Process (ICP), a Bayesian nonparametric prior on the joint space of infinite directed acyclic graphs (DAGs) and orders that generalizes Indian Buffet Processes. As our construction shows, the proposed distribution relies on a latent Beta Process controlling both the orders and outgoing connection probabilities of the nodes, and yields a probability distribution on sparse infinite graphs. The main advantage of the ICP over previously proposed Bayesian nonparametric priors for DAG structures is its greater flexibility. To the best of our knowledge, the ICP is the first Bayesian nonparametric model supporting every possible DAG. We demonstrate the usefulness of the ICP on learning the structure of deep generative sigmoid networks as well as convolutional neural networks. |
Tasks | |
Published | 2020-01-29 |
URL | https://arxiv.org/abs/2001.10657v1 |
https://arxiv.org/pdf/2001.10657v1.pdf | |
PWC | https://paperswithcode.com/paper/the-indian-chefs-process |
Repo | |
Framework | |
A Novel Twitter Sentiment Analysis Model with Baseline Correlation for Financial Market Prediction with Improved Efficiency
Title | A Novel Twitter Sentiment Analysis Model with Baseline Correlation for Financial Market Prediction with Improved Efficiency |
Authors | Xinyi Guo, Jinfeng Li |
Abstract | A novel social networks sentiment analysis model is proposed based on Twitter sentiment score (TSS) for real-time prediction of the future stock market price FTSE 100, as compared with conventional econometric models of investor sentiment based on closed-end fund discount (CEFD). The proposed TSS model features a new baseline correlation approach, which not only exhibits a decent prediction accuracy, but also reduces the computation burden and enables a fast decision making without the knowledge of historical data. Polynomial regression, classification modelling and lexicon-based sentiment analysis are performed using R. The obtained TSS predicts the future stock market trend in advance by 15 time samples (30 working hours) with an accuracy of 67.22% using the proposed baseline criterion without referring to historical TSS or market data. Specifically, TSS’s prediction performance of an upward market is found far better than that of a downward market. Under the logistic regression and linear discriminant analysis, the accuracy of TSS in predicting the upward trend of the future market achieves 97.87%. |
Tasks | Decision Making, Sentiment Analysis, Twitter Sentiment Analysis |
Published | 2020-03-18 |
URL | https://arxiv.org/abs/2003.08137v1 |
https://arxiv.org/pdf/2003.08137v1.pdf | |
PWC | https://paperswithcode.com/paper/a-novel-twitter-sentiment-analysis-model-with |
Repo | |
Framework | |
M$^5$L: Multi-Modal Multi-Margin Metric Learning for RGBT Tracking
Title | M$^5$L: Multi-Modal Multi-Margin Metric Learning for RGBT Tracking |
Authors | Zhengzheng Tu, Chun Lin, Chenglong Li, Jin Tang, Bin Luo |
Abstract | Classifying the confusing samples in the course of RGBT tracking is a quite challenging problem, which hasn’t got satisfied solution. Existing methods only focus on enlarging the boundary between positive and negative samples, however, the structured information of samples might be harmed, e.g., confusing positive samples are closer to the anchor than normal positive samples.To handle this problem, we propose a novel Multi-Modal Multi-Margin Metric Learning framework, named M$^5$L for RGBT tracking in this paper. In particular, we design a multi-margin structured loss to distinguish the confusing samples which play a most critical role in tracking performance boosting. To alleviate this problem, we additionally enlarge the boundaries between confusing positive samples and normal ones, between confusing negative samples and normal ones with predefined margins, by exploiting the structured information of all samples in each modality.Moreover, a cross-modality constraint is employed to reduce the difference between modalities and push positive samples closer to the anchor than negative ones from two modalities.In addition, to achieve quality-aware RGB and thermal feature fusion, we introduce the modality attentions and learn them using a feature fusion module in our network. Extensive experiments on large-scale datasets testify that our framework clearly improves the tracking performance and outperforms the state-of-the-art RGBT trackers. |
Tasks | Metric Learning |
Published | 2020-03-17 |
URL | https://arxiv.org/abs/2003.07650v1 |
https://arxiv.org/pdf/2003.07650v1.pdf | |
PWC | https://paperswithcode.com/paper/m5l-multi-modal-multi-margin-metric-learning |
Repo | |
Framework | |
Sideways: Depth-Parallel Training of Video Models
Title | Sideways: Depth-Parallel Training of Video Models |
Authors | Mateusz Malinowski, Grzegorz Swirszcz, Joao Carreira, Viorica Patraucean |
Abstract | We propose Sideways, an approximate backpropagation scheme for training video models. In standard backpropagation, the gradients and activations at every computation step through the model are temporally synchronized. The forward activations need to be stored until the backward pass is executed, preventing inter-layer (depth) parallelization. However, can we leverage smooth, redundant input streams such as videos to develop a more efficient training scheme? Here, we explore an alternative to backpropagation; we overwrite network activations whenever new ones, i.e., from new frames, become available. Such a more gradual accumulation of information from both passes breaks the precise correspondence between gradients and activations, leading to theoretically more noisy weight updates. Counter-intuitively, we show that Sideways training of deep convolutional video networks not only still converges, but can also potentially exhibit better generalization compared to standard synchronized backpropagation. |
Tasks | |
Published | 2020-01-17 |
URL | https://arxiv.org/abs/2001.06232v3 |
https://arxiv.org/pdf/2001.06232v3.pdf | |
PWC | https://paperswithcode.com/paper/sideways-depth-parallel-training-of-video |
Repo | |
Framework | |
Few-Shot Scene Adaptive Crowd Counting Using Meta-Learning
Title | Few-Shot Scene Adaptive Crowd Counting Using Meta-Learning |
Authors | Mahesh Kumar Krishna Reddy, Mohammad Hossain, Mrigank Rochan, Yang Wang |
Abstract | We consider the problem of few-shot scene adaptive crowd counting. Given a target camera scene, our goal is to adapt a model to this specific scene with only a few labeled images of that scene. The solution to this problem has potential applications in numerous real-world scenarios, where we ideally like to deploy a crowd counting model specially adapted to a target camera. We accomplish this challenge by taking inspiration from the recently introduced learning-to-learn paradigm in the context of few-shot regime. In training, our method learns the model parameters in a way that facilitates the fast adaptation to the target scene. At test time, given a target scene with a small number of labeled data, our method quickly adapts to that scene with a few gradient updates to the learned parameters. Our extensive experimental results show that the proposed approach outperforms other alternatives in few-shot scene adaptive crowd counting. |
Tasks | Crowd Counting, Meta-Learning |
Published | 2020-02-01 |
URL | https://arxiv.org/abs/2002.00264v2 |
https://arxiv.org/pdf/2002.00264v2.pdf | |
PWC | https://paperswithcode.com/paper/few-shot-scene-adaptive-crowd-counting-using |
Repo | |
Framework | |
Auditing ML Models for Individual Bias and Unfairness
Title | Auditing ML Models for Individual Bias and Unfairness |
Authors | Songkai Xue, Mikhail Yurochkin, Yuekai Sun |
Abstract | We consider the task of auditing ML models for individual bias/unfairness. We formalize the task in an optimization problem and develop a suite of inferential tools for the optimal value. Our tools permit us to obtain asymptotic confidence intervals and hypothesis tests that cover the target/control the Type I error rate exactly. To demonstrate the utility of our tools, we use them to reveal the gender and racial biases in Northpointe’s COMPAS recidivism prediction instrument. |
Tasks | |
Published | 2020-03-11 |
URL | https://arxiv.org/abs/2003.05048v1 |
https://arxiv.org/pdf/2003.05048v1.pdf | |
PWC | https://paperswithcode.com/paper/auditing-ml-models-for-individual-bias-and |
Repo | |
Framework | |
SieveNet: A Unified Framework for Robust Image-Based Virtual Try-On
Title | SieveNet: A Unified Framework for Robust Image-Based Virtual Try-On |
Authors | Surgan Jandial, Ayush Chopra, Kumar Ayush, Mayur Hemani, Abhijeet Kumar, Balaji Krishnamurthy |
Abstract | Image-based virtual try-on for fashion has gained considerable attention recently. The task requires trying on a clothing item on a target model image. An efficient framework for this is composed of two stages: (1) warping (transforming) the try-on cloth to align with the pose and shape of the target model, and (2) a texture transfer module to seamlessly integrate the warped try-on cloth onto the target model image. Existing methods suffer from artifacts and distortions in their try-on output. In this work, we present SieveNet, a framework for robust image-based virtual try-on. Firstly, we introduce a multi-stage coarse-to-fine warping network to better model fine-grained intricacies (while transforming the try-on cloth) and train it with a novel perceptual geometric matching loss. Next, we introduce a try-on cloth conditioned segmentation mask prior to improve the texture transfer network. Finally, we also introduce a dueling triplet loss strategy for training the texture translation network which further improves the quality of the generated try-on results. We present extensive qualitative and quantitative evaluations of each component of the proposed pipeline and show significant performance improvements against the current state-of-the-art method. |
Tasks | |
Published | 2020-01-17 |
URL | https://arxiv.org/abs/2001.06265v1 |
https://arxiv.org/pdf/2001.06265v1.pdf | |
PWC | https://paperswithcode.com/paper/sievenet-a-unified-framework-for-robust-image |
Repo | |
Framework | |
The Deep Learning Compiler: A Comprehensive Survey
Title | The Deep Learning Compiler: A Comprehensive Survey |
Authors | Mingzhen Li, Yi Liu, Xiaoyan Liu, Qingxiao Sun, Xin You, Hailong Yang, Zhongzhi Luan, Depei Qian |
Abstract | The difficulty of deploying various deep learning (DL) models on diverse DL hardware has boosted the research and development of DL compilers in the community. Several DL compilers have been proposed from both industry and academia such as Tensorflow XLA and TVM. Similarly, the DL compilers take the DL models described in different DL frameworks as input, and then generate optimized codes for diverse DL hardware as output. However, none of the existing survey has analyzed the unique design of the DL compilers comprehensively. In this paper, we perform a comprehensive survey of existing DL compilers by dissecting the commonly adopted design in details, with emphasis on the DL oriented multi-level IRs, and frontend/backend optimizations. Specifically, we provide a comprehensive comparison among existing DL compilers from various aspects. In addition, we present detailed analysis of the multi-level IR design and compiler optimization techniques. Finally, several insights are highlighted as the potential research directions of DL compiler. This is the first survey paper focusing on the unique design of DL compiler, which we hope can pave the road for future research towards the DL compiler. |
Tasks | |
Published | 2020-02-06 |
URL | https://arxiv.org/abs/2002.03794v2 |
https://arxiv.org/pdf/2002.03794v2.pdf | |
PWC | https://paperswithcode.com/paper/the-deep-learning-compiler-a-comprehensive |
Repo | |
Framework | |
Improving noise robust automatic speech recognition with single-channel time-domain enhancement network
Title | Improving noise robust automatic speech recognition with single-channel time-domain enhancement network |
Authors | Keisuke Kinoshita, Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani |
Abstract | With the advent of deep learning, research on noise-robust automatic speech recognition (ASR) has progressed rapidly. However, ASR performance in noisy conditions of single-channel systems remains unsatisfactory. Indeed, most single-channel speech enhancement (SE) methods (denoising) have brought only limited performance gains over state-of-the-art ASR back-end trained on multi-condition training data. Recently, there has been much research on neural network-based SE methods working in the time-domain showing levels of performance never attained before. However, it has not been established whether the high enhancement performance achieved by such time-domain approaches could be translated into ASR. In this paper, we show that a single-channel time-domain denoising approach can significantly improve ASR performance, providing more than 30 % relative word error reduction over a strong ASR back-end on the real evaluation data of the single-channel track of the CHiME-4 dataset. These positive results demonstrate that single-channel noise reduction can still improve ASR performance, which should open the door to more research in that direction. |
Tasks | Denoising, Speech Enhancement, Speech Recognition |
Published | 2020-03-09 |
URL | https://arxiv.org/abs/2003.03998v1 |
https://arxiv.org/pdf/2003.03998v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-noise-robust-automatic-speech |
Repo | |
Framework | |
A meta-algorithm for classification using random recursive tree ensembles: A high energy physics application
Title | A meta-algorithm for classification using random recursive tree ensembles: A high energy physics application |
Authors | Vidhi Lalchand |
Abstract | The aim of this work is to propose a meta-algorithm for automatic classification in the presence of discrete binary classes. Classifier learning in the presence of overlapping class distributions is a challenging problem in machine learning. Overlapping classes are described by the presence of ambiguous areas in the feature space with a high density of points belonging to both classes. This often occurs in real-world datasets, one such example is numeric data denoting properties of particle decays derived from high-energy accelerators like the Large Hadron Collider (LHC). A significant body of research targeting the class overlap problem use ensemble classifiers to boost the performance of algorithms by using them iteratively in multiple stages or using multiple copies of the same model on different subsets of the input training data. The former is called boosting and the latter is called bagging. The algorithm proposed in this thesis targets a challenging classification problem in high energy physics - that of improving the statistical significance of the Higgs discovery. The underlying dataset used to train the algorithm is experimental data built from the official ATLAS full-detector simulation with Higgs events (signal) mixed with different background events (background) that closely mimic the statistical properties of the signal generating class overlap. The algorithm proposed is a variant of the classical boosted decision tree which is known to be one of the most successful analysis techniques in experimental physics. The algorithm utilizes a unified framework that combines two meta-learning techniques - bagging and boosting. The results show that this combination only works in the presence of a randomization trick in the base learners. |
Tasks | Meta-Learning |
Published | 2020-01-19 |
URL | https://arxiv.org/abs/2001.06880v1 |
https://arxiv.org/pdf/2001.06880v1.pdf | |
PWC | https://paperswithcode.com/paper/a-meta-algorithm-for-classification-using |
Repo | |
Framework | |
Efficient Trainable Front-Ends for Neural Speech Enhancement
Title | Efficient Trainable Front-Ends for Neural Speech Enhancement |
Authors | Jonah Casebeer, Umut Isik, Shrikant Venkataramani, Arvindh Krishnaswamy |
Abstract | Many neural speech enhancement and source separation systems operate in the time-frequency domain. Such models often benefit from making their Short-Time Fourier Transform (STFT) front-ends trainable. In current literature, these are implemented as large Discrete Fourier Transform matrices; which are prohibitively inefficient for low-compute systems. We present an efficient, trainable front-end based on the butterfly mechanism to compute the Fast Fourier Transform, and show its accuracy and efficiency benefits for low-compute neural speech enhancement models. We also explore the effects of making the STFT window trainable. |
Tasks | Speech Enhancement |
Published | 2020-02-20 |
URL | https://arxiv.org/abs/2002.09286v1 |
https://arxiv.org/pdf/2002.09286v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-trainable-front-ends-for-neural |
Repo | |
Framework | |
Goldilocks Neural Networks
Title | Goldilocks Neural Networks |
Authors | Jan Rosenzweig, Zoran Cvetkovic, Ivana Roenzweig |
Abstract | We introduce the new “Goldilocks” class of activation functions, which non-linearly deform the input signal only locally when the input signal is in the appropriate range. The small local deformation of the signal enables better understanding of how and why the signal is transformed through the layers. Numerical results on CIFAR-10 and CIFAR-100 data sets show that Goldilocks networks perform better than, or comparably to SELU and RELU, while introducing tractability of data deformation through the layers. |
Tasks | |
Published | 2020-02-11 |
URL | https://arxiv.org/abs/2002.05059v2 |
https://arxiv.org/pdf/2002.05059v2.pdf | |
PWC | https://paperswithcode.com/paper/goldilocks-neural-networks |
Repo | |
Framework | |