February 2, 2020

3323 words 16 mins read

Paper Group AWR 6

Do Not Trust Additive Explanations. From Open Set to Closed Set: Counting Objects by Spatial Divide-and-Conquer. TX-Ray: Quantifying and Explaining Model-Knowledge Transfer in (Un-)Supervised NLP. Domain Intersection and Domain Difference. Automatically Neutralizing Subjective Bias in Text. Learning Content-Weighted Deep Image Compression. Rectangu …

Do Not Trust Additive Explanations


Title	Do Not Trust Additive Explanations
Authors	Alicja Gosiewska, Przemyslaw Biecek
Abstract	Explainable Artificial Intelligence (XAI) brings a lot of attention recently. Explainability is being presented as a remedy for a lack of trust in model predictions. Model agnostic tools such as LIME, SHAP, or Break Down promise instance level interpretability for any complex machine learning model. But how certain are these additive explanations? Can we rely on additive explanations for non-additive models? In this paper, we (1) examine the behavior of the most popular instance-level explanations under the presence of interactions, (2) introduce a new method that can handle interactions for instance-level explanations, (3) perform a large scale benchmark to see how frequently additive explanations may be misleading.
Tasks
Published	2019-03-27
URL	https://arxiv.org/abs/1903.11420v2
PDF	https://arxiv.org/pdf/1903.11420v2.pdf
PWC	https://paperswithcode.com/paper/ibreakdown-uncertainty-of-model-explanations
Repo	https://github.com/ModelOriented/iBreakDown
Framework	none

From Open Set to Closed Set: Counting Objects by Spatial Divide-and-Conquer


Title	From Open Set to Closed Set: Counting Objects by Spatial Divide-and-Conquer
Authors	Haipeng Xiong, Hao Lu, Chengxin Liu, Liang Liu, Zhiguo Cao, Chunhua Shen
Abstract	Visual counting, a task that predicts the number of objects from an image/video, is an open-set problem by nature, i.e., the number of population can vary in $[0,+\infty)$ in theory. However, the collected images and labeled count values are limited in reality, which means only a small closed set is observed. Existing methods typically model this task in a regression manner, while they are likely to suffer from an unseen scene with counts out of the scope of the closed set. In fact, counting is decomposable. A dense region can always be divided until sub-region counts are within the previously observed closed set. Inspired by this idea, we propose a simple but effective approach, Spatial Divide-and- Conquer Network (S-DCNet). S-DCNet only learns from a closed set but can generalize well to open-set scenarios via S-DC. S-DCNet is also efficient. To avoid repeatedly computing sub-region convolutional features, S-DC is executed on the feature map instead of on the input image. S-DCNet achieves the state-of-the-art performance on three crowd counting datasets (ShanghaiTech, UCF_CC_50 and UCF-QNRF), a vehicle counting dataset (TRANCOS) and a plant counting dataset (MTC). Compared to the previous best methods, S-DCNet brings a 20.2% relative improvement on the ShanghaiTech Part B, 20.9% on the UCF-QNRF, 22.5% on the TRANCOS and 15.1% on the MTC. Code has been made available at: https://github. com/xhp-hust-2018-2011/S-DCNet.
Tasks	Crowd Counting
Published	2019-08-15
URL	https://arxiv.org/abs/1908.06473v1
PDF	https://arxiv.org/pdf/1908.06473v1.pdf
PWC	https://paperswithcode.com/paper/from-open-set-to-closed-set-counting-objects
Repo	https://github.com/xhp-hust-2018-2011/S-DCNet
Framework	pytorch

TX-Ray: Quantifying and Explaining Model-Knowledge Transfer in (Un-)Supervised NLP


Title	TX-Ray: Quantifying and Explaining Model-Knowledge Transfer in (Un-)Supervised NLP
Authors	Nils Rethmeier, Vageesh Kumar Saxena, Isabelle Augenstein
Abstract	While state-of-the-art NLP explainability (XAI) methods focus on supervised, per-instance end or diagnostic probing task evaluation[4, 2, 10], this is insufficient to interpret and quantify model knowledge transfer during (un-) supervised training. By instead expressing each neuron as an interpretable token-activation distribution collected over many instances, one can quantify and guide visual exploration of neuron-knowledge change between model training stages to analyze transfer beyond probing tasks and the per-instance level. This allows one to analyze: (RQ1) how neurons abstract knowledge during unsupervised pretraining; (RQ2) how pretrained neurons zero-shot transfer knowledge to new domain data; and (RQ3) how supervised tasks reorder pretrained neuron knowledge abstractions. Since the meaningfulness of XAI methods is hard to quantify [11, 4], we analyze three example learning setups (RQ1-3) to empirically verify that our method (TX-Ray): identifies transfer (ir-)relevant neurons for pruning (RQ3), and that its transfer metrics coincide with traditional measures like perplexity (RQ1). We also find, that TX-Ray guided pruning of supervision (ir-)relevant neuron-knowledge (RQ3) can identify `lottery ticket'-like [9, 40] neurons that drive model performance and robustness. Upon inspecting pruned neurons, we find that task-relevant neuron-knowledge (`tickets’), appear (over-)fit, while task-irrelevant neurons lower overfitting, i.e. TX-Ray identifies neurons that generalize, transfer or specialize model-knowledge [25]. Finally, through RQ1-3, we find that TX-Ray helps to explore and quantify dynamics of (continual) knowledge transfer and that it can shed light on neuron-knowledge specialization and generalization, to complement (costly) supervised probing task procurement and established `summary’ statistics like perplexity, ROC or F scores. \|
Tasks	Transfer Learning
Published	2019-12-02
URL	https://arxiv.org/abs/1912.00982v1
PDF	https://arxiv.org/pdf/1912.00982v1.pdf
PWC	https://paperswithcode.com/paper/tx-ray-quantifying-and-explaining-model
Repo	https://github.com/copenlu/tx-ray
Framework	pytorch

Domain Intersection and Domain Difference


Title	Domain Intersection and Domain Difference
Authors	Sagie Benaim, Michael Khaitov, Tomer Galanti, Lior Wolf
Abstract	We present a method for recovering the shared content between two visual domains as well as the content that is unique to each domain. This allows us to map from one domain to the other, in a way in which the content that is specific for the first domain is removed and the content that is specific for the second is imported from any image in the second domain. In addition, our method enables generation of images from the intersection of the two domains as well as their union, despite having no such samples during training. The method is shown analytically to contain all the sufficient and necessary constraints. It also outperforms the literature methods in an extensive set of experiments. Our code is available at https://github.com/sagiebenaim/DomainIntersectionDifference.
Tasks
Published	2019-08-30
URL	https://arxiv.org/abs/1908.11628v1
PDF	https://arxiv.org/pdf/1908.11628v1.pdf
PWC	https://paperswithcode.com/paper/domain-intersection-and-domain-difference
Repo	https://github.com/sagiebenaim/DomainIntersectionDifference
Framework	pytorch

Automatically Neutralizing Subjective Bias in Text


Title	Automatically Neutralizing Subjective Bias in Text
Authors	Reid Pryzant, Richard Diehl Martinez, Nathan Dass, Sadao Kurohashi, Dan Jurafsky, Diyi Yang
Abstract	Texts like news, encyclopedias, and some social media strive for objectivity. Yet bias in the form of inappropriate subjectivity - introducing attitudes via framing, presupposing truth, and casting doubt - remains ubiquitous. This kind of bias erodes our collective trust and fuels social conflict. To address this issue, we introduce a novel testbed for natural language generation: automatically bringing inappropriately subjective text into a neutral point of view (“neutralizing” biased text). We also offer the first parallel corpus of biased language. The corpus contains 180,000 sentence pairs and originates from Wikipedia edits that removed various framings, presuppositions, and attitudes from biased sentences. Last, we propose two strong encoder-decoder baselines for the task. A straightforward yet opaque CONCURRENT system uses a BERT encoder to identify subjective words as part of the generation process. An interpretable and controllable MODULAR algorithm separates these steps, using (1) a BERT-based classifier to identify problematic words and (2) a novel join embedding through which the classifier can edit the hidden states of the encoder. Large-scale human evaluation across four domains (encyclopedias, news headlines, books, and political speeches) suggests that these algorithms are a first step towards the automatic identification and reduction of bias.
Tasks	Text Generation
Published	2019-11-21
URL	https://arxiv.org/abs/1911.09709v3
PDF	https://arxiv.org/pdf/1911.09709v3.pdf
PWC	https://paperswithcode.com/paper/automatically-neutralizing-subjective-bias-in
Repo	https://github.com/rpryzant/neutralizing-bias
Framework	pytorch

Learning Content-Weighted Deep Image Compression


Title	Learning Content-Weighted Deep Image Compression
Authors	Mu Li, Wangmeng Zuo, Shuhang Gu, Jane You, David Zhang
Abstract	Learning-based lossy image compression usually involves the joint optimization of rate-distortion performance. Most existing methods adopt spatially invariant bit length allocation and incorporate discrete entropy approximation to constrain compression rate. Nonetheless, the information content is spatially variant, where the regions with complex and salient structures generally are more essential to image compression. Taking the spatial variation of image content into account, this paper presents a content-weighted encoder-decoder model, which involves an importance map subnet to produce the importance mask for locally adaptive bit rate allocation. Consequently, the summation of importance mask can thus be utilized as an alternative of entropy estimation for compression rate control. Furthermore, the quantized representations of the learned code and importance map are still spatially dependent, which can be losslessly compressed using arithmetic coding. To compress the codes effectively and efficiently, we propose a trimmed convolutional network to predict the conditional probability of quantized codes. Experiments show that the proposed method can produce visually much better results, and performs favorably in comparison with deep and traditional lossy image compression approaches.
Tasks	Image Compression
Published	2019-04-01
URL	http://arxiv.org/abs/1904.00664v1
PDF	http://arxiv.org/pdf/1904.00664v1.pdf
PWC	https://paperswithcode.com/paper/learning-content-weighted-deep-image
Repo	https://github.com/limuhit/CWIC
Framework	caffe2

Rectangular Bounding Process


Title	Rectangular Bounding Process
Authors	Xuhui Fan, Bin Li, Scott Anthony Sisson
Abstract	Stochastic partition models divide a multi-dimensional space into a number of rectangular regions, such that the data within each region exhibit certain types of homogeneity. Due to the nature of their partition strategy, existing partition models may create many unnecessary divisions in sparse regions when trying to describe data in dense regions. To avoid this problem we introduce a new parsimonious partition model – the Rectangular Bounding Process (RBP) – to efficiently partition multi-dimensional spaces, by employing a bounding strategy to enclose data points within rectangular bounding boxes. Unlike existing approaches, the RBP possesses several attractive theoretical properties that make it a powerful nonparametric partition prior on a hypercube. In particular, the RBP is self-consistent and as such can be directly extended from a finite hypercube to infinite (unbounded) space. We apply the RBP to regression trees and relational models as a flexible partition prior. The experimental results validate the merit of the RBP {in rich yet parsimonious expressiveness} compared to the state-of-the-art methods.
Tasks
Published	2019-03-10
URL	http://arxiv.org/abs/1903.03906v1
PDF	http://arxiv.org/pdf/1903.03906v1.pdf
PWC	https://paperswithcode.com/paper/rectangular-bounding-process-1
Repo	https://github.com/xuhuifan/RBP
Framework	none

Episodic Training for Domain Generalization


Title	Episodic Training for Domain Generalization
Authors	Da Li, Jianshu Zhang, Yongxin Yang, Cong Liu, Yi-Zhe Song, Timothy M. Hospedales
Abstract	Domain generalization (DG) is the challenging and topical problem of learning models that generalize to novel testing domains with different statistics than a set of known training domains. The simple approach of aggregating data from all source domains and training a single deep neural network end-to-end on all the data provides a surprisingly strong baseline that surpasses many prior published methods. In this paper, we build on this strong baseline by designing an episodic training procedure that trains a single deep network in a way that exposes it to the domain shift that characterises a novel domain at runtime. Specifically, we decompose a deep network into feature extractor and classifier components, and then train each component by simulating it interacting with a partner who is badly tuned for the current domain. This makes both components more robust, ultimately leading to our networks producing state-of-the-art performance on three DG benchmarks. Furthermore, we consider the pervasive workflow of using an ImageNet trained CNN as a fixed feature extractor for downstream recognition tasks. Using the Visual Decathlon benchmark, we demonstrate that our episodic-DG training improves the performance of such a general-purpose feature extractor by explicitly training a feature for robustness to novel problems. This shows that DG training can benefit standard practice in computer vision.
Tasks	Domain Generalization
Published	2019-01-31
URL	https://arxiv.org/abs/1902.00113v3
PDF	https://arxiv.org/pdf/1902.00113v3.pdf
PWC	https://paperswithcode.com/paper/episodic-training-for-domain-generalization
Repo	https://github.com/Emma0118/domain-generalization
Framework	pytorch

Principled analytic classifier for positive-unlabeled learning via weighted integral probability metric


Title	Principled analytic classifier for positive-unlabeled learning via weighted integral probability metric
Authors	Yongchan Kwon, Wonyoung Kim, Masashi Sugiyama, Myunghee Cho Paik
Abstract	We consider the problem of learning a binary classifier from only positive and unlabeled observations (called PU learning). Recent studies in PU learning have shown superior performance theoretically and empirically. However, most existing algorithms may not be suitable for large-scale datasets because they face repeated computations of a large Gram matrix or require massive hyperparameter optimization. In this paper, we propose a computationally efficient and theoretically grounded PU learning algorithm. The proposed PU learning algorithm produces a closed-form classifier when the hypothesis space is a closed ball in reproducing kernel Hilbert space. In addition, we establish upper bounds of the estimation error and the excess risk. The obtained estimation error bound is sharper than existing results and the derived excess risk bound has an explicit form, which vanishes as sample sizes increase. Finally, we conduct extensive numerical experiments using both synthetic and real datasets, demonstrating improved accuracy, scalability, and robustness of the proposed algorithm.
Tasks	Hyperparameter Optimization
Published	2019-01-28
URL	https://arxiv.org/abs/1901.09503v6
PDF	https://arxiv.org/pdf/1901.09503v6.pdf
PWC	https://paperswithcode.com/paper/an-analytic-formulation-for-positive
Repo	https://github.com/eraser347/WMMD_PU
Framework	pytorch

Image-based 3D Object Reconstruction: State-of-the-Art and Trends in the Deep Learning Era


Title	Image-based 3D Object Reconstruction: State-of-the-Art and Trends in the Deep Learning Era
Authors	Xian-Feng Han, Hamid Laga, Mohammed Bennamoun
Abstract	3D reconstruction is a longstanding ill-posed problem, which has been explored for decades by the computer vision, computer graphics, and machine learning communities. Since 2015, image-based 3D reconstruction using convolutional neural networks (CNN) has attracted increasing interest and demonstrated an impressive performance. Given this new era of rapid evolution, this article provides a comprehensive survey of the recent developments in this field. We focus on the works which use deep learning techniques to estimate the 3D shape of generic objects either from a single or multiple RGB images. We organize the literature based on the shape representations, the network architectures, and the training mechanisms they use. While this survey is intended for methods which reconstruct generic objects, we also review some of the recent works which focus on specific object classes such as human body shapes and faces. We provide an analysis and comparison of the performance of some key papers, summarize some of the open problems in this field, and discuss promising directions for future research.
Tasks	3D Object Reconstruction, 3D Reconstruction, Object Reconstruction
Published	2019-06-15
URL	https://arxiv.org/abs/1906.06543v3
PDF	https://arxiv.org/pdf/1906.06543v3.pdf
PWC	https://paperswithcode.com/paper/image-based-3d-object-reconstruction-state-of
Repo	https://github.com/natowi/3D-Reconstruction-with-Neural-Network
Framework	pytorch

Mix & Match: training convnets with mixed image sizes for improved accuracy, speed and scale resiliency


Title	Mix & Match: training convnets with mixed image sizes for improved accuracy, speed and scale resiliency
Authors	Elad Hoffer, Berry Weinstein, Itay Hubara, Tal Ben-Nun, Torsten Hoefler, Daniel Soudry
Abstract	Convolutional neural networks (CNNs) are commonly trained using a fixed spatial image size predetermined for a given model. Although trained on images of aspecific size, it is well established that CNNs can be used to evaluate a wide range of image sizes at test time, by adjusting the size of intermediate feature maps. In this work, we describe and evaluate a novel mixed-size training regime that mixes several image sizes at training time. We demonstrate that models trained using our method are more resilient to image size changes and generalize well even on small images. This allows faster inference by using smaller images attest time. For instance, we receive a 76.43% top-1 accuracy using ResNet50 with an image size of 160, which matches the accuracy of the baseline model with 2x fewer computations. Furthermore, for a given image size used at test time, we show this method can be exploited either to accelerate training or the final test accuracy. For example, we are able to reach a 79.27% accuracy with a model evaluated at a 288 spatial size for a relative improvement of 14% over the baseline.
Tasks
Published	2019-08-12
URL	https://arxiv.org/abs/1908.08986v1
PDF	https://arxiv.org/pdf/1908.08986v1.pdf
PWC	https://paperswithcode.com/paper/mix-match-training-convnets-with-mixed-image
Repo	https://github.com/vaapopescu/gradient-pruning
Framework	pytorch

Deep Floor Plan Recognition Using a Multi-Task Network with Room-Boundary-Guided Attention


Title	Deep Floor Plan Recognition Using a Multi-Task Network with Room-Boundary-Guided Attention
Authors	Zhiliang Zeng, Xianzhi Li, Ying Kin Yu, Chi-Wing Fu
Abstract	This paper presents a new approach to recognize elements in floor plan layouts. Besides walls and rooms, we aim to recognize diverse floor plan elements, such as doors, windows and different types of rooms, in the floor layouts. To this end, we model a hierarchy of floor plan elements and design a deep multi-task neural network with two tasks: one to learn to predict room-boundary elements, and the other to predict rooms with types. More importantly, we formulate the room-boundary-guided attention mechanism in our spatial contextual module to carefully take room-boundary features into account to enhance the room-type predictions. Furthermore, we design a cross-and-within-task weighted loss to balance the multi-label tasks and prepare two new datasets for floor plan recognition. Experimental results demonstrate the superiority and effectiveness of our network over the state-of-the-art methods.
Tasks
Published	2019-08-29
URL	https://arxiv.org/abs/1908.11025v1
PDF	https://arxiv.org/pdf/1908.11025v1.pdf
PWC	https://paperswithcode.com/paper/deep-floor-plan-recognition-using-a-multi
Repo	https://github.com/zlzeng/DeepFloorplan
Framework	tf

Missing Not at Random in Matrix Completion: The Effectiveness of Estimating Missingness Probabilities Under a Low Nuclear Norm Assumption


Title	Missing Not at Random in Matrix Completion: The Effectiveness of Estimating Missingness Probabilities Under a Low Nuclear Norm Assumption
Authors	Wei Ma, George H. Chen
Abstract	Matrix completion is often applied to data with entries missing not at random (MNAR). For example, consider a recommendation system where users tend to only reveal ratings for items they like. In this case, a matrix completion method that relies on entries being revealed at uniformly sampled row and column indices can yield overly optimistic predictions of unseen user ratings. Recently, various papers have shown that we can reduce this bias in MNAR matrix completion if we know the probabilities of different matrix entries being missing. These probabilities are typically modeled using logistic regression or naive Bayes, which make strong assumptions and lack guarantees on the accuracy of the estimated probabilities. In this paper, we suggest a simple approach to estimating these probabilities that avoids these shortcomings. Our approach follows from the observation that missingness patterns in real data often exhibit low nuclear norm structure. We can then estimate the missingness probabilities by feeding the (always fully-observed) binary matrix specifying which entries are revealed or missing to an existing nuclear-norm-constrained matrix completion algorithm by Davenport et al. [2014]. Thus, we tackle MNAR matrix completion by solving a different matrix completion problem first that recovers missingness probabilities. We establish finite-sample error bounds for how accurate these probability estimates are and how well these estimates debias standard matrix completion losses for the original matrix to be completed. Our experiments show that the proposed debiasing strategy can improve a variety of existing matrix completion algorithms, and achieves downstream matrix completion accuracy at least as good as logistic regression and naive Bayes debiasing baselines that require additional auxiliary information.
Tasks	Matrix Completion
Published	2019-10-28
URL	https://arxiv.org/abs/1910.12774v2
PDF	https://arxiv.org/pdf/1910.12774v2.pdf
PWC	https://paperswithcode.com/paper/missing-not-at-random-in-matrix-completion
Repo	https://github.com/georgehc/mnar_mc
Framework	none

cGANs with Multi-Hinge Loss


Title	cGANs with Multi-Hinge Loss
Authors	Ilya Kavalerov, Wojciech Czaja, Rama Chellappa
Abstract	We propose a new algorithm to incorporate class conditional information into the discriminator of GANs via a multi-class generalization of the commonly used Hinge loss. Our approach is in contrast to most GAN frameworks in that we train a single classifier for K+1 classes with one loss function, instead of a real/fake discriminator, or a discriminator classifier pair. We show that learning a single good classifier and a single state of the art generator simultaneously is possible in supervised and semi-supervised settings. With our multi-hinge loss modification we were able to improve the state of the art CIFAR10 IS & FID to 9.58 & 6.40, CIFAR100 IS & FID to 14.36 & 13.32, and STL10 IS & FID to 12.16 & 17.44. The code written with PyTorch is available at https://github.com/ilyakava/BigGAN-PyTorch.
Tasks	Conditional Image Generation
Published	2019-12-09
URL	https://arxiv.org/abs/1912.04216v1
PDF	https://arxiv.org/pdf/1912.04216v1.pdf
PWC	https://paperswithcode.com/paper/cgans-with-multi-hinge-loss
Repo	https://github.com/ilyakava/BigGAN-PyTorch
Framework	pytorch

Generative Pre-Training for Speech with Autoregressive Predictive Coding


Title	Generative Pre-Training for Speech with Autoregressive Predictive Coding
Authors	Yu-An Chung, James Glass
Abstract	Learning meaningful and general representations from unannotated speech that are applicable to a wide range of tasks remains challenging. In this paper we propose to use autoregressive predictive coding (APC), a recently proposed self-supervised objective, as a generative pre-training approach for learning meaningful, non-specific, and transferable speech representations. We pre-train APC on large-scale unlabeled data and conduct transfer learning experiments on three speech applications that require different information about speech characteristics to perform well: speech recognition, speech translation, and speaker identification. Extensive experiments show that APC not only outperforms surface features (e.g., log Mel spectrograms) and other popular representation learning methods on all three tasks, but is also effective at reducing downstream labeled data size and model parameters. We also investigate the use of Transformers for modeling APC and find it superior to RNNs.
Tasks	Representation Learning, Speaker Identification, Speech Recognition, Transfer Learning
Published	2019-10-23
URL	https://arxiv.org/abs/1910.12607v2
PDF	https://arxiv.org/pdf/1910.12607v2.pdf
PWC	https://paperswithcode.com/paper/generative-pre-training-for-speech-with
Repo	https://github.com/iamyuanchung/Autoregressive-Predictive-Coding
Framework	pytorch