February 1, 2020

2994 words 15 mins read

Paper Group AWR 298

A Simple Baseline for Bayesian Uncertainty in Deep Learning. Causal Reasoning from Meta-reinforcement Learning. Fair DARTS: Eliminating Unfair Advantages in Differentiable Architecture Search. A Global-Local Emebdding Module for Fashion Landmark Detection. Cross-Lingual Machine Reading Comprehension. A Comprehensive guide to Bayesian Convolutional …

A Simple Baseline for Bayesian Uncertainty in Deep Learning


Title	A Simple Baseline for Bayesian Uncertainty in Deep Learning
Authors	Wesley Maddox, Timur Garipov, Pavel Izmailov, Dmitry Vetrov, Andrew Gordon Wilson
Abstract	We propose SWA-Gaussian (SWAG), a simple, scalable, and general purpose approach for uncertainty representation and calibration in deep learning. Stochastic Weight Averaging (SWA), which computes the first moment of stochastic gradient descent (SGD) iterates with a modified learning rate schedule, has recently been shown to improve generalization in deep learning. With SWAG, we fit a Gaussian using the SWA solution as the first moment and a low rank plus diagonal covariance also derived from the SGD iterates, forming an approximate posterior distribution over neural network weights; we then sample from this Gaussian distribution to perform Bayesian model averaging. We empirically find that SWAG approximates the shape of the true posterior, in accordance with results describing the stationary distribution of SGD iterates. Moreover, we demonstrate that SWAG performs well on a wide variety of tasks, including out of sample detection, calibration, and transfer learning, in comparison to many popular alternatives including MC dropout, KFAC Laplace, SGLD, and temperature scaling.
Tasks	Bayesian Inference, Calibration, Transfer Learning
Published	2019-02-07
URL	https://arxiv.org/abs/1902.02476v2
PDF	https://arxiv.org/pdf/1902.02476v2.pdf
PWC	https://paperswithcode.com/paper/a-simple-baseline-for-bayesian-uncertainty-in
Repo	https://github.com/SamuelGuilluy/Bayesian_ML_SWAG
Framework	pytorch

Causal Reasoning from Meta-reinforcement Learning


Title	Causal Reasoning from Meta-reinforcement Learning
Authors	Ishita Dasgupta, Jane Wang, Silvia Chiappa, Jovana Mitrovic, Pedro Ortega, David Raposo, Edward Hughes, Peter Battaglia, Matthew Botvinick, Zeb Kurth-Nelson
Abstract	Discovering and exploiting the causal structure in the environment is a crucial challenge for intelligent agents. Here we explore whether causal reasoning can emerge via meta-reinforcement learning. We train a recurrent network with model-free reinforcement learning to solve a range of problems that each contain causal structure. We find that the trained agent can perform causal reasoning in novel situations in order to obtain rewards. The agent can select informative interventions, draw causal inferences from observational data, and make counterfactual predictions. Although established formal causal reasoning algorithms also exist, in this paper we show that such reasoning can arise from model-free reinforcement learning, and suggest that causal reasoning in complex settings may benefit from the more end-to-end learning-based approaches presented here. This work also offers new strategies for structured exploration in reinforcement learning, by providing agents with the ability to perform – and interpret – experiments.
Tasks
Published	2019-01-23
URL	http://arxiv.org/abs/1901.08162v1
PDF	http://arxiv.org/pdf/1901.08162v1.pdf
PWC	https://paperswithcode.com/paper/causal-reasoning-from-meta-reinforcement
Repo	https://github.com/kantneel/causal-metarl
Framework	none

Fair DARTS: Eliminating Unfair Advantages in Differentiable Architecture Search


Title	Fair DARTS: Eliminating Unfair Advantages in Differentiable Architecture Search
Authors	Xiangxiang Chu, Tianbao Zhou, Bo Zhang, Jixiang Li
Abstract	Differentiable Architecture Search (DARTS) is now a widely disseminated weight-sharing neural architecture search method. However, it suffers from well-known performance collapse due to an inevitable aggregation of skip connections. In this paper, we first disclose that its root cause lies in an unfair advantage in exclusive competition. Through experiments, we show that if either of two conditions is broken, the collapse disappears. Thereby, we present a novel approach called Fair DARTS where the exclusive competition is relaxed to be collaborative. Specifically, we let each operation’s architectural weight be independent of others. Yet there is still an important issue of discretization discrepancy. We then propose a zero-one loss to push architectural weights towards zero or one, which approximates an expected multi-hot solution. Our experiments are performed on two mainstream search spaces, and we derive new state-of-the-art results on CIFAR-10 and ImageNet. Our code is available on https://github.com/xiaomi-automl/fairdarts .
Tasks	AutoML, Neural Architecture Search
Published	2019-11-27
URL	https://arxiv.org/abs/1911.12126v2
PDF	https://arxiv.org/pdf/1911.12126v2.pdf
PWC	https://paperswithcode.com/paper/fair-darts-eliminating-unfair-advantages-in
Repo	https://github.com/xiaomi-automl/fairdarts
Framework	pytorch

A Global-Local Emebdding Module for Fashion Landmark Detection


Title	A Global-Local Emebdding Module for Fashion Landmark Detection
Authors	Sumin Lee, Sungchan Oh, Chanho Jung, Changick Kim
Abstract	Detecting fashion landmarks is a fundamental technique for visual clothing analysis. Due to the large variation and non-rigid deformation of clothes, localizing fashion landmarks suffers from large spatial variances across poses, scales, and styles. Therefore, understanding contextual knowledge of clothes is required for accurate landmark detection. To that end, in this paper, we propose a fashion landmark detection network with a global-local embedding module. The global-local embedding module is based on a non-local operation for capturing long-range dependencies and a subsequent convolution operation for adopting local neighborhood relations. With this processing, the network can consider both global and local contextual knowledge for a clothing image. We demonstrate that our proposed method has an excellent ability to learn advanced deep feature representations for fashion landmark detection. Experimental results on two benchmark datasets show that the proposed network outperforms the state-of-the-art methods. Our code is available at https://github.com/shumming/GLE_FLD.
Tasks
Published	2019-08-28
URL	https://arxiv.org/abs/1908.10548v1
PDF	https://arxiv.org/pdf/1908.10548v1.pdf
PWC	https://paperswithcode.com/paper/a-global-local-emebdding-module-for-fashion
Repo	https://github.com/shumming/GLE_FLD
Framework	pytorch

Cross-Lingual Machine Reading Comprehension


Title	Cross-Lingual Machine Reading Comprehension
Authors	Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Shijin Wang, Guoping Hu
Abstract	Though the community has made great progress on Machine Reading Comprehension (MRC) task, most of the previous works are solving English-based MRC problems, and there are few efforts on other languages mainly due to the lack of large-scale training data. In this paper, we propose Cross-Lingual Machine Reading Comprehension (CLMRC) task for the languages other than English. Firstly, we present several back-translation approaches for CLMRC task, which is straightforward to adopt. However, to accurately align the answer into another language is difficult and could introduce additional noise. In this context, we propose a novel model called Dual BERT, which takes advantage of the large-scale training data provided by rich-resource language (such as English) and learn the semantic relations between the passage and question in a bilingual context, and then utilize the learned knowledge to improve reading comprehension performance of low-resource language. We conduct experiments on two Chinese machine reading comprehension datasets CMRC 2018 and DRCD. The results show consistent and significant improvements over various state-of-the-art systems by a large margin, which demonstrate the potentials in CLMRC task. Resources available: https://github.com/ymcui/Cross-Lingual-MRC
Tasks	Machine Reading Comprehension, Reading Comprehension
Published	2019-09-01
URL	https://arxiv.org/abs/1909.00361v1
PDF	https://arxiv.org/pdf/1909.00361v1.pdf
PWC	https://paperswithcode.com/paper/cross-lingual-machine-reading-comprehension
Repo	https://github.com/ymcui/Cross-Lingual-MRC
Framework	tf

A Comprehensive guide to Bayesian Convolutional Neural Network with Variational Inference


Title	A Comprehensive guide to Bayesian Convolutional Neural Network with Variational Inference
Authors	Kumar Shridhar, Felix Laumann, Marcus Liwicki
Abstract	Artificial Neural Networks are connectionist systems that perform a given task by learning on examples without having prior knowledge about the task. This is done by finding an optimal point estimate for the weights in every node. Generally, the network using point estimates as weights perform well with large datasets, but they fail to express uncertainty in regions with little or no data, leading to overconfident decisions. In this paper, Bayesian Convolutional Neural Network (BayesCNN) using Variational Inference is proposed, that introduces probability distribution over the weights. Furthermore, the proposed BayesCNN architecture is applied to tasks like Image Classification, Image Super-Resolution and Generative Adversarial Networks. The results are compared to point-estimates based architectures on MNIST, CIFAR-10 and CIFAR-100 datasets for Image CLassification task, on BSD300 dataset for Image Super Resolution task and on CIFAR10 dataset again for Generative Adversarial Network task. BayesCNN is based on Bayes by Backprop which derives a variational approximation to the true posterior. We, therefore, introduce the idea of applying two convolutional operations, one for the mean and one for the variance. Our proposed method not only achieves performances equivalent to frequentist inference in identical architectures but also incorporate a measurement for uncertainties and regularisation. It further eliminates the use of dropout in the model. Moreover, we predict how certain the model prediction is based on the epistemic and aleatoric uncertainties and empirically show how the uncertainty can decrease, allowing the decisions made by the network to become more deterministic as the training accuracy increases. Finally, we propose ways to prune the Bayesian architecture and to make it more computational and time effective.
Tasks	Bayesian Inference, Image Classification, Image Super-Resolution, Super-Resolution
Published	2019-01-08
URL	http://arxiv.org/abs/1901.02731v1
PDF	http://arxiv.org/pdf/1901.02731v1.pdf
PWC	https://paperswithcode.com/paper/a-comprehensive-guide-to-bayesian
Repo	https://github.com/kumar-shridhar/PyTorch-BayesianCNN
Framework	pytorch

Superquadrics Revisited: Learning 3D Shape Parsing beyond Cuboids


Title	Superquadrics Revisited: Learning 3D Shape Parsing beyond Cuboids
Authors	Despoina Paschalidou, Ali Osman Ulusoy, Andreas Geiger
Abstract	Abstracting complex 3D shapes with parsimonious part-based representations has been a long standing goal in computer vision. This paper presents a learning-based solution to this problem which goes beyond the traditional 3D cuboid representation by exploiting superquadrics as atomic elements. We demonstrate that superquadrics lead to more expressive 3D scene parses while being easier to learn than 3D cuboid representations. Moreover, we provide an analytical solution to the Chamfer loss which avoids the need for computational expensive reinforcement learning or iterative prediction. Our model learns to parse 3D objects into consistent superquadric representations without supervision. Results on various ShapeNet categories as well as the SURREAL human body dataset demonstrate the flexibility of our model in capturing fine details and complex poses that could not have been modelled using cuboids.
Tasks
Published	2019-04-22
URL	http://arxiv.org/abs/1904.09970v1
PDF	http://arxiv.org/pdf/1904.09970v1.pdf
PWC	https://paperswithcode.com/paper/superquadrics-revisited-learning-3d-shape
Repo	https://github.com/paschalidoud/superquadric_parsing
Framework	pytorch

Efficiently Checking Actual Causality with SAT Solving


Title	Efficiently Checking Actual Causality with SAT Solving
Authors	Amjad Ibrahim, Simon Rehwald, Alexander Pretschner
Abstract	Recent formal approaches towards causality have made the concept ready for incorporation into the technical world. However, causality reasoning is computationally hard; and no general algorithmic approach exists that efficiently infers the causes for effects. Thus, checking causality in the context of complex, multi-agent, and distributed socio-technical systems is a significant challenge. Therefore, we conceptualize an intelligent and novel algorithmic approach towards checking causality in acyclic causal models with binary variables, utilizing the optimization power in the solvers of the Boolean Satisfiability Problem (SAT). We present two SAT encodings, and an empirical evaluation of their efficiency and scalability. We show that causality is computed efficiently in less than 5 seconds for models that consist of more than 4000 variables.
Tasks
Published	2019-04-30
URL	http://arxiv.org/abs/1904.13101v1
PDF	http://arxiv.org/pdf/1904.13101v1.pdf
PWC	https://paperswithcode.com/paper/efficiently-checking-actual-causality-with
Repo	https://github.com/amjadKhalifah/HP2SAT1.0
Framework	none

Density Encoding Enables Resource-Efficient Randomly Connected Neural Networks


Title	Density Encoding Enables Resource-Efficient Randomly Connected Neural Networks
Authors	Denis Kleyko, Mansour Kheffache, E. Paxon Frady, Urban Wiklund, Evgeny Osipov
Abstract	The deployment of machine learning algorithms on resource-constrained edge devices is an important challenge from both theoretical and applied points of view. In this article, we focus on resource-efficient randomly connected neural networks known as Random Vector Functional Link (RVFL) networks since their simple design and extremely fast training time make them very attractive for solving many applied classification tasks. We propose to represent input features via the density-based encoding known in the area of stochastic computing and use the operations of binding and bundling from the area of hyperdimensional computing for obtaining the activations of the hidden neurons. Using a collection of 121 real-world datasets from the UCI Machine Learning Repository, we empirically show that the proposed approach demonstrates higher average accuracy than the conventional RVFL. We also demonstrate that it is possible to represent the readout matrix using only integers in a limited range with minimal loss in the accuracy. In this case, the proposed approach operates only on small n-bits integers, which results in a computationally efficient architecture. Finally, through hardware FPGA implementations, we show that such an approach consumes approximately eleven times less energy than that of the conventional RVFL.
Tasks
Published	2019-09-19
URL	https://arxiv.org/abs/1909.09153v1
PDF	https://arxiv.org/pdf/1909.09153v1.pdf
PWC	https://paperswithcode.com/paper/density-encoding-enables-resource-efficient
Repo	https://github.com/sweetwenwen/Stochastic-computing-based-neural-network-accelerator
Framework	none

Connecting Vision and Language with Localized Narratives


Title	Connecting Vision and Language with Localized Narratives
Authors	Jordi Pont-Tuset, Jasper Uijlings, Soravit Changpinyo, Radu Soricut, Vittorio Ferrari
Abstract	This paper proposes Localized Narratives, a new form of multimodal image annotations connecting vision and language. We ask annotators to describe an image with their voice while simultaneously hovering their mouse over the region they are describing. Since the voice and the mouse pointer are synchronized, we can localize every single word in the description. This dense visual grounding takes the form of a mouse trace segment per word and is unique to our data. We annotate 628k images with Localized Narratives: the whole COCO dataset and 504k images of the Open Images dataset, which we make publicly available. We provide an extensive analysis of these annotations showing they are diverse, accurate, and efficient to produce. We also demonstrate their utility on the application of controlled image captioning.
Tasks	Image Captioning, Image Generation
Published	2019-12-06
URL	https://arxiv.org/abs/1912.03098v3
PDF	https://arxiv.org/pdf/1912.03098v3.pdf
PWC	https://paperswithcode.com/paper/connecting-vision-and-language-with-localized
Repo	https://github.com/google/localized-narratives
Framework	none

Improving Quality and Efficiency in Plan-based Neural Data-to-Text Generation


Title	Improving Quality and Efficiency in Plan-based Neural Data-to-Text Generation
Authors	Amit Moryossef, Ido Dagan, Yoav Goldberg
Abstract	We follow the step-by-step approach to neural data-to-text generation we proposed in Moryossef et al (2019), in which the generation process is divided into a text-planning stage followed by a plan-realization stage. We suggest four extensions to that framework: (1) we introduce a trainable neural planning component that can generate effective plans several orders of magnitude faster than the original planner; (2) we incorporate typing hints that improve the model’s ability to deal with unseen relations and entities; (3) we introduce a verification-by-reranking stage that substantially improves the faithfulness of the resulting texts; (4) we incorporate a simple but effective referring expression generation module. These extensions result in a generation process that is faster, more fluent, and more accurate.
Tasks	Data-to-Text Generation, Text Generation
Published	2019-09-22
URL	https://arxiv.org/abs/1909.09986v1
PDF	https://arxiv.org/pdf/1909.09986v1.pdf
PWC	https://paperswithcode.com/paper/190909986
Repo	https://github.com/AmitMY/chimera
Framework	none

Real Image Denoising with Feature Attention


Title	Real Image Denoising with Feature Attention
Authors	Saeed Anwar, Nick Barnes
Abstract	Deep convolutional neural networks perform better on images containing spatially invariant noise (synthetic noise); however, their performance is limited on real-noisy photographs and requires multiple stage network modeling. To advance the practicability of denoising algorithms, this paper proposes a novel single-stage blind real image denoising network (RIDNet) by employing a modular architecture. We use a residual on the residual structure to ease the flow of low-frequency information and apply feature attention to exploit the channel dependencies. Furthermore, the evaluation in terms of quantitative metrics and visual quality on three synthetic and four real noisy datasets against 19 state-of-the-art algorithms demonstrate the superiority of our RIDNet.
Tasks	Denoising, Image Denoising
Published	2019-04-16
URL	https://arxiv.org/abs/1904.07396v2
PDF	https://arxiv.org/pdf/1904.07396v2.pdf
PWC	https://paperswithcode.com/paper/real-image-denoising-with-feature-attention
Repo	https://github.com/saeed-anwar/RIDNet
Framework	pytorch

AI-IMU Dead-Reckoning


Title	AI-IMU Dead-Reckoning
Authors	Martin Brossard, Axel Barrau, Silvère Bonnabel
Abstract	In this paper we propose a novel accurate method for dead-reckoning of wheeled vehicles based only on an Inertial Measurement Unit (IMU). In the context of intelligent vehicles, robust and accurate dead-reckoning based on the IMU may prove useful to correlate feeds from imaging sensors, to safely navigate through obstructions, or for safe emergency stops in the extreme case of exteroceptive sensors failure. The key components of the method are the Kalman filter and the use of deep neural networks to dynamically adapt the noise parameters of the filter. The method is tested on the KITTI odometry dataset, and our dead-reckoning inertial method based only on the IMU accurately estimates 3D position, velocity, orientation of the vehicle and self-calibrates the IMU biases. We achieve on average a 1.10% translational error and the algorithm competes with top-ranked methods which, by contrast, use LiDAR or stereo vision. We make our implementation open-source at: https://github.com/mbrossar/ai-imu-dr
Tasks	Dead-Reckoning Prediction
Published	2019-04-12
URL	http://arxiv.org/abs/1904.06064v1
PDF	http://arxiv.org/pdf/1904.06064v1.pdf
PWC	https://paperswithcode.com/paper/ai-imu-dead-reckoning
Repo	https://github.com/mbrossar/ai-imu-dr
Framework	pytorch

Quadratization in discrete optimization and quantum mechanics


Title	Quadratization in discrete optimization and quantum mechanics
Authors	Nike Dattani
Abstract	A book about turning high-degree optimization problems into quadratic optimization problems that maintain the same global minimum (ground state). This book explores quadratizations for pseudo-Boolean optimization, perturbative gadgets used in QMA completeness theorems, and also non-perturbative k-local to 2-local transformations used for quantum mechanics, quantum annealing and universal adiabatic quantum computing. The book contains ~70 different Hamiltonian transformations, each of them on a separate page, where the cost (in number of auxiliary binary variables or auxiliary qubits, or number of sub-modular terms, or in graph connectivity, etc.), pros, cons, examples, and references are given. One can therefore look up a quadratization appropriate for the specific term(s) that need to be quadratized, much like using an integral table to look up the integral that needs to be done. This book is therefore useful for writing compilers to transform general optimization problems, into a form that quantum annealing or universal adiabatic quantum computing hardware requires; or for transforming quantum chemistry problems written in the Jordan-Wigner or Bravyi-Kitaev form, into a form where all multi-qubit interactions become 2-qubit pairwise interactions, without changing the desired ground state. Applications cited include computer vision problems (e.g. image de-noising, un-blurring, etc.), number theory (e.g. integer factoring), graph theory (e.g. Ramsey number determination), and quantum chemistry. The book is open source, and anyone can make modifications here: https://github.com/HPQC-LABS/Book_About_Quadratization.
Tasks
Published	2019-01-14
URL	https://arxiv.org/abs/1901.04405v2
PDF	https://arxiv.org/pdf/1901.04405v2.pdf
PWC	https://paperswithcode.com/paper/quadratization-in-discrete-optimization-and
Repo	https://github.com/HPQC-LABS/Book_About_Quadratization
Framework	none

Hierarchical Importance Weighted Autoencoders


Title	Hierarchical Importance Weighted Autoencoders
Authors	Chin-Wei Huang, Kris Sankaran, Eeshan Dhekane, Alexandre Lacoste, Aaron Courville
Abstract	Importance weighted variational inference (Burda et al., 2015) uses multiple i.i.d. samples to have a tighter variational lower bound. We believe a joint proposal has the potential of reducing the number of redundant samples, and introduce a hierarchical structure to induce correlation. The hope is that the proposals would coordinate to make up for the error made by one another to reduce the variance of the importance estimator. Theoretically, we analyze the condition under which convergence of the estimator variance can be connected to convergence of the lower bound. Empirically, we confirm that maximization of the lower bound does implicitly minimize variance. Further analysis shows that this is a result of negative correlation induced by the proposed hierarchical meta sampling scheme, and performance of inference also improves when the number of samples increases.
Tasks
Published	2019-05-13
URL	https://arxiv.org/abs/1905.04866v1
PDF	https://arxiv.org/pdf/1905.04866v1.pdf
PWC	https://paperswithcode.com/paper/hierarchical-importance-weighted-autoencoders
Repo	https://github.com/CW-Huang/HIWAE
Framework	pytorch