February 1, 2020

3192 words 15 mins read

Paper Group AWR 338

Does It Make Sense? And Why? A Pilot Study for Sense Making and Explanation. Weakly Supervised Energy-Based Learning for Action Segmentation. A Biologically Inspired Visual Working Memory for Deep Networks. Unifying Variational Inference and PAC-Bayes for Supervised Learning that Scales. Well-calibrated Model Uncertainty with Temperature Scaling fo …

Does It Make Sense? And Why? A Pilot Study for Sense Making and Explanation


Title	Does It Make Sense? And Why? A Pilot Study for Sense Making and Explanation
Authors	Cunxiang Wang, Shuailong Liang, Yue Zhang, Xiaonan Li, Tian Gao
Abstract	Introducing common sense to natural language understanding systems has received increasing research attention. It remains a fundamental question on how to evaluate whether a system has a sense making capability. Existing benchmarks measures commonsense knowledge indirectly and without explanation. In this paper, we release a benchmark to directly test whether a system can differentiate natural language statements that make sense from those that do not make sense. In addition, a system is asked to identify the most crucial reason why a statement does not make sense. We evaluate models trained over large-scale language modeling tasks as well as human performance, showing that there are different challenges for system sense making.
Tasks	Common Sense Reasoning, Language Modelling
Published	2019-06-02
URL	https://arxiv.org/abs/1906.00363v1
PDF	https://arxiv.org/pdf/1906.00363v1.pdf
PWC	https://paperswithcode.com/paper/190600363
Repo	https://github.com/wangcunxiang/Sen-Making-and-Explanation
Framework	tf

Weakly Supervised Energy-Based Learning for Action Segmentation


Title	Weakly Supervised Energy-Based Learning for Action Segmentation
Authors	Jun Li, Peng Lei, Sinisa Todorovic
Abstract	This paper is about labeling video frames with action classes under weak supervision in training, where we have access to a temporal ordering of actions, but their start and end frames in training videos are unknown. Following prior work, we use an HMM grounded on a Gated Recurrent Unit (GRU) for frame labeling. Our key contribution is a new constrained discriminative forward loss (CDFL) that we use for training the HMM and GRU under weak supervision. While prior work typically estimates the loss on a single, inferred video segmentation, our CDFL discriminates between the energy of all valid and invalid frame labelings of a training video. A valid frame labeling satisfies the ground-truth temporal ordering of actions, whereas an invalid one violates the ground truth. We specify an efficient recursive algorithm for computing the CDFL in terms of the logadd function of the segmentation energy. Our evaluation on action segmentation and alignment gives superior results to those of the state of the art on the benchmark Breakfast Action, Hollywood Extended, and 50Salads datasets.
Tasks	action segmentation, Video Semantic Segmentation
Published	2019-09-28
URL	https://arxiv.org/abs/1909.13155v1
PDF	https://arxiv.org/pdf/1909.13155v1.pdf
PWC	https://paperswithcode.com/paper/weakly-supervised-energy-based-learning-for
Repo	https://github.com/JunLi-Galios/CDFL
Framework	pytorch

A Biologically Inspired Visual Working Memory for Deep Networks


Title	A Biologically Inspired Visual Working Memory for Deep Networks
Authors	Ethan Harris, Mahesan Niranjan, Jonathon Hare
Abstract	The ability to look multiple times through a series of pose-adjusted glimpses is fundamental to human vision. This critical faculty allows us to understand highly complex visual scenes. Short term memory plays an integral role in aggregating the information obtained from these glimpses and informing our interpretation of the scene. Computational models have attempted to address glimpsing and visual attention but have failed to incorporate the notion of memory. We introduce a novel, biologically inspired visual working memory architecture that we term the Hebb-Rosenblatt memory. We subsequently introduce a fully differentiable Short Term Attentive Working Memory model (STAWM) which uses transformational attention to learn a memory over each image it sees. The state of our Hebb-Rosenblatt memory is embedded in STAWM as the weights space of a layer. By projecting different queries through this layer we can obtain goal-oriented latent representations for tasks including classification and visual reconstruction. Our model obtains highly competitive classification performance on MNIST and CIFAR-10. As demonstrated through the CelebA dataset, to perform reconstruction the model learns to make a sequence of updates to a canvas which constitute a parts-based representation. Classification with the self supervised representation obtained from MNIST is shown to be in line with the state of the art models (none of which use a visual attention mechanism). Finally, we show that STAWM can be trained under the dual constraints of classification and reconstruction to provide an interpretable visual sketchpad which helps open the ‘black-box’ of deep learning.
Tasks
Published	2019-01-09
URL	http://arxiv.org/abs/1901.03665v1
PDF	http://arxiv.org/pdf/1901.03665v1.pdf
PWC	https://paperswithcode.com/paper/a-biologically-inspired-visual-working-memory
Repo	https://github.com/ethanwharris/STAWM
Framework	pytorch

Unifying Variational Inference and PAC-Bayes for Supervised Learning that Scales


Title	Unifying Variational Inference and PAC-Bayes for Supervised Learning that Scales
Authors	Sanjay Thakur, Herke Van Hoof, Gunshi Gupta, David Meger
Abstract	Neural Network based controllers hold enormous potential to learn complex, high-dimensional functions. However, they are prone to overfitting and unwarranted extrapolations. PAC Bayes is a generalized framework which is more resistant to overfitting and that yields performance bounds that hold with arbitrarily high probability even on the unjustified extrapolations. However, optimizing to learn such a function and a bound is intractable for complex tasks. In this work, we propose a method to simultaneously learn such a function and estimate performance bounds that scale organically to high-dimensions, non-linear environments without making any explicit assumptions about the environment. We build our approach on a parallel that we draw between the formulations called ELBO and PAC Bayes when the risk metric is negative log likelihood. Through our experiments on multiple high dimensional MuJoCo locomotion tasks, we validate the correctness of our theory, show its ability to generalize better, and investigate the factors that are important for its learning. The code for all the experiments is available at https://bit.ly/2qv0JjA.
Tasks
Published	2019-10-23
URL	https://arxiv.org/abs/1910.10367v2
PDF	https://arxiv.org/pdf/1910.10367v2.pdf
PWC	https://paperswithcode.com/paper/unifying-variational-inference-and-pac-bayes
Repo	https://github.com/sanjaythakur/Unifying-VI-and-PAC-Bayes-for-Learning-that-Scales
Framework	tf

Well-calibrated Model Uncertainty with Temperature Scaling for Dropout Variational Inference


Title	Well-calibrated Model Uncertainty with Temperature Scaling for Dropout Variational Inference
Authors	Max-Heinrich Laves, Sontje Ihler, Karl-Philipp Kortmann, Tobias Ortmaier
Abstract	Model uncertainty obtained by variational Bayesian inference with Monte Carlo dropout is prone to miscalibration. The uncertainty does not represent the model error well. In this paper, temperature scaling is extended to dropout variational inference to calibrate model uncertainty. Expected uncertainty calibration error (UCE) is presented as a metric to measure miscalibration of uncertainty. The effectiveness of this approach is evaluated on CIFAR-10/100 for recent CNN architectures. Experimental results show, that temperature scaling considerably reduces miscalibration by means of UCE and enables robust rejection of uncertain predictions. The proposed approach can easily be derived from frequentist temperature scaling and yields well-calibrated model uncertainty. It is simple to implement and does not affect the model accuracy.
Tasks	Bayesian Inference, Calibration
Published	2019-09-30
URL	https://arxiv.org/abs/1909.13550v3
PDF	https://arxiv.org/pdf/1909.13550v3.pdf
PWC	https://paperswithcode.com/paper/well-calibrated-model-uncertainty-with
Repo	https://github.com/mlaves/bayesian-temperature-scaling
Framework	pytorch

Particle Flow Bayes’ Rule


Title	Particle Flow Bayes’ Rule
Authors	Xinshi Chen, Hanjun Dai, Le Song
Abstract	We present a particle flow realization of Bayes’ rule, where an ODE-based neural operator is used to transport particles from a prior to its posterior after a new observation. We prove that such an ODE operator exists. Its neural parameterization can be trained in a meta-learning framework, allowing this operator to reason about the effect of an individual observation on the posterior, and thus generalize across different priors, observations and to sequential Bayesian inference. We demonstrated the generalization ability of our particle flow Bayes operator in several canonical and high dimensional examples.
Tasks	Bayesian Inference, Meta-Learning
Published	2019-02-02
URL	https://arxiv.org/abs/1902.00640v3
PDF	https://arxiv.org/pdf/1902.00640v3.pdf
PWC	https://paperswithcode.com/paper/meta-particle-flow-for-sequential-bayesian
Repo	https://github.com/xinshi-chen/ParticleFlowBayesRule
Framework	pytorch

Learning Bayesian posteriors with neural networks for gravitational-wave inference


Title	Learning Bayesian posteriors with neural networks for gravitational-wave inference
Authors	Alvin J. K. Chua, Michele Vallisneri
Abstract	We seek to achieve the Holy Grail of Bayesian inference for gravitational-wave astronomy: using deep-learning techniques to instantly produce the posterior $p(\thetaD)$ for the source parameters $\theta$, given the detector data $D$. To do so, we train a deep neural network to take as input a signal + noise data set (drawn from the astrophysical source-parameter prior and the sampling distribution of detector noise), and to output a parametrized approximation of the corresponding posterior. We rely on a compact representation of the data based on reduced-order modeling, which we generate efficiently using a separate neural-network waveform interpolant [A. J. K. Chua, C. R. Galley & M. Vallisneri, Phys. Rev. Lett. 122, 211101 (2019)]. Our scheme has broad relevance to gravitational-wave applications such as low-latency parameter estimation and characterizing the science returns of future experiments. Source code and trained networks are available online at https://github.com/vallis/truebayes.
Tasks	Bayesian Inference
Published	2019-09-12
URL	https://arxiv.org/abs/1909.05966v3
PDF	https://arxiv.org/pdf/1909.05966v3.pdf
PWC	https://paperswithcode.com/paper/learning-bayes-theorem-with-a-neural-network
Repo	https://github.com/vallis/truebayes
Framework	pytorch

Boundary Loss for Remote Sensing Imagery Semantic Segmentation


Title	Boundary Loss for Remote Sensing Imagery Semantic Segmentation
Authors	Alexey Bokhovkin, Evgeny Burnaev
Abstract	In response to the growing importance of geospatial data, its analysis including semantic segmentation becomes an increasingly popular task in computer vision today. Convolutional neural networks are powerful visual models that yield hierarchies of features and practitioners widely use them to process remote sensing data. When performing remote sensing image segmentation, multiple instances of one class with precisely defined boundaries are often the case, and it is crucial to extract those boundaries accurately. The accuracy of segments boundaries delineation influences the quality of the whole segmented areas explicitly. However, widely-used segmentation loss functions such as BCE, IoU loss or Dice loss do not penalize misalignment of boundaries sufficiently. In this paper, we propose a novel loss function, namely a differentiable surrogate of a metric accounting accuracy of boundary detection. We can use the loss function with any neural network for binary segmentation. We performed validation of our loss function with various modifications of UNet on a synthetic dataset, as well as using real-world data (ISPRS Potsdam, INRIA AIL). Trained with the proposed loss function, models outperform baseline methods in terms of IoU score.
Tasks	Boundary Detection, Semantic Segmentation
Published	2019-05-20
URL	https://arxiv.org/abs/1905.07852v1
PDF	https://arxiv.org/pdf/1905.07852v1.pdf
PWC	https://paperswithcode.com/paper/boundary-loss-for-remote-sensing-imagery
Repo	https://github.com/yiskw713/boundary_loss_for_remote_sensing
Framework	pytorch

LGM-Net: Learning to Generate Matching Networks for Few-Shot Learning


Title	LGM-Net: Learning to Generate Matching Networks for Few-Shot Learning
Authors	Huaiyu Li, Weiming Dong, Xing Mei, Chongyang Ma, Feiyue Huang, Bao-Gang Hu
Abstract	In this work, we propose a novel meta-learning approach for few-shot classification, which learns transferable prior knowledge across tasks and directly produces network parameters for similar unseen tasks with training samples. Our approach, called LGM-Net, includes two key modules, namely, TargetNet and MetaNet. The TargetNet module is a neural network for solving a specific task and the MetaNet module aims at learning to generate functional weights for TargetNet by observing training samples. We also present an intertask normalization strategy for the training process to leverage common information shared across different tasks. The experimental results on Omniglot and miniImageNet datasets demonstrate that LGM-Net can effectively adapt to similar unseen tasks and achieve competitive performance, and the results on synthetic datasets show that transferable prior knowledge is learned by the MetaNet module via mapping training data to functional weights. LGM-Net enables fast learning and adaptation since no further tuning steps are required compared to other meta-learning approaches.
Tasks	Few-Shot Learning, Meta-Learning, Omniglot
Published	2019-05-15
URL	https://arxiv.org/abs/1905.06331v1
PDF	https://arxiv.org/pdf/1905.06331v1.pdf
PWC	https://paperswithcode.com/paper/lgm-net-learning-to-generate-matching
Repo	https://github.com/likesiwell/LGM-Net
Framework	tf

Image Outpainting and Harmonization using Generative Adversarial Networks


Title	Image Outpainting and Harmonization using Generative Adversarial Networks
Authors	Basile Van Hoorick
Abstract	Although the inherently ambiguous task of predicting what resides beyond all four edges of an image has rarely been explored before, we demonstrate that GANs hold powerful potential in producing reasonable extrapolations. Two outpainting methods are proposed that aim to instigate this line of research: the first approach uses a context encoder inspired by common inpainting architectures and paradigms, while the second approach adds an extra post-processing step using a single-image generative model. This way, the hallucinated details are integrated with the style of the original image, in an attempt to further boost the quality of the result and possibly allow for arbitrary output resolutions to be supported.
Tasks	Conditional Image Generation, Image Outpainting
Published	2019-12-23
URL	https://arxiv.org/abs/1912.10960v2
PDF	https://arxiv.org/pdf/1912.10960v2.pdf
PWC	https://paperswithcode.com/paper/image-outpainting-and-harmonization-using
Repo	https://github.com/basilevh/image-outpainting
Framework	pytorch

RecurSIA-RRT: Recursive translatable point-set pattern discovery with removal of redundant translators


Title	RecurSIA-RRT: Recursive translatable point-set pattern discovery with removal of redundant translators
Authors	David Meredith
Abstract	We introduce two algorithms, RECURSIA and RRT, designed to increase the compression factor achievable using point-set cover algorithms based on the SIA and SIATEC pattern discovery algorithms. SIA computes the maximal translatable patterns (MTPs) in a point set, while SIATEC computes the translational equivalence class (TEC) of every MTP in a point set, where the TEC of an MTP is the set of translationally invariant occurrences of that MTP in the point set. In its output, SIATEC encodes each MTP TEC as a pair, <P,V>, where P is the first occurrence of the MTP and V is the set of non-zero vectors that map P onto its other occurrences. RECURSIA recursively applies a TEC cover algorithm to the pattern P, in each TEC, <P,V>, that it discovers. RRT attempts to remove translators from V in each TEC without reducing the total set of points covered by the TEC. When evaluated with COSIATEC, SIATECCompress and Forth’s algorithm on the JKU Patterns Development Database, using RECURSIA with or without RRT increased compression factor and recall but reduced precision. Using RRT alone increased compression factor and reduced recall and precision, but had a smaller effect than RECURSIA.
Tasks
Published	2019-06-28
URL	https://arxiv.org/abs/1906.12286v2
PDF	https://arxiv.org/pdf/1906.12286v2.pdf
PWC	https://paperswithcode.com/paper/recursia-rrt-recursive-translatable-point-set
Repo	https://github.com/chromamorph/omnisia-recursia-rrt-mml-2019
Framework	none


Title	Intent term selection and refinement in e-commerce queries
Authors	Saurav Manchanda, Mohit Sharma, George Karypis
Abstract	In e-commerce, a user tends to search for the desired product by issuing a query to the search engine and examining the retrieved results. If the search engine was successful in correctly understanding the user’s query, it will return results that correspond to the products whose attributes match the terms in the query that are representative of the query’s product intent. However, the search engine may fail to retrieve results that satisfy the query’s product intent and thus degrading user experience due to different issues in query processing: (i) when multiple terms are present in a query it may fail to determine the relevant terms that are representative of the query’s product intent, and (ii) it may suffer from vocabulary gap between the terms in the query and the product’s description, i.e., terms used in the query are semantically similar but different from the terms in the product description. Hence, identifying the terms that describe the query’s product intent and predicting additional terms that describe the query’s product intent better than the existing query terms to the search engine is an essential task in e-commerce search. In this paper, we leverage the historical query reformulation logs of a major e-commerce retailer to develop distant-supervised approaches to solve both these problems. Our approaches exploit the fact that the significance of a term is dependent upon the context (other terms in the neighborhood) in which it is used in order to learn the importance of the term towards the query’s product intent. We show that identifying and emphasizing the terms that define the query’s product intent leads to a 3% improvement in ranking. Moreover, for the tasks of identifying the important terms in a query and for predicting the additional terms that represent product intent, experiments illustrate that our approaches outperform the non-contextual baselines.
Tasks
Published	2019-08-22
URL	https://arxiv.org/abs/1908.08564v1
PDF	https://arxiv.org/pdf/1908.08564v1.pdf
PWC	https://paperswithcode.com/paper/intent-term-selection-and-refinement-in-e
Repo	https://github.com/gurdaspuriya/query_intent
Framework	pytorch

Towards Learning Universal, Regional, and Local Hydrological Behaviors via Machine-Learning Applied to Large-Sample Datasets


Title	Towards Learning Universal, Regional, and Local Hydrological Behaviors via Machine-Learning Applied to Large-Sample Datasets
Authors	Frederik Kratzert, Daniel Klotz, Guy Shalev, Günter Klambauer, Sepp Hochreiter, Grey Nearing
Abstract	Regional rainfall-runoff modeling is an old but still mostly out-standing problem in Hydrological Sciences. The problem currently is that traditional hydrological models degrade significantly in performance when calibrated for multiple basins together instead of for a single basin alone. In this paper, we propose a novel, data-driven approach using Long Short-Term Memory networks (LSTMs), and demonstrate that under a ‘big data’ paradigm, this is not necessarily the case. By training a single LSTM model on 531 basins from the CAMELS data set using meteorological time series data and static catchment attributes, we were able to significantly improve performance compared to a set of several different hydrological benchmark models. Our proposed approach not only significantly outperforms hydrological models that were calibrated regionally but also achieves better performance than hydrological models that were calibrated for each basin individually. Furthermore, we propose an adaption to the standard LSTM architecture, which we call an Entity-Aware-LSTM (EA-LSTM), that allows for learning, and embedding as a feature layer in a deep learning model, catchment similarities. We show that this learned catchment similarity corresponds well with what we would expect from prior hydrological understanding.
Tasks	Time Series
Published	2019-07-19
URL	https://arxiv.org/abs/1907.08456v2
PDF	https://arxiv.org/pdf/1907.08456v2.pdf
PWC	https://paperswithcode.com/paper/benchmarking-a-catchment-aware-long-short
Repo	https://github.com/kratzert/ealstm_regional_modeling
Framework	pytorch

Generative Models for Effective ML on Private, Decentralized Datasets


Title	Generative Models for Effective ML on Private, Decentralized Datasets
Authors	Sean Augenstein, H. Brendan McMahan, Daniel Ramage, Swaroop Ramaswamy, Peter Kairouz, Mingqing Chen, Rajiv Mathews, Blaise Aguera y Arcas
Abstract	To improve real-world applications of machine learning, experienced modelers develop intuition about their datasets, their models, and how the two interact. Manual inspection of raw data - of representative samples, of outliers, of misclassifications - is an essential tool in a) identifying and fixing problems in the data, b) generating new modeling hypotheses, and c) assigning or refining human-provided labels. However, manual data inspection is problematic for privacy sensitive datasets, such as those representing the behavior of real-world individuals. Furthermore, manual data inspection is impossible in the increasingly important setting of federated learning, where raw examples are stored at the edge and the modeler may only access aggregated outputs such as metrics or model parameters. This paper demonstrates that generative models - trained using federated methods and with formal differential privacy guarantees - can be used effectively to debug many commonly occurring data issues even when the data cannot be directly inspected. We explore these methods in applications to text with differentially private federated RNNs and to images using a novel algorithm for differentially private federated GANs.
Tasks
Published	2019-11-15
URL	https://arxiv.org/abs/1911.06679v2
PDF	https://arxiv.org/pdf/1911.06679v2.pdf
PWC	https://paperswithcode.com/paper/generative-models-for-effective-ml-on-private-1
Repo	https://github.com/tensorflow/gan
Framework	tf

Confident Learning: Estimating Uncertainty in Dataset Labels


Title	Confident Learning: Estimating Uncertainty in Dataset Labels
Authors	Curtis G. Northcutt, Lu Jiang, Isaac L. Chuang
Abstract	Learning exists in the context of data, yet notions of \emph{confidence} typically focus on model predictions, not label quality. Confident learning (CL) has emerged as an approach for characterizing, identifying, and learning with noisy labels in datasets, based on the principles of pruning noisy data, counting to estimate noise, and ranking examples to train with confidence. Here, we generalize CL, building on the assumption of a classification noise process, to directly estimate the joint distribution between noisy (given) labels and uncorrupted (unknown) labels. This generalized CL, open-sourced as \texttt{cleanlab}, is provably consistent across reasonable conditions, and experimentally performant on ImageNet and CIFAR, outperforming seven recent approaches when label noise is non-uniform. \texttt{cleanlab} also quantifies ontological class overlap, and can increase model accuracy (e.g. ResNet) by providing clean data for training.
Tasks
Published	2019-10-31
URL	https://arxiv.org/abs/1911.00068v2
PDF	https://arxiv.org/pdf/1911.00068v2.pdf
PWC	https://paperswithcode.com/paper/confident-learning-estimating-uncertainty-in
Repo	https://github.com/cgnorthcutt/cleanlab
Framework	pytorch