October 20, 2019

3108 words 15 mins read

Paper Group AWR 216

Pattern Search Multidimensional Scaling. Towards Universal Dialogue State Tracking. A Mixed Hierarchical Attention based Encoder-Decoder Approach for Standard Table Summarization. Fair and Diverse DPP-based Data Summarization. Localized Structured Prediction. Bayesian QuickNAT: Model Uncertainty in Deep Whole-Brain Segmentation for Structure-wise Q …

Pattern Search Multidimensional Scaling


Title	Pattern Search Multidimensional Scaling
Authors	Georgios Paraskevopoulos, Efthymios Tzinis, Emmanouil-Vasileios Vlatakis-Gkaragkounis, Alexandros Potamianos
Abstract	We present a novel view of nonlinear manifold learning using derivative-free optimization techniques. Specifically, we propose an extension of the classical multi-dimensional scaling (MDS) method, where instead of performing gradient descent, we sample and evaluate possible “moves” in a sphere of fixed radius for each point in the embedded space. A fixed-point convergence guarantee can be shown by formulating the proposed algorithm as an instance of General Pattern Search (GPS) framework. Evaluation on both clean and noisy synthetic datasets shows that pattern search MDS can accurately infer the intrinsic geometry of manifolds embedded in high-dimensional spaces. Additionally, experiments on real data, even under noisy conditions, demonstrate that the proposed pattern search MDS yields state-of-the-art results.
Tasks
Published	2018-06-01
URL	https://arxiv.org/abs/1806.00416v3
PDF	https://arxiv.org/pdf/1806.00416v3.pdf
PWC	https://paperswithcode.com/paper/pattern-search-multidimensional-scaling
Repo	https://github.com/georgepar/pattern-search-mds
Framework	none

Towards Universal Dialogue State Tracking


Title	Towards Universal Dialogue State Tracking
Authors	Liliang Ren, Kaige Xie, Lu Chen, Kai Yu
Abstract	Dialogue state tracking is the core part of a spoken dialogue system. It estimates the beliefs of possible user’s goals at every dialogue turn. However, for most current approaches, it’s difficult to scale to large dialogue domains. They have one or more of following limitations: (a) Some models don’t work in the situation where slot values in ontology changes dynamically; (b) The number of model parameters is proportional to the number of slots; (c) Some models extract features based on hand-crafted lexicons. To tackle these challenges, we propose StateNet, a universal dialogue state tracker. It is independent of the number of values, shares parameters across all slots, and uses pre-trained word vectors instead of explicit semantic dictionaries. Our experiments on two datasets show that our approach not only overcomes the limitations, but also significantly outperforms the performance of state-of-the-art approaches.
Tasks	Dialogue State Tracking
Published	2018-10-22
URL	http://arxiv.org/abs/1810.09587v1
PDF	http://arxiv.org/pdf/1810.09587v1.pdf
PWC	https://paperswithcode.com/paper/towards-universal-dialogue-state-tracking
Repo	https://github.com/renll/StateNet
Framework	mxnet

A Mixed Hierarchical Attention based Encoder-Decoder Approach for Standard Table Summarization


Title	A Mixed Hierarchical Attention based Encoder-Decoder Approach for Standard Table Summarization
Authors	Parag Jain, Anirban Laha, Karthik Sankaranarayanan, Preksha Nema, Mitesh M. Khapra, Shreyas Shetty
Abstract	Structured data summarization involves generation of natural language summaries from structured input data. In this work, we consider summarizing structured data occurring in the form of tables as they are prevalent across a wide variety of domains. We formulate the standard table summarization problem, which deals with tables conforming to a single predefined schema. To this end, we propose a mixed hierarchical attention based encoder-decoder model which is able to leverage the structure in addition to the content of the tables. Our experiments on the publicly available WEATHERGOV dataset show around 18 BLEU (~ 30%) improvement over the current state-of-the-art.
Tasks	Data Summarization
Published	2018-04-20
URL	http://arxiv.org/abs/1804.07790v1
PDF	http://arxiv.org/pdf/1804.07790v1.pdf
PWC	https://paperswithcode.com/paper/a-mixed-hierarchical-attention-based-encoder
Repo	https://github.com/parajain/StructuredData_To_Descriptions
Framework	tf

Fair and Diverse DPP-based Data Summarization


Title	Fair and Diverse DPP-based Data Summarization
Authors	L. Elisa Celis, Vijay Keswani, Damian Straszak, Amit Deshpande, Tarun Kathuria, Nisheeth K. Vishnoi
Abstract	Sampling methods that choose a subset of the data proportional to its diversity in the feature space are popular for data summarization. However, recent studies have noted the occurrence of bias (under- or over-representation of a certain gender or race) in such data summarization methods. In this paper we initiate a study of the problem of outputting a diverse and fair summary of a given dataset. We work with a well-studied determinantal measure of diversity and corresponding distributions (DPPs) and present a framework that allows us to incorporate a general class of fairness constraints into such distributions. Coming up with efficient algorithms to sample from these constrained determinantal distributions, however, suffers from a complexity barrier and we present a fast sampler that is provably good when the input vectors satisfy a natural property. Our experimental results on a real-world and an image dataset show that the diversity of the samples produced by adding fairness constraints is not too far from the unconstrained case, and we also provide a theoretical explanation of it.
Tasks	Data Summarization
Published	2018-02-12
URL	http://arxiv.org/abs/1802.04023v1
PDF	http://arxiv.org/pdf/1802.04023v1.pdf
PWC	https://paperswithcode.com/paper/fair-and-diverse-dpp-based-data-summarization
Repo	https://github.com/DamianStraszak/FairDiverseDPPSampling
Framework	none

Localized Structured Prediction


Title	Localized Structured Prediction
Authors	Carlo Ciliberto, Francis Bach, Alessandro Rudi
Abstract	Key to structured prediction is exploiting the problem structure to simplify the learning process. A major challenge arises when data exhibit a local structure (e.g., are made by “parts”) that can be leveraged to better approximate the relation between (parts of) the input and (parts of) the output. Recent literature on signal processing, and in particular computer vision, has shown that capturing these aspects is indeed essential to achieve state-of-the-art performance. While such algorithms are typically derived on a case-by-case basis, in this work we propose the first theoretical framework to deal with part-based data from a general perspective. We derive a novel approach to deal with these problems and study its generalization properties within the setting of statistical learning theory. Our analysis is novel in that it explicitly quantifies the benefits of leveraging the part-based structure of the problem with respect to the learning rates of the proposed estimator.
Tasks	Structured Prediction
Published	2018-06-06
URL	https://arxiv.org/abs/1806.02402v3
PDF	https://arxiv.org/pdf/1806.02402v3.pdf
PWC	https://paperswithcode.com/paper/localized-structured-prediction
Repo	https://github.com/cciliber/localized-structured-prediction
Framework	none

Bayesian QuickNAT: Model Uncertainty in Deep Whole-Brain Segmentation for Structure-wise Quality Control


Title	Bayesian QuickNAT: Model Uncertainty in Deep Whole-Brain Segmentation for Structure-wise Quality Control
Authors	Abhijit Guha Roy, Sailesh Conjeti, Nassir Navab, Christian Wachinger
Abstract	We introduce Bayesian QuickNAT for the automated quality control of whole-brain segmentation on MRI T1 scans. Next to the Bayesian fully convolutional neural network, we also present inherent measures of segmentation uncertainty that allow for quality control per brain structure. For estimating model uncertainty, we follow a Bayesian approach, wherein, Monte Carlo (MC) samples from the posterior distribution are generated by keeping the dropout layers active at test time. Entropy over the MC samples provides a voxel-wise model uncertainty map, whereas expectation over the MC predictions provides the final segmentation. Next to voxel-wise uncertainty, we introduce four metrics to quantify structure-wise uncertainty in segmentation for quality control. We report experiments on four out-of-sample datasets comprising of diverse age range, pathology and imaging artifacts. The proposed structure-wise uncertainty metrics are highly correlated with the Dice score estimated with manual annotation and therefore present an inherent measure of segmentation quality. In particular, the intersection over union over all the MC samples is a suitable proxy for the Dice score. In addition to quality control at scan-level, we propose to incorporate the structure-wise uncertainty as a measure of confidence to do reliable group analysis on large data repositories. We envisage that the introduced uncertainty metrics would help assess the fidelity of automated deep learning based segmentation methods for large-scale population studies, as they enable automated quality control and group analyses in processing large data repositories.
Tasks	Brain Segmentation
Published	2018-11-24
URL	http://arxiv.org/abs/1811.09800v1
PDF	http://arxiv.org/pdf/1811.09800v1.pdf
PWC	https://paperswithcode.com/paper/bayesian-quicknat-model-uncertainty-in-deep
Repo	https://github.com/abhi4ssj/BayesianQuickNAT
Framework	none

Deep Metric Transfer for Label Propagation with Limited Annotated Data


Title	Deep Metric Transfer for Label Propagation with Limited Annotated Data
Authors	Bin Liu, Zhirong Wu, Han Hu, Stephen Lin
Abstract	We study object recognition under the constraint that each object class is only represented by very few observations. Semi-supervised learning, transfer learning, and few-shot recognition all concern with achieving fast generalization with few labeled data. In this paper, we propose a generic framework that utilizes unlabeled data to aid generalization for all three tasks. Our approach is to create much more training data through label propagation from the few labeled examples to a vast collection of unannotated images. The main contribution of the paper is that we show such a label propagation scheme can be highly effective when the similarity metric used for propagation is transferred from other related domains. We test various combinations of supervised and unsupervised metric learning methods with various label propagation algorithms. We find that our framework is very generic without being sensitive to any specific techniques. By taking advantage of unlabeled data in this way, we achieve significant improvements on all three tasks.
Tasks	Metric Learning, Object Recognition, Transfer Learning
Published	2018-12-20
URL	https://arxiv.org/abs/1812.08781v2
PDF	https://arxiv.org/pdf/1812.08781v2.pdf
PWC	https://paperswithcode.com/paper/deep-metric-transfer-for-label-propagation
Repo	https://github.com/microsoft/metric-transfer.pytorch
Framework	pytorch

Modeling Uncertainty with Hedged Instance Embedding


Title	Modeling Uncertainty with Hedged Instance Embedding
Authors	Seong Joon Oh, Kevin Murphy, Jiyan Pan, Joseph Roth, Florian Schroff, Andrew Gallagher
Abstract	Instance embeddings are an efficient and versatile image representation that facilitates applications like recognition, verification, retrieval, and clustering. Many metric learning methods represent the input as a single point in the embedding space. Often the distance between points is used as a proxy for match confidence. However, this can fail to represent uncertainty arising when the input is ambiguous, e.g., due to occlusion or blurriness. This work addresses this issue and explicitly models the uncertainty by hedging the location of each input in the embedding space. We introduce the hedged instance embedding (HIB) in which embeddings are modeled as random variables and the model is trained under the variational information bottleneck principle. Empirical results on our new N-digit MNIST dataset show that our method leads to the desired behavior of hedging its bets across the embedding space upon encountering ambiguous inputs. This results in improved performance for image matching and classification tasks, more structure in the learned embedding space, and an ability to compute a per-exemplar uncertainty measure that is correlated with downstream performance.
Tasks	Metric Learning
Published	2018-09-30
URL	https://arxiv.org/abs/1810.00319v6
PDF	https://arxiv.org/pdf/1810.00319v6.pdf
PWC	https://paperswithcode.com/paper/modeling-uncertainty-with-hedged-instance
Repo	https://github.com/google/n-digit-mnist
Framework	none

Robust Video Content Alignment and Compensation for Rain Removal in a CNN Framework


Title	Robust Video Content Alignment and Compensation for Rain Removal in a CNN Framework
Authors	Jie Chen, Cheen-Hau Tan, Junhui Hou, Lap-Pui Chau, He Li
Abstract	Rain removal is important for improving the robustness of outdoor vision based systems. Current rain removal methods show limitations either for complex dynamic scenes shot from fast moving cameras, or under torrential rain fall with opaque occlusions. We propose a novel derain algorithm, which applies superpixel (SP) segmentation to decompose the scene into depth consistent units. Alignment of scene contents are done at the SP level, which proves to be robust towards rain occlusion and fast camera motion. Two alignment output tensors, i.e., optimal temporal match tensor and sorted spatial-temporal match tensor, provide informative clues for rain streak location and occluded background contents to generate an intermediate derain output. These tensors will be subsequently prepared as input features for a convolutional neural network to restore high frequency details to the intermediate output for compensation of mis-alignment blur. Extensive evaluations show that up to 5 dB reconstruction PSNR advantage is achieved over state-of-the-art methods. Visual inspection shows that much cleaner rain removal is achieved especially for highly dynamic scenes with heavy and opaque rainfall from a fast moving camera.
Tasks	Rain Removal
Published	2018-03-28
URL	http://arxiv.org/abs/1803.10433v1
PDF	http://arxiv.org/pdf/1803.10433v1.pdf
PWC	https://paperswithcode.com/paper/robust-video-content-alignment-and-1
Repo	https://github.com/hotndy/SPAC-SupplementaryMaterials
Framework	none

Stochastic Adversarial Video Prediction


Title	Stochastic Adversarial Video Prediction
Authors	Alex X. Lee, Richard Zhang, Frederik Ebert, Pieter Abbeel, Chelsea Finn, Sergey Levine
Abstract	Being able to predict what may happen in the future requires an in-depth understanding of the physical and causal rules that govern the world. A model that is able to do so has a number of appealing applications, from robotic planning to representation learning. However, learning to predict raw future observations, such as frames in a video, is exceedingly challenging – the ambiguous nature of the problem can cause a naively designed model to average together possible futures into a single, blurry prediction. Recently, this has been addressed by two distinct approaches: (a) latent variational variable models that explicitly model underlying stochasticity and (b) adversarially-trained models that aim to produce naturalistic images. However, a standard latent variable model can struggle to produce realistic results, and a standard adversarially-trained model underutilizes latent variables and fails to produce diverse predictions. We show that these distinct methods are in fact complementary. Combining the two produces predictions that look more realistic to human raters and better cover the range of possible futures. Our method outperforms prior and concurrent work in these aspects.
Tasks	Representation Learning, Video Prediction
Published	2018-04-04
URL	http://arxiv.org/abs/1804.01523v1
PDF	http://arxiv.org/pdf/1804.01523v1.pdf
PWC	https://paperswithcode.com/paper/stochastic-adversarial-video-prediction
Repo	https://github.com/alexlee-gk/video_prediction
Framework	tf

Rain Removal in Traffic Surveillance: Does it Matter?


Title	Rain Removal in Traffic Surveillance: Does it Matter?
Authors	Chris H. Bahnsen, Thomas B. Moeslund
Abstract	Varying weather conditions, including rainfall and snowfall, are generally regarded as a challenge for computer vision algorithms. One proposed solution to the challenges induced by rain and snowfall is to artificially remove the rain from images or video using rain removal algorithms. It is the promise of these algorithms that the rain-removed image frames will improve the performance of subsequent segmentation and tracking algorithms. However, rain removal algorithms are typically evaluated on their ability to remove synthetic rain on a small subset of images. Currently, their behavior is unknown on real-world videos when integrated with a typical computer vision pipeline. In this paper, we review the existing rain removal algorithms and propose a new dataset that consists of 22 traffic surveillance sequences under a broad variety of weather conditions that all include either rain or snowfall. We propose a new evaluation protocol that evaluates the rain removal algorithms on their ability to improve the performance of subsequent segmentation, instance segmentation, and feature tracking algorithms under rain and snow. If successful, the de-rained frames of a rain removal algorithm should improve segmentation performance and increase the number of accurately tracked features. The results show that a recent single-frame-based rain removal algorithm increases the segmentation performance by 19.7% on our proposed dataset, but it eventually decreases the feature tracking performance and showed mixed results with recent instance segmentation methods. However, the best video-based rain removal algorithm improves the feature tracking accuracy by 7.72%.
Tasks	Instance Segmentation, Rain Removal, Semantic Segmentation
Published	2018-10-30
URL	http://arxiv.org/abs/1810.12574v1
PDF	http://arxiv.org/pdf/1810.12574v1.pdf
PWC	https://paperswithcode.com/paper/rain-removal-in-traffic-surveillance-does-it
Repo	https://github.com/chrisbahnsen/aau-rainsnow-eval
Framework	none

A Large Dataset for Improving Patch Matching


Title	A Large Dataset for Improving Patch Matching
Authors	Rahul Mitra, Nehal Doiphode, Utkarsh Gautam, Sanath Narayan, Shuaib Ahmed, Sharat Chandran, Arjun Jain
Abstract	We propose a new dataset for learning local image descriptors which can be used for significantly improved patch matching. Our proposed dataset consists of an order of magnitude more number of scenes, images, and positive and negative correspondences compared to the currently available Multi-View Stereo (MVS) dataset from Brown et al. The new dataset also has better coverage of the overall viewpoint, scale, and lighting changes in comparison to the MVS dataset. Our dataset also provides supplementary information like RGB patches with scale and rotations values, and intrinsic and extrinsic camera parameters which as shown later can be used to customize training data as per application. We train an existing state-of-the-art model on our dataset and evaluate on publicly available benchmarks such as HPatches dataset and Strecha et al.\cite{strecha} to quantify the image descriptor performance. Experimental evaluations show that the descriptors trained using our proposed dataset outperform the current state-of-the-art descriptors trained on MVS by 8%, 4% and 10% on matching, verification and retrieval tasks respectively on the HPatches dataset. Similarly on the Strecha dataset, we see an improvement of 3-5% for the matching task in non-planar scenes.
Tasks
Published	2018-01-04
URL	http://arxiv.org/abs/1801.01466v3
PDF	http://arxiv.org/pdf/1801.01466v3.pdf
PWC	https://paperswithcode.com/paper/a-large-dataset-for-improving-patch-matching
Repo	https://github.com/rmitra/PS-Dataset
Framework	none

Bayesian Compression for Natural Language Processing


Title	Bayesian Compression for Natural Language Processing
Authors	Nadezhda Chirkova, Ekaterina Lobacheva, Dmitry Vetrov
Abstract	In natural language processing, a lot of the tasks are successfully solved with recurrent neural networks, but such models have a huge number of parameters. The majority of these parameters are often concentrated in the embedding layer, which size grows proportionally to the vocabulary length. We propose a Bayesian sparsification technique for RNNs which allows compressing the RNN dozens or hundreds of times without time-consuming hyperparameters tuning. We also generalize the model for vocabulary sparsification to filter out unnecessary words and compress the RNN even further. We show that the choice of the kept words is interpretable. Code is available on github: https://github.com/tipt0p/SparseBayesianRNN
Tasks
Published	2018-10-25
URL	http://arxiv.org/abs/1810.10927v2
PDF	http://arxiv.org/pdf/1810.10927v2.pdf
PWC	https://paperswithcode.com/paper/bayesian-compression-for-natural-language
Repo	https://github.com/ars-ashuha/variational-dropout-sparsifies-dnn
Framework	tf

Inferencing Based on Unsupervised Learning of Disentangled Representations


Title	Inferencing Based on Unsupervised Learning of Disentangled Representations
Authors	Tobias Hinz, Stefan Wermter
Abstract	Combining Generative Adversarial Networks (GANs) with encoders that learn to encode data points has shown promising results in learning data representations in an unsupervised way. We propose a framework that combines an encoder and a generator to learn disentangled representations which encode meaningful information about the data distribution without the need for any labels. While current approaches focus mostly on the generative aspects of GANs, our framework can be used to perform inference on both real and generated data points. Experiments on several data sets show that the encoder learns interpretable, disentangled representations which encode descriptive properties and can be used to sample images that exhibit specific characteristics.
Tasks	Representation Learning, Unsupervised Image Classification, Unsupervised MNIST, Unsupervised Representation Learning
Published	2018-03-07
URL	http://arxiv.org/abs/1803.02627v1
PDF	http://arxiv.org/pdf/1803.02627v1.pdf
PWC	https://paperswithcode.com/paper/inferencing-based-on-unsupervised-learning-of
Repo	https://github.com/tohinz/Bidirectional-InfoGAN
Framework	tf

Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task


Title	Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task
Authors	Tao Yu, Rui Zhang, Kai Yang, Michihiro Yasunaga, Dongxu Wang, Zifan Li, James Ma, Irene Li, Qingning Yao, Shanelle Roman, Zilin Zhang, Dragomir Radev
Abstract	We present Spider, a large-scale, complex and cross-domain semantic parsing and text-to-SQL dataset annotated by 11 college students. It consists of 10,181 questions and 5,693 unique complex SQL queries on 200 databases with multiple tables, covering 138 different domains. We define a new complex and cross-domain semantic parsing and text-to-SQL task where different complex SQL queries and databases appear in train and test sets. In this way, the task requires the model to generalize well to both new SQL queries and new database schemas. Spider is distinct from most of the previous semantic parsing tasks because they all use a single database and the exact same programs in the train set and the test set. We experiment with various state-of-the-art models and the best model achieves only 12.4% exact matching accuracy on a database split setting. This shows that Spider presents a strong challenge for future research. Our dataset and task are publicly available at https://yale-lily.github.io/spider
Tasks	Semantic Parsing, Text-To-Sql
Published	2018-09-24
URL	http://arxiv.org/abs/1809.08887v5
PDF	http://arxiv.org/pdf/1809.08887v5.pdf
PWC	https://paperswithcode.com/paper/spider-a-large-scale-human-labeled-dataset
Repo	https://github.com/taoyds/spider
Framework	tf