October 20, 2019

3108 words 15 mins read

Paper Group AWR 216

Paper Group AWR 216

Pattern Search Multidimensional Scaling. Towards Universal Dialogue State Tracking. A Mixed Hierarchical Attention based Encoder-Decoder Approach for Standard Table Summarization. Fair and Diverse DPP-based Data Summarization. Localized Structured Prediction. Bayesian QuickNAT: Model Uncertainty in Deep Whole-Brain Segmentation for Structure-wise Q …

Pattern Search Multidimensional Scaling

Title Pattern Search Multidimensional Scaling
Authors Georgios Paraskevopoulos, Efthymios Tzinis, Emmanouil-Vasileios Vlatakis-Gkaragkounis, Alexandros Potamianos
Abstract We present a novel view of nonlinear manifold learning using derivative-free optimization techniques. Specifically, we propose an extension of the classical multi-dimensional scaling (MDS) method, where instead of performing gradient descent, we sample and evaluate possible “moves” in a sphere of fixed radius for each point in the embedded space. A fixed-point convergence guarantee can be shown by formulating the proposed algorithm as an instance of General Pattern Search (GPS) framework. Evaluation on both clean and noisy synthetic datasets shows that pattern search MDS can accurately infer the intrinsic geometry of manifolds embedded in high-dimensional spaces. Additionally, experiments on real data, even under noisy conditions, demonstrate that the proposed pattern search MDS yields state-of-the-art results.
Tasks
Published 2018-06-01
URL https://arxiv.org/abs/1806.00416v3
PDF https://arxiv.org/pdf/1806.00416v3.pdf
PWC https://paperswithcode.com/paper/pattern-search-multidimensional-scaling
Repo https://github.com/georgepar/pattern-search-mds
Framework none

Towards Universal Dialogue State Tracking

Title Towards Universal Dialogue State Tracking
Authors Liliang Ren, Kaige Xie, Lu Chen, Kai Yu
Abstract Dialogue state tracking is the core part of a spoken dialogue system. It estimates the beliefs of possible user’s goals at every dialogue turn. However, for most current approaches, it’s difficult to scale to large dialogue domains. They have one or more of following limitations: (a) Some models don’t work in the situation where slot values in ontology changes dynamically; (b) The number of model parameters is proportional to the number of slots; (c) Some models extract features based on hand-crafted lexicons. To tackle these challenges, we propose StateNet, a universal dialogue state tracker. It is independent of the number of values, shares parameters across all slots, and uses pre-trained word vectors instead of explicit semantic dictionaries. Our experiments on two datasets show that our approach not only overcomes the limitations, but also significantly outperforms the performance of state-of-the-art approaches.
Tasks Dialogue State Tracking
Published 2018-10-22
URL http://arxiv.org/abs/1810.09587v1
PDF http://arxiv.org/pdf/1810.09587v1.pdf
PWC https://paperswithcode.com/paper/towards-universal-dialogue-state-tracking
Repo https://github.com/renll/StateNet
Framework mxnet

A Mixed Hierarchical Attention based Encoder-Decoder Approach for Standard Table Summarization

Title A Mixed Hierarchical Attention based Encoder-Decoder Approach for Standard Table Summarization
Authors Parag Jain, Anirban Laha, Karthik Sankaranarayanan, Preksha Nema, Mitesh M. Khapra, Shreyas Shetty
Abstract Structured data summarization involves generation of natural language summaries from structured input data. In this work, we consider summarizing structured data occurring in the form of tables as they are prevalent across a wide variety of domains. We formulate the standard table summarization problem, which deals with tables conforming to a single predefined schema. To this end, we propose a mixed hierarchical attention based encoder-decoder model which is able to leverage the structure in addition to the content of the tables. Our experiments on the publicly available WEATHERGOV dataset show around 18 BLEU (~ 30%) improvement over the current state-of-the-art.
Tasks Data Summarization
Published 2018-04-20
URL http://arxiv.org/abs/1804.07790v1
PDF http://arxiv.org/pdf/1804.07790v1.pdf
PWC https://paperswithcode.com/paper/a-mixed-hierarchical-attention-based-encoder
Repo https://github.com/parajain/StructuredData_To_Descriptions
Framework tf

Fair and Diverse DPP-based Data Summarization

Title Fair and Diverse DPP-based Data Summarization
Authors L. Elisa Celis, Vijay Keswani, Damian Straszak, Amit Deshpande, Tarun Kathuria, Nisheeth K. Vishnoi
Abstract Sampling methods that choose a subset of the data proportional to its diversity in the feature space are popular for data summarization. However, recent studies have noted the occurrence of bias (under- or over-representation of a certain gender or race) in such data summarization methods. In this paper we initiate a study of the problem of outputting a diverse and fair summary of a given dataset. We work with a well-studied determinantal measure of diversity and corresponding distributions (DPPs) and present a framework that allows us to incorporate a general class of fairness constraints into such distributions. Coming up with efficient algorithms to sample from these constrained determinantal distributions, however, suffers from a complexity barrier and we present a fast sampler that is provably good when the input vectors satisfy a natural property. Our experimental results on a real-world and an image dataset show that the diversity of the samples produced by adding fairness constraints is not too far from the unconstrained case, and we also provide a theoretical explanation of it.
Tasks Data Summarization
Published 2018-02-12
URL http://arxiv.org/abs/1802.04023v1
PDF http://arxiv.org/pdf/1802.04023v1.pdf
PWC https://paperswithcode.com/paper/fair-and-diverse-dpp-based-data-summarization
Repo https://github.com/DamianStraszak/FairDiverseDPPSampling
Framework none

Localized Structured Prediction

Title Localized Structured Prediction
Authors Carlo Ciliberto, Francis Bach, Alessandro Rudi
Abstract Key to structured prediction is exploiting the problem structure to simplify the learning process. A major challenge arises when data exhibit a local structure (e.g., are made by “parts”) that can be leveraged to better approximate the relation between (parts of) the input and (parts of) the output. Recent literature on signal processing, and in particular computer vision, has shown that capturing these aspects is indeed essential to achieve state-of-the-art performance. While such algorithms are typically derived on a case-by-case basis, in this work we propose the first theoretical framework to deal with part-based data from a general perspective. We derive a novel approach to deal with these problems and study its generalization properties within the setting of statistical learning theory. Our analysis is novel in that it explicitly quantifies the benefits of leveraging the part-based structure of the problem with respect to the learning rates of the proposed estimator.
Tasks Structured Prediction
Published 2018-06-06
URL https://arxiv.org/abs/1806.02402v3
PDF https://arxiv.org/pdf/1806.02402v3.pdf
PWC https://paperswithcode.com/paper/localized-structured-prediction
Repo https://github.com/cciliber/localized-structured-prediction
Framework none

Bayesian QuickNAT: Model Uncertainty in Deep Whole-Brain Segmentation for Structure-wise Quality Control

Title Bayesian QuickNAT: Model Uncertainty in Deep Whole-Brain Segmentation for Structure-wise Quality Control
Authors Abhijit Guha Roy, Sailesh Conjeti, Nassir Navab, Christian Wachinger
Abstract We introduce Bayesian QuickNAT for the automated quality control of whole-brain segmentation on MRI T1 scans. Next to the Bayesian fully convolutional neural network, we also present inherent measures of segmentation uncertainty that allow for quality control per brain structure. For estimating model uncertainty, we follow a Bayesian approach, wherein, Monte Carlo (MC) samples from the posterior distribution are generated by keeping the dropout layers active at test time. Entropy over the MC samples provides a voxel-wise model uncertainty map, whereas expectation over the MC predictions provides the final segmentation. Next to voxel-wise uncertainty, we introduce four metrics to quantify structure-wise uncertainty in segmentation for quality control. We report experiments on four out-of-sample datasets comprising of diverse age range, pathology and imaging artifacts. The proposed structure-wise uncertainty metrics are highly correlated with the Dice score estimated with manual annotation and therefore present an inherent measure of segmentation quality. In particular, the intersection over union over all the MC samples is a suitable proxy for the Dice score. In addition to quality control at scan-level, we propose to incorporate the structure-wise uncertainty as a measure of confidence to do reliable group analysis on large data repositories. We envisage that the introduced uncertainty metrics would help assess the fidelity of automated deep learning based segmentation methods for large-scale population studies, as they enable automated quality control and group analyses in processing large data repositories.
Tasks Brain Segmentation
Published 2018-11-24
URL http://arxiv.org/abs/1811.09800v1
PDF http://arxiv.org/pdf/1811.09800v1.pdf
PWC https://paperswithcode.com/paper/bayesian-quicknat-model-uncertainty-in-deep
Repo https://github.com/abhi4ssj/BayesianQuickNAT
Framework none

Deep Metric Transfer for Label Propagation with Limited Annotated Data

Title Deep Metric Transfer for Label Propagation with Limited Annotated Data
Authors Bin Liu, Zhirong Wu, Han Hu, Stephen Lin
Abstract We study object recognition under the constraint that each object class is only represented by very few observations. Semi-supervised learning, transfer learning, and few-shot recognition all concern with achieving fast generalization with few labeled data. In this paper, we propose a generic framework that utilizes unlabeled data to aid generalization for all three tasks. Our approach is to create much more training data through label propagation from the few labeled examples to a vast collection of unannotated images. The main contribution of the paper is that we show such a label propagation scheme can be highly effective when the similarity metric used for propagation is transferred from other related domains. We test various combinations of supervised and unsupervised metric learning methods with various label propagation algorithms. We find that our framework is very generic without being sensitive to any specific techniques. By taking advantage of unlabeled data in this way, we achieve significant improvements on all three tasks.
Tasks Metric Learning, Object Recognition, Transfer Learning
Published 2018-12-20
URL https://arxiv.org/abs/1812.08781v2
PDF https://arxiv.org/pdf/1812.08781v2.pdf
PWC https://paperswithcode.com/paper/deep-metric-transfer-for-label-propagation
Repo https://github.com/microsoft/metric-transfer.pytorch
Framework pytorch

Modeling Uncertainty with Hedged Instance Embedding

Title Modeling Uncertainty with Hedged Instance Embedding
Authors Seong Joon Oh, Kevin Murphy, Jiyan Pan, Joseph Roth, Florian Schroff, Andrew Gallagher
Abstract Instance embeddings are an efficient and versatile image representation that facilitates applications like recognition, verification, retrieval, and clustering. Many metric learning methods represent the input as a single point in the embedding space. Often the distance between points is used as a proxy for match confidence. However, this can fail to represent uncertainty arising when the input is ambiguous, e.g., due to occlusion or blurriness. This work addresses this issue and explicitly models the uncertainty by hedging the location of each input in the embedding space. We introduce the hedged instance embedding (HIB) in which embeddings are modeled as random variables and the model is trained under the variational information bottleneck principle. Empirical results on our new N-digit MNIST dataset show that our method leads to the desired behavior of hedging its bets across the embedding space upon encountering ambiguous inputs. This results in improved performance for image matching and classification tasks, more structure in the learned embedding space, and an ability to compute a per-exemplar uncertainty measure that is correlated with downstream performance.
Tasks Metric Learning
Published 2018-09-30
URL https://arxiv.org/abs/1810.00319v6
PDF https://arxiv.org/pdf/1810.00319v6.pdf
PWC https://paperswithcode.com/paper/modeling-uncertainty-with-hedged-instance
Repo https://github.com/google/n-digit-mnist
Framework none

Robust Video Content Alignment and Compensation for Rain Removal in a CNN Framework

Title Robust Video Content Alignment and Compensation for Rain Removal in a CNN Framework
Authors Jie Chen, Cheen-Hau Tan, Junhui Hou, Lap-Pui Chau, He Li
Abstract Rain removal is important for improving the robustness of outdoor vision based systems. Current rain removal methods show limitations either for complex dynamic scenes shot from fast moving cameras, or under torrential rain fall with opaque occlusions. We propose a novel derain algorithm, which applies superpixel (SP) segmentation to decompose the scene into depth consistent units. Alignment of scene contents are done at the SP level, which proves to be robust towards rain occlusion and fast camera motion. Two alignment output tensors, i.e., optimal temporal match tensor and sorted spatial-temporal match tensor, provide informative clues for rain streak location and occluded background contents to generate an intermediate derain output. These tensors will be subsequently prepared as input features for a convolutional neural network to restore high frequency details to the intermediate output for compensation of mis-alignment blur. Extensive evaluations show that up to 5 dB reconstruction PSNR advantage is achieved over state-of-the-art methods. Visual inspection shows that much cleaner rain removal is achieved especially for highly dynamic scenes with heavy and opaque rainfall from a fast moving camera.
Tasks Rain Removal
Published 2018-03-28
URL http://arxiv.org/abs/1803.10433v1
PDF http://arxiv.org/pdf/1803.10433v1.pdf
PWC https://paperswithcode.com/paper/robust-video-content-alignment-and-1
Repo https://github.com/hotndy/SPAC-SupplementaryMaterials
Framework none

Stochastic Adversarial Video Prediction

Title Stochastic Adversarial Video Prediction
Authors Alex X. Lee, Richard Zhang, Frederik Ebert, Pieter Abbeel, Chelsea Finn, Sergey Levine
Abstract Being able to predict what may happen in the future requires an in-depth understanding of the physical and causal rules that govern the world. A model that is able to do so has a number of appealing applications, from robotic planning to representation learning. However, learning to predict raw future observations, such as frames in a video, is exceedingly challenging – the ambiguous nature of the problem can cause a naively designed model to average together possible futures into a single, blurry prediction. Recently, this has been addressed by two distinct approaches: (a) latent variational variable models that explicitly model underlying stochasticity and (b) adversarially-trained models that aim to produce naturalistic images. However, a standard latent variable model can struggle to produce realistic results, and a standard adversarially-trained model underutilizes latent variables and fails to produce diverse predictions. We show that these distinct methods are in fact complementary. Combining the two produces predictions that look more realistic to human raters and better cover the range of possible futures. Our method outperforms prior and concurrent work in these aspects.
Tasks Representation Learning, Video Prediction
Published 2018-04-04
URL http://arxiv.org/abs/1804.01523v1
PDF http://arxiv.org/pdf/1804.01523v1.pdf
PWC https://paperswithcode.com/paper/stochastic-adversarial-video-prediction
Repo https://github.com/alexlee-gk/video_prediction
Framework tf

Rain Removal in Traffic Surveillance: Does it Matter?

Title Rain Removal in Traffic Surveillance: Does it Matter?
Authors Chris H. Bahnsen, Thomas B. Moeslund
Abstract Varying weather conditions, including rainfall and snowfall, are generally regarded as a challenge for computer vision algorithms. One proposed solution to the challenges induced by rain and snowfall is to artificially remove the rain from images or video using rain removal algorithms. It is the promise of these algorithms that the rain-removed image frames will improve the performance of subsequent segmentation and tracking algorithms. However, rain removal algorithms are typically evaluated on their ability to remove synthetic rain on a small subset of images. Currently, their behavior is unknown on real-world videos when integrated with a typical computer vision pipeline. In this paper, we review the existing rain removal algorithms and propose a new dataset that consists of 22 traffic surveillance sequences under a broad variety of weather conditions that all include either rain or snowfall. We propose a new evaluation protocol that evaluates the rain removal algorithms on their ability to improve the performance of subsequent segmentation, instance segmentation, and feature tracking algorithms under rain and snow. If successful, the de-rained frames of a rain removal algorithm should improve segmentation performance and increase the number of accurately tracked features. The results show that a recent single-frame-based rain removal algorithm increases the segmentation performance by 19.7% on our proposed dataset, but it eventually decreases the feature tracking performance and showed mixed results with recent instance segmentation methods. However, the best video-based rain removal algorithm improves the feature tracking accuracy by 7.72%.
Tasks Instance Segmentation, Rain Removal, Semantic Segmentation
Published 2018-10-30
URL http://arxiv.org/abs/1810.12574v1
PDF http://arxiv.org/pdf/1810.12574v1.pdf
PWC https://paperswithcode.com/paper/rain-removal-in-traffic-surveillance-does-it
Repo https://github.com/chrisbahnsen/aau-rainsnow-eval
Framework none

A Large Dataset for Improving Patch Matching

Title A Large Dataset for Improving Patch Matching
Authors Rahul Mitra, Nehal Doiphode, Utkarsh Gautam, Sanath Narayan, Shuaib Ahmed, Sharat Chandran, Arjun Jain
Abstract We propose a new dataset for learning local image descriptors which can be used for significantly improved patch matching. Our proposed dataset consists of an order of magnitude more number of scenes, images, and positive and negative correspondences compared to the currently available Multi-View Stereo (MVS) dataset from Brown et al. The new dataset also has better coverage of the overall viewpoint, scale, and lighting changes in comparison to the MVS dataset. Our dataset also provides supplementary information like RGB patches with scale and rotations values, and intrinsic and extrinsic camera parameters which as shown later can be used to customize training data as per application. We train an existing state-of-the-art model on our dataset and evaluate on publicly available benchmarks such as HPatches dataset and Strecha et al.\cite{strecha} to quantify the image descriptor performance. Experimental evaluations show that the descriptors trained using our proposed dataset outperform the current state-of-the-art descriptors trained on MVS by 8%, 4% and 10% on matching, verification and retrieval tasks respectively on the HPatches dataset. Similarly on the Strecha dataset, we see an improvement of 3-5% for the matching task in non-planar scenes.
Tasks
Published 2018-01-04
URL http://arxiv.org/abs/1801.01466v3
PDF http://arxiv.org/pdf/1801.01466v3.pdf
PWC https://paperswithcode.com/paper/a-large-dataset-for-improving-patch-matching
Repo https://github.com/rmitra/PS-Dataset
Framework none

Bayesian Compression for Natural Language Processing

Title Bayesian Compression for Natural Language Processing
Authors Nadezhda Chirkova, Ekaterina Lobacheva, Dmitry Vetrov
Abstract In natural language processing, a lot of the tasks are successfully solved with recurrent neural networks, but such models have a huge number of parameters. The majority of these parameters are often concentrated in the embedding layer, which size grows proportionally to the vocabulary length. We propose a Bayesian sparsification technique for RNNs which allows compressing the RNN dozens or hundreds of times without time-consuming hyperparameters tuning. We also generalize the model for vocabulary sparsification to filter out unnecessary words and compress the RNN even further. We show that the choice of the kept words is interpretable. Code is available on github: https://github.com/tipt0p/SparseBayesianRNN
Tasks
Published 2018-10-25
URL http://arxiv.org/abs/1810.10927v2
PDF http://arxiv.org/pdf/1810.10927v2.pdf
PWC https://paperswithcode.com/paper/bayesian-compression-for-natural-language
Repo https://github.com/ars-ashuha/variational-dropout-sparsifies-dnn
Framework tf

Inferencing Based on Unsupervised Learning of Disentangled Representations

Title Inferencing Based on Unsupervised Learning of Disentangled Representations
Authors Tobias Hinz, Stefan Wermter
Abstract Combining Generative Adversarial Networks (GANs) with encoders that learn to encode data points has shown promising results in learning data representations in an unsupervised way. We propose a framework that combines an encoder and a generator to learn disentangled representations which encode meaningful information about the data distribution without the need for any labels. While current approaches focus mostly on the generative aspects of GANs, our framework can be used to perform inference on both real and generated data points. Experiments on several data sets show that the encoder learns interpretable, disentangled representations which encode descriptive properties and can be used to sample images that exhibit specific characteristics.
Tasks Representation Learning, Unsupervised Image Classification, Unsupervised MNIST, Unsupervised Representation Learning
Published 2018-03-07
URL http://arxiv.org/abs/1803.02627v1
PDF http://arxiv.org/pdf/1803.02627v1.pdf
PWC https://paperswithcode.com/paper/inferencing-based-on-unsupervised-learning-of
Repo https://github.com/tohinz/Bidirectional-InfoGAN
Framework tf

Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task

Title Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task
Authors Tao Yu, Rui Zhang, Kai Yang, Michihiro Yasunaga, Dongxu Wang, Zifan Li, James Ma, Irene Li, Qingning Yao, Shanelle Roman, Zilin Zhang, Dragomir Radev
Abstract We present Spider, a large-scale, complex and cross-domain semantic parsing and text-to-SQL dataset annotated by 11 college students. It consists of 10,181 questions and 5,693 unique complex SQL queries on 200 databases with multiple tables, covering 138 different domains. We define a new complex and cross-domain semantic parsing and text-to-SQL task where different complex SQL queries and databases appear in train and test sets. In this way, the task requires the model to generalize well to both new SQL queries and new database schemas. Spider is distinct from most of the previous semantic parsing tasks because they all use a single database and the exact same programs in the train set and the test set. We experiment with various state-of-the-art models and the best model achieves only 12.4% exact matching accuracy on a database split setting. This shows that Spider presents a strong challenge for future research. Our dataset and task are publicly available at https://yale-lily.github.io/spider
Tasks Semantic Parsing, Text-To-Sql
Published 2018-09-24
URL http://arxiv.org/abs/1809.08887v5
PDF http://arxiv.org/pdf/1809.08887v5.pdf
PWC https://paperswithcode.com/paper/spider-a-large-scale-human-labeled-dataset
Repo https://github.com/taoyds/spider
Framework tf
comments powered by Disqus