Paper Group AWR 216
Pattern Search Multidimensional Scaling. Towards Universal Dialogue State Tracking. A Mixed Hierarchical Attention based Encoder-Decoder Approach for Standard Table Summarization. Fair and Diverse DPP-based Data Summarization. Localized Structured Prediction. Bayesian QuickNAT: Model Uncertainty in Deep Whole-Brain Segmentation for Structure-wise Q …
Pattern Search Multidimensional Scaling
Title | Pattern Search Multidimensional Scaling |
Authors | Georgios Paraskevopoulos, Efthymios Tzinis, Emmanouil-Vasileios Vlatakis-Gkaragkounis, Alexandros Potamianos |
Abstract | We present a novel view of nonlinear manifold learning using derivative-free optimization techniques. Specifically, we propose an extension of the classical multi-dimensional scaling (MDS) method, where instead of performing gradient descent, we sample and evaluate possible “moves” in a sphere of fixed radius for each point in the embedded space. A fixed-point convergence guarantee can be shown by formulating the proposed algorithm as an instance of General Pattern Search (GPS) framework. Evaluation on both clean and noisy synthetic datasets shows that pattern search MDS can accurately infer the intrinsic geometry of manifolds embedded in high-dimensional spaces. Additionally, experiments on real data, even under noisy conditions, demonstrate that the proposed pattern search MDS yields state-of-the-art results. |
Tasks | |
Published | 2018-06-01 |
URL | https://arxiv.org/abs/1806.00416v3 |
https://arxiv.org/pdf/1806.00416v3.pdf | |
PWC | https://paperswithcode.com/paper/pattern-search-multidimensional-scaling |
Repo | https://github.com/georgepar/pattern-search-mds |
Framework | none |
Towards Universal Dialogue State Tracking
Title | Towards Universal Dialogue State Tracking |
Authors | Liliang Ren, Kaige Xie, Lu Chen, Kai Yu |
Abstract | Dialogue state tracking is the core part of a spoken dialogue system. It estimates the beliefs of possible user’s goals at every dialogue turn. However, for most current approaches, it’s difficult to scale to large dialogue domains. They have one or more of following limitations: (a) Some models don’t work in the situation where slot values in ontology changes dynamically; (b) The number of model parameters is proportional to the number of slots; (c) Some models extract features based on hand-crafted lexicons. To tackle these challenges, we propose StateNet, a universal dialogue state tracker. It is independent of the number of values, shares parameters across all slots, and uses pre-trained word vectors instead of explicit semantic dictionaries. Our experiments on two datasets show that our approach not only overcomes the limitations, but also significantly outperforms the performance of state-of-the-art approaches. |
Tasks | Dialogue State Tracking |
Published | 2018-10-22 |
URL | http://arxiv.org/abs/1810.09587v1 |
http://arxiv.org/pdf/1810.09587v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-universal-dialogue-state-tracking |
Repo | https://github.com/renll/StateNet |
Framework | mxnet |
A Mixed Hierarchical Attention based Encoder-Decoder Approach for Standard Table Summarization
Title | A Mixed Hierarchical Attention based Encoder-Decoder Approach for Standard Table Summarization |
Authors | Parag Jain, Anirban Laha, Karthik Sankaranarayanan, Preksha Nema, Mitesh M. Khapra, Shreyas Shetty |
Abstract | Structured data summarization involves generation of natural language summaries from structured input data. In this work, we consider summarizing structured data occurring in the form of tables as they are prevalent across a wide variety of domains. We formulate the standard table summarization problem, which deals with tables conforming to a single predefined schema. To this end, we propose a mixed hierarchical attention based encoder-decoder model which is able to leverage the structure in addition to the content of the tables. Our experiments on the publicly available WEATHERGOV dataset show around 18 BLEU (~ 30%) improvement over the current state-of-the-art. |
Tasks | Data Summarization |
Published | 2018-04-20 |
URL | http://arxiv.org/abs/1804.07790v1 |
http://arxiv.org/pdf/1804.07790v1.pdf | |
PWC | https://paperswithcode.com/paper/a-mixed-hierarchical-attention-based-encoder |
Repo | https://github.com/parajain/StructuredData_To_Descriptions |
Framework | tf |
Fair and Diverse DPP-based Data Summarization
Title | Fair and Diverse DPP-based Data Summarization |
Authors | L. Elisa Celis, Vijay Keswani, Damian Straszak, Amit Deshpande, Tarun Kathuria, Nisheeth K. Vishnoi |
Abstract | Sampling methods that choose a subset of the data proportional to its diversity in the feature space are popular for data summarization. However, recent studies have noted the occurrence of bias (under- or over-representation of a certain gender or race) in such data summarization methods. In this paper we initiate a study of the problem of outputting a diverse and fair summary of a given dataset. We work with a well-studied determinantal measure of diversity and corresponding distributions (DPPs) and present a framework that allows us to incorporate a general class of fairness constraints into such distributions. Coming up with efficient algorithms to sample from these constrained determinantal distributions, however, suffers from a complexity barrier and we present a fast sampler that is provably good when the input vectors satisfy a natural property. Our experimental results on a real-world and an image dataset show that the diversity of the samples produced by adding fairness constraints is not too far from the unconstrained case, and we also provide a theoretical explanation of it. |
Tasks | Data Summarization |
Published | 2018-02-12 |
URL | http://arxiv.org/abs/1802.04023v1 |
http://arxiv.org/pdf/1802.04023v1.pdf | |
PWC | https://paperswithcode.com/paper/fair-and-diverse-dpp-based-data-summarization |
Repo | https://github.com/DamianStraszak/FairDiverseDPPSampling |
Framework | none |
Localized Structured Prediction
Title | Localized Structured Prediction |
Authors | Carlo Ciliberto, Francis Bach, Alessandro Rudi |
Abstract | Key to structured prediction is exploiting the problem structure to simplify the learning process. A major challenge arises when data exhibit a local structure (e.g., are made by “parts”) that can be leveraged to better approximate the relation between (parts of) the input and (parts of) the output. Recent literature on signal processing, and in particular computer vision, has shown that capturing these aspects is indeed essential to achieve state-of-the-art performance. While such algorithms are typically derived on a case-by-case basis, in this work we propose the first theoretical framework to deal with part-based data from a general perspective. We derive a novel approach to deal with these problems and study its generalization properties within the setting of statistical learning theory. Our analysis is novel in that it explicitly quantifies the benefits of leveraging the part-based structure of the problem with respect to the learning rates of the proposed estimator. |
Tasks | Structured Prediction |
Published | 2018-06-06 |
URL | https://arxiv.org/abs/1806.02402v3 |
https://arxiv.org/pdf/1806.02402v3.pdf | |
PWC | https://paperswithcode.com/paper/localized-structured-prediction |
Repo | https://github.com/cciliber/localized-structured-prediction |
Framework | none |
Bayesian QuickNAT: Model Uncertainty in Deep Whole-Brain Segmentation for Structure-wise Quality Control
Title | Bayesian QuickNAT: Model Uncertainty in Deep Whole-Brain Segmentation for Structure-wise Quality Control |
Authors | Abhijit Guha Roy, Sailesh Conjeti, Nassir Navab, Christian Wachinger |
Abstract | We introduce Bayesian QuickNAT for the automated quality control of whole-brain segmentation on MRI T1 scans. Next to the Bayesian fully convolutional neural network, we also present inherent measures of segmentation uncertainty that allow for quality control per brain structure. For estimating model uncertainty, we follow a Bayesian approach, wherein, Monte Carlo (MC) samples from the posterior distribution are generated by keeping the dropout layers active at test time. Entropy over the MC samples provides a voxel-wise model uncertainty map, whereas expectation over the MC predictions provides the final segmentation. Next to voxel-wise uncertainty, we introduce four metrics to quantify structure-wise uncertainty in segmentation for quality control. We report experiments on four out-of-sample datasets comprising of diverse age range, pathology and imaging artifacts. The proposed structure-wise uncertainty metrics are highly correlated with the Dice score estimated with manual annotation and therefore present an inherent measure of segmentation quality. In particular, the intersection over union over all the MC samples is a suitable proxy for the Dice score. In addition to quality control at scan-level, we propose to incorporate the structure-wise uncertainty as a measure of confidence to do reliable group analysis on large data repositories. We envisage that the introduced uncertainty metrics would help assess the fidelity of automated deep learning based segmentation methods for large-scale population studies, as they enable automated quality control and group analyses in processing large data repositories. |
Tasks | Brain Segmentation |
Published | 2018-11-24 |
URL | http://arxiv.org/abs/1811.09800v1 |
http://arxiv.org/pdf/1811.09800v1.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-quicknat-model-uncertainty-in-deep |
Repo | https://github.com/abhi4ssj/BayesianQuickNAT |
Framework | none |
Deep Metric Transfer for Label Propagation with Limited Annotated Data
Title | Deep Metric Transfer for Label Propagation with Limited Annotated Data |
Authors | Bin Liu, Zhirong Wu, Han Hu, Stephen Lin |
Abstract | We study object recognition under the constraint that each object class is only represented by very few observations. Semi-supervised learning, transfer learning, and few-shot recognition all concern with achieving fast generalization with few labeled data. In this paper, we propose a generic framework that utilizes unlabeled data to aid generalization for all three tasks. Our approach is to create much more training data through label propagation from the few labeled examples to a vast collection of unannotated images. The main contribution of the paper is that we show such a label propagation scheme can be highly effective when the similarity metric used for propagation is transferred from other related domains. We test various combinations of supervised and unsupervised metric learning methods with various label propagation algorithms. We find that our framework is very generic without being sensitive to any specific techniques. By taking advantage of unlabeled data in this way, we achieve significant improvements on all three tasks. |
Tasks | Metric Learning, Object Recognition, Transfer Learning |
Published | 2018-12-20 |
URL | https://arxiv.org/abs/1812.08781v2 |
https://arxiv.org/pdf/1812.08781v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-metric-transfer-for-label-propagation |
Repo | https://github.com/microsoft/metric-transfer.pytorch |
Framework | pytorch |
Modeling Uncertainty with Hedged Instance Embedding
Title | Modeling Uncertainty with Hedged Instance Embedding |
Authors | Seong Joon Oh, Kevin Murphy, Jiyan Pan, Joseph Roth, Florian Schroff, Andrew Gallagher |
Abstract | Instance embeddings are an efficient and versatile image representation that facilitates applications like recognition, verification, retrieval, and clustering. Many metric learning methods represent the input as a single point in the embedding space. Often the distance between points is used as a proxy for match confidence. However, this can fail to represent uncertainty arising when the input is ambiguous, e.g., due to occlusion or blurriness. This work addresses this issue and explicitly models the uncertainty by hedging the location of each input in the embedding space. We introduce the hedged instance embedding (HIB) in which embeddings are modeled as random variables and the model is trained under the variational information bottleneck principle. Empirical results on our new N-digit MNIST dataset show that our method leads to the desired behavior of hedging its bets across the embedding space upon encountering ambiguous inputs. This results in improved performance for image matching and classification tasks, more structure in the learned embedding space, and an ability to compute a per-exemplar uncertainty measure that is correlated with downstream performance. |
Tasks | Metric Learning |
Published | 2018-09-30 |
URL | https://arxiv.org/abs/1810.00319v6 |
https://arxiv.org/pdf/1810.00319v6.pdf | |
PWC | https://paperswithcode.com/paper/modeling-uncertainty-with-hedged-instance |
Repo | https://github.com/google/n-digit-mnist |
Framework | none |
Robust Video Content Alignment and Compensation for Rain Removal in a CNN Framework
Title | Robust Video Content Alignment and Compensation for Rain Removal in a CNN Framework |
Authors | Jie Chen, Cheen-Hau Tan, Junhui Hou, Lap-Pui Chau, He Li |
Abstract | Rain removal is important for improving the robustness of outdoor vision based systems. Current rain removal methods show limitations either for complex dynamic scenes shot from fast moving cameras, or under torrential rain fall with opaque occlusions. We propose a novel derain algorithm, which applies superpixel (SP) segmentation to decompose the scene into depth consistent units. Alignment of scene contents are done at the SP level, which proves to be robust towards rain occlusion and fast camera motion. Two alignment output tensors, i.e., optimal temporal match tensor and sorted spatial-temporal match tensor, provide informative clues for rain streak location and occluded background contents to generate an intermediate derain output. These tensors will be subsequently prepared as input features for a convolutional neural network to restore high frequency details to the intermediate output for compensation of mis-alignment blur. Extensive evaluations show that up to 5 dB reconstruction PSNR advantage is achieved over state-of-the-art methods. Visual inspection shows that much cleaner rain removal is achieved especially for highly dynamic scenes with heavy and opaque rainfall from a fast moving camera. |
Tasks | Rain Removal |
Published | 2018-03-28 |
URL | http://arxiv.org/abs/1803.10433v1 |
http://arxiv.org/pdf/1803.10433v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-video-content-alignment-and-1 |
Repo | https://github.com/hotndy/SPAC-SupplementaryMaterials |
Framework | none |
Stochastic Adversarial Video Prediction
Title | Stochastic Adversarial Video Prediction |
Authors | Alex X. Lee, Richard Zhang, Frederik Ebert, Pieter Abbeel, Chelsea Finn, Sergey Levine |
Abstract | Being able to predict what may happen in the future requires an in-depth understanding of the physical and causal rules that govern the world. A model that is able to do so has a number of appealing applications, from robotic planning to representation learning. However, learning to predict raw future observations, such as frames in a video, is exceedingly challenging – the ambiguous nature of the problem can cause a naively designed model to average together possible futures into a single, blurry prediction. Recently, this has been addressed by two distinct approaches: (a) latent variational variable models that explicitly model underlying stochasticity and (b) adversarially-trained models that aim to produce naturalistic images. However, a standard latent variable model can struggle to produce realistic results, and a standard adversarially-trained model underutilizes latent variables and fails to produce diverse predictions. We show that these distinct methods are in fact complementary. Combining the two produces predictions that look more realistic to human raters and better cover the range of possible futures. Our method outperforms prior and concurrent work in these aspects. |
Tasks | Representation Learning, Video Prediction |
Published | 2018-04-04 |
URL | http://arxiv.org/abs/1804.01523v1 |
http://arxiv.org/pdf/1804.01523v1.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-adversarial-video-prediction |
Repo | https://github.com/alexlee-gk/video_prediction |
Framework | tf |
Rain Removal in Traffic Surveillance: Does it Matter?
Title | Rain Removal in Traffic Surveillance: Does it Matter? |
Authors | Chris H. Bahnsen, Thomas B. Moeslund |
Abstract | Varying weather conditions, including rainfall and snowfall, are generally regarded as a challenge for computer vision algorithms. One proposed solution to the challenges induced by rain and snowfall is to artificially remove the rain from images or video using rain removal algorithms. It is the promise of these algorithms that the rain-removed image frames will improve the performance of subsequent segmentation and tracking algorithms. However, rain removal algorithms are typically evaluated on their ability to remove synthetic rain on a small subset of images. Currently, their behavior is unknown on real-world videos when integrated with a typical computer vision pipeline. In this paper, we review the existing rain removal algorithms and propose a new dataset that consists of 22 traffic surveillance sequences under a broad variety of weather conditions that all include either rain or snowfall. We propose a new evaluation protocol that evaluates the rain removal algorithms on their ability to improve the performance of subsequent segmentation, instance segmentation, and feature tracking algorithms under rain and snow. If successful, the de-rained frames of a rain removal algorithm should improve segmentation performance and increase the number of accurately tracked features. The results show that a recent single-frame-based rain removal algorithm increases the segmentation performance by 19.7% on our proposed dataset, but it eventually decreases the feature tracking performance and showed mixed results with recent instance segmentation methods. However, the best video-based rain removal algorithm improves the feature tracking accuracy by 7.72%. |
Tasks | Instance Segmentation, Rain Removal, Semantic Segmentation |
Published | 2018-10-30 |
URL | http://arxiv.org/abs/1810.12574v1 |
http://arxiv.org/pdf/1810.12574v1.pdf | |
PWC | https://paperswithcode.com/paper/rain-removal-in-traffic-surveillance-does-it |
Repo | https://github.com/chrisbahnsen/aau-rainsnow-eval |
Framework | none |
A Large Dataset for Improving Patch Matching
Title | A Large Dataset for Improving Patch Matching |
Authors | Rahul Mitra, Nehal Doiphode, Utkarsh Gautam, Sanath Narayan, Shuaib Ahmed, Sharat Chandran, Arjun Jain |
Abstract | We propose a new dataset for learning local image descriptors which can be used for significantly improved patch matching. Our proposed dataset consists of an order of magnitude more number of scenes, images, and positive and negative correspondences compared to the currently available Multi-View Stereo (MVS) dataset from Brown et al. The new dataset also has better coverage of the overall viewpoint, scale, and lighting changes in comparison to the MVS dataset. Our dataset also provides supplementary information like RGB patches with scale and rotations values, and intrinsic and extrinsic camera parameters which as shown later can be used to customize training data as per application. We train an existing state-of-the-art model on our dataset and evaluate on publicly available benchmarks such as HPatches dataset and Strecha et al.\cite{strecha} to quantify the image descriptor performance. Experimental evaluations show that the descriptors trained using our proposed dataset outperform the current state-of-the-art descriptors trained on MVS by 8%, 4% and 10% on matching, verification and retrieval tasks respectively on the HPatches dataset. Similarly on the Strecha dataset, we see an improvement of 3-5% for the matching task in non-planar scenes. |
Tasks | |
Published | 2018-01-04 |
URL | http://arxiv.org/abs/1801.01466v3 |
http://arxiv.org/pdf/1801.01466v3.pdf | |
PWC | https://paperswithcode.com/paper/a-large-dataset-for-improving-patch-matching |
Repo | https://github.com/rmitra/PS-Dataset |
Framework | none |
Bayesian Compression for Natural Language Processing
Title | Bayesian Compression for Natural Language Processing |
Authors | Nadezhda Chirkova, Ekaterina Lobacheva, Dmitry Vetrov |
Abstract | In natural language processing, a lot of the tasks are successfully solved with recurrent neural networks, but such models have a huge number of parameters. The majority of these parameters are often concentrated in the embedding layer, which size grows proportionally to the vocabulary length. We propose a Bayesian sparsification technique for RNNs which allows compressing the RNN dozens or hundreds of times without time-consuming hyperparameters tuning. We also generalize the model for vocabulary sparsification to filter out unnecessary words and compress the RNN even further. We show that the choice of the kept words is interpretable. Code is available on github: https://github.com/tipt0p/SparseBayesianRNN |
Tasks | |
Published | 2018-10-25 |
URL | http://arxiv.org/abs/1810.10927v2 |
http://arxiv.org/pdf/1810.10927v2.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-compression-for-natural-language |
Repo | https://github.com/ars-ashuha/variational-dropout-sparsifies-dnn |
Framework | tf |
Inferencing Based on Unsupervised Learning of Disentangled Representations
Title | Inferencing Based on Unsupervised Learning of Disentangled Representations |
Authors | Tobias Hinz, Stefan Wermter |
Abstract | Combining Generative Adversarial Networks (GANs) with encoders that learn to encode data points has shown promising results in learning data representations in an unsupervised way. We propose a framework that combines an encoder and a generator to learn disentangled representations which encode meaningful information about the data distribution without the need for any labels. While current approaches focus mostly on the generative aspects of GANs, our framework can be used to perform inference on both real and generated data points. Experiments on several data sets show that the encoder learns interpretable, disentangled representations which encode descriptive properties and can be used to sample images that exhibit specific characteristics. |
Tasks | Representation Learning, Unsupervised Image Classification, Unsupervised MNIST, Unsupervised Representation Learning |
Published | 2018-03-07 |
URL | http://arxiv.org/abs/1803.02627v1 |
http://arxiv.org/pdf/1803.02627v1.pdf | |
PWC | https://paperswithcode.com/paper/inferencing-based-on-unsupervised-learning-of |
Repo | https://github.com/tohinz/Bidirectional-InfoGAN |
Framework | tf |
Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task
Title | Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task |
Authors | Tao Yu, Rui Zhang, Kai Yang, Michihiro Yasunaga, Dongxu Wang, Zifan Li, James Ma, Irene Li, Qingning Yao, Shanelle Roman, Zilin Zhang, Dragomir Radev |
Abstract | We present Spider, a large-scale, complex and cross-domain semantic parsing and text-to-SQL dataset annotated by 11 college students. It consists of 10,181 questions and 5,693 unique complex SQL queries on 200 databases with multiple tables, covering 138 different domains. We define a new complex and cross-domain semantic parsing and text-to-SQL task where different complex SQL queries and databases appear in train and test sets. In this way, the task requires the model to generalize well to both new SQL queries and new database schemas. Spider is distinct from most of the previous semantic parsing tasks because they all use a single database and the exact same programs in the train set and the test set. We experiment with various state-of-the-art models and the best model achieves only 12.4% exact matching accuracy on a database split setting. This shows that Spider presents a strong challenge for future research. Our dataset and task are publicly available at https://yale-lily.github.io/spider |
Tasks | Semantic Parsing, Text-To-Sql |
Published | 2018-09-24 |
URL | http://arxiv.org/abs/1809.08887v5 |
http://arxiv.org/pdf/1809.08887v5.pdf | |
PWC | https://paperswithcode.com/paper/spider-a-large-scale-human-labeled-dataset |
Repo | https://github.com/taoyds/spider |
Framework | tf |