January 29, 2020

3222 words 16 mins read

Paper Group ANR 559

Self-supervised Learning of Distance Functions for Goal-Conditioned Reinforcement Learning. Model Similarity Mitigates Test Set Overuse. Protecting GANs against privacy attacks by preventing overfitting. One Point, One Object: Simultaneous 3D Object Segmentation and 6-DOF Pose Estimation. Joint Adversarial Training: Incorporating both Spatial and P …

Self-supervised Learning of Distance Functions for Goal-Conditioned Reinforcement Learning


Title	Self-supervised Learning of Distance Functions for Goal-Conditioned Reinforcement Learning
Authors	Srinivas Venkattaramanujam, Eric Crawford, Thang Doan, Doina Precup
Abstract	Goal-conditioned policies are used in order to break down complex reinforcement learning (RL) problems by using subgoals, which can be defined either in state space or in a latent feature space. This can increase the efficiency of learning by using a curriculum, and also enables simultaneous learning and generalization across goals. A crucial requirement of goal-conditioned policies is to be able to determine whether the goal has been achieved. Having a notion of distance to a goal is thus a crucial component of this approach. However, it is not straightforward to come up with an appropriate distance, and in some tasks, the goal space may not even be known a priori. In this work we learn a distance-to-goal estimate which is computed in terms of the number of actions that would need to be carried out in a self-supervised approach. Our method solves complex tasks without prior domain knowledge in the online setting in three different scenarios in the context of goal-conditioned policies a) the goal space is the same as the state space b) the goal space is given but an appropriate distance is unknown and c) the state space is accessible, but only a subset of the state space represents desired goals, and this subset is known a priori. We also propose a goal-generation mechanism as a secondary contribution.
Tasks
Published	2019-07-05
URL	https://arxiv.org/abs/1907.02998v1
PDF	https://arxiv.org/pdf/1907.02998v1.pdf
PWC	https://paperswithcode.com/paper/self-supervised-learning-of-distance
Repo
Framework

Model Similarity Mitigates Test Set Overuse


Title	Model Similarity Mitigates Test Set Overuse
Authors	Horia Mania, John Miller, Ludwig Schmidt, Moritz Hardt, Benjamin Recht
Abstract	Excessive reuse of test data has become commonplace in today’s machine learning workflows. Popular benchmarks, competitions, industrial scale tuning, among other applications, all involve test data reuse beyond guidance by statistical confidence bounds. Nonetheless, recent replication studies give evidence that popular benchmarks continue to support progress despite years of extensive reuse. We proffer a new explanation for the apparent longevity of test data: Many proposed models are similar in their predictions and we prove that this similarity mitigates overfitting. Specifically, we show empirically that models proposed for the ImageNet ILSVRC benchmark agree in their predictions well beyond what we can conclude from their accuracy levels alone. Likewise, models created by large scale hyperparameter search enjoy high levels of similarity. Motivated by these empirical observations, we give a non-asymptotic generalization bound that takes similarity into account, leading to meaningful confidence bounds in practical settings.
Tasks
Published	2019-05-29
URL	https://arxiv.org/abs/1905.12580v1
PDF	https://arxiv.org/pdf/1905.12580v1.pdf
PWC	https://paperswithcode.com/paper/model-similarity-mitigates-test-set-overuse
Repo
Framework

Protecting GANs against privacy attacks by preventing overfitting


Title	Protecting GANs against privacy attacks by preventing overfitting
Authors	Sumit Mukherjee, Yixi Xu, Anusua Trivedi, Juan Lavista Ferres
Abstract	Generative Adversarial Networks (GANs) have made releasing of synthetic images a viable approach to share data without releasing the original dataset. It has been shown that such synthetic data can be used for a variety of downstream tasks such as training classifiers that would otherwise require the original dataset to be shared. However, recent work has shown that the GAN models and their synthetically generated data can be used to infer the training set membership by an adversary who has access to the entire dataset and some auxiliary information. Here we develop a new GAN architecture (privGAN) which provides protection against this mode of attack while leading to negligible loss in downstream performances. Our architecture explicitly prevents overfitting to the training set thereby providing implicit protection against white-box attacks. The main contributions of this paper are: i) we propose a novel GAN architecture that can generate synthetic data in a privacy preserving manner and demonstrate the effectiveness of our model against white–box attacks on several benchmark datasets, ii) we provide a theoretical understanding of the optimal solution of the GAN loss function, iii) we demonstrate on two common benchmark datasets that synthetic images generated by privGAN lead to negligible loss in downstream performance when compared against non–private GANs. While we have focosued on benchmarking privGAN exclusively of image datasets, the architecture of privGAN is not exclusive to image datasets and can be easily extended to other types of datasets.
Tasks
Published	2019-12-31
URL	https://arxiv.org/abs/2001.00071v2
PDF	https://arxiv.org/pdf/2001.00071v2.pdf
PWC	https://paperswithcode.com/paper/protecting-gans-against-privacy-attacks-by
Repo
Framework

One Point, One Object: Simultaneous 3D Object Segmentation and 6-DOF Pose Estimation


Title	One Point, One Object: Simultaneous 3D Object Segmentation and 6-DOF Pose Estimation
Authors	Hongsen Liu, Yang Cong, Yandong Tang
Abstract	We propose a single-shot method for simultaneous 3D object segmentation and 6-DOF pose estimation in pure 3D point clouds scenes based on a consensus that \emph{one point only belongs to one object}, i.e., each point has the potential power to predict the 6-DOF pose of its corresponding object. Unlike the recently proposed methods of the similar task, which rely on 2D detectors to predict the projection of 3D corners of the 3D bounding boxes and the 6-DOF pose must be estimated by a PnP like spatial transformation method, ours is concise enough not to require additional spatial transformation between different dimensions. Due to the lack of training data for many objects, the recently proposed 2D detection methods try to generate training data by using rendering engine and achieve good results. However, rendering in 3D space along with 6-DOF is relatively difficult. Therefore, we propose an augmented reality technology to generate the training data in semi-virtual reality 3D space. The key component of our method is a multi-task CNN architecture that can simultaneously predicts the 3D object segmentation and 6-DOF pose estimation in pure 3D point clouds. For experimental evaluation, we generate expanded training data for two state-of-the-arts 3D object datasets \cite{PLCHF}\cite{TLINEMOD} by using Augmented Reality technology (AR). We evaluate our proposed method on the two datasets. The results show that our method can be well generalized into multiple scenarios and provide performance comparable to or better than the state-of-the-arts.
Tasks	Pose Estimation, Semantic Segmentation
Published	2019-12-27
URL	https://arxiv.org/abs/1912.12095v1
PDF	https://arxiv.org/pdf/1912.12095v1.pdf
PWC	https://paperswithcode.com/paper/one-point-one-object-simultaneous-3d-object
Repo
Framework

Joint Adversarial Training: Incorporating both Spatial and Pixel Attacks


Title	Joint Adversarial Training: Incorporating both Spatial and Pixel Attacks
Authors	Haichao Zhang, Jianyu Wang
Abstract	Conventional adversarial training methods using attacks that manipulate the pixel value directly and individually, leading to models that are less robust in face of spatial transformation-based attacks. In this paper, we propose a joint adversarial training method that incorporates both spatial transformation-based and pixel-value based attacks for improving model robustness. We introduce a spatial transformation-based attack with an explicit notion of budget and develop an algorithm for spatial attack generation. We further integrate both pixel and spatial attacks into one generation model and show how to leverage the complementary strengths of each other in training for improving the overall model robustness. Extensive experimental results on different benchmark datasets compared with state-of-the-art methods verified the effectiveness of the proposed method.
Tasks
Published	2019-07-24
URL	https://arxiv.org/abs/1907.10737v2
PDF	https://arxiv.org/pdf/1907.10737v2.pdf
PWC	https://paperswithcode.com/paper/joint-adversarial-training-incorporating-both
Repo
Framework

Is Pretraining Necessary for Hyperspectral Image Classification?


Title	Is Pretraining Necessary for Hyperspectral Image Classification?
Authors	Hyungtae Lee, Sungmin Eum, Heesung Kwon
Abstract	We address two questions for training a convolutional neural network (CNN) for hyperspectral image classification: i) is it possible to build a pre-trained network? and ii) is the pre-training effective in furthering the performance? To answer the first question, we have devised an approach that pre-trains a network on multiple source datasets that differ in their hyperspectral characteristics and fine-tunes on a target dataset. This approach effectively resolves the architectural issue that arises when transferring meaningful information between the source and the target networks. To answer the second question, we carried out several ablation experiments. Based on the experimental results, a network trained from scratch performs as good as a network fine-tuned from a pre-trained network. However, we observed that pre-training the network has its own advantage in achieving better performances when deeper networks are required.
Tasks	Hyperspectral Image Classification, Image Classification
Published	2019-01-24
URL	http://arxiv.org/abs/1901.08658v1
PDF	http://arxiv.org/pdf/1901.08658v1.pdf
PWC	https://paperswithcode.com/paper/is-pretraining-necessary-for-hyperspectral
Repo
Framework

Data Augmentation with Manifold Exploring Geometric Transformations for Increased Performance and Robustness


Title	Data Augmentation with Manifold Exploring Geometric Transformations for Increased Performance and Robustness
Authors	Magdalini Paschali, Walter Simson, Abhijit Guha Roy, Muhammad Ferjad Naeem, Rüdiger Göbl, Christian Wachinger, Nassir Navab
Abstract	In this paper we propose a novel augmentation technique that improves not only the performance of deep neural networks on clean test data, but also significantly increases their robustness to random transformations, both affine and projective. Inspired by ManiFool, the augmentation is performed by a line-search manifold-exploration method that learns affine geometric transformations that lead to the misclassification on an image, while ensuring that it remains on the same manifold as the training data. This augmentation method populates any training dataset with images that lie on the border of the manifolds between two-classes and maximizes the variance the network is exposed to during training. Our method was thoroughly evaluated on the challenging tasks of fine-grained skin lesion classification from limited data, and breast tumor classification of mammograms. Compared with traditional augmentation methods, and with images synthesized by Generative Adversarial Networks our method not only achieves state-of-the-art performance but also significantly improves the network’s robustness.
Tasks	Data Augmentation, Skin Lesion Classification
Published	2019-01-14
URL	http://arxiv.org/abs/1901.04420v1
PDF	http://arxiv.org/pdf/1901.04420v1.pdf
PWC	https://paperswithcode.com/paper/data-augmentation-with-manifold-exploring
Repo
Framework

Unsupervised and Unregistered Hyperspectral Image Super-Resolution with Mutual Dirichlet-Net


Title	Unsupervised and Unregistered Hyperspectral Image Super-Resolution with Mutual Dirichlet-Net
Authors	Ying Qu, Hairong Qi, Chiman Kwan
Abstract	Hyperspectral images (HSI) provide rich spectral information that contributed to the successful performance improvement of numerous computer vision tasks. However, it can only be achieved at the expense of images’ spatial resolution. Hyperspectral image super-resolution (HSI-SR) addresses this problem by fusing low resolution (LR) HSI with multispectral image (MSI) carrying much higher spatial resolution (HR). All existing HSI-SR approaches require the LR HSI and HR MSI to be well registered and the reconstruction accuracy of the HR HSI relies heavily on the registration accuracy of different modalities. This paper exploits the uncharted problem domain of HSI-SR without the requirement of multi-modality registration. Given the unregistered LR HSI and HR MSI with overlapped regions, we design a unique unsupervised learning structure linking the two unregistered modalities by projecting them into the same statistical space through the same encoder. The mutual information (MI) is further adopted to capture the non-linear statistical dependencies between the representations from two modalities (carrying spatial information) and their raw inputs. By maximizing the MI, spatial correlations between different modalities can be well characterized to further reduce the spectral distortion. A collaborative $l_{2,1}$ norm is employed as the reconstruction error instead of the more common $l_2$ norm, so that individual pixels can be recovered as accurately as possible. With this design, the network allows to extract correlated spectral and spatial information from unregistered images that better preserves the spectral information. The proposed method is referred to as unregistered and unsupervised mutual Dirichlet Net ($u^2$-MDN). Extensive experimental results using benchmark HSI datasets demonstrate the superior performance of $u^2$-MDN as compared to the state-of-the-art.
Tasks	Image Super-Resolution, Super-Resolution
Published	2019-04-27
URL	https://arxiv.org/abs/1904.12175v2
PDF	https://arxiv.org/pdf/1904.12175v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-and-unregistered-hyperspectral
Repo
Framework

Session-Based Hotel Recommendations: Challenges and Future Directions


Title	Session-Based Hotel Recommendations: Challenges and Future Directions
Authors	Jens Adamczak, Gerard-Paul Leyson, Peter Knees, Yashar Deldjoo, Farshad Bakhshandegan Moghaddam, Julia Neidhardt, Wolfgang Wörndl, Philipp Monreal
Abstract	In the year 2019, the Recommender Systems Challenge deals with a real-world task from the area of e-tourism for the first time, namely the recommendation of hotels in booking sessions. In this context, this article aims at identifying and investigating what we believe are important domain-specific challenges recommendation systems research in hotel search is facing, from both academic and industry perspectives. We focus on three main challenges, namely dealing with (1) multiple stakeholders and value-awareness in recommendations, (2) sparsity of user data and the extensive cold-start problem, and (3) dynamic input data and computational requirements. To this end, we review the state of the art toward solving these challenges and discuss shortcomings. We detail possible future directions and visions we contemplate for the further evolution of the field. This article should, therefore, serve two purposes: giving the interested reader an overview of current challenges in the field and inspiring new approaches for the ACM Recommender Systems Challenge 2019 and beyond.
Tasks	Recommendation Systems
Published	2019-07-31
URL	https://arxiv.org/abs/1908.00071v1
PDF	https://arxiv.org/pdf/1908.00071v1.pdf
PWC	https://paperswithcode.com/paper/session-based-hotel-recommendations
Repo
Framework

XLDA: Cross-Lingual Data Augmentation for Natural Language Inference and Question Answering


Title	XLDA: Cross-Lingual Data Augmentation for Natural Language Inference and Question Answering
Authors	Jasdeep Singh, Bryan McCann, Nitish Shirish Keskar, Caiming Xiong, Richard Socher
Abstract	While natural language processing systems often focus on a single language, multilingual transfer learning has the potential to improve performance, especially for low-resource languages. We introduce XLDA, cross-lingual data augmentation, a method that replaces a segment of the input text with its translation in another language. XLDA enhances performance of all 14 tested languages of the cross-lingual natural language inference (XNLI) benchmark. With improvements of up to $4.8%$, training with XLDA achieves state-of-the-art performance for Greek, Turkish, and Urdu. XLDA is in contrast to, and performs markedly better than, a more naive approach that aggregates examples in various languages in a way that each example is solely in one language. On the SQuAD question answering task, we see that XLDA provides a $1.0%$ performance increase on the English evaluation set. Comprehensive experiments suggest that most languages are effective as cross-lingual augmentors, that XLDA is robust to a wide range of translation quality, and that XLDA is even more effective for randomly initialized models than for pretrained models.
Tasks	Cross-Lingual Natural Language Inference, Data Augmentation, Natural Language Inference, Question Answering, Transfer Learning
Published	2019-05-27
URL	https://arxiv.org/abs/1905.11471v1
PDF	https://arxiv.org/pdf/1905.11471v1.pdf
PWC	https://paperswithcode.com/paper/xlda-cross-lingual-data-augmentation-for
Repo
Framework

Clustering by Optimizing the Average Silhouette Width


Title	Clustering by Optimizing the Average Silhouette Width
Authors	Fatima Batool, Christian Hennig
Abstract	In this paper, we propose a unified clustering approach that can estimate number of clusters and produce clustering against this number simultaneously. Average silhouette width (ASW) is a widely used standard cluster quality index. We define a distance based objective function that optimizes ASW for clustering. The proposed algorithm named as OSil, only, needs data observations as an input without any prior knowledge of the number of clusters. This work is about thorough investigation of the proposed methodology, its usefulness and limitations. A vast spectrum of clustering structures were generated, and several well-known clustering methods including partitioning, hierarchical, density based, and spatial methods were consider as the competitor of the proposed methodology. Simulation reveals that OSil algorithm has shown superior perform in terms of clustering quality than all clustering methods included in the study. OSil can find well separated, compact clusters and have shown better performance for the estimation of number of clusters than several methods. Apart from the proposal of the new methodology and it’s investigation this papers offer a systematic analysis on the estimation of cluster indices, some of which never appeared together in comparative simulation setup before. The study offers many insightful findings useful for the selection of the clustering methods and indices.
Tasks
Published	2019-10-18
URL	https://arxiv.org/abs/1910.08644v1
PDF	https://arxiv.org/pdf/1910.08644v1.pdf
PWC	https://paperswithcode.com/paper/clustering-by-optimizing-the-average
Repo
Framework

TextNAS: A Neural Architecture Search Space tailored for Text Representation


Title	TextNAS: A Neural Architecture Search Space tailored for Text Representation
Authors	Yujing Wang, Yaming Yang, Yiren Chen, Jing Bai, Ce Zhang, Guinan Su, Xiaoyu Kou, Yunhai Tong, Mao Yang, Lidong Zhou
Abstract	Learning text representation is crucial for text classification and other language related tasks. There are a diverse set of text representation networks in the literature, and how to find the optimal one is a non-trivial problem. Recently, the emerging Neural Architecture Search (NAS) techniques have demonstrated good potential to solve the problem. Nevertheless, most of the existing works of NAS focus on the search algorithms and pay little attention to the search space. In this paper, we argue that the search space is also an important human prior to the success of NAS in different applications. Thus, we propose a novel search space tailored for text representation. Through automatic search, the discovered network architecture outperforms state-of-the-art models on various public datasets on text classification and natural language inference tasks. Furthermore, some of the design principles found in the automatic network agree well with human intuition.
Tasks	Natural Language Inference, Neural Architecture Search, Text Classification
Published	2019-12-23
URL	https://arxiv.org/abs/1912.10729v1
PDF	https://arxiv.org/pdf/1912.10729v1.pdf
PWC	https://paperswithcode.com/paper/textnas-a-neural-architecture-search-space
Repo
Framework

Interactive Topic Modeling with Anchor Words


Title	Interactive Topic Modeling with Anchor Words
Authors	Sanjoy Dasgupta, Stefanos Poulis, Christopher Tosh
Abstract	The formalism of anchor words has enabled the development of fast topic modeling algorithms with provable guarantees. In this paper, we introduce a protocol that allows users to interact with anchor words to build customized and interpretable topic models. Experimental evidence validating the usefulness of our approach is also presented.
Tasks	Topic Models
Published	2019-06-18
URL	https://arxiv.org/abs/1907.04919v1
PDF	https://arxiv.org/pdf/1907.04919v1.pdf
PWC	https://paperswithcode.com/paper/interactive-topic-modeling-with-anchor-words
Repo
Framework

Spatial-Temporal-Textual Point Processes with Applications in Crime Linkage Detection


Title	Spatial-Temporal-Textual Point Processes with Applications in Crime Linkage Detection
Authors	Shixiang Zhu, Yao Xie
Abstract	Crimes emerge out of complex interactions of human behaviors and situations. Linkages between crime events are highly complex. Detecting crime linkage given a set of events is a highly challenging task since we only have limited information, including text descriptions, event times, and locations. In practice, there are very few labels. We propose a new statistical modeling framework for spatio-temporal-textual data and demonstrate its usage on crime linkage detection. We capture linkages of crime incidents via multivariate marked spatio-temporal Hawkes processes and treat embedding vectors of the free-text as marks of incidents. This is inspired by the notion of modus operandi (M.O.) in crime analysis. We also reduce the implicit bias in text documents before embedding to remove any potential discrimination of our algorithm. Numerical results using real data demonstrate the good performance of our method. The proposed method can be widely used in other similar data in social networks, electronic health records, etc.
Tasks	Point Processes
Published	2019-02-01
URL	https://arxiv.org/abs/1902.00440v4
PDF	https://arxiv.org/pdf/1902.00440v4.pdf
PWC	https://paperswithcode.com/paper/crime-linkage-detection-by-spatio-temporal
Repo
Framework

Giving BERT a Calculator: Finding Operations and Arguments with Reading Comprehension


Title	Giving BERT a Calculator: Finding Operations and Arguments with Reading Comprehension
Authors	Daniel Andor, Luheng He, Kenton Lee, Emily Pitler
Abstract	Reading comprehension models have been successfully applied to extractive text answers, but it is unclear how best to generalize these models to abstractive numerical answers. We enable a BERT-based reading comprehension model to perform lightweight numerical reasoning. We augment the model with a predefined set of executable ‘programs’ which encompass simple arithmetic as well as extraction. Rather than having to learn to manipulate numbers directly, the model can pick a program and execute it. On the recent Discrete Reasoning Over Passages (DROP) dataset, designed to challenge reading comprehension models, we show a 33% absolute improvement by adding shallow programs. The model can learn to predict new operations when appropriate in a math word problem setting (Roy and Roth, 2015) with very few training examples.
Tasks	Reading Comprehension
Published	2019-08-31
URL	https://arxiv.org/abs/1909.00109v2
PDF	https://arxiv.org/pdf/1909.00109v2.pdf
PWC	https://paperswithcode.com/paper/giving-bert-a-calculator-finding-operations
Repo
Framework