Paper Group ANR 559
Self-supervised Learning of Distance Functions for Goal-Conditioned Reinforcement Learning. Model Similarity Mitigates Test Set Overuse. Protecting GANs against privacy attacks by preventing overfitting. One Point, One Object: Simultaneous 3D Object Segmentation and 6-DOF Pose Estimation. Joint Adversarial Training: Incorporating both Spatial and P …
Self-supervised Learning of Distance Functions for Goal-Conditioned Reinforcement Learning
Title | Self-supervised Learning of Distance Functions for Goal-Conditioned Reinforcement Learning |
Authors | Srinivas Venkattaramanujam, Eric Crawford, Thang Doan, Doina Precup |
Abstract | Goal-conditioned policies are used in order to break down complex reinforcement learning (RL) problems by using subgoals, which can be defined either in state space or in a latent feature space. This can increase the efficiency of learning by using a curriculum, and also enables simultaneous learning and generalization across goals. A crucial requirement of goal-conditioned policies is to be able to determine whether the goal has been achieved. Having a notion of distance to a goal is thus a crucial component of this approach. However, it is not straightforward to come up with an appropriate distance, and in some tasks, the goal space may not even be known a priori. In this work we learn a distance-to-goal estimate which is computed in terms of the number of actions that would need to be carried out in a self-supervised approach. Our method solves complex tasks without prior domain knowledge in the online setting in three different scenarios in the context of goal-conditioned policies a) the goal space is the same as the state space b) the goal space is given but an appropriate distance is unknown and c) the state space is accessible, but only a subset of the state space represents desired goals, and this subset is known a priori. We also propose a goal-generation mechanism as a secondary contribution. |
Tasks | |
Published | 2019-07-05 |
URL | https://arxiv.org/abs/1907.02998v1 |
https://arxiv.org/pdf/1907.02998v1.pdf | |
PWC | https://paperswithcode.com/paper/self-supervised-learning-of-distance |
Repo | |
Framework | |
Model Similarity Mitigates Test Set Overuse
Title | Model Similarity Mitigates Test Set Overuse |
Authors | Horia Mania, John Miller, Ludwig Schmidt, Moritz Hardt, Benjamin Recht |
Abstract | Excessive reuse of test data has become commonplace in today’s machine learning workflows. Popular benchmarks, competitions, industrial scale tuning, among other applications, all involve test data reuse beyond guidance by statistical confidence bounds. Nonetheless, recent replication studies give evidence that popular benchmarks continue to support progress despite years of extensive reuse. We proffer a new explanation for the apparent longevity of test data: Many proposed models are similar in their predictions and we prove that this similarity mitigates overfitting. Specifically, we show empirically that models proposed for the ImageNet ILSVRC benchmark agree in their predictions well beyond what we can conclude from their accuracy levels alone. Likewise, models created by large scale hyperparameter search enjoy high levels of similarity. Motivated by these empirical observations, we give a non-asymptotic generalization bound that takes similarity into account, leading to meaningful confidence bounds in practical settings. |
Tasks | |
Published | 2019-05-29 |
URL | https://arxiv.org/abs/1905.12580v1 |
https://arxiv.org/pdf/1905.12580v1.pdf | |
PWC | https://paperswithcode.com/paper/model-similarity-mitigates-test-set-overuse |
Repo | |
Framework | |
Protecting GANs against privacy attacks by preventing overfitting
Title | Protecting GANs against privacy attacks by preventing overfitting |
Authors | Sumit Mukherjee, Yixi Xu, Anusua Trivedi, Juan Lavista Ferres |
Abstract | Generative Adversarial Networks (GANs) have made releasing of synthetic images a viable approach to share data without releasing the original dataset. It has been shown that such synthetic data can be used for a variety of downstream tasks such as training classifiers that would otherwise require the original dataset to be shared. However, recent work has shown that the GAN models and their synthetically generated data can be used to infer the training set membership by an adversary who has access to the entire dataset and some auxiliary information. Here we develop a new GAN architecture (privGAN) which provides protection against this mode of attack while leading to negligible loss in downstream performances. Our architecture explicitly prevents overfitting to the training set thereby providing implicit protection against white-box attacks. The main contributions of this paper are: i) we propose a novel GAN architecture that can generate synthetic data in a privacy preserving manner and demonstrate the effectiveness of our model against white–box attacks on several benchmark datasets, ii) we provide a theoretical understanding of the optimal solution of the GAN loss function, iii) we demonstrate on two common benchmark datasets that synthetic images generated by privGAN lead to negligible loss in downstream performance when compared against non–private GANs. While we have focosued on benchmarking privGAN exclusively of image datasets, the architecture of privGAN is not exclusive to image datasets and can be easily extended to other types of datasets. |
Tasks | |
Published | 2019-12-31 |
URL | https://arxiv.org/abs/2001.00071v2 |
https://arxiv.org/pdf/2001.00071v2.pdf | |
PWC | https://paperswithcode.com/paper/protecting-gans-against-privacy-attacks-by |
Repo | |
Framework | |
One Point, One Object: Simultaneous 3D Object Segmentation and 6-DOF Pose Estimation
Title | One Point, One Object: Simultaneous 3D Object Segmentation and 6-DOF Pose Estimation |
Authors | Hongsen Liu, Yang Cong, Yandong Tang |
Abstract | We propose a single-shot method for simultaneous 3D object segmentation and 6-DOF pose estimation in pure 3D point clouds scenes based on a consensus that \emph{one point only belongs to one object}, i.e., each point has the potential power to predict the 6-DOF pose of its corresponding object. Unlike the recently proposed methods of the similar task, which rely on 2D detectors to predict the projection of 3D corners of the 3D bounding boxes and the 6-DOF pose must be estimated by a PnP like spatial transformation method, ours is concise enough not to require additional spatial transformation between different dimensions. Due to the lack of training data for many objects, the recently proposed 2D detection methods try to generate training data by using rendering engine and achieve good results. However, rendering in 3D space along with 6-DOF is relatively difficult. Therefore, we propose an augmented reality technology to generate the training data in semi-virtual reality 3D space. The key component of our method is a multi-task CNN architecture that can simultaneously predicts the 3D object segmentation and 6-DOF pose estimation in pure 3D point clouds. For experimental evaluation, we generate expanded training data for two state-of-the-arts 3D object datasets \cite{PLCHF}\cite{TLINEMOD} by using Augmented Reality technology (AR). We evaluate our proposed method on the two datasets. The results show that our method can be well generalized into multiple scenarios and provide performance comparable to or better than the state-of-the-arts. |
Tasks | Pose Estimation, Semantic Segmentation |
Published | 2019-12-27 |
URL | https://arxiv.org/abs/1912.12095v1 |
https://arxiv.org/pdf/1912.12095v1.pdf | |
PWC | https://paperswithcode.com/paper/one-point-one-object-simultaneous-3d-object |
Repo | |
Framework | |
Joint Adversarial Training: Incorporating both Spatial and Pixel Attacks
Title | Joint Adversarial Training: Incorporating both Spatial and Pixel Attacks |
Authors | Haichao Zhang, Jianyu Wang |
Abstract | Conventional adversarial training methods using attacks that manipulate the pixel value directly and individually, leading to models that are less robust in face of spatial transformation-based attacks. In this paper, we propose a joint adversarial training method that incorporates both spatial transformation-based and pixel-value based attacks for improving model robustness. We introduce a spatial transformation-based attack with an explicit notion of budget and develop an algorithm for spatial attack generation. We further integrate both pixel and spatial attacks into one generation model and show how to leverage the complementary strengths of each other in training for improving the overall model robustness. Extensive experimental results on different benchmark datasets compared with state-of-the-art methods verified the effectiveness of the proposed method. |
Tasks | |
Published | 2019-07-24 |
URL | https://arxiv.org/abs/1907.10737v2 |
https://arxiv.org/pdf/1907.10737v2.pdf | |
PWC | https://paperswithcode.com/paper/joint-adversarial-training-incorporating-both |
Repo | |
Framework | |
Is Pretraining Necessary for Hyperspectral Image Classification?
Title | Is Pretraining Necessary for Hyperspectral Image Classification? |
Authors | Hyungtae Lee, Sungmin Eum, Heesung Kwon |
Abstract | We address two questions for training a convolutional neural network (CNN) for hyperspectral image classification: i) is it possible to build a pre-trained network? and ii) is the pre-training effective in furthering the performance? To answer the first question, we have devised an approach that pre-trains a network on multiple source datasets that differ in their hyperspectral characteristics and fine-tunes on a target dataset. This approach effectively resolves the architectural issue that arises when transferring meaningful information between the source and the target networks. To answer the second question, we carried out several ablation experiments. Based on the experimental results, a network trained from scratch performs as good as a network fine-tuned from a pre-trained network. However, we observed that pre-training the network has its own advantage in achieving better performances when deeper networks are required. |
Tasks | Hyperspectral Image Classification, Image Classification |
Published | 2019-01-24 |
URL | http://arxiv.org/abs/1901.08658v1 |
http://arxiv.org/pdf/1901.08658v1.pdf | |
PWC | https://paperswithcode.com/paper/is-pretraining-necessary-for-hyperspectral |
Repo | |
Framework | |
Data Augmentation with Manifold Exploring Geometric Transformations for Increased Performance and Robustness
Title | Data Augmentation with Manifold Exploring Geometric Transformations for Increased Performance and Robustness |
Authors | Magdalini Paschali, Walter Simson, Abhijit Guha Roy, Muhammad Ferjad Naeem, Rüdiger Göbl, Christian Wachinger, Nassir Navab |
Abstract | In this paper we propose a novel augmentation technique that improves not only the performance of deep neural networks on clean test data, but also significantly increases their robustness to random transformations, both affine and projective. Inspired by ManiFool, the augmentation is performed by a line-search manifold-exploration method that learns affine geometric transformations that lead to the misclassification on an image, while ensuring that it remains on the same manifold as the training data. This augmentation method populates any training dataset with images that lie on the border of the manifolds between two-classes and maximizes the variance the network is exposed to during training. Our method was thoroughly evaluated on the challenging tasks of fine-grained skin lesion classification from limited data, and breast tumor classification of mammograms. Compared with traditional augmentation methods, and with images synthesized by Generative Adversarial Networks our method not only achieves state-of-the-art performance but also significantly improves the network’s robustness. |
Tasks | Data Augmentation, Skin Lesion Classification |
Published | 2019-01-14 |
URL | http://arxiv.org/abs/1901.04420v1 |
http://arxiv.org/pdf/1901.04420v1.pdf | |
PWC | https://paperswithcode.com/paper/data-augmentation-with-manifold-exploring |
Repo | |
Framework | |
Unsupervised and Unregistered Hyperspectral Image Super-Resolution with Mutual Dirichlet-Net
Title | Unsupervised and Unregistered Hyperspectral Image Super-Resolution with Mutual Dirichlet-Net |
Authors | Ying Qu, Hairong Qi, Chiman Kwan |
Abstract | Hyperspectral images (HSI) provide rich spectral information that contributed to the successful performance improvement of numerous computer vision tasks. However, it can only be achieved at the expense of images’ spatial resolution. Hyperspectral image super-resolution (HSI-SR) addresses this problem by fusing low resolution (LR) HSI with multispectral image (MSI) carrying much higher spatial resolution (HR). All existing HSI-SR approaches require the LR HSI and HR MSI to be well registered and the reconstruction accuracy of the HR HSI relies heavily on the registration accuracy of different modalities. This paper exploits the uncharted problem domain of HSI-SR without the requirement of multi-modality registration. Given the unregistered LR HSI and HR MSI with overlapped regions, we design a unique unsupervised learning structure linking the two unregistered modalities by projecting them into the same statistical space through the same encoder. The mutual information (MI) is further adopted to capture the non-linear statistical dependencies between the representations from two modalities (carrying spatial information) and their raw inputs. By maximizing the MI, spatial correlations between different modalities can be well characterized to further reduce the spectral distortion. A collaborative $l_{2,1}$ norm is employed as the reconstruction error instead of the more common $l_2$ norm, so that individual pixels can be recovered as accurately as possible. With this design, the network allows to extract correlated spectral and spatial information from unregistered images that better preserves the spectral information. The proposed method is referred to as unregistered and unsupervised mutual Dirichlet Net ($u^2$-MDN). Extensive experimental results using benchmark HSI datasets demonstrate the superior performance of $u^2$-MDN as compared to the state-of-the-art. |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2019-04-27 |
URL | https://arxiv.org/abs/1904.12175v2 |
https://arxiv.org/pdf/1904.12175v2.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-and-unregistered-hyperspectral |
Repo | |
Framework | |
Session-Based Hotel Recommendations: Challenges and Future Directions
Title | Session-Based Hotel Recommendations: Challenges and Future Directions |
Authors | Jens Adamczak, Gerard-Paul Leyson, Peter Knees, Yashar Deldjoo, Farshad Bakhshandegan Moghaddam, Julia Neidhardt, Wolfgang Wörndl, Philipp Monreal |
Abstract | In the year 2019, the Recommender Systems Challenge deals with a real-world task from the area of e-tourism for the first time, namely the recommendation of hotels in booking sessions. In this context, this article aims at identifying and investigating what we believe are important domain-specific challenges recommendation systems research in hotel search is facing, from both academic and industry perspectives. We focus on three main challenges, namely dealing with (1) multiple stakeholders and value-awareness in recommendations, (2) sparsity of user data and the extensive cold-start problem, and (3) dynamic input data and computational requirements. To this end, we review the state of the art toward solving these challenges and discuss shortcomings. We detail possible future directions and visions we contemplate for the further evolution of the field. This article should, therefore, serve two purposes: giving the interested reader an overview of current challenges in the field and inspiring new approaches for the ACM Recommender Systems Challenge 2019 and beyond. |
Tasks | Recommendation Systems |
Published | 2019-07-31 |
URL | https://arxiv.org/abs/1908.00071v1 |
https://arxiv.org/pdf/1908.00071v1.pdf | |
PWC | https://paperswithcode.com/paper/session-based-hotel-recommendations |
Repo | |
Framework | |
XLDA: Cross-Lingual Data Augmentation for Natural Language Inference and Question Answering
Title | XLDA: Cross-Lingual Data Augmentation for Natural Language Inference and Question Answering |
Authors | Jasdeep Singh, Bryan McCann, Nitish Shirish Keskar, Caiming Xiong, Richard Socher |
Abstract | While natural language processing systems often focus on a single language, multilingual transfer learning has the potential to improve performance, especially for low-resource languages. We introduce XLDA, cross-lingual data augmentation, a method that replaces a segment of the input text with its translation in another language. XLDA enhances performance of all 14 tested languages of the cross-lingual natural language inference (XNLI) benchmark. With improvements of up to $4.8%$, training with XLDA achieves state-of-the-art performance for Greek, Turkish, and Urdu. XLDA is in contrast to, and performs markedly better than, a more naive approach that aggregates examples in various languages in a way that each example is solely in one language. On the SQuAD question answering task, we see that XLDA provides a $1.0%$ performance increase on the English evaluation set. Comprehensive experiments suggest that most languages are effective as cross-lingual augmentors, that XLDA is robust to a wide range of translation quality, and that XLDA is even more effective for randomly initialized models than for pretrained models. |
Tasks | Cross-Lingual Natural Language Inference, Data Augmentation, Natural Language Inference, Question Answering, Transfer Learning |
Published | 2019-05-27 |
URL | https://arxiv.org/abs/1905.11471v1 |
https://arxiv.org/pdf/1905.11471v1.pdf | |
PWC | https://paperswithcode.com/paper/xlda-cross-lingual-data-augmentation-for |
Repo | |
Framework | |
Clustering by Optimizing the Average Silhouette Width
Title | Clustering by Optimizing the Average Silhouette Width |
Authors | Fatima Batool, Christian Hennig |
Abstract | In this paper, we propose a unified clustering approach that can estimate number of clusters and produce clustering against this number simultaneously. Average silhouette width (ASW) is a widely used standard cluster quality index. We define a distance based objective function that optimizes ASW for clustering. The proposed algorithm named as OSil, only, needs data observations as an input without any prior knowledge of the number of clusters. This work is about thorough investigation of the proposed methodology, its usefulness and limitations. A vast spectrum of clustering structures were generated, and several well-known clustering methods including partitioning, hierarchical, density based, and spatial methods were consider as the competitor of the proposed methodology. Simulation reveals that OSil algorithm has shown superior perform in terms of clustering quality than all clustering methods included in the study. OSil can find well separated, compact clusters and have shown better performance for the estimation of number of clusters than several methods. Apart from the proposal of the new methodology and it’s investigation this papers offer a systematic analysis on the estimation of cluster indices, some of which never appeared together in comparative simulation setup before. The study offers many insightful findings useful for the selection of the clustering methods and indices. |
Tasks | |
Published | 2019-10-18 |
URL | https://arxiv.org/abs/1910.08644v1 |
https://arxiv.org/pdf/1910.08644v1.pdf | |
PWC | https://paperswithcode.com/paper/clustering-by-optimizing-the-average |
Repo | |
Framework | |
TextNAS: A Neural Architecture Search Space tailored for Text Representation
Title | TextNAS: A Neural Architecture Search Space tailored for Text Representation |
Authors | Yujing Wang, Yaming Yang, Yiren Chen, Jing Bai, Ce Zhang, Guinan Su, Xiaoyu Kou, Yunhai Tong, Mao Yang, Lidong Zhou |
Abstract | Learning text representation is crucial for text classification and other language related tasks. There are a diverse set of text representation networks in the literature, and how to find the optimal one is a non-trivial problem. Recently, the emerging Neural Architecture Search (NAS) techniques have demonstrated good potential to solve the problem. Nevertheless, most of the existing works of NAS focus on the search algorithms and pay little attention to the search space. In this paper, we argue that the search space is also an important human prior to the success of NAS in different applications. Thus, we propose a novel search space tailored for text representation. Through automatic search, the discovered network architecture outperforms state-of-the-art models on various public datasets on text classification and natural language inference tasks. Furthermore, some of the design principles found in the automatic network agree well with human intuition. |
Tasks | Natural Language Inference, Neural Architecture Search, Text Classification |
Published | 2019-12-23 |
URL | https://arxiv.org/abs/1912.10729v1 |
https://arxiv.org/pdf/1912.10729v1.pdf | |
PWC | https://paperswithcode.com/paper/textnas-a-neural-architecture-search-space |
Repo | |
Framework | |
Interactive Topic Modeling with Anchor Words
Title | Interactive Topic Modeling with Anchor Words |
Authors | Sanjoy Dasgupta, Stefanos Poulis, Christopher Tosh |
Abstract | The formalism of anchor words has enabled the development of fast topic modeling algorithms with provable guarantees. In this paper, we introduce a protocol that allows users to interact with anchor words to build customized and interpretable topic models. Experimental evidence validating the usefulness of our approach is also presented. |
Tasks | Topic Models |
Published | 2019-06-18 |
URL | https://arxiv.org/abs/1907.04919v1 |
https://arxiv.org/pdf/1907.04919v1.pdf | |
PWC | https://paperswithcode.com/paper/interactive-topic-modeling-with-anchor-words |
Repo | |
Framework | |
Spatial-Temporal-Textual Point Processes with Applications in Crime Linkage Detection
Title | Spatial-Temporal-Textual Point Processes with Applications in Crime Linkage Detection |
Authors | Shixiang Zhu, Yao Xie |
Abstract | Crimes emerge out of complex interactions of human behaviors and situations. Linkages between crime events are highly complex. Detecting crime linkage given a set of events is a highly challenging task since we only have limited information, including text descriptions, event times, and locations. In practice, there are very few labels. We propose a new statistical modeling framework for spatio-temporal-textual data and demonstrate its usage on crime linkage detection. We capture linkages of crime incidents via multivariate marked spatio-temporal Hawkes processes and treat embedding vectors of the free-text as marks of incidents. This is inspired by the notion of modus operandi (M.O.) in crime analysis. We also reduce the implicit bias in text documents before embedding to remove any potential discrimination of our algorithm. Numerical results using real data demonstrate the good performance of our method. The proposed method can be widely used in other similar data in social networks, electronic health records, etc. |
Tasks | Point Processes |
Published | 2019-02-01 |
URL | https://arxiv.org/abs/1902.00440v4 |
https://arxiv.org/pdf/1902.00440v4.pdf | |
PWC | https://paperswithcode.com/paper/crime-linkage-detection-by-spatio-temporal |
Repo | |
Framework | |
Giving BERT a Calculator: Finding Operations and Arguments with Reading Comprehension
Title | Giving BERT a Calculator: Finding Operations and Arguments with Reading Comprehension |
Authors | Daniel Andor, Luheng He, Kenton Lee, Emily Pitler |
Abstract | Reading comprehension models have been successfully applied to extractive text answers, but it is unclear how best to generalize these models to abstractive numerical answers. We enable a BERT-based reading comprehension model to perform lightweight numerical reasoning. We augment the model with a predefined set of executable ‘programs’ which encompass simple arithmetic as well as extraction. Rather than having to learn to manipulate numbers directly, the model can pick a program and execute it. On the recent Discrete Reasoning Over Passages (DROP) dataset, designed to challenge reading comprehension models, we show a 33% absolute improvement by adding shallow programs. The model can learn to predict new operations when appropriate in a math word problem setting (Roy and Roth, 2015) with very few training examples. |
Tasks | Reading Comprehension |
Published | 2019-08-31 |
URL | https://arxiv.org/abs/1909.00109v2 |
https://arxiv.org/pdf/1909.00109v2.pdf | |
PWC | https://paperswithcode.com/paper/giving-bert-a-calculator-finding-operations |
Repo | |
Framework | |