January 31, 2020

3276 words 16 mins read

Paper Group AWR 453

On Adversarial Removal of Hypothesis-only Bias in Natural Language Inference. Understanding the Representation Power of Graph Neural Networks in Learning Graph Topology. Receding Horizon Curiosity. GLTR: Statistical Detection and Visualization of Generated Text. $Σ$-net: Systematic Evaluation of Iterative Deep Neural Networks for Fast Parallel MR I …

On Adversarial Removal of Hypothesis-only Bias in Natural Language Inference


Title	On Adversarial Removal of Hypothesis-only Bias in Natural Language Inference
Authors	Yonatan Belinkov, Adam Poliak, Stuart M. Shieber, Benjamin Van Durme, Alexander M. Rush
Abstract	Popular Natural Language Inference (NLI) datasets have been shown to be tainted by hypothesis-only biases. Adversarial learning may help models ignore sensitive biases and spurious correlations in data. We evaluate whether adversarial learning can be used in NLI to encourage models to learn representations free of hypothesis-only biases. Our analyses indicate that the representations learned via adversarial learning may be less biased, with only small drops in NLI accuracy.
Tasks	Natural Language Inference
Published	2019-07-09
URL	https://arxiv.org/abs/1907.04389v1
PDF	https://arxiv.org/pdf/1907.04389v1.pdf
PWC	https://paperswithcode.com/paper/on-adversarial-removal-of-hypothesis-only-1
Repo	https://github.com/azpoliak/robust-nli
Framework	pytorch

Understanding the Representation Power of Graph Neural Networks in Learning Graph Topology


Title	Understanding the Representation Power of Graph Neural Networks in Learning Graph Topology
Authors	Nima Dehmamy, Albert-László Barabási, Rose Yu
Abstract	To deepen our understanding of graph neural networks, we investigate the representation power of Graph Convolutional Networks (GCN) through the looking glass of graph moments, a key property of graph topology encoding path of various lengths. We find that GCNs are rather restrictive in learning graph moments. Without careful design, GCNs can fail miserably even with multiple layers and nonlinear activation functions. We analyze theoretically the expressiveness of GCNs, concluding a modular GCN design, using different propagation rules with residual connections could significantly improve the performance of GCN. We demonstrate that such modular designs are capable of distinguishing graphs from different graph generation models for surprisingly small graphs, a notoriously difficult problem in network science. Our investigation suggests that, depth is much more influential than width, with deeper GCNs being more capable of learning higher order graph moments. Additionally, combining GCN modules with different propagation rules is critical to the representation power of GCNs.
Tasks	Graph Generation
Published	2019-07-11
URL	https://arxiv.org/abs/1907.05008v2
PDF	https://arxiv.org/pdf/1907.05008v2.pdf
PWC	https://paperswithcode.com/paper/understanding-the-representation-power-of
Repo	https://github.com/nimadehmamy/Understanding-GCN
Framework	none

Receding Horizon Curiosity


Title	Receding Horizon Curiosity
Authors	Matthias Schultheis, Boris Belousov, Hany Abdulsamad, Jan Peters
Abstract	Sample-efficient exploration is crucial not only for discovering rewarding experiences but also for adapting to environment changes in a task-agnostic fashion. A principled treatment of the problem of optimal input synthesis for system identification is provided within the framework of sequential Bayesian experimental design. In this paper, we present an effective trajectory-optimization-based approximate solution of this otherwise intractable problem that models optimal exploration in an unknown Markov decision process (MDP). By interleaving episodic exploration with Bayesian nonlinear system identification, our algorithm takes advantage of the inductive bias to explore in a directed manner, without assuming prior knowledge of the MDP. Empirical evaluations indicate a clear advantage of the proposed algorithm in terms of the rate of convergence and the final model fidelity when compared to intrinsic-motivation-based algorithms employing exploration bonuses such as prediction error and information gain. Moreover, our method maintains a computational advantage over a recent model-based active exploration (MAX) algorithm, by focusing on the information gain along trajectories instead of seeking a global exploration policy. A reference implementation of our algorithm and the conducted experiments is publicly available.
Tasks	Efficient Exploration
Published	2019-10-08
URL	https://arxiv.org/abs/1910.03620v1
PDF	https://arxiv.org/pdf/1910.03620v1.pdf
PWC	https://paperswithcode.com/paper/receding-horizon-curiosity
Repo	https://github.com/mschulth/rhc
Framework	none

GLTR: Statistical Detection and Visualization of Generated Text


Title	GLTR: Statistical Detection and Visualization of Generated Text
Authors	Sebastian Gehrmann, Hendrik Strobelt, Alexander M. Rush
Abstract	The rapid improvement of language models has raised the specter of abuse of text generation systems. This progress motivates the development of simple methods for detecting generated text that can be used by and explained to non-experts. We develop GLTR, a tool to support humans in detecting whether a text was generated by a model. GLTR applies a suite of baseline statistical methods that can detect generation artifacts across common sampling schemes. In a human-subjects study, we show that the annotation scheme provided by GLTR improves the human detection-rate of fake text from 54% to 72% without any prior training. GLTR is open-source and publicly deployed, and has already been widely used to detect generated outputs
Tasks	Human Detection, Text Generation
Published	2019-06-10
URL	https://arxiv.org/abs/1906.04043v1
PDF	https://arxiv.org/pdf/1906.04043v1.pdf
PWC	https://paperswithcode.com/paper/gltr-statistical-detection-and-visualization
Repo	https://github.com/HendrikStrobelt/detecting-fake-text
Framework	none

$Σ$-net: Systematic Evaluation of Iterative Deep Neural Networks for Fast Parallel MR Image Reconstruction


Title	$Σ$-net: Systematic Evaluation of Iterative Deep Neural Networks for Fast Parallel MR Image Reconstruction
Authors	Kerstin Hammernik, Jo Schlemper, Chen Qin, Jinming Duan, Ronald M. Summers, Daniel Rueckert
Abstract	Purpose: To systematically investigate the influence of various data consistency layers, (semi-)supervised learning and ensembling strategies, defined in a $\Sigma$-net, for accelerated parallel MR image reconstruction using deep learning. Theory and Methods: MR image reconstruction is formulated as learned unrolled optimization scheme with a Down-Up network as regularization and varying data consistency layers. The different architectures are split into sensitivity networks, which rely on explicit coil sensitivity maps, and parallel coil networks, which learn the combination of coils implicitly. Different content and adversarial losses, a semi-supervised fine-tuning scheme and model ensembling are investigated. Results: Evaluated on the fastMRI multicoil validation set, architectures involving raw k-space data outperform image enhancement methods significantly. Semi-supervised fine-tuning adapts to new k-space data and provides, together with reconstructions based on adversarial training, the visually most appealing results although quantitative quality metrics are reduced. The $\Sigma$-net ensembles the benefits from different models and achieves similar scores compared to the single state-of-the-art approaches. Conclusion: This work provides an open-source framework to perform a systematic wide-range comparison of state-of-the-art reconstruction approaches for parallel MR image reconstruction on the fastMRI knee dataset and explores the importance of data consistency. A suitable trade-off between perceptual image quality and quantitative scores are achieved with the ensembled $\Sigma$-net.
Tasks	Image Enhancement, Image Reconstruction
Published	2019-12-18
URL	https://arxiv.org/abs/1912.09278v1
PDF	https://arxiv.org/pdf/1912.09278v1.pdf
PWC	https://paperswithcode.com/paper/-net-systematic-evaluation-of-iterative-deep
Repo	https://github.com/khammernik/sigmanet
Framework	pytorch

Robust Parameter-Free Season Length Detection in Time Series


Title	Robust Parameter-Free Season Length Detection in Time Series
Authors	Maximilian Toller, Roman Kern
Abstract	The in-depth analysis of time series has gained a lot of research interest in recent years, with the identification of periodic patterns being one important aspect. Many of the methods for identifying periodic patterns require time series’ season length as input parameter. There exist only a few algorithms for automatic season length approximation. Many of these rely on simplifications such as data discretization and user defined parameters. This paper presents an algorithm for season length detection that is designed to be sufficiently reliable to be used in practical applications and does not require any input other than the time series to be analyzed. The algorithm estimates a time series’ season length by interpolating, filtering and detrending the data. This is followed by analyzing the distances between zeros in the directly corresponding autocorrelation function. Our algorithm was tested against a comparable algorithm and outperformed it by passing 122 out of 165 tests, while the existing algorithm passed 83 tests. The robustness of our method can be jointly attributed to both the algorithmic approach and also to design decisions taken at the implementational level.
Tasks	Time Series
Published	2019-11-14
URL	https://arxiv.org/abs/1911.06015v1
PDF	https://arxiv.org/pdf/1911.06015v1.pdf
PWC	https://paperswithcode.com/paper/robust-parameter-free-season-length-detection
Repo	https://github.com/mtoller/autocorr_season_length_detection
Framework	none

Learning Semantic-Specific Graph Representation for Multi-Label Image Recognition


Title	Learning Semantic-Specific Graph Representation for Multi-Label Image Recognition
Authors	Tianshui Chen, Muxin Xu, Xiaolu Hui, Hefeng Wu, Liang Lin
Abstract	Recognizing multiple labels of images is a practical and challenging task, and significant progress has been made by searching semantic-aware regions and modeling label dependency. However, current methods cannot locate the semantic regions accurately due to the lack of part-level supervision or semantic guidance. Moreover, they cannot fully explore the mutual interactions among the semantic regions and do not explicitly model the label co-occurrence. To address these issues, we propose a Semantic-Specific Graph Representation Learning (SSGRL) framework that consists of two crucial modules: 1) a semantic decoupling module that incorporates category semantics to guide learning semantic-specific representations and 2) a semantic interaction module that correlates these representations with a graph built on the statistical label co-occurrence and explores their interactions via a graph propagation mechanism. Extensive experiments on public benchmarks show that our SSGRL framework outperforms current state-of-the-art methods by a sizable margin, e.g. with an mAP improvement of 2.5%, 2.6%, 6.7%, and 3.1% on the PASCAL VOC 2007 & 2012, Microsoft-COCO and Visual Genome benchmarks, respectively. Our codes and models are available at https://github.com/HCPLab-SYSU/SSGRL.
Tasks	Graph Representation Learning, Representation Learning
Published	2019-08-20
URL	https://arxiv.org/abs/1908.07325v1
PDF	https://arxiv.org/pdf/1908.07325v1.pdf
PWC	https://paperswithcode.com/paper/learning-semantic-specific-graph
Repo	https://github.com/HCPLab-SYSU/SSGRL
Framework	pytorch

Unified Probabilistic Deep Continual Learning through Generative Replay and Open Set Recognition


Title	Unified Probabilistic Deep Continual Learning through Generative Replay and Open Set Recognition
Authors	Martin Mundt, Sagnik Majumder, Iuliia Pliushch, Visvanathan Ramesh
Abstract	We introduce a probabilistic approach to unify deep continual learning with open set recognition, based on variational Bayesian inference. Our single model combines a joint probabilistic encoder with a generative model and a linear classifier that get shared across sequentially arriving tasks. In order to successfully distinguish unseen unknown data from trained known tasks, we propose to bound the class specific approximate posterior by fitting regions of high density on the basis of correctly classified data points. These bounds are further used to significantly alleviate catastrophic forgetting by avoiding samples from low density areas in generative replay. Our approach requires no storing of old- or upfront knowledge of future data and is empirically validated on visual and audio tasks in class incremental, as well as cross-dataset scenarios across modalities.
Tasks	Audio Classification, Bayesian Inference, Continual Learning, Open Set Learning
Published	2019-05-28
URL	https://arxiv.org/abs/1905.12019v3
PDF	https://arxiv.org/pdf/1905.12019v3.pdf
PWC	https://paperswithcode.com/paper/unified-probabilistic-deep-continual-learning
Repo	https://github.com/MrtnMndt/OCDVAE_ContinualLearning
Framework	pytorch

Graph Warp Module: an Auxiliary Module for Boosting the Power of Graph Neural Networks in Molecular Graph Analysis


Title	Graph Warp Module: an Auxiliary Module for Boosting the Power of Graph Neural Networks in Molecular Graph Analysis
Authors	Katsuhiko Ishiguro, Shin-ichi Maeda, Masanori Koyama
Abstract	Graph Neural Network (GNN) is a popular architecture for the analysis of chemical molecules, and it has numerous applications in material and medicinal science. Current lines of GNNs developed for molecular analysis, however, do not fit well on the training set, and their performance does not scale well with the complexity of the network. In this paper, we propose an auxiliary module to be attached to a GNN that can boost the representation power of the model without hindering with the original GNN architecture. Our auxiliary module can be attached to a wide variety of GNNs, including those that are used commonly in biochemical applications. With our auxiliary architecture, the performances of many GNNs used in practice improve more consistently, achieving the state-of-the-art performance on popular molecular graph datasets.
Tasks
Published	2019-02-04
URL	https://arxiv.org/abs/1902.01020v4
PDF	https://arxiv.org/pdf/1902.01020v4.pdf
PWC	https://paperswithcode.com/paper/graph-warp-module-an-auxiliary-module-for
Repo	https://github.com/pfnet-research/chainer-chemistry
Framework	none

Extending Adversarial Attacks and Defenses to Deep 3D Point Cloud Classifiers


Title	Extending Adversarial Attacks and Defenses to Deep 3D Point Cloud Classifiers
Authors	Daniel Liu, Ronald Yu, Hao Su
Abstract	3D object classification and segmentation using deep neural networks has been extremely successful. As the problem of identifying 3D objects has many safety-critical applications, the neural networks have to be robust against adversarial changes to the input data set. There is a growing body of research on generating human-imperceptible adversarial attacks and defenses against them in the 2D image classification domain. However, 3D objects have various differences with 2D images, and this specific domain has not been rigorously studied so far. We present a preliminary evaluation of adversarial attacks on deep 3D point cloud classifiers, namely PointNet and PointNet++, by evaluating both white-box and black-box adversarial attacks that were proposed for 2D images and extending those attacks to reduce the perceptibility of the perturbations in 3D space. We also show the high effectiveness of simple defenses against those attacks by proposing new defenses that exploit the unique structure of 3D point clouds. Finally, we attempt to explain the effectiveness of the defenses through the intrinsic structures of both the point clouds and the neural network architectures. Overall, we find that networks that process 3D point cloud data are weak to adversarial attacks, but they are also more easily defensible compared to 2D image classifiers. Our investigation will provide the groundwork for future studies on improving the robustness of deep neural networks that handle 3D data.
Tasks	3D Object Classification, Image Classification, Object Classification
Published	2019-01-10
URL	https://arxiv.org/abs/1901.03006v4
PDF	https://arxiv.org/pdf/1901.03006v4.pdf
PWC	https://paperswithcode.com/paper/extending-adversarial-attacks-and-defenses-to
Repo	https://github.com/Daniel-Liu-c0deb0t/3D-Neural-Network-Adversarial-Attacks
Framework	tf

Unsupervised Sketch-to-Photo Synthesis


Title	Unsupervised Sketch-to-Photo Synthesis
Authors	Runtao Liu, Qian Yu, Stella Yu
Abstract	Humans can envision a realistic photo given a free-hand sketch that is not only spatially imprecise and geometrically distorted but also without colors and visual details. We study unsupervised sketch-to-photo synthesis for the first time, learning from unpaired sketch-photo data where the target photo for a sketch is unknown during training. Existing works only deal with style change or spatial deformation alone, synthesizing photos from edge-aligned line drawings or transforming shapes within the same modality, e.g., color images. Our key insight is to decompose unsupervised sketch-to-photo synthesis into a two-stage translation task: First shape translation from sketches to grayscale photos and then content enrichment from grayscale to color photos. We also incorporate a self-supervised denoising objective and an attention module to handle abstraction and style variations that are inherent and specific to sketches. Our synthesis is sketch-faithful and photo-realistic to enable sketch-based image retrieval in practice. An exciting corollary product is a universal and promising sketch generator that captures human visual perception beyond the edge map of a photo.
Tasks	Colorization, Data Augmentation, Denoising, Image Generation, Image Retrieval, Sketch-Based Image Retrieval
Published	2019-09-18
URL	https://arxiv.org/abs/1909.08313v3
PDF	https://arxiv.org/pdf/1909.08313v3.pdf
PWC	https://paperswithcode.com/paper/an-unpaired-sketch-to-photo-translation-model
Repo	https://github.com/rt219/Unpaired-Sketch-to-Photo-Translation
Framework	none

Segmentation-Based Deep-Learning Approach for Surface-Defect Detection


Title	Segmentation-Based Deep-Learning Approach for Surface-Defect Detection
Authors	Domen Tabernik, Samo Šela, Jure Skvarč, Danijel Skočaj
Abstract	Automated surface-anomaly detection using machine learning has become an interesting and promising area of research, with a very high and direct impact on the application domain of visual inspection. Deep-learning methods have become the most suitable approaches for this task. They allow the inspection system to learn to detect the surface anomaly by simply showing it a number of exemplar images. This paper presents a segmentation-based deep-learning architecture that is designed for the detection and segmentation of surface anomalies and is demonstrated on a specific domain of surface-crack detection. The design of the architecture enables the model to be trained using a small number of samples, which is an important requirement for practical applications. The proposed model is compared with the related deep-learning methods, including the state-of-the-art commercial software, showing that the proposed approach outperforms the related methods on the specific domain of surface-crack detection. The large number of experiments also shed light on the required precision of the annotation, the number of required training samples and on the required computational cost. Experiments are performed on a newly created dataset based on a real-world quality control case and demonstrates that the proposed approach is able to learn on a small number of defected surfaces, using only approximately 25-30 defective training samples, instead of hundreds or thousands, which is usually the case in deep-learning applications. This makes the deep-learning method practical for use in industry where the number of available defective samples is limited. The dataset is also made publicly available to encourage the development and evaluation of new methods for surface-defect detection.
Tasks	Anomaly Detection
Published	2019-03-20
URL	https://arxiv.org/abs/1903.08536v3
PDF	https://arxiv.org/pdf/1903.08536v3.pdf
PWC	https://paperswithcode.com/paper/segmentation-based-deep-learning-approach-for
Repo	https://github.com/seanXYZ/SegDecNet
Framework	pytorch

Image Inpainting with Learnable Bidirectional Attention Maps


Title	Image Inpainting with Learnable Bidirectional Attention Maps
Authors	Chaohao Xie, Shaohui Liu, Chao Li, Ming-Ming Cheng, Wangmeng Zuo, Xiao Liu, Shilei Wen, Errui Ding
Abstract	Most convolutional network (CNN)-based inpainting methods adopt standard convolution to indistinguishably treat valid pixels and holes, making them limited in handling irregular holes and more likely to generate inpainting results with color discrepancy and blurriness. Partial convolution has been suggested to address this issue, but it adopts handcrafted feature re-normalization, and only considers forward mask-updating. In this paper, we present a learnable attention map module for learning feature renormalization and mask-updating in an end-to-end manner, which is effective in adapting to irregular holes and propagation of convolution layers. Furthermore, learnable reverse attention maps are introduced to allow the decoder of U-Net to concentrate on filling in irregular holes instead of reconstructing both holes and known regions, resulting in our learnable bidirectional attention maps. Qualitative and quantitative experiments show that our method performs favorably against state-of-the-arts in generating sharper, more coherent and visually plausible inpainting results. The source code and pre-trained models will be available.
Tasks	Image Inpainting
Published	2019-09-03
URL	https://arxiv.org/abs/1909.00968v3
PDF	https://arxiv.org/pdf/1909.00968v3.pdf
PWC	https://paperswithcode.com/paper/image-inpainting-with-learnable-bidirectional
Repo	https://github.com/Vious/LBAM_inpainting
Framework	pytorch

Event detection in Twitter: A keyword volume approach


Title	Event detection in Twitter: A keyword volume approach
Authors	Ahmad Hany Hossny, Lewis Mitchell
Abstract	Event detection using social media streams needs a set of informative features with strong signals that need minimal preprocessing and are highly associated with events of interest. Identifying these informative features as keywords from Twitter is challenging, as people use informal language to express their thoughts and feelings. This informality includes acronyms, misspelled words, synonyms, transliteration and ambiguous terms. In this paper, we propose an efficient method to select the keywords frequently used in Twitter that are mostly associated with events of interest such as protests. The volume of these keywords is tracked in real time to identify the events of interest in a binary classification scheme. We use keywords within word-pairs to capture the context. The proposed method is to binarize vectors of daily counts for each word-pair by applying a spike detection temporal filter, then use the Jaccard metric to measure the similarity of the binary vector for each word-pair with the binary vector describing event occurrence. The top n word-pairs are used as features to classify any day to be an event or non-event day. The selected features are tested using multiple classifiers such as Naive Bayes, SVM, Logistic Regression, KNN and decision trees. They all produced AUC ROC scores up to 0.91 and F1 scores up to 0.79. The experiment is performed using the English language in multiple cities such as Melbourne, Sydney and Brisbane as well as the Indonesian language in Jakarta. The two experiments, comprising different languages and locations, yielded similar results.
Tasks	Transliteration
Published	2019-01-03
URL	http://arxiv.org/abs/1901.00570v1
PDF	http://arxiv.org/pdf/1901.00570v1.pdf
PWC	https://paperswithcode.com/paper/event-detection-in-twitter-a-keyword-volume
Repo	https://github.com/vsatyav007/repo-eventdetection
Framework	none

Global Textual Relation Embedding for Relational Understanding


Title	Global Textual Relation Embedding for Relational Understanding
Authors	Zhiyu Chen, Hanwen Zha, Honglei Liu, Wenhu Chen, Xifeng Yan, Yu Su
Abstract	Pre-trained embeddings such as word embeddings and sentence embeddings are fundamental tools facilitating a wide range of downstream NLP tasks. In this work, we investigate how to learn a general-purpose embedding of textual relations, defined as the shortest dependency path between entities. Textual relation embedding provides a level of knowledge between word/phrase level and sentence level, and we show that it can facilitate downstream tasks requiring relational understanding of the text. To learn such an embedding, we create the largest distant supervision dataset by linking the entire English ClueWeb09 corpus to Freebase. We use global co-occurrence statistics between textual and knowledge base relations as the supervision signal to train the embedding. Evaluation on two relational understanding tasks demonstrates the usefulness of the learned textual relation embedding. The data and code can be found at https://github.com/czyssrs/GloREPlus
Tasks	Sentence Embeddings, Word Embeddings
Published	2019-06-03
URL	https://arxiv.org/abs/1906.00550v1
PDF	https://arxiv.org/pdf/1906.00550v1.pdf
PWC	https://paperswithcode.com/paper/190600550
Repo	https://github.com/czyssrs/GloREPlus
Framework	tf