January 31, 2020

3276 words 16 mins read

Paper Group AWR 453

Paper Group AWR 453

On Adversarial Removal of Hypothesis-only Bias in Natural Language Inference. Understanding the Representation Power of Graph Neural Networks in Learning Graph Topology. Receding Horizon Curiosity. GLTR: Statistical Detection and Visualization of Generated Text. $Σ$-net: Systematic Evaluation of Iterative Deep Neural Networks for Fast Parallel MR I …

On Adversarial Removal of Hypothesis-only Bias in Natural Language Inference

Title On Adversarial Removal of Hypothesis-only Bias in Natural Language Inference
Authors Yonatan Belinkov, Adam Poliak, Stuart M. Shieber, Benjamin Van Durme, Alexander M. Rush
Abstract Popular Natural Language Inference (NLI) datasets have been shown to be tainted by hypothesis-only biases. Adversarial learning may help models ignore sensitive biases and spurious correlations in data. We evaluate whether adversarial learning can be used in NLI to encourage models to learn representations free of hypothesis-only biases. Our analyses indicate that the representations learned via adversarial learning may be less biased, with only small drops in NLI accuracy.
Tasks Natural Language Inference
Published 2019-07-09
URL https://arxiv.org/abs/1907.04389v1
PDF https://arxiv.org/pdf/1907.04389v1.pdf
PWC https://paperswithcode.com/paper/on-adversarial-removal-of-hypothesis-only-1
Repo https://github.com/azpoliak/robust-nli
Framework pytorch

Understanding the Representation Power of Graph Neural Networks in Learning Graph Topology

Title Understanding the Representation Power of Graph Neural Networks in Learning Graph Topology
Authors Nima Dehmamy, Albert-László Barabási, Rose Yu
Abstract To deepen our understanding of graph neural networks, we investigate the representation power of Graph Convolutional Networks (GCN) through the looking glass of graph moments, a key property of graph topology encoding path of various lengths. We find that GCNs are rather restrictive in learning graph moments. Without careful design, GCNs can fail miserably even with multiple layers and nonlinear activation functions. We analyze theoretically the expressiveness of GCNs, concluding a modular GCN design, using different propagation rules with residual connections could significantly improve the performance of GCN. We demonstrate that such modular designs are capable of distinguishing graphs from different graph generation models for surprisingly small graphs, a notoriously difficult problem in network science. Our investigation suggests that, depth is much more influential than width, with deeper GCNs being more capable of learning higher order graph moments. Additionally, combining GCN modules with different propagation rules is critical to the representation power of GCNs.
Tasks Graph Generation
Published 2019-07-11
URL https://arxiv.org/abs/1907.05008v2
PDF https://arxiv.org/pdf/1907.05008v2.pdf
PWC https://paperswithcode.com/paper/understanding-the-representation-power-of
Repo https://github.com/nimadehmamy/Understanding-GCN
Framework none

Receding Horizon Curiosity

Title Receding Horizon Curiosity
Authors Matthias Schultheis, Boris Belousov, Hany Abdulsamad, Jan Peters
Abstract Sample-efficient exploration is crucial not only for discovering rewarding experiences but also for adapting to environment changes in a task-agnostic fashion. A principled treatment of the problem of optimal input synthesis for system identification is provided within the framework of sequential Bayesian experimental design. In this paper, we present an effective trajectory-optimization-based approximate solution of this otherwise intractable problem that models optimal exploration in an unknown Markov decision process (MDP). By interleaving episodic exploration with Bayesian nonlinear system identification, our algorithm takes advantage of the inductive bias to explore in a directed manner, without assuming prior knowledge of the MDP. Empirical evaluations indicate a clear advantage of the proposed algorithm in terms of the rate of convergence and the final model fidelity when compared to intrinsic-motivation-based algorithms employing exploration bonuses such as prediction error and information gain. Moreover, our method maintains a computational advantage over a recent model-based active exploration (MAX) algorithm, by focusing on the information gain along trajectories instead of seeking a global exploration policy. A reference implementation of our algorithm and the conducted experiments is publicly available.
Tasks Efficient Exploration
Published 2019-10-08
URL https://arxiv.org/abs/1910.03620v1
PDF https://arxiv.org/pdf/1910.03620v1.pdf
PWC https://paperswithcode.com/paper/receding-horizon-curiosity
Repo https://github.com/mschulth/rhc
Framework none

GLTR: Statistical Detection and Visualization of Generated Text

Title GLTR: Statistical Detection and Visualization of Generated Text
Authors Sebastian Gehrmann, Hendrik Strobelt, Alexander M. Rush
Abstract The rapid improvement of language models has raised the specter of abuse of text generation systems. This progress motivates the development of simple methods for detecting generated text that can be used by and explained to non-experts. We develop GLTR, a tool to support humans in detecting whether a text was generated by a model. GLTR applies a suite of baseline statistical methods that can detect generation artifacts across common sampling schemes. In a human-subjects study, we show that the annotation scheme provided by GLTR improves the human detection-rate of fake text from 54% to 72% without any prior training. GLTR is open-source and publicly deployed, and has already been widely used to detect generated outputs
Tasks Human Detection, Text Generation
Published 2019-06-10
URL https://arxiv.org/abs/1906.04043v1
PDF https://arxiv.org/pdf/1906.04043v1.pdf
PWC https://paperswithcode.com/paper/gltr-statistical-detection-and-visualization
Repo https://github.com/HendrikStrobelt/detecting-fake-text
Framework none

$Σ$-net: Systematic Evaluation of Iterative Deep Neural Networks for Fast Parallel MR Image Reconstruction

Title $Σ$-net: Systematic Evaluation of Iterative Deep Neural Networks for Fast Parallel MR Image Reconstruction
Authors Kerstin Hammernik, Jo Schlemper, Chen Qin, Jinming Duan, Ronald M. Summers, Daniel Rueckert
Abstract Purpose: To systematically investigate the influence of various data consistency layers, (semi-)supervised learning and ensembling strategies, defined in a $\Sigma$-net, for accelerated parallel MR image reconstruction using deep learning. Theory and Methods: MR image reconstruction is formulated as learned unrolled optimization scheme with a Down-Up network as regularization and varying data consistency layers. The different architectures are split into sensitivity networks, which rely on explicit coil sensitivity maps, and parallel coil networks, which learn the combination of coils implicitly. Different content and adversarial losses, a semi-supervised fine-tuning scheme and model ensembling are investigated. Results: Evaluated on the fastMRI multicoil validation set, architectures involving raw k-space data outperform image enhancement methods significantly. Semi-supervised fine-tuning adapts to new k-space data and provides, together with reconstructions based on adversarial training, the visually most appealing results although quantitative quality metrics are reduced. The $\Sigma$-net ensembles the benefits from different models and achieves similar scores compared to the single state-of-the-art approaches. Conclusion: This work provides an open-source framework to perform a systematic wide-range comparison of state-of-the-art reconstruction approaches for parallel MR image reconstruction on the fastMRI knee dataset and explores the importance of data consistency. A suitable trade-off between perceptual image quality and quantitative scores are achieved with the ensembled $\Sigma$-net.
Tasks Image Enhancement, Image Reconstruction
Published 2019-12-18
URL https://arxiv.org/abs/1912.09278v1
PDF https://arxiv.org/pdf/1912.09278v1.pdf
PWC https://paperswithcode.com/paper/-net-systematic-evaluation-of-iterative-deep
Repo https://github.com/khammernik/sigmanet
Framework pytorch

Robust Parameter-Free Season Length Detection in Time Series

Title Robust Parameter-Free Season Length Detection in Time Series
Authors Maximilian Toller, Roman Kern
Abstract The in-depth analysis of time series has gained a lot of research interest in recent years, with the identification of periodic patterns being one important aspect. Many of the methods for identifying periodic patterns require time series’ season length as input parameter. There exist only a few algorithms for automatic season length approximation. Many of these rely on simplifications such as data discretization and user defined parameters. This paper presents an algorithm for season length detection that is designed to be sufficiently reliable to be used in practical applications and does not require any input other than the time series to be analyzed. The algorithm estimates a time series’ season length by interpolating, filtering and detrending the data. This is followed by analyzing the distances between zeros in the directly corresponding autocorrelation function. Our algorithm was tested against a comparable algorithm and outperformed it by passing 122 out of 165 tests, while the existing algorithm passed 83 tests. The robustness of our method can be jointly attributed to both the algorithmic approach and also to design decisions taken at the implementational level.
Tasks Time Series
Published 2019-11-14
URL https://arxiv.org/abs/1911.06015v1
PDF https://arxiv.org/pdf/1911.06015v1.pdf
PWC https://paperswithcode.com/paper/robust-parameter-free-season-length-detection
Repo https://github.com/mtoller/autocorr_season_length_detection
Framework none

Learning Semantic-Specific Graph Representation for Multi-Label Image Recognition

Title Learning Semantic-Specific Graph Representation for Multi-Label Image Recognition
Authors Tianshui Chen, Muxin Xu, Xiaolu Hui, Hefeng Wu, Liang Lin
Abstract Recognizing multiple labels of images is a practical and challenging task, and significant progress has been made by searching semantic-aware regions and modeling label dependency. However, current methods cannot locate the semantic regions accurately due to the lack of part-level supervision or semantic guidance. Moreover, they cannot fully explore the mutual interactions among the semantic regions and do not explicitly model the label co-occurrence. To address these issues, we propose a Semantic-Specific Graph Representation Learning (SSGRL) framework that consists of two crucial modules: 1) a semantic decoupling module that incorporates category semantics to guide learning semantic-specific representations and 2) a semantic interaction module that correlates these representations with a graph built on the statistical label co-occurrence and explores their interactions via a graph propagation mechanism. Extensive experiments on public benchmarks show that our SSGRL framework outperforms current state-of-the-art methods by a sizable margin, e.g. with an mAP improvement of 2.5%, 2.6%, 6.7%, and 3.1% on the PASCAL VOC 2007 & 2012, Microsoft-COCO and Visual Genome benchmarks, respectively. Our codes and models are available at https://github.com/HCPLab-SYSU/SSGRL.
Tasks Graph Representation Learning, Representation Learning
Published 2019-08-20
URL https://arxiv.org/abs/1908.07325v1
PDF https://arxiv.org/pdf/1908.07325v1.pdf
PWC https://paperswithcode.com/paper/learning-semantic-specific-graph
Repo https://github.com/HCPLab-SYSU/SSGRL
Framework pytorch

Unified Probabilistic Deep Continual Learning through Generative Replay and Open Set Recognition

Title Unified Probabilistic Deep Continual Learning through Generative Replay and Open Set Recognition
Authors Martin Mundt, Sagnik Majumder, Iuliia Pliushch, Visvanathan Ramesh
Abstract We introduce a probabilistic approach to unify deep continual learning with open set recognition, based on variational Bayesian inference. Our single model combines a joint probabilistic encoder with a generative model and a linear classifier that get shared across sequentially arriving tasks. In order to successfully distinguish unseen unknown data from trained known tasks, we propose to bound the class specific approximate posterior by fitting regions of high density on the basis of correctly classified data points. These bounds are further used to significantly alleviate catastrophic forgetting by avoiding samples from low density areas in generative replay. Our approach requires no storing of old- or upfront knowledge of future data and is empirically validated on visual and audio tasks in class incremental, as well as cross-dataset scenarios across modalities.
Tasks Audio Classification, Bayesian Inference, Continual Learning, Open Set Learning
Published 2019-05-28
URL https://arxiv.org/abs/1905.12019v3
PDF https://arxiv.org/pdf/1905.12019v3.pdf
PWC https://paperswithcode.com/paper/unified-probabilistic-deep-continual-learning
Repo https://github.com/MrtnMndt/OCDVAE_ContinualLearning
Framework pytorch

Graph Warp Module: an Auxiliary Module for Boosting the Power of Graph Neural Networks in Molecular Graph Analysis

Title Graph Warp Module: an Auxiliary Module for Boosting the Power of Graph Neural Networks in Molecular Graph Analysis
Authors Katsuhiko Ishiguro, Shin-ichi Maeda, Masanori Koyama
Abstract Graph Neural Network (GNN) is a popular architecture for the analysis of chemical molecules, and it has numerous applications in material and medicinal science. Current lines of GNNs developed for molecular analysis, however, do not fit well on the training set, and their performance does not scale well with the complexity of the network. In this paper, we propose an auxiliary module to be attached to a GNN that can boost the representation power of the model without hindering with the original GNN architecture. Our auxiliary module can be attached to a wide variety of GNNs, including those that are used commonly in biochemical applications. With our auxiliary architecture, the performances of many GNNs used in practice improve more consistently, achieving the state-of-the-art performance on popular molecular graph datasets.
Tasks
Published 2019-02-04
URL https://arxiv.org/abs/1902.01020v4
PDF https://arxiv.org/pdf/1902.01020v4.pdf
PWC https://paperswithcode.com/paper/graph-warp-module-an-auxiliary-module-for
Repo https://github.com/pfnet-research/chainer-chemistry
Framework none

Extending Adversarial Attacks and Defenses to Deep 3D Point Cloud Classifiers

Title Extending Adversarial Attacks and Defenses to Deep 3D Point Cloud Classifiers
Authors Daniel Liu, Ronald Yu, Hao Su
Abstract 3D object classification and segmentation using deep neural networks has been extremely successful. As the problem of identifying 3D objects has many safety-critical applications, the neural networks have to be robust against adversarial changes to the input data set. There is a growing body of research on generating human-imperceptible adversarial attacks and defenses against them in the 2D image classification domain. However, 3D objects have various differences with 2D images, and this specific domain has not been rigorously studied so far. We present a preliminary evaluation of adversarial attacks on deep 3D point cloud classifiers, namely PointNet and PointNet++, by evaluating both white-box and black-box adversarial attacks that were proposed for 2D images and extending those attacks to reduce the perceptibility of the perturbations in 3D space. We also show the high effectiveness of simple defenses against those attacks by proposing new defenses that exploit the unique structure of 3D point clouds. Finally, we attempt to explain the effectiveness of the defenses through the intrinsic structures of both the point clouds and the neural network architectures. Overall, we find that networks that process 3D point cloud data are weak to adversarial attacks, but they are also more easily defensible compared to 2D image classifiers. Our investigation will provide the groundwork for future studies on improving the robustness of deep neural networks that handle 3D data.
Tasks 3D Object Classification, Image Classification, Object Classification
Published 2019-01-10
URL https://arxiv.org/abs/1901.03006v4
PDF https://arxiv.org/pdf/1901.03006v4.pdf
PWC https://paperswithcode.com/paper/extending-adversarial-attacks-and-defenses-to
Repo https://github.com/Daniel-Liu-c0deb0t/3D-Neural-Network-Adversarial-Attacks
Framework tf

Unsupervised Sketch-to-Photo Synthesis

Title Unsupervised Sketch-to-Photo Synthesis
Authors Runtao Liu, Qian Yu, Stella Yu
Abstract Humans can envision a realistic photo given a free-hand sketch that is not only spatially imprecise and geometrically distorted but also without colors and visual details. We study unsupervised sketch-to-photo synthesis for the first time, learning from unpaired sketch-photo data where the target photo for a sketch is unknown during training. Existing works only deal with style change or spatial deformation alone, synthesizing photos from edge-aligned line drawings or transforming shapes within the same modality, e.g., color images. Our key insight is to decompose unsupervised sketch-to-photo synthesis into a two-stage translation task: First shape translation from sketches to grayscale photos and then content enrichment from grayscale to color photos. We also incorporate a self-supervised denoising objective and an attention module to handle abstraction and style variations that are inherent and specific to sketches. Our synthesis is sketch-faithful and photo-realistic to enable sketch-based image retrieval in practice. An exciting corollary product is a universal and promising sketch generator that captures human visual perception beyond the edge map of a photo.
Tasks Colorization, Data Augmentation, Denoising, Image Generation, Image Retrieval, Sketch-Based Image Retrieval
Published 2019-09-18
URL https://arxiv.org/abs/1909.08313v3
PDF https://arxiv.org/pdf/1909.08313v3.pdf
PWC https://paperswithcode.com/paper/an-unpaired-sketch-to-photo-translation-model
Repo https://github.com/rt219/Unpaired-Sketch-to-Photo-Translation
Framework none

Segmentation-Based Deep-Learning Approach for Surface-Defect Detection

Title Segmentation-Based Deep-Learning Approach for Surface-Defect Detection
Authors Domen Tabernik, Samo Šela, Jure Skvarč, Danijel Skočaj
Abstract Automated surface-anomaly detection using machine learning has become an interesting and promising area of research, with a very high and direct impact on the application domain of visual inspection. Deep-learning methods have become the most suitable approaches for this task. They allow the inspection system to learn to detect the surface anomaly by simply showing it a number of exemplar images. This paper presents a segmentation-based deep-learning architecture that is designed for the detection and segmentation of surface anomalies and is demonstrated on a specific domain of surface-crack detection. The design of the architecture enables the model to be trained using a small number of samples, which is an important requirement for practical applications. The proposed model is compared with the related deep-learning methods, including the state-of-the-art commercial software, showing that the proposed approach outperforms the related methods on the specific domain of surface-crack detection. The large number of experiments also shed light on the required precision of the annotation, the number of required training samples and on the required computational cost. Experiments are performed on a newly created dataset based on a real-world quality control case and demonstrates that the proposed approach is able to learn on a small number of defected surfaces, using only approximately 25-30 defective training samples, instead of hundreds or thousands, which is usually the case in deep-learning applications. This makes the deep-learning method practical for use in industry where the number of available defective samples is limited. The dataset is also made publicly available to encourage the development and evaluation of new methods for surface-defect detection.
Tasks Anomaly Detection
Published 2019-03-20
URL https://arxiv.org/abs/1903.08536v3
PDF https://arxiv.org/pdf/1903.08536v3.pdf
PWC https://paperswithcode.com/paper/segmentation-based-deep-learning-approach-for
Repo https://github.com/seanXYZ/SegDecNet
Framework pytorch

Image Inpainting with Learnable Bidirectional Attention Maps

Title Image Inpainting with Learnable Bidirectional Attention Maps
Authors Chaohao Xie, Shaohui Liu, Chao Li, Ming-Ming Cheng, Wangmeng Zuo, Xiao Liu, Shilei Wen, Errui Ding
Abstract Most convolutional network (CNN)-based inpainting methods adopt standard convolution to indistinguishably treat valid pixels and holes, making them limited in handling irregular holes and more likely to generate inpainting results with color discrepancy and blurriness. Partial convolution has been suggested to address this issue, but it adopts handcrafted feature re-normalization, and only considers forward mask-updating. In this paper, we present a learnable attention map module for learning feature renormalization and mask-updating in an end-to-end manner, which is effective in adapting to irregular holes and propagation of convolution layers. Furthermore, learnable reverse attention maps are introduced to allow the decoder of U-Net to concentrate on filling in irregular holes instead of reconstructing both holes and known regions, resulting in our learnable bidirectional attention maps. Qualitative and quantitative experiments show that our method performs favorably against state-of-the-arts in generating sharper, more coherent and visually plausible inpainting results. The source code and pre-trained models will be available.
Tasks Image Inpainting
Published 2019-09-03
URL https://arxiv.org/abs/1909.00968v3
PDF https://arxiv.org/pdf/1909.00968v3.pdf
PWC https://paperswithcode.com/paper/image-inpainting-with-learnable-bidirectional
Repo https://github.com/Vious/LBAM_inpainting
Framework pytorch

Event detection in Twitter: A keyword volume approach

Title Event detection in Twitter: A keyword volume approach
Authors Ahmad Hany Hossny, Lewis Mitchell
Abstract Event detection using social media streams needs a set of informative features with strong signals that need minimal preprocessing and are highly associated with events of interest. Identifying these informative features as keywords from Twitter is challenging, as people use informal language to express their thoughts and feelings. This informality includes acronyms, misspelled words, synonyms, transliteration and ambiguous terms. In this paper, we propose an efficient method to select the keywords frequently used in Twitter that are mostly associated with events of interest such as protests. The volume of these keywords is tracked in real time to identify the events of interest in a binary classification scheme. We use keywords within word-pairs to capture the context. The proposed method is to binarize vectors of daily counts for each word-pair by applying a spike detection temporal filter, then use the Jaccard metric to measure the similarity of the binary vector for each word-pair with the binary vector describing event occurrence. The top n word-pairs are used as features to classify any day to be an event or non-event day. The selected features are tested using multiple classifiers such as Naive Bayes, SVM, Logistic Regression, KNN and decision trees. They all produced AUC ROC scores up to 0.91 and F1 scores up to 0.79. The experiment is performed using the English language in multiple cities such as Melbourne, Sydney and Brisbane as well as the Indonesian language in Jakarta. The two experiments, comprising different languages and locations, yielded similar results.
Tasks Transliteration
Published 2019-01-03
URL http://arxiv.org/abs/1901.00570v1
PDF http://arxiv.org/pdf/1901.00570v1.pdf
PWC https://paperswithcode.com/paper/event-detection-in-twitter-a-keyword-volume
Repo https://github.com/vsatyav007/repo-eventdetection
Framework none

Global Textual Relation Embedding for Relational Understanding

Title Global Textual Relation Embedding for Relational Understanding
Authors Zhiyu Chen, Hanwen Zha, Honglei Liu, Wenhu Chen, Xifeng Yan, Yu Su
Abstract Pre-trained embeddings such as word embeddings and sentence embeddings are fundamental tools facilitating a wide range of downstream NLP tasks. In this work, we investigate how to learn a general-purpose embedding of textual relations, defined as the shortest dependency path between entities. Textual relation embedding provides a level of knowledge between word/phrase level and sentence level, and we show that it can facilitate downstream tasks requiring relational understanding of the text. To learn such an embedding, we create the largest distant supervision dataset by linking the entire English ClueWeb09 corpus to Freebase. We use global co-occurrence statistics between textual and knowledge base relations as the supervision signal to train the embedding. Evaluation on two relational understanding tasks demonstrates the usefulness of the learned textual relation embedding. The data and code can be found at https://github.com/czyssrs/GloREPlus
Tasks Sentence Embeddings, Word Embeddings
Published 2019-06-03
URL https://arxiv.org/abs/1906.00550v1
PDF https://arxiv.org/pdf/1906.00550v1.pdf
PWC https://paperswithcode.com/paper/190600550
Repo https://github.com/czyssrs/GloREPlus
Framework tf
comments powered by Disqus