February 1, 2020

3206 words 16 mins read

Paper Group AWR 286

Paper Group AWR 286

Mean Spectral Normalization of Deep Neural Networks for Embedded Automation. Mask Embedding in conditional GAN for Guided Synthesis of High Resolution Images. Fully automatic computer-aided mass detection and segmentation via pseudo-color mammograms and Mask R-CNN. Interpretable and Steerable Sequence Learning via Prototypes. Patent Citation Dynami …

Mean Spectral Normalization of Deep Neural Networks for Embedded Automation

Title Mean Spectral Normalization of Deep Neural Networks for Embedded Automation
Authors Anand Krishnamoorthy Subramanian, Nak Young Chong
Abstract Deep Neural Networks (DNNs) have begun to thrive in the field of automation systems, owing to the recent advancements in standardising various aspects such as architecture, optimization techniques, and regularization. In this paper, we take a step towards a better understanding of Spectral Normalization (SN) and its potential for standardizing regularization of a wider range of Deep Learning models, following an empirical approach. We conduct several experiments to study their training dynamics, in comparison with the ubiquitous Batch Normalization (BN) and show that SN increases the gradient sparsity and controls the gradient variance. Furthermore, we show that SN suffers from a phenomenon, we call the mean-drift effect, which mitigates its performance. We, then, propose a weight reparameterization called as the Mean Spectral Normalization (MSN) to resolve the mean drift, thereby significantly improving the network’s performance. Our model performs ~16% faster as compared to BN in practice, and has fewer trainable parameters. We also show the performance of our MSN for small, medium, and large CNNs - 3-layer CNN, VGG7 and DenseNet-BC, respectively - and unsupervised image generation tasks using Generative Adversarial Networks (GANs) to evaluate its applicability for a broad range of embedded automation tasks.
Tasks Image Generation
Published 2019-07-09
URL https://arxiv.org/abs/1907.04003v1
PDF https://arxiv.org/pdf/1907.04003v1.pdf
PWC https://paperswithcode.com/paper/mean-spectral-normalization-of-deep-neural
Repo https://github.com/AntixK/mean-spectral-norm
Framework pytorch

Mask Embedding in conditional GAN for Guided Synthesis of High Resolution Images

Title Mask Embedding in conditional GAN for Guided Synthesis of High Resolution Images
Authors Yinhao Ren, Zhe Zhu, Yingzhou Li, Joseph Lo
Abstract Recent advancements in conditional Generative Adversarial Networks (cGANs) have shown promises in label guided image synthesis. Semantic masks, such as sketches and label maps, are another intuitive and effective form of guidance in image synthesis. Directly incorporating the semantic masks as constraints dramatically reduces the variability and quality of the synthesized results. We observe this is caused by the incompatibility of features from different inputs (such as mask image and latent vector) of the generator. To use semantic masks as guidance whilst providing realistic synthesized results with fine details, we propose to use mask embedding mechanism to allow for a more efficient initial feature projection in the generator. We validate the effectiveness of our approach by training a mask guided face generator using CELEBA-HQ dataset. We can generate realistic and high resolution facial images up to the resolution of 512*512 with a mask guidance. Our code is publicly available.
Tasks Image Generation
Published 2019-07-03
URL https://arxiv.org/abs/1907.01710v1
PDF https://arxiv.org/pdf/1907.01710v1.pdf
PWC https://paperswithcode.com/paper/mask-embedding-in-conditional-gan-for-guided
Repo https://github.com/johnryh/Face_Embedding_GAN
Framework tf

Fully automatic computer-aided mass detection and segmentation via pseudo-color mammograms and Mask R-CNN

Title Fully automatic computer-aided mass detection and segmentation via pseudo-color mammograms and Mask R-CNN
Authors Hang Min, Devin Wilson, Yinhuang Huang, Siyu Liu, Stuart Crozier, Andrew P Bradley, Shekhar S. Chandra
Abstract Mammographic mass detection and segmentation are usually performed as serial and separate tasks, with segmentation often only performed on manually confirmed true positive detections in previous studies. We propose a fully-integrated computer-aided detection (CAD) system for simultaneous mammographic mass detection and segmentation without user intervention. The proposed CAD only consists of a pseudo-color image generation and a mass detection-segmentation stage based on Mask R-CNN. Grayscale mammograms are transformed into pseudo-color images based on multi-scale morphological sifting where mass-like patterns are enhanced to improve the performance of Mask R-CNN. Transfer learning with the Mask R-CNN is then adopted to simultaneously detect and segment masses on the pseudo-color images. Evaluated on the public dataset INbreast, the method outperforms the state-of-the-art methods by achieving an average true positive rate of 0.90 at 0.9 false positive per image and an average Dice similarity index of 0.88 for mass segmentation.
Tasks Image Generation, Transfer Learning
Published 2019-06-28
URL https://arxiv.org/abs/1906.12118v2
PDF https://arxiv.org/pdf/1906.12118v2.pdf
PWC https://paperswithcode.com/paper/fully-automatic-computer-aided-mass-detection
Repo https://github.com/Holliemin9090/Mammographic-mass-CAD-via-pseudo-color-mammogram-and-Mask-R-CNN
Framework none

Interpretable and Steerable Sequence Learning via Prototypes

Title Interpretable and Steerable Sequence Learning via Prototypes
Authors Yao Ming, Panpan Xu, Huamin Qu, Liu Ren
Abstract One of the major challenges in machine learning nowadays is to provide predictions with not only high accuracy but also user-friendly explanations. Although in recent years we have witnessed increasingly popular use of deep neural networks for sequence modeling, it is still challenging to explain the rationales behind the model outputs, which is essential for building trust and supporting the domain experts to validate, critique and refine the model. We propose ProSeNet, an interpretable and steerable deep sequence model with natural explanations derived from case-based reasoning. The prediction is obtained by comparing the inputs to a few prototypes, which are exemplar cases in the problem domain. For better interpretability, we define several criteria for constructing the prototypes, including simplicity, diversity, and sparsity and propose the learning objective and the optimization procedure. ProSeNet also provides a user-friendly approach to model steering: domain experts without any knowledge on the underlying model or parameters can easily incorporate their intuition and experience by manually refining the prototypes. We conduct experiments on a wide range of real-world applications, including predictive diagnostics for automobiles, ECG, and protein sequence classification and sentiment analysis on texts. The result shows that ProSeNet can achieve accuracy on par with state-of-the-art deep learning models. We also evaluate the interpretability of the results with concrete case studies. Finally, through user study on Amazon Mechanical Turk (MTurk), we demonstrate that the model selects high-quality prototypes which align well with human knowledge and can be interactively refined for better interpretability without loss of performance.
Tasks Sentiment Analysis
Published 2019-07-23
URL https://arxiv.org/abs/1907.09728v1
PDF https://arxiv.org/pdf/1907.09728v1.pdf
PWC https://paperswithcode.com/paper/interpretable-and-steerable-sequence-learning
Repo https://github.com/myaooo/ProSeNet
Framework none

Patent Citation Dynamics Modeling via Multi-Attention Recurrent Networks

Title Patent Citation Dynamics Modeling via Multi-Attention Recurrent Networks
Authors Taoran Ji, Zhiqian Chen, Nathan Self, Kaiqun Fu, Chang-Tien Lu, Naren Ramakrishnan
Abstract Modeling and forecasting forward citations to a patent is a central task for the discovery of emerging technologies and for measuring the pulse of inventive progress. Conventional methods for forecasting these forward citations cast the problem as analysis of temporal point processes which rely on the conditional intensity of previously received citations. Recent approaches model the conditional intensity as a chain of recurrent neural networks to capture memory dependency in hopes of reducing the restrictions of the parametric form of the intensity function. For the problem of patent citations, we observe that forecasting a patent’s chain of citations benefits from not only the patent’s history itself but also from the historical citations of assignees and inventors associated with that patent. In this paper, we propose a sequence-to-sequence model which employs an attention-of-attention mechanism to capture the dependencies of these multiple time sequences. Furthermore, the proposed model is able to forecast both the timestamp and the category of a patent’s next citation. Extensive experiments on a large patent citation dataset collected from USPTO demonstrate that the proposed model outperforms state-of-the-art models at forward citation forecasting.
Tasks Point Processes
Published 2019-05-22
URL https://arxiv.org/abs/1905.10022v1
PDF https://arxiv.org/pdf/1905.10022v1.pdf
PWC https://paperswithcode.com/paper/patent-citation-dynamics-modeling-via-multi
Repo https://github.com/TaoranJ/PC-RNN
Framework pytorch

Neural Shuffle-Exchange Networks – Sequence Processing in O(n log n) Time

Title Neural Shuffle-Exchange Networks – Sequence Processing in O(n log n) Time
Authors Kārlis Freivalds, Emīls Ozoliņš, Agris Šostaks
Abstract A key requirement in sequence to sequence processing is the modeling of long range dependencies. To this end, a vast majority of the state-of-the-art models use attention mechanism which is of O($n^2$) complexity that leads to slow execution for long sequences. We introduce a new Shuffle-Exchange neural network model for sequence to sequence tasks which have O(log n) depth and O(n log n) total complexity. We show that this model is powerful enough to infer efficient algorithms for common algorithmic benchmarks including sorting, addition and multiplication. We evaluate our architecture on the challenging LAMBADA question answering dataset and compare it with the state-of-the-art models which use attention. Our model achieves competitive accuracy and scales to sequences with more than a hundred thousand of elements. We are confident that the proposed model has the potential for building more efficient architectures for processing large interrelated data in language modeling, music generation and other application domains.
Tasks Language Modelling, Music Generation, Question Answering
Published 2019-07-18
URL https://arxiv.org/abs/1907.07897v3
PDF https://arxiv.org/pdf/1907.07897v3.pdf
PWC https://paperswithcode.com/paper/neural-shuffle-exchange-networks-sequence
Repo https://github.com/LUMII-Syslab/shuffle-exchange
Framework tf

C-MIL: Continuation Multiple Instance Learning for Weakly Supervised Object Detection

Title C-MIL: Continuation Multiple Instance Learning for Weakly Supervised Object Detection
Authors Fang Wan, Chang Liu, Wei Ke, Xiangyang Ji, Jianbin Jiao, Qixiang Ye
Abstract Weakly supervised object detection (WSOD) is a challenging task when provided with image category supervision but required to simultaneously learn object locations and object detectors. Many WSOD approaches adopt multiple instance learning (MIL) and have non-convex loss functions which are prone to get stuck into local minima (falsely localize object parts) while missing full object extent during training. In this paper, we introduce a continuation optimization method into MIL and thereby creating continuation multiple instance learning (C-MIL), with the intention of alleviating the non-convexity problem in a systematic way. We partition instances into spatially related and class related subsets, and approximate the original loss function with a series of smoothed loss functions defined within the subsets. Optimizing smoothed loss functions prevents the training procedure falling prematurely into local minima and facilitates the discovery of Stable Semantic Extremal Regions (SSERs) which indicate full object extent. On the PASCAL VOC 2007 and 2012 datasets, C-MIL improves the state-of-the-art of weakly supervised object detection and weakly supervised object localization with large margins.
Tasks Multiple Instance Learning, Object Detection, Object Localization, Weakly Supervised Object Detection, Weakly-Supervised Object Localization
Published 2019-04-11
URL http://arxiv.org/abs/1904.05647v1
PDF http://arxiv.org/pdf/1904.05647v1.pdf
PWC https://paperswithcode.com/paper/c-mil-continuation-multiple-instance-learning
Repo https://github.com/Winfrand/C-MIL
Framework pytorch

Exploring Generative Physics Models with Scientific Priors in Inertial Confinement Fusion

Title Exploring Generative Physics Models with Scientific Priors in Inertial Confinement Fusion
Authors Rushil Anirudh, Jayaraman J. Thiagarajan, Shusen Liu, Peer-Timo Bremer, Brian K. Spears
Abstract There is significant interest in using modern neural networks for scientific applications due to their effectiveness in modeling highly complex, non-linear problems in a data-driven fashion. However, a common challenge is to verify the scientific plausibility or validity of outputs predicted by a neural network. This work advocates the use of known scientific constraints as a lens into evaluating, exploring, and understanding such predictions for the problem of inertial confinement fusion.
Tasks
Published 2019-10-03
URL https://arxiv.org/abs/1910.01666v1
PDF https://arxiv.org/pdf/1910.01666v1.pdf
PWC https://paperswithcode.com/paper/exploring-generative-physics-models-with
Repo https://github.com/rushilanirudh/macc
Framework tf

Craquelure as a Graph: Application of Image Processing and Graph Neural Networks to the Description of Fracture Patterns

Title Craquelure as a Graph: Application of Image Processing and Graph Neural Networks to the Description of Fracture Patterns
Authors Oleksii Sidorov, Jon Yngve Hardeberg
Abstract Cracks on a painting is not a defect but an inimitable signature of an artwork which can be used for origin examination, aging monitoring, damage identification, and even forgery detection. This work presents the development of a new methodology and corresponding toolbox for the extraction and characterization of information from an image of a craquelure pattern. The proposed approach processes craquelure network as a graph. The graph representation captures the network structure via mutual organization of junctions and fractures. Furthermore, it is invariant to any geometrical distortions. At the same time, our tool extracts the properties of each node and edge individually, which allows to characterize the pattern statistically. We illustrate benefits from the graph representation and statistical features individually using novel Graph Neural Network and hand-crafted descriptors correspondingly. However, we also show that the best performance is achieved when both techniques are merged into one framework. We perform experiments on the dataset for paintings’ origin classification and demonstrate that our approach outperforms existing techniques by a large margin.
Tasks
Published 2019-05-13
URL https://arxiv.org/abs/1905.05010v2
PDF https://arxiv.org/pdf/1905.05010v2.pdf
PWC https://paperswithcode.com/paper/the-cracks-that-wanted-to-be-a-graph
Repo https://github.com/acecreamu/craquelure-graphs
Framework pytorch

Learning Combinatorial Embedding Networks for Deep Graph Matching

Title Learning Combinatorial Embedding Networks for Deep Graph Matching
Authors Runzhong Wang, Junchi Yan, Xiaokang Yang
Abstract Graph matching refers to finding node correspondence between graphs, such that the corresponding node and edge’s affinity can be maximized. In addition with its NP-completeness nature, another important challenge is effective modeling of the node-wise and structure-wise affinity across graphs and the resulting objective, to guide the matching procedure effectively finding the true matching against noises. To this end, this paper devises an end-to-end differentiable deep network pipeline to learn the affinity for graph matching. It involves a supervised permutation loss regarding with node correspondence to capture the combinatorial nature for graph matching. Meanwhile deep graph embedding models are adopted to parameterize both intra-graph and cross-graph affinity functions, instead of the traditional shallow and simple parametric forms e.g. a Gaussian kernel. The embedding can also effectively capture the higher-order structure beyond second-order edges. The permutation loss model is agnostic to the number of nodes, and the embedding model is shared among nodes such that the network allows for varying numbers of nodes in graphs for training and inference. Moreover, our network is class-agnostic with some generalization capability across different categories. All these features are welcomed for real-world applications. Experiments show its superiority against state-of-the-art graph matching learning methods.
Tasks Graph Embedding, Graph Matching
Published 2019-04-01
URL https://arxiv.org/abs/1904.00597v3
PDF https://arxiv.org/pdf/1904.00597v3.pdf
PWC https://paperswithcode.com/paper/learning-combinatorial-embedding-networks-for
Repo https://github.com/Thinklab-SJTU/PCA-GM
Framework pytorch

Learning Task-specific Representation for Novel Words in Sequence Labeling

Title Learning Task-specific Representation for Novel Words in Sequence Labeling
Authors Minlong Peng, Qi Zhang, Xiaoyu Xing, Tao Gui, Jinlan Fu, Xuanjing Huang
Abstract Word representation is a key component in neural-network-based sequence labeling systems. However, representations of unseen or rare words trained on the end task are usually poor for appreciable performance. This is commonly referred to as the out-of-vocabulary (OOV) problem. In this work, we address the OOV problem in sequence labeling using only training data of the task. To this end, we propose a novel method to predict representations for OOV words from their surface-forms (e.g., character sequence) and contexts. The method is specifically designed to avoid the error propagation problem suffered by existing approaches in the same paradigm. To evaluate its effectiveness, we performed extensive empirical studies on four part-of-speech tagging (POS) tasks and four named entity recognition (NER) tasks. Experimental results show that the proposed method can achieve better or competitive performance on the OOV problem compared with existing state-of-the-art methods.
Tasks Named Entity Recognition, Part-Of-Speech Tagging
Published 2019-05-29
URL https://arxiv.org/abs/1905.12277v1
PDF https://arxiv.org/pdf/1905.12277v1.pdf
PWC https://paperswithcode.com/paper/learning-task-specific-representation-for
Repo https://github.com/v-mipeng/TaskOOV
Framework none

ERNIE: Enhanced Representation through Knowledge Integration

Title ERNIE: Enhanced Representation through Knowledge Integration
Authors Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Xuyi Chen, Han Zhang, Xin Tian, Danxiang Zhu, Hao Tian, Hua Wu
Abstract We present a novel language representation model enhanced by knowledge called ERNIE (Enhanced Representation through kNowledge IntEgration). Inspired by the masking strategy of BERT, ERNIE is designed to learn language representation enhanced by knowledge masking strategies, which includes entity-level masking and phrase-level masking. Entity-level strategy masks entities which are usually composed of multiple words.Phrase-level strategy masks the whole phrase which is composed of several words standing together as a conceptual unit.Experimental results show that ERNIE outperforms other baseline methods, achieving new state-of-the-art results on five Chinese natural language processing tasks including natural language inference, semantic similarity, named entity recognition, sentiment analysis and question answering. We also demonstrate that ERNIE has more powerful knowledge inference capacity on a cloze test.
Tasks Named Entity Recognition, Natural Language Inference, Question Answering, Semantic Similarity, Semantic Textual Similarity, Sentiment Analysis
Published 2019-04-19
URL http://arxiv.org/abs/1904.09223v1
PDF http://arxiv.org/pdf/1904.09223v1.pdf
PWC https://paperswithcode.com/paper/ernie-enhanced-representation-through
Repo https://github.com/lonePatient/ERNIE-text-classification-pytorch
Framework pytorch

Explaining Convolutional Neural Networks using Softmax Gradient Layer-wise Relevance Propagation

Title Explaining Convolutional Neural Networks using Softmax Gradient Layer-wise Relevance Propagation
Authors Brian Kenji Iwana, Ryohei Kuroki, Seiichi Uchida
Abstract Convolutional Neural Networks (CNN) have become state-of-the-art in the field of image classification. However, not everything is understood about their inner representations. This paper tackles the interpretability and explainability of the predictions of CNNs for multi-class classification problems. Specifically, we propose a novel visualization method of pixel-wise input attribution called Softmax-Gradient Layer-wise Relevance Propagation (SGLRP). The proposed model is a class discriminate extension to Deep Taylor Decomposition (DTD) using the gradient of softmax to back propagate the relevance of the output probability to the input image. Through qualitative and quantitative analysis, we demonstrate that SGLRP can successfully localize and attribute the regions on input images which contribute to a target object’s classification. We show that the proposed method excels at discriminating the target objects class from the other possible objects in the images. We confirm that SGLRP performs better than existing Layer-wise Relevance Propagation (LRP) based methods and can help in the understanding of the decision process of CNNs.
Tasks Image Classification
Published 2019-08-06
URL https://arxiv.org/abs/1908.04351v3
PDF https://arxiv.org/pdf/1908.04351v3.pdf
PWC https://paperswithcode.com/paper/explaining-convolutional-neural-networks
Repo https://github.com/uchidalab/softmaxgradient-lrp
Framework tf

Spectral Regularization for Combating Mode Collapse in GANs

Title Spectral Regularization for Combating Mode Collapse in GANs
Authors Kanglin Liu, Wenming Tang, Fei Zhou, Guoping Qiu
Abstract Despite excellent progress in recent years, mode collapse remains a major unsolved problem in generative adversarial networks (GANs).In this paper, we present spectral regularization for GANs (SR-GANs), a new and robust method for combating the mode collapse problem in GANs. Theoretical analysis shows that the optimal solution to the discriminator has a strong relationship to the spectral distributions of the weight matrix.Therefore, we monitor the spectral distribution in the discriminator of spectral normalized GANs (SN-GANs), and discover a phenomenon which we refer to as spectral collapse, where a large number of singular values of the weight matrices drop dramatically when mode collapse occurs. We show that there are strong evidence linking mode collapse to spectral collapse; and based on this link, we set out to tackle spectral collapse as a surrogate of mode collapse. We have developed a spectral regularization method where we compensate the spectral distributions of the weight matrices to prevent them from collapsing, which in turn successfully prevents mode collapse in GANs. We provide theoretical explanations for why SR-GANs are more stable and can provide better performances than SN-GANs. We also present extensive experimental results and analysis to show that SR-GANs not only always outperform SN-GANs but also always succeed in combating mode collapse where SN-GANs fail. The code is available at https://github.com/max-liu-112/SRGANs-Spectral-Regularization-GANs-.
Tasks
Published 2019-08-29
URL https://arxiv.org/abs/1908.10999v3
PDF https://arxiv.org/pdf/1908.10999v3.pdf
PWC https://paperswithcode.com/paper/spectral-regularization-for-combating-mode
Repo https://github.com/max-liu-112/SRGANs
Framework none

Mo’ States Mo’ Problems: Emergency Stop Mechanisms from Observation

Title Mo’ States Mo’ Problems: Emergency Stop Mechanisms from Observation
Authors Samuel Ainsworth, Matt Barnes, Siddhartha Srinivasa
Abstract In many environments, only a relatively small subset of the complete state space is necessary in order to accomplish a given task. We develop a simple technique using emergency stops (e-stops) to exploit this phenomenon. Using e-stops significantly improves sample complexity by reducing the amount of required exploration, while retaining a performance bound that efficiently trades off the rate of convergence with a small asymptotic sub-optimality gap. We analyze the regret behavior of e-stops and present empirical results in discrete and continuous settings demonstrating that our reset mechanism can provide order-of-magnitude speedups on top of existing reinforcement learning methods.
Tasks
Published 2019-12-03
URL https://arxiv.org/abs/1912.01649v1
PDF https://arxiv.org/pdf/1912.01649v1.pdf
PWC https://paperswithcode.com/paper/mo-states-mo-problems-emergency-stop-1
Repo https://github.com/samuela/e-stops
Framework jax
comments powered by Disqus