February 1, 2020

2909 words 14 mins read

Paper Group AWR 186

Paper Group AWR 186

CM-Net: A Novel Collaborative Memory Network for Spoken Language Understanding. Tensor Programs I: Wide Feedforward or Recurrent Neural Networks of Any Architecture are Gaussian Processes. Neural Reverse Engineering of Stripped Binaries. Face Manifold: Manifold Learning for Synthetic Face Generation. Electro-Magnetic Side-Channel Attack Through Lea …

CM-Net: A Novel Collaborative Memory Network for Spoken Language Understanding

Title CM-Net: A Novel Collaborative Memory Network for Spoken Language Understanding
Authors Yijin Liu, Fandong Meng, Jinchao Zhang, Jie Zhou, Yufeng Chen, Jinan Xu
Abstract Spoken Language Understanding (SLU) mainly involves two tasks, intent detection and slot filling, which are generally modeled jointly in existing works. However, most existing models fail to fully utilize co-occurrence relations between slots and intents, which restricts their potential performance. To address this issue, in this paper we propose a novel Collaborative Memory Network (CM-Net) based on the well-designed block, named CM-block. The CM-block firstly captures slot-specific and intent-specific features from memories in a collaborative manner, and then uses these enriched features to enhance local context representations, based on which the sequential information flow leads to more specific (slot and intent) global utterance representations. Through stacking multiple CM-blocks, our CM-Net is able to alternately perform information exchange among specific memories, local contexts and the global utterance, and thus incrementally enriches each other. We evaluate the CM-Net on two standard benchmarks (ATIS and SNIPS) and a self-collected corpus (CAIS). Experimental results show that the CM-Net achieves the state-of-the-art results on the ATIS and SNIPS in most of criteria, and significantly outperforms the baseline models on the CAIS. Additionally, we make the CAIS dataset publicly available for the research community.
Tasks Intent Detection, Slot Filling, Spoken Language Understanding
Published 2019-09-16
URL https://arxiv.org/abs/1909.06937v1
PDF https://arxiv.org/pdf/1909.06937v1.pdf
PWC https://paperswithcode.com/paper/cm-net-a-novel-collaborative-memory-network
Repo https://github.com/Adaxry/CM-Net
Framework none

Tensor Programs I: Wide Feedforward or Recurrent Neural Networks of Any Architecture are Gaussian Processes

Title Tensor Programs I: Wide Feedforward or Recurrent Neural Networks of Any Architecture are Gaussian Processes
Authors Greg Yang
Abstract Wide neural networks with random weights and biases are Gaussian processes, as originally observed by Neal (1995) and more recently by Lee et al. (2018) and Matthews et al. (2018) for deep fully-connected networks, as well as by Novak et al. (2019) and Garriga-Alonso et al. (2019) for deep convolutional networks. We show that this Neural Network-Gaussian Process correspondence surprisingly extends to all modern feedforward or recurrent neural networks composed of multilayer perceptron, RNNs (e.g. LSTMs, GRUs), (nD or graph) convolution, pooling, skip connection, attention, batch normalization, and/or layer normalization. More generally, we introduce a language for expressing neural network computations, and our result encompasses all such expressible neural networks. This work serves as a tutorial on the tensor programs technique formulated in Yang (2019) and elucidates the Gaussian Process results obtained there. We provide open-source implementations of the Gaussian Process kernels of simple RNN, GRU, transformer, and batchnorm+ReLU network at github.com/thegregyang/GP4A.
Tasks Gaussian Processes
Published 2019-10-28
URL https://arxiv.org/abs/1910.12478v2
PDF https://arxiv.org/pdf/1910.12478v2.pdf
PWC https://paperswithcode.com/paper/tensor-programs-i-wide-feedforward-or
Repo https://github.com/thegregyang/GP4A
Framework none

Neural Reverse Engineering of Stripped Binaries

Title Neural Reverse Engineering of Stripped Binaries
Authors Yaniv David, Uri Alon, Eran Yahav
Abstract We address the problem of reverse engineering of stripped executables which contain no debug information. This is a challenging problem because of the low amount of syntactic information available in stripped executables, and due to the diverse assembly code patterns arising from compiler optimizations. We present a novel approach for predicting procedure names in stripped executables. Our approach combines static analysis with encoder-decoder-based models. The main idea is to use static analysis to obtain enriched representations of API call sites; encode a set of sequences of these call sites by traversing the Control-Flow Graph; and finally, attend to the encoded sequences while decoding the target name. Our evaluation shows that our model performs predictions that are difficult and time consuming for humans, while improving on the state-of-the-art by $20%$.
Tasks
Published 2019-02-25
URL https://arxiv.org/abs/1902.09122v2
PDF https://arxiv.org/pdf/1902.09122v2.pdf
PWC https://paperswithcode.com/paper/neural-reverse-engineering-of-stripped
Repo https://github.com/tech-srl/code2vec
Framework tf

Face Manifold: Manifold Learning for Synthetic Face Generation

Title Face Manifold: Manifold Learning for Synthetic Face Generation
Authors Kimia Dinashi, Ramin Toosi, Mohammad Ali Akhaee
Abstract Face is one of the most important things for communication with the world around us. It also forms our identity and expressions. Estimating the face structure is a fundamental task in computer vision with applications in different areas such as face recognition and medical surgeries. Recently, deep learning techniques achieved significant results for 3D face reconstruction from flat images. The main challenge of such techniques is a vital need for large 3D face datasets. Usually, this challenge is handled by synthetic face generation. However, synthetic datasets suffer from the existence of non-possible faces. Here, we propose a face manifold learning method for synthetic diverse face dataset generation. First, the face structure is divided into the shape and expression groups. Then, a fully convolutional autoencoder network is exploited to deal with the non-possible faces, and, simultaneously, preserving the dataset diversity. Simulation results show that the proposed method is capable of denoising highly corrupted faces. The diversity of the generated dataset is evaluated qualitatively and quantitatively and compared to the existing methods. Experiments show that our manifold learning method outperforms the state of the art methods significantly.
Tasks 3D Face Reconstruction, Denoising, Face Generation, Face Recognition, Face Reconstruction
Published 2019-10-03
URL https://arxiv.org/abs/1910.01403v2
PDF https://arxiv.org/pdf/1910.01403v2.pdf
PWC https://paperswithcode.com/paper/face-manifold-manifold-learning-for-synthetic
Repo https://github.com/SCL-UT/face-manifold
Framework pytorch

Electro-Magnetic Side-Channel Attack Through Learned Denoising and Classification

Title Electro-Magnetic Side-Channel Attack Through Learned Denoising and Classification
Authors Florian Lemarchand, Cyril Marlin, Florent Montreuil, Erwan Nogues, Maxime Pelcat
Abstract This paper proposes an upgraded electro-magnetic side-channel attack that automatically reconstructs the intercepted data. A novel system is introduced, running in parallel with leakage signal interception and catching compromising data in real-time. Based on deep learning and character recognition the proposed system retrieves more than 57% of characters present in intercepted signals regardless of signal type: analog or digital. The approach is also extended to a protection system that triggers an alarm if the system is compromised, demonstrating a success rate over 95%. Based on software-defined radio and graphics processing unit architectures, this solution can be easily deployed onto existing information systems where information shall be kept secret.
Tasks Denoising
Published 2019-10-16
URL https://arxiv.org/abs/1910.07201v1
PDF https://arxiv.org/pdf/1910.07201v1.pdf
PWC https://paperswithcode.com/paper/electro-magnetic-side-channel-attack-through
Repo https://github.com/opendenoising/interception_dataset
Framework none

ScispaCy: Fast and Robust Models for Biomedical Natural Language Processing

Title ScispaCy: Fast and Robust Models for Biomedical Natural Language Processing
Authors Mark Neumann, Daniel King, Iz Beltagy, Waleed Ammar
Abstract Despite recent advances in natural language processing, many statistical models for processing text perform extremely poorly under domain shift. Processing biomedical and clinical text is a critically important application area of natural language processing, for which there are few robust, practical, publicly available models. This paper describes scispaCy, a new tool for practical biomedical/scientific text processing, which heavily leverages the spaCy library. We detail the performance of two packages of models released in scispaCy and demonstrate their robustness on several tasks and datasets. Models and code are available at https://allenai.github.io/scispacy/
Tasks
Published 2019-02-20
URL https://arxiv.org/abs/1902.07669v3
PDF https://arxiv.org/pdf/1902.07669v3.pdf
PWC https://paperswithcode.com/paper/scispacy-fast-and-robust-models-for
Repo https://github.com/allenai/scispacy
Framework none

Rogue-Gym: A New Challenge for Generalization in Reinforcement Learning

Title Rogue-Gym: A New Challenge for Generalization in Reinforcement Learning
Authors Yuji Kanagawa, Tomoyuki Kaneko
Abstract In this paper, we propose Rogue-Gym, a simple and classic style roguelike game built for evaluating generalization in reinforcement learning (RL). Combined with the recent progress of deep neural networks, RL has successfully trained human-level agents without human knowledge in many games such as those for Atari 2600. However, it has been pointed out that agents trained with RL methods often overfit the training environment, and they work poorly in slightly different environments. To investigate this problem, some research environments with procedural content generation have been proposed. Following these studies, we propose the use of roguelikes as a benchmark for evaluating the generalization ability of RL agents. In our Rogue-Gym, agents need to explore dungeons that are structured differently each time they start a new game. Thanks to the very diverse structures of the dungeons, we believe that the generalization benchmark of Rogue-Gym is sufficiently fair. In our experiments, we evaluate a standard reinforcement learning method, PPO, with and without enhancements for generalization. The results show that some enhancements believed to be effective fail to mitigate the overfitting in Rogue-Gym, although others slightly improve the generalization ability.
Tasks
Published 2019-04-17
URL https://arxiv.org/abs/1904.08129v2
PDF https://arxiv.org/pdf/1904.08129v2.pdf
PWC https://paperswithcode.com/paper/rogue-gym-a-new-challenge-for-generalization
Repo https://github.com/kngwyu/rogue-gym-agents-cog19
Framework pytorch

Neutron Transmission Strain Tomography for Non-Constant Stress-Free Lattice Spacing

Title Neutron Transmission Strain Tomography for Non-Constant Stress-Free Lattice Spacing
Authors J. N. Hendriks, C. Jidling, T. B. Schön, A. Wills, C. M. Wensrich, E. H. Kisi
Abstract Recently, several algorithms for strain tomography from energy-resolved neutron transmission measurements have been proposed. These methods assume that the stress-free lattice spacing $d_0$ is a known constant limiting their application to the study of stresses generated by manufacturing and loading methods that do not alter this parameter. In this paper, we consider the more general problem of jointly reconstructing the strain and $d_0$ fields. A method for solving this inherently non-linear problem is presented that ensures the estimated strain field satisfies equilibrium and can include knowledge of boundary conditions. This method is tested on a simulated data set with realistic noise levels, demonstrating that it is possible to jointly reconstruct $d_0$ and the strain field.
Tasks
Published 2019-05-15
URL https://arxiv.org/abs/1905.06854v2
PDF https://arxiv.org/pdf/1905.06854v2.pdf
PWC https://paperswithcode.com/paper/neutron-transmission-strain-tomography-for
Repo https://github.com/jnh277/Joint_strain_d0_tomography
Framework none

Compressive Closeness in Networks

Title Compressive Closeness in Networks
Authors Hamidreza Mahyar, Rouzbeh Hasheminezhad, H Eugene Stanley
Abstract Distributed algorithms for network science applications are of great importance due to today’s large real-world networks. In such algorithms, a node is allowed only to have local interactions with its immediate neighbors. This is because the whole network topological structure is often unknown to each node. Recently, distributed detection of central nodes, concerning different notions of importance, within a network has received much attention. Closeness centrality is a prominent measure to evaluate the importance (influence) of nodes, based on their accessibility, in a given network. In this paper, first, we introduce a local (ego-centric) metric that correlates well with the global closeness centrality; however, it has very low computational complexity. Second, we propose a compressive sensing (CS)-based framework to accurately recover high closeness centrality nodes in the network utilizing the proposed local metric. Both ego-centric metric computation and its aggregation via CS are efficient and distributed, using only local interactions between neighboring nodes. Finally, we evaluate the performance of the proposed method through extensive experiments on various synthetic and real-world networks. The results show that the proposed local metric correlates with the global closeness centrality, better than the current local metrics. Moreover, the results demonstrate that the proposed CS-based method outperforms the state-of-the-art methods with notable improvement.
Tasks Compressive Sensing
Published 2019-06-19
URL https://arxiv.org/abs/1906.08335v1
PDF https://arxiv.org/pdf/1906.08335v1.pdf
PWC https://paperswithcode.com/paper/compressive-closeness-in-networks
Repo https://github.com/hamidreza-mahyar/CS-HiClose
Framework none

Screening Sinkhorn Algorithm for Regularized Optimal Transport

Title Screening Sinkhorn Algorithm for Regularized Optimal Transport
Authors Mokhtar Z. Alaya, Maxime Bérar, Gilles Gasso, Alain Rakotomamonjy
Abstract We introduce in this paper a novel strategy for efficiently approximating the Sinkhorn distance between two discrete measures. After identifying neglectable components of the dual solution of the regularized Sinkhorn problem, we propose to screen those components by directly setting them at that value before entering the Sinkhorn problem. This allows us to solve a smaller Sinkhorn problem while ensuring approximation with provable guarantees. More formally, the approach is based on a new formulation of dual of Sinkhorn divergence problem and on the KKT optimality conditions of this problem, which enable identification of dual components to be screened. This new analysis leads to the Screenkhorn algorithm. We illustrate the efficiency of Screenkhorn on complex tasks such as dimensionality reduction and domain adaptation involving regularized optimal transport.
Tasks Dimensionality Reduction, Domain Adaptation
Published 2019-06-20
URL https://arxiv.org/abs/1906.08540v3
PDF https://arxiv.org/pdf/1906.08540v3.pdf
PWC https://paperswithcode.com/paper/screening-sinkhorn-algorithm-for-regularized
Repo https://github.com/mzalaya/screenkhorn
Framework none

Combination of multiple Deep Learning architectures for Offensive Language Detection in Tweets

Title Combination of multiple Deep Learning architectures for Offensive Language Detection in Tweets
Authors Nicolò Frisiani, Alexis Laignelet, Batuhan Güler
Abstract This report contains the details regarding our submission to the OffensEval 2019 (SemEval 2019 - Task 6). The competition was based on the Offensive Language Identification Dataset. We first discuss the details of the classifier implemented and the type of input data used and pre-processing performed. We then move onto critically evaluating our performance. We have achieved a macro-average F1-score of 0.76, 0.68, 0.54, respectively for Task a, Task b, and Task c, which we believe reflects on the level of sophistication of the models implemented. Finally, we will be discussing the difficulties encountered and possible improvements for the future.
Tasks Language Identification
Published 2019-03-16
URL http://arxiv.org/abs/1903.08734v2
PDF http://arxiv.org/pdf/1903.08734v2.pdf
PWC https://paperswithcode.com/paper/combination-of-multiple-deep-learning
Repo https://github.com/alaignelet/nlp-sem-eval-2019
Framework none

Attention is not not Explanation

Title Attention is not not Explanation
Authors Sarah Wiegreffe, Yuval Pinter
Abstract Attention mechanisms play a central role in NLP systems, especially within recurrent neural network (RNN) models. Recently, there has been increasing interest in whether or not the intermediate representations offered by these modules may be used to explain the reasoning for a model’s prediction, and consequently reach insights regarding the model’s decision-making process. A recent paper claims that `Attention is not Explanation’ (Jain and Wallace, 2019). We challenge many of the assumptions underlying this work, arguing that such a claim depends on one’s definition of explanation, and that testing it needs to take into account all elements of the model, using a rigorous experimental design. We propose four alternative tests to determine when/whether attention can be used as explanation: a simple uniform-weights baseline; a variance calibration based on multiple random seed runs; a diagnostic framework using frozen weights from pretrained models; and an end-to-end adversarial attention training protocol. Each allows for meaningful interpretation of attention mechanisms in RNN models. We show that even when reliable adversarial distributions can be found, they don’t perform well on the simple diagnostic, indicating that prior work does not disprove the usefulness of attention mechanisms for explainability. |
Tasks Calibration, Decision Making
Published 2019-08-13
URL https://arxiv.org/abs/1908.04626v2
PDF https://arxiv.org/pdf/1908.04626v2.pdf
PWC https://paperswithcode.com/paper/attention-is-not-not-explanation
Repo https://github.com/sarahwie/attention
Framework pytorch

Out-of-Sample Testing for GANs

Title Out-of-Sample Testing for GANs
Authors Pablo Sánchez-Martín, Pablo M. Olmos, Fernando Pérez-Cruz
Abstract We propose a new method to evaluate GANs, namely EvalGAN. EvalGAN relies on a test set to directly measure the reconstruction quality in the original sample space (no auxiliary networks are necessary), and it also computes the (log)likelihood for the reconstructed samples in the test set. Further, EvalGAN is agnostic to the GAN algorithm and the dataset. We decided to test it on three state-of-the-art GANs over the well-known CIFAR-10 and CelebA datasets.
Tasks
Published 2019-01-28
URL http://arxiv.org/abs/1901.09557v1
PDF http://arxiv.org/pdf/1901.09557v1.pdf
PWC https://paperswithcode.com/paper/out-of-sample-testing-for-gans
Repo https://github.com/psanch21/EvalGAN
Framework tf

A Unified MRC Framework for Named Entity Recognition

Title A Unified MRC Framework for Named Entity Recognition
Authors Xiaoya Li, Jingrong Feng, Yuxian Meng, Qinghong Han, Fei Wu, Jiwei Li
Abstract The task of named entity recognition (NER) is normally divided into nested NER and flat NER depending on whether named entities are nested or not. Models are usually separately developed for the two tasks, since sequence labeling models, the most widely used backbone for flat NER, are only able to assign a single label to a particular token, which is unsuitable for nested NER where a token may be assigned several labels. In this paper, we propose a unified framework that is capable of handling both flat and nested NER tasks. Instead of treating the task of NER as a sequence labeling problem, we propose to formulate it as a machine reading comprehension (MRC) task. For example, extracting entities with the \textsc{per} label is formalized as extracting answer spans to the question “{\it which person is mentioned in the text?}". This formulation naturally tackles the entity overlapping issue in nested NER: the extraction of two overlapping entities for different categories requires answering two independent questions. Additionally, since the query encodes informative prior knowledge, this strategy facilitates the process of entity extraction, leading to better performances for not only nested NER, but flat NER. We conduct experiments on both {\em nested} and {\em flat} NER datasets. Experimental results demonstrate the effectiveness of the proposed formulation. We are able to achieve vast amount of performance boost over current SOTA models on nested NER datasets, i.e., +1.28, +2.55, +5.44, +6.37, respectively on ACE04, ACE05, GENIA and KBP17, along with SOTA results on flat NER datasets, i.e.,+0.24, +1.95, +0.21, +1.49 respectively on English CoNLL 2003, English OntoNotes 5.0, Chinese MSRA, Chinese OntoNotes 4.0.
Tasks Chinese Named Entity Recognition, Entity Extraction, Machine Reading Comprehension, Named Entity Recognition, Nested Mention Recognition, Nested Named Entity Recognition, Reading Comprehension
Published 2019-10-25
URL https://arxiv.org/abs/1910.11476v2
PDF https://arxiv.org/pdf/1910.11476v2.pdf
PWC https://paperswithcode.com/paper/a-unified-mrc-framework-for-named-entity
Repo https://github.com/ShannonAI/mrc-for-flat-nested-ner
Framework pytorch

Hacking Neural Networks: A Short Introduction

Title Hacking Neural Networks: A Short Introduction
Authors Michael Kissner
Abstract A large chunk of research on the security issues of neural networks is focused on adversarial attacks. However, there exists a vast sea of simpler attacks one can perform both against and with neural networks. In this article, we give a quick introduction on how deep learning in security works and explore the basic methods of exploitation, but also look at the offensive capabilities deep learning enabled tools provide. All presented attacks, such as backdooring, GPU-based buffer overflows or automated bug hunting, are accompanied by short open-source exercises for anyone to try out.
Tasks Neural Network Security
Published 2019-11-18
URL https://arxiv.org/abs/1911.07658v2
PDF https://arxiv.org/pdf/1911.07658v2.pdf
PWC https://paperswithcode.com/paper/hacking-neural-networks-a-short-introduction
Repo https://github.com/Kayzaks/HackingNeuralNetworks
Framework tf
comments powered by Disqus