April 2, 2020

3550 words 17 mins read

Paper Group ANR 260

Paper Group ANR 260

Representations of molecules and materials for interpolation of quantum-mechanical simulations via machine learning. Multilevel Acyclic Hypergraph Partitioning. Undersensitivity in Neural Reading Comprehension. Over-the-Air Adversarial Attacks on Deep Learning Based Modulation Classifier over Wireless Channels. Practical Fast Gradient Sign Attack a …

Representations of molecules and materials for interpolation of quantum-mechanical simulations via machine learning

Title Representations of molecules and materials for interpolation of quantum-mechanical simulations via machine learning
Authors Marcel F. Langer, Alex Goeßmann, Matthias Rupp
Abstract Computational study of molecules and materials from first principles is a cornerstone of physics, chemistry and materials science, but limited by the cost of accurate and precise simulations. In settings involving many simulations, machine learning can reduce these costs, sometimes by orders of magnitude, by interpolating between reference simulations. This requires representations that describe any molecule or material and support interpolation. We review, discuss and benchmark state-of-the-art representations and relations between them, including smooth overlap of atomic positions, many-body tensor representation, and symmetry functions. For this, we use a unified mathematical framework based on many-body functions, group averaging and tensor products, and compare energy predictions for organic molecules, binary alloys and Al-Ga-In sesquioxides in numerical experiments controlled for data distribution, regression method and hyper-parameter optimization.
Published 2020-03-26
URL https://arxiv.org/abs/2003.12081v1
PDF https://arxiv.org/pdf/2003.12081v1.pdf
PWC https://paperswithcode.com/paper/representations-of-molecules-and-materials

Multilevel Acyclic Hypergraph Partitioning

Title Multilevel Acyclic Hypergraph Partitioning
Authors Merten Popp, Sebastian Schlag, Christian Schulz, Daniel Seemaier
Abstract A directed acyclic hypergraph is a generalized concept of a directed acyclic graph, where each hyperedge can contain an arbitrary number of tails and heads. Directed hypergraphs can be used to model data flow and execution dependencies in streaming applications. Thus, hypergraph partitioning algorithms can be used to obtain efficient parallelizations for multiprocessor architectures. However, an acyclicity constraint on the partition is necessary when mapping streaming applications to embedded multiprocessors due to resource restrictions on this type of hardware. The acyclic hypergraph partitioning problem is to partition the hypernodes of a directed acyclic hypergraph into a given number of blocks of roughly equal size such that the corresponding quotient graph is acyclic while minimizing an objective function on the partition. Here, we contribute the first n-level algorithm for the acyclic hypergraph partitioning problem. Our focus is on acyclic hypergraphs where hyperedges can have one head and arbitrary many tails. Based on this, we engineer a memetic algorithm to further reduce communication cost, as well as to improve scheduling makespan on embedded multiprocessor architectures. Experiments indicate that our algorithm outperforms previous algorithms that focus on the directed acyclic graph case which have previously been employed in the application domain. Moreover, our experiments indicate that using the directed hypergraph model for this type of application yields a significantly smaller makespan.
Tasks hypergraph partitioning
Published 2020-02-06
URL https://arxiv.org/abs/2002.02962v1
PDF https://arxiv.org/pdf/2002.02962v1.pdf
PWC https://paperswithcode.com/paper/multilevel-acyclic-hypergraph-partitioning

Undersensitivity in Neural Reading Comprehension

Title Undersensitivity in Neural Reading Comprehension
Authors Johannes Welbl, Pasquale Minervini, Max Bartolo, Pontus Stenetorp, Sebastian Riedel
Abstract Current reading comprehension models generalise well to in-distribution test sets, yet perform poorly on adversarially selected inputs. Most prior work on adversarial inputs studies oversensitivity: semantically invariant text perturbations that cause a model’s prediction to change when it should not. In this work we focus on the complementary problem: excessive prediction undersensitivity, where input text is meaningfully changed but the model’s prediction does not, even though it should. We formulate a noisy adversarial attack which searches among semantic variations of the question for which a model erroneously predicts the same answer, and with even higher probability. Despite comprising unanswerable questions, both SQuAD2.0 and NewsQA models are vulnerable to this attack. This indicates that although accurate, models tend to rely on spurious patterns and do not fully consider the information specified in a question. We experiment with data augmentation and adversarial training as defences, and find that both substantially decrease vulnerability to attacks on held out data, as well as held out attack spaces. Addressing undersensitivity also improves results on AddSent and AddOneSent, and models furthermore generalise better when facing train/evaluation distribution mismatch: they are less prone to overly rely on predictive cues present only in the training set, and outperform a conventional model by as much as 10.9% F1.
Tasks Adversarial Attack, Data Augmentation, Reading Comprehension
Published 2020-02-15
URL https://arxiv.org/abs/2003.04808v1
PDF https://arxiv.org/pdf/2003.04808v1.pdf
PWC https://paperswithcode.com/paper/undersensitivity-in-neural-reading-1

Over-the-Air Adversarial Attacks on Deep Learning Based Modulation Classifier over Wireless Channels

Title Over-the-Air Adversarial Attacks on Deep Learning Based Modulation Classifier over Wireless Channels
Authors Brian Kim, Yalin E. Sagduyu, Kemal Davaslioglu, Tugba Erpek, Sennur Ulukus
Abstract We consider a wireless communication system that consists of a transmitter, a receiver, and an adversary. The transmitter transmits signals with different modulation types, while the receiver classifies its received signals to modulation types using a deep learning-based classifier. In the meantime, the adversary makes over-the-air transmissions that are received as superimposed with the transmitter’s signals to fool the classifier at the receiver into making errors. While this evasion attack has received growing interest recently, the channel effects from the adversary to the receiver have been ignored so far such that the previous attack mechanisms cannot be applied under realistic channel effects. In this paper, we present how to launch a realistic evasion attack by considering channels from the adversary to the receiver. Our results show that modulation classification is vulnerable to an adversarial attack over a wireless channel that is modeled as Rayleigh fading with path loss and shadowing. We present various adversarial attacks with respect to availability of information about channel, transmitter input, and classifier architecture. First, we present two types of adversarial attacks, namely a targeted attack (with minimum power) and non-targeted attack that aims to change the classification to a target label or to any other label other than the true label, respectively. Both are white-box attacks that are transmitter input-specific and use channel information. Then we introduce an algorithm to generate adversarial attacks using limited channel information where the adversary only knows the channel distribution. Finally, we present a black-box universal adversarial perturbation (UAP) attack where the adversary has limited knowledge about both channel and transmitter input.
Tasks Adversarial Attack
Published 2020-02-05
URL https://arxiv.org/abs/2002.02400v2
PDF https://arxiv.org/pdf/2002.02400v2.pdf
PWC https://paperswithcode.com/paper/over-the-air-adversarial-attacks-on-deep

Practical Fast Gradient Sign Attack against Mammographic Image Classifier

Title Practical Fast Gradient Sign Attack against Mammographic Image Classifier
Authors Ibrahim Yilmaz
Abstract Artificial intelligence (AI) has been a topic of major research for many years. Especially, with the emergence of deep neural network (DNN), these studies have been tremendously successful. Today machines are capable of making faster, more accurate decision than human. Thanks to the great development of machine learning (ML) techniques, ML have been used many different fields such as education, medicine, malware detection, autonomous car etc. In spite of having this degree of interest and much successful research, ML models are still vulnerable to adversarial attacks. Attackers can manipulate clean data in order to fool the ML classifiers to achieve their desire target. For instance; a benign sample can be modified as a malicious sample or a malicious one can be altered as benign while this modification can not be recognized by human observer. This can lead to many financial losses, or serious injuries, even deaths. The motivation behind this paper is that we emphasize this issue and want to raise awareness. Therefore, the security gap of mammographic image classifier against adversarial attack is demonstrated. We use mamographic images to train our model then evaluate our model performance in terms of accuracy. Later on, we poison original dataset and generate adversarial samples that missclassified by the model. We then using structural similarity index (SSIM) analyze similarity between clean images and adversarial images. Finally, we show how successful we are to misuse by using different poisoning factors.
Tasks Adversarial Attack, Malware Detection
Published 2020-01-27
URL https://arxiv.org/abs/2001.09610v1
PDF https://arxiv.org/pdf/2001.09610v1.pdf
PWC https://paperswithcode.com/paper/practical-fast-gradient-sign-attack-against

Adversarial Attack on Community Detection by Hiding Individuals

Title Adversarial Attack on Community Detection by Hiding Individuals
Authors Jia Li, Honglei Zhang, Zhichao Han, Yu Rong, Hong Cheng, Junzhou Huang
Abstract It has been demonstrated that adversarial graphs, i.e., graphs with imperceptible perturbations added, can cause deep graph models to fail on node/graph classification tasks. In this paper, we extend adversarial graphs to the problem of community detection which is much more difficult. We focus on black-box attack and aim to hide targeted individuals from the detection of deep graph community detection models, which has many applications in real-world scenarios, for example, protecting personal privacy in social networks and understanding camouflage patterns in transaction networks. We propose an iterative learning framework that takes turns to update two modules: one working as the constrained graph generator and the other as the surrogate community detection model. We also find that the adversarial graphs generated by our method can be transferred to other learning based community detection models.
Tasks Adversarial Attack, Community Detection, Graph Classification
Published 2020-01-22
URL https://arxiv.org/abs/2001.07933v1
PDF https://arxiv.org/pdf/2001.07933v1.pdf
PWC https://paperswithcode.com/paper/adversarial-attack-on-community-detection-by

Vocabulary-based Method for Quantifying Controversy in Social Media

Title Vocabulary-based Method for Quantifying Controversy in Social Media
Authors Juan Manuel Ortiz de Zarate, Esteban Feuerstein
Abstract Identifying controversial topics is not only interesting from a social point of view, it also enables the application of methods to avoid the information segregation, creating better discussion contexts and reaching agreements in the best cases. In this paper we develop a systematic method for controversy detection based primarily on the jargon used by the communities in social media. Our method dispenses with the use of domain-specific knowledge, is language-agnostic, efficient and easy to apply. We perform an extensive set of experiments across many languages, regions and contexts, taking controversial and non-controversial topics. We find that our vocabulary-based measure performs better than state of the art measures that are based only on the community graph structure. Moreover, we shows that it is possible to detect polarization through text analysis.
Published 2020-01-14
URL https://arxiv.org/abs/2001.09899v1
PDF https://arxiv.org/pdf/2001.09899v1.pdf
PWC https://paperswithcode.com/paper/vocabulary-based-method-for-quantifying

Super Resolution for Root Imaging

Title Super Resolution for Root Imaging
Authors Jose F. Ruiz-Munoz, Alina Zare, Jyothier K. Nimmagadda, Shuang Cui, James E. Baciak
Abstract High-resolution cameras have become very helpful for plant phenotyping by providing a mechanism for tasks such as target versus background discrimination, and the measurement and analysis of fine-above-ground plant attributes, e.g., the venation network of leaves. However, the acquisition of high-resolution (HR) imagery of roots in situ remains a challenge. We apply super-resolution (SR) convolutional neural networks (CNNs) to boost the resolution capability of a backscatter X-ray system designed to image buried roots. To overcome limited available backscatter X-ray data for training, we compare three alternatives for training: i) non-plant-root images, ii) plant-root images, and iii) pretraining the model with non-plant-root images and fine-tuning with plant-root images and two deep learning approaches i) Fast Super Resolution Convolutional Neural Network and ii) Super Resolution Generative Adversarial Network). We evaluate SR performance using signal to noise ratio (SNR) and intersection over union (IoU) metrics when segmenting the SR images. In our experiments, we observe that the studied SR models improve the quality of the low-resolution images (LR) of plant roots of an unseen dataset in terms of SNR. Likewise, we demonstrate that SR pre-processing boosts the performance of a machine learning system trained to separate plant roots from their background. In addition, we show examples of backscatter X-ray images upscaled by using the SR model. The current technology for non-intrusive root imaging acquires noisy and LR images. In this study, we show that this issue can be tackled by the incorporation of a deep-learning based SR model in the image formation process.
Tasks Super-Resolution
Published 2020-03-30
URL https://arxiv.org/abs/2003.13537v1
PDF https://arxiv.org/pdf/2003.13537v1.pdf
PWC https://paperswithcode.com/paper/super-resolution-for-root-imaging

When to Use Convolutional Neural Networks for Inverse Problems

Title When to Use Convolutional Neural Networks for Inverse Problems
Authors Nathaniel Chodosh, Simon Lucey
Abstract Reconstruction tasks in computer vision aim fundamentally to recover an undetermined signal from a set of noisy measurements. Examples include super-resolution, image denoising, and non-rigid structure from motion, all of which have seen recent advancements through deep learning. However, earlier work made extensive use of sparse signal reconstruction frameworks (e.g convolutional sparse coding). While this work was ultimately surpassed by deep learning, it rested on a much more developed theoretical framework. Recent work by Papyan et. al provides a bridge between the two approaches by showing how a convolutional neural network (CNN) can be viewed as an approximate solution to a convolutional sparse coding (CSC) problem. In this work we argue that for some types of inverse problems the CNN approximation breaks down leading to poor performance. We argue that for these types of problems the CSC approach should be used instead and validate this argument with empirical evidence. Specifically we identify JPEG artifact reduction and non-rigid trajectory reconstruction as challenging inverse problems for CNNs and demonstrate state of the art performance on them using a CSC method. Furthermore, we offer some practical improvements to this model and its application, and also show how insights from the CSC model can be used to make CNNs effective in tasks where their naive application fails.
Tasks Denoising, Image Denoising, Super-Resolution
Published 2020-03-30
URL https://arxiv.org/abs/2003.13820v1
PDF https://arxiv.org/pdf/2003.13820v1.pdf
PWC https://paperswithcode.com/paper/when-to-use-convolutional-neural-networks-for

Real-time 3D object proposal generation and classification under limited processing resources

Title Real-time 3D object proposal generation and classification under limited processing resources
Authors Xuesong Li, Jose Guivant, Subhan Khan
Abstract The task of detecting 3D objects is important to various robotic applications. The existing deep learning-based detection techniques have achieved impressive performance. However, these techniques are limited to run with a graphics processing unit (GPU) in a real-time environment. To achieve real-time 3D object detection with limited computational resources for robots, we propose an efficient detection method consisting of 3D proposal generation and classification. The proposal generation is mainly based on point segmentation, while the proposal classification is performed by a lightweight convolution neural network (CNN) model. To validate our method, KITTI datasets are utilized. The experimental results demonstrate the capability of proposed real-time 3D object detection method from the point cloud with a competitive performance of object recall and classification.
Tasks 3D Object Detection, Object Detection, Object Proposal Generation
Published 2020-03-24
URL https://arxiv.org/abs/2003.10670v1
PDF https://arxiv.org/pdf/2003.10670v1.pdf
PWC https://paperswithcode.com/paper/real-time-3d-object-proposal-generation-and

Assessing Robustness to Noise: Low-Cost Head CT Triage

Title Assessing Robustness to Noise: Low-Cost Head CT Triage
Authors Sarah M. Hooper, Jared A. Dunnmon, Matthew P. Lungren, Sanjiv Sam Gambhir, Christopher Ré, Adam S. Wang, Bhavik N. Patel
Abstract Automated medical image classification with convolutional neural networks (CNNs) has great potential to impact healthcare, particularly in resource-constrained healthcare systems where fewer trained radiologists are available. However, little is known about how well a trained CNN can perform on images with the increased noise levels, different acquisition protocols, or additional artifacts that may arise when using low-cost scanners, which can be underrepresented in datasets collected from well-funded hospitals. In this work, we investigate how a model trained to triage head computed tomography (CT) scans performs on images acquired with reduced x-ray tube current, fewer projections per gantry rotation, and limited angle scans. These changes can reduce the cost of the scanner and demands on electrical power but come at the expense of increased image noise and artifacts. We first develop a model to triage head CTs and report an area under the receiver operating characteristic curve (AUROC) of 0.77. We then show that the trained model is robust to reduced tube current and fewer projections, with the AUROC dropping only 0.65% for images acquired with a 16x reduction in tube current and 0.22% for images acquired with 8x fewer projections. Finally, for significantly degraded images acquired by a limited angle scan, we show that a model trained specifically to classify such images can overcome the technological limitations to reconstruction and maintain an AUROC within 0.09% of the original model.
Tasks Computed Tomography (CT), Image Classification
Published 2020-03-17
URL https://arxiv.org/abs/2003.07977v2
PDF https://arxiv.org/pdf/2003.07977v2.pdf
PWC https://paperswithcode.com/paper/assessing-robustness-to-noise-low-cost-head

Universal Equivariant Multilayer Perceptrons

Title Universal Equivariant Multilayer Perceptrons
Authors Siamak Ravanbakhsh
Abstract Group invariant and equivariant Multilayer Perceptrons (MLP), also known as Equivariant Networks, have achieved remarkable success in learning on a variety of data structures, such as sequences, images, sets, and graphs. Using tools from group theory, this paper proves the universality of a broad class of equivariant MLPs with a single hidden layer. In particular, it is shown that having a hidden layer on which the group acts regularly is sufficient for universal equivariance. Next, Burnside’s table of marks is used to decompose product spaces. It is shown that the product of two G-sets always contains an orbit larger than the input orbits. Therefore high order hidden layers inevitably contain a regular orbit, leading to the universality of the corresponding MLP. It is shown that with an order larger than the logarithm of the size of the stabilizer group, a high-order equivariant MLP is equivariant universal.
Published 2020-02-07
URL https://arxiv.org/abs/2002.02912v1
PDF https://arxiv.org/pdf/2002.02912v1.pdf
PWC https://paperswithcode.com/paper/universal-equivariant-multilayer-perceptrons

Cross-domain Self-supervised Learning for Domain Adaptation with Few Source Labels

Title Cross-domain Self-supervised Learning for Domain Adaptation with Few Source Labels
Authors Donghyun Kim, Kuniaki Saito, Tae-Hyun Oh, Bryan A. Plummer, Stan Sclaroff, Kate Saenko
Abstract Existing unsupervised domain adaptation methods aim to transfer knowledge from a label-rich source domain to an unlabeled target domain. However, obtaining labels for some source domains may be very expensive, making complete labeling as used in prior work impractical. In this work, we investigate a new domain adaptation scenario with sparsely labeled source data, where only a few examples in the source domain have been labeled, while the target domain is unlabeled. We show that when labeled source examples are limited, existing methods often fail to learn discriminative features applicable for both source and target domains. We propose a novel Cross-Domain Self-supervised (CDS) learning approach for domain adaptation, which learns features that are not only domain-invariant but also class-discriminative. Our self-supervised learning method captures apparent visual similarity with in-domain self-supervision in a domain adaptive manner and performs cross-domain feature matching with across-domain self-supervision. In extensive experiments with three standard benchmark datasets, our method significantly boosts performance of target accuracy in the new target domain with few source labels and is even helpful on classical domain adaptation scenarios.
Tasks Domain Adaptation, Unsupervised Domain Adaptation
Published 2020-03-18
URL https://arxiv.org/abs/2003.08264v1
PDF https://arxiv.org/pdf/2003.08264v1.pdf
PWC https://paperswithcode.com/paper/cross-domain-self-supervised-learning-for

Systematic Evaluation of Privacy Risks of Machine Learning Models

Title Systematic Evaluation of Privacy Risks of Machine Learning Models
Authors Liwei Song, Prateek Mittal
Abstract Machine learning models are prone to memorizing sensitive data, making them vulnerable to membership inference attacks in which an adversary aims to guess if an input sample was used to train the model. In this paper, we show that prior work on membership inference attacks may severely underestimate the privacy risks by relying solely on training custom neural network classifiers to perform attacks and focusing only on the aggregate results over data samples, such as the attack accuracy. To overcome these limitations, we first propose to benchmark membership inference privacy risks by improving existing non-neural network based inference attacks and proposing a new inference attack method based on a modification of prediction entropy. We also propose benchmarks for defense mechanisms by accounting for adaptive adversaries with knowledge of the defense and also accounting for the trade-off between model accuracy and privacy risks. Using our benchmark attacks, we demonstrate that existing defense approaches are not as effective as previously reported. Next, we introduce a new approach for fine-grained privacy analysis by formulating and deriving a new metric called the privacy risk score. Our privacy risk score metric measures an individual sample’s likelihood of being a training member, which allows an adversary to perform membership inference attacks with high confidence. We experimentally validate the effectiveness of the privacy risk score metric and demonstrate that the distribution of the privacy risk score across individual samples is heterogeneous. Finally, we perform an in-depth investigation for understanding why certain samples have high privacy risk scores, including correlations with model sensitivity, generalization error, and feature embeddings. Our work emphasizes the importance of a systematic and rigorous evaluation of privacy risks of machine learning models.
Tasks Inference Attack
Published 2020-03-24
URL https://arxiv.org/abs/2003.10595v1
PDF https://arxiv.org/pdf/2003.10595v1.pdf
PWC https://paperswithcode.com/paper/systematic-evaluation-of-privacy-risks-of

Multimodal Transformer with Pointer Network for the DSTC8 AVSD Challenge

Title Multimodal Transformer with Pointer Network for the DSTC8 AVSD Challenge
Authors Hung Le, Nancy F. Chen
Abstract Audio-Visual Scene-Aware Dialog (AVSD) is an extension from Video Question Answering (QA) whereby the dialogue agent is required to generate natural language responses to address user queries and carry on conversations. This is a challenging task as it consists of video features of multiple modalities, including text, visual, and audio features. The agent also needs to learn semantic dependencies among user utterances and system responses to make coherent conversations with humans. In this work, we describe our submission to the AVSD track of the 8th Dialogue System Technology Challenge. We adopt dot-product attention to combine text and non-text features of input video. We further enhance the generation capability of the dialogue agent by adopting pointer networks to point to tokens from multiple source sequences in each generation step. Our systems achieve high performance in automatic metrics and obtain 5th and 6th place in human evaluation among all submissions.
Tasks Question Answering, Video Question Answering
Published 2020-02-25
URL https://arxiv.org/abs/2002.10695v1
PDF https://arxiv.org/pdf/2002.10695v1.pdf
PWC https://paperswithcode.com/paper/multimodal-transformer-with-pointer-network
comments powered by Disqus