January 31, 2020

3060 words 15 mins read

Paper Group AWR 411

Paper Group AWR 411

Adversarial Examples Are Not Bugs, They Are Features. Routing Networks and the Challenges of Modular and Compositional Computation. Is Sampling Heuristics Necessary in Training Deep Object Detectors?. Are Sixteen Heads Really Better than One?. We Need No Pixels: Video Manipulation Detection Using Stream Descriptors. Self-Correction for Human Parsin …

Adversarial Examples Are Not Bugs, They Are Features

Title Adversarial Examples Are Not Bugs, They Are Features
Authors Andrew Ilyas, Shibani Santurkar, Dimitris Tsipras, Logan Engstrom, Brandon Tran, Aleksander Madry
Abstract Adversarial examples have attracted significant attention in machine learning, but the reasons for their existence and pervasiveness remain unclear. We demonstrate that adversarial examples can be directly attributed to the presence of non-robust features: features derived from patterns in the data distribution that are highly predictive, yet brittle and incomprehensible to humans. After capturing these features within a theoretical framework, we establish their widespread existence in standard datasets. Finally, we present a simple setting where we can rigorously tie the phenomena we observe in practice to a misalignment between the (human-specified) notion of robustness and the inherent geometry of the data.
Tasks
Published 2019-05-06
URL https://arxiv.org/abs/1905.02175v4
PDF https://arxiv.org/pdf/1905.02175v4.pdf
PWC https://paperswithcode.com/paper/adversarial-examples-are-not-bugs-they-are
Repo https://github.com/MadryLab/constructed-datasets
Framework pytorch

Routing Networks and the Challenges of Modular and Compositional Computation

Title Routing Networks and the Challenges of Modular and Compositional Computation
Authors Clemens Rosenbaum, Ignacio Cases, Matthew Riemer, Tim Klinger
Abstract Compositionality is a key strategy for addressing combinatorial complexity and the curse of dimensionality. Recent work has shown that compositional solutions can be learned and offer substantial gains across a variety of domains, including multi-task learning, language modeling, visual question answering, machine comprehension, and others. However, such models present unique challenges during training when both the module parameters and their composition must be learned jointly. In this paper, we identify several of these issues and analyze their underlying causes. Our discussion focuses on routing networks, a general approach to this problem, and examines empirically the interplay of these challenges and a variety of design decisions. In particular, we consider the effect of how the algorithm decides on module composition, how the algorithm updates the modules, and if the algorithm uses regularization.
Tasks Language Modelling, Multi-Task Learning, Question Answering, Reading Comprehension, Visual Question Answering
Published 2019-04-29
URL http://arxiv.org/abs/1904.12774v1
PDF http://arxiv.org/pdf/1904.12774v1.pdf
PWC https://paperswithcode.com/paper/routing-networks-and-the-challenges-of
Repo https://github.com/cle-ros/RoutingNetworks
Framework pytorch

Is Sampling Heuristics Necessary in Training Deep Object Detectors?

Title Is Sampling Heuristics Necessary in Training Deep Object Detectors?
Authors Joya Chen, Dong Liu, Tong Xu, Shilong Zhang, Shiwei Wu, Bin Luo, Xuezheng Peng, Enhong Chen
Abstract To address the imbalance between foreground and background, various heuristic methods, such as OHEM, Focal Loss, GHM, have been proposed for biased sampling or weighting when training deep object detectors. We challenge this paradigm by discarding the sampling heuristics and focusing on other settings for training. Our empirical study reveals that the weight of classification loss and the initialization strategy have a big impact on the training stability and the final accuracy. Thus, we propose the \emph{Sampling-Free} mechanism, including three key ingredients: optimal bias initialization, guided loss weights, and class-adaptive threshold, for training deep detectors without sampling heuristics. Compared with the sampling heuristics, our Sampling-Free mechanism is fully data diagnostic and thus avoids the laborious tuning of sampling hyper-parameters. Our extensive experimental results demonstrate that the Sampling-Free mechanism can be used for one-stage, two-stage, and anchor-free object detectors, where it always achieves higher accuracy on the challenging COCO benchmark. The mechanism is also useful for the instance segmentation task. Code is at https://github.com/ChenJoya/sampling-free.
Tasks Instance Segmentation, Semantic Segmentation
Published 2019-09-11
URL https://arxiv.org/abs/1909.04868v5
PDF https://arxiv.org/pdf/1909.04868v5.pdf
PWC https://paperswithcode.com/paper/revisiting-foreground-background-imbalance-in
Repo https://github.com/ChenJoya/objnessdet
Framework pytorch

Are Sixteen Heads Really Better than One?

Title Are Sixteen Heads Really Better than One?
Authors Paul Michel, Omer Levy, Graham Neubig
Abstract Attention is a powerful and ubiquitous mechanism for allowing neural models to focus on particular salient pieces of information by taking their weighted average when making predictions. In particular, multi-headed attention is a driving force behind many recent state-of-the-art NLP models such as Transformer-based MT models and BERT. These models apply multiple attention mechanisms in parallel, with each attention “head” potentially focusing on different parts of the input, which makes it possible to express sophisticated functions beyond the simple weighted average. In this paper we make the surprising observation that even if models have been trained using multiple heads, in practice, a large percentage of attention heads can be removed at test time without significantly impacting performance. In fact, some layers can even be reduced to a single head. We further examine greedy algorithms for pruning down models, and the potential speed, memory efficiency, and accuracy improvements obtainable therefrom. Finally, we analyze the results with respect to which parts of the model are more reliant on having multiple heads, and provide precursory evidence that training dynamics play a role in the gains provided by multi-head attention.
Tasks
Published 2019-05-25
URL https://arxiv.org/abs/1905.10650v3
PDF https://arxiv.org/pdf/1905.10650v3.pdf
PWC https://paperswithcode.com/paper/are-sixteen-heads-really-better-than-one
Repo https://github.com/pmichel31415/are-16-heads-really-better-than-1
Framework pytorch

We Need No Pixels: Video Manipulation Detection Using Stream Descriptors

Title We Need No Pixels: Video Manipulation Detection Using Stream Descriptors
Authors David Güera, Sriram Baireddy, Paolo Bestagini, Stefano Tubaro, Edward J. Delp
Abstract Manipulating video content is easier than ever. Due to the misuse potential of manipulated content, multiple detection techniques that analyze the pixel data from the videos have been proposed. However, clever manipulators should also carefully forge the metadata and auxiliary header information, which is harder to do for videos than images. In this paper, we propose to identify forged videos by analyzing their multimedia stream descriptors with simple binary classifiers, completely avoiding the pixel space. Using well-known datasets, our results show that this scalable approach can achieve a high manipulation detection score if the manipulators have not done a careful data sanitization of the multimedia stream descriptors.
Tasks
Published 2019-06-20
URL https://arxiv.org/abs/1906.08743v1
PDF https://arxiv.org/pdf/1906.08743v1.pdf
PWC https://paperswithcode.com/paper/we-need-no-pixels-video-manipulation
Repo https://github.com/dguera/fake-video-detection-without-pixels
Framework none

Self-Correction for Human Parsing

Title Self-Correction for Human Parsing
Authors Peike Li, Yunqiu Xu, Yunchao Wei, Yi Yang
Abstract Labeling pixel-level masks for fine-grained semantic segmentation tasks, e.g. human parsing, remains a challenging task. The ambiguous boundary between different semantic parts and those categories with similar appearance usually are confusing, leading to unexpected noises in ground truth masks. To tackle the problem of learning with label noises, this work introduces a purification strategy, called Self-Correction for Human Parsing (SCHP), to progressively promote the reliability of the supervised labels as well as the learned models. In particular, starting from a model trained with inaccurate annotations as initialization, we design a cyclically learning scheduler to infer more reliable pseudo-masks by iteratively aggregating the current learned model with the former optimal one in an online manner. Besides, those correspondingly corrected labels can in turn to further boost the model performance. In this way, the models and the labels will reciprocally become more robust and accurate during the self-correction learning cycles. Benefiting from the superiority of SCHP, we achieve the best performance on two popular single-person human parsing benchmarks, including LIP and Pascal-Person-Part datasets. Our overall system ranks 1st in CVPR2019 LIP Challenge. Code is available at https://github.com/PeikeLi/Self-Correction-Human-Parsing.
Tasks Human Parsing, Human Part Segmentation, Semantic Segmentation
Published 2019-10-22
URL https://arxiv.org/abs/1910.09777v1
PDF https://arxiv.org/pdf/1910.09777v1.pdf
PWC https://paperswithcode.com/paper/self-correction-for-human-parsing
Repo https://github.com/PeikeLi/Self-Correction-Human-Parsing
Framework pytorch

ASD-DiagNet: A hybrid learning approach for detection of Autism Spectrum Disorder using fMRI data

Title ASD-DiagNet: A hybrid learning approach for detection of Autism Spectrum Disorder using fMRI data
Authors Taban Eslami, Vahid Mirjalili, Alvis Fong, Angela Laird, Fahad Saeed
Abstract Mental disorders such as Autism Spectrum Disorders (ASD) are heterogeneous disorders that are notoriously difficult to diagnose, especially in children. The current psychiatric diagnostic process is based purely on the behavioural observation of symptomology (DSM-5/ICD-10) and may be prone to over-prescribing of drugs due to misdiagnosis. In order to move the field towards more quantitative fashion, we need advanced and scalable machine learning infrastructure that will allow us to identify reliable biomarkers of mental health disorders. In this paper, we propose a framework called ASD-DiagNet for classifying subjects with ASD from healthy subjects by using only fMRI data. We designed and implemented a joint learning procedure using an autoencoder and a single layer perceptron which results in improved quality of extracted features and optimized parameters for the model. Further, we designed and implemented a data augmentation strategy, based on linear interpolation on available feature vectors, that allows us to produce synthetic datasets needed for training of machine learning models. The proposed approach is evaluated on a public dataset provided by Autism Brain Imaging Data Exchange including 1035 subjects coming from 17 different brain imaging centers. Our machine learning model outperforms other state of the art methods from 13 imaging centers with increase in classification accuracy up to 20% with maximum accuracy of 80%. The machine learning technique presented in this paper, in addition to yielding better quality, gives enormous advantages in terms of execution time (40 minutes vs. 6 hours on other methods). The implemented code is available as GPL license on GitHub portal of our lab (https://github.com/pcdslab/ASD-DiagNet).
Tasks Data Augmentation
Published 2019-04-16
URL http://arxiv.org/abs/1904.07577v1
PDF http://arxiv.org/pdf/1904.07577v1.pdf
PWC https://paperswithcode.com/paper/asd-diagnet-a-hybrid-learning-approach-for
Repo https://github.com/pcdslab/ASD-DiagNet
Framework pytorch

Supervised Vector Quantized Variational Autoencoder for Learning Interpretable Global Representations

Title Supervised Vector Quantized Variational Autoencoder for Learning Interpretable Global Representations
Authors Yifan Xue, Michael Ding, Xinghua Lu
Abstract Learning interpretable representations of data remains a central challenge in deep learning. When training a deep generative model, the observed data are often associated with certain categorical labels, and, in parallel with learning to regenerate data and simulate new data, learning an interpretable representation of each class of data is also a process of acquiring knowledge. Here, we present a novel generative model, referred to as the Supervised Vector Quantized Variational AutoEncoder (S-VQ-VAE), which combines the power of supervised and unsupervised learning to obtain a unique, interpretable global representation for each class of data. Compared with conventional generative models, our model has three key advantages: first, it is an integrative model that can simultaneously learn a feature representation for individual data point and a global representation for each class of data; second, the learning of global representations with embedding codes is guided by supervised information, which clearly defines the interpretation of each code; and third, the global representations capture crucial characteristics of different classes, which reveal similarity and differences of statistical structures underlying different groups of data. We evaluated the utility of S-VQ-VAE on a machine learning benchmark dataset, the MNIST dataset, and on gene expression data from the Library of Integrated Network-Based Cellular Signatures (LINCS). We proved that S-VQ-VAE was able to learn the global genetic characteristics of samples perturbed by the same class of perturbagen (PCL), and further revealed the mechanism correlations between PCLs. Such knowledge is crucial for promoting new drug development for complex diseases like cancer.
Tasks
Published 2019-09-24
URL https://arxiv.org/abs/1909.11124v2
PDF https://arxiv.org/pdf/1909.11124v2.pdf
PWC https://paperswithcode.com/paper/supervised-vector-quantized-variational
Repo https://github.com/evasnow1992/S-VQ-VAE
Framework pytorch

CoulGAT: An Experiment on Interpretability of Graph Attention Networks

Title CoulGAT: An Experiment on Interpretability of Graph Attention Networks
Authors Burc Gokden
Abstract We present an attention mechanism inspired from definition of screened Coulomb potential. This attention mechanism was used to interpret the Graph Attention (GAT) model layers and training dataset by using a flexible and scalable framework (CoulGAT) developed for this purpose. Using CoulGAT, a forest of plain and resnet models were trained and characterized using this attention mechanism against CHAMPS dataset. The learnable variables of the attention mechanism are used to extract node-node and node-feature interactions to define an empirical standard model for the graph structure and hidden layer. This representation of graph and hidden layers can be used as a tool to compare different models, optimize hidden layers and extract a compact definition of graph structure of the dataset.
Tasks
Published 2019-12-18
URL https://arxiv.org/abs/1912.08409v1
PDF https://arxiv.org/pdf/1912.08409v1.pdf
PWC https://paperswithcode.com/paper/coulgat-an-experiment-on-interpretability-of
Repo https://github.com/burcgokden/CoulGAT-Graph-Attention-Interpretability
Framework none

ROCKET: Exceptionally fast and accurate time series classification using random convolutional kernels

Title ROCKET: Exceptionally fast and accurate time series classification using random convolutional kernels
Authors Angus Dempster, François Petitjean, Geoffrey I. Webb
Abstract Most methods for time series classification that attain state-of-the-art accuracy have high computational complexity, requiring significant training time even for smaller datasets, and are intractable for larger datasets. Additionally, many existing methods focus on a single type of feature such as shape or frequency. Building on the recent success of convolutional neural networks for time series classification, we show that simple linear classifiers using random convolutional kernels achieve state-of-the-art accuracy with a fraction of the computational expense of existing methods.
Tasks Time Series, Time Series Classification
Published 2019-10-29
URL https://arxiv.org/abs/1910.13051v1
PDF https://arxiv.org/pdf/1910.13051v1.pdf
PWC https://paperswithcode.com/paper/rocket-exceptionally-fast-and-accurate-time
Repo https://github.com/timeseriesAI/timeseriesAI
Framework pytorch

Tight Certificates of Adversarial Robustness for Randomly Smoothed Classifiers

Title Tight Certificates of Adversarial Robustness for Randomly Smoothed Classifiers
Authors Guang-He Lee, Yang Yuan, Shiyu Chang, Tommi S. Jaakkola
Abstract Strong theoretical guarantees of robustness can be given for ensembles of classifiers generated by input randomization. Specifically, an $\ell_2$ bounded adversary cannot alter the ensemble prediction generated by an additive isotropic Gaussian noise, where the radius for the adversary depends on both the variance of the distribution as well as the ensemble margin at the point of interest. We build on and considerably expand this work across broad classes of distributions. In particular, we offer adversarial robustness guarantees and associated algorithms for the discrete case where the adversary is $\ell_0$ bounded. Moreover, we exemplify how the guarantees can be tightened with specific assumptions about the function class of the classifier such as a decision tree. We empirically illustrate these results with and without functional restrictions across image and molecule datasets.
Tasks
Published 2019-06-12
URL https://arxiv.org/abs/1906.04948v3
PDF https://arxiv.org/pdf/1906.04948v3.pdf
PWC https://paperswithcode.com/paper/a-stratified-approach-to-robustness-for
Repo https://github.com/guanghelee/Randomized_Smoothing
Framework pytorch

Poisson-Minibatching for Gibbs Sampling with Convergence Rate Guarantees

Title Poisson-Minibatching for Gibbs Sampling with Convergence Rate Guarantees
Authors Ruqi Zhang, Christopher De Sa
Abstract Gibbs sampling is a Markov chain Monte Carlo method that is often used for learning and inference on graphical models. Minibatching, in which a small random subset of the graph is used at each iteration, can help make Gibbs sampling scale to large graphical models by reducing its computational cost. In this paper, we propose a new auxiliary-variable minibatched Gibbs sampling method, {\it Poisson-minibatching Gibbs}, which both produces unbiased samples and has a theoretical guarantee on its convergence rate. In comparison to previous minibatched Gibbs algorithms, Poisson-minibatching Gibbs supports fast sampling from continuous state spaces and avoids the need for a Metropolis-Hastings correction on discrete state spaces. We demonstrate the effectiveness of our method on multiple applications and in comparison with both plain Gibbs and previous minibatched methods.
Tasks
Published 2019-11-21
URL https://arxiv.org/abs/1911.09771v1
PDF https://arxiv.org/pdf/1911.09771v1.pdf
PWC https://paperswithcode.com/paper/poisson-minibatching-for-gibbs-sampling-with-1
Repo https://github.com/ruqizhang/poisson-gibbs
Framework none

Out the Window: A Crowd-Sourced Dataset for Activity Classification in Security Video

Title Out the Window: A Crowd-Sourced Dataset for Activity Classification in Security Video
Authors Gregory Castanon, Nathan Shnidman, Tim Anderson, Jeffrey Byrne
Abstract The Out the Window (OTW) dataset is a crowdsourced activity dataset containing 5,668 instances of 17 activities from the NIST Activities in Extended Video (ActEV) challenge. These videos are crowdsourced from workers on the Amazon Mechanical Turk using a novel scenario acting strategy, which collects multiple instances of natural activities per scenario. Turkers are instructed to lean their mobile device against an upper story window overlooking an outdoor space, walk outside to perform a scenario involving people, vehicles and objects, and finally upload the video to us for annotation. Performance evaluation for activity classification on VIRAT Ground 2.0 shows that the OTW dataset provides an 8.3% improvement in mean classification accuracy, and a 12.5% improvement on the most challenging activities involving people with vehicles.
Tasks
Published 2019-08-28
URL https://arxiv.org/abs/1908.10899v2
PDF https://arxiv.org/pdf/1908.10899v2.pdf
PWC https://paperswithcode.com/paper/out-the-window-a-crowd-sourced-dataset-for
Repo https://github.com/stresearch/otw
Framework none

Estimating multi-year 24/7 origin-destination demand using high-granular multi-source traffic data

Title Estimating multi-year 24/7 origin-destination demand using high-granular multi-source traffic data
Authors Wei Ma, Zhen, Qian
Abstract Dynamic origin-destination (OD) demand is central to transportation system modeling and analysis. The dynamic OD demand estimation problem (DODE) has been studied for decades, most of which solve the DODE problem on a typical day or several typical hours. There is a lack of methods that estimate high-resolution dynamic OD demand for a sequence of many consecutive days over several years (referred to as 24/7 OD in this research). Having multi-year 24/7 OD demand would allow a better understanding of characteristics of dynamic OD demands and their evolution/trends over the past few years, a critical input for modeling transportation system evolution and reliability. This paper presents a data-driven framework that estimates day-to-day dynamic OD using high-granular traffic counts and speed data collected over many years. The proposed framework statistically clusters daily traffic data into typical traffic patterns using t-Distributed Stochastic Neighbor Embedding (t-SNE) and k-means methods. A GPU-based stochastic projected gradient descent method is proposed to efficiently solve the multi-year 24/7 DODE problem. It is demonstrated that the new method efficiently estimates the 5-minute dynamic OD demand for every single day from 2014 to 2016 on I-5 and SR-99 in the Sacramento region. The resultant multi-year 24/7 dynamic OD demand reveals the daily, weekly, monthly, seasonal and yearly change in travel demand in a region, implying intriguing demand characteristics over the years.
Tasks Traffic Prediction
Published 2019-01-26
URL http://arxiv.org/abs/1901.09266v1
PDF http://arxiv.org/pdf/1901.09266v1.pdf
PWC https://paperswithcode.com/paper/estimating-multi-year-247-origin-destination
Repo https://github.com/Lemma1/DPFE
Framework pytorch

Dynamics are Important for the Recognition of Equine Pain in Video

Title Dynamics are Important for the Recognition of Equine Pain in Video
Authors Sofia Broomé, Karina Bech Gleerup, Pia Haubro Andersen, Hedvig Kjellström
Abstract A prerequisite to successfully alleviate pain in animals is to recognize it, which is a great challenge in non-verbal species. Furthermore, prey animals such as horses tend to hide their pain. In this study, we propose a deep recurrent two-stream architecture for the task of distinguishing pain from non-pain in videos of horses. Different models are evaluated on a unique dataset showing horses under controlled trials with moderate pain induction, which has been presented in earlier work. Sequential models are experimentally compared to single-frame models, showing the importance of the temporal dimension of the data, and are benchmarked against a veterinary expert classification of the data. We additionally perform baseline comparisons with generalized versions of state-of-the-art human pain recognition methods. While equine pain detection in machine learning is a novel field, our results surpass veterinary expert performance and outperform pain detection results reported for other larger non-human species.
Tasks
Published 2019-01-07
URL https://arxiv.org/abs/1901.02106v2
PDF https://arxiv.org/pdf/1901.02106v2.pdf
PWC https://paperswithcode.com/paper/dynamics-are-important-for-the-recognition-of
Repo https://github.com/sofiabroome/painface-recognition
Framework none
comments powered by Disqus