January 31, 2020

3060 words 15 mins read

Paper Group AWR 411

Adversarial Examples Are Not Bugs, They Are Features. Routing Networks and the Challenges of Modular and Compositional Computation. Is Sampling Heuristics Necessary in Training Deep Object Detectors?. Are Sixteen Heads Really Better than One?. We Need No Pixels: Video Manipulation Detection Using Stream Descriptors. Self-Correction for Human Parsin …

Adversarial Examples Are Not Bugs, They Are Features


Title	Adversarial Examples Are Not Bugs, They Are Features
Authors	Andrew Ilyas, Shibani Santurkar, Dimitris Tsipras, Logan Engstrom, Brandon Tran, Aleksander Madry
Abstract	Adversarial examples have attracted significant attention in machine learning, but the reasons for their existence and pervasiveness remain unclear. We demonstrate that adversarial examples can be directly attributed to the presence of non-robust features: features derived from patterns in the data distribution that are highly predictive, yet brittle and incomprehensible to humans. After capturing these features within a theoretical framework, we establish their widespread existence in standard datasets. Finally, we present a simple setting where we can rigorously tie the phenomena we observe in practice to a misalignment between the (human-specified) notion of robustness and the inherent geometry of the data.
Tasks
Published	2019-05-06
URL	https://arxiv.org/abs/1905.02175v4
PDF	https://arxiv.org/pdf/1905.02175v4.pdf
PWC	https://paperswithcode.com/paper/adversarial-examples-are-not-bugs-they-are
Repo	https://github.com/MadryLab/constructed-datasets
Framework	pytorch

Routing Networks and the Challenges of Modular and Compositional Computation


Title	Routing Networks and the Challenges of Modular and Compositional Computation
Authors	Clemens Rosenbaum, Ignacio Cases, Matthew Riemer, Tim Klinger
Abstract	Compositionality is a key strategy for addressing combinatorial complexity and the curse of dimensionality. Recent work has shown that compositional solutions can be learned and offer substantial gains across a variety of domains, including multi-task learning, language modeling, visual question answering, machine comprehension, and others. However, such models present unique challenges during training when both the module parameters and their composition must be learned jointly. In this paper, we identify several of these issues and analyze their underlying causes. Our discussion focuses on routing networks, a general approach to this problem, and examines empirically the interplay of these challenges and a variety of design decisions. In particular, we consider the effect of how the algorithm decides on module composition, how the algorithm updates the modules, and if the algorithm uses regularization.
Tasks	Language Modelling, Multi-Task Learning, Question Answering, Reading Comprehension, Visual Question Answering
Published	2019-04-29
URL	http://arxiv.org/abs/1904.12774v1
PDF	http://arxiv.org/pdf/1904.12774v1.pdf
PWC	https://paperswithcode.com/paper/routing-networks-and-the-challenges-of
Repo	https://github.com/cle-ros/RoutingNetworks
Framework	pytorch

Is Sampling Heuristics Necessary in Training Deep Object Detectors?


Title	Is Sampling Heuristics Necessary in Training Deep Object Detectors?
Authors	Joya Chen, Dong Liu, Tong Xu, Shilong Zhang, Shiwei Wu, Bin Luo, Xuezheng Peng, Enhong Chen
Abstract	To address the imbalance between foreground and background, various heuristic methods, such as OHEM, Focal Loss, GHM, have been proposed for biased sampling or weighting when training deep object detectors. We challenge this paradigm by discarding the sampling heuristics and focusing on other settings for training. Our empirical study reveals that the weight of classification loss and the initialization strategy have a big impact on the training stability and the final accuracy. Thus, we propose the \emph{Sampling-Free} mechanism, including three key ingredients: optimal bias initialization, guided loss weights, and class-adaptive threshold, for training deep detectors without sampling heuristics. Compared with the sampling heuristics, our Sampling-Free mechanism is fully data diagnostic and thus avoids the laborious tuning of sampling hyper-parameters. Our extensive experimental results demonstrate that the Sampling-Free mechanism can be used for one-stage, two-stage, and anchor-free object detectors, where it always achieves higher accuracy on the challenging COCO benchmark. The mechanism is also useful for the instance segmentation task. Code is at https://github.com/ChenJoya/sampling-free.
Tasks	Instance Segmentation, Semantic Segmentation
Published	2019-09-11
URL	https://arxiv.org/abs/1909.04868v5
PDF	https://arxiv.org/pdf/1909.04868v5.pdf
PWC	https://paperswithcode.com/paper/revisiting-foreground-background-imbalance-in
Repo	https://github.com/ChenJoya/objnessdet
Framework	pytorch

Are Sixteen Heads Really Better than One?


Title	Are Sixteen Heads Really Better than One?
Authors	Paul Michel, Omer Levy, Graham Neubig
Abstract	Attention is a powerful and ubiquitous mechanism for allowing neural models to focus on particular salient pieces of information by taking their weighted average when making predictions. In particular, multi-headed attention is a driving force behind many recent state-of-the-art NLP models such as Transformer-based MT models and BERT. These models apply multiple attention mechanisms in parallel, with each attention “head” potentially focusing on different parts of the input, which makes it possible to express sophisticated functions beyond the simple weighted average. In this paper we make the surprising observation that even if models have been trained using multiple heads, in practice, a large percentage of attention heads can be removed at test time without significantly impacting performance. In fact, some layers can even be reduced to a single head. We further examine greedy algorithms for pruning down models, and the potential speed, memory efficiency, and accuracy improvements obtainable therefrom. Finally, we analyze the results with respect to which parts of the model are more reliant on having multiple heads, and provide precursory evidence that training dynamics play a role in the gains provided by multi-head attention.
Tasks
Published	2019-05-25
URL	https://arxiv.org/abs/1905.10650v3
PDF	https://arxiv.org/pdf/1905.10650v3.pdf
PWC	https://paperswithcode.com/paper/are-sixteen-heads-really-better-than-one
Repo	https://github.com/pmichel31415/are-16-heads-really-better-than-1
Framework	pytorch

We Need No Pixels: Video Manipulation Detection Using Stream Descriptors


Title	We Need No Pixels: Video Manipulation Detection Using Stream Descriptors
Authors	David Güera, Sriram Baireddy, Paolo Bestagini, Stefano Tubaro, Edward J. Delp
Abstract	Manipulating video content is easier than ever. Due to the misuse potential of manipulated content, multiple detection techniques that analyze the pixel data from the videos have been proposed. However, clever manipulators should also carefully forge the metadata and auxiliary header information, which is harder to do for videos than images. In this paper, we propose to identify forged videos by analyzing their multimedia stream descriptors with simple binary classifiers, completely avoiding the pixel space. Using well-known datasets, our results show that this scalable approach can achieve a high manipulation detection score if the manipulators have not done a careful data sanitization of the multimedia stream descriptors.
Tasks
Published	2019-06-20
URL	https://arxiv.org/abs/1906.08743v1
PDF	https://arxiv.org/pdf/1906.08743v1.pdf
PWC	https://paperswithcode.com/paper/we-need-no-pixels-video-manipulation
Repo	https://github.com/dguera/fake-video-detection-without-pixels
Framework	none

Self-Correction for Human Parsing


Title	Self-Correction for Human Parsing
Authors	Peike Li, Yunqiu Xu, Yunchao Wei, Yi Yang
Abstract	Labeling pixel-level masks for fine-grained semantic segmentation tasks, e.g. human parsing, remains a challenging task. The ambiguous boundary between different semantic parts and those categories with similar appearance usually are confusing, leading to unexpected noises in ground truth masks. To tackle the problem of learning with label noises, this work introduces a purification strategy, called Self-Correction for Human Parsing (SCHP), to progressively promote the reliability of the supervised labels as well as the learned models. In particular, starting from a model trained with inaccurate annotations as initialization, we design a cyclically learning scheduler to infer more reliable pseudo-masks by iteratively aggregating the current learned model with the former optimal one in an online manner. Besides, those correspondingly corrected labels can in turn to further boost the model performance. In this way, the models and the labels will reciprocally become more robust and accurate during the self-correction learning cycles. Benefiting from the superiority of SCHP, we achieve the best performance on two popular single-person human parsing benchmarks, including LIP and Pascal-Person-Part datasets. Our overall system ranks 1st in CVPR2019 LIP Challenge. Code is available at https://github.com/PeikeLi/Self-Correction-Human-Parsing.
Tasks	Human Parsing, Human Part Segmentation, Semantic Segmentation
Published	2019-10-22
URL	https://arxiv.org/abs/1910.09777v1
PDF	https://arxiv.org/pdf/1910.09777v1.pdf
PWC	https://paperswithcode.com/paper/self-correction-for-human-parsing
Repo	https://github.com/PeikeLi/Self-Correction-Human-Parsing
Framework	pytorch

ASD-DiagNet: A hybrid learning approach for detection of Autism Spectrum Disorder using fMRI data


Title	ASD-DiagNet: A hybrid learning approach for detection of Autism Spectrum Disorder using fMRI data
Authors	Taban Eslami, Vahid Mirjalili, Alvis Fong, Angela Laird, Fahad Saeed
Abstract	Mental disorders such as Autism Spectrum Disorders (ASD) are heterogeneous disorders that are notoriously difficult to diagnose, especially in children. The current psychiatric diagnostic process is based purely on the behavioural observation of symptomology (DSM-5/ICD-10) and may be prone to over-prescribing of drugs due to misdiagnosis. In order to move the field towards more quantitative fashion, we need advanced and scalable machine learning infrastructure that will allow us to identify reliable biomarkers of mental health disorders. In this paper, we propose a framework called ASD-DiagNet for classifying subjects with ASD from healthy subjects by using only fMRI data. We designed and implemented a joint learning procedure using an autoencoder and a single layer perceptron which results in improved quality of extracted features and optimized parameters for the model. Further, we designed and implemented a data augmentation strategy, based on linear interpolation on available feature vectors, that allows us to produce synthetic datasets needed for training of machine learning models. The proposed approach is evaluated on a public dataset provided by Autism Brain Imaging Data Exchange including 1035 subjects coming from 17 different brain imaging centers. Our machine learning model outperforms other state of the art methods from 13 imaging centers with increase in classification accuracy up to 20% with maximum accuracy of 80%. The machine learning technique presented in this paper, in addition to yielding better quality, gives enormous advantages in terms of execution time (40 minutes vs. 6 hours on other methods). The implemented code is available as GPL license on GitHub portal of our lab (https://github.com/pcdslab/ASD-DiagNet).
Tasks	Data Augmentation
Published	2019-04-16
URL	http://arxiv.org/abs/1904.07577v1
PDF	http://arxiv.org/pdf/1904.07577v1.pdf
PWC	https://paperswithcode.com/paper/asd-diagnet-a-hybrid-learning-approach-for
Repo	https://github.com/pcdslab/ASD-DiagNet
Framework	pytorch

Supervised Vector Quantized Variational Autoencoder for Learning Interpretable Global Representations


Title	Supervised Vector Quantized Variational Autoencoder for Learning Interpretable Global Representations
Authors	Yifan Xue, Michael Ding, Xinghua Lu
Abstract	Learning interpretable representations of data remains a central challenge in deep learning. When training a deep generative model, the observed data are often associated with certain categorical labels, and, in parallel with learning to regenerate data and simulate new data, learning an interpretable representation of each class of data is also a process of acquiring knowledge. Here, we present a novel generative model, referred to as the Supervised Vector Quantized Variational AutoEncoder (S-VQ-VAE), which combines the power of supervised and unsupervised learning to obtain a unique, interpretable global representation for each class of data. Compared with conventional generative models, our model has three key advantages: first, it is an integrative model that can simultaneously learn a feature representation for individual data point and a global representation for each class of data; second, the learning of global representations with embedding codes is guided by supervised information, which clearly defines the interpretation of each code; and third, the global representations capture crucial characteristics of different classes, which reveal similarity and differences of statistical structures underlying different groups of data. We evaluated the utility of S-VQ-VAE on a machine learning benchmark dataset, the MNIST dataset, and on gene expression data from the Library of Integrated Network-Based Cellular Signatures (LINCS). We proved that S-VQ-VAE was able to learn the global genetic characteristics of samples perturbed by the same class of perturbagen (PCL), and further revealed the mechanism correlations between PCLs. Such knowledge is crucial for promoting new drug development for complex diseases like cancer.
Tasks
Published	2019-09-24
URL	https://arxiv.org/abs/1909.11124v2
PDF	https://arxiv.org/pdf/1909.11124v2.pdf
PWC	https://paperswithcode.com/paper/supervised-vector-quantized-variational
Repo	https://github.com/evasnow1992/S-VQ-VAE
Framework	pytorch

CoulGAT: An Experiment on Interpretability of Graph Attention Networks


Title	CoulGAT: An Experiment on Interpretability of Graph Attention Networks
Authors	Burc Gokden
Abstract	We present an attention mechanism inspired from definition of screened Coulomb potential. This attention mechanism was used to interpret the Graph Attention (GAT) model layers and training dataset by using a flexible and scalable framework (CoulGAT) developed for this purpose. Using CoulGAT, a forest of plain and resnet models were trained and characterized using this attention mechanism against CHAMPS dataset. The learnable variables of the attention mechanism are used to extract node-node and node-feature interactions to define an empirical standard model for the graph structure and hidden layer. This representation of graph and hidden layers can be used as a tool to compare different models, optimize hidden layers and extract a compact definition of graph structure of the dataset.
Tasks
Published	2019-12-18
URL	https://arxiv.org/abs/1912.08409v1
PDF	https://arxiv.org/pdf/1912.08409v1.pdf
PWC	https://paperswithcode.com/paper/coulgat-an-experiment-on-interpretability-of
Repo	https://github.com/burcgokden/CoulGAT-Graph-Attention-Interpretability
Framework	none

ROCKET: Exceptionally fast and accurate time series classification using random convolutional kernels


Title	ROCKET: Exceptionally fast and accurate time series classification using random convolutional kernels
Authors	Angus Dempster, François Petitjean, Geoffrey I. Webb
Abstract	Most methods for time series classification that attain state-of-the-art accuracy have high computational complexity, requiring significant training time even for smaller datasets, and are intractable for larger datasets. Additionally, many existing methods focus on a single type of feature such as shape or frequency. Building on the recent success of convolutional neural networks for time series classification, we show that simple linear classifiers using random convolutional kernels achieve state-of-the-art accuracy with a fraction of the computational expense of existing methods.
Tasks	Time Series, Time Series Classification
Published	2019-10-29
URL	https://arxiv.org/abs/1910.13051v1
PDF	https://arxiv.org/pdf/1910.13051v1.pdf
PWC	https://paperswithcode.com/paper/rocket-exceptionally-fast-and-accurate-time
Repo	https://github.com/timeseriesAI/timeseriesAI
Framework	pytorch

Tight Certificates of Adversarial Robustness for Randomly Smoothed Classifiers


Title	Tight Certificates of Adversarial Robustness for Randomly Smoothed Classifiers
Authors	Guang-He Lee, Yang Yuan, Shiyu Chang, Tommi S. Jaakkola
Abstract	Strong theoretical guarantees of robustness can be given for ensembles of classifiers generated by input randomization. Specifically, an $\ell_2$ bounded adversary cannot alter the ensemble prediction generated by an additive isotropic Gaussian noise, where the radius for the adversary depends on both the variance of the distribution as well as the ensemble margin at the point of interest. We build on and considerably expand this work across broad classes of distributions. In particular, we offer adversarial robustness guarantees and associated algorithms for the discrete case where the adversary is $\ell_0$ bounded. Moreover, we exemplify how the guarantees can be tightened with specific assumptions about the function class of the classifier such as a decision tree. We empirically illustrate these results with and without functional restrictions across image and molecule datasets.
Tasks
Published	2019-06-12
URL	https://arxiv.org/abs/1906.04948v3
PDF	https://arxiv.org/pdf/1906.04948v3.pdf
PWC	https://paperswithcode.com/paper/a-stratified-approach-to-robustness-for
Repo	https://github.com/guanghelee/Randomized_Smoothing
Framework	pytorch

Poisson-Minibatching for Gibbs Sampling with Convergence Rate Guarantees


Title	Poisson-Minibatching for Gibbs Sampling with Convergence Rate Guarantees
Authors	Ruqi Zhang, Christopher De Sa
Abstract	Gibbs sampling is a Markov chain Monte Carlo method that is often used for learning and inference on graphical models. Minibatching, in which a small random subset of the graph is used at each iteration, can help make Gibbs sampling scale to large graphical models by reducing its computational cost. In this paper, we propose a new auxiliary-variable minibatched Gibbs sampling method, {\it Poisson-minibatching Gibbs}, which both produces unbiased samples and has a theoretical guarantee on its convergence rate. In comparison to previous minibatched Gibbs algorithms, Poisson-minibatching Gibbs supports fast sampling from continuous state spaces and avoids the need for a Metropolis-Hastings correction on discrete state spaces. We demonstrate the effectiveness of our method on multiple applications and in comparison with both plain Gibbs and previous minibatched methods.
Tasks
Published	2019-11-21
URL	https://arxiv.org/abs/1911.09771v1
PDF	https://arxiv.org/pdf/1911.09771v1.pdf
PWC	https://paperswithcode.com/paper/poisson-minibatching-for-gibbs-sampling-with-1
Repo	https://github.com/ruqizhang/poisson-gibbs
Framework	none

Out the Window: A Crowd-Sourced Dataset for Activity Classification in Security Video


Title	Out the Window: A Crowd-Sourced Dataset for Activity Classification in Security Video
Authors	Gregory Castanon, Nathan Shnidman, Tim Anderson, Jeffrey Byrne
Abstract	The Out the Window (OTW) dataset is a crowdsourced activity dataset containing 5,668 instances of 17 activities from the NIST Activities in Extended Video (ActEV) challenge. These videos are crowdsourced from workers on the Amazon Mechanical Turk using a novel scenario acting strategy, which collects multiple instances of natural activities per scenario. Turkers are instructed to lean their mobile device against an upper story window overlooking an outdoor space, walk outside to perform a scenario involving people, vehicles and objects, and finally upload the video to us for annotation. Performance evaluation for activity classification on VIRAT Ground 2.0 shows that the OTW dataset provides an 8.3% improvement in mean classification accuracy, and a 12.5% improvement on the most challenging activities involving people with vehicles.
Tasks
Published	2019-08-28
URL	https://arxiv.org/abs/1908.10899v2
PDF	https://arxiv.org/pdf/1908.10899v2.pdf
PWC	https://paperswithcode.com/paper/out-the-window-a-crowd-sourced-dataset-for
Repo	https://github.com/stresearch/otw
Framework	none

Estimating multi-year 24/7 origin-destination demand using high-granular multi-source traffic data


Title	Estimating multi-year 24/7 origin-destination demand using high-granular multi-source traffic data
Authors	Wei Ma, Zhen, Qian
Abstract	Dynamic origin-destination (OD) demand is central to transportation system modeling and analysis. The dynamic OD demand estimation problem (DODE) has been studied for decades, most of which solve the DODE problem on a typical day or several typical hours. There is a lack of methods that estimate high-resolution dynamic OD demand for a sequence of many consecutive days over several years (referred to as 24/7 OD in this research). Having multi-year 24/7 OD demand would allow a better understanding of characteristics of dynamic OD demands and their evolution/trends over the past few years, a critical input for modeling transportation system evolution and reliability. This paper presents a data-driven framework that estimates day-to-day dynamic OD using high-granular traffic counts and speed data collected over many years. The proposed framework statistically clusters daily traffic data into typical traffic patterns using t-Distributed Stochastic Neighbor Embedding (t-SNE) and k-means methods. A GPU-based stochastic projected gradient descent method is proposed to efficiently solve the multi-year 24/7 DODE problem. It is demonstrated that the new method efficiently estimates the 5-minute dynamic OD demand for every single day from 2014 to 2016 on I-5 and SR-99 in the Sacramento region. The resultant multi-year 24/7 dynamic OD demand reveals the daily, weekly, monthly, seasonal and yearly change in travel demand in a region, implying intriguing demand characteristics over the years.
Tasks	Traffic Prediction
Published	2019-01-26
URL	http://arxiv.org/abs/1901.09266v1
PDF	http://arxiv.org/pdf/1901.09266v1.pdf
PWC	https://paperswithcode.com/paper/estimating-multi-year-247-origin-destination
Repo	https://github.com/Lemma1/DPFE
Framework	pytorch

Dynamics are Important for the Recognition of Equine Pain in Video


Title	Dynamics are Important for the Recognition of Equine Pain in Video
Authors	Sofia Broomé, Karina Bech Gleerup, Pia Haubro Andersen, Hedvig Kjellström
Abstract	A prerequisite to successfully alleviate pain in animals is to recognize it, which is a great challenge in non-verbal species. Furthermore, prey animals such as horses tend to hide their pain. In this study, we propose a deep recurrent two-stream architecture for the task of distinguishing pain from non-pain in videos of horses. Different models are evaluated on a unique dataset showing horses under controlled trials with moderate pain induction, which has been presented in earlier work. Sequential models are experimentally compared to single-frame models, showing the importance of the temporal dimension of the data, and are benchmarked against a veterinary expert classification of the data. We additionally perform baseline comparisons with generalized versions of state-of-the-art human pain recognition methods. While equine pain detection in machine learning is a novel field, our results surpass veterinary expert performance and outperform pain detection results reported for other larger non-human species.
Tasks
Published	2019-01-07
URL	https://arxiv.org/abs/1901.02106v2
PDF	https://arxiv.org/pdf/1901.02106v2.pdf
PWC	https://paperswithcode.com/paper/dynamics-are-important-for-the-recognition-of
Repo	https://github.com/sofiabroome/painface-recognition
Framework	none