April 3, 2020

3306 words 16 mins read

Paper Group AWR 18

MOGPTK: The Multi-Output Gaussian Process Toolkit. Active Bayesian Assessment for Black-Box Classifiers. PDE-NetGen 1.0: from symbolic PDE representations of physical processes to trainable neural network representations. Unsupervised Learning of Intrinsic Structural Representation Points. Graph Neighborhood Attentive Pooling. Non-Autoregressive Di …

MOGPTK: The Multi-Output Gaussian Process Toolkit


Title	MOGPTK: The Multi-Output Gaussian Process Toolkit
Authors	Taco de Wolff, Alejandro Cuevas, Felipe Tobar
Abstract	We present MOGPTK, a Python package for multi-channel data modelling using Gaussian processes (GP). The aim of this toolkit is to make multi-output GP (MOGP) models accessible to researchers, data scientists, and practitioners alike. MOGPTK uses a Python front-end, relies on the GPflow suite and is built on a TensorFlow back-end, thus enabling GPU-accelerated training. The toolkit facilitates implementing the entire pipeline of GP modelling, including data loading, parameter initialization, model learning, parameter interpretation, up to data imputation and extrapolation. MOGPTK implements the main multi-output covariance kernels from literature, as well as spectral-based parameter initialization strategies. The source code, tutorials and examples in the form of Jupyter notebooks, together with the API documentation, can be found at http://github.com/GAMES-UChile/mogptk
Tasks	Gaussian Processes, Imputation
Published	2020-02-09
URL	https://arxiv.org/abs/2002.03471v1
PDF	https://arxiv.org/pdf/2002.03471v1.pdf
PWC	https://paperswithcode.com/paper/mogptk-the-multi-output-gaussian-process
Repo	https://github.com/GAMES-UChile/mogptk
Framework	tf

Active Bayesian Assessment for Black-Box Classifiers


Title	Active Bayesian Assessment for Black-Box Classifiers
Authors	Disi Ji, Robert L. Logan IV, Padhraic Smyth, Mark Steyvers
Abstract	Recent advances in machine learning have led to increased deployment of black-box classifiers across a wide variety of applications. In many such situations there is a crucial need to assess the performance of these pre-trained models, for instance to ensure sufficient predictive accuracy, or that class probabilities are well-calibrated. Furthermore, since labeled data may be scarce or costly to collect, it is desirable for such assessment be performed in an efficient manner. In this paper, we introduce a Bayesian approach for model assessment that satisfies these desiderata. We develop inference strategies to quantify uncertainty for common assessment metrics (accuracy, misclassification cost, expected calibration error), and propose a framework for active assessment using this uncertainty to guide efficient selection of instances for labeling. We illustrate the benefits of our approach in experiments assessing the performance of modern neural classifiers (e.g., ResNet and BERT) on several standard image and text classification datasets.
Tasks	Calibration, Text Classification
Published	2020-02-16
URL	https://arxiv.org/abs/2002.06532v1
PDF	https://arxiv.org/pdf/2002.06532v1.pdf
PWC	https://paperswithcode.com/paper/active-bayesian-assessment-for-black-box
Repo	https://github.com/disiji/bayesian-blackbox
Framework	pytorch

PDE-NetGen 1.0: from symbolic PDE representations of physical processes to trainable neural network representations


Title	PDE-NetGen 1.0: from symbolic PDE representations of physical processes to trainable neural network representations
Authors	Olivier Pannekoucke, Ronan Fablet
Abstract	Bridging physics and deep learning is a topical challenge. While deep learning frameworks open avenues in physical science, the design of physically-consistent deep neural network architectures is an open issue. In the spirit of physics-informed NNs, PDE-NetGen package provides new means to automatically translate physical equations, given as PDEs, into neural network architectures. PDE-NetGen combines symbolic calculus and a neural network generator. The later exploits NN-based implementations of PDE solvers using Keras. With some knowledge of a problem, PDE-NetGen is a plug-and-play tool to generate physics-informed NN architectures. They provide computationally-efficient yet compact representations to address a variety of issues, including among others adjoint derivation, model calibration, forecasting, data assimilation as well as uncertainty quantification. As an illustration, the workflow is first presented for the 2D diffusion equation, then applied to the data-driven and physics-informed identification of uncertainty dynamics for the Burgers equation.
Tasks	Calibration
Published	2020-02-03
URL	https://arxiv.org/abs/2002.01029v1
PDF	https://arxiv.org/pdf/2002.01029v1.pdf
PWC	https://paperswithcode.com/paper/pde-netgen-10-from-symbolic-pde
Repo	https://github.com/opannekoucke/pdenetgen
Framework	none

Unsupervised Learning of Intrinsic Structural Representation Points


Title	Unsupervised Learning of Intrinsic Structural Representation Points
Authors	Nenglun Chen, Lingjie Liu, Zhiming Cui, Runnan Chen, Duygu Ceylan, Changhe Tu, Wenping Wang
Abstract	Learning structures of 3D shapes is a fundamental problem in the field of computer graphics and geometry processing. We present a simple yet interpretable unsupervised method for learning a new structural representation in the form of 3D structure points. The 3D structure points produced by our method encode the shape structure intrinsically and exhibit semantic consistency across all the shape instances with similar structures. This is a challenging goal that has not fully been achieved by other methods. Specifically, our method takes a 3D point cloud as input and encodes it as a set of local features. The local features are then passed through a novel point integration module to produce a set of 3D structure points. The chamfer distance is used as reconstruction loss to ensure the structure points lie close to the input point cloud. Extensive experiments have shown that our method outperforms the state-of-the-art on the semantic shape correspondence task and achieves comparable performance with the state-of-the-art on the segmentation label transfer task. Moreover, the PCA based shape embedding built upon consistent structure points demonstrates good performance in preserving the shape structures. Code is available at https://github.com/NolenChen/3DStructurePoints
Tasks
Published	2020-03-03
URL	https://arxiv.org/abs/2003.01661v2
PDF	https://arxiv.org/pdf/2003.01661v2.pdf
PWC	https://paperswithcode.com/paper/unsupervised-learning-of-intrinsic-structural
Repo	https://github.com/NolenChen/3DStructurePoints
Framework	pytorch

Graph Neighborhood Attentive Pooling


Title	Graph Neighborhood Attentive Pooling
Authors	Zekarias T. Kefato, Sarunas Girdzijauskas
Abstract	Network representation learning (NRL) is a powerful technique for learning low-dimensional vector representation of high-dimensional and sparse graphs. Most studies explore the structure and metadata associated with the graph using random walks and employ an unsupervised or semi-supervised learning schemes. Learning in these methods is context-free, because only a single representation per node is learned. Recently studies have argued on the sufficiency of a single representation and proposed a context-sensitive approach that proved to be highly effective in applications such as link prediction and ranking. However, most of these methods rely on additional textual features that require RNNs or CNNs to capture high-level features or rely on a community detection algorithm to identify multiple contexts of a node. In this study, without requiring additional features nor a community detection algorithm, we propose a novel context-sensitive algorithm called GAP that learns to attend on different parts of a node’s neighborhood using attentive pooling networks. We show the efficacy of GAP using three real-world datasets on link prediction and node clustering tasks and compare it against 10 popular and state-of-the-art (SOTA) baselines. GAP consistently outperforms them and achieves up to ~9% and ~20% gain over the best performing methods on link prediction and clustering tasks, respectively.
Tasks	Community Detection, Link Prediction, Representation Learning
Published	2020-01-28
URL	https://arxiv.org/abs/2001.10394v2
PDF	https://arxiv.org/pdf/2001.10394v2.pdf
PWC	https://paperswithcode.com/paper/graph-neighborhood-attentive-pooling-1
Repo	https://github.com/zekarias-tilahun/GAP
Framework	pytorch

Non-Autoregressive Dialog State Tracking


Title	Non-Autoregressive Dialog State Tracking
Authors	Hung Le, Richard Socher, Steven C. H. Hoi
Abstract	Recent efforts in Dialogue State Tracking (DST) for task-oriented dialogues have progressed toward open-vocabulary or generation-based approaches where the models can generate slot value candidates from the dialogue history itself. These approaches have shown good performance gain, especially in complicated dialogue domains with dynamic slot values. However, they fall short in two aspects: (1) they do not allow models to explicitly learn signals across domains and slots to detect potential dependencies among (domain, slot) pairs; and (2) existing models follow auto-regressive approaches which incur high time cost when the dialogue evolves over multiple domains and multiple turns. In this paper, we propose a novel framework of Non-Autoregressive Dialog State Tracking (NADST) which can factor in potential dependencies among domains and slots to optimize the models towards better prediction of dialogue states as a complete set rather than separate slots. In particular, the non-autoregressive nature of our method not only enables decoding in parallel to significantly reduce the latency of DST for real-time dialogue response generation, but also detect dependencies among slots at token level in addition to slot and domain level. Our empirical results show that our model achieves the state-of-the-art joint accuracy across all domains on the MultiWOZ 2.1 corpus, and the latency of our model is an order of magnitude lower than the previous state of the art as the dialogue history extends over time.
Tasks	Dialogue State Tracking
Published	2020-02-19
URL	https://arxiv.org/abs/2002.08024v1
PDF	https://arxiv.org/pdf/2002.08024v1.pdf
PWC	https://paperswithcode.com/paper/non-autoregressive-dialog-state-tracking-1
Repo	https://github.com/henryhungle/NADST
Framework	pytorch

DC-WCNN: A deep cascade of wavelet based convolutional neural networks for MR Image Reconstruction


Title	DC-WCNN: A deep cascade of wavelet based convolutional neural networks for MR Image Reconstruction
Authors	Sriprabha Ramanarayanan, Balamurali Murugesan, Keerthi Ram, Mohanasankar Sivaprakasam
Abstract	Several variants of Convolutional Neural Networks (CNN) have been developed for Magnetic Resonance (MR) image reconstruction. Among them, U-Net has shown to be the baseline architecture for MR image reconstruction. However, sub-sampling is performed by its pooling layers, causing information loss which in turn leads to blur and missing fine details in the reconstructed image. We propose a modification to the U-Net architecture to recover fine structures. The proposed network is a wavelet packet transform based encoder-decoder CNN with residual learning called CNN. The proposed WCNN has discrete wavelet transform instead of pooling and inverse wavelet transform instead of unpooling layers and residual connections. We also propose a deep cascaded framework (DC-WCNN) which consists of cascades of WCNN and k-space data fidelity units to achieve high quality MR reconstruction. Experimental results show that WCNN and DC-WCNN give promising results in terms of evaluation metrics and better recovery of fine details as compared to other methods.
Tasks	Image Reconstruction
Published	2020-01-08
URL	https://arxiv.org/abs/2001.02397v1
PDF	https://arxiv.org/pdf/2001.02397v1.pdf
PWC	https://paperswithcode.com/paper/dc-wcnn-a-deep-cascade-of-wavelet-based
Repo	https://github.com/sriprabhar/DC-WCNN
Framework	pytorch

An Adversarial Approach for the Robust Classification of Pneumonia from Chest Radiographs


Title	An Adversarial Approach for the Robust Classification of Pneumonia from Chest Radiographs
Authors	Joseph D. Janizek, Gabriel Erion, Alex J. DeGrave, Su-In Lee
Abstract	While deep learning has shown promise in the domain of disease classification from medical images, models based on state-of-the-art convolutional neural network architectures often exhibit performance loss due to dataset shift. Models trained using data from one hospital system achieve high predictive performance when tested on data from the same hospital, but perform significantly worse when they are tested in different hospital systems. Furthermore, even within a given hospital system, deep learning models have been shown to depend on hospital- and patient-level confounders rather than meaningful pathology to make classifications. In order for these models to be safely deployed, we would like to ensure that they do not use confounding variables to make their classification, and that they will work well even when tested on images from hospitals that were not included in the training data. We attempt to address this problem in the context of pneumonia classification from chest radiographs. We propose an approach based on adversarial optimization, which allows us to learn more robust models that do not depend on confounders. Specifically, we demonstrate improved out-of-hospital generalization performance of a pneumonia classifier by training a model that is invariant to the view position of chest radiographs (anterior-posterior vs. posterior-anterior). Our approach leads to better predictive performance on external hospital data than both a standard baseline and previously proposed methods to handle confounding, and also suggests a method for identifying models that may rely on confounders. Code available at https://github.com/suinleelab/cxr_adv.
Tasks
Published	2020-01-13
URL	https://arxiv.org/abs/2001.04051v1
PDF	https://arxiv.org/pdf/2001.04051v1.pdf
PWC	https://paperswithcode.com/paper/an-adversarial-approach-for-the-robust
Repo	https://github.com/suinleelab/cxr_adv
Framework	pytorch

Patternless Adversarial Attacks on Video Recognition Networks


Title	Patternless Adversarial Attacks on Video Recognition Networks
Authors	Itay Naeh, Roi Pony, Shie Mannor
Abstract	Deep neural networks for classification of videos, just like image classification networks, may be subjected to adversarial manipulation. The main difference between image classifiers and video classifiers is that the latter usually use temporal information contained within the video in the form of optical flow or implicitly by various differences between adjacent frames. In this work we present a manipulation scheme for fooling video classifiers by introducing a spatial patternless temporal perturbation that is practically unnoticed by human observers and undetectable by leading image adversarial pattern detection algorithms. After demonstrating the manipulation of action classification of single videos, we generalize the procedure to make adversarial patterns with temporal invariance that generalizes across different classes for both targeted and untargeted attacks.
Tasks	Action Classification, Image Classification, Optical Flow Estimation, Video Recognition
Published	2020-02-12
URL	https://arxiv.org/abs/2002.05123v1
PDF	https://arxiv.org/pdf/2002.05123v1.pdf
PWC	https://paperswithcode.com/paper/patternless-adversarial-attacks-on-video
Repo	https://github.com/roipony/Patternless_Adversarial_Video
Framework	none

Embodied Synaptic Plasticity with Online Reinforcement learning


Title	Embodied Synaptic Plasticity with Online Reinforcement learning
Authors	Jacques Kaiser, Michael Hoff, Andreas Konle, J. Camilo Vasquez Tieck, David Kappel, Daniel Reichard, Anand Subramoney, Robert Legenstein, Arne Roennau, Wolfgang Maass, Rudiger Dillmann
Abstract	The endeavor to understand the brain involves multiple collaborating research fields. Classically, synaptic plasticity rules derived by theoretical neuroscientists are evaluated in isolation on pattern classification tasks. This contrasts with the biological brain which purpose is to control a body in closed-loop. This paper contributes to bringing the fields of computational neuroscience and robotics closer together by integrating open-source software components from these two fields. The resulting framework allows to evaluate the validity of biologically-plausibe plasticity models in closed-loop robotics environments. We demonstrate this framework to evaluate Synaptic Plasticity with Online REinforcement learning (SPORE), a reward-learning rule based on synaptic sampling, on two visuomotor tasks: reaching and lane following. We show that SPORE is capable of learning to perform policies within the course of simulated hours for both tasks. Provisional parameter explorations indicate that the learning rate and the temperature driving the stochastic processes that govern synaptic learning dynamics need to be regulated for performance improvements to be retained. We conclude by discussing the recent deep reinforcement learning techniques which would be beneficial to increase the functionality of SPORE on visuomotor tasks.
Tasks
Published	2020-03-03
URL	https://arxiv.org/abs/2003.01431v1
PDF	https://arxiv.org/pdf/2003.01431v1.pdf
PWC	https://paperswithcode.com/paper/embodied-synaptic-plasticity-with-online
Repo	https://github.com/IGITUGraz/spore-nest-module
Framework	none

Recognizing Video Events with Varying Rhythms


Title	Recognizing Video Events with Varying Rhythms
Authors	Yikang Li, Tianshu Yu, Baoxin Li
Abstract	Recognizing Video events in long, complex videos with multiple sub-activities has received persistent attention recently. This task is more challenging than traditional action recognition with short, relatively homogeneous video clips. In this paper, we investigate the problem of recognizing long and complex events with varying action rhythms, which has not been considered in the literature but is a practical challenge. Our work is inspired in part by how humans identify events with varying rhythms: quickly catching frames contributing most to a specific event. We propose a two-stage \emph{end-to-end} framework, in which the first stage selects the most significant frames while the second stage recognizes the event using the selected frames. Our model needs only \emph{event-level labels} in the training stage, and thus is more practical when the sub-activity labels are missing or difficult to obtain. The results of extensive experiments show that our model can achieve significant improvement in event recognition from long videos while maintaining high accuracy even if the test videos suffer from severe rhythm changes. This demonstrates the potential of our method for real-world video-based applications, where test and training videos can differ drastically in rhythms of sub-activities.
Tasks
Published	2020-01-14
URL	https://arxiv.org/abs/2001.05060v1
PDF	https://arxiv.org/pdf/2001.05060v1.pdf
PWC	https://paperswithcode.com/paper/recognizing-video-events-with-varying-rhythms
Repo	https://github.com/yikangli/video-rhythm
Framework	pytorch

Simple and Effective Graph Autoencoders with One-Hop Linear Models


Title	Simple and Effective Graph Autoencoders with One-Hop Linear Models
Authors	Guillaume Salha, Romain Hennequin, Michalis Vazirgiannis
Abstract	Over the last few years, graph autoencoders (AE) and variational autoencoders (VAE) emerged as powerful node embedding methods, with promising performances on challenging tasks such as link prediction and node clustering. Graph AE, VAE and most of their extensions rely on multi-layer graph convolutional networks (GCN) encoders to learn vector space representations of nodes. In this paper, we show that GCN encoders are actually unnecessarily complex for many applications. We propose to replace them by significantly simpler and more interpretable linear models w.r.t. the direct neighborhood (one-hop) adjacency matrix of the graph, involving fewer operations, fewer parameters and no activation function. For the two aforementioned tasks, we show that this simpler approach consistently reaches competitive performances w.r.t. GCN-based graph AE and VAE for numerous real-world graphs, including all benchmark datasets commonly used to evaluate graph AE and VAE. Based on these results, we also question the relevance of repeatedly using these datasets to compare complex graph AE and VAE.
Tasks	Link Prediction
Published	2020-01-21
URL	https://arxiv.org/abs/2001.07614v2
PDF	https://arxiv.org/pdf/2001.07614v2.pdf
PWC	https://paperswithcode.com/paper/simple-and-effective-graph-autoencoders-with
Repo	https://github.com/deezer/linear_graph_autoencoders
Framework	tf

KaoKore: A Pre-modern Japanese Art Facial Expression Dataset


Title	KaoKore: A Pre-modern Japanese Art Facial Expression Dataset
Authors	Yingtao Tian, Chikahiko Suzuki, Tarin Clanuwat, Mikel Bober-Irizar, Alex Lamb, Asanobu Kitamoto
Abstract	From classifying handwritten digits to generating strings of text, the datasets which have received long-time focus from the machine learning community vary greatly in their subject matter. This has motivated a renewed interest in building datasets which are socially and culturally relevant, so that algorithmic research may have a more direct and immediate impact on society. One such area is in history and the humanities, where better and relevant machine learning models can accelerate research across various fields. To this end, newly released benchmarks and models have been proposed for transcribing historical Japanese cursive writing, yet for the field as a whole using machine learning for historical Japanese artworks still remains largely uncharted. To bridge this gap, in this work we propose a new dataset KaoKore which consists of faces extracted from pre-modern Japanese artwork. We demonstrate its value as both a dataset for image classification as well as a creative and artistic dataset, which we explore using generative models. Dataset available at https://github.com/rois-codh/kaokore
Tasks	Image Classification
Published	2020-02-20
URL	https://arxiv.org/abs/2002.08595v1
PDF	https://arxiv.org/pdf/2002.08595v1.pdf
PWC	https://paperswithcode.com/paper/kaokore-a-pre-modern-japanese-art-facial
Repo	https://github.com/rois-codh/kaokore
Framework	pytorch

Pop Music Transformer: Generating Music with Rhythm and Harmony


Title	Pop Music Transformer: Generating Music with Rhythm and Harmony
Authors	Yu-Siang Huang, Yi-Hsuan Yang
Abstract	The task automatic music composition entails generative modeling of music in symbolic formats such as the musical scores. By serializing a score as a sequence of MIDI-like events, recent work has demonstrated that state-of-the-art sequence models with self-attention work nicely for this task, especially for composing music with long-range coherence. In this paper, we show that sequence models can do even better when we improve the way a musical score is converted into events. The new event set, dubbed “REMI” (REvamped MIDI-derived events), provides sequence models a metric context for modeling the rhythmic patterns of music, while allowing for local tempo changes. Moreover, it explicitly sets up a harmonic structure and makes chord progression controllable. It also facilitates coordinating different tracks of a musical piece, such as the piano, bass and drums. With this new approach, we build a Pop Music Transformer that composes Pop piano music with a more plausible rhythmic structure than prior arts do. The code, data and pre-trained model are publicly available.\footnote{\url{https://github.com/YatingMusic/remi}}
Tasks
Published	2020-02-01
URL	https://arxiv.org/abs/2002.00212v1
PDF	https://arxiv.org/pdf/2002.00212v1.pdf
PWC	https://paperswithcode.com/paper/pop-music-transformer-generating-music-with
Repo	https://github.com/YatingMusic/remi
Framework	tf

Analysis of Gender Inequality In Face Recognition Accuracy


Title	Analysis of Gender Inequality In Face Recognition Accuracy
Authors	Vítor Albiero, Krishnapriya K. S., Kushal Vangara, Kai Zhang, Michael C. King, Kevin W. Bowyer
Abstract	We present a comprehensive analysis of how and why face recognition accuracy differs between men and women. We show that accuracy is lower for women due to the combination of (1) the impostor distribution for women having a skew toward higher similarity scores, and (2) the genuine distribution for women having a skew toward lower similarity scores. We show that this phenomenon of the impostor and genuine distributions for women shifting closer towards each other is general across datasets of African-American, Caucasian, and Asian faces. We show that the distribution of facial expressions may differ between male/female, but that the accuracy difference persists for image subsets rated confidently as neutral expression. The accuracy difference also persists for image subsets rated as close to zero pitch angle. Even when removing images with forehead partially occluded by hair/hat, the same impostor/genuine accuracy difference persists. We show that the female genuine distribution improves when only female images without facial cosmetics are used, but that the female impostor distribution also degrades at the same time. Lastly, we show that the accuracy difference persists even if a state-of-the-art deep learning method is trained from scratch using training data explicitly balanced between male and female images and subjects.
Tasks	Face Recognition
Published	2020-01-31
URL	https://arxiv.org/abs/2002.00065v1
PDF	https://arxiv.org/pdf/2002.00065v1.pdf
PWC	https://paperswithcode.com/paper/analysis-of-gender-inequality-in-face
Repo	https://github.com/vitoralbiero/afd_dataset_cleaned
Framework	none