May 7, 2019

3105 words 15 mins read

Paper Group AWR 52

Incorporating long-range consistency in CNN-based texture generation. GRAM: Graph-based Attention Model for Healthcare Representation Learning. LOFS: Library of Online Streaming Feature Selection. Learning Robust Features using Deep Learning for Automatic Seizure Detection. Ristretto: Hardware-Oriented Approximation of Convolutional Neural Networks …

Incorporating long-range consistency in CNN-based texture generation


Title	Incorporating long-range consistency in CNN-based texture generation
Authors	G. Berger, R. Memisevic
Abstract	Gatys et al. (2015) showed that pair-wise products of features in a convolutional network are a very effective representation of image textures. We propose a simple modification to that representation which makes it possible to incorporate long-range structure into image generation, and to render images that satisfy various symmetry constraints. We show how this can greatly improve rendering of regular textures and of images that contain other kinds of symmetric structure. We also present applications to inpainting and season transfer.
Tasks	Image Generation, Texture Synthesis
Published	2016-06-03
URL	http://arxiv.org/abs/1606.01286v2
PDF	http://arxiv.org/pdf/1606.01286v2.pdf
PWC	https://paperswithcode.com/paper/incorporating-long-range-consistency-in-cnn
Repo	https://github.com/guillaumebrg/texture_generation
Framework	none

GRAM: Graph-based Attention Model for Healthcare Representation Learning


Title	GRAM: Graph-based Attention Model for Healthcare Representation Learning
Authors	Edward Choi, Mohammad Taha Bahadori, Le Song, Walter F. Stewart, Jimeng Sun
Abstract	Deep learning methods exhibit promising performance for predictive modeling in healthcare, but two important challenges remain: -Data insufficiency:Often in healthcare predictive modeling, the sample size is insufficient for deep learning methods to achieve satisfactory results. -Interpretation:The representations learned by deep learning methods should align with medical knowledge. To address these challenges, we propose a GRaph-based Attention Model, GRAM that supplements electronic health records (EHR) with hierarchical information inherent to medical ontologies. Based on the data volume and the ontology structure, GRAM represents a medical concept as a combination of its ancestors in the ontology via an attention mechanism. We compared predictive performance (i.e. accuracy, data needs, interpretability) of GRAM to various methods including the recurrent neural network (RNN) in two sequential diagnoses prediction tasks and one heart failure prediction task. Compared to the basic RNN, GRAM achieved 10% higher accuracy for predicting diseases rarely observed in the training data and 3% improved area under the ROC curve for predicting heart failure using an order of magnitude less training data. Additionally, unlike other methods, the medical concept representations learned by GRAM are well aligned with the medical ontology. Finally, GRAM exhibits intuitive attention behaviors by adaptively generalizing to higher level concepts when facing data insufficiency at the lower level concepts.
Tasks	Representation Learning
Published	2016-11-21
URL	http://arxiv.org/abs/1611.07012v3
PDF	http://arxiv.org/pdf/1611.07012v3.pdf
PWC	https://paperswithcode.com/paper/gram-graph-based-attention-model-for
Repo	https://github.com/mp2893/gram
Framework	none

LOFS: Library of Online Streaming Feature Selection


Title	LOFS: Library of Online Streaming Feature Selection
Authors	Kui Yu, Wei Ding, Xindong Wu
Abstract	As an emerging research direction, online streaming feature selection deals with sequentially added dimensions in a feature space while the number of data instances is fixed. Online streaming feature selection provides a new, complementary algorithmic methodology to enrich online feature selection, especially targets to high dimensionality in big data analytics. This paper introduces the first comprehensive open-source library for use in MATLAB that implements the state-of-the-art algorithms of online streaming feature selection. The library is designed to facilitate the development of new algorithms in this exciting research direction and make comparisons between the new methods and existing ones available.
Tasks	Feature Selection
Published	2016-03-02
URL	http://arxiv.org/abs/1603.00531v1
PDF	http://arxiv.org/pdf/1603.00531v1.pdf
PWC	https://paperswithcode.com/paper/lofs-library-of-online-streaming-feature
Repo	https://github.com/kuiy/LOFS
Framework	none

Learning Robust Features using Deep Learning for Automatic Seizure Detection


Title	Learning Robust Features using Deep Learning for Automatic Seizure Detection
Authors	Pierre Thodoroff, Joelle Pineau, Andrew Lim
Abstract	We present and evaluate the capacity of a deep neural network to learn robust features from EEG to automatically detect seizures. This is a challenging problem because seizure manifestations on EEG are extremely variable both inter- and intra-patient. By simultaneously capturing spectral, temporal and spatial information our recurrent convolutional neural network learns a general spatially invariant representation of a seizure. The proposed approach exceeds significantly previous results obtained on cross-patient classifiers both in terms of sensitivity and false positive rate. Furthermore, our model proves to be robust to missing channel and variable electrode montage.
Tasks	EEG, Seizure Detection
Published	2016-07-31
URL	http://arxiv.org/abs/1608.00220v1
PDF	http://arxiv.org/pdf/1608.00220v1.pdf
PWC	https://paperswithcode.com/paper/learning-robust-features-using-deep-learning
Repo	https://github.com/Sharad24/Epileptic-Seizure-Detection
Framework	none

Ristretto: Hardware-Oriented Approximation of Convolutional Neural Networks


Title	Ristretto: Hardware-Oriented Approximation of Convolutional Neural Networks
Authors	Philipp Gysel
Abstract	Convolutional neural networks (CNN) have achieved major breakthroughs in recent years. Their performance in computer vision have matched and in some areas even surpassed human capabilities. Deep neural networks can capture complex non-linear features; however this ability comes at the cost of high computational and memory requirements. State-of-art networks require billions of arithmetic operations and millions of parameters. To enable embedded devices such as smartphones, Google glasses and monitoring cameras with the astonishing power of deep learning, dedicated hardware accelerators can be used to decrease both execution time and power consumption. In applications where fast connection to the cloud is not guaranteed or where privacy is important, computation needs to be done locally. Many hardware accelerators for deep neural networks have been proposed recently. A first important step of accelerator design is hardware-oriented approximation of deep networks, which enables energy-efficient inference. We present Ristretto, a fast and automated framework for CNN approximation. Ristretto simulates the hardware arithmetic of a custom hardware accelerator. The framework reduces the bit-width of network parameters and outputs of resource-intense layers, which reduces the chip area for multiplication units significantly. Alternatively, Ristretto can remove the need for multipliers altogether, resulting in an adder-only arithmetic. The tool fine-tunes trimmed networks to achieve high classification accuracy. Since training of deep neural networks can be time-consuming, Ristretto uses highly optimized routines which run on the GPU. This enables fast compression of any given network. Given a maximum tolerance of 1%, Ristretto can successfully condense CaffeNet and SqueezeNet to 8-bit. The code for Ristretto is available.
Tasks
Published	2016-05-20
URL	http://arxiv.org/abs/1605.06402v1
PDF	http://arxiv.org/pdf/1605.06402v1.pdf
PWC	https://paperswithcode.com/paper/ristretto-hardware-oriented-approximation-of
Repo	https://github.com/DeepScale/SqueezeNet
Framework	pytorch

Scalable Bayesian Rule Lists


Title	Scalable Bayesian Rule Lists
Authors	Hongyu Yang, Cynthia Rudin, Margo Seltzer
Abstract	We present an algorithm for building probabilistic rule lists that is two orders of magnitude faster than previous work. Rule list algorithms are competitors for decision tree algorithms. They are associative classifiers, in that they are built from pre-mined association rules. They have a logical structure that is a sequence of IF-THEN rules, identical to a decision list or one-sided decision tree. Instead of using greedy splitting and pruning like decision tree algorithms, we fully optimize over rule lists, striking a practical balance between accuracy, interpretability, and computational speed. The algorithm presented here uses a mixture of theoretical bounds (tight enough to have practical implications as a screening or bounding procedure), computational reuse, and highly tuned language libraries to achieve computational efficiency. Currently, for many practical problems, this method achieves better accuracy and sparsity than decision trees; further, in many cases, the computational time is practical and often less than that of decision trees. The result is a probabilistic classifier (which estimates P(y = 1x) for each x) that optimizes the posterior of a Bayesian hierarchical model over rule lists.
Tasks
Published	2016-02-27
URL	http://arxiv.org/abs/1602.08610v2
PDF	http://arxiv.org/pdf/1602.08610v2.pdf
PWC	https://paperswithcode.com/paper/scalable-bayesian-rule-lists
Repo	https://github.com/nlarusstone/corels
Framework	none

Coarse-to-Fine Volumetric Prediction for Single-Image 3D Human Pose


Title	Coarse-to-Fine Volumetric Prediction for Single-Image 3D Human Pose
Authors	Georgios Pavlakos, Xiaowei Zhou, Konstantinos G. Derpanis, Kostas Daniilidis
Abstract	This paper addresses the challenge of 3D human pose estimation from a single color image. Despite the general success of the end-to-end learning paradigm, top performing approaches employ a two-step solution consisting of a Convolutional Network (ConvNet) for 2D joint localization and a subsequent optimization step to recover 3D pose. In this paper, we identify the representation of 3D pose as a critical issue with current ConvNet approaches and make two important contributions towards validating the value of end-to-end learning for this task. First, we propose a fine discretization of the 3D space around the subject and train a ConvNet to predict per voxel likelihoods for each joint. This creates a natural representation for 3D pose and greatly improves performance over the direct regression of joint coordinates. Second, to further improve upon initial estimates, we employ a coarse-to-fine prediction scheme. This step addresses the large dimensionality increase and enables iterative refinement and repeated processing of the image features. The proposed approach outperforms all state-of-the-art methods on standard benchmarks achieving a relative error reduction greater than 30% on average. Additionally, we investigate using our volumetric representation in a related architecture which is suboptimal compared to our end-to-end approach, but is of practical interest, since it enables training when no image with corresponding 3D groundtruth is available, and allows us to present compelling results for in-the-wild images.
Tasks	3D Human Pose Estimation, Pose Estimation
Published	2016-11-23
URL	http://arxiv.org/abs/1611.07828v2
PDF	http://arxiv.org/pdf/1611.07828v2.pdf
PWC	https://paperswithcode.com/paper/coarse-to-fine-volumetric-prediction-for
Repo	https://github.com/strawberryfg/c2f-3dhm-human-caffe
Framework	torch

Generative Multi-Adversarial Networks


Title	Generative Multi-Adversarial Networks
Authors	Ishan Durugkar, Ian Gemp, Sridhar Mahadevan
Abstract	Generative adversarial networks (GANs) are a framework for producing a generative model by way of a two-player minimax game. In this paper, we propose the \emph{Generative Multi-Adversarial Network} (GMAN), a framework that extends GANs to multiple discriminators. In previous work, the successful training of GANs requires modifying the minimax objective to accelerate training early on. In contrast, GMAN can be reliably trained with the original, untampered objective. We explore a number of design perspectives with the discriminator role ranging from formidable adversary to forgiving teacher. Image generation tasks comparing the proposed framework to standard GANs demonstrate GMAN produces higher quality samples in a fraction of the iterations when measured by a pairwise GAM-type metric.
Tasks	Image Generation
Published	2016-11-05
URL	http://arxiv.org/abs/1611.01673v3
PDF	http://arxiv.org/pdf/1611.01673v3.pdf
PWC	https://paperswithcode.com/paper/generative-multi-adversarial-networks
Repo	https://github.com/torchgan/model-zoo
Framework	pytorch

How Grammatical is Character-level Neural Machine Translation? Assessing MT Quality with Contrastive Translation Pairs


Title	How Grammatical is Character-level Neural Machine Translation? Assessing MT Quality with Contrastive Translation Pairs
Authors	Rico Sennrich
Abstract	Analysing translation quality in regards to specific linguistic phenomena has historically been difficult and time-consuming. Neural machine translation has the attractive property that it can produce scores for arbitrary translations, and we propose a novel method to assess how well NMT systems model specific linguistic phenomena such as agreement over long distances, the production of novel words, and the faithful translation of polarity. The core idea is that we measure whether a reference translation is more probable under a NMT model than a contrastive translation which introduces a specific type of error. We present LingEval97, a large-scale data set of 97000 contrastive translation pairs based on the WMT English->German translation task, with errors automatically created with simple rules. We report results for a number of systems, and find that recently introduced character-level NMT systems perform better at transliteration than models with byte-pair encoding (BPE) segmentation, but perform more poorly at morphosyntactic agreement, and translating discontiguous units of meaning.
Tasks	Machine Translation, Transliteration
Published	2016-12-14
URL	http://arxiv.org/abs/1612.04629v3
PDF	http://arxiv.org/pdf/1612.04629v3.pdf
PWC	https://paperswithcode.com/paper/how-grammatical-is-character-level-neural
Repo	https://github.com/rsennrich/lingeval97
Framework	none

Exploring the Political Agenda of the European Parliament Using a Dynamic Topic Modeling Approach


Title	Exploring the Political Agenda of the European Parliament Using a Dynamic Topic Modeling Approach
Authors	Derek Greene, James P. Cross
Abstract	This study analyzes the political agenda of the European Parliament (EP) plenary, how it has evolved over time, and the manner in which Members of the European Parliament (MEPs) have reacted to external and internal stimuli when making plenary speeches. To unveil the plenary agenda and detect latent themes in legislative speeches over time, MEP speech content is analyzed using a new dynamic topic modeling method based on two layers of Non-negative Matrix Factorization (NMF). This method is applied to a new corpus of all English language legislative speeches in the EP plenary from the period 1999-2014. Our findings suggest that two-layer NMF is a valuable alternative to existing dynamic topic modeling approaches found in the literature, and can unveil niche topics and associated vocabularies not captured by existing methods. Substantively, our findings suggest that the political agenda of the EP evolves significantly over time and reacts to exogenous events such as EU Treaty referenda and the emergence of the Euro-crisis. MEP contributions to the plenary agenda are also found to be impacted upon by voting behaviour and the committee structure of the Parliament.
Tasks
Published	2016-07-11
URL	http://arxiv.org/abs/1607.03055v1
PDF	http://arxiv.org/pdf/1607.03055v1.pdf
PWC	https://paperswithcode.com/paper/exploring-the-political-agenda-of-the
Repo	https://github.com/derekgreene/dynamic-nmf
Framework	tf

RIGA at SemEval-2016 Task 8: Impact of Smatch Extensions and Character-Level Neural Translation on AMR Parsing Accuracy


Title	RIGA at SemEval-2016 Task 8: Impact of Smatch Extensions and Character-Level Neural Translation on AMR Parsing Accuracy
Authors	Guntis Barzdins, Didzis Gosko
Abstract	Two extensions to the AMR smatch scoring script are presented. The first extension com-bines the smatch scoring script with the C6.0 rule-based classifier to produce a human-readable report on the error patterns frequency observed in the scored AMR graphs. This first extension results in 4% gain over the state-of-art CAMR baseline parser by adding to it a manually crafted wrapper fixing the identified CAMR parser errors. The second extension combines a per-sentence smatch with an en-semble method for selecting the best AMR graph among the set of AMR graphs for the same sentence. This second modification au-tomatically yields further 0.4% gain when ap-plied to outputs of two nondeterministic AMR parsers: a CAMR+wrapper parser and a novel character-level neural translation AMR parser. For AMR parsing task the character-level neural translation attains surprising 7% gain over the carefully optimized word-level neural translation. Overall, we achieve smatch F1=62% on the SemEval-2016 official scor-ing set and F1=67% on the LDC2015E86 test set.
Tasks	Amr Parsing
Published	2016-04-05
URL	http://arxiv.org/abs/1604.01278v1
PDF	http://arxiv.org/pdf/1604.01278v1.pdf
PWC	https://paperswithcode.com/paper/riga-at-semeval-2016-task-8-impact-of-smatch
Repo	https://github.com/didzis/tensorflowAMR
Framework	tf

Safety Verification of Deep Neural Networks


Title	Safety Verification of Deep Neural Networks
Authors	Xiaowei Huang, Marta Kwiatkowska, Sen Wang, Min Wu
Abstract	Deep neural networks have achieved impressive experimental results in image classification, but can surprisingly be unstable with respect to adversarial perturbations, that is, minimal changes to the input image that cause the network to misclassify it. With potential applications including perception modules and end-to-end controllers for self-driving cars, this raises concerns about their safety. We develop a novel automated verification framework for feed-forward multi-layer neural networks based on Satisfiability Modulo Theory (SMT). We focus on safety of image classification decisions with respect to image manipulations, such as scratches or changes to camera angle or lighting conditions that would result in the same class being assigned by a human, and define safety for an individual decision in terms of invariance of the classification within a small neighbourhood of the original image. We enable exhaustive search of the region by employing discretisation, and propagate the analysis layer by layer. Our method works directly with the network code and, in contrast to existing methods, can guarantee that adversarial examples, if they exist, are found for the given region and family of manipulations. If found, adversarial examples can be shown to human testers and/or used to fine-tune the network. We implement the techniques using Z3 and evaluate them on state-of-the-art networks, including regularised and deep learning networks. We also compare against existing techniques to search for adversarial examples and estimate network robustness.
Tasks	Adversarial Attack, Adversarial Defense, Image Classification, Self-Driving Cars
Published	2016-10-21
URL	http://arxiv.org/abs/1610.06940v3
PDF	http://arxiv.org/pdf/1610.06940v3.pdf
PWC	https://paperswithcode.com/paper/safety-verification-of-deep-neural-networks
Repo	https://github.com/VeriDeep/DLV
Framework	none

Tutorial on Variational Autoencoders


Title	Tutorial on Variational Autoencoders
Authors	Carl Doersch
Abstract	In just three years, Variational Autoencoders (VAEs) have emerged as one of the most popular approaches to unsupervised learning of complicated distributions. VAEs are appealing because they are built on top of standard function approximators (neural networks), and can be trained with stochastic gradient descent. VAEs have already shown promise in generating many kinds of complicated data, including handwritten digits, faces, house numbers, CIFAR images, physical models of scenes, segmentation, and predicting the future from static images. This tutorial introduces the intuitions behind VAEs, explains the mathematics behind them, and describes some empirical behavior. No prior knowledge of variational Bayesian methods is assumed.
Tasks
Published	2016-06-19
URL	http://arxiv.org/abs/1606.05908v2
PDF	http://arxiv.org/pdf/1606.05908v2.pdf
PWC	https://paperswithcode.com/paper/tutorial-on-variational-autoencoders
Repo	https://github.com/cdoersch/vae_tutorial
Framework	caffe2

Using Centroidal Voronoi Tessellations to Scale Up the Multi-dimensional Archive of Phenotypic Elites Algorithm


Title	Using Centroidal Voronoi Tessellations to Scale Up the Multi-dimensional Archive of Phenotypic Elites Algorithm
Authors	Vassilis Vassiliades, Konstantinos Chatzilygeroudis, Jean-Baptiste Mouret
Abstract	The recently introduced Multi-dimensional Archive of Phenotypic Elites (MAP-Elites) is an evolutionary algorithm capable of producing a large archive of diverse, high-performing solutions in a single run. It works by discretizing a continuous feature space into unique regions according to the desired discretization per dimension. While simple, this algorithm has a main drawback: it cannot scale to high-dimensional feature spaces since the number of regions increase exponentially with the number of dimensions. In this paper, we address this limitation by introducing a simple extension of MAP-Elites that has a constant, pre-defined number of regions irrespective of the dimensionality of the feature space. Our main insight is that methods from computational geometry could partition a high-dimensional space into well-spread geometric regions. In particular, our algorithm uses a centroidal Voronoi tessellation (CVT) to divide the feature space into a desired number of regions; it then places every generated individual in its closest region, replacing a less fit one if the region is already occupied. We demonstrate the effectiveness of the new “CVT-MAP-Elites” algorithm in high-dimensional feature spaces through comparisons against MAP-Elites in maze navigation and hexapod locomotion tasks.
Tasks
Published	2016-10-18
URL	http://arxiv.org/abs/1610.05729v2
PDF	http://arxiv.org/pdf/1610.05729v2.pdf
PWC	https://paperswithcode.com/paper/using-centroidal-voronoi-tessellations-to
Repo	https://github.com/resibots/vassiliades_2017_cvt_map_elites
Framework	none

Multi-View 3D Object Detection Network for Autonomous Driving


Title	Multi-View 3D Object Detection Network for Autonomous Driving
Authors	Xiaozhi Chen, Huimin Ma, Ji Wan, Bo Li, Tian Xia
Abstract	This paper aims at high-accuracy 3D object detection in autonomous driving scenario. We propose Multi-View 3D networks (MV3D), a sensory-fusion framework that takes both LIDAR point cloud and RGB images as input and predicts oriented 3D bounding boxes. We encode the sparse 3D point cloud with a compact multi-view representation. The network is composed of two subnetworks: one for 3D object proposal generation and another for multi-view feature fusion. The proposal network generates 3D candidate boxes efficiently from the bird’s eye view representation of 3D point cloud. We design a deep fusion scheme to combine region-wise features from multiple views and enable interactions between intermediate layers of different paths. Experiments on the challenging KITTI benchmark show that our approach outperforms the state-of-the-art by around 25% and 30% AP on the tasks of 3D localization and 3D detection. In addition, for 2D detection, our approach obtains 10.3% higher AP than the state-of-the-art on the hard data among the LIDAR-based methods.
Tasks	3D Object Detection, Autonomous Driving, Object Detection, Object Proposal Generation
Published	2016-11-23
URL	http://arxiv.org/abs/1611.07759v3
PDF	http://arxiv.org/pdf/1611.07759v3.pdf
PWC	https://paperswithcode.com/paper/multi-view-3d-object-detection-network-for
Repo	https://github.com/bostondiditeam/MV3D
Framework	tf