October 20, 2019

3273 words 16 mins read

Paper Group AWR 181

A Fully Progressive Approach to Single-Image Super-Resolution. AdGraph: A Graph-Based Approach to Ad and Tracker Blocking. Sliced Recurrent Neural Networks. KS(conf ): A Light-Weight Test if a ConvNet Operates Outside of Its Specifications. ChronoNet: A Deep Recurrent Neural Network for Abnormal EEG Identification. End2You – The Imperial Toolkit f …

A Fully Progressive Approach to Single-Image Super-Resolution


Title	A Fully Progressive Approach to Single-Image Super-Resolution
Authors	Yifan Wang, Federico Perazzi, Brian McWilliams, Alexander Sorkine-Hornung, Olga Sorkine-Hornung, Christopher Schroers
Abstract	Recent deep learning approaches to single image super-resolution have achieved impressive results in terms of traditional error measures and perceptual quality. However, in each case it remains challenging to achieve high quality results for large upsampling factors. To this end, we propose a method (ProSR) that is progressive both in architecture and training: the network upsamples an image in intermediate steps, while the learning process is organized from easy to hard, as is done in curriculum learning. To obtain more photorealistic results, we design a generative adversarial network (GAN), named ProGanSR, that follows the same progressive multi-scale design principle. This not only allows to scale well to high upsampling factors (e.g., 8x) but constitutes a principled multi-scale approach that increases the reconstruction quality for all upsampling factors simultaneously. In particular ProSR ranks 2nd in terms of SSIM and 4th in terms of PSNR in the NTIRE2018 SISR challenge [34]. Compared to the top-ranking team, our model is marginally lower, but runs 5 times faster.
Tasks	Image Super-Resolution, Super-Resolution
Published	2018-04-09
URL	http://arxiv.org/abs/1804.02900v2
PDF	http://arxiv.org/pdf/1804.02900v2.pdf
PWC	https://paperswithcode.com/paper/a-fully-progressive-approach-to-single-image
Repo	https://github.com/fperazzi/proSR
Framework	pytorch

AdGraph: A Graph-Based Approach to Ad and Tracker Blocking


Title	AdGraph: A Graph-Based Approach to Ad and Tracker Blocking
Authors	Umar Iqbal, Peter Snyder, Shitong Zhu, Benjamin Livshits, Zhiyun Qian, Zubair Shafiq
Abstract	User demand for blocking advertising and tracking online is large and growing. Existing tools, both deployed and described in research, have proven useful, but lack either the completeness or robustness needed for a general solution. Existing detection approaches generally focus on only one aspect of advertising or tracking (e.g. URL patterns, code structure), making existing approaches susceptible to evasion. In this work we present AdGraph, a novel graph-based machine learning approach for detecting advertising and tracking resources on the web. AdGraph differs from existing approaches by building a graph representation of the HTML structure, network requests, and JavaScript behavior of a webpage, and using this unique representation to train a classifier for identifying advertising and tracking resources. Because AdGraph considers many aspects of the context a network request takes place in, it is less susceptible to the single-factor evasion techniques that flummox existing approaches. We evaluate AdGraph on the Alexa top-10K websites, and find that it is highly accurate, able to replicate the labels of human-generated filter lists with 95.33% accuracy, and can even identify many mistakes in filter lists. We implement AdGraph as a modification to Chromium. AdGraph adds only minor overhead to page loading and execution, and is actually faster than stock Chromium on 42% of websites and AdBlock Plus on 78% of websites. Overall, we conclude that AdGraph is both accurate enough and performant enough for online use, breaking comparable or fewer websites than popular filter list based approaches.
Tasks
Published	2018-05-22
URL	https://arxiv.org/abs/1805.09155v2
PDF	https://arxiv.org/pdf/1805.09155v2.pdf
PWC	https://paperswithcode.com/paper/adgraph-a-machine-learning-approach-to
Repo	https://github.com/brandoningli/cs-455-project
Framework	none

Sliced Recurrent Neural Networks


Title	Sliced Recurrent Neural Networks
Authors	Zeping Yu, Gongshen Liu
Abstract	Recurrent neural networks have achieved great success in many NLP tasks. However, they have difficulty in parallelization because of the recurrent structure, so it takes much time to train RNNs. In this paper, we introduce sliced recurrent neural networks (SRNNs), which could be parallelized by slicing the sequences into many subsequences. SRNNs have the ability to obtain high-level information through multiple layers with few extra parameters. We prove that the standard RNN is a special case of the SRNN when we use linear activation functions. Without changing the recurrent units, SRNNs are 136 times as fast as standard RNNs and could be even faster when we train longer sequences. Experiments on six largescale sentiment analysis datasets show that SRNNs achieve better performance than standard RNNs.
Tasks	Sentiment Analysis
Published	2018-07-06
URL	http://arxiv.org/abs/1807.02291v1
PDF	http://arxiv.org/pdf/1807.02291v1.pdf
PWC	https://paperswithcode.com/paper/sliced-recurrent-neural-networks
Repo	https://github.com/zepingyu0512/srnn
Framework	tf

KS(conf ): A Light-Weight Test if a ConvNet Operates Outside of Its Specifications


Title	KS(conf ): A Light-Weight Test if a ConvNet Operates Outside of Its Specifications
Authors	Rémy Sun, Christoph H. Lampert
Abstract	Computer vision systems for automatic image categorization have become accurate and reliable enough that they can run continuously for days or even years as components of real-world commercial applications. A major open problem in this context, however, is quality control. Good classification performance can only be expected if systems run under the specific conditions, in particular data distributions, that they were trained for. Surprisingly, none of the currently used deep network architectures has a built-in functionality that could detect if a network operates on data from a distribution that it was not trained for and potentially trigger a warning to the human users. In this work, we describe KS(conf), a procedure for detecting such outside of the specifications operation. Building on statistical insights, its main step is the applications of a classical Kolmogorov-Smirnov test to the distribution of predicted confidence values. We show by extensive experiments using ImageNet, AwA2 and DAVIS data on a variety of ConvNets architectures that KS(conf) reliably detects out-of-specs situations. It furthermore has a number of properties that make it an excellent candidate for practical deployment: it is easy to implement, adds almost no overhead to the system, works with all networks, including pretrained ones, and requires no a priori knowledge about how the data distribution could change.
Tasks	Image Categorization
Published	2018-04-11
URL	http://arxiv.org/abs/1804.04171v1
PDF	http://arxiv.org/pdf/1804.04171v1.pdf
PWC	https://paperswithcode.com/paper/ksconf-a-light-weight-test-if-a-convnet
Repo	https://github.com/ISTAustria-CVML/KSconf
Framework	none

ChronoNet: A Deep Recurrent Neural Network for Abnormal EEG Identification


Title	ChronoNet: A Deep Recurrent Neural Network for Abnormal EEG Identification
Authors	Subhrajit Roy, Isabell Kiral-Kornek, Stefan Harrer
Abstract	Brain-related disorders such as epilepsy can be diagnosed by analyzing electroencephalograms (EEG). However, manual analysis of EEG data requires highly trained clinicians, and is a procedure that is known to have relatively low inter-rater agreement (IRA). Moreover, the volume of the data and the rate at which new data becomes available make manual interpretation a time-consuming, resource-hungry, and expensive process. In contrast, automated analysis of EEG data offers the potential to improve the quality of patient care by shortening the time to diagnosis and reducing manual error. In this paper, we focus on one of the first steps in interpreting an EEG session - identifying whether the brain activity is abnormal or normal. To solve this task, we propose a novel recurrent neural network (RNN) architecture termed ChronoNet which is inspired by recent developments from the field of image classification and designed to work efficiently with EEG data. ChronoNet is formed by stacking multiple 1D convolution layers followed by deep gated recurrent unit (GRU) layers where each 1D convolution layer uses multiple filters of exponentially varying lengths and the stacked GRU layers are densely connected in a feed-forward manner. We used the recently released TUH Abnormal EEG Corpus dataset for evaluating the performance of ChronoNet. Unlike previous studies using this dataset, ChronoNet directly takes time-series EEG as input and learns meaningful representations of brain activity patterns. ChronoNet outperforms the previously reported best results by 7.79% thereby setting a new benchmark for this dataset. Furthermore, we demonstrate the domain-independent nature of ChronoNet by successfully applying it to classify speech commands.
Tasks	EEG, Image Classification, Time Series
Published	2018-01-30
URL	http://arxiv.org/abs/1802.00308v2
PDF	http://arxiv.org/pdf/1802.00308v2.pdf
PWC	https://paperswithcode.com/paper/chrononet-a-deep-recurrent-neural-network-for
Repo	https://github.com/Sharad24/Epileptic-Seizure-Detection
Framework	none

End2You – The Imperial Toolkit for Multimodal Profiling by End-to-End Learning


Title	End2You – The Imperial Toolkit for Multimodal Profiling by End-to-End Learning
Authors	Panagiotis Tzirakis, Stefanos Zafeiriou, Bjorn W. Schuller
Abstract	We introduce End2You – the Imperial College London toolkit for multimodal profiling by end-to-end deep learning. End2You is an open-source toolkit implemented in Python and is based on Tensorflow. It provides capabilities to train and evaluate models in an end-to-end manner, i.e., using raw input. It supports input from raw audio, visual, physiological or other types of information or combination of those, and the output can be of an arbitrary representation, for either classification or regression tasks. To our knowledge, this is the first toolkit that provides generic end-to-end learning for profiling capabilities in either unimodal or multimodal cases. To test our toolkit, we utilise the RECOLA database as was used in the AVEC 2016 challenge. Experimental results indicate that End2You can provide comparable results to state-of-the-art methods despite no need of expert-alike feature representations, but self-learning these from the data “end to end”.
Tasks
Published	2018-02-04
URL	http://arxiv.org/abs/1802.01115v1
PDF	http://arxiv.org/pdf/1802.01115v1.pdf
PWC	https://paperswithcode.com/paper/end2you-the-imperial-toolkit-for-multimodal
Repo	https://github.com/end2you/end2you
Framework	tf

ANNETT-O: An Ontology for Describing Artificial Neural Network Evaluation, Topology and Training


Title	ANNETT-O: An Ontology for Describing Artificial Neural Network Evaluation, Topology and Training
Authors	Iraklis A. Klampanos, Athanasios Davvetas, Antonis Koukourikos, Vangelis Karkaletsis
Abstract	Deep learning models, while effective and versatile, are becoming increasingly complex, often including multiple overlapping networks of arbitrary depths, multiple objectives and non-intuitive training methodologies. This makes it increasingly difficult for researchers and practitioners to design, train and understand them. In this paper we present ANNETT-O, a much-needed, generic and computer-actionable vocabulary for researchers and practitioners to describe their deep learning configurations, training procedures and experiments. The proposed ontology focuses on topological, training and evaluation aspects of complex deep neural configurations, while keeping peripheral entities more succinct. Knowledge bases implementing ANNETT-O can support a wide variety of queries, providing relevant insights to users. In addition to a detailed description of the ontology, we demonstrate its suitability to the task via a number of hypothetical use-cases of increasing complexity.
Tasks
Published	2018-04-07
URL	http://arxiv.org/abs/1804.02528v2
PDF	http://arxiv.org/pdf/1804.02528v2.pdf
PWC	https://paperswithcode.com/paper/annett-o-an-ontology-for-describing
Repo	https://github.com/davidath/evitrac
Framework	tf

Algorithm Selection for Collaborative Filtering: the influence of graph metafeatures and multicriteria metatargets


Title	Algorithm Selection for Collaborative Filtering: the influence of graph metafeatures and multicriteria metatargets
Authors	Tiago Cunha, Carlos Soares, André C. P. L. F. de Carvalho
Abstract	To select the best algorithm for a new problem is an expensive and difficult task. However, there are automatic solutions to address this problem: using Metalearning, which takes advantage of problem characteristics (i.e. metafeatures), one is able to predict the relative performance of algorithms. In the Collaborative Filtering scope, recent works have proposed diverse metafeatures describing several dimensions of this problem. Despite interesting and effective findings, it is still unknown whether these are the most effective metafeatures. Hence, this work proposes a new set of graph metafeatures, which approach the Collaborative Filtering problem from a Graph Theory perspective. Furthermore, in order to understand whether metafeatures from multiple dimensions are a better fit, we investigate the effects of comprehensive metafeatures. These metafeatures are a selection of the best metafeatures from all existing Collaborative Filtering metafeatures. The impact of the most representative metafeatures is investigated in a controlled experimental setup. Another contribution we present is the use of a Pareto-Efficient ranking procedure to create multicriteria metatargets. These new rankings of algorithms, which take into account multiple evaluation measures, allow to explore the algorithm selection problem in a fairer and more detailed way. According to the experimental results, the graph metafeatures are a good alternative to related work metafeatures. However, the results have shown that the feature selection procedure used to create the comprehensive metafeatures is is not effective, since there is no gain in predictive performance. Finally, an extensive metaknowledge analysis was conducted to identify the most influential metafeatures.
Tasks	Feature Selection
Published	2018-07-23
URL	http://arxiv.org/abs/1807.09097v1
PDF	http://arxiv.org/pdf/1807.09097v1.pdf
PWC	https://paperswithcode.com/paper/algorithm-selection-for-collaborative
Repo	https://github.com/tiagodscunha/cf_metafeatures
Framework	none

Fully Statistical Neural Belief Tracking


Title	Fully Statistical Neural Belief Tracking
Authors	Nikola Mrkšić, Ivan Vulić
Abstract	This paper proposes an improvement to the existing data-driven Neural Belief Tracking (NBT) framework for Dialogue State Tracking (DST). The existing NBT model uses a hand-crafted belief state update mechanism which involves an expensive manual retuning step whenever the model is deployed to a new dialogue domain. We show that this update mechanism can be learned jointly with the semantic decoding and context modelling parts of the NBT model, eliminating the last rule-based module from this DST framework. We propose two different statistical update mechanisms and show that dialogue dynamics can be modelled with a very small number of additional model parameters. In our DST evaluation over three languages, we show that this model achieves competitive performance and provides a robust framework for building resource-light DST models.
Tasks	Dialogue State Tracking
Published	2018-05-29
URL	http://arxiv.org/abs/1805.11350v1
PDF	http://arxiv.org/pdf/1805.11350v1.pdf
PWC	https://paperswithcode.com/paper/fully-statistical-neural-belief-tracking
Repo	https://github.com/nmrksic/neural-belief-tracker
Framework	tf

3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation


Title	3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation
Authors	Angela Dai, Matthias Nießner
Abstract	We present 3DMV, a novel method for 3D semantic scene segmentation of RGB-D scans in indoor environments using a joint 3D-multi-view prediction network. In contrast to existing methods that either use geometry or RGB data as input for this task, we combine both data modalities in a joint, end-to-end network architecture. Rather than simply projecting color data into a volumetric grid and operating solely in 3D – which would result in insufficient detail – we first extract feature maps from associated RGB images. These features are then mapped into the volumetric feature grid of a 3D network using a differentiable backprojection layer. Since our target is 3D scanning scenarios with possibly many frames, we use a multi-view pooling approach in order to handle a varying number of RGB input views. This learned combination of RGB and geometric features with our joint 2D-3D architecture achieves significantly better results than existing baselines. For instance, our final result on the ScanNet 3D segmentation benchmark increases from 52.8% to 75% accuracy compared to existing volumetric architectures.
Tasks	Scene Segmentation, Semantic Segmentation
Published	2018-03-28
URL	http://arxiv.org/abs/1803.10409v1
PDF	http://arxiv.org/pdf/1803.10409v1.pdf
PWC	https://paperswithcode.com/paper/3dmv-joint-3d-multi-view-prediction-for-3d
Repo	https://github.com/angeladai/3DMV
Framework	pytorch

Hubless keypoint-based 3D deformable groupwise registration


Title	Hubless keypoint-based 3D deformable groupwise registration
Authors	Rémi Agier, Sébastien Valette, Razmig Kéchichian, Laurent Fanton, Rémy Prost
Abstract	We present a novel algorithm for Fast Registration Of image Groups (FROG), applied to large 3D image groups. Our approach extracts 3D SURF keypoints from images, computes matched pairs of keypoints and registers the group by minimizing pair distances in a hubless way i.e. without computing any central mean image. Using keypoints significantly reduces the problem complexity compared to voxel-based approaches, and enables us to provide an in-core global optimization, similar to the Bundle Adjustment for 3D reconstruction. As we aim to register images of different patients, the matching step yields many outliers. Then we propose a new EM-weighting algorithm which efficiently discards outliers. Global optimization is carried out with a fast gradient descent algorithm. This allows our approach to robustly register large datasets. The result is a set of diffeomorphic half transforms which link the volumes together and can be subsequently exploited for computational anatomy and landmark detection. We show experimental results on whole-body CT scans, with groups of up to 103 volumes. On a benchmark based on anatomical landmarks, our algorithm compares favorably with the star-groupwise voxel-based ANTs and NiftyReg approaches while being much faster. We also discuss the limitations of our approach for lower resolution images such as brain MRI.
Tasks	3D Reconstruction, Semantic Segmentation
Published	2018-09-11
URL	https://arxiv.org/abs/1809.03951v3
PDF	https://arxiv.org/pdf/1809.03951v3.pdf
PWC	https://paperswithcode.com/paper/hubless-keypoint-based-3d-deformable
Repo	https://github.com/valette/frog
Framework	none

User Constrained Thumbnail Generation using Adaptive Convolutions


Title	User Constrained Thumbnail Generation using Adaptive Convolutions
Authors	Perla Sai Raj Kishore, Ayan Kumar Bhunia, Shuvozit Ghose, Partha Pratim Roy
Abstract	Thumbnails are widely used all over the world as a preview for digital images. In this work we propose a deep neural framework to generate thumbnails of any size and aspect ratio, even for unseen values during training, with high accuracy and precision. We use Global Context Aggregation (GCA) and a modified Region Proposal Network (RPN) with adaptive convolutions to generate thumbnails in real time. GCA is used to selectively attend and aggregate the global context information from the entire image while the RPN is used to predict candidate bounding boxes for the thumbnail image. Adaptive convolution eliminates the problem of generating thumbnails of various aspect ratios by using filter weights dynamically generated from the aspect ratio information. The experimental results indicate the superior performance of the proposed model over existing state-of-the-art techniques.
Tasks	User Constrained Thumbnail Generation
Published	2018-10-31
URL	http://arxiv.org/abs/1810.13054v3
PDF	http://arxiv.org/pdf/1810.13054v3.pdf
PWC	https://paperswithcode.com/paper/user-constrained-thumbnail-generation-using
Repo	https://github.com/Aiyoj/Thumbnail-Generation
Framework	tf

WarpGAN: Automatic Caricature Generation


Title	WarpGAN: Automatic Caricature Generation
Authors	Yichun Shi, Debayan Deb, Anil K. Jain
Abstract	We propose, WarpGAN, a fully automatic network that can generate caricatures given an input face photo. Besides transferring rich texture styles, WarpGAN learns to automatically predict a set of control points that can warp the photo into a caricature, while preserving identity. We introduce an identity-preserving adversarial loss that aids the discriminator to distinguish between different subjects. Moreover, WarpGAN allows customization of the generated caricatures by controlling the exaggeration extent and the visual styles. Experimental results on a public domain dataset, WebCaricature, show that WarpGAN is capable of generating a diverse set of caricatures while preserving the identities. Five caricature experts suggest that caricatures generated by WarpGAN are visually similar to hand-drawn ones and only prominent facial features are exaggerated.
Tasks	Photo-To-Caricature Translation
Published	2018-11-25
URL	http://arxiv.org/abs/1811.10100v3
PDF	http://arxiv.org/pdf/1811.10100v3.pdf
PWC	https://paperswithcode.com/paper/warpgan-automatic-caricature-generation
Repo	https://github.com/seasonSH/WarpGAN
Framework	tf

Probabilistic Object Detection: Definition and Evaluation


Title	Probabilistic Object Detection: Definition and Evaluation
Authors	David Hall, Feras Dayoub, John Skinner, Haoyang Zhang, Dimity Miller, Peter Corke, Gustavo Carneiro, Anelia Angelova, Niko Sünderhauf
Abstract	We introduce Probabilistic Object Detection, the task of detecting objects in images and accurately quantifying the spatial and semantic uncertainties of the detections. Given the lack of methods capable of assessing such probabilistic object detections, we present the new Probability-based Detection Quality measure (PDQ).Unlike AP-based measures, PDQ has no arbitrary thresholds and rewards spatial and label quality, and foreground/background separation quality while explicitly penalising false positive and false negative detections. We contrast PDQ with existing mAP and moLRP measures by evaluating state-of-the-art detectors and a Bayesian object detector based on Monte Carlo Dropout. Our experiments indicate that conventional object detectors tend to be spatially overconfident and thus perform poorly on the task of probabilistic object detection. Our paper aims to encourage the development of new object detection approaches that provide detections with accurately estimated spatial and label uncertainties and are of critical importance for deployment on robots and embodied AI systems in the real world.
Tasks	Object Detection
Published	2018-11-27
URL	https://arxiv.org/abs/1811.10800v4
PDF	https://arxiv.org/pdf/1811.10800v4.pdf
PWC	https://paperswithcode.com/paper/probability-based-detection-quality-pdq-a
Repo	https://github.com/jskinn/rvchallenge-evaluation
Framework	none

PPF-FoldNet: Unsupervised Learning of Rotation Invariant 3D Local Descriptors


Title	PPF-FoldNet: Unsupervised Learning of Rotation Invariant 3D Local Descriptors
Authors	Haowen Deng, Tolga Birdal, Slobodan Ilic
Abstract	We present PPF-FoldNet for unsupervised learning of 3D local descriptors on pure point cloud geometry. Based on the folding-based auto-encoding of well known point pair features, PPF-FoldNet offers many desirable properties: it necessitates neither supervision, nor a sensitive local reference frame, benefits from point-set sparsity, is end-to-end, fast, and can extract powerful rotation invariant descriptors. Thanks to a novel feature visualization, its evolution can be monitored to provide interpretable insights. Our extensive experiments demonstrate that despite having six degree-of-freedom invariance and lack of training labels, our network achieves state of the art results in standard benchmark datasets and outperforms its competitors when rotations and varying point densities are present. PPF-FoldNet achieves $9%$ higher recall on standard benchmarks, $23%$ higher recall when rotations are introduced into the same datasets and finally, a margin of $>35%$ is attained when point density is significantly decreased.
Tasks
Published	2018-08-30
URL	http://arxiv.org/abs/1808.10322v1
PDF	http://arxiv.org/pdf/1808.10322v1.pdf
PWC	https://paperswithcode.com/paper/ppf-foldnet-unsupervised-learning-of-rotation
Repo	https://github.com/XuyangBai/PPF-FoldNet
Framework	pytorch