Paper Group AWR 181
A Fully Progressive Approach to Single-Image Super-Resolution. AdGraph: A Graph-Based Approach to Ad and Tracker Blocking. Sliced Recurrent Neural Networks. KS(conf ): A Light-Weight Test if a ConvNet Operates Outside of Its Specifications. ChronoNet: A Deep Recurrent Neural Network for Abnormal EEG Identification. End2You – The Imperial Toolkit f …
A Fully Progressive Approach to Single-Image Super-Resolution
Title | A Fully Progressive Approach to Single-Image Super-Resolution |
Authors | Yifan Wang, Federico Perazzi, Brian McWilliams, Alexander Sorkine-Hornung, Olga Sorkine-Hornung, Christopher Schroers |
Abstract | Recent deep learning approaches to single image super-resolution have achieved impressive results in terms of traditional error measures and perceptual quality. However, in each case it remains challenging to achieve high quality results for large upsampling factors. To this end, we propose a method (ProSR) that is progressive both in architecture and training: the network upsamples an image in intermediate steps, while the learning process is organized from easy to hard, as is done in curriculum learning. To obtain more photorealistic results, we design a generative adversarial network (GAN), named ProGanSR, that follows the same progressive multi-scale design principle. This not only allows to scale well to high upsampling factors (e.g., 8x) but constitutes a principled multi-scale approach that increases the reconstruction quality for all upsampling factors simultaneously. In particular ProSR ranks 2nd in terms of SSIM and 4th in terms of PSNR in the NTIRE2018 SISR challenge [34]. Compared to the top-ranking team, our model is marginally lower, but runs 5 times faster. |
Tasks | Image Super-Resolution, Super-Resolution |
Published | 2018-04-09 |
URL | http://arxiv.org/abs/1804.02900v2 |
http://arxiv.org/pdf/1804.02900v2.pdf | |
PWC | https://paperswithcode.com/paper/a-fully-progressive-approach-to-single-image |
Repo | https://github.com/fperazzi/proSR |
Framework | pytorch |
AdGraph: A Graph-Based Approach to Ad and Tracker Blocking
Title | AdGraph: A Graph-Based Approach to Ad and Tracker Blocking |
Authors | Umar Iqbal, Peter Snyder, Shitong Zhu, Benjamin Livshits, Zhiyun Qian, Zubair Shafiq |
Abstract | User demand for blocking advertising and tracking online is large and growing. Existing tools, both deployed and described in research, have proven useful, but lack either the completeness or robustness needed for a general solution. Existing detection approaches generally focus on only one aspect of advertising or tracking (e.g. URL patterns, code structure), making existing approaches susceptible to evasion. In this work we present AdGraph, a novel graph-based machine learning approach for detecting advertising and tracking resources on the web. AdGraph differs from existing approaches by building a graph representation of the HTML structure, network requests, and JavaScript behavior of a webpage, and using this unique representation to train a classifier for identifying advertising and tracking resources. Because AdGraph considers many aspects of the context a network request takes place in, it is less susceptible to the single-factor evasion techniques that flummox existing approaches. We evaluate AdGraph on the Alexa top-10K websites, and find that it is highly accurate, able to replicate the labels of human-generated filter lists with 95.33% accuracy, and can even identify many mistakes in filter lists. We implement AdGraph as a modification to Chromium. AdGraph adds only minor overhead to page loading and execution, and is actually faster than stock Chromium on 42% of websites and AdBlock Plus on 78% of websites. Overall, we conclude that AdGraph is both accurate enough and performant enough for online use, breaking comparable or fewer websites than popular filter list based approaches. |
Tasks | |
Published | 2018-05-22 |
URL | https://arxiv.org/abs/1805.09155v2 |
https://arxiv.org/pdf/1805.09155v2.pdf | |
PWC | https://paperswithcode.com/paper/adgraph-a-machine-learning-approach-to |
Repo | https://github.com/brandoningli/cs-455-project |
Framework | none |
Sliced Recurrent Neural Networks
Title | Sliced Recurrent Neural Networks |
Authors | Zeping Yu, Gongshen Liu |
Abstract | Recurrent neural networks have achieved great success in many NLP tasks. However, they have difficulty in parallelization because of the recurrent structure, so it takes much time to train RNNs. In this paper, we introduce sliced recurrent neural networks (SRNNs), which could be parallelized by slicing the sequences into many subsequences. SRNNs have the ability to obtain high-level information through multiple layers with few extra parameters. We prove that the standard RNN is a special case of the SRNN when we use linear activation functions. Without changing the recurrent units, SRNNs are 136 times as fast as standard RNNs and could be even faster when we train longer sequences. Experiments on six largescale sentiment analysis datasets show that SRNNs achieve better performance than standard RNNs. |
Tasks | Sentiment Analysis |
Published | 2018-07-06 |
URL | http://arxiv.org/abs/1807.02291v1 |
http://arxiv.org/pdf/1807.02291v1.pdf | |
PWC | https://paperswithcode.com/paper/sliced-recurrent-neural-networks |
Repo | https://github.com/zepingyu0512/srnn |
Framework | tf |
KS(conf ): A Light-Weight Test if a ConvNet Operates Outside of Its Specifications
Title | KS(conf ): A Light-Weight Test if a ConvNet Operates Outside of Its Specifications |
Authors | Rémy Sun, Christoph H. Lampert |
Abstract | Computer vision systems for automatic image categorization have become accurate and reliable enough that they can run continuously for days or even years as components of real-world commercial applications. A major open problem in this context, however, is quality control. Good classification performance can only be expected if systems run under the specific conditions, in particular data distributions, that they were trained for. Surprisingly, none of the currently used deep network architectures has a built-in functionality that could detect if a network operates on data from a distribution that it was not trained for and potentially trigger a warning to the human users. In this work, we describe KS(conf), a procedure for detecting such outside of the specifications operation. Building on statistical insights, its main step is the applications of a classical Kolmogorov-Smirnov test to the distribution of predicted confidence values. We show by extensive experiments using ImageNet, AwA2 and DAVIS data on a variety of ConvNets architectures that KS(conf) reliably detects out-of-specs situations. It furthermore has a number of properties that make it an excellent candidate for practical deployment: it is easy to implement, adds almost no overhead to the system, works with all networks, including pretrained ones, and requires no a priori knowledge about how the data distribution could change. |
Tasks | Image Categorization |
Published | 2018-04-11 |
URL | http://arxiv.org/abs/1804.04171v1 |
http://arxiv.org/pdf/1804.04171v1.pdf | |
PWC | https://paperswithcode.com/paper/ksconf-a-light-weight-test-if-a-convnet |
Repo | https://github.com/ISTAustria-CVML/KSconf |
Framework | none |
ChronoNet: A Deep Recurrent Neural Network for Abnormal EEG Identification
Title | ChronoNet: A Deep Recurrent Neural Network for Abnormal EEG Identification |
Authors | Subhrajit Roy, Isabell Kiral-Kornek, Stefan Harrer |
Abstract | Brain-related disorders such as epilepsy can be diagnosed by analyzing electroencephalograms (EEG). However, manual analysis of EEG data requires highly trained clinicians, and is a procedure that is known to have relatively low inter-rater agreement (IRA). Moreover, the volume of the data and the rate at which new data becomes available make manual interpretation a time-consuming, resource-hungry, and expensive process. In contrast, automated analysis of EEG data offers the potential to improve the quality of patient care by shortening the time to diagnosis and reducing manual error. In this paper, we focus on one of the first steps in interpreting an EEG session - identifying whether the brain activity is abnormal or normal. To solve this task, we propose a novel recurrent neural network (RNN) architecture termed ChronoNet which is inspired by recent developments from the field of image classification and designed to work efficiently with EEG data. ChronoNet is formed by stacking multiple 1D convolution layers followed by deep gated recurrent unit (GRU) layers where each 1D convolution layer uses multiple filters of exponentially varying lengths and the stacked GRU layers are densely connected in a feed-forward manner. We used the recently released TUH Abnormal EEG Corpus dataset for evaluating the performance of ChronoNet. Unlike previous studies using this dataset, ChronoNet directly takes time-series EEG as input and learns meaningful representations of brain activity patterns. ChronoNet outperforms the previously reported best results by 7.79% thereby setting a new benchmark for this dataset. Furthermore, we demonstrate the domain-independent nature of ChronoNet by successfully applying it to classify speech commands. |
Tasks | EEG, Image Classification, Time Series |
Published | 2018-01-30 |
URL | http://arxiv.org/abs/1802.00308v2 |
http://arxiv.org/pdf/1802.00308v2.pdf | |
PWC | https://paperswithcode.com/paper/chrononet-a-deep-recurrent-neural-network-for |
Repo | https://github.com/Sharad24/Epileptic-Seizure-Detection |
Framework | none |
End2You – The Imperial Toolkit for Multimodal Profiling by End-to-End Learning
Title | End2You – The Imperial Toolkit for Multimodal Profiling by End-to-End Learning |
Authors | Panagiotis Tzirakis, Stefanos Zafeiriou, Bjorn W. Schuller |
Abstract | We introduce End2You – the Imperial College London toolkit for multimodal profiling by end-to-end deep learning. End2You is an open-source toolkit implemented in Python and is based on Tensorflow. It provides capabilities to train and evaluate models in an end-to-end manner, i.e., using raw input. It supports input from raw audio, visual, physiological or other types of information or combination of those, and the output can be of an arbitrary representation, for either classification or regression tasks. To our knowledge, this is the first toolkit that provides generic end-to-end learning for profiling capabilities in either unimodal or multimodal cases. To test our toolkit, we utilise the RECOLA database as was used in the AVEC 2016 challenge. Experimental results indicate that End2You can provide comparable results to state-of-the-art methods despite no need of expert-alike feature representations, but self-learning these from the data “end to end”. |
Tasks | |
Published | 2018-02-04 |
URL | http://arxiv.org/abs/1802.01115v1 |
http://arxiv.org/pdf/1802.01115v1.pdf | |
PWC | https://paperswithcode.com/paper/end2you-the-imperial-toolkit-for-multimodal |
Repo | https://github.com/end2you/end2you |
Framework | tf |
ANNETT-O: An Ontology for Describing Artificial Neural Network Evaluation, Topology and Training
Title | ANNETT-O: An Ontology for Describing Artificial Neural Network Evaluation, Topology and Training |
Authors | Iraklis A. Klampanos, Athanasios Davvetas, Antonis Koukourikos, Vangelis Karkaletsis |
Abstract | Deep learning models, while effective and versatile, are becoming increasingly complex, often including multiple overlapping networks of arbitrary depths, multiple objectives and non-intuitive training methodologies. This makes it increasingly difficult for researchers and practitioners to design, train and understand them. In this paper we present ANNETT-O, a much-needed, generic and computer-actionable vocabulary for researchers and practitioners to describe their deep learning configurations, training procedures and experiments. The proposed ontology focuses on topological, training and evaluation aspects of complex deep neural configurations, while keeping peripheral entities more succinct. Knowledge bases implementing ANNETT-O can support a wide variety of queries, providing relevant insights to users. In addition to a detailed description of the ontology, we demonstrate its suitability to the task via a number of hypothetical use-cases of increasing complexity. |
Tasks | |
Published | 2018-04-07 |
URL | http://arxiv.org/abs/1804.02528v2 |
http://arxiv.org/pdf/1804.02528v2.pdf | |
PWC | https://paperswithcode.com/paper/annett-o-an-ontology-for-describing |
Repo | https://github.com/davidath/evitrac |
Framework | tf |
Algorithm Selection for Collaborative Filtering: the influence of graph metafeatures and multicriteria metatargets
Title | Algorithm Selection for Collaborative Filtering: the influence of graph metafeatures and multicriteria metatargets |
Authors | Tiago Cunha, Carlos Soares, André C. P. L. F. de Carvalho |
Abstract | To select the best algorithm for a new problem is an expensive and difficult task. However, there are automatic solutions to address this problem: using Metalearning, which takes advantage of problem characteristics (i.e. metafeatures), one is able to predict the relative performance of algorithms. In the Collaborative Filtering scope, recent works have proposed diverse metafeatures describing several dimensions of this problem. Despite interesting and effective findings, it is still unknown whether these are the most effective metafeatures. Hence, this work proposes a new set of graph metafeatures, which approach the Collaborative Filtering problem from a Graph Theory perspective. Furthermore, in order to understand whether metafeatures from multiple dimensions are a better fit, we investigate the effects of comprehensive metafeatures. These metafeatures are a selection of the best metafeatures from all existing Collaborative Filtering metafeatures. The impact of the most representative metafeatures is investigated in a controlled experimental setup. Another contribution we present is the use of a Pareto-Efficient ranking procedure to create multicriteria metatargets. These new rankings of algorithms, which take into account multiple evaluation measures, allow to explore the algorithm selection problem in a fairer and more detailed way. According to the experimental results, the graph metafeatures are a good alternative to related work metafeatures. However, the results have shown that the feature selection procedure used to create the comprehensive metafeatures is is not effective, since there is no gain in predictive performance. Finally, an extensive metaknowledge analysis was conducted to identify the most influential metafeatures. |
Tasks | Feature Selection |
Published | 2018-07-23 |
URL | http://arxiv.org/abs/1807.09097v1 |
http://arxiv.org/pdf/1807.09097v1.pdf | |
PWC | https://paperswithcode.com/paper/algorithm-selection-for-collaborative |
Repo | https://github.com/tiagodscunha/cf_metafeatures |
Framework | none |
Fully Statistical Neural Belief Tracking
Title | Fully Statistical Neural Belief Tracking |
Authors | Nikola Mrkšić, Ivan Vulić |
Abstract | This paper proposes an improvement to the existing data-driven Neural Belief Tracking (NBT) framework for Dialogue State Tracking (DST). The existing NBT model uses a hand-crafted belief state update mechanism which involves an expensive manual retuning step whenever the model is deployed to a new dialogue domain. We show that this update mechanism can be learned jointly with the semantic decoding and context modelling parts of the NBT model, eliminating the last rule-based module from this DST framework. We propose two different statistical update mechanisms and show that dialogue dynamics can be modelled with a very small number of additional model parameters. In our DST evaluation over three languages, we show that this model achieves competitive performance and provides a robust framework for building resource-light DST models. |
Tasks | Dialogue State Tracking |
Published | 2018-05-29 |
URL | http://arxiv.org/abs/1805.11350v1 |
http://arxiv.org/pdf/1805.11350v1.pdf | |
PWC | https://paperswithcode.com/paper/fully-statistical-neural-belief-tracking |
Repo | https://github.com/nmrksic/neural-belief-tracker |
Framework | tf |
3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation
Title | 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation |
Authors | Angela Dai, Matthias Nießner |
Abstract | We present 3DMV, a novel method for 3D semantic scene segmentation of RGB-D scans in indoor environments using a joint 3D-multi-view prediction network. In contrast to existing methods that either use geometry or RGB data as input for this task, we combine both data modalities in a joint, end-to-end network architecture. Rather than simply projecting color data into a volumetric grid and operating solely in 3D – which would result in insufficient detail – we first extract feature maps from associated RGB images. These features are then mapped into the volumetric feature grid of a 3D network using a differentiable backprojection layer. Since our target is 3D scanning scenarios with possibly many frames, we use a multi-view pooling approach in order to handle a varying number of RGB input views. This learned combination of RGB and geometric features with our joint 2D-3D architecture achieves significantly better results than existing baselines. For instance, our final result on the ScanNet 3D segmentation benchmark increases from 52.8% to 75% accuracy compared to existing volumetric architectures. |
Tasks | Scene Segmentation, Semantic Segmentation |
Published | 2018-03-28 |
URL | http://arxiv.org/abs/1803.10409v1 |
http://arxiv.org/pdf/1803.10409v1.pdf | |
PWC | https://paperswithcode.com/paper/3dmv-joint-3d-multi-view-prediction-for-3d |
Repo | https://github.com/angeladai/3DMV |
Framework | pytorch |
Hubless keypoint-based 3D deformable groupwise registration
Title | Hubless keypoint-based 3D deformable groupwise registration |
Authors | Rémi Agier, Sébastien Valette, Razmig Kéchichian, Laurent Fanton, Rémy Prost |
Abstract | We present a novel algorithm for Fast Registration Of image Groups (FROG), applied to large 3D image groups. Our approach extracts 3D SURF keypoints from images, computes matched pairs of keypoints and registers the group by minimizing pair distances in a hubless way i.e. without computing any central mean image. Using keypoints significantly reduces the problem complexity compared to voxel-based approaches, and enables us to provide an in-core global optimization, similar to the Bundle Adjustment for 3D reconstruction. As we aim to register images of different patients, the matching step yields many outliers. Then we propose a new EM-weighting algorithm which efficiently discards outliers. Global optimization is carried out with a fast gradient descent algorithm. This allows our approach to robustly register large datasets. The result is a set of diffeomorphic half transforms which link the volumes together and can be subsequently exploited for computational anatomy and landmark detection. We show experimental results on whole-body CT scans, with groups of up to 103 volumes. On a benchmark based on anatomical landmarks, our algorithm compares favorably with the star-groupwise voxel-based ANTs and NiftyReg approaches while being much faster. We also discuss the limitations of our approach for lower resolution images such as brain MRI. |
Tasks | 3D Reconstruction, Semantic Segmentation |
Published | 2018-09-11 |
URL | https://arxiv.org/abs/1809.03951v3 |
https://arxiv.org/pdf/1809.03951v3.pdf | |
PWC | https://paperswithcode.com/paper/hubless-keypoint-based-3d-deformable |
Repo | https://github.com/valette/frog |
Framework | none |
User Constrained Thumbnail Generation using Adaptive Convolutions
Title | User Constrained Thumbnail Generation using Adaptive Convolutions |
Authors | Perla Sai Raj Kishore, Ayan Kumar Bhunia, Shuvozit Ghose, Partha Pratim Roy |
Abstract | Thumbnails are widely used all over the world as a preview for digital images. In this work we propose a deep neural framework to generate thumbnails of any size and aspect ratio, even for unseen values during training, with high accuracy and precision. We use Global Context Aggregation (GCA) and a modified Region Proposal Network (RPN) with adaptive convolutions to generate thumbnails in real time. GCA is used to selectively attend and aggregate the global context information from the entire image while the RPN is used to predict candidate bounding boxes for the thumbnail image. Adaptive convolution eliminates the problem of generating thumbnails of various aspect ratios by using filter weights dynamically generated from the aspect ratio information. The experimental results indicate the superior performance of the proposed model over existing state-of-the-art techniques. |
Tasks | User Constrained Thumbnail Generation |
Published | 2018-10-31 |
URL | http://arxiv.org/abs/1810.13054v3 |
http://arxiv.org/pdf/1810.13054v3.pdf | |
PWC | https://paperswithcode.com/paper/user-constrained-thumbnail-generation-using |
Repo | https://github.com/Aiyoj/Thumbnail-Generation |
Framework | tf |
WarpGAN: Automatic Caricature Generation
Title | WarpGAN: Automatic Caricature Generation |
Authors | Yichun Shi, Debayan Deb, Anil K. Jain |
Abstract | We propose, WarpGAN, a fully automatic network that can generate caricatures given an input face photo. Besides transferring rich texture styles, WarpGAN learns to automatically predict a set of control points that can warp the photo into a caricature, while preserving identity. We introduce an identity-preserving adversarial loss that aids the discriminator to distinguish between different subjects. Moreover, WarpGAN allows customization of the generated caricatures by controlling the exaggeration extent and the visual styles. Experimental results on a public domain dataset, WebCaricature, show that WarpGAN is capable of generating a diverse set of caricatures while preserving the identities. Five caricature experts suggest that caricatures generated by WarpGAN are visually similar to hand-drawn ones and only prominent facial features are exaggerated. |
Tasks | Photo-To-Caricature Translation |
Published | 2018-11-25 |
URL | http://arxiv.org/abs/1811.10100v3 |
http://arxiv.org/pdf/1811.10100v3.pdf | |
PWC | https://paperswithcode.com/paper/warpgan-automatic-caricature-generation |
Repo | https://github.com/seasonSH/WarpGAN |
Framework | tf |
Probabilistic Object Detection: Definition and Evaluation
Title | Probabilistic Object Detection: Definition and Evaluation |
Authors | David Hall, Feras Dayoub, John Skinner, Haoyang Zhang, Dimity Miller, Peter Corke, Gustavo Carneiro, Anelia Angelova, Niko Sünderhauf |
Abstract | We introduce Probabilistic Object Detection, the task of detecting objects in images and accurately quantifying the spatial and semantic uncertainties of the detections. Given the lack of methods capable of assessing such probabilistic object detections, we present the new Probability-based Detection Quality measure (PDQ).Unlike AP-based measures, PDQ has no arbitrary thresholds and rewards spatial and label quality, and foreground/background separation quality while explicitly penalising false positive and false negative detections. We contrast PDQ with existing mAP and moLRP measures by evaluating state-of-the-art detectors and a Bayesian object detector based on Monte Carlo Dropout. Our experiments indicate that conventional object detectors tend to be spatially overconfident and thus perform poorly on the task of probabilistic object detection. Our paper aims to encourage the development of new object detection approaches that provide detections with accurately estimated spatial and label uncertainties and are of critical importance for deployment on robots and embodied AI systems in the real world. |
Tasks | Object Detection |
Published | 2018-11-27 |
URL | https://arxiv.org/abs/1811.10800v4 |
https://arxiv.org/pdf/1811.10800v4.pdf | |
PWC | https://paperswithcode.com/paper/probability-based-detection-quality-pdq-a |
Repo | https://github.com/jskinn/rvchallenge-evaluation |
Framework | none |
PPF-FoldNet: Unsupervised Learning of Rotation Invariant 3D Local Descriptors
Title | PPF-FoldNet: Unsupervised Learning of Rotation Invariant 3D Local Descriptors |
Authors | Haowen Deng, Tolga Birdal, Slobodan Ilic |
Abstract | We present PPF-FoldNet for unsupervised learning of 3D local descriptors on pure point cloud geometry. Based on the folding-based auto-encoding of well known point pair features, PPF-FoldNet offers many desirable properties: it necessitates neither supervision, nor a sensitive local reference frame, benefits from point-set sparsity, is end-to-end, fast, and can extract powerful rotation invariant descriptors. Thanks to a novel feature visualization, its evolution can be monitored to provide interpretable insights. Our extensive experiments demonstrate that despite having six degree-of-freedom invariance and lack of training labels, our network achieves state of the art results in standard benchmark datasets and outperforms its competitors when rotations and varying point densities are present. PPF-FoldNet achieves $9%$ higher recall on standard benchmarks, $23%$ higher recall when rotations are introduced into the same datasets and finally, a margin of $>35%$ is attained when point density is significantly decreased. |
Tasks | |
Published | 2018-08-30 |
URL | http://arxiv.org/abs/1808.10322v1 |
http://arxiv.org/pdf/1808.10322v1.pdf | |
PWC | https://paperswithcode.com/paper/ppf-foldnet-unsupervised-learning-of-rotation |
Repo | https://github.com/XuyangBai/PPF-FoldNet |
Framework | pytorch |