February 1, 2020

3134 words 15 mins read

Paper Group AWR 121

Paper Group AWR 121

Deep End-to-End Alignment and Refinement for Time-of-Flight RGB-D Module. Validating the Validation: Reanalyzing a large-scale comparison of Deep Learning and Machine Learning models for bioactivity prediction. Safe Policy Improvement with Soft Baseline Bootstrapping. Dense Scale Network for Crowd Counting. Learning Dynamic Author Representations w …

Deep End-to-End Alignment and Refinement for Time-of-Flight RGB-D Module

Title Deep End-to-End Alignment and Refinement for Time-of-Flight RGB-D Module
Authors Di Qiu, Jiahao Pang, Wenxiu Sun, Chengxi Yang
Abstract Recently, it is increasingly popular to equip mobile RGB cameras with Time-of-Flight (ToF) sensors for active depth sensing. However, for off-the-shelf ToF sensors, one must tackle two problems in order to obtain high-quality depth with respect to the RGB camera, namely 1) online calibration and alignment; and 2) complicated error correction for ToF depth sensing. In this work, we propose a framework for jointly alignment and refinement via deep learning. First, a cross-modal optical flow between the RGB image and the ToF amplitude image is estimated for alignment. The aligned depth is then refined via an improved kernel predicting network that performs kernel normalization and applies the bias prior to the dynamic convolution. To enrich our data for end-to-end training, we have also synthesized a dataset using tools from computer graphics. Experimental results demonstrate the effectiveness of our approach, achieving state-of-the-art for ToF refinement.
Tasks Calibration, Optical Flow Estimation
Published 2019-09-17
URL https://arxiv.org/abs/1909.07623v1
PDF https://arxiv.org/pdf/1909.07623v1.pdf
PWC https://paperswithcode.com/paper/deep-end-to-end-alignment-and-refinement-for
Repo https://github.com/sylqiu/tof_rgbd_processing
Framework tf

Validating the Validation: Reanalyzing a large-scale comparison of Deep Learning and Machine Learning models for bioactivity prediction

Title Validating the Validation: Reanalyzing a large-scale comparison of Deep Learning and Machine Learning models for bioactivity prediction
Authors Matthew C. Robinson, Robert C. Glen, Alpha A. Lee
Abstract Machine learning methods may have the potential to significantly accelerate drug discovery. However, the increasing rate of new methodological approaches being published in the literature raises the fundamental question of how models should be benchmarked and validated. We reanalyze the data generated by a recently published large-scale comparison of machine learning models for bioactivity prediction and arrive at a somewhat different conclusion. We show that the performance of support vector machines is competitive with that of deep learning methods. Additionally, using a series of numerical experiments, we question the relevance of area under the receiver operating characteristic curve as a metric in virtual screening, and instead suggest that area under the precision-recall curve should be used in conjunction with the receiver operating characteristic. Our numerical experiments also highlight challenges in estimating the uncertainty in model performance via scaffold-split nested cross validation.
Tasks Drug Discovery
Published 2019-05-28
URL https://arxiv.org/abs/1905.11681v2
PDF https://arxiv.org/pdf/1905.11681v2.pdf
PWC https://paperswithcode.com/paper/validating-the-validation-reanalyzing-a-large
Repo https://github.com/mc-robinson/mayr_reanalysis_supp_info
Framework none

Safe Policy Improvement with Soft Baseline Bootstrapping

Title Safe Policy Improvement with Soft Baseline Bootstrapping
Authors Kimia Nadjahi, Romain Laroche, Rémi Tachet des Combes
Abstract Batch Reinforcement Learning (Batch RL) consists in training a policy using trajectories collected with another policy, called the behavioural policy. Safe policy improvement (SPI) provides guarantees with high probability that the trained policy performs better than the behavioural policy, also called baseline in this setting. Previous work shows that the SPI objective improves mean performance as compared to using the basic RL objective, which boils down to solving the MDP with maximum likelihood. Here, we build on that work and improve more precisely the SPI with Baseline Bootstrapping algorithm (SPIBB) by allowing the policy search over a wider set of policies. Instead of binarily classifying the state-action pairs into two sets (the \textit{uncertain} and the \textit{safe-to-train-on} ones), we adopt a softer strategy that controls the error in the value estimates by constraining the policy change according to the local model uncertainty. The method can take more risks on uncertain actions all the while remaining provably-safe, and is therefore less conservative than the state-of-the-art methods. We propose two algorithms (one optimal and one approximate) to solve this constrained optimization problem and empirically show a significant improvement over existing SPI algorithms both on finite MDPs and on infinite MDPs with a neural network function approximation.
Tasks
Published 2019-07-11
URL https://arxiv.org/abs/1907.05079v1
PDF https://arxiv.org/pdf/1907.05079v1.pdf
PWC https://paperswithcode.com/paper/safe-policy-improvement-with-soft-baseline
Repo https://github.com/RomainLaroche/SPIBB
Framework none

Dense Scale Network for Crowd Counting

Title Dense Scale Network for Crowd Counting
Authors Feng Dai, Hao Liu, Yike Ma, Juan Cao, Qiang Zhao, Yongdong Zhang
Abstract Crowd counting has been widely studied by computer vision community in recent years. Due to the large scale variation, it remains to be a challenging task. Previous methods adopt either multi-column CNN or single-column CNN with multiple branches to deal with this problem. However, restricted by the number of columns or branches, these methods can only capture a few different scales and have limited capability. In this paper, we propose a simple but effective network called DSNet for crowd counting, which can be easily trained in an end-to-end fashion. The key component of our network is the dense dilated convolution block, in which each dilation layer is densely connected with the others to preserve information from continuously varied scales. The dilation rates in dilation layers are carefully selected to prevent the block from gridding artifacts. To further enlarge the range of scales covered by the network, we cascade three blocks and link them with dense residual connections. We also introduce a novel multi-scale density level consistency loss for performance improvement. To evaluate our method, we compare it with state-of-the-art algorithms on four crowd counting datasets (ShanghaiTech, UCF-QNRF, UCF_CC_50 and UCSD). Experimental results demonstrate that DSNet can achieve the best performance and make significant improvements on all the four datasets (30% on the UCF-QNRF and UCF_CC_50, and 20% on the others).
Tasks Crowd Counting
Published 2019-06-24
URL https://arxiv.org/abs/1906.09707v1
PDF https://arxiv.org/pdf/1906.09707v1.pdf
PWC https://paperswithcode.com/paper/dense-scale-network-for-crowd-counting
Repo https://github.com/rongliangzi/Dense-Scale-Network-for-Crowd-Counting
Framework pytorch

Learning Dynamic Author Representations with Temporal Language Models

Title Learning Dynamic Author Representations with Temporal Language Models
Authors Edouard Delasalles, Sylvain Lamprier, Ludovic Denoyer
Abstract Language models are at the heart of numerous works, notably in the text mining and information retrieval communities. These statistical models aim at extracting word distributions, from simple unigram models to recurrent approaches with latent variables that capture subtle dependencies in texts. However, those models are learned from word sequences only, and authors’ identities, as well as publication dates, are seldom considered. We propose a neural model, based on recurrent language modeling, which aims at capturing language diffusion tendencies in author communities through time. By conditioning language models with author and temporal vector states, we are able to leverage the latent dependencies between the text contexts. This allows us to beat several temporal and non-temporal language baselines on two real-world corpora, and to learn meaningful author representations that vary through time.
Tasks Information Retrieval, Language Modelling
Published 2019-09-11
URL https://arxiv.org/abs/1909.04985v1
PDF https://arxiv.org/pdf/1909.04985v1.pdf
PWC https://paperswithcode.com/paper/learning-dynamic-author-representations-with
Repo https://github.com/edouardelasalles/dar
Framework pytorch

BULNER: BUg Localization with word embeddings and NEtwork Regularization

Title BULNER: BUg Localization with word embeddings and NEtwork Regularization
Authors Jacson Rodrigues Barbosa, Ricardo Marcondes Marcacini, Ricardo Britto, Frederico Soares, Solange Rezende, Auri M. R. Vincenzi, Marcio E. Delamaro
Abstract Bug localization (BL) from the bug report is the strategic activity of the software maintaining process. Because BL is a costly and tedious activity, BL techniques information retrieval-based and machine learning-based could aid software engineers. We propose a method for BUg Localization with word embeddings and Network Regularization (BULNER). The preliminary results suggest that BULNER has better performance than two state-of-the-art methods.
Tasks Information Retrieval, Word Embeddings
Published 2019-08-26
URL https://arxiv.org/abs/1908.09876v1
PDF https://arxiv.org/pdf/1908.09876v1.pdf
PWC https://paperswithcode.com/paper/bulner-bug-localization-with-word-embeddings
Repo https://github.com/jacsonrbinf/bulner
Framework none

Multilingual and Multi-Aspect Hate Speech Analysis

Title Multilingual and Multi-Aspect Hate Speech Analysis
Authors Nedjma Ousidhoum, Zizheng Lin, Hongming Zhang, Yangqiu Song, Dit-Yan Yeung
Abstract Current research on hate speech analysis is typically oriented towards monolingual and single classification tasks. In this paper, we present a new multilingual multi-aspect hate speech analysis dataset and use it to test the current state-of-the-art multilingual multitask learning approaches. We evaluate our dataset in various classification settings, then we discuss how to leverage our annotations in order to improve hate speech detection and classification in general.
Tasks Hate Speech Detection
Published 2019-08-29
URL https://arxiv.org/abs/1908.11049v1
PDF https://arxiv.org/pdf/1908.11049v1.pdf
PWC https://paperswithcode.com/paper/multilingual-and-multi-aspect-hate-speech
Repo https://github.com/HKUST-KnowComp/MLMA_hate_speech
Framework none

Weakly Supervised Person Re-ID: Differentiable Graphical Learning and A New Benchmark

Title Weakly Supervised Person Re-ID: Differentiable Graphical Learning and A New Benchmark
Authors Guangrun Wang, Guangcong Wang, Xujie Zhang, Jianhuang Lai, Liang Lin
Abstract Person re-identification (Re-ID) benefits greatly from the accurate annotations of existing datasets (e.g., CUHK03 \cite{li2014deepreid} and Market-1501 \cite{zheng2015scalable}), which are quite expensive because each image in these datasets has to be assigned with a proper label. In this work, we ease the annotation of Re-ID by replacing the accurate annotation with inaccurate annotation, i.e., we group the images into bags in terms of time and assign a bag-level label for each bag. This greatly reduces the annotation effort and leads to the creation of a large-scale Re-ID benchmark called SYSU-30$k$. The new benchmark contains $30k$ categories of persons, which is about $20$ times larger than CUHK03 ($1.3k$ categories) and Market-1501 ($1.5k$ categories), and $30$ times larger the ImageNet ($1k$ categories). It sums up to 29,606,918 images. Learning a Re-ID model with bag-level annotation is called the weakly supervised Re-ID problem. To solve this problem, we introduce a differentiable graphical model to capture the dependencies from all images in a bag and generate a reliable pseudo label for each person image. The pseudo label is further used to supervise the learning of the Re-ID model. When compared with the fully supervised Re-ID models, our method achieves the state-of-the-art performance on SYSU-30$k$ and other datasets. The code, dataset, and pretrained model will be available at \url{https://github.com/wanggrun/SYSU-30k}.
Tasks Person Re-Identification
Published 2019-04-08
URL https://arxiv.org/abs/1904.03845v2
PDF https://arxiv.org/pdf/1904.03845v2.pdf
PWC https://paperswithcode.com/paper/weakly-supervised-person-re-identification
Repo https://github.com/wanggrun/SYSU-30k
Framework none

On the Relationship between Self-Attention and Convolutional Layers

Title On the Relationship between Self-Attention and Convolutional Layers
Authors Jean-Baptiste Cordonnier, Andreas Loukas, Martin Jaggi
Abstract Recent trends of incorporating attention mechanisms in vision have led researchers to reconsider the supremacy of convolutional layers as a primary building block. Beyond helping CNNs to handle long-range dependencies, Ramachandran et al. (2019) showed that attention can completely replace convolution and achieve state-of-the-art performance on vision tasks. This raises the question: do learned attention layers operate similarly to convolutional layers? This work provides evidence that attention layers can perform convolution and, indeed, they often learn to do so in practice. Specifically, we prove that a multi-head self-attention layer with sufficient number of heads is at least as expressive as any convolutional layer. Our numerical experiments then show that self-attention layers attend to pixel-grid patterns similarly to CNN layers, corroborating our analysis. Our code is publicly available.
Tasks Image Classification
Published 2019-11-08
URL https://arxiv.org/abs/1911.03584v2
PDF https://arxiv.org/pdf/1911.03584v2.pdf
PWC https://paperswithcode.com/paper/on-the-relationship-between-self-attention-1
Repo https://github.com/epfml/attention-cnn
Framework pytorch

Dynamic Environment Prediction in Urban Scenes using Recurrent Representation Learning

Title Dynamic Environment Prediction in Urban Scenes using Recurrent Representation Learning
Authors Masha Itkina, Katherine Driggs-Campbell, Mykel J. Kochenderfer
Abstract A key challenge for autonomous driving is safe trajectory planning in cluttered, urban environments with dynamic obstacles, such as pedestrians, bicyclists, and other vehicles. A reliable prediction of the future environment, including the behavior of dynamic agents, would allow planning algorithms to proactively generate a trajectory in response to a rapidly changing environment. We present a novel framework that predicts the future occupancy state of the local environment surrounding an autonomous agent by learning a motion model from occupancy grid data using a neural network. We take advantage of the temporal structure of the grid data by utilizing a convolutional long-short term memory network in the form of the PredNet architecture. This method is validated on the KITTI dataset and demonstrates higher accuracy and better predictive power than baseline methods.
Tasks Autonomous Driving, Representation Learning
Published 2019-04-28
URL https://arxiv.org/abs/1904.12374v2
PDF https://arxiv.org/pdf/1904.12374v2.pdf
PWC https://paperswithcode.com/paper/dynamic-environment-prediction-in-urban
Repo https://github.com/mitkina/EnvironmentPrediction
Framework tf

Predictive Multiplicity in Classification

Title Predictive Multiplicity in Classification
Authors Charles T. Marx, Flavio du Pin Calmon, Berk Ustun
Abstract In the context of machine learning, a prediction problem exhibits predictive multiplicity if there exist several “good” models that attain identical or near-identical performance (i.e., accuracy, AUC, etc.). In this paper, we study the effects of multiplicity in human-facing applications, such as credit scoring and recidivism prediction. We introduce a specific notion of multiplicity – predictive multiplicity – to describe the existence of good models that output conflicting predictions. Unlike existing notions of multiplicity (e.g., the Rashomon effect), predictive multiplicity reflects irreconcilable differences in the predictions of models with comparable performance, and presents new challenges for common practices such as model selection and local explanation. We propose measures to evaluate the predictive multiplicity in classification problems. We present integer programming methods to compute these measures for a given datasets by solving empirical risk minimization problems with discrete constraints. We demonstrate how these tools can inform stakeholders on a large collection of recidivism prediction problems. Our results show that real-world prediction problems often admit many good models that output wildly conflicting predictions, and support the need to report predictive multiplicity in model development.
Tasks Model Selection
Published 2019-09-14
URL https://arxiv.org/abs/1909.06677v2
PDF https://arxiv.org/pdf/1909.06677v2.pdf
PWC https://paperswithcode.com/paper/predictive-multiplicity-in-classification
Repo https://github.com/charliemarx/pmtools
Framework none

How To Train Your Deep Multi-Object Tracker

Title How To Train Your Deep Multi-Object Tracker
Authors Yihong Xu, Aljosa Osep, Yutong Ban, Radu Horaud, Laura Leal-Taixe, Xavier Alameda-Pineda
Abstract The recent trend in vision-based multi-object tracking (MOT) is heading towards leveraging the representational power of deep learning to jointly learn to detect and track objects. However, existing methods train only certain sub-modules using loss functions that often do not correlate with established tracking evaluation measures such as Multi-Object Tracking Accuracy (MOTA) and Precision (MOTP). As these measures are not differentiable, the choice of appropriate loss functions for end-to-end training of multi-object tracking methods is still an open research problem. In this paper, we bridge this gap by proposing a differentiable proxy of MOTA and MOTP, which we combine in a loss function suitable for end-to-end training of deep multi-object trackers. As a key ingredient, we propose a Deep Hungarian Net (DHN) module that approximates the Hungarian matching algorithm. DHN allows to estimate the correspondence between object tracks and ground truth objects to compute differentiable proxies of MOTA and MOTP, which are in turn used to optimize deep trackers directly. We experimentally demonstrate that the proposed differentiable framework improves the performance of existing multi-object trackers, and we establish a new state-of-the-art on the MOTChallenge benchmark. Our code is publicly available at https://github.com/yihongXU/deepMOT.
Tasks Multi-Object Tracking, Multiple Object Tracking, Object Tracking
Published 2019-06-15
URL https://arxiv.org/abs/1906.06618v2
PDF https://arxiv.org/pdf/1906.06618v2.pdf
PWC https://paperswithcode.com/paper/deepmot-a-differentiable-framework-for
Repo https://github.com/yihongXU/deepMOT
Framework pytorch

Feature Pyramid and Hierarchical Boosting Network for Pavement Crack Detection

Title Feature Pyramid and Hierarchical Boosting Network for Pavement Crack Detection
Authors Fan Yang, Lei Zhang, Sijia Yu, Danil Prokhorov, Xue Mei, Haibin Ling
Abstract Pavement crack detection is a critical task for insuring road safety. Manual crack detection is extremely time-consuming. Therefore, an automatic road crack detection method is required to boost this progress. However, it remains a challenging task due to the intensity inhomogeneity of cracks and complexity of the background, e.g., the low contrast with surrounding pavements and possible shadows with similar intensity. Inspired by recent advances of deep learning in computer vision, we propose a novel network architecture, named Feature Pyramid and Hierarchical Boosting Network (FPHBN), for pavement crack detection. The proposed network integrates semantic information to low-level features for crack detection in a feature pyramid way. And, it balances the contribution of both easy and hard samples to loss by nested sample reweighting in a hierarchical way. To demonstrate the superiority and generality of the proposed method, we evaluate the proposed method on five crack datasets and compare it with state-of-the-art crack detection, edge detection, semantic segmentation methods. Extensive experiments show that the proposed method outperforms these state-of-the-art methods in terms of accuracy and generality.
Tasks Edge Detection, Semantic Segmentation
Published 2019-01-18
URL http://arxiv.org/abs/1901.06340v2
PDF http://arxiv.org/pdf/1901.06340v2.pdf
PWC https://paperswithcode.com/paper/feature-pyramid-and-hierarchical-boosting
Repo https://github.com/fyangneil/pavement-crack-detection
Framework none

Deep Stereo using Adaptive Thin Volume Representation with Uncertainty Awareness

Title Deep Stereo using Adaptive Thin Volume Representation with Uncertainty Awareness
Authors Shuo Cheng, Zexiang Xu, Shilin Zhu, Zhuwen Li, Li Erran Li, Ravi Ramamoorthi, Hao Su
Abstract We present Uncertainty-aware Cascaded Stereo Network (UCS-Net) for 3D reconstruction from multiple RGB images. Multi-view stereo (MVS) aims to reconstruct fine-grained scene geometry from multi-view images. Previous learning-based MVS methods estimate per-view depth using plane sweep volumes with a fixed depth hypothesis at each plane; this generally requires densely sampled planes for desired accuracy, and it is very hard to achieve high-resolution depth. In contrast, we propose adaptive thin volumes (ATVs); in an ATV, the depth hypothesis of each plane is spatially varying, which adapts to the uncertainties of previous per-pixel depth predictions. Our UCS-Net has three stages: the first stage processes a small standard plane sweep volume to predict low-resolution depth; two ATVs are then used in the following stages to refine the depth with higher resolution and higher accuracy. Our ATV consists of only a small number of planes; yet, it efficiently partitions local depth ranges within learned small intervals. In particular, we propose to use variance-based uncertainty estimates to adaptively construct ATVs; this differentiable process introduces reasonable and fine-grained spatial partitioning. Our multi-stage framework progressively subdivides the vast scene space with increasing depth resolution and precision, which enables scene reconstruction with high completeness and accuracy in a coarse-to-fine fashion. We demonstrate that our method achieves superior performance compared with state-of-the-art benchmarks on various challenging datasets.
Tasks 3D Reconstruction
Published 2019-11-27
URL https://arxiv.org/abs/1911.12012v1
PDF https://arxiv.org/pdf/1911.12012v1.pdf
PWC https://paperswithcode.com/paper/deep-stereo-using-adaptive-thin-volume
Repo https://github.com/touristCheng/UCSNet
Framework none

Causality-based Feature Selection: Methods and Evaluations

Title Causality-based Feature Selection: Methods and Evaluations
Authors Kui Yu, Xianjie Guo, Lin Liu, Jiuyong Li, Hao Wang, Zhaolong Ling, Xindong Wu
Abstract Feature selection is a crucial preprocessing step in data analytics and machine learning. Classical feature selection algorithms select features based on the correlations between predictive features and the class variable and do not attempt to capture causal relationships between them. It has been shown that the knowledge about the causal relationships between features and the class variable has potential benefits for building interpretable and robust prediction models, since causal relationships imply the underlying mechanism of a system. Consequently, causality-based feature selection has gradually attracted greater attentions and many algorithms have been proposed. In this paper, we present a comprehensive review of recent advances in causality-based feature selection. To facilitate the development of new algorithms in the research area and make it easy for the comparisons between new methods and existing ones, we develop the first open-source package, called CausalFS, which consists of most of the representative causality-based feature selection algorithms (available at https://github.com/kuiy/CausalFS). Using CausalFS, we conduct extensive experiments to compare the representative algorithms with both synthetic and real-world data sets. Finally, we discuss some challenging problems to be tackled in future causality-based feature selection research.
Tasks Feature Selection
Published 2019-11-17
URL https://arxiv.org/abs/1911.07147v1
PDF https://arxiv.org/pdf/1911.07147v1.pdf
PWC https://paperswithcode.com/paper/causality-based-feature-selection-methods-and
Repo https://github.com/kuiy/CausalFS
Framework none
comments powered by Disqus