February 1, 2020

2678 words 13 mins read

Paper Group AWR 120

Paper Group AWR 120

Locality-constrained Spatial Transformer Network for Video Crowd Counting. Robustness of Generalized Learning Vector Quantization Models against Adversarial Attacks. Adversarial Defense by Suppressing High-frequency Components. Personalizing Graph Neural Networks with Attention Mechanism for Session-based Recommendation. C^3 Framework: An Open-sour …

Locality-constrained Spatial Transformer Network for Video Crowd Counting

Title Locality-constrained Spatial Transformer Network for Video Crowd Counting
Authors Yanyan Fang, Biyun Zhan, Wandi Cai, Shenghua Gao, Bo Hu
Abstract Compared with single image based crowd counting, video provides the spatial-temporal information of the crowd that would help improve the robustness of crowd counting. But translation, rotation and scaling of people lead to the change of density map of heads between neighbouring frames. Meanwhile, people walking in/out or being occluded in dynamic scenes leads to the change of head counts. To alleviate these issues in video crowd counting, a Locality-constrained Spatial Transformer Network (LSTN) is proposed. Specifically, we first leverage a Convolutional Neural Networks to estimate the density map for each frame. Then to relate the density maps between neighbouring frames, a Locality-constrained Spatial Transformer (LST) module is introduced to estimate the density map of next frame with that of current frame. To facilitate the performance evaluation, a large-scale video crowd counting dataset is collected, which contains 15K frames with about 394K annotated heads captured from 13 different scenes. As far as we know, it is the largest video crowd counting dataset. Extensive experiments on our dataset and other crowd counting datasets validate the effectiveness of our LSTN for crowd counting.
Tasks Crowd Counting
Published 2019-07-18
URL https://arxiv.org/abs/1907.07911v1
PDF https://arxiv.org/pdf/1907.07911v1.pdf
PWC https://paperswithcode.com/paper/locality-constrained-spatial-transformer
Repo https://github.com/sweetyy83/Lstn_fdst_dataset
Framework none

Robustness of Generalized Learning Vector Quantization Models against Adversarial Attacks

Title Robustness of Generalized Learning Vector Quantization Models against Adversarial Attacks
Authors Sascha Saralajew, Lars Holdijk, Maike Rees, Thomas Villmann
Abstract Adversarial attacks and the development of (deep) neural networks robust against them are currently two widely researched topics. The robustness of Learning Vector Quantization (LVQ) models against adversarial attacks has however not yet been studied to the same extent. We therefore present an extensive evaluation of three LVQ models: Generalized LVQ, Generalized Matrix LVQ and Generalized Tangent LVQ. The evaluation suggests that both Generalized LVQ and Generalized Tangent LVQ have a high base robustness, on par with the current state-of-the-art in robust neural network methods. In contrast to this, Generalized Matrix LVQ shows a high susceptibility to adversarial attacks, scoring consistently behind all other models. Additionally, our numerical evaluation indicates that increasing the number of prototypes per class improves the robustness of the models.
Tasks Quantization
Published 2019-02-01
URL http://arxiv.org/abs/1902.00577v2
PDF http://arxiv.org/pdf/1902.00577v2.pdf
PWC https://paperswithcode.com/paper/robustness-of-generalized-learning-vector
Repo https://github.com/LarsHoldijk/robust_LVQ_models
Framework tf

Adversarial Defense by Suppressing High-frequency Components

Title Adversarial Defense by Suppressing High-frequency Components
Authors Zhendong Zhang, Cheolkon Jung, Xiaolong Liang
Abstract Recent works show that deep neural networks trained on image classification dataset bias towards textures. Those models are easily fooled by applying small high-frequency perturbations to clean images. In this paper, we learn robust image classification models by removing high-frequency components. Specifically, we develop a differentiable high-frequency suppression module based on discrete Fourier transform (DFT). Combining with adversarial training, we won the 5th place in the IJCAI-2019 Alibaba Adversarial AI Challenge. Our code is available online.
Tasks Adversarial Defense, Image Classification
Published 2019-08-19
URL https://arxiv.org/abs/1908.06566v3
PDF https://arxiv.org/pdf/1908.06566v3.pdf
PWC https://paperswithcode.com/paper/adversarial-defense-by-suppressing-high
Repo https://github.com/zzd1992/Adversarial-Defense-by-Suppressing-High-Frequencies
Framework pytorch

Personalizing Graph Neural Networks with Attention Mechanism for Session-based Recommendation

Title Personalizing Graph Neural Networks with Attention Mechanism for Session-based Recommendation
Authors Shu Wu, Mengqi Zhang, Xin Jiang, Xu Ke, Liang Wang
Abstract The problem of personalized session-based recommendation aims to predict users’ next click based on their sequential behaviors. Existing session-based recommendation methods only consider all sessions of user as a single sequence, ignoring the relationship of among sessions. Other than that, most of them neglect complex transitions of items and the collaborative relationship between users and items. To this end, we propose a novel method, named Personalizing Graph Neural Networks with Attention Mechanism, A-PGNN for brevity. A-PGNN mainly consists of two components: One is Personalizing Graph Neural Network (PGNN), which is used to capture complex transitions in user session sequence. Compared with the traditional Graph Neural Network (GNN) model, it also considers the role of users in the sequence. The other is Dot-Product Attention mechanism, which draws on the attention mechanism in machine translation to explicitly model the effect of historical sessions on the current session. These two parts make it possible to learn the multi-level transition relationships between items and sessions in user-specific fashion. Extensive experiments conducted on two real-world data sets show that A-PGNN significantly outperforms the state-of-the-art personalizing session-based recommendation methods consistently.
Tasks Machine Translation, Session-Based Recommendations
Published 2019-10-20
URL https://arxiv.org/abs/1910.08887v2
PDF https://arxiv.org/pdf/1910.08887v2.pdf
PWC https://paperswithcode.com/paper/personalizing-graph-neural-networks-with
Repo https://github.com/CRIPAC-DIG/A-PGNN
Framework tf

C^3 Framework: An Open-source PyTorch Code for Crowd Counting

Title C^3 Framework: An Open-source PyTorch Code for Crowd Counting
Authors Junyu Gao, Wei Lin, Bin Zhao, Dong Wang, Chenyu Gao, Jun Wen
Abstract This technical report attempts to provide efficient and solid kits addressed on the field of crowd counting, which is denoted as Crowd Counting Code Framework (C$^3$F). The contributions of C$^3$F are in three folds: 1) Some solid baseline networks are presented, which have achieved the state-of-the-arts. 2) Some flexible parameter setting strategies are provided to further promote the performance. 3) A powerful log system is developed to record the experiment process, which can enhance the reproducibility of each experiment. Our code is made publicly available at \url{https://github.com/gjy3035/C-3-Framework}. Furthermore, we also post a Chinese blog\footnote{\url{https://zhuanlan.zhihu.com/p/65650998}} to describe the details and insights of crowd counting.
Tasks Crowd Counting
Published 2019-07-05
URL https://arxiv.org/abs/1907.02724v1
PDF https://arxiv.org/pdf/1907.02724v1.pdf
PWC https://paperswithcode.com/paper/c3-framework-an-open-source-pytorch-code-for
Repo https://github.com/surajdakua/Crowd-Counting-Using-Pytorch
Framework pytorch

Attention Is (not) All You Need for Commonsense Reasoning

Title Attention Is (not) All You Need for Commonsense Reasoning
Authors Tassilo Klein, Moin Nabi
Abstract The recently introduced BERT model exhibits strong performance on several language understanding benchmarks. In this paper, we describe a simple re-implementation of BERT for commonsense reasoning. We show that the attentions produced by BERT can be directly utilized for tasks such as the Pronoun Disambiguation Problem and Winograd Schema Challenge. Our proposed attention-guided commonsense reasoning method is conceptually simple yet empirically powerful. Experimental analysis on multiple datasets demonstrates that our proposed system performs remarkably well on all cases while outperforming the previously reported state of the art by a margin. While results suggest that BERT seems to implicitly learn to establish complex relationships between entities, solving commonsense reasoning tasks might require more than unsupervised models learned from huge text corpora.
Tasks
Published 2019-05-31
URL https://arxiv.org/abs/1905.13497v1
PDF https://arxiv.org/pdf/1905.13497v1.pdf
PWC https://paperswithcode.com/paper/attention-is-not-all-you-need-for-commonsense
Repo https://github.com/SAP-samples/acl2019-commonsense-reasoning
Framework pytorch

Fast 3D Line Segment Detection From Unorganized Point Cloud

Title Fast 3D Line Segment Detection From Unorganized Point Cloud
Authors Xiaohu Lu, Yahui Liu, Kai Li
Abstract This paper presents a very simple but efficient algorithm for 3D line segment detection from large scale unorganized point cloud. Unlike traditional methods which usually extract 3D edge points first and then link them to fit for 3D line segments, we propose a very simple 3D line segment detection algorithm based on point cloud segmentation and 2D line detection. Given the input unorganized point cloud, three steps are performed to detect 3D line segments. Firstly, the point cloud is segmented into 3D planes via region growing and region merging. Secondly, for each 3D plane, all the points belonging to it are projected onto the plane itself to form a 2D image, which is followed by 2D contour extraction and Least Square Fitting to get the 2D line segments. Those 2D line segments are then re-projected onto the 3D plane to get the corresponding 3D line segments. Finally, a post-processing procedure is proposed to eliminate outliers and merge adjacent 3D line segments. Experiments on several public datasets demonstrate the efficiency and robustness of our method. More results and the C++ source code of the proposed algorithm are publicly available at https://github.com/xiaohulugo/3DLineDetection.
Tasks Line Segment Detection
Published 2019-01-08
URL http://arxiv.org/abs/1901.02532v1
PDF http://arxiv.org/pdf/1901.02532v1.pdf
PWC https://paperswithcode.com/paper/fast-3d-line-segment-detection-from
Repo https://github.com/xiaohulugo/3DLineDetection
Framework none

W-PoseNet: Dense Correspondence Regularized Pixel Pair Pose Regression

Title W-PoseNet: Dense Correspondence Regularized Pixel Pair Pose Regression
Authors Zelin Xu, Ke Chen, Kui Jia
Abstract Solving 6D pose estimation is non-trivial to cope with intrinsic appearance and shape variation and severe inter-object occlusion, and is made more challenging in light of extrinsic large illumination changes and low quality of the acquired data under an uncontrolled environment. This paper introduces a novel pose estimation algorithm W-PoseNet, which densely regresses from input data to 6D pose and also 3D coordinates in model space. In other words, local features learned for pose regression in our deep network are regularized by explicitly learning pixel-wise correspondence mapping onto 3D pose-sensitive coordinates as an auxiliary task. Moreover, a sparse pair combination of pixel-wise features and soft voting on pixel-pair pose predictions are designed to improve robustness to inconsistent and sparse local features. Experiment results on the popular YCB-Video and LineMOD benchmarks show that the proposed W-PoseNet consistently achieves superior performance to the state-of-the-art algorithms.
Tasks 6D Pose Estimation, Pose Estimation
Published 2019-12-26
URL https://arxiv.org/abs/1912.11888v1
PDF https://arxiv.org/pdf/1912.11888v1.pdf
PWC https://paperswithcode.com/paper/w-posenet-dense-correspondence-regularized
Repo https://github.com/xzlscut/W-PoseNet
Framework none

STELA: A Real-Time Scene Text Detector with Learned Anchor

Title STELA: A Real-Time Scene Text Detector with Learned Anchor
Authors Linjie Deng, Yanxiang Gong, Xinchen Lu, Yi Lin, Zheng Ma, Mei Xie
Abstract To achieve high coverage of target boxes, a normal strategy of conventional one-stage anchor-based detectors is to utilize multiple priors at each spatial position, especially in scene text detection tasks. In this work, we present a simple and intuitive method for multi-oriented text detection where each location of feature maps only associates with one reference box. The idea is inspired from the twostage R-CNN framework that can estimate the location of objects with any shape by using learned proposals. The aim of our method is to integrate this mechanism into a onestage detector and employ the learned anchor which is obtained through a regression operation to replace the original one into the final predictions. Based on RetinaNet, our method achieves competitive performances on several public benchmarks with a totally real-time efficiency (26:5fps at 800p), which surpasses all of anchor-based scene text detectors. In addition, with less attention on anchor design, we believe our method is easy to be applied on other analogous detection tasks. The code will publicly available at https://github.com/xhzdeng/stela.
Tasks Scene Text Detection
Published 2019-09-17
URL https://arxiv.org/abs/1909.07549v2
PDF https://arxiv.org/pdf/1909.07549v2.pdf
PWC https://paperswithcode.com/paper/stela-a-real-time-scene-text-detector-with
Repo https://github.com/xhzdeng/stela
Framework pytorch

Practical Calculation of Gittins Indices for Multi-armed Bandits

Title Practical Calculation of Gittins Indices for Multi-armed Bandits
Authors James Edwards
Abstract Gittins indices provide an optimal solution to the classical multi-armed bandit problem. An obstacle to their use has been the common perception that their computation is very difficult. This paper demonstrates an accessible general methodology for the calculating Gittins indices for the multi-armed bandit with a detailed study on the cases of Bernoulli and Gaussian rewards. With accompanying easy-to-use open source software, this work removes computation as a barrier to using Gittins indices in these commonly found settings.
Tasks Multi-Armed Bandits
Published 2019-09-11
URL https://arxiv.org/abs/1909.05075v1
PDF https://arxiv.org/pdf/1909.05075v1.pdf
PWC https://paperswithcode.com/paper/practical-calculation-of-gittins-indices-for
Repo https://github.com/jedwards24/gittins
Framework none

Toward Unsupervised Text Content Manipulation

Title Toward Unsupervised Text Content Manipulation
Authors Wentao Wang, Zhiting Hu, Zichao Yang, Haoran Shi, Frank Xu, Eric Xing
Abstract Controlled generation of text is of high practical use. Recent efforts have made impressive progress in generating or editing sentences with given textual attributes (e.g., sentiment). This work studies a new practical setting of text content manipulation. Given a structured record, such as (PLAYER: Lebron, POINTS: 20, ASSISTS: 10)', and a reference sentence, such as Kobe easily dropped 30 points’, we aim to generate a sentence that accurately describes the full content in the record, with the same writing style (e.g., wording, transitions) of the reference. The problem is unsupervised due to lack of parallel data in practice, and is challenging to minimally yet effectively manipulate the text (by rewriting/adding/deleting text portions) to ensure fidelity to the structured content. We derive a dataset from a basketball game report corpus as our testbed, and develop a neural method with unsupervised competing objectives and explicit content coverage constraints. Automatic and human evaluations show superiority of our approach over competitive methods including a strong rule-based baseline and prior approaches designed for style transfer.
Tasks Style Transfer
Published 2019-01-28
URL http://arxiv.org/abs/1901.09501v2
PDF http://arxiv.org/pdf/1901.09501v2.pdf
PWC https://paperswithcode.com/paper/toward-unsupervised-text-content-manipulation
Repo https://github.com/ZhitingHu/text_content_manipulation
Framework none

Extraction of digital wavefront sets using applied harmonic analysis and deep neural networks

Title Extraction of digital wavefront sets using applied harmonic analysis and deep neural networks
Authors Héctor Andrade-Loarca, Gitta Kutyniok, Ozan Öktem, Philipp Petersen
Abstract Microlocal analysis provides deep insight into singularity structures and is often crucial for solving inverse problems, predominately, in imaging sciences. Of particular importance is the analysis of wavefront sets and the correct extraction of those. In this paper, we introduce the first algorithmic approach to extract the wavefront set of images, which combines data-based and model-based methods. Based on a celebrated property of the shearlet transform to unravel information on the wavefront set, we extract the wavefront set of an image by first applying a discrete shearlet transform and then feeding local patches of this transform to a deep convolutional neural network trained on labeled data. The resulting algorithm outperforms all competing algorithms in edge-orientation and ramp-orientation detection.
Tasks
Published 2019-01-05
URL https://arxiv.org/abs/1901.01388v2
PDF https://arxiv.org/pdf/1901.01388v2.pdf
PWC https://paperswithcode.com/paper/extraction-of-digital-wavefront-sets-using
Repo https://github.com/arsenal9971/DeNSE
Framework none

Are Graph Neural Networks Miscalibrated?

Title Are Graph Neural Networks Miscalibrated?
Authors Leonardo Teixeira, Brian Jalaian, Bruno Ribeiro
Abstract Graph Neural Networks (GNNs) have proven to be successful in many classification tasks, outperforming previous state-of-the-art methods in terms of accuracy. However, accuracy alone is not enough for high-stakes decision making. Decision makers want to know the likelihood that a specific GNN prediction is correct. For this purpose, obtaining calibrated models is essential. In this work, we perform an empirical evaluation of the calibration of state-of-the-art GNNs on multiple datasets. Our experiments show that GNNs can be calibrated in some datasets but also badly miscalibrated in others, and that state-of-the-art calibration methods are helpful but do not fix the problem.
Tasks Calibration, Decision Making
Published 2019-05-07
URL https://arxiv.org/abs/1905.02296v2
PDF https://arxiv.org/pdf/1905.02296v2.pdf
PWC https://paperswithcode.com/paper/are-graph-neural-networks-miscalibrated
Repo https://github.com/PurdueMINDS/GNNsMiscalibrated
Framework pytorch

A comparison of some conformal quantile regression methods

Title A comparison of some conformal quantile regression methods
Authors Matteo Sesia, Emmanuel J. Candès
Abstract We compare two recently proposed methods that combine ideas from conformal inference and quantile regression to produce locally adaptive and marginally valid prediction intervals under sample exchangeability (Romano et al., 2019; Kivaranovic et al., 2019). First, we prove that these two approaches are asymptotically efficient in large samples, under some additional assumptions. Then we compare them empirically on simulated and real data. Our results demonstrate that the method in Romano et al. (2019) typically yields tighter prediction intervals in finite samples. Finally, we discuss how to tune these procedures by fixing the relative proportions of observations used for training and conformalization.
Tasks
Published 2019-09-12
URL https://arxiv.org/abs/1909.05433v1
PDF https://arxiv.org/pdf/1909.05433v1.pdf
PWC https://paperswithcode.com/paper/a-comparison-of-some-conformal-quantile
Repo https://github.com/msesia/cqr-comparison
Framework pytorch

Efficient Pose Selection for Interactive Camera Calibration

Title Efficient Pose Selection for Interactive Camera Calibration
Authors Pavel Rojtberg, Arjan Kuijper
Abstract The choice of poses for camera calibration with planar patterns is only rarely considered - yet the calibration precision heavily depends on it. This work presents a pose selection method that finds a compact and robust set of calibration poses and is suitable for interactive calibration. Consequently, singular poses that would lead to an unreliable solution are avoided explicitly, while poses reducing the uncertainty of the calibration are favoured. For this, we use uncertainty propagation. Our method takes advantage of a self-identifying calibration pattern to track the camera pose in real-time. This allows to iteratively guide the user to the target poses, until the desired quality level is reached. Therefore, only a sparse set of key-frames is needed for calibration. The method is evaluated on separate training and testing sets, as well as on synthetic data. Our approach performs better than comparable solutions while requiring 30% less calibration frames.
Tasks Calibration
Published 2019-07-09
URL https://arxiv.org/abs/1907.04096v1
PDF https://arxiv.org/pdf/1907.04096v1.pdf
PWC https://paperswithcode.com/paper/efficient-pose-selection-for-interactive
Repo https://github.com/paroj/pose_calib
Framework none
comments powered by Disqus