Paper Group AWR 120
Locality-constrained Spatial Transformer Network for Video Crowd Counting. Robustness of Generalized Learning Vector Quantization Models against Adversarial Attacks. Adversarial Defense by Suppressing High-frequency Components. Personalizing Graph Neural Networks with Attention Mechanism for Session-based Recommendation. C^3 Framework: An Open-sour …
Locality-constrained Spatial Transformer Network for Video Crowd Counting
Title | Locality-constrained Spatial Transformer Network for Video Crowd Counting |
Authors | Yanyan Fang, Biyun Zhan, Wandi Cai, Shenghua Gao, Bo Hu |
Abstract | Compared with single image based crowd counting, video provides the spatial-temporal information of the crowd that would help improve the robustness of crowd counting. But translation, rotation and scaling of people lead to the change of density map of heads between neighbouring frames. Meanwhile, people walking in/out or being occluded in dynamic scenes leads to the change of head counts. To alleviate these issues in video crowd counting, a Locality-constrained Spatial Transformer Network (LSTN) is proposed. Specifically, we first leverage a Convolutional Neural Networks to estimate the density map for each frame. Then to relate the density maps between neighbouring frames, a Locality-constrained Spatial Transformer (LST) module is introduced to estimate the density map of next frame with that of current frame. To facilitate the performance evaluation, a large-scale video crowd counting dataset is collected, which contains 15K frames with about 394K annotated heads captured from 13 different scenes. As far as we know, it is the largest video crowd counting dataset. Extensive experiments on our dataset and other crowd counting datasets validate the effectiveness of our LSTN for crowd counting. |
Tasks | Crowd Counting |
Published | 2019-07-18 |
URL | https://arxiv.org/abs/1907.07911v1 |
https://arxiv.org/pdf/1907.07911v1.pdf | |
PWC | https://paperswithcode.com/paper/locality-constrained-spatial-transformer |
Repo | https://github.com/sweetyy83/Lstn_fdst_dataset |
Framework | none |
Robustness of Generalized Learning Vector Quantization Models against Adversarial Attacks
Title | Robustness of Generalized Learning Vector Quantization Models against Adversarial Attacks |
Authors | Sascha Saralajew, Lars Holdijk, Maike Rees, Thomas Villmann |
Abstract | Adversarial attacks and the development of (deep) neural networks robust against them are currently two widely researched topics. The robustness of Learning Vector Quantization (LVQ) models against adversarial attacks has however not yet been studied to the same extent. We therefore present an extensive evaluation of three LVQ models: Generalized LVQ, Generalized Matrix LVQ and Generalized Tangent LVQ. The evaluation suggests that both Generalized LVQ and Generalized Tangent LVQ have a high base robustness, on par with the current state-of-the-art in robust neural network methods. In contrast to this, Generalized Matrix LVQ shows a high susceptibility to adversarial attacks, scoring consistently behind all other models. Additionally, our numerical evaluation indicates that increasing the number of prototypes per class improves the robustness of the models. |
Tasks | Quantization |
Published | 2019-02-01 |
URL | http://arxiv.org/abs/1902.00577v2 |
http://arxiv.org/pdf/1902.00577v2.pdf | |
PWC | https://paperswithcode.com/paper/robustness-of-generalized-learning-vector |
Repo | https://github.com/LarsHoldijk/robust_LVQ_models |
Framework | tf |
Adversarial Defense by Suppressing High-frequency Components
Title | Adversarial Defense by Suppressing High-frequency Components |
Authors | Zhendong Zhang, Cheolkon Jung, Xiaolong Liang |
Abstract | Recent works show that deep neural networks trained on image classification dataset bias towards textures. Those models are easily fooled by applying small high-frequency perturbations to clean images. In this paper, we learn robust image classification models by removing high-frequency components. Specifically, we develop a differentiable high-frequency suppression module based on discrete Fourier transform (DFT). Combining with adversarial training, we won the 5th place in the IJCAI-2019 Alibaba Adversarial AI Challenge. Our code is available online. |
Tasks | Adversarial Defense, Image Classification |
Published | 2019-08-19 |
URL | https://arxiv.org/abs/1908.06566v3 |
https://arxiv.org/pdf/1908.06566v3.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-defense-by-suppressing-high |
Repo | https://github.com/zzd1992/Adversarial-Defense-by-Suppressing-High-Frequencies |
Framework | pytorch |
Personalizing Graph Neural Networks with Attention Mechanism for Session-based Recommendation
Title | Personalizing Graph Neural Networks with Attention Mechanism for Session-based Recommendation |
Authors | Shu Wu, Mengqi Zhang, Xin Jiang, Xu Ke, Liang Wang |
Abstract | The problem of personalized session-based recommendation aims to predict users’ next click based on their sequential behaviors. Existing session-based recommendation methods only consider all sessions of user as a single sequence, ignoring the relationship of among sessions. Other than that, most of them neglect complex transitions of items and the collaborative relationship between users and items. To this end, we propose a novel method, named Personalizing Graph Neural Networks with Attention Mechanism, A-PGNN for brevity. A-PGNN mainly consists of two components: One is Personalizing Graph Neural Network (PGNN), which is used to capture complex transitions in user session sequence. Compared with the traditional Graph Neural Network (GNN) model, it also considers the role of users in the sequence. The other is Dot-Product Attention mechanism, which draws on the attention mechanism in machine translation to explicitly model the effect of historical sessions on the current session. These two parts make it possible to learn the multi-level transition relationships between items and sessions in user-specific fashion. Extensive experiments conducted on two real-world data sets show that A-PGNN significantly outperforms the state-of-the-art personalizing session-based recommendation methods consistently. |
Tasks | Machine Translation, Session-Based Recommendations |
Published | 2019-10-20 |
URL | https://arxiv.org/abs/1910.08887v2 |
https://arxiv.org/pdf/1910.08887v2.pdf | |
PWC | https://paperswithcode.com/paper/personalizing-graph-neural-networks-with |
Repo | https://github.com/CRIPAC-DIG/A-PGNN |
Framework | tf |
C^3 Framework: An Open-source PyTorch Code for Crowd Counting
Title | C^3 Framework: An Open-source PyTorch Code for Crowd Counting |
Authors | Junyu Gao, Wei Lin, Bin Zhao, Dong Wang, Chenyu Gao, Jun Wen |
Abstract | This technical report attempts to provide efficient and solid kits addressed on the field of crowd counting, which is denoted as Crowd Counting Code Framework (C$^3$F). The contributions of C$^3$F are in three folds: 1) Some solid baseline networks are presented, which have achieved the state-of-the-arts. 2) Some flexible parameter setting strategies are provided to further promote the performance. 3) A powerful log system is developed to record the experiment process, which can enhance the reproducibility of each experiment. Our code is made publicly available at \url{https://github.com/gjy3035/C-3-Framework}. Furthermore, we also post a Chinese blog\footnote{\url{https://zhuanlan.zhihu.com/p/65650998}} to describe the details and insights of crowd counting. |
Tasks | Crowd Counting |
Published | 2019-07-05 |
URL | https://arxiv.org/abs/1907.02724v1 |
https://arxiv.org/pdf/1907.02724v1.pdf | |
PWC | https://paperswithcode.com/paper/c3-framework-an-open-source-pytorch-code-for |
Repo | https://github.com/surajdakua/Crowd-Counting-Using-Pytorch |
Framework | pytorch |
Attention Is (not) All You Need for Commonsense Reasoning
Title | Attention Is (not) All You Need for Commonsense Reasoning |
Authors | Tassilo Klein, Moin Nabi |
Abstract | The recently introduced BERT model exhibits strong performance on several language understanding benchmarks. In this paper, we describe a simple re-implementation of BERT for commonsense reasoning. We show that the attentions produced by BERT can be directly utilized for tasks such as the Pronoun Disambiguation Problem and Winograd Schema Challenge. Our proposed attention-guided commonsense reasoning method is conceptually simple yet empirically powerful. Experimental analysis on multiple datasets demonstrates that our proposed system performs remarkably well on all cases while outperforming the previously reported state of the art by a margin. While results suggest that BERT seems to implicitly learn to establish complex relationships between entities, solving commonsense reasoning tasks might require more than unsupervised models learned from huge text corpora. |
Tasks | |
Published | 2019-05-31 |
URL | https://arxiv.org/abs/1905.13497v1 |
https://arxiv.org/pdf/1905.13497v1.pdf | |
PWC | https://paperswithcode.com/paper/attention-is-not-all-you-need-for-commonsense |
Repo | https://github.com/SAP-samples/acl2019-commonsense-reasoning |
Framework | pytorch |
Fast 3D Line Segment Detection From Unorganized Point Cloud
Title | Fast 3D Line Segment Detection From Unorganized Point Cloud |
Authors | Xiaohu Lu, Yahui Liu, Kai Li |
Abstract | This paper presents a very simple but efficient algorithm for 3D line segment detection from large scale unorganized point cloud. Unlike traditional methods which usually extract 3D edge points first and then link them to fit for 3D line segments, we propose a very simple 3D line segment detection algorithm based on point cloud segmentation and 2D line detection. Given the input unorganized point cloud, three steps are performed to detect 3D line segments. Firstly, the point cloud is segmented into 3D planes via region growing and region merging. Secondly, for each 3D plane, all the points belonging to it are projected onto the plane itself to form a 2D image, which is followed by 2D contour extraction and Least Square Fitting to get the 2D line segments. Those 2D line segments are then re-projected onto the 3D plane to get the corresponding 3D line segments. Finally, a post-processing procedure is proposed to eliminate outliers and merge adjacent 3D line segments. Experiments on several public datasets demonstrate the efficiency and robustness of our method. More results and the C++ source code of the proposed algorithm are publicly available at https://github.com/xiaohulugo/3DLineDetection. |
Tasks | Line Segment Detection |
Published | 2019-01-08 |
URL | http://arxiv.org/abs/1901.02532v1 |
http://arxiv.org/pdf/1901.02532v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-3d-line-segment-detection-from |
Repo | https://github.com/xiaohulugo/3DLineDetection |
Framework | none |
W-PoseNet: Dense Correspondence Regularized Pixel Pair Pose Regression
Title | W-PoseNet: Dense Correspondence Regularized Pixel Pair Pose Regression |
Authors | Zelin Xu, Ke Chen, Kui Jia |
Abstract | Solving 6D pose estimation is non-trivial to cope with intrinsic appearance and shape variation and severe inter-object occlusion, and is made more challenging in light of extrinsic large illumination changes and low quality of the acquired data under an uncontrolled environment. This paper introduces a novel pose estimation algorithm W-PoseNet, which densely regresses from input data to 6D pose and also 3D coordinates in model space. In other words, local features learned for pose regression in our deep network are regularized by explicitly learning pixel-wise correspondence mapping onto 3D pose-sensitive coordinates as an auxiliary task. Moreover, a sparse pair combination of pixel-wise features and soft voting on pixel-pair pose predictions are designed to improve robustness to inconsistent and sparse local features. Experiment results on the popular YCB-Video and LineMOD benchmarks show that the proposed W-PoseNet consistently achieves superior performance to the state-of-the-art algorithms. |
Tasks | 6D Pose Estimation, Pose Estimation |
Published | 2019-12-26 |
URL | https://arxiv.org/abs/1912.11888v1 |
https://arxiv.org/pdf/1912.11888v1.pdf | |
PWC | https://paperswithcode.com/paper/w-posenet-dense-correspondence-regularized |
Repo | https://github.com/xzlscut/W-PoseNet |
Framework | none |
STELA: A Real-Time Scene Text Detector with Learned Anchor
Title | STELA: A Real-Time Scene Text Detector with Learned Anchor |
Authors | Linjie Deng, Yanxiang Gong, Xinchen Lu, Yi Lin, Zheng Ma, Mei Xie |
Abstract | To achieve high coverage of target boxes, a normal strategy of conventional one-stage anchor-based detectors is to utilize multiple priors at each spatial position, especially in scene text detection tasks. In this work, we present a simple and intuitive method for multi-oriented text detection where each location of feature maps only associates with one reference box. The idea is inspired from the twostage R-CNN framework that can estimate the location of objects with any shape by using learned proposals. The aim of our method is to integrate this mechanism into a onestage detector and employ the learned anchor which is obtained through a regression operation to replace the original one into the final predictions. Based on RetinaNet, our method achieves competitive performances on several public benchmarks with a totally real-time efficiency (26:5fps at 800p), which surpasses all of anchor-based scene text detectors. In addition, with less attention on anchor design, we believe our method is easy to be applied on other analogous detection tasks. The code will publicly available at https://github.com/xhzdeng/stela. |
Tasks | Scene Text Detection |
Published | 2019-09-17 |
URL | https://arxiv.org/abs/1909.07549v2 |
https://arxiv.org/pdf/1909.07549v2.pdf | |
PWC | https://paperswithcode.com/paper/stela-a-real-time-scene-text-detector-with |
Repo | https://github.com/xhzdeng/stela |
Framework | pytorch |
Practical Calculation of Gittins Indices for Multi-armed Bandits
Title | Practical Calculation of Gittins Indices for Multi-armed Bandits |
Authors | James Edwards |
Abstract | Gittins indices provide an optimal solution to the classical multi-armed bandit problem. An obstacle to their use has been the common perception that their computation is very difficult. This paper demonstrates an accessible general methodology for the calculating Gittins indices for the multi-armed bandit with a detailed study on the cases of Bernoulli and Gaussian rewards. With accompanying easy-to-use open source software, this work removes computation as a barrier to using Gittins indices in these commonly found settings. |
Tasks | Multi-Armed Bandits |
Published | 2019-09-11 |
URL | https://arxiv.org/abs/1909.05075v1 |
https://arxiv.org/pdf/1909.05075v1.pdf | |
PWC | https://paperswithcode.com/paper/practical-calculation-of-gittins-indices-for |
Repo | https://github.com/jedwards24/gittins |
Framework | none |
Toward Unsupervised Text Content Manipulation
Title | Toward Unsupervised Text Content Manipulation |
Authors | Wentao Wang, Zhiting Hu, Zichao Yang, Haoran Shi, Frank Xu, Eric Xing |
Abstract | Controlled generation of text is of high practical use. Recent efforts have made impressive progress in generating or editing sentences with given textual attributes (e.g., sentiment). This work studies a new practical setting of text content manipulation. Given a structured record, such as (PLAYER: Lebron, POINTS: 20, ASSISTS: 10)', and a reference sentence, such as Kobe easily dropped 30 points’, we aim to generate a sentence that accurately describes the full content in the record, with the same writing style (e.g., wording, transitions) of the reference. The problem is unsupervised due to lack of parallel data in practice, and is challenging to minimally yet effectively manipulate the text (by rewriting/adding/deleting text portions) to ensure fidelity to the structured content. We derive a dataset from a basketball game report corpus as our testbed, and develop a neural method with unsupervised competing objectives and explicit content coverage constraints. Automatic and human evaluations show superiority of our approach over competitive methods including a strong rule-based baseline and prior approaches designed for style transfer. |
Tasks | Style Transfer |
Published | 2019-01-28 |
URL | http://arxiv.org/abs/1901.09501v2 |
http://arxiv.org/pdf/1901.09501v2.pdf | |
PWC | https://paperswithcode.com/paper/toward-unsupervised-text-content-manipulation |
Repo | https://github.com/ZhitingHu/text_content_manipulation |
Framework | none |
Extraction of digital wavefront sets using applied harmonic analysis and deep neural networks
Title | Extraction of digital wavefront sets using applied harmonic analysis and deep neural networks |
Authors | Héctor Andrade-Loarca, Gitta Kutyniok, Ozan Öktem, Philipp Petersen |
Abstract | Microlocal analysis provides deep insight into singularity structures and is often crucial for solving inverse problems, predominately, in imaging sciences. Of particular importance is the analysis of wavefront sets and the correct extraction of those. In this paper, we introduce the first algorithmic approach to extract the wavefront set of images, which combines data-based and model-based methods. Based on a celebrated property of the shearlet transform to unravel information on the wavefront set, we extract the wavefront set of an image by first applying a discrete shearlet transform and then feeding local patches of this transform to a deep convolutional neural network trained on labeled data. The resulting algorithm outperforms all competing algorithms in edge-orientation and ramp-orientation detection. |
Tasks | |
Published | 2019-01-05 |
URL | https://arxiv.org/abs/1901.01388v2 |
https://arxiv.org/pdf/1901.01388v2.pdf | |
PWC | https://paperswithcode.com/paper/extraction-of-digital-wavefront-sets-using |
Repo | https://github.com/arsenal9971/DeNSE |
Framework | none |
Are Graph Neural Networks Miscalibrated?
Title | Are Graph Neural Networks Miscalibrated? |
Authors | Leonardo Teixeira, Brian Jalaian, Bruno Ribeiro |
Abstract | Graph Neural Networks (GNNs) have proven to be successful in many classification tasks, outperforming previous state-of-the-art methods in terms of accuracy. However, accuracy alone is not enough for high-stakes decision making. Decision makers want to know the likelihood that a specific GNN prediction is correct. For this purpose, obtaining calibrated models is essential. In this work, we perform an empirical evaluation of the calibration of state-of-the-art GNNs on multiple datasets. Our experiments show that GNNs can be calibrated in some datasets but also badly miscalibrated in others, and that state-of-the-art calibration methods are helpful but do not fix the problem. |
Tasks | Calibration, Decision Making |
Published | 2019-05-07 |
URL | https://arxiv.org/abs/1905.02296v2 |
https://arxiv.org/pdf/1905.02296v2.pdf | |
PWC | https://paperswithcode.com/paper/are-graph-neural-networks-miscalibrated |
Repo | https://github.com/PurdueMINDS/GNNsMiscalibrated |
Framework | pytorch |
A comparison of some conformal quantile regression methods
Title | A comparison of some conformal quantile regression methods |
Authors | Matteo Sesia, Emmanuel J. Candès |
Abstract | We compare two recently proposed methods that combine ideas from conformal inference and quantile regression to produce locally adaptive and marginally valid prediction intervals under sample exchangeability (Romano et al., 2019; Kivaranovic et al., 2019). First, we prove that these two approaches are asymptotically efficient in large samples, under some additional assumptions. Then we compare them empirically on simulated and real data. Our results demonstrate that the method in Romano et al. (2019) typically yields tighter prediction intervals in finite samples. Finally, we discuss how to tune these procedures by fixing the relative proportions of observations used for training and conformalization. |
Tasks | |
Published | 2019-09-12 |
URL | https://arxiv.org/abs/1909.05433v1 |
https://arxiv.org/pdf/1909.05433v1.pdf | |
PWC | https://paperswithcode.com/paper/a-comparison-of-some-conformal-quantile |
Repo | https://github.com/msesia/cqr-comparison |
Framework | pytorch |
Efficient Pose Selection for Interactive Camera Calibration
Title | Efficient Pose Selection for Interactive Camera Calibration |
Authors | Pavel Rojtberg, Arjan Kuijper |
Abstract | The choice of poses for camera calibration with planar patterns is only rarely considered - yet the calibration precision heavily depends on it. This work presents a pose selection method that finds a compact and robust set of calibration poses and is suitable for interactive calibration. Consequently, singular poses that would lead to an unreliable solution are avoided explicitly, while poses reducing the uncertainty of the calibration are favoured. For this, we use uncertainty propagation. Our method takes advantage of a self-identifying calibration pattern to track the camera pose in real-time. This allows to iteratively guide the user to the target poses, until the desired quality level is reached. Therefore, only a sparse set of key-frames is needed for calibration. The method is evaluated on separate training and testing sets, as well as on synthetic data. Our approach performs better than comparable solutions while requiring 30% less calibration frames. |
Tasks | Calibration |
Published | 2019-07-09 |
URL | https://arxiv.org/abs/1907.04096v1 |
https://arxiv.org/pdf/1907.04096v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-pose-selection-for-interactive |
Repo | https://github.com/paroj/pose_calib |
Framework | none |