Paper Group AWR 399
Learn to Scale: Generating Multipolar Normalized Density Maps for Crowd Counting. Deep Double Descent: Where Bigger Models and More Data Hurt. Harvey Mudd College at SemEval-2019 Task 4: The Clint Buchanan Hyperpartisan News Detector. PolSAR Image Classification Based on Dilated Convolution and Pixel-Refining Parallel Mapping network in the Complex …
Learn to Scale: Generating Multipolar Normalized Density Maps for Crowd Counting
Title | Learn to Scale: Generating Multipolar Normalized Density Maps for Crowd Counting |
Authors | Chenfeng Xu, Kai Qiu, Jianlong Fu, Song Bai, Yongchao Xu, Xiang Bai |
Abstract | Dense crowd counting aims to predict thousands of human instances from an image, by calculating integrals of a density map over image pixels. Existing approaches mainly suffer from the extreme density variances. Such density pattern shift poses challenges even for multi-scale model ensembling. In this paper, we propose a simple yet effective approach to tackle this problem. First, a patch-level density map is extracted by a density estimation model and further grouped into several density levels which are determined over full datasets. Second, each patch density map is automatically normalized by an online center learning strategy with a multipolar center loss. Such a design can significantly condense the density distribution into several clusters, and enable that the density variance can be learned by a single model. Extensive experiments demonstrate the superiority of the proposed method. Our work outperforms the state-of-the-art by 4.2%, 14.3%, 27.1% and 20.1% in MAE, on ShanghaiTech Part A, ShanghaiTech Part B, UCF_CC_50 and UCF-QNRF datasets, respectively. |
Tasks | Crowd Counting, Density Estimation |
Published | 2019-07-29 |
URL | https://arxiv.org/abs/1907.12428v2 |
https://arxiv.org/pdf/1907.12428v2.pdf | |
PWC | https://paperswithcode.com/paper/learn-to-scale-generating-multipolar |
Repo | https://github.com/zhousy1993/paper |
Framework | none |
Deep Double Descent: Where Bigger Models and More Data Hurt
Title | Deep Double Descent: Where Bigger Models and More Data Hurt |
Authors | Preetum Nakkiran, Gal Kaplun, Yamini Bansal, Tristan Yang, Boaz Barak, Ilya Sutskever |
Abstract | We show that a variety of modern deep learning tasks exhibit a “double-descent” phenomenon where, as we increase model size, performance first gets worse and then gets better. Moreover, we show that double descent occurs not just as a function of model size, but also as a function of the number of training epochs. We unify the above phenomena by defining a new complexity measure we call the effective model complexity and conjecture a generalized double descent with respect to this measure. Furthermore, our notion of model complexity allows us to identify certain regimes where increasing (even quadrupling) the number of train samples actually hurts test performance. |
Tasks | |
Published | 2019-12-04 |
URL | https://arxiv.org/abs/1912.02292v1 |
https://arxiv.org/pdf/1912.02292v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-double-descent-where-bigger-models-and-1 |
Repo | https://github.com/mbpereira49/inferenceattacks |
Framework | pytorch |
Harvey Mudd College at SemEval-2019 Task 4: The Clint Buchanan Hyperpartisan News Detector
Title | Harvey Mudd College at SemEval-2019 Task 4: The Clint Buchanan Hyperpartisan News Detector |
Authors | Mehdi Drissi, Pedro Sandoval, Vivaswat Ojha, Julie Medero |
Abstract | We investigate the recently developed Bidirectional Encoder Representations from Transformers (BERT) model for the hyperpartisan news detection task. Using a subset of hand-labeled articles from SemEval as a validation set, we test the performance of different parameters for BERT models. We find that accuracy from two different BERT models using different proportions of the articles is consistently high, with our best-performing model on the validation set achieving 85% accuracy and the best-performing model on the test set achieving 77%. We further determined that our model exhibits strong consistency, labeling independent slices of the same article identically. Finally, we find that randomizing the order of word pieces dramatically reduces validation accuracy (to approximately 60%), but that shuffling groups of four or more word pieces maintains an accuracy of about 80%, indicating the model mainly gains value from local context. |
Tasks | |
Published | 2019-04-10 |
URL | http://arxiv.org/abs/1905.01962v1 |
http://arxiv.org/pdf/1905.01962v1.pdf | |
PWC | https://paperswithcode.com/paper/190501962 |
Repo | https://github.com/hmc-cs159-fall2018/final-project-team-mvp-10000 |
Framework | pytorch |
PolSAR Image Classification Based on Dilated Convolution and Pixel-Refining Parallel Mapping network in the Complex Domain
Title | PolSAR Image Classification Based on Dilated Convolution and Pixel-Refining Parallel Mapping network in the Complex Domain |
Authors | Dongling Xiao, Chang Liu, Qi Wang, Chao Wang, Xin Zhang |
Abstract | Efficient and accurate polarimetric synthetic aperture radar (PolSAR) image classification with a limited number of prior labels is always full of challenges. For general supervised deep learning classification algorithms, the pixel-by-pixel algorithm achieves precise yet inefficient classification with a small number of labeled pixels, whereas the pixel mapping algorithm achieves efficient yet edge-rough classification with more prior labels required. To take efficiency, accuracy and prior labels into account, we propose a novel pixel-refining parallel mapping network in the complex domain named CRPM-Net and the corresponding training algorithm for PolSAR image classification. CRPM-Net consists of two parallel sub-networks: a) A transfer dilated convolution mapping network in the complex domain (C-Dilated CNN) activated by a complex cross-convolution neural network (Cs-CNN), which is aiming at precise localization, high efficiency and the full use of phase information; b) A complex domain encoder-decoder network connected parallelly with C-Dilated CNN, which is to extract more contextual semantic features. Finally, we design a two-step algorithm to train the Cs-CNN and CRPM-Net with a small number of labeled pixels for higher accuracy by refining misclassified labeled pixels. We verify the proposed method on AIRSAR and E-SAR datasets. The experimental results demonstrate that CRPM-Net achieves the best classification results and substantially outperforms some latest state-of-the-art approaches in both efficiency and accuracy for PolSAR image classification. The source code and trained models for CRPM-Net is available at: https://github.com/PROoshio/CRPM-Net. |
Tasks | Image Classification |
Published | 2019-09-24 |
URL | https://arxiv.org/abs/1909.10783v2 |
https://arxiv.org/pdf/1909.10783v2.pdf | |
PWC | https://paperswithcode.com/paper/polsar-image-classification-based-on-dilated |
Repo | https://github.com/PROoshio/CRPM-Net |
Framework | tf |
A Strong Feature Representation for Siamese Network Tracker
Title | A Strong Feature Representation for Siamese Network Tracker |
Authors | Zhipeng Zhou, Rui Zhang, Dong Yin |
Abstract | Object tracking has important application in assistive technologies for personalized monitoring. Recent trackers choosing AlexNet as their backbone to extract features have gained great success. However, AlexNet is too shallow to form a strong feature representation, the tracker based on the Siamese network have an accuracy gap compared with state-of-the-art algorithms. To solve this problem, this paper proposes a tracker called SiamPF. Firstly, the modified pre-trained VGG16 network is fine-tuned as the backbone. Secondly, an AlexNet-like branch is added after the third convolutional layer and merged with the response map of the backbone network to form a preliminary strong feature representation. And then, a channel attention block is designed to adaptively select the contribution features. Finally, the APCE is modified to process the response map to reduce interference and focus the tracker on the target. Our SiamPF only used ILSVRC2015-VID for training, but it achieved excellent performance on OTB-2013 / OTB-2015 / VOT2015 / VOT2017, while maintaining the real-time performance of 41FPS on the GTX 1080Ti. |
Tasks | Object Tracking |
Published | 2019-07-18 |
URL | https://arxiv.org/abs/1907.07880v1 |
https://arxiv.org/pdf/1907.07880v1.pdf | |
PWC | https://paperswithcode.com/paper/a-strong-feature-representation-for-siamese |
Repo | https://github.com/zzpustc/SiamPF |
Framework | none |
Shakeout: A New Approach to Regularized Deep Neural Network Training
Title | Shakeout: A New Approach to Regularized Deep Neural Network Training |
Authors | Guoliang Kang, Jun Li, Dacheng Tao |
Abstract | Recent years have witnessed the success of deep neural networks in dealing with a plenty of practical problems. Dropout has played an essential role in many successful deep neural networks, by inducing regularization in the model training. In this paper, we present a new regularized training approach: Shakeout. Instead of randomly discarding units as Dropout does at the training stage, Shakeout randomly chooses to enhance or reverse each unit’s contribution to the next layer. This minor modification of Dropout has the statistical trait: the regularizer induced by Shakeout adaptively combines $L_0$, $L_1$ and $L_2$ regularization terms. Our classification experiments with representative deep architectures on image datasets MNIST, CIFAR-10 and ImageNet show that Shakeout deals with over-fitting effectively and outperforms Dropout. We empirically demonstrate that Shakeout leads to sparser weights under both unsupervised and supervised settings. Shakeout also leads to the grouping effect of the input units in a layer. Considering the weights in reflecting the importance of connections, Shakeout is superior to Dropout, which is valuable for the deep model compression. Moreover, we demonstrate that Shakeout can effectively reduce the instability of the training process of the deep architecture. |
Tasks | Model Compression |
Published | 2019-04-13 |
URL | http://arxiv.org/abs/1904.06593v1 |
http://arxiv.org/pdf/1904.06593v1.pdf | |
PWC | https://paperswithcode.com/paper/shakeout-a-new-approach-to-regularized-deep |
Repo | https://github.com/kgl-prml/shakeout-for-caffe |
Framework | none |
PolSF: PolSAR image dataset on San Francisco
Title | PolSF: PolSAR image dataset on San Francisco |
Authors | Xu Liu, Licheng Jiao, Fang Liu |
Abstract | Polarimetric SAR data has the characteristics of all-weather, all-time and so on, which is widely used in many fields. However, the data of annotation is relatively small, which is not conducive to our research. In this paper, we have collected five open polarimetric SAR images, which are images of the San Francisco area. These five images come from different satellites at different times, which has great scientific research value. We annotate the collected images at the pixel level for image classification and segmentation. For the convenience of researchers, the annotated data is open source https://github.com/liuxuvip/PolSF. |
Tasks | Image Classification |
Published | 2019-12-16 |
URL | https://arxiv.org/abs/1912.07259v1 |
https://arxiv.org/pdf/1912.07259v1.pdf | |
PWC | https://paperswithcode.com/paper/polsf-polsar-image-dataset-on-san-francisco |
Repo | https://github.com/liuxuvip/PolSF |
Framework | none |
Curriculum based Dropout Discriminator for Domain Adaptation
Title | Curriculum based Dropout Discriminator for Domain Adaptation |
Authors | Vinod Kumar Kurmi, Vipul Bajaj, Venkatesh K Subramanian, Vinay P Namboodiri |
Abstract | Domain adaptation is essential to enable wide usage of deep learning based networks trained using large labeled datasets. Adversarial learning based techniques have shown their utility towards solving this problem using a discriminator that ensures source and target distributions are close. However, here we suggest that rather than using a point estimate, it would be useful if a distribution based discriminator could be used to bridge this gap. This could be achieved using multiple classifiers or using traditional ensemble methods. In contrast, we suggest that a Monte Carlo dropout based ensemble discriminator could suffice to obtain the distribution based discriminator. Specifically, we propose a curriculum based dropout discriminator that gradually increases the variance of the sample based distribution and the corresponding reverse gradients are used to align the source and target feature representations. The detailed results and thorough ablation analysis show that our model outperforms state-of-the-art results. |
Tasks | Domain Adaptation |
Published | 2019-07-24 |
URL | https://arxiv.org/abs/1907.10628v2 |
https://arxiv.org/pdf/1907.10628v2.pdf | |
PWC | https://paperswithcode.com/paper/curriculum-based-dropout-discriminator-for |
Repo | https://github.com/DelTA-Lab-IITK/CD3A |
Framework | pytorch |
Temporally Grounding Language Queries in Videos by Contextual Boundary-aware Prediction
Title | Temporally Grounding Language Queries in Videos by Contextual Boundary-aware Prediction |
Authors | Jingwen Wang, Lin Ma, Wenhao Jiang |
Abstract | The task of temporally grounding language queries in videos is to temporally localize the best matched video segment corresponding to a given language (sentence). It requires certain models to simultaneously perform visual and linguistic understandings. Previous work predominantly ignores the precision of segment localization. Sliding window based methods use predefined search window sizes, which suffer from redundant computation, while existing anchor-based approaches fail to yield precise localization. We address this issue by proposing an end-to-end boundary-aware model, which uses a lightweight branch to predict semantic boundaries corresponding to the given linguistic information. To better detect semantic boundaries, we propose to aggregate contextual information by explicitly modeling the relationship between the current element and its neighbors. The most confident segments are subsequently selected based on both anchor and boundary predictions at the testing stage. The proposed model, dubbed Contextual Boundary-aware Prediction (CBP), outperforms its competitors with a clear margin on three public datasets. All codes are available on https://github.com/JaywongWang/CBP . |
Tasks | |
Published | 2019-09-11 |
URL | https://arxiv.org/abs/1909.05010v2 |
https://arxiv.org/pdf/1909.05010v2.pdf | |
PWC | https://paperswithcode.com/paper/temporally-grounding-language-queries-in |
Repo | https://github.com/JaywongWang/CBP |
Framework | tf |
Discriminating Spatial and Temporal Relevance in Deep Taylor Decompositions for Explainable Activity Recognition
Title | Discriminating Spatial and Temporal Relevance in Deep Taylor Decompositions for Explainable Activity Recognition |
Authors | Liam Hiley, Alun Preece, Yulia Hicks, David Marshall, Harrison Taylor |
Abstract | Current techniques for explainable AI have been applied with some success to image processing. The recent rise of research in video processing has called for similar work n deconstructing and explaining spatio-temporal models. While many techniques are designed for 2D convolutional models, others are inherently applicable to any input domain. One such body of work, deep Taylor decomposition, propagates relevance from the model output distributively onto its input and thus is not restricted to image processing models. However, by exploiting a simple technique that removes motion information, we show that it is not the case that this technique is effective as-is for representing relevance in non-image tasks. We instead propose a discriminative method that produces a na"ive representation of both the spatial and temporal relevance of a frame as two separate objects. This new discriminative relevance model exposes relevance in the frame attributed to motion, that was previously ambiguous in the original explanation. We observe the effectiveness of this technique on a range of samples from the UCF-101 action recognition dataset, two of which are demonstrated in this paper. |
Tasks | Activity Recognition |
Published | 2019-08-05 |
URL | https://arxiv.org/abs/1908.01536v2 |
https://arxiv.org/pdf/1908.01536v2.pdf | |
PWC | https://paperswithcode.com/paper/discriminating-spatial-and-temporal-relevance |
Repo | https://github.com/dais-ita/vadr |
Framework | pytorch |
Mindful Active Learning
Title | Mindful Active Learning |
Authors | Zhila Esna Ashari, Hassan Ghasemzadeh |
Abstract | We propose a novel active learning framework for activity recognition using wearable sensors. Our work is unique in that it takes physical and cognitive limitations of the oracle into account when selecting sensor data to be annotated by the oracle. Our approach is inspired by human-beings’ limited capacity to respond to external stimulus such as responding to a prompt on their mobile devices. This capacity constraint is manifested not only in the number of queries that a person can respond to in a given time-frame but also in the lag between the time that a query is made and when it is responded to. We introduce the notion of mindful active learning and propose a computational framework, called EMMA, to maximize the active learning performance taking informativeness of sensor data, query budget, and human memory into account. We formulate this optimization problem, propose an approach to model memory retention, discuss complexity of the problem, and propose a greedy heuristic to solve the problem. We demonstrate the effectiveness of our approach on three publicly available datasets and by simulating oracles with various memory strengths. We show that the activity recognition accuracy ranges from 21% to 97% depending on memory strength, query budget, and difficulty of the machine learning task. Our results also indicate that EMMA achieves an accuracy level that is, on average, 13.5% higher than the case when only informativeness of the sensor data is considered for active learning. Additionally, we show that the performance of our approach is at most 20% less than experimental upper-bound and up to 80% higher than experimental lower-bound. We observe that mindful active learning is most beneficial when query budget is small and/or oracle’s memory is weak, thus emphasizing contributions of our work in human-centered mobile health settings and for elderly with cognitive impairments. |
Tasks | Active Learning, Activity Recognition |
Published | 2019-07-28 |
URL | https://arxiv.org/abs/1907.12003v1 |
https://arxiv.org/pdf/1907.12003v1.pdf | |
PWC | https://paperswithcode.com/paper/mindful-active-learning |
Repo | https://github.com/zhesna/EMMA |
Framework | none |
Effective Attention Modeling for Neural Relation Extraction
Title | Effective Attention Modeling for Neural Relation Extraction |
Authors | Tapas Nayak, Hwee Tou Ng |
Abstract | Relation extraction is the task of determining the relation between two entities in a sentence. Distantly-supervised models are popular for this task. However, sentences can be long and two entities can be located far from each other in a sentence. The pieces of evidence supporting the presence of a relation between two entities may not be very direct, since the entities may be connected via some indirect links such as a third entity or via co-reference. Relation extraction in such scenarios becomes more challenging as we need to capture the long-distance interactions among the entities and other words in the sentence. Also, the words in a sentence do not contribute equally in identifying the relation between the two entities. To address this issue, we propose a novel and effective attention model which incorporates syntactic information of the sentence and a multi-factor attention mechanism. Experiments on the New York Times corpus show that our proposed model outperforms prior state-of-the-art models. |
Tasks | Relation Extraction |
Published | 2019-12-09 |
URL | https://arxiv.org/abs/1912.03832v1 |
https://arxiv.org/pdf/1912.03832v1.pdf | |
PWC | https://paperswithcode.com/paper/effective-attention-modeling-for-neural-1 |
Repo | https://github.com/nusnlp/MFA4RE |
Framework | pytorch |
AlphaStar: An Evolutionary Computation Perspective
Title | AlphaStar: An Evolutionary Computation Perspective |
Authors | Kai Arulkumaran, Antoine Cully, Julian Togelius |
Abstract | In January 2019, DeepMind revealed AlphaStar to the world-the first artificial intelligence (AI) system to beat a professional player at the game of StarCraft II-representing a milestone in the progress of AI. AlphaStar draws on many areas of AI research, including deep learning, reinforcement learning, game theory, and evolutionary computation (EC). In this paper we analyze AlphaStar primarily through the lens of EC, presenting a new look at the system and relating it to many concepts in the field. We highlight some of its most interesting aspects-the use of Lamarckian evolution, competitive co-evolution, and quality diversity. In doing so, we hope to provide a bridge between the wider EC community and one of the most significant AI systems developed in recent times. |
Tasks | Starcraft, Starcraft II |
Published | 2019-02-05 |
URL | https://arxiv.org/abs/1902.01724v3 |
https://arxiv.org/pdf/1902.01724v3.pdf | |
PWC | https://paperswithcode.com/paper/alphastar-an-evolutionary-computation |
Repo | https://github.com/SpinazieSin/scared_citizen_simulation |
Framework | none |
A Smoother Way to Train Structured Prediction Models
Title | A Smoother Way to Train Structured Prediction Models |
Authors | Krishna Pillutla, Vincent Roulet, Sham M. Kakade, Zaid Harchaoui |
Abstract | We present a framework to train a structured prediction model by performing smoothing on the inference algorithm it builds upon. Smoothing overcomes the non-smoothness inherent to the maximum margin structured prediction objective, and paves the way for the use of fast primal gradient-based optimization algorithms. We illustrate the proposed framework by developing a novel primal incremental optimization algorithm for the structural support vector machine. The proposed algorithm blends an extrapolation scheme for acceleration and an adaptive smoothing scheme and builds upon the stochastic variance-reduced gradient algorithm. We establish its worst-case global complexity bound and study several practical variants, including extensions to deep structured prediction. We present experimental results on two real-world problems, namely named entity recognition and visual object localization. The experimental results show that the proposed framework allows us to build upon efficient inference algorithms to develop large-scale optimization algorithms for structured prediction which can achieve competitive performance on the two real-world problems. |
Tasks | Named Entity Recognition, Object Localization, Structured Prediction |
Published | 2019-02-08 |
URL | http://arxiv.org/abs/1902.03228v1 |
http://arxiv.org/pdf/1902.03228v1.pdf | |
PWC | https://paperswithcode.com/paper/a-smoother-way-to-train-structured-prediction |
Repo | https://github.com/krishnap25/casimir |
Framework | none |
Self-attention with Functional Time Representation Learning
Title | Self-attention with Functional Time Representation Learning |
Authors | Da Xu, Chuanwei Ruan, Sushant Kumar, Evren Korpeoglu, Kannan Achan |
Abstract | Sequential modelling with self-attention has achieved cutting edge performances in natural language processing. With advantages in model flexibility, computation complexity and interpretability, self-attention is gradually becoming a key component in event sequence models. However, like most other sequence models, self-attention does not account for the time span between events and thus captures sequential signals rather than temporal patterns. Without relying on recurrent network structures, self-attention recognizes event orderings via positional encoding. To bridge the gap between modelling time-independent and time-dependent event sequence, we introduce a functional feature map that embeds time span into high-dimensional spaces. By constructing the associated translation-invariant time kernel function, we reveal the functional forms of the feature map under classic functional function analysis results, namely Bochner’s Theorem and Mercer’s Theorem. We propose several models to learn the functional time representation and the interactions with event representation. These methods are evaluated on real-world datasets under various continuous-time event sequence prediction tasks. The experiments reveal that the proposed methods compare favorably to baseline models while also capturing useful time-event interactions. |
Tasks | Representation Learning |
Published | 2019-11-28 |
URL | https://arxiv.org/abs/1911.12864v1 |
https://arxiv.org/pdf/1911.12864v1.pdf | |
PWC | https://paperswithcode.com/paper/self-attention-with-functional-time-1 |
Repo | https://github.com/StatsDLMathsRecomSys/Self-attention-with-Functional-Time-Representation-Learning |
Framework | tf |