February 1, 2020

3361 words 16 mins read

Paper Group AWR 181

A Survey on Rain Removal from Video and Single Image. Sequential modeling of Sessions using Recurrent Neural Networks for Skip Prediction. Privacy-Preserving Gradient Boosting Decision Trees. Real-time Person Re-identification at the Edge: A Mixed Precision Approach. Person Re-identification in Aerial Imagery. PKUSEG: A Toolkit for Multi-Domain Chi …

A Survey on Rain Removal from Video and Single Image


Title	A Survey on Rain Removal from Video and Single Image
Authors	Hong Wang, Yichen Wu, Minghan Li, Qian Zhao, Deyu Meng
Abstract	Rain streaks might severely degenerate the performance of video/image processing tasks. The investigations on rain removal from video or a single image has thus been attracting much research attention in the field of computer vision and pattern recognition, and various methods have been proposed against this task in the recent years. However, there is still not a comprehensive survey paper to summarize current rain removal methods and fairly compare their generalization performance, and especially, still not a off-the-shelf toolkit to accumulate recent representative methods for easy performance comparison and capability evaluation. Aiming at this meaningful task, in this study we present a comprehensive review for current rain removal methods for video and a single image. Specifically, these methods are categorized into model-driven and data-driven approaches, and more elaborate branches of each approach are further introduced. Intrinsic capabilities, especially generalization, of representative state-of-the-art methods of each approach have been evaluated and analyzed by experiments implemented on synthetic and real data both visually and quantitatively. Furthermore, we release a comprehensive repository, including direct links to 74 rain removal papers, source codes of 9 methods for video rain removal and 20 ones for single image rain removal, 19 related project pages, 6 synthetic datasets and 4 real ones, and 4 commonly used image quality metrics, to facilitate reproduction and performance comparison of current existing methods for general users. Some limitations and research issues worthy to be further investigated have also been discussed for future research of this direction.
Tasks	Rain Removal
Published	2019-09-18
URL	https://arxiv.org/abs/1909.08326v2
PDF	https://arxiv.org/pdf/1909.08326v2.pdf
PWC	https://paperswithcode.com/paper/a-survey-on-rain-removal-from-video-and
Repo	https://github.com/hongwang01/Video-and-Single-Image-Deraining
Framework	none

Sequential modeling of Sessions using Recurrent Neural Networks for Skip Prediction


Title	Sequential modeling of Sessions using Recurrent Neural Networks for Skip Prediction
Authors	Sainath Adapa
Abstract	Recommender systems play an essential role in music streaming services, prominently in the form of personalized playlists. Exploring the user interactions within these listening sessions can be beneficial to understanding the user preferences in the context of a single session. In the ‘Spotify Sequential Skip Prediction Challenge’, WSDM, and Spotify are challenging people to understand the way users sequentially interact with music. We describe our solution approach in this paper and also state proposals for further improvements to the model. The proposed model initially generates a fixed vector representation of the session, and this additional information is incorporated into an Encoder-Decoder style architecture. This method achieved the seventh position in the competition, with a mean average accuracy of 0.604 on the test set. The solution code is available at https://github.com/sainathadapa/spotify-sequential-skip-prediction.
Tasks	Recommendation Systems
Published	2019-04-23
URL	http://arxiv.org/abs/1904.10273v1
PDF	http://arxiv.org/pdf/1904.10273v1.pdf
PWC	https://paperswithcode.com/paper/sequential-modeling-of-sessions-using
Repo	https://github.com/sainathadapa/spotify-sequential-skip-prediction
Framework	none

Privacy-Preserving Gradient Boosting Decision Trees


Title	Privacy-Preserving Gradient Boosting Decision Trees
Authors	Qinbin Li, Zhaomin Wu, Zeyi Wen, Bingsheng He
Abstract	The Gradient Boosting Decision Tree (GBDT) is a popular machine learning model for various tasks in recent years. In this paper, we study how to improve model accuracy of GBDT while preserving the strong guarantee of differential privacy. Sensitivity and privacy budget are two key design aspects for the effectiveness of differential private models. Existing solutions for GBDT with differential privacy suffer from the significant accuracy loss due to too loose sensitivity bounds and ineffective privacy budget allocations (especially across different trees in the GBDT model). Loose sensitivity bounds lead to more noise to obtain a fixed privacy level. Ineffective privacy budget allocations worsen the accuracy loss especially when the number of trees is large. Therefore, we propose a new GBDT training algorithm that achieves tighter sensitivity bounds and more effective noise allocations. Specifically, by investigating the property of gradient and the contribution of each tree in GBDTs, we propose to adaptively control the gradients of training data for each iteration and leaf node clipping in order to tighten the sensitivity bounds. Furthermore, we design a novel boosting framework to allocate the privacy budget between trees so that the accuracy loss can be further reduced. Our experiments show that our approach can achieve much better model accuracy than other baselines.
Tasks
Published	2019-11-11
URL	https://arxiv.org/abs/1911.04209v2
PDF	https://arxiv.org/pdf/1911.04209v2.pdf
PWC	https://paperswithcode.com/paper/privacy-preserving-gradient-boosting-decision
Repo	https://github.com/Xtra-Computing/PrivML
Framework	none

Real-time Person Re-identification at the Edge: A Mixed Precision Approach


Title	Real-time Person Re-identification at the Edge: A Mixed Precision Approach
Authors	Mohammadreza Baharani, Shrey Mohan, Hamed Tabkhi
Abstract	A critical part of multi-person multi-camera tracking is person re-identification (re-ID) algorithm, which recognizes and retains identities of all detected unknown people throughout the video stream. Many re-ID algorithms today exemplify state of the art results, but not much work has been done to explore the deployment of such algorithms for computation and power constrained real-time scenarios. In this paper, we study the effect of using a light-weight model, MobileNet-v2 for re-ID and investigate the impact of single (FP32) precision versus half (FP16) precision for training on the server and inference on the edge nodes. We further compare the results with the baseline model which uses ResNet-50 on state of the art benchmarks including CUHK03, Market-1501, and Duke-MTMC. The MobileNet-V2 mixed precision training method can improve both inference throughput on the edge node, and training time on server $3.25\times$ reaching to 27.77fps and $1.75\times$, respectively and decreases power consumption on the edge node by $1.45\times$, while it deteriorates accuracy only 5.6% in respect to ResNet-50 single precision on the average for three different datasets. The code and pre-trained networks are publicly available at https://github.com/TeCSAR-UNCC/person-reid.
Tasks	Person Re-Identification
Published	2019-08-19
URL	https://arxiv.org/abs/1908.07842v1
PDF	https://arxiv.org/pdf/1908.07842v1.pdf
PWC	https://paperswithcode.com/paper/190807842
Repo	https://github.com/TeCSAR-UNCC/person-reid
Framework	pytorch

Person Re-identification in Aerial Imagery


Title	Person Re-identification in Aerial Imagery
Authors	Shizhou Zhang, Qi Zhang, Yifei Yang, Xing Wei, Peng Wang, Bingliang Jiao, Yanning Zhang
Abstract	Nowadays, with the rapid development of consumer Unmanned Aerial Vehicles (UAVs), visual surveillance by utilizing the UAV platform has been very attractive. Most of the research works for UAV captured visual data are mainly focused on the tasks of object detection and tracking. However, limited attention has been paid to the task of person Re-identification (ReID) which has been widely studied in ordinary surveillance cameras with fixed emplacements. In this paper, to facilitate the research of person ReID in aerial imagery, we collect a large scale airborne person ReID dataset named as Person ReID for Aerial Imagery (PRAI-1581), which consists of 39,461 images of 1581 person identities. The images of the dataset are shot by two DJI consumer UAVs flying at an altitude ranging from 20 to 60 meters above the ground, which covers most of the real UAV surveillance scenarios. In addition, we propose to utilize subspace pooling of convolution feature maps to represent the input person images. Our method can learn a discriminative and compact feature representation for ReID in aerial imagery and can be trained in an end-to-end fashion efficiently. We conduct extensive experiments on the proposed dataset and the experimental results demonstrate that re-identify persons in aerial imagery is a challenging problem, where our method performs favorably against state of the arts.
Tasks	Object Detection, Person Re-Identification
Published	2019-08-14
URL	https://arxiv.org/abs/1908.05024v2
PDF	https://arxiv.org/pdf/1908.05024v2.pdf
PWC	https://paperswithcode.com/paper/person-re-identification-in-aerial-imagery
Repo	https://github.com/stormyoung/PRAI-1581
Framework	none

PKUSEG: A Toolkit for Multi-Domain Chinese Word Segmentation


Title	PKUSEG: A Toolkit for Multi-Domain Chinese Word Segmentation
Authors	Ruixuan Luo, Jingjing Xu, Yi Zhang, Xuancheng Ren, Xu Sun
Abstract	Chinese word segmentation (CWS) is a fundamental step of Chinese natural language processing. In this paper, we build a new toolkit, named PKUSEG, for multi-domain word segmentation. Unlike existing single-model toolkits, PKUSEG targets at multi-domain word segmentation and provides separate models for different domains, such as web, medicine, and tourism. The new toolkit also supports POS tagging and model training to adapt to various application scenarios. Experiments show that PKUSEG achieves high performance on multiple domains. The toolkit is now freely and publicly available for the usage of research and industry.
Tasks	Chinese Word Segmentation
Published	2019-06-27
URL	https://arxiv.org/abs/1906.11455v2
PDF	https://arxiv.org/pdf/1906.11455v2.pdf
PWC	https://paperswithcode.com/paper/pkuseg-a-toolkit-for-multi-domain-chinese
Repo	https://github.com/lancopku/pkuseg-python
Framework	none

Gate Decorator: Global Filter Pruning Method for Accelerating Deep Convolutional Neural Networks


Title	Gate Decorator: Global Filter Pruning Method for Accelerating Deep Convolutional Neural Networks
Authors	Zhonghui You, Kun Yan, Jinmian Ye, Meng Ma, Ping Wang
Abstract	Filter pruning is one of the most effective ways to accelerate and compress convolutional neural networks (CNNs). In this work, we propose a global filter pruning algorithm called Gate Decorator, which transforms a vanilla CNN module by multiplying its output by the channel-wise scaling factors, i.e. gate. When the scaling factor is set to zero, it is equivalent to removing the corresponding filter. We use Taylor expansion to estimate the change in the loss function caused by setting the scaling factor to zero and use the estimation for the global filter importance ranking. Then we prune the network by removing those unimportant filters. After pruning, we merge all the scaling factors into its original module, so no special operations or structures are introduced. Moreover, we propose an iterative pruning framework called Tick-Tock to improve pruning accuracy. The extensive experiments demonstrate the effectiveness of our approaches. For example, we achieve the state-of-the-art pruning ratio on ResNet-56 by reducing 70% FLOPs without noticeable loss in accuracy. For ResNet-50 on ImageNet, our pruned model with 40% FLOPs reduction outperforms the baseline model by 0.31% in top-1 accuracy. Various datasets are used, including CIFAR-10, CIFAR-100, CUB-200, ImageNet ILSVRC-12 and PASCAL VOC 2011. Code is available at github.com/youzhonghui/gate-decorator-pruning
Tasks
Published	2019-09-18
URL	https://arxiv.org/abs/1909.08174v1
PDF	https://arxiv.org/pdf/1909.08174v1.pdf
PWC	https://paperswithcode.com/paper/gate-decorator-global-filter-pruning-method
Repo	https://github.com/youzhonghui/gate-decorator-pruning
Framework	pytorch

$L_0$-ARM: Network Sparsification via Stochastic Binary Optimization


Title	$L_0$-ARM: Network Sparsification via Stochastic Binary Optimization
Authors	Yang Li, Shihao Ji
Abstract	We consider network sparsification as an $L_0$-norm regularized binary optimization problem, where each unit of a neural network (e.g., weight, neuron, or channel, etc.) is attached with a stochastic binary gate, whose parameters are jointly optimized with original network parameters. The Augment-Reinforce-Merge (ARM), a recently proposed unbiased gradient estimator, is investigated for this binary optimization problem. Compared to the hard concrete gradient estimator from Louizos et al., ARM demonstrates superior performance of pruning network architectures while retaining almost the same accuracies of baseline methods. Similar to the hard concrete estimator, ARM also enables conditional computation during model training but with improved effectiveness due to the exact binary stochasticity. Thanks to the flexibility of ARM, many smooth or non-smooth parametric functions, such as scaled sigmoid or hard sigmoid, can be used to parameterize this binary optimization problem and the unbiasness of the ARM estimator is retained, while the hard concrete estimator has to rely on the hard sigmoid function to achieve conditional computation and thus accelerated training. Extensive experiments on multiple public datasets demonstrate state-of-the-art pruning rates with almost the same accuracies of baseline methods. The resulting algorithm $L_0$-ARM sparsifies the Wide-ResNet models on CIFAR-10 and CIFAR-100 while the hard concrete estimator cannot. The code is public available at https://github.com/leo-yangli/l0-arm.
Tasks
Published	2019-04-09
URL	https://arxiv.org/abs/1904.04432v3
PDF	https://arxiv.org/pdf/1904.04432v3.pdf
PWC	https://paperswithcode.com/paper/l_0-arm-network-sparsification-via-stochastic
Repo	https://github.com/leo-yangli/l0-arm
Framework	pytorch

Computation of Circular Area and Spherical Volume Invariants via Boundary Integrals


Title	Computation of Circular Area and Spherical Volume Invariants via Boundary Integrals
Authors	Riley O’Neill, Pedro Angulo-Umana, Jeff Calder, Bo Hessburg, Peter J. Olver, Chehrzad Shakiban, Katrina Yezzi-Woodley
Abstract	We show how to compute the circular area invariant of planar curves, and the spherical volume invariant of surfaces, in terms of line and surface integrals, respectively. We use the Divergence Theorem to express the area and volume integrals as line and surface integrals, respectively, against particular kernels; our results also extend to higher dimensional hypersurfaces. The resulting surface integrals are computable analytically on a triangulated mesh. This gives a simple computational algorithm for computing the spherical volume invariant for triangulated surfaces that does not involve discretizing the ambient space. We discuss potential applications to feature detection on broken bone fragments of interest in anthropology.
Tasks
Published	2019-05-06
URL	https://arxiv.org/abs/1905.02176v1
PDF	https://arxiv.org/pdf/1905.02176v1.pdf
PWC	https://paperswithcode.com/paper/computation-of-circular-area-and-spherical
Repo	https://github.com/jwcalder/Spherical-Volume-Invariant
Framework	none

Feature2Vec: Distributional semantic modelling of human property knowledge


Title	Feature2Vec: Distributional semantic modelling of human property knowledge
Authors	Steven Derby, Paul Miller, Barry Devereux
Abstract	Feature norm datasets of human conceptual knowledge, collected in surveys of human volunteers, yield highly interpretable models of word meaning and play an important role in neurolinguistic research on semantic cognition. However, these datasets are limited in size due to practical obstacles associated with exhaustively listing properties for a large number of words. In contrast, the development of distributional modelling techniques and the availability of vast text corpora have allowed researchers to construct effective vector space models of word meaning over large lexicons. However, this comes at the cost of interpretable, human-like information about word meaning. We propose a method for mapping human property knowledge onto a distributional semantic space, which adapts the word2vec architecture to the task of modelling concept features. Our approach gives a measure of concept and feature affinity in a single semantic space, which makes for easy and efficient ranking of candidate human-derived semantic properties for arbitrary words. We compare our model with a previous approach, and show that it performs better on several evaluation tasks. Finally, we discuss how our method could be used to develop efficient sampling techniques to extend existing feature norm datasets in a reliable way.
Tasks
Published	2019-08-29
URL	https://arxiv.org/abs/1908.11439v1
PDF	https://arxiv.org/pdf/1908.11439v1.pdf
PWC	https://paperswithcode.com/paper/feature2vec-distributional-semantic-modelling
Repo	https://github.com/stevend94/Feature2Vec
Framework	tf

Rewarding Coreference Resolvers for Being Consistent with World Knowledge


Title	Rewarding Coreference Resolvers for Being Consistent with World Knowledge
Authors	Rahul Aralikatte, Heather Lent, Ana Valeria Gonzalez, Daniel Hershcovich, Chen Qiu, Anders Sandholm, Michael Ringaard, Anders Søgaard
Abstract	Unresolved coreference is a bottleneck for relation extraction, and high-quality coreference resolvers may produce an output that makes it a lot easier to extract knowledge triples. We show how to improve coreference resolvers by forwarding their input to a relation extraction system and reward the resolvers for producing triples that are found in knowledge bases. Since relation extraction systems can rely on different forms of supervision and be biased in different ways, we obtain the best performance, improving over the state of the art, using multi-task reinforcement learning.
Tasks	Relation Extraction
Published	2019-09-05
URL	https://arxiv.org/abs/1909.02392v2
PDF	https://arxiv.org/pdf/1909.02392v2.pdf
PWC	https://paperswithcode.com/paper/rewarding-coreference-resolvers-for-being
Repo	https://github.com/rahular/coref-rl
Framework	tf

Kernel Mode Decomposition and programmable/interpretable regression networks


Title	Kernel Mode Decomposition and programmable/interpretable regression networks
Authors	Houman Owhadi, Clint Scovel, Gene Ryan Yoo
Abstract	Mode decomposition is a prototypical pattern recognition problem that can be addressed from the (a priori distinct) perspectives of numerical approximation, statistical inference and deep learning. Could its analysis through these combined perspectives be used as a Rosetta stone for deciphering mechanisms at play in deep learning? Motivated by this question we introduce programmable and interpretable regression networks for pattern recognition and address mode decomposition as a prototypical problem. The programming of these networks is achieved by assembling elementary modules decomposing and recomposing kernels and data. These elementary steps are repeated across levels of abstraction and interpreted from the equivalent perspectives of optimal recovery, game theory and Gaussian process regression (GPR). The prototypical mode/kernel decomposition module produces an optimal approximation $(w_1,w_2,\cdots,w_m)$ of an element $(v_1,v_2,\ldots,v_m)$ of a product of Hilbert subspaces of a common Hilbert space from the observation of the sum $v:=v_1+\cdots+v_m$. The prototypical mode/kernel recomposition module performs partial sums of the recovered modes $w_i$ based on the alignment between each recovered mode $w_i$ and the data $v$. We illustrate the proposed framework by programming regression networks approximating the modes $v_i= a_i(t)y_i\big(\theta_i(t)\big)$ of a (possibly noisy) signal $\sum_i v_i$ when the amplitudes $a_i$, instantaneous phases $\theta_i$ and periodic waveforms $y_i$ may all be unknown and show near machine precision recovery under regularity and separation assumptions on the instantaneous amplitudes $a_i$ and frequencies $\dot{\theta}_i$. The structure of some of these networks share intriguing similarities with convolutional neural networks while being interpretable, programmable and amenable to theoretical analysis.
Tasks
Published	2019-07-19
URL	https://arxiv.org/abs/1907.08592v1
PDF	https://arxiv.org/pdf/1907.08592v1.pdf
PWC	https://paperswithcode.com/paper/kernel-mode-decomposition-and
Repo	https://github.com/kernelmodedec/Kernel-Mode-Decomposition-1D
Framework	tf

Treant: Training Evasion-Aware Decision Trees


Title	Treant: Training Evasion-Aware Decision Trees
Authors	Stefano Calzavara, Claudio Lucchese, Gabriele Tolomei, Seyum Assefa Abebe, Salvatore Orlando
Abstract	Despite its success and popularity, machine learning is now recognized as vulnerable to evasion attacks, i.e., carefully crafted perturbations of test inputs designed to force prediction errors. In this paper we focus on evasion attacks against decision tree ensembles, which are among the most successful predictive models for dealing with non-perceptual problems. Even though they are powerful and interpretable, decision tree ensembles have received only limited attention by the security and machine learning communities so far, leading to a sub-optimal state of the art for adversarial learning techniques. We thus propose Treant, a novel decision tree learning algorithm that, on the basis of a formal threat model, minimizes an evasion-aware loss function at each step of the tree construction. Treant is based on two key technical ingredients: robust splitting and attack invariance, which jointly guarantee the soundness of the learning process. Experimental results on three publicly available datasets show that Treant is able to generate decision tree ensembles that are at the same time accurate and nearly insensitive to evasion attacks, outperforming state-of-the-art adversarial learning techniques.
Tasks
Published	2019-07-02
URL	https://arxiv.org/abs/1907.01197v2
PDF	https://arxiv.org/pdf/1907.01197v2.pdf
PWC	https://paperswithcode.com/paper/treant-training-evasion-aware-decision-trees
Repo	https://github.com/gtolomei/treant
Framework	none

Head-Driven Phrase Structure Grammar Parsing on Penn Treebank


Title	Head-Driven Phrase Structure Grammar Parsing on Penn Treebank
Authors	Junru Zhou, Hai Zhao
Abstract	Head-driven phrase structure grammar (HPSG) enjoys a uniform formalism representing rich contextual syntactic and even semantic meanings. This paper makes the first attempt to formulate a simplified HPSG by integrating constituent and dependency formal representations into head-driven phrase structure. Then two parsing algorithms are respectively proposed for two converted tree representations, division span and joint span. As HPSG encodes both constituent and dependency structure information, the proposed HPSG parsers may be regarded as a sort of joint decoder for both types of structures and thus are evaluated in terms of extracted or converted constituent and dependency parsing trees. Our parser achieves new state-of-the-art performance for both parsing tasks on Penn Treebank (PTB) and Chinese Penn Treebank, verifying the effectiveness of joint learning constituent and dependency structures. In details, we report 96.33 F1 of constituent parsing and 97.20% UAS of dependency parsing on PTB.
Tasks	Dependency Parsing
Published	2019-07-05
URL	https://arxiv.org/abs/1907.02684v3
PDF	https://arxiv.org/pdf/1907.02684v3.pdf
PWC	https://paperswithcode.com/paper/head-driven-phrase-structure-grammar-parsing
Repo	https://github.com/DoodleJZ/HPSG-Neural-Parser
Framework	pytorch

Modeling Major Transitions in Evolution with the Game of Life


Title	Modeling Major Transitions in Evolution with the Game of Life
Authors	Peter D. Turney
Abstract	Maynard Smith and Szathm'ary’s book, The Major Transitions in Evolution, describes eight major events in the evolution of life on Earth and identifies a common theme that unites these events. In each event, smaller entities came together to form larger entities, which can be described as symbiosis or cooperation. Here we present a computational simulation of evolving entities that includes symbiosis with shifting levels of selection. In the simulation, the fitness of an entity is measured by a series of one-on-one competitions in the Immigration Game, a two-player variation of Conway’s Game of Life. Mutation, reproduction, and symbiosis are implemented as operations that are external to the Immigration Game. Because these operations are external to the game, we are able to freely manipulate the operations and observe the effects of the manipulations. The simulation is composed of four layers, each layer building on the previous layer. The first layer implements a simple form of asexual reproduction, the second layer introduces a more sophisticated form of asexual reproduction, the third layer adds sexual reproduction, and the fourth layer adds symbiosis. The experiments show that a small amount of symbiosis, added to the other layers, significantly increases the fitness of the population. We suggest that, in addition to providing new insights into biological and cultural evolution, this model of symbiosis may have practical applications in evolutionary computation, such as in the task of learning deep neural network models.
Tasks
Published	2019-08-19
URL	https://arxiv.org/abs/1908.07034v1
PDF	https://arxiv.org/pdf/1908.07034v1.pdf
PWC	https://paperswithcode.com/paper/modeling-major-transitions-in-evolution-with
Repo	https://github.com/pdturney/modeling-major-transitions
Framework	none