Paper Group AWR 181
A Survey on Rain Removal from Video and Single Image. Sequential modeling of Sessions using Recurrent Neural Networks for Skip Prediction. Privacy-Preserving Gradient Boosting Decision Trees. Real-time Person Re-identification at the Edge: A Mixed Precision Approach. Person Re-identification in Aerial Imagery. PKUSEG: A Toolkit for Multi-Domain Chi …
A Survey on Rain Removal from Video and Single Image
Title | A Survey on Rain Removal from Video and Single Image |
Authors | Hong Wang, Yichen Wu, Minghan Li, Qian Zhao, Deyu Meng |
Abstract | Rain streaks might severely degenerate the performance of video/image processing tasks. The investigations on rain removal from video or a single image has thus been attracting much research attention in the field of computer vision and pattern recognition, and various methods have been proposed against this task in the recent years. However, there is still not a comprehensive survey paper to summarize current rain removal methods and fairly compare their generalization performance, and especially, still not a off-the-shelf toolkit to accumulate recent representative methods for easy performance comparison and capability evaluation. Aiming at this meaningful task, in this study we present a comprehensive review for current rain removal methods for video and a single image. Specifically, these methods are categorized into model-driven and data-driven approaches, and more elaborate branches of each approach are further introduced. Intrinsic capabilities, especially generalization, of representative state-of-the-art methods of each approach have been evaluated and analyzed by experiments implemented on synthetic and real data both visually and quantitatively. Furthermore, we release a comprehensive repository, including direct links to 74 rain removal papers, source codes of 9 methods for video rain removal and 20 ones for single image rain removal, 19 related project pages, 6 synthetic datasets and 4 real ones, and 4 commonly used image quality metrics, to facilitate reproduction and performance comparison of current existing methods for general users. Some limitations and research issues worthy to be further investigated have also been discussed for future research of this direction. |
Tasks | Rain Removal |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.08326v2 |
https://arxiv.org/pdf/1909.08326v2.pdf | |
PWC | https://paperswithcode.com/paper/a-survey-on-rain-removal-from-video-and |
Repo | https://github.com/hongwang01/Video-and-Single-Image-Deraining |
Framework | none |
Sequential modeling of Sessions using Recurrent Neural Networks for Skip Prediction
Title | Sequential modeling of Sessions using Recurrent Neural Networks for Skip Prediction |
Authors | Sainath Adapa |
Abstract | Recommender systems play an essential role in music streaming services, prominently in the form of personalized playlists. Exploring the user interactions within these listening sessions can be beneficial to understanding the user preferences in the context of a single session. In the ‘Spotify Sequential Skip Prediction Challenge’, WSDM, and Spotify are challenging people to understand the way users sequentially interact with music. We describe our solution approach in this paper and also state proposals for further improvements to the model. The proposed model initially generates a fixed vector representation of the session, and this additional information is incorporated into an Encoder-Decoder style architecture. This method achieved the seventh position in the competition, with a mean average accuracy of 0.604 on the test set. The solution code is available at https://github.com/sainathadapa/spotify-sequential-skip-prediction. |
Tasks | Recommendation Systems |
Published | 2019-04-23 |
URL | http://arxiv.org/abs/1904.10273v1 |
http://arxiv.org/pdf/1904.10273v1.pdf | |
PWC | https://paperswithcode.com/paper/sequential-modeling-of-sessions-using |
Repo | https://github.com/sainathadapa/spotify-sequential-skip-prediction |
Framework | none |
Privacy-Preserving Gradient Boosting Decision Trees
Title | Privacy-Preserving Gradient Boosting Decision Trees |
Authors | Qinbin Li, Zhaomin Wu, Zeyi Wen, Bingsheng He |
Abstract | The Gradient Boosting Decision Tree (GBDT) is a popular machine learning model for various tasks in recent years. In this paper, we study how to improve model accuracy of GBDT while preserving the strong guarantee of differential privacy. Sensitivity and privacy budget are two key design aspects for the effectiveness of differential private models. Existing solutions for GBDT with differential privacy suffer from the significant accuracy loss due to too loose sensitivity bounds and ineffective privacy budget allocations (especially across different trees in the GBDT model). Loose sensitivity bounds lead to more noise to obtain a fixed privacy level. Ineffective privacy budget allocations worsen the accuracy loss especially when the number of trees is large. Therefore, we propose a new GBDT training algorithm that achieves tighter sensitivity bounds and more effective noise allocations. Specifically, by investigating the property of gradient and the contribution of each tree in GBDTs, we propose to adaptively control the gradients of training data for each iteration and leaf node clipping in order to tighten the sensitivity bounds. Furthermore, we design a novel boosting framework to allocate the privacy budget between trees so that the accuracy loss can be further reduced. Our experiments show that our approach can achieve much better model accuracy than other baselines. |
Tasks | |
Published | 2019-11-11 |
URL | https://arxiv.org/abs/1911.04209v2 |
https://arxiv.org/pdf/1911.04209v2.pdf | |
PWC | https://paperswithcode.com/paper/privacy-preserving-gradient-boosting-decision |
Repo | https://github.com/Xtra-Computing/PrivML |
Framework | none |
Real-time Person Re-identification at the Edge: A Mixed Precision Approach
Title | Real-time Person Re-identification at the Edge: A Mixed Precision Approach |
Authors | Mohammadreza Baharani, Shrey Mohan, Hamed Tabkhi |
Abstract | A critical part of multi-person multi-camera tracking is person re-identification (re-ID) algorithm, which recognizes and retains identities of all detected unknown people throughout the video stream. Many re-ID algorithms today exemplify state of the art results, but not much work has been done to explore the deployment of such algorithms for computation and power constrained real-time scenarios. In this paper, we study the effect of using a light-weight model, MobileNet-v2 for re-ID and investigate the impact of single (FP32) precision versus half (FP16) precision for training on the server and inference on the edge nodes. We further compare the results with the baseline model which uses ResNet-50 on state of the art benchmarks including CUHK03, Market-1501, and Duke-MTMC. The MobileNet-V2 mixed precision training method can improve both inference throughput on the edge node, and training time on server $3.25\times$ reaching to 27.77fps and $1.75\times$, respectively and decreases power consumption on the edge node by $1.45\times$, while it deteriorates accuracy only 5.6% in respect to ResNet-50 single precision on the average for three different datasets. The code and pre-trained networks are publicly available at https://github.com/TeCSAR-UNCC/person-reid. |
Tasks | Person Re-Identification |
Published | 2019-08-19 |
URL | https://arxiv.org/abs/1908.07842v1 |
https://arxiv.org/pdf/1908.07842v1.pdf | |
PWC | https://paperswithcode.com/paper/190807842 |
Repo | https://github.com/TeCSAR-UNCC/person-reid |
Framework | pytorch |
Person Re-identification in Aerial Imagery
Title | Person Re-identification in Aerial Imagery |
Authors | Shizhou Zhang, Qi Zhang, Yifei Yang, Xing Wei, Peng Wang, Bingliang Jiao, Yanning Zhang |
Abstract | Nowadays, with the rapid development of consumer Unmanned Aerial Vehicles (UAVs), visual surveillance by utilizing the UAV platform has been very attractive. Most of the research works for UAV captured visual data are mainly focused on the tasks of object detection and tracking. However, limited attention has been paid to the task of person Re-identification (ReID) which has been widely studied in ordinary surveillance cameras with fixed emplacements. In this paper, to facilitate the research of person ReID in aerial imagery, we collect a large scale airborne person ReID dataset named as Person ReID for Aerial Imagery (PRAI-1581), which consists of 39,461 images of 1581 person identities. The images of the dataset are shot by two DJI consumer UAVs flying at an altitude ranging from 20 to 60 meters above the ground, which covers most of the real UAV surveillance scenarios. In addition, we propose to utilize subspace pooling of convolution feature maps to represent the input person images. Our method can learn a discriminative and compact feature representation for ReID in aerial imagery and can be trained in an end-to-end fashion efficiently. We conduct extensive experiments on the proposed dataset and the experimental results demonstrate that re-identify persons in aerial imagery is a challenging problem, where our method performs favorably against state of the arts. |
Tasks | Object Detection, Person Re-Identification |
Published | 2019-08-14 |
URL | https://arxiv.org/abs/1908.05024v2 |
https://arxiv.org/pdf/1908.05024v2.pdf | |
PWC | https://paperswithcode.com/paper/person-re-identification-in-aerial-imagery |
Repo | https://github.com/stormyoung/PRAI-1581 |
Framework | none |
PKUSEG: A Toolkit for Multi-Domain Chinese Word Segmentation
Title | PKUSEG: A Toolkit for Multi-Domain Chinese Word Segmentation |
Authors | Ruixuan Luo, Jingjing Xu, Yi Zhang, Xuancheng Ren, Xu Sun |
Abstract | Chinese word segmentation (CWS) is a fundamental step of Chinese natural language processing. In this paper, we build a new toolkit, named PKUSEG, for multi-domain word segmentation. Unlike existing single-model toolkits, PKUSEG targets at multi-domain word segmentation and provides separate models for different domains, such as web, medicine, and tourism. The new toolkit also supports POS tagging and model training to adapt to various application scenarios. Experiments show that PKUSEG achieves high performance on multiple domains. The toolkit is now freely and publicly available for the usage of research and industry. |
Tasks | Chinese Word Segmentation |
Published | 2019-06-27 |
URL | https://arxiv.org/abs/1906.11455v2 |
https://arxiv.org/pdf/1906.11455v2.pdf | |
PWC | https://paperswithcode.com/paper/pkuseg-a-toolkit-for-multi-domain-chinese |
Repo | https://github.com/lancopku/pkuseg-python |
Framework | none |
Gate Decorator: Global Filter Pruning Method for Accelerating Deep Convolutional Neural Networks
Title | Gate Decorator: Global Filter Pruning Method for Accelerating Deep Convolutional Neural Networks |
Authors | Zhonghui You, Kun Yan, Jinmian Ye, Meng Ma, Ping Wang |
Abstract | Filter pruning is one of the most effective ways to accelerate and compress convolutional neural networks (CNNs). In this work, we propose a global filter pruning algorithm called Gate Decorator, which transforms a vanilla CNN module by multiplying its output by the channel-wise scaling factors, i.e. gate. When the scaling factor is set to zero, it is equivalent to removing the corresponding filter. We use Taylor expansion to estimate the change in the loss function caused by setting the scaling factor to zero and use the estimation for the global filter importance ranking. Then we prune the network by removing those unimportant filters. After pruning, we merge all the scaling factors into its original module, so no special operations or structures are introduced. Moreover, we propose an iterative pruning framework called Tick-Tock to improve pruning accuracy. The extensive experiments demonstrate the effectiveness of our approaches. For example, we achieve the state-of-the-art pruning ratio on ResNet-56 by reducing 70% FLOPs without noticeable loss in accuracy. For ResNet-50 on ImageNet, our pruned model with 40% FLOPs reduction outperforms the baseline model by 0.31% in top-1 accuracy. Various datasets are used, including CIFAR-10, CIFAR-100, CUB-200, ImageNet ILSVRC-12 and PASCAL VOC 2011. Code is available at github.com/youzhonghui/gate-decorator-pruning |
Tasks | |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.08174v1 |
https://arxiv.org/pdf/1909.08174v1.pdf | |
PWC | https://paperswithcode.com/paper/gate-decorator-global-filter-pruning-method |
Repo | https://github.com/youzhonghui/gate-decorator-pruning |
Framework | pytorch |
$L_0$-ARM: Network Sparsification via Stochastic Binary Optimization
Title | $L_0$-ARM: Network Sparsification via Stochastic Binary Optimization |
Authors | Yang Li, Shihao Ji |
Abstract | We consider network sparsification as an $L_0$-norm regularized binary optimization problem, where each unit of a neural network (e.g., weight, neuron, or channel, etc.) is attached with a stochastic binary gate, whose parameters are jointly optimized with original network parameters. The Augment-Reinforce-Merge (ARM), a recently proposed unbiased gradient estimator, is investigated for this binary optimization problem. Compared to the hard concrete gradient estimator from Louizos et al., ARM demonstrates superior performance of pruning network architectures while retaining almost the same accuracies of baseline methods. Similar to the hard concrete estimator, ARM also enables conditional computation during model training but with improved effectiveness due to the exact binary stochasticity. Thanks to the flexibility of ARM, many smooth or non-smooth parametric functions, such as scaled sigmoid or hard sigmoid, can be used to parameterize this binary optimization problem and the unbiasness of the ARM estimator is retained, while the hard concrete estimator has to rely on the hard sigmoid function to achieve conditional computation and thus accelerated training. Extensive experiments on multiple public datasets demonstrate state-of-the-art pruning rates with almost the same accuracies of baseline methods. The resulting algorithm $L_0$-ARM sparsifies the Wide-ResNet models on CIFAR-10 and CIFAR-100 while the hard concrete estimator cannot. The code is public available at https://github.com/leo-yangli/l0-arm. |
Tasks | |
Published | 2019-04-09 |
URL | https://arxiv.org/abs/1904.04432v3 |
https://arxiv.org/pdf/1904.04432v3.pdf | |
PWC | https://paperswithcode.com/paper/l_0-arm-network-sparsification-via-stochastic |
Repo | https://github.com/leo-yangli/l0-arm |
Framework | pytorch |
Computation of Circular Area and Spherical Volume Invariants via Boundary Integrals
Title | Computation of Circular Area and Spherical Volume Invariants via Boundary Integrals |
Authors | Riley O’Neill, Pedro Angulo-Umana, Jeff Calder, Bo Hessburg, Peter J. Olver, Chehrzad Shakiban, Katrina Yezzi-Woodley |
Abstract | We show how to compute the circular area invariant of planar curves, and the spherical volume invariant of surfaces, in terms of line and surface integrals, respectively. We use the Divergence Theorem to express the area and volume integrals as line and surface integrals, respectively, against particular kernels; our results also extend to higher dimensional hypersurfaces. The resulting surface integrals are computable analytically on a triangulated mesh. This gives a simple computational algorithm for computing the spherical volume invariant for triangulated surfaces that does not involve discretizing the ambient space. We discuss potential applications to feature detection on broken bone fragments of interest in anthropology. |
Tasks | |
Published | 2019-05-06 |
URL | https://arxiv.org/abs/1905.02176v1 |
https://arxiv.org/pdf/1905.02176v1.pdf | |
PWC | https://paperswithcode.com/paper/computation-of-circular-area-and-spherical |
Repo | https://github.com/jwcalder/Spherical-Volume-Invariant |
Framework | none |
Feature2Vec: Distributional semantic modelling of human property knowledge
Title | Feature2Vec: Distributional semantic modelling of human property knowledge |
Authors | Steven Derby, Paul Miller, Barry Devereux |
Abstract | Feature norm datasets of human conceptual knowledge, collected in surveys of human volunteers, yield highly interpretable models of word meaning and play an important role in neurolinguistic research on semantic cognition. However, these datasets are limited in size due to practical obstacles associated with exhaustively listing properties for a large number of words. In contrast, the development of distributional modelling techniques and the availability of vast text corpora have allowed researchers to construct effective vector space models of word meaning over large lexicons. However, this comes at the cost of interpretable, human-like information about word meaning. We propose a method for mapping human property knowledge onto a distributional semantic space, which adapts the word2vec architecture to the task of modelling concept features. Our approach gives a measure of concept and feature affinity in a single semantic space, which makes for easy and efficient ranking of candidate human-derived semantic properties for arbitrary words. We compare our model with a previous approach, and show that it performs better on several evaluation tasks. Finally, we discuss how our method could be used to develop efficient sampling techniques to extend existing feature norm datasets in a reliable way. |
Tasks | |
Published | 2019-08-29 |
URL | https://arxiv.org/abs/1908.11439v1 |
https://arxiv.org/pdf/1908.11439v1.pdf | |
PWC | https://paperswithcode.com/paper/feature2vec-distributional-semantic-modelling |
Repo | https://github.com/stevend94/Feature2Vec |
Framework | tf |
Rewarding Coreference Resolvers for Being Consistent with World Knowledge
Title | Rewarding Coreference Resolvers for Being Consistent with World Knowledge |
Authors | Rahul Aralikatte, Heather Lent, Ana Valeria Gonzalez, Daniel Hershcovich, Chen Qiu, Anders Sandholm, Michael Ringaard, Anders Søgaard |
Abstract | Unresolved coreference is a bottleneck for relation extraction, and high-quality coreference resolvers may produce an output that makes it a lot easier to extract knowledge triples. We show how to improve coreference resolvers by forwarding their input to a relation extraction system and reward the resolvers for producing triples that are found in knowledge bases. Since relation extraction systems can rely on different forms of supervision and be biased in different ways, we obtain the best performance, improving over the state of the art, using multi-task reinforcement learning. |
Tasks | Relation Extraction |
Published | 2019-09-05 |
URL | https://arxiv.org/abs/1909.02392v2 |
https://arxiv.org/pdf/1909.02392v2.pdf | |
PWC | https://paperswithcode.com/paper/rewarding-coreference-resolvers-for-being |
Repo | https://github.com/rahular/coref-rl |
Framework | tf |
Kernel Mode Decomposition and programmable/interpretable regression networks
Title | Kernel Mode Decomposition and programmable/interpretable regression networks |
Authors | Houman Owhadi, Clint Scovel, Gene Ryan Yoo |
Abstract | Mode decomposition is a prototypical pattern recognition problem that can be addressed from the (a priori distinct) perspectives of numerical approximation, statistical inference and deep learning. Could its analysis through these combined perspectives be used as a Rosetta stone for deciphering mechanisms at play in deep learning? Motivated by this question we introduce programmable and interpretable regression networks for pattern recognition and address mode decomposition as a prototypical problem. The programming of these networks is achieved by assembling elementary modules decomposing and recomposing kernels and data. These elementary steps are repeated across levels of abstraction and interpreted from the equivalent perspectives of optimal recovery, game theory and Gaussian process regression (GPR). The prototypical mode/kernel decomposition module produces an optimal approximation $(w_1,w_2,\cdots,w_m)$ of an element $(v_1,v_2,\ldots,v_m)$ of a product of Hilbert subspaces of a common Hilbert space from the observation of the sum $v:=v_1+\cdots+v_m$. The prototypical mode/kernel recomposition module performs partial sums of the recovered modes $w_i$ based on the alignment between each recovered mode $w_i$ and the data $v$. We illustrate the proposed framework by programming regression networks approximating the modes $v_i= a_i(t)y_i\big(\theta_i(t)\big)$ of a (possibly noisy) signal $\sum_i v_i$ when the amplitudes $a_i$, instantaneous phases $\theta_i$ and periodic waveforms $y_i$ may all be unknown and show near machine precision recovery under regularity and separation assumptions on the instantaneous amplitudes $a_i$ and frequencies $\dot{\theta}_i$. The structure of some of these networks share intriguing similarities with convolutional neural networks while being interpretable, programmable and amenable to theoretical analysis. |
Tasks | |
Published | 2019-07-19 |
URL | https://arxiv.org/abs/1907.08592v1 |
https://arxiv.org/pdf/1907.08592v1.pdf | |
PWC | https://paperswithcode.com/paper/kernel-mode-decomposition-and |
Repo | https://github.com/kernelmodedec/Kernel-Mode-Decomposition-1D |
Framework | tf |
Treant: Training Evasion-Aware Decision Trees
Title | Treant: Training Evasion-Aware Decision Trees |
Authors | Stefano Calzavara, Claudio Lucchese, Gabriele Tolomei, Seyum Assefa Abebe, Salvatore Orlando |
Abstract | Despite its success and popularity, machine learning is now recognized as vulnerable to evasion attacks, i.e., carefully crafted perturbations of test inputs designed to force prediction errors. In this paper we focus on evasion attacks against decision tree ensembles, which are among the most successful predictive models for dealing with non-perceptual problems. Even though they are powerful and interpretable, decision tree ensembles have received only limited attention by the security and machine learning communities so far, leading to a sub-optimal state of the art for adversarial learning techniques. We thus propose Treant, a novel decision tree learning algorithm that, on the basis of a formal threat model, minimizes an evasion-aware loss function at each step of the tree construction. Treant is based on two key technical ingredients: robust splitting and attack invariance, which jointly guarantee the soundness of the learning process. Experimental results on three publicly available datasets show that Treant is able to generate decision tree ensembles that are at the same time accurate and nearly insensitive to evasion attacks, outperforming state-of-the-art adversarial learning techniques. |
Tasks | |
Published | 2019-07-02 |
URL | https://arxiv.org/abs/1907.01197v2 |
https://arxiv.org/pdf/1907.01197v2.pdf | |
PWC | https://paperswithcode.com/paper/treant-training-evasion-aware-decision-trees |
Repo | https://github.com/gtolomei/treant |
Framework | none |
Head-Driven Phrase Structure Grammar Parsing on Penn Treebank
Title | Head-Driven Phrase Structure Grammar Parsing on Penn Treebank |
Authors | Junru Zhou, Hai Zhao |
Abstract | Head-driven phrase structure grammar (HPSG) enjoys a uniform formalism representing rich contextual syntactic and even semantic meanings. This paper makes the first attempt to formulate a simplified HPSG by integrating constituent and dependency formal representations into head-driven phrase structure. Then two parsing algorithms are respectively proposed for two converted tree representations, division span and joint span. As HPSG encodes both constituent and dependency structure information, the proposed HPSG parsers may be regarded as a sort of joint decoder for both types of structures and thus are evaluated in terms of extracted or converted constituent and dependency parsing trees. Our parser achieves new state-of-the-art performance for both parsing tasks on Penn Treebank (PTB) and Chinese Penn Treebank, verifying the effectiveness of joint learning constituent and dependency structures. In details, we report 96.33 F1 of constituent parsing and 97.20% UAS of dependency parsing on PTB. |
Tasks | Dependency Parsing |
Published | 2019-07-05 |
URL | https://arxiv.org/abs/1907.02684v3 |
https://arxiv.org/pdf/1907.02684v3.pdf | |
PWC | https://paperswithcode.com/paper/head-driven-phrase-structure-grammar-parsing |
Repo | https://github.com/DoodleJZ/HPSG-Neural-Parser |
Framework | pytorch |
Modeling Major Transitions in Evolution with the Game of Life
Title | Modeling Major Transitions in Evolution with the Game of Life |
Authors | Peter D. Turney |
Abstract | Maynard Smith and Szathm'ary’s book, The Major Transitions in Evolution, describes eight major events in the evolution of life on Earth and identifies a common theme that unites these events. In each event, smaller entities came together to form larger entities, which can be described as symbiosis or cooperation. Here we present a computational simulation of evolving entities that includes symbiosis with shifting levels of selection. In the simulation, the fitness of an entity is measured by a series of one-on-one competitions in the Immigration Game, a two-player variation of Conway’s Game of Life. Mutation, reproduction, and symbiosis are implemented as operations that are external to the Immigration Game. Because these operations are external to the game, we are able to freely manipulate the operations and observe the effects of the manipulations. The simulation is composed of four layers, each layer building on the previous layer. The first layer implements a simple form of asexual reproduction, the second layer introduces a more sophisticated form of asexual reproduction, the third layer adds sexual reproduction, and the fourth layer adds symbiosis. The experiments show that a small amount of symbiosis, added to the other layers, significantly increases the fitness of the population. We suggest that, in addition to providing new insights into biological and cultural evolution, this model of symbiosis may have practical applications in evolutionary computation, such as in the task of learning deep neural network models. |
Tasks | |
Published | 2019-08-19 |
URL | https://arxiv.org/abs/1908.07034v1 |
https://arxiv.org/pdf/1908.07034v1.pdf | |
PWC | https://paperswithcode.com/paper/modeling-major-transitions-in-evolution-with |
Repo | https://github.com/pdturney/modeling-major-transitions |
Framework | none |