February 1, 2020

2986 words 15 mins read

Paper Group AWR 103

Paper Group AWR 103

Learning to Traverse Latent Spaces for Musical Score Inpainting. KPConv: Flexible and Deformable Convolution for Point Clouds. FeatherNets: Convolutional Neural Networks as Light as Feather for Face Anti-spoofing. Cross-Domain Transferability of Adversarial Perturbations. Soft Anchor-Point Object Detection. CutMix: Regularization Strategy to Train …

Learning to Traverse Latent Spaces for Musical Score Inpainting

Title Learning to Traverse Latent Spaces for Musical Score Inpainting
Authors Ashis Pati, Alexander Lerch, Gaëtan Hadjeres
Abstract Music Inpainting is the task of filling in missing or lost information in a piece of music. We investigate this task from an interactive music creation perspective. To this end, a novel deep learning-based approach for musical score inpainting is proposed. The designed model takes both past and future musical context into account and is capable of suggesting ways to connect them in a musically meaningful manner. To achieve this, we leverage the representational power of the latent space of a Variational Auto-Encoder and train a Recurrent Neural Network which learns to traverse this latent space conditioned on the past and future musical contexts. Consequently, the designed model is capable of generating several measures of music to connect two musical excerpts. The capabilities and performance of the model are showcased by comparison with competitive baselines using several objective and subjective evaluation methods. The results show that the model generates meaningful inpaintings and can be used in interactive music creation applications. Overall, the method demonstrates the merit of learning complex trajectories in the latent spaces of deep generative models.
Tasks
Published 2019-07-02
URL https://arxiv.org/abs/1907.01164v1
PDF https://arxiv.org/pdf/1907.01164v1.pdf
PWC https://paperswithcode.com/paper/learning-to-traverse-latent-spaces-for
Repo https://github.com/ashispati/InpaintNet
Framework pytorch

KPConv: Flexible and Deformable Convolution for Point Clouds

Title KPConv: Flexible and Deformable Convolution for Point Clouds
Authors Hugues Thomas, Charles R. Qi, Jean-Emmanuel Deschaud, Beatriz Marcotegui, François Goulette, Leonidas J. Guibas
Abstract We present Kernel Point Convolution (KPConv), a new design of point convolution, i.e. that operates on point clouds without any intermediate representation. The convolution weights of KPConv are located in Euclidean space by kernel points, and applied to the input points close to them. Its capacity to use any number of kernel points gives KPConv more flexibility than fixed grid convolutions. Furthermore, these locations are continuous in space and can be learned by the network. Therefore, KPConv can be extended to deformable convolutions that learn to adapt kernel points to local geometry. Thanks to a regular subsampling strategy, KPConv is also efficient and robust to varying densities. Whether they use deformable KPConv for complex tasks, or rigid KPconv for simpler tasks, our networks outperform state-of-the-art classification and segmentation approaches on several datasets. We also offer ablation studies and visualizations to provide understanding of what has been learned by KPConv and to validate the descriptive power of deformable KPConv.
Tasks Semantic Segmentation
Published 2019-04-18
URL https://arxiv.org/abs/1904.08889v2
PDF https://arxiv.org/pdf/1904.08889v2.pdf
PWC https://paperswithcode.com/paper/kpconv-flexible-and-deformable-convolution
Repo https://github.com/HuguesTHOMAS/KPConv
Framework tf

FeatherNets: Convolutional Neural Networks as Light as Feather for Face Anti-spoofing

Title FeatherNets: Convolutional Neural Networks as Light as Feather for Face Anti-spoofing
Authors Peng Zhang, Fuhao Zou, Zhiwen Wu, Nengli Dai, Skarpness Mark, Michael Fu, Juan Zhao, Kai Li
Abstract Face Anti-spoofing gains increased attentions recently in both academic and industrial fields. With the emergence of various CNN based solutions, the multi-modal(RGB, depth and IR) methods based CNN showed better performance than single modal classifiers. However, there is a need for improving the performance and reducing the complexity. Therefore, an extreme light network architecture(FeatherNet A/B) is proposed with a streaming module which fixes the weakness of Global Average Pooling and uses less parameters. Our single FeatherNet trained by depth image only, provides a higher baseline with 0.00168 ACER, 0.35M parameters and 83M FLOPS. Furthermore, a novel fusion procedure with ``ensemble + cascade’’ structure is presented to satisfy the performance preferred use cases. Meanwhile, the MMFD dataset is collected to provide more attacks and diversity to gain better generalization. We use the fusion method in the Face Anti-spoofing Attack Detection Challenge@CVPR2019 and got the result of 0.0013(ACER), 0.999(TPR@FPR=10e-2), 0.998(TPR@FPR=10e-3) and 0.9814(TPR@FPR=10e-4). |
Tasks Face Anti-Spoofing
Published 2019-04-22
URL http://arxiv.org/abs/1904.09290v1
PDF http://arxiv.org/pdf/1904.09290v1.pdf
PWC https://paperswithcode.com/paper/190409290
Repo https://github.com/SoftwareGift/FeatheNets_Face-Anti-spoofing-Attack-Detection-Challenge-CVPR2019
Framework pytorch

Cross-Domain Transferability of Adversarial Perturbations

Title Cross-Domain Transferability of Adversarial Perturbations
Authors Muzammal Naseer, Salman H. Khan, Harris Khan, Fahad Shahbaz Khan, Fatih Porikli
Abstract Adversarial examples reveal the blind spots of deep neural networks (DNNs) and represent a major concern for security-critical applications. The transferability of adversarial examples makes real-world attacks possible in black-box settings, where the attacker is forbidden to access the internal parameters of the model. The underlying assumption in most adversary generation methods, whether learning an instance-specific or an instance-agnostic perturbation, is the direct or indirect reliance on the original domain-specific data distribution. In this work, for the first time, we demonstrate the existence of domain-invariant adversaries, thereby showing common adversarial space among different datasets and models. To this end, we propose a framework capable of launching highly transferable attacks that crafts adversarial patterns to mislead networks trained on wholly different domains. For instance, an adversarial function learned on Paintings, Cartoons or Medical images can successfully perturb ImageNet samples to fool the classifier, with success rates as high as $\sim$99% ($\ell_{\infty} \le 10$). The core of our proposed adversarial function is a generative network that is trained using a relativistic supervisory signal that enables domain-invariant perturbations. Our approach sets the new state-of-the-art for fooling rates, both under the white-box and black-box scenarios. Furthermore, despite being an instance-agnostic perturbation function, our attack outperforms the conventionally much stronger instance-specific attack methods.
Tasks
Published 2019-05-28
URL https://arxiv.org/abs/1905.11736v5
PDF https://arxiv.org/pdf/1905.11736v5.pdf
PWC https://paperswithcode.com/paper/cross-domain-transferability-of-adversarial
Repo https://github.com/Muzammal-Naseer/Cross-domain-perturbations
Framework pytorch

Soft Anchor-Point Object Detection

Title Soft Anchor-Point Object Detection
Authors Chenchen Zhu, Fangyi Chen, Zhiqiang Shen, Marios Savvides
Abstract Recently, anchor-free detectors have shown great potential to outperform anchor-based detectors in terms of both accuracy and speed. In this work, we aim at finding a new balance of speed and accuracy for anchor-free detectors. Two questions are studied: 1) how to make the anchor-free detection head better? 2) how to utilize the power of feature pyramid better? We identify attention bias and feature selection as the main issues for these two questions respectively. We propose to address these issues with a novel training strategy that has two soften optimization techniques, i.e. soft-weighted anchor points and soft-selected pyramid levels. To evaluate the effectiveness, we train a single-stage anchor-free detector called Soft Anchor-Point Detector (SAPD). Experiments show that our concise SAPD pushes the envelope of speed/accuracy trade-off to a new level, outperforming recent state-of-the-art anchor-based and anchor-free, single-stage and multi-stage detectors. Without bells and whistles, our best model can achieve a single-model single-scale AP of 47.4% on COCO. Our fastest version can run up to 5x faster than other detectors with comparable accuracy.
Tasks Feature Selection, Object Detection
Published 2019-11-27
URL https://arxiv.org/abs/1911.12448v1
PDF https://arxiv.org/pdf/1911.12448v1.pdf
PWC https://paperswithcode.com/paper/soft-anchor-point-object-detection
Repo https://github.com/xuannianz/SAPD
Framework tf

CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features

Title CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features
Authors Sangdoo Yun, Dongyoon Han, Seong Joon Oh, Sanghyuk Chun, Junsuk Choe, Youngjoon Yoo
Abstract Regional dropout strategies have been proposed to enhance the performance of convolutional neural network classifiers. They have proved to be effective for guiding the model to attend on less discriminative parts of objects (e.g. leg as opposed to head of a person), thereby letting the network generalize better and have better object localization capabilities. On the other hand, current methods for regional dropout remove informative pixels on training images by overlaying a patch of either black pixels or random noise. Such removal is not desirable because it leads to information loss and inefficiency during training. We therefore propose the CutMix augmentation strategy: patches are cut and pasted among training images where the ground truth labels are also mixed proportionally to the area of the patches. By making efficient use of training pixels and retaining the regularization effect of regional dropout, CutMix consistently outperforms the state-of-the-art augmentation strategies on CIFAR and ImageNet classification tasks, as well as on the ImageNet weakly-supervised localization task. Moreover, unlike previous augmentation methods, our CutMix-trained ImageNet classifier, when used as a pretrained model, results in consistent performance gains in Pascal detection and MS-COCO image captioning benchmarks. We also show that CutMix improves the model robustness against input corruptions and its out-of-distribution detection performances. Source code and pretrained models are available at \href{https://github.com/clovaai/CutMix-PyTorch}{https://github.com/clovaai/CutMix-PyTorch}.
Tasks Image Captioning, Image Classification, Object Localization, Out-of-Distribution Detection
Published 2019-05-13
URL https://arxiv.org/abs/1905.04899v2
PDF https://arxiv.org/pdf/1905.04899v2.pdf
PWC https://paperswithcode.com/paper/cutmix-regularization-strategy-to-train
Repo https://github.com/DevBruce/CutMixImageDataGenerator_For_Keras
Framework none

Why Attention? Analyze BiLSTM Deficiency and Its Remedies in the Case of NER

Title Why Attention? Analyze BiLSTM Deficiency and Its Remedies in the Case of NER
Authors Peng-Hsuan Li, Tsu-Jui Fu, Wei-Yun Ma
Abstract BiLSTM has been prevalently used as a core module for NER in a sequence-labeling setup. State-of-the-art approaches use BiLSTM with additional resources such as gazetteers, language-modeling, or multi-task supervision to further improve NER. This paper instead takes a step back and focuses on analyzing problems of BiLSTM itself and how exactly self-attention can bring improvements. We formally show the limitation of (CRF-)BiLSTM in modeling cross-context patterns for each word – the XOR limitation. Then, we show that two types of simple cross-structures – self-attention and Cross-BiLSTM – can effectively remedy the problem. We further test them on real-world NER datasets, OntoNotes 5.0 and WNUT 2017, with clear and consistent improvements over the baseline, up to 8.7% on some of the multi-token entity mentions. We give in-depth analyses of the improvements across several aspects of NER, especially the identification of multi-token mentions. This study should lay a sound foundation for future improvements on sequence-labeling NER\footnote{\url{https://github.com/jacobvsdanniel/cross_ner}}.
Tasks Named Entity Recognition
Published 2019-08-29
URL https://arxiv.org/abs/1908.11046v2
PDF https://arxiv.org/pdf/1908.11046v2.pdf
PWC https://paperswithcode.com/paper/remedying-bilstm-cnn-deficiency-in-modeling
Repo https://github.com/ckiplab/ckiptagger
Framework tf

Adaptive and Iteratively Improving Recurrent Lateral Connections

Title Adaptive and Iteratively Improving Recurrent Lateral Connections
Authors Barak Battash, Lior Wolf
Abstract The current leading computer vision models are typically feed forward neural models, in which the output of one computational block is passed to the next one sequentially. This is in sharp contrast to the organization of the primate visual cortex, in which feedback and lateral connections are abundant. In this work, we propose a computational model for the role of lateral connections in a given block, in which the weights of the block vary dynamically as a function of its activations, and the input from the upstream blocks is iteratively reintroduced. We demonstrate how this novel architectural modification can lead to sizable gains in performance, when applied to visual action recognition without pretraining and that it outperforms the literature architectures with recurrent feedback processing on ImageNet.
Tasks
Published 2019-10-16
URL https://arxiv.org/abs/1910.11105v1
PDF https://arxiv.org/pdf/1910.11105v1.pdf
PWC https://paperswithcode.com/paper/adaptive-and-iteratively-improving-recurrent
Repo https://github.com/BattashB/Adaptive-and-Iteratively-Improving-Recurrent-Lateral-Connections
Framework pytorch

Block Neural Autoregressive Flow

Title Block Neural Autoregressive Flow
Authors Nicola De Cao, Ivan Titov, Wilker Aziz
Abstract Normalising flows (NFS) map two density functions via a differentiable bijection whose Jacobian determinant can be computed efficiently. Recently, as an alternative to hand-crafted bijections, Huang et al. (2018) proposed neural autoregressive flow (NAF) which is a universal approximator for density functions. Their flow is a neural network (NN) whose parameters are predicted by another NN. The latter grows quadratically with the size of the former and thus an efficient technique for parametrization is needed. We propose block neural autoregressive flow (B-NAF), a much more compact universal approximator of density functions, where we model a bijection directly using a single feed-forward network. Invertibility is ensured by carefully designing each affine transformation with block matrices that make the flow autoregressive and (strictly) monotone. We compare B-NAF to NAF and other established flows on density estimation and approximate inference for latent variable models. Our proposed flow is competitive across datasets while using orders of magnitude fewer parameters.
Tasks Density Estimation, Latent Variable Models, Normalising Flows
Published 2019-04-09
URL http://arxiv.org/abs/1904.04676v1
PDF http://arxiv.org/pdf/1904.04676v1.pdf
PWC https://paperswithcode.com/paper/block-neural-autoregressive-flow
Repo https://github.com/nicola-decao/BNAF
Framework pytorch

Enhancing VAEs for Collaborative Filtering: Flexible Priors & Gating Mechanisms

Title Enhancing VAEs for Collaborative Filtering: Flexible Priors & Gating Mechanisms
Authors Daeryong Kim, Bongwon Suh
Abstract Neural network based models for collaborative filtering have started to gain attention recently. One branch of research is based on using deep generative models to model user preferences where variational autoencoders were shown to produce state-of-the-art results. However, there are some potentially problematic characteristics of the current variational autoencoder for CF. The first is the too simplistic prior that VAEs incorporate for learning the latent representations of user preference. The other is the model’s inability to learn deeper representations with more than one hidden layer for each network. Our goal is to incorporate appropriate techniques to mitigate the aforementioned problems of variational autoencoder CF and further improve the recommendation performance. Our work is the first to apply flexible priors to collaborative filtering and show that simple priors (in original VAEs) may be too restrictive to fully model user preferences and setting a more flexible prior gives significant gains. We experiment with the VampPrior, originally proposed for image generation, to examine the effect of flexible priors in CF. We also show that VampPriors coupled with gating mechanisms outperform SOTA results including the Variational Autoencoder for Collaborative Filtering by meaningful margins on 2 popular benchmark datasets (MovieLens & Netflix).
Tasks Image Generation
Published 2019-11-03
URL https://arxiv.org/abs/1911.00936v1
PDF https://arxiv.org/pdf/1911.00936v1.pdf
PWC https://paperswithcode.com/paper/enhancing-vaes-for-collaborative-filtering
Repo https://github.com/psywaves/EVCF
Framework pytorch

Improving Neural Networks by Adopting Amplifying and Attenuating Neurons

Title Improving Neural Networks by Adopting Amplifying and Attenuating Neurons
Authors Seongmun Jung, Oh Joon Kwon
Abstract In the present study, an amplifying neuron and attenuating neuron, which can be easily implemented into neural networks without any significant additional computational effort, are proposed. The activated output value is squared for the amplifying neuron, while the value becomes its reciprocal for the attenuating one. Theoretically, the order of neural networks increases when the amplifying neuron is placed in the hidden layer. The performance assessments of neural networks were conducted to verify that the amplifying and attenuating neurons enhance the performance of neural networks. From the numerical experiments, it was revealed that the neural networks that contain the amplifying and attenuating neurons yield more accurate results, compared to those without them.
Tasks
Published 2019-05-23
URL https://arxiv.org/abs/1905.09574v2
PDF https://arxiv.org/pdf/1905.09574v2.pdf
PWC https://paperswithcode.com/paper/improving-neural-networks-by-adopting
Repo https://github.com/TheWinterSky/Amplifying-and-attenuating-neurons
Framework none

Designing recurrent neural networks by unfolding an l1-l1 minimization algorithm

Title Designing recurrent neural networks by unfolding an l1-l1 minimization algorithm
Authors Hung Duy Le, Huynh Van Luong, Nikos Deligiannis
Abstract We propose a new deep recurrent neural network (RNN) architecture for sequential signal reconstruction. Our network is designed by unfolding the iterations of the proximal gradient method that solves the l1-l1 minimization problem. As such, our network leverages by design that signals have a sparse representation and that the difference between consecutive signal representations is also sparse. We evaluate the proposed model in the task of reconstructing video frames from compressive measurements and show that it outperforms several state-of-the-art RNN models.
Tasks
Published 2019-02-18
URL http://arxiv.org/abs/1902.06522v1
PDF http://arxiv.org/pdf/1902.06522v1.pdf
PWC https://paperswithcode.com/paper/designing-recurrent-neural-networks-by
Repo https://github.com/dhungle/L1-L1-RNN
Framework pytorch

Preserving physically important variables in optimal event selections: A case study in Higgs physics

Title Preserving physically important variables in optimal event selections: A case study in Higgs physics
Authors Philipp Windischhofer, Miha Zgubic, Daniela Bortoletto
Abstract Analyses of collider data, often assisted by modern Machine Learning methods, condense a number of observables into a few powerful discriminants for the separation of the targeted signal process from the contributing backgrounds. These discriminants are highly correlated with important physical observables; using them in the event selection thus leads to the distortion of physically relevant distributions. We present an alternative event selection strategy, based on adversarially trained classifiers, that exploits the discriminating power contained in many event variables, but preserves the distributions of selected observables. This method is implemented and evaluated for the case of a Standard Model Higgs boson decaying into a pair of bottom quarks. Compared to a cut-based approach, it leads to a significant improvement in analysis sensitivity and retains the shapes of the relevant distributions to a greater extent.
Tasks
Published 2019-07-03
URL https://arxiv.org/abs/1907.02098v1
PDF https://arxiv.org/pdf/1907.02098v1.pdf
PWC https://paperswithcode.com/paper/preserving-physically-important-variables-in
Repo https://github.com/philippwindischhofer/HiggsPivoting
Framework tf

LFZip: Lossy compression of multivariate floating-point time series data via improved prediction

Title LFZip: Lossy compression of multivariate floating-point time series data via improved prediction
Authors Shubham Chandak, Kedar Tatwawadi, Chengtao Wen, Lingyun Wang, Juan Aparicio, Tsachy Weissman
Abstract Time series data compression is emerging as an important problem with the growth in IoT devices and sensors. Due to the presence of noise in these datasets, lossy compression can often provide significant compression gains without impacting the performance of downstream applications. In this work, we propose an error-bounded lossy compressor, LFZip, for multivariate floating-point time series data that provides guaranteed reconstruction up to user-specified maximum absolute error. The compressor is based on the prediction-quantization-entropy coder framework and benefits from improved prediction using linear models and neural networks. We evaluate the compressor on several time series datasets where it outperforms the existing state-of-the-art error-bounded lossy compressors. The code and data are available at https://github.com/shubhamchandak94/LFZip
Tasks Quantization, Time Series
Published 2019-11-01
URL https://arxiv.org/abs/1911.00208v2
PDF https://arxiv.org/pdf/1911.00208v2.pdf
PWC https://paperswithcode.com/paper/lfzip-lossy-compression-of-multivariate
Repo https://github.com/shubhamchandak94/LFZip
Framework tf

Drop to Adapt: Learning Discriminative Features for Unsupervised Domain Adaptation

Title Drop to Adapt: Learning Discriminative Features for Unsupervised Domain Adaptation
Authors Seungmin Lee, Dongwan Kim, Namil Kim, Seong-Gyun Jeong
Abstract Recent works on domain adaptation exploit adversarial training to obtain domain-invariant feature representations from the joint learning of feature extractor and domain discriminator networks. However, domain adversarial methods render suboptimal performances since they attempt to match the distributions among the domains without considering the task at hand. We propose Drop to Adapt (DTA), which leverages adversarial dropout to learn strongly discriminative features by enforcing the cluster assumption. Accordingly, we design objective functions to support robust domain adaptation. We demonstrate efficacy of the proposed method on various experiments and achieve consistent improvements in both image classification and semantic segmentation tasks. Our source code is available at https://github.com/postBG/DTA.pytorch.
Tasks Domain Adaptation, Image Classification, Semantic Segmentation, Unsupervised Domain Adaptation
Published 2019-10-12
URL https://arxiv.org/abs/1910.05562v1
PDF https://arxiv.org/pdf/1910.05562v1.pdf
PWC https://paperswithcode.com/paper/drop-to-adapt-learning-discriminative
Repo https://github.com/postBG/DTA.pytorch
Framework pytorch
comments powered by Disqus