February 1, 2020

2986 words 15 mins read

Paper Group AWR 103

Learning to Traverse Latent Spaces for Musical Score Inpainting. KPConv: Flexible and Deformable Convolution for Point Clouds. FeatherNets: Convolutional Neural Networks as Light as Feather for Face Anti-spoofing. Cross-Domain Transferability of Adversarial Perturbations. Soft Anchor-Point Object Detection. CutMix: Regularization Strategy to Train …

Learning to Traverse Latent Spaces for Musical Score Inpainting


Title	Learning to Traverse Latent Spaces for Musical Score Inpainting
Authors	Ashis Pati, Alexander Lerch, Gaëtan Hadjeres
Abstract	Music Inpainting is the task of filling in missing or lost information in a piece of music. We investigate this task from an interactive music creation perspective. To this end, a novel deep learning-based approach for musical score inpainting is proposed. The designed model takes both past and future musical context into account and is capable of suggesting ways to connect them in a musically meaningful manner. To achieve this, we leverage the representational power of the latent space of a Variational Auto-Encoder and train a Recurrent Neural Network which learns to traverse this latent space conditioned on the past and future musical contexts. Consequently, the designed model is capable of generating several measures of music to connect two musical excerpts. The capabilities and performance of the model are showcased by comparison with competitive baselines using several objective and subjective evaluation methods. The results show that the model generates meaningful inpaintings and can be used in interactive music creation applications. Overall, the method demonstrates the merit of learning complex trajectories in the latent spaces of deep generative models.
Tasks
Published	2019-07-02
URL	https://arxiv.org/abs/1907.01164v1
PDF	https://arxiv.org/pdf/1907.01164v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-traverse-latent-spaces-for
Repo	https://github.com/ashispati/InpaintNet
Framework	pytorch

KPConv: Flexible and Deformable Convolution for Point Clouds


Title	KPConv: Flexible and Deformable Convolution for Point Clouds
Authors	Hugues Thomas, Charles R. Qi, Jean-Emmanuel Deschaud, Beatriz Marcotegui, François Goulette, Leonidas J. Guibas
Abstract	We present Kernel Point Convolution (KPConv), a new design of point convolution, i.e. that operates on point clouds without any intermediate representation. The convolution weights of KPConv are located in Euclidean space by kernel points, and applied to the input points close to them. Its capacity to use any number of kernel points gives KPConv more flexibility than fixed grid convolutions. Furthermore, these locations are continuous in space and can be learned by the network. Therefore, KPConv can be extended to deformable convolutions that learn to adapt kernel points to local geometry. Thanks to a regular subsampling strategy, KPConv is also efficient and robust to varying densities. Whether they use deformable KPConv for complex tasks, or rigid KPconv for simpler tasks, our networks outperform state-of-the-art classification and segmentation approaches on several datasets. We also offer ablation studies and visualizations to provide understanding of what has been learned by KPConv and to validate the descriptive power of deformable KPConv.
Tasks	Semantic Segmentation
Published	2019-04-18
URL	https://arxiv.org/abs/1904.08889v2
PDF	https://arxiv.org/pdf/1904.08889v2.pdf
PWC	https://paperswithcode.com/paper/kpconv-flexible-and-deformable-convolution
Repo	https://github.com/HuguesTHOMAS/KPConv
Framework	tf

FeatherNets: Convolutional Neural Networks as Light as Feather for Face Anti-spoofing


Title	FeatherNets: Convolutional Neural Networks as Light as Feather for Face Anti-spoofing
Authors	Peng Zhang, Fuhao Zou, Zhiwen Wu, Nengli Dai, Skarpness Mark, Michael Fu, Juan Zhao, Kai Li
Abstract	Face Anti-spoofing gains increased attentions recently in both academic and industrial fields. With the emergence of various CNN based solutions, the multi-modal(RGB, depth and IR) methods based CNN showed better performance than single modal classifiers. However, there is a need for improving the performance and reducing the complexity. Therefore, an extreme light network architecture(FeatherNet A/B) is proposed with a streaming module which fixes the weakness of Global Average Pooling and uses less parameters. Our single FeatherNet trained by depth image only, provides a higher baseline with 0.00168 ACER, 0.35M parameters and 83M FLOPS. Furthermore, a novel fusion procedure with ``ensemble + cascade’’ structure is presented to satisfy the performance preferred use cases. Meanwhile, the MMFD dataset is collected to provide more attacks and diversity to gain better generalization. We use the fusion method in the Face Anti-spoofing Attack Detection Challenge@CVPR2019 and got the result of 0.0013(ACER), 0.999(TPR@FPR=10e-2), 0.998(TPR@FPR=10e-3) and 0.9814(TPR@FPR=10e-4). \|
Tasks	Face Anti-Spoofing
Published	2019-04-22
URL	http://arxiv.org/abs/1904.09290v1
PDF	http://arxiv.org/pdf/1904.09290v1.pdf
PWC	https://paperswithcode.com/paper/190409290
Repo	https://github.com/SoftwareGift/FeatheNets_Face-Anti-spoofing-Attack-Detection-Challenge-CVPR2019
Framework	pytorch

Cross-Domain Transferability of Adversarial Perturbations


Title	Cross-Domain Transferability of Adversarial Perturbations
Authors	Muzammal Naseer, Salman H. Khan, Harris Khan, Fahad Shahbaz Khan, Fatih Porikli
Abstract	Adversarial examples reveal the blind spots of deep neural networks (DNNs) and represent a major concern for security-critical applications. The transferability of adversarial examples makes real-world attacks possible in black-box settings, where the attacker is forbidden to access the internal parameters of the model. The underlying assumption in most adversary generation methods, whether learning an instance-specific or an instance-agnostic perturbation, is the direct or indirect reliance on the original domain-specific data distribution. In this work, for the first time, we demonstrate the existence of domain-invariant adversaries, thereby showing common adversarial space among different datasets and models. To this end, we propose a framework capable of launching highly transferable attacks that crafts adversarial patterns to mislead networks trained on wholly different domains. For instance, an adversarial function learned on Paintings, Cartoons or Medical images can successfully perturb ImageNet samples to fool the classifier, with success rates as high as $\sim$99% ($\ell_{\infty} \le 10$). The core of our proposed adversarial function is a generative network that is trained using a relativistic supervisory signal that enables domain-invariant perturbations. Our approach sets the new state-of-the-art for fooling rates, both under the white-box and black-box scenarios. Furthermore, despite being an instance-agnostic perturbation function, our attack outperforms the conventionally much stronger instance-specific attack methods.
Tasks
Published	2019-05-28
URL	https://arxiv.org/abs/1905.11736v5
PDF	https://arxiv.org/pdf/1905.11736v5.pdf
PWC	https://paperswithcode.com/paper/cross-domain-transferability-of-adversarial
Repo	https://github.com/Muzammal-Naseer/Cross-domain-perturbations
Framework	pytorch

Soft Anchor-Point Object Detection


Title	Soft Anchor-Point Object Detection
Authors	Chenchen Zhu, Fangyi Chen, Zhiqiang Shen, Marios Savvides
Abstract	Recently, anchor-free detectors have shown great potential to outperform anchor-based detectors in terms of both accuracy and speed. In this work, we aim at finding a new balance of speed and accuracy for anchor-free detectors. Two questions are studied: 1) how to make the anchor-free detection head better? 2) how to utilize the power of feature pyramid better? We identify attention bias and feature selection as the main issues for these two questions respectively. We propose to address these issues with a novel training strategy that has two soften optimization techniques, i.e. soft-weighted anchor points and soft-selected pyramid levels. To evaluate the effectiveness, we train a single-stage anchor-free detector called Soft Anchor-Point Detector (SAPD). Experiments show that our concise SAPD pushes the envelope of speed/accuracy trade-off to a new level, outperforming recent state-of-the-art anchor-based and anchor-free, single-stage and multi-stage detectors. Without bells and whistles, our best model can achieve a single-model single-scale AP of 47.4% on COCO. Our fastest version can run up to 5x faster than other detectors with comparable accuracy.
Tasks	Feature Selection, Object Detection
Published	2019-11-27
URL	https://arxiv.org/abs/1911.12448v1
PDF	https://arxiv.org/pdf/1911.12448v1.pdf
PWC	https://paperswithcode.com/paper/soft-anchor-point-object-detection
Repo	https://github.com/xuannianz/SAPD
Framework	tf

CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features


Title	CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features
Authors	Sangdoo Yun, Dongyoon Han, Seong Joon Oh, Sanghyuk Chun, Junsuk Choe, Youngjoon Yoo
Abstract	Regional dropout strategies have been proposed to enhance the performance of convolutional neural network classifiers. They have proved to be effective for guiding the model to attend on less discriminative parts of objects (e.g. leg as opposed to head of a person), thereby letting the network generalize better and have better object localization capabilities. On the other hand, current methods for regional dropout remove informative pixels on training images by overlaying a patch of either black pixels or random noise. Such removal is not desirable because it leads to information loss and inefficiency during training. We therefore propose the CutMix augmentation strategy: patches are cut and pasted among training images where the ground truth labels are also mixed proportionally to the area of the patches. By making efficient use of training pixels and retaining the regularization effect of regional dropout, CutMix consistently outperforms the state-of-the-art augmentation strategies on CIFAR and ImageNet classification tasks, as well as on the ImageNet weakly-supervised localization task. Moreover, unlike previous augmentation methods, our CutMix-trained ImageNet classifier, when used as a pretrained model, results in consistent performance gains in Pascal detection and MS-COCO image captioning benchmarks. We also show that CutMix improves the model robustness against input corruptions and its out-of-distribution detection performances. Source code and pretrained models are available at \href{https://github.com/clovaai/CutMix-PyTorch}{https://github.com/clovaai/CutMix-PyTorch}.
Tasks	Image Captioning, Image Classification, Object Localization, Out-of-Distribution Detection
Published	2019-05-13
URL	https://arxiv.org/abs/1905.04899v2
PDF	https://arxiv.org/pdf/1905.04899v2.pdf
PWC	https://paperswithcode.com/paper/cutmix-regularization-strategy-to-train
Repo	https://github.com/DevBruce/CutMixImageDataGenerator_For_Keras
Framework	none

Why Attention? Analyze BiLSTM Deficiency and Its Remedies in the Case of NER


Title	Why Attention? Analyze BiLSTM Deficiency and Its Remedies in the Case of NER
Authors	Peng-Hsuan Li, Tsu-Jui Fu, Wei-Yun Ma
Abstract	BiLSTM has been prevalently used as a core module for NER in a sequence-labeling setup. State-of-the-art approaches use BiLSTM with additional resources such as gazetteers, language-modeling, or multi-task supervision to further improve NER. This paper instead takes a step back and focuses on analyzing problems of BiLSTM itself and how exactly self-attention can bring improvements. We formally show the limitation of (CRF-)BiLSTM in modeling cross-context patterns for each word – the XOR limitation. Then, we show that two types of simple cross-structures – self-attention and Cross-BiLSTM – can effectively remedy the problem. We further test them on real-world NER datasets, OntoNotes 5.0 and WNUT 2017, with clear and consistent improvements over the baseline, up to 8.7% on some of the multi-token entity mentions. We give in-depth analyses of the improvements across several aspects of NER, especially the identification of multi-token mentions. This study should lay a sound foundation for future improvements on sequence-labeling NER\footnote{\url{https://github.com/jacobvsdanniel/cross_ner}}.
Tasks	Named Entity Recognition
Published	2019-08-29
URL	https://arxiv.org/abs/1908.11046v2
PDF	https://arxiv.org/pdf/1908.11046v2.pdf
PWC	https://paperswithcode.com/paper/remedying-bilstm-cnn-deficiency-in-modeling
Repo	https://github.com/ckiplab/ckiptagger
Framework	tf

Adaptive and Iteratively Improving Recurrent Lateral Connections


Title	Adaptive and Iteratively Improving Recurrent Lateral Connections
Authors	Barak Battash, Lior Wolf
Abstract	The current leading computer vision models are typically feed forward neural models, in which the output of one computational block is passed to the next one sequentially. This is in sharp contrast to the organization of the primate visual cortex, in which feedback and lateral connections are abundant. In this work, we propose a computational model for the role of lateral connections in a given block, in which the weights of the block vary dynamically as a function of its activations, and the input from the upstream blocks is iteratively reintroduced. We demonstrate how this novel architectural modification can lead to sizable gains in performance, when applied to visual action recognition without pretraining and that it outperforms the literature architectures with recurrent feedback processing on ImageNet.
Tasks
Published	2019-10-16
URL	https://arxiv.org/abs/1910.11105v1
PDF	https://arxiv.org/pdf/1910.11105v1.pdf
PWC	https://paperswithcode.com/paper/adaptive-and-iteratively-improving-recurrent
Repo	https://github.com/BattashB/Adaptive-and-Iteratively-Improving-Recurrent-Lateral-Connections
Framework	pytorch

Block Neural Autoregressive Flow


Title	Block Neural Autoregressive Flow
Authors	Nicola De Cao, Ivan Titov, Wilker Aziz
Abstract	Normalising flows (NFS) map two density functions via a differentiable bijection whose Jacobian determinant can be computed efficiently. Recently, as an alternative to hand-crafted bijections, Huang et al. (2018) proposed neural autoregressive flow (NAF) which is a universal approximator for density functions. Their flow is a neural network (NN) whose parameters are predicted by another NN. The latter grows quadratically with the size of the former and thus an efficient technique for parametrization is needed. We propose block neural autoregressive flow (B-NAF), a much more compact universal approximator of density functions, where we model a bijection directly using a single feed-forward network. Invertibility is ensured by carefully designing each affine transformation with block matrices that make the flow autoregressive and (strictly) monotone. We compare B-NAF to NAF and other established flows on density estimation and approximate inference for latent variable models. Our proposed flow is competitive across datasets while using orders of magnitude fewer parameters.
Tasks	Density Estimation, Latent Variable Models, Normalising Flows
Published	2019-04-09
URL	http://arxiv.org/abs/1904.04676v1
PDF	http://arxiv.org/pdf/1904.04676v1.pdf
PWC	https://paperswithcode.com/paper/block-neural-autoregressive-flow
Repo	https://github.com/nicola-decao/BNAF
Framework	pytorch

Enhancing VAEs for Collaborative Filtering: Flexible Priors & Gating Mechanisms


Title	Enhancing VAEs for Collaborative Filtering: Flexible Priors & Gating Mechanisms
Authors	Daeryong Kim, Bongwon Suh
Abstract	Neural network based models for collaborative filtering have started to gain attention recently. One branch of research is based on using deep generative models to model user preferences where variational autoencoders were shown to produce state-of-the-art results. However, there are some potentially problematic characteristics of the current variational autoencoder for CF. The first is the too simplistic prior that VAEs incorporate for learning the latent representations of user preference. The other is the model’s inability to learn deeper representations with more than one hidden layer for each network. Our goal is to incorporate appropriate techniques to mitigate the aforementioned problems of variational autoencoder CF and further improve the recommendation performance. Our work is the first to apply flexible priors to collaborative filtering and show that simple priors (in original VAEs) may be too restrictive to fully model user preferences and setting a more flexible prior gives significant gains. We experiment with the VampPrior, originally proposed for image generation, to examine the effect of flexible priors in CF. We also show that VampPriors coupled with gating mechanisms outperform SOTA results including the Variational Autoencoder for Collaborative Filtering by meaningful margins on 2 popular benchmark datasets (MovieLens & Netflix).
Tasks	Image Generation
Published	2019-11-03
URL	https://arxiv.org/abs/1911.00936v1
PDF	https://arxiv.org/pdf/1911.00936v1.pdf
PWC	https://paperswithcode.com/paper/enhancing-vaes-for-collaborative-filtering
Repo	https://github.com/psywaves/EVCF
Framework	pytorch

Improving Neural Networks by Adopting Amplifying and Attenuating Neurons


Title	Improving Neural Networks by Adopting Amplifying and Attenuating Neurons
Authors	Seongmun Jung, Oh Joon Kwon
Abstract	In the present study, an amplifying neuron and attenuating neuron, which can be easily implemented into neural networks without any significant additional computational effort, are proposed. The activated output value is squared for the amplifying neuron, while the value becomes its reciprocal for the attenuating one. Theoretically, the order of neural networks increases when the amplifying neuron is placed in the hidden layer. The performance assessments of neural networks were conducted to verify that the amplifying and attenuating neurons enhance the performance of neural networks. From the numerical experiments, it was revealed that the neural networks that contain the amplifying and attenuating neurons yield more accurate results, compared to those without them.
Tasks
Published	2019-05-23
URL	https://arxiv.org/abs/1905.09574v2
PDF	https://arxiv.org/pdf/1905.09574v2.pdf
PWC	https://paperswithcode.com/paper/improving-neural-networks-by-adopting
Repo	https://github.com/TheWinterSky/Amplifying-and-attenuating-neurons
Framework	none

Designing recurrent neural networks by unfolding an l1-l1 minimization algorithm


Title	Designing recurrent neural networks by unfolding an l1-l1 minimization algorithm
Authors	Hung Duy Le, Huynh Van Luong, Nikos Deligiannis
Abstract	We propose a new deep recurrent neural network (RNN) architecture for sequential signal reconstruction. Our network is designed by unfolding the iterations of the proximal gradient method that solves the l1-l1 minimization problem. As such, our network leverages by design that signals have a sparse representation and that the difference between consecutive signal representations is also sparse. We evaluate the proposed model in the task of reconstructing video frames from compressive measurements and show that it outperforms several state-of-the-art RNN models.
Tasks
Published	2019-02-18
URL	http://arxiv.org/abs/1902.06522v1
PDF	http://arxiv.org/pdf/1902.06522v1.pdf
PWC	https://paperswithcode.com/paper/designing-recurrent-neural-networks-by
Repo	https://github.com/dhungle/L1-L1-RNN
Framework	pytorch

Preserving physically important variables in optimal event selections: A case study in Higgs physics


Title	Preserving physically important variables in optimal event selections: A case study in Higgs physics
Authors	Philipp Windischhofer, Miha Zgubic, Daniela Bortoletto
Abstract	Analyses of collider data, often assisted by modern Machine Learning methods, condense a number of observables into a few powerful discriminants for the separation of the targeted signal process from the contributing backgrounds. These discriminants are highly correlated with important physical observables; using them in the event selection thus leads to the distortion of physically relevant distributions. We present an alternative event selection strategy, based on adversarially trained classifiers, that exploits the discriminating power contained in many event variables, but preserves the distributions of selected observables. This method is implemented and evaluated for the case of a Standard Model Higgs boson decaying into a pair of bottom quarks. Compared to a cut-based approach, it leads to a significant improvement in analysis sensitivity and retains the shapes of the relevant distributions to a greater extent.
Tasks
Published	2019-07-03
URL	https://arxiv.org/abs/1907.02098v1
PDF	https://arxiv.org/pdf/1907.02098v1.pdf
PWC	https://paperswithcode.com/paper/preserving-physically-important-variables-in
Repo	https://github.com/philippwindischhofer/HiggsPivoting
Framework	tf

LFZip: Lossy compression of multivariate floating-point time series data via improved prediction


Title	LFZip: Lossy compression of multivariate floating-point time series data via improved prediction
Authors	Shubham Chandak, Kedar Tatwawadi, Chengtao Wen, Lingyun Wang, Juan Aparicio, Tsachy Weissman
Abstract	Time series data compression is emerging as an important problem with the growth in IoT devices and sensors. Due to the presence of noise in these datasets, lossy compression can often provide significant compression gains without impacting the performance of downstream applications. In this work, we propose an error-bounded lossy compressor, LFZip, for multivariate floating-point time series data that provides guaranteed reconstruction up to user-specified maximum absolute error. The compressor is based on the prediction-quantization-entropy coder framework and benefits from improved prediction using linear models and neural networks. We evaluate the compressor on several time series datasets where it outperforms the existing state-of-the-art error-bounded lossy compressors. The code and data are available at https://github.com/shubhamchandak94/LFZip
Tasks	Quantization, Time Series
Published	2019-11-01
URL	https://arxiv.org/abs/1911.00208v2
PDF	https://arxiv.org/pdf/1911.00208v2.pdf
PWC	https://paperswithcode.com/paper/lfzip-lossy-compression-of-multivariate
Repo	https://github.com/shubhamchandak94/LFZip
Framework	tf

Drop to Adapt: Learning Discriminative Features for Unsupervised Domain Adaptation


Title	Drop to Adapt: Learning Discriminative Features for Unsupervised Domain Adaptation
Authors	Seungmin Lee, Dongwan Kim, Namil Kim, Seong-Gyun Jeong
Abstract	Recent works on domain adaptation exploit adversarial training to obtain domain-invariant feature representations from the joint learning of feature extractor and domain discriminator networks. However, domain adversarial methods render suboptimal performances since they attempt to match the distributions among the domains without considering the task at hand. We propose Drop to Adapt (DTA), which leverages adversarial dropout to learn strongly discriminative features by enforcing the cluster assumption. Accordingly, we design objective functions to support robust domain adaptation. We demonstrate efficacy of the proposed method on various experiments and achieve consistent improvements in both image classification and semantic segmentation tasks. Our source code is available at https://github.com/postBG/DTA.pytorch.
Tasks	Domain Adaptation, Image Classification, Semantic Segmentation, Unsupervised Domain Adaptation
Published	2019-10-12
URL	https://arxiv.org/abs/1910.05562v1
PDF	https://arxiv.org/pdf/1910.05562v1.pdf
PWC	https://paperswithcode.com/paper/drop-to-adapt-learning-discriminative
Repo	https://github.com/postBG/DTA.pytorch
Framework	pytorch