Paper Group AWR 103
Learning to Traverse Latent Spaces for Musical Score Inpainting. KPConv: Flexible and Deformable Convolution for Point Clouds. FeatherNets: Convolutional Neural Networks as Light as Feather for Face Anti-spoofing. Cross-Domain Transferability of Adversarial Perturbations. Soft Anchor-Point Object Detection. CutMix: Regularization Strategy to Train …
Learning to Traverse Latent Spaces for Musical Score Inpainting
Title | Learning to Traverse Latent Spaces for Musical Score Inpainting |
Authors | Ashis Pati, Alexander Lerch, Gaëtan Hadjeres |
Abstract | Music Inpainting is the task of filling in missing or lost information in a piece of music. We investigate this task from an interactive music creation perspective. To this end, a novel deep learning-based approach for musical score inpainting is proposed. The designed model takes both past and future musical context into account and is capable of suggesting ways to connect them in a musically meaningful manner. To achieve this, we leverage the representational power of the latent space of a Variational Auto-Encoder and train a Recurrent Neural Network which learns to traverse this latent space conditioned on the past and future musical contexts. Consequently, the designed model is capable of generating several measures of music to connect two musical excerpts. The capabilities and performance of the model are showcased by comparison with competitive baselines using several objective and subjective evaluation methods. The results show that the model generates meaningful inpaintings and can be used in interactive music creation applications. Overall, the method demonstrates the merit of learning complex trajectories in the latent spaces of deep generative models. |
Tasks | |
Published | 2019-07-02 |
URL | https://arxiv.org/abs/1907.01164v1 |
https://arxiv.org/pdf/1907.01164v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-traverse-latent-spaces-for |
Repo | https://github.com/ashispati/InpaintNet |
Framework | pytorch |
KPConv: Flexible and Deformable Convolution for Point Clouds
Title | KPConv: Flexible and Deformable Convolution for Point Clouds |
Authors | Hugues Thomas, Charles R. Qi, Jean-Emmanuel Deschaud, Beatriz Marcotegui, François Goulette, Leonidas J. Guibas |
Abstract | We present Kernel Point Convolution (KPConv), a new design of point convolution, i.e. that operates on point clouds without any intermediate representation. The convolution weights of KPConv are located in Euclidean space by kernel points, and applied to the input points close to them. Its capacity to use any number of kernel points gives KPConv more flexibility than fixed grid convolutions. Furthermore, these locations are continuous in space and can be learned by the network. Therefore, KPConv can be extended to deformable convolutions that learn to adapt kernel points to local geometry. Thanks to a regular subsampling strategy, KPConv is also efficient and robust to varying densities. Whether they use deformable KPConv for complex tasks, or rigid KPconv for simpler tasks, our networks outperform state-of-the-art classification and segmentation approaches on several datasets. We also offer ablation studies and visualizations to provide understanding of what has been learned by KPConv and to validate the descriptive power of deformable KPConv. |
Tasks | Semantic Segmentation |
Published | 2019-04-18 |
URL | https://arxiv.org/abs/1904.08889v2 |
https://arxiv.org/pdf/1904.08889v2.pdf | |
PWC | https://paperswithcode.com/paper/kpconv-flexible-and-deformable-convolution |
Repo | https://github.com/HuguesTHOMAS/KPConv |
Framework | tf |
FeatherNets: Convolutional Neural Networks as Light as Feather for Face Anti-spoofing
Title | FeatherNets: Convolutional Neural Networks as Light as Feather for Face Anti-spoofing |
Authors | Peng Zhang, Fuhao Zou, Zhiwen Wu, Nengli Dai, Skarpness Mark, Michael Fu, Juan Zhao, Kai Li |
Abstract | Face Anti-spoofing gains increased attentions recently in both academic and industrial fields. With the emergence of various CNN based solutions, the multi-modal(RGB, depth and IR) methods based CNN showed better performance than single modal classifiers. However, there is a need for improving the performance and reducing the complexity. Therefore, an extreme light network architecture(FeatherNet A/B) is proposed with a streaming module which fixes the weakness of Global Average Pooling and uses less parameters. Our single FeatherNet trained by depth image only, provides a higher baseline with 0.00168 ACER, 0.35M parameters and 83M FLOPS. Furthermore, a novel fusion procedure with ``ensemble + cascade’’ structure is presented to satisfy the performance preferred use cases. Meanwhile, the MMFD dataset is collected to provide more attacks and diversity to gain better generalization. We use the fusion method in the Face Anti-spoofing Attack Detection Challenge@CVPR2019 and got the result of 0.0013(ACER), 0.999(TPR@FPR=10e-2), 0.998(TPR@FPR=10e-3) and 0.9814(TPR@FPR=10e-4). | |
Tasks | Face Anti-Spoofing |
Published | 2019-04-22 |
URL | http://arxiv.org/abs/1904.09290v1 |
http://arxiv.org/pdf/1904.09290v1.pdf | |
PWC | https://paperswithcode.com/paper/190409290 |
Repo | https://github.com/SoftwareGift/FeatheNets_Face-Anti-spoofing-Attack-Detection-Challenge-CVPR2019 |
Framework | pytorch |
Cross-Domain Transferability of Adversarial Perturbations
Title | Cross-Domain Transferability of Adversarial Perturbations |
Authors | Muzammal Naseer, Salman H. Khan, Harris Khan, Fahad Shahbaz Khan, Fatih Porikli |
Abstract | Adversarial examples reveal the blind spots of deep neural networks (DNNs) and represent a major concern for security-critical applications. The transferability of adversarial examples makes real-world attacks possible in black-box settings, where the attacker is forbidden to access the internal parameters of the model. The underlying assumption in most adversary generation methods, whether learning an instance-specific or an instance-agnostic perturbation, is the direct or indirect reliance on the original domain-specific data distribution. In this work, for the first time, we demonstrate the existence of domain-invariant adversaries, thereby showing common adversarial space among different datasets and models. To this end, we propose a framework capable of launching highly transferable attacks that crafts adversarial patterns to mislead networks trained on wholly different domains. For instance, an adversarial function learned on Paintings, Cartoons or Medical images can successfully perturb ImageNet samples to fool the classifier, with success rates as high as $\sim$99% ($\ell_{\infty} \le 10$). The core of our proposed adversarial function is a generative network that is trained using a relativistic supervisory signal that enables domain-invariant perturbations. Our approach sets the new state-of-the-art for fooling rates, both under the white-box and black-box scenarios. Furthermore, despite being an instance-agnostic perturbation function, our attack outperforms the conventionally much stronger instance-specific attack methods. |
Tasks | |
Published | 2019-05-28 |
URL | https://arxiv.org/abs/1905.11736v5 |
https://arxiv.org/pdf/1905.11736v5.pdf | |
PWC | https://paperswithcode.com/paper/cross-domain-transferability-of-adversarial |
Repo | https://github.com/Muzammal-Naseer/Cross-domain-perturbations |
Framework | pytorch |
Soft Anchor-Point Object Detection
Title | Soft Anchor-Point Object Detection |
Authors | Chenchen Zhu, Fangyi Chen, Zhiqiang Shen, Marios Savvides |
Abstract | Recently, anchor-free detectors have shown great potential to outperform anchor-based detectors in terms of both accuracy and speed. In this work, we aim at finding a new balance of speed and accuracy for anchor-free detectors. Two questions are studied: 1) how to make the anchor-free detection head better? 2) how to utilize the power of feature pyramid better? We identify attention bias and feature selection as the main issues for these two questions respectively. We propose to address these issues with a novel training strategy that has two soften optimization techniques, i.e. soft-weighted anchor points and soft-selected pyramid levels. To evaluate the effectiveness, we train a single-stage anchor-free detector called Soft Anchor-Point Detector (SAPD). Experiments show that our concise SAPD pushes the envelope of speed/accuracy trade-off to a new level, outperforming recent state-of-the-art anchor-based and anchor-free, single-stage and multi-stage detectors. Without bells and whistles, our best model can achieve a single-model single-scale AP of 47.4% on COCO. Our fastest version can run up to 5x faster than other detectors with comparable accuracy. |
Tasks | Feature Selection, Object Detection |
Published | 2019-11-27 |
URL | https://arxiv.org/abs/1911.12448v1 |
https://arxiv.org/pdf/1911.12448v1.pdf | |
PWC | https://paperswithcode.com/paper/soft-anchor-point-object-detection |
Repo | https://github.com/xuannianz/SAPD |
Framework | tf |
CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features
Title | CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features |
Authors | Sangdoo Yun, Dongyoon Han, Seong Joon Oh, Sanghyuk Chun, Junsuk Choe, Youngjoon Yoo |
Abstract | Regional dropout strategies have been proposed to enhance the performance of convolutional neural network classifiers. They have proved to be effective for guiding the model to attend on less discriminative parts of objects (e.g. leg as opposed to head of a person), thereby letting the network generalize better and have better object localization capabilities. On the other hand, current methods for regional dropout remove informative pixels on training images by overlaying a patch of either black pixels or random noise. Such removal is not desirable because it leads to information loss and inefficiency during training. We therefore propose the CutMix augmentation strategy: patches are cut and pasted among training images where the ground truth labels are also mixed proportionally to the area of the patches. By making efficient use of training pixels and retaining the regularization effect of regional dropout, CutMix consistently outperforms the state-of-the-art augmentation strategies on CIFAR and ImageNet classification tasks, as well as on the ImageNet weakly-supervised localization task. Moreover, unlike previous augmentation methods, our CutMix-trained ImageNet classifier, when used as a pretrained model, results in consistent performance gains in Pascal detection and MS-COCO image captioning benchmarks. We also show that CutMix improves the model robustness against input corruptions and its out-of-distribution detection performances. Source code and pretrained models are available at \href{https://github.com/clovaai/CutMix-PyTorch}{https://github.com/clovaai/CutMix-PyTorch}. |
Tasks | Image Captioning, Image Classification, Object Localization, Out-of-Distribution Detection |
Published | 2019-05-13 |
URL | https://arxiv.org/abs/1905.04899v2 |
https://arxiv.org/pdf/1905.04899v2.pdf | |
PWC | https://paperswithcode.com/paper/cutmix-regularization-strategy-to-train |
Repo | https://github.com/DevBruce/CutMixImageDataGenerator_For_Keras |
Framework | none |
Why Attention? Analyze BiLSTM Deficiency and Its Remedies in the Case of NER
Title | Why Attention? Analyze BiLSTM Deficiency and Its Remedies in the Case of NER |
Authors | Peng-Hsuan Li, Tsu-Jui Fu, Wei-Yun Ma |
Abstract | BiLSTM has been prevalently used as a core module for NER in a sequence-labeling setup. State-of-the-art approaches use BiLSTM with additional resources such as gazetteers, language-modeling, or multi-task supervision to further improve NER. This paper instead takes a step back and focuses on analyzing problems of BiLSTM itself and how exactly self-attention can bring improvements. We formally show the limitation of (CRF-)BiLSTM in modeling cross-context patterns for each word – the XOR limitation. Then, we show that two types of simple cross-structures – self-attention and Cross-BiLSTM – can effectively remedy the problem. We further test them on real-world NER datasets, OntoNotes 5.0 and WNUT 2017, with clear and consistent improvements over the baseline, up to 8.7% on some of the multi-token entity mentions. We give in-depth analyses of the improvements across several aspects of NER, especially the identification of multi-token mentions. This study should lay a sound foundation for future improvements on sequence-labeling NER\footnote{\url{https://github.com/jacobvsdanniel/cross_ner}}. |
Tasks | Named Entity Recognition |
Published | 2019-08-29 |
URL | https://arxiv.org/abs/1908.11046v2 |
https://arxiv.org/pdf/1908.11046v2.pdf | |
PWC | https://paperswithcode.com/paper/remedying-bilstm-cnn-deficiency-in-modeling |
Repo | https://github.com/ckiplab/ckiptagger |
Framework | tf |
Adaptive and Iteratively Improving Recurrent Lateral Connections
Title | Adaptive and Iteratively Improving Recurrent Lateral Connections |
Authors | Barak Battash, Lior Wolf |
Abstract | The current leading computer vision models are typically feed forward neural models, in which the output of one computational block is passed to the next one sequentially. This is in sharp contrast to the organization of the primate visual cortex, in which feedback and lateral connections are abundant. In this work, we propose a computational model for the role of lateral connections in a given block, in which the weights of the block vary dynamically as a function of its activations, and the input from the upstream blocks is iteratively reintroduced. We demonstrate how this novel architectural modification can lead to sizable gains in performance, when applied to visual action recognition without pretraining and that it outperforms the literature architectures with recurrent feedback processing on ImageNet. |
Tasks | |
Published | 2019-10-16 |
URL | https://arxiv.org/abs/1910.11105v1 |
https://arxiv.org/pdf/1910.11105v1.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-and-iteratively-improving-recurrent |
Repo | https://github.com/BattashB/Adaptive-and-Iteratively-Improving-Recurrent-Lateral-Connections |
Framework | pytorch |
Block Neural Autoregressive Flow
Title | Block Neural Autoregressive Flow |
Authors | Nicola De Cao, Ivan Titov, Wilker Aziz |
Abstract | Normalising flows (NFS) map two density functions via a differentiable bijection whose Jacobian determinant can be computed efficiently. Recently, as an alternative to hand-crafted bijections, Huang et al. (2018) proposed neural autoregressive flow (NAF) which is a universal approximator for density functions. Their flow is a neural network (NN) whose parameters are predicted by another NN. The latter grows quadratically with the size of the former and thus an efficient technique for parametrization is needed. We propose block neural autoregressive flow (B-NAF), a much more compact universal approximator of density functions, where we model a bijection directly using a single feed-forward network. Invertibility is ensured by carefully designing each affine transformation with block matrices that make the flow autoregressive and (strictly) monotone. We compare B-NAF to NAF and other established flows on density estimation and approximate inference for latent variable models. Our proposed flow is competitive across datasets while using orders of magnitude fewer parameters. |
Tasks | Density Estimation, Latent Variable Models, Normalising Flows |
Published | 2019-04-09 |
URL | http://arxiv.org/abs/1904.04676v1 |
http://arxiv.org/pdf/1904.04676v1.pdf | |
PWC | https://paperswithcode.com/paper/block-neural-autoregressive-flow |
Repo | https://github.com/nicola-decao/BNAF |
Framework | pytorch |
Enhancing VAEs for Collaborative Filtering: Flexible Priors & Gating Mechanisms
Title | Enhancing VAEs for Collaborative Filtering: Flexible Priors & Gating Mechanisms |
Authors | Daeryong Kim, Bongwon Suh |
Abstract | Neural network based models for collaborative filtering have started to gain attention recently. One branch of research is based on using deep generative models to model user preferences where variational autoencoders were shown to produce state-of-the-art results. However, there are some potentially problematic characteristics of the current variational autoencoder for CF. The first is the too simplistic prior that VAEs incorporate for learning the latent representations of user preference. The other is the model’s inability to learn deeper representations with more than one hidden layer for each network. Our goal is to incorporate appropriate techniques to mitigate the aforementioned problems of variational autoencoder CF and further improve the recommendation performance. Our work is the first to apply flexible priors to collaborative filtering and show that simple priors (in original VAEs) may be too restrictive to fully model user preferences and setting a more flexible prior gives significant gains. We experiment with the VampPrior, originally proposed for image generation, to examine the effect of flexible priors in CF. We also show that VampPriors coupled with gating mechanisms outperform SOTA results including the Variational Autoencoder for Collaborative Filtering by meaningful margins on 2 popular benchmark datasets (MovieLens & Netflix). |
Tasks | Image Generation |
Published | 2019-11-03 |
URL | https://arxiv.org/abs/1911.00936v1 |
https://arxiv.org/pdf/1911.00936v1.pdf | |
PWC | https://paperswithcode.com/paper/enhancing-vaes-for-collaborative-filtering |
Repo | https://github.com/psywaves/EVCF |
Framework | pytorch |
Improving Neural Networks by Adopting Amplifying and Attenuating Neurons
Title | Improving Neural Networks by Adopting Amplifying and Attenuating Neurons |
Authors | Seongmun Jung, Oh Joon Kwon |
Abstract | In the present study, an amplifying neuron and attenuating neuron, which can be easily implemented into neural networks without any significant additional computational effort, are proposed. The activated output value is squared for the amplifying neuron, while the value becomes its reciprocal for the attenuating one. Theoretically, the order of neural networks increases when the amplifying neuron is placed in the hidden layer. The performance assessments of neural networks were conducted to verify that the amplifying and attenuating neurons enhance the performance of neural networks. From the numerical experiments, it was revealed that the neural networks that contain the amplifying and attenuating neurons yield more accurate results, compared to those without them. |
Tasks | |
Published | 2019-05-23 |
URL | https://arxiv.org/abs/1905.09574v2 |
https://arxiv.org/pdf/1905.09574v2.pdf | |
PWC | https://paperswithcode.com/paper/improving-neural-networks-by-adopting |
Repo | https://github.com/TheWinterSky/Amplifying-and-attenuating-neurons |
Framework | none |
Designing recurrent neural networks by unfolding an l1-l1 minimization algorithm
Title | Designing recurrent neural networks by unfolding an l1-l1 minimization algorithm |
Authors | Hung Duy Le, Huynh Van Luong, Nikos Deligiannis |
Abstract | We propose a new deep recurrent neural network (RNN) architecture for sequential signal reconstruction. Our network is designed by unfolding the iterations of the proximal gradient method that solves the l1-l1 minimization problem. As such, our network leverages by design that signals have a sparse representation and that the difference between consecutive signal representations is also sparse. We evaluate the proposed model in the task of reconstructing video frames from compressive measurements and show that it outperforms several state-of-the-art RNN models. |
Tasks | |
Published | 2019-02-18 |
URL | http://arxiv.org/abs/1902.06522v1 |
http://arxiv.org/pdf/1902.06522v1.pdf | |
PWC | https://paperswithcode.com/paper/designing-recurrent-neural-networks-by |
Repo | https://github.com/dhungle/L1-L1-RNN |
Framework | pytorch |
Preserving physically important variables in optimal event selections: A case study in Higgs physics
Title | Preserving physically important variables in optimal event selections: A case study in Higgs physics |
Authors | Philipp Windischhofer, Miha Zgubic, Daniela Bortoletto |
Abstract | Analyses of collider data, often assisted by modern Machine Learning methods, condense a number of observables into a few powerful discriminants for the separation of the targeted signal process from the contributing backgrounds. These discriminants are highly correlated with important physical observables; using them in the event selection thus leads to the distortion of physically relevant distributions. We present an alternative event selection strategy, based on adversarially trained classifiers, that exploits the discriminating power contained in many event variables, but preserves the distributions of selected observables. This method is implemented and evaluated for the case of a Standard Model Higgs boson decaying into a pair of bottom quarks. Compared to a cut-based approach, it leads to a significant improvement in analysis sensitivity and retains the shapes of the relevant distributions to a greater extent. |
Tasks | |
Published | 2019-07-03 |
URL | https://arxiv.org/abs/1907.02098v1 |
https://arxiv.org/pdf/1907.02098v1.pdf | |
PWC | https://paperswithcode.com/paper/preserving-physically-important-variables-in |
Repo | https://github.com/philippwindischhofer/HiggsPivoting |
Framework | tf |
LFZip: Lossy compression of multivariate floating-point time series data via improved prediction
Title | LFZip: Lossy compression of multivariate floating-point time series data via improved prediction |
Authors | Shubham Chandak, Kedar Tatwawadi, Chengtao Wen, Lingyun Wang, Juan Aparicio, Tsachy Weissman |
Abstract | Time series data compression is emerging as an important problem with the growth in IoT devices and sensors. Due to the presence of noise in these datasets, lossy compression can often provide significant compression gains without impacting the performance of downstream applications. In this work, we propose an error-bounded lossy compressor, LFZip, for multivariate floating-point time series data that provides guaranteed reconstruction up to user-specified maximum absolute error. The compressor is based on the prediction-quantization-entropy coder framework and benefits from improved prediction using linear models and neural networks. We evaluate the compressor on several time series datasets where it outperforms the existing state-of-the-art error-bounded lossy compressors. The code and data are available at https://github.com/shubhamchandak94/LFZip |
Tasks | Quantization, Time Series |
Published | 2019-11-01 |
URL | https://arxiv.org/abs/1911.00208v2 |
https://arxiv.org/pdf/1911.00208v2.pdf | |
PWC | https://paperswithcode.com/paper/lfzip-lossy-compression-of-multivariate |
Repo | https://github.com/shubhamchandak94/LFZip |
Framework | tf |
Drop to Adapt: Learning Discriminative Features for Unsupervised Domain Adaptation
Title | Drop to Adapt: Learning Discriminative Features for Unsupervised Domain Adaptation |
Authors | Seungmin Lee, Dongwan Kim, Namil Kim, Seong-Gyun Jeong |
Abstract | Recent works on domain adaptation exploit adversarial training to obtain domain-invariant feature representations from the joint learning of feature extractor and domain discriminator networks. However, domain adversarial methods render suboptimal performances since they attempt to match the distributions among the domains without considering the task at hand. We propose Drop to Adapt (DTA), which leverages adversarial dropout to learn strongly discriminative features by enforcing the cluster assumption. Accordingly, we design objective functions to support robust domain adaptation. We demonstrate efficacy of the proposed method on various experiments and achieve consistent improvements in both image classification and semantic segmentation tasks. Our source code is available at https://github.com/postBG/DTA.pytorch. |
Tasks | Domain Adaptation, Image Classification, Semantic Segmentation, Unsupervised Domain Adaptation |
Published | 2019-10-12 |
URL | https://arxiv.org/abs/1910.05562v1 |
https://arxiv.org/pdf/1910.05562v1.pdf | |
PWC | https://paperswithcode.com/paper/drop-to-adapt-learning-discriminative |
Repo | https://github.com/postBG/DTA.pytorch |
Framework | pytorch |