October 20, 2019

3073 words 15 mins read

Paper Group AWR 233

Paper Group AWR 233

Skip-gram word embeddings in hyperbolic space. PCN: Point Completion Network. On Difficulties of Cross-Lingual Transfer with Order Differences: A Case Study on Dependency Parsing. Discriminability objective for training descriptive captions. Attacks Meet Interpretability: Attribute-steered Detection of Adversarial Samples. Consensus-Driven Propagat …

Skip-gram word embeddings in hyperbolic space

Title Skip-gram word embeddings in hyperbolic space
Authors Matthias Leimeister, Benjamin J. Wilson
Abstract Recent work has demonstrated that embeddings of tree-like graphs in hyperbolic space surpass their Euclidean counterparts in performance by a large margin. Inspired by these results and scale-free structure in the word co-occurrence graph, we present an algorithm for learning word embeddings in hyperbolic space from free text. An objective function based on the hyperbolic distance is derived and included in the skip-gram negative-sampling architecture of word2vec. The hyperbolic word embeddings are then evaluated on word similarity and analogy benchmarks. The results demonstrate the potential of hyperbolic word embeddings, particularly in low dimensions, though without clear superiority over their Euclidean counterparts. We further discuss subtleties in the formulation of the analogy task in curved spaces.
Tasks Learning Word Embeddings, Word Embeddings
Published 2018-08-30
URL https://arxiv.org/abs/1809.01498v2
PDF https://arxiv.org/pdf/1809.01498v2.pdf
PWC https://paperswithcode.com/paper/skip-gram-word-embeddings-in-hyperbolic-space
Repo https://github.com/mtbarta/hyperbolic
Framework tf

PCN: Point Completion Network

Title PCN: Point Completion Network
Authors Wentao Yuan, Tejas Khot, David Held, Christoph Mertz, Martial Hebert
Abstract Shape completion, the problem of estimating the complete geometry of objects from partial observations, lies at the core of many vision and robotics applications. In this work, we propose Point Completion Network (PCN), a novel learning-based approach for shape completion. Unlike existing shape completion methods, PCN directly operates on raw point clouds without any structural assumption (e.g. symmetry) or annotation (e.g. semantic class) about the underlying shape. It features a decoder design that enables the generation of fine-grained completions while maintaining a small number of parameters. Our experiments show that PCN produces dense, complete point clouds with realistic structures in the missing regions on inputs with various levels of incompleteness and noise, including cars from LiDAR scans in the KITTI dataset.
Tasks Generating 3D Point Clouds
Published 2018-08-02
URL https://arxiv.org/abs/1808.00671v3
PDF https://arxiv.org/pdf/1808.00671v3.pdf
PWC https://paperswithcode.com/paper/pcn-point-completion-network
Repo https://github.com/TonythePlaneswalker/pcn
Framework tf

On Difficulties of Cross-Lingual Transfer with Order Differences: A Case Study on Dependency Parsing

Title On Difficulties of Cross-Lingual Transfer with Order Differences: A Case Study on Dependency Parsing
Authors Wasi Uddin Ahmad, Zhisong Zhang, Xuezhe Ma, Eduard Hovy, Kai-Wei Chang, Nanyun Peng
Abstract Different languages might have different word orders. In this paper, we investigate cross-lingual transfer and posit that an order-agnostic model will perform better when transferring to distant foreign languages. To test our hypothesis, we train dependency parsers on an English corpus and evaluate their transfer performance on 30 other languages. Specifically, we compare encoders and decoders based on Recurrent Neural Networks (RNNs) and modified self-attentive architectures. The former relies on sequential information while the latter is more flexible at modeling word order. Rigorous experiments and detailed analysis shows that RNN-based architectures transfer well to languages that are close to English, while self-attentive models have better overall cross-lingual transferability and perform especially well on distant languages.
Tasks Cross-Lingual Transfer, Dependency Parsing
Published 2018-11-01
URL http://arxiv.org/abs/1811.00570v3
PDF http://arxiv.org/pdf/1811.00570v3.pdf
PWC https://paperswithcode.com/paper/on-difficulties-of-cross-lingual-transfer
Repo https://github.com/uclanlp/CrossLingualDepParser
Framework pytorch

Discriminability objective for training descriptive captions

Title Discriminability objective for training descriptive captions
Authors Ruotian Luo, Brian Price, Scott Cohen, Gregory Shakhnarovich
Abstract One property that remains lacking in image captions generated by contemporary methods is discriminability: being able to tell two images apart given the caption for one of them. We propose a way to improve this aspect of caption generation. By incorporating into the captioning training objective a loss component directly related to ability (by a machine) to disambiguate image/caption matches, we obtain systems that produce much more discriminative caption, according to human evaluation. Remarkably, our approach leads to improvement in other aspects of generated captions, reflected by a battery of standard scores such as BLEU, SPICE etc. Our approach is modular and can be applied to a variety of model/loss combinations commonly proposed for image captioning.
Tasks Image Captioning
Published 2018-03-12
URL http://arxiv.org/abs/1803.04376v2
PDF http://arxiv.org/pdf/1803.04376v2.pdf
PWC https://paperswithcode.com/paper/discriminability-objective-for-training
Repo https://github.com/pdaicode/ImageCaptioning
Framework none

Attacks Meet Interpretability: Attribute-steered Detection of Adversarial Samples

Title Attacks Meet Interpretability: Attribute-steered Detection of Adversarial Samples
Authors Guanhong Tao, Shiqing Ma, Yingqi Liu, Xiangyu Zhang
Abstract Adversarial sample attacks perturb benign inputs to induce DNN misbehaviors. Recent research has demonstrated the widespread presence and the devastating consequences of such attacks. Existing defense techniques either assume prior knowledge of specific attacks or may not work well on complex models due to their underlying assumptions. We argue that adversarial sample attacks are deeply entangled with interpretability of DNN models: while classification results on benign inputs can be reasoned based on the human perceptible features/attributes, results on adversarial samples can hardly be explained. Therefore, we propose a novel adversarial sample detection technique for face recognition models, based on interpretability. It features a novel bi-directional correspondence inference between attributes and internal neurons to identify neurons critical for individual attributes. The activation values of critical neurons are enhanced to amplify the reasoning part of the computation and the values of other neurons are weakened to suppress the uninterpretable part. The classification results after such transformation are compared with those of the original model to detect adversaries. Results show that our technique can achieve 94% detection accuracy for 7 different kinds of attacks with 9.91% false positives on benign inputs. In contrast, a state-of-the-art feature squeezing technique can only achieve 55% accuracy with 23.3% false positives.
Tasks Face Recognition
Published 2018-10-27
URL http://arxiv.org/abs/1810.11580v1
PDF http://arxiv.org/pdf/1810.11580v1.pdf
PWC https://paperswithcode.com/paper/attacks-meet-interpretability-attribute
Repo https://github.com/AmIAttribute/AmI
Framework none

Consensus-Driven Propagation in Massive Unlabeled Data for Face Recognition

Title Consensus-Driven Propagation in Massive Unlabeled Data for Face Recognition
Authors Xiaohang Zhan, Ziwei Liu, Junjie Yan, Dahua Lin, Chen Change Loy
Abstract Face recognition has witnessed great progress in recent years, mainly attributed to the high-capacity model designed and the abundant labeled data collected. However, it becomes more and more prohibitive to scale up the current million-level identity annotations. In this work, we show that unlabeled face data can be as effective as the labeled ones. Here, we consider a setting closely mimicking the real-world scenario, where the unlabeled data are collected from unconstrained environments and their identities are exclusive from the labeled ones. Our main insight is that although the class information is not available, we can still faithfully approximate these semantic relationships by constructing a relational graph in a bottom-up manner. We propose Consensus-Driven Propagation (CDP) to tackle this challenging problem with two modules, the “committee” and the “mediator”, which select positive face pairs robustly by carefully aggregating multi-view information. Extensive experiments validate the effectiveness of both modules to discard outliers and mine hard positives. With CDP, we achieve a compelling accuracy of 78.18% on MegaFace identification challenge by using only 9% of the labels, comparing to 61.78% when no unlabeled data are used and 78.52% when all labels are employed.
Tasks Face Recognition
Published 2018-09-05
URL http://arxiv.org/abs/1809.01407v2
PDF http://arxiv.org/pdf/1809.01407v2.pdf
PWC https://paperswithcode.com/paper/consensus-driven-propagation-in-massive
Repo https://github.com/XiaohangZhan/cdp
Framework pytorch

Deep Cross Modal Learning for Caricature Verification and Identification(CaVINet)

Title Deep Cross Modal Learning for Caricature Verification and Identification(CaVINet)
Authors Jatin Garg, Skand Vishwanath Peri, Himanshu Tolani, Narayanan C Krishnan
Abstract Learning from different modalities is a challenging task. In this paper, we look at the challenging problem of cross modal face verification and recognition between caricature and visual image modalities. Caricature have exaggerations of facial features of a person. Due to the significant variations in the caricatures, building vision models for recognizing and verifying data from this modality is an extremely challenging task. Visual images with significantly lesser amount of distortions can act as a bridge for the analysis of caricature modality. We introduce a publicly available large Caricature-VIsual dataset [CaVI] with images from both the modalities that captures the rich variations in the caricature of an identity. This paper presents the first cross modal architecture that handles extreme distortions of caricatures using a deep learning network that learns similar representations across the modalities. We use two convolutional networks along with transformations that are subjected to orthogonality constraints to capture the shared and modality specific representations. In contrast to prior research, our approach neither depends on manually extracted facial landmarks for learning the representations, nor on the identities of the person for performing verification. The learned shared representation achieves 91% accuracy for verifying unseen images and 75% accuracy on unseen identities. Further, recognizing the identity in the image by knowledge transfer using a combination of shared and modality specific representations, resulted in an unprecedented performance of 85% rank-1 accuracy for caricatures and 95% rank-1 accuracy for visual images.
Tasks Caricature, Face Recognition, Face Verification, Transfer Learning
Published 2018-07-31
URL http://arxiv.org/abs/1807.11688v1
PDF http://arxiv.org/pdf/1807.11688v1.pdf
PWC https://paperswithcode.com/paper/deep-cross-modal-learning-for-caricature
Repo https://github.com/lsaiml/CaVINet
Framework tf

Bootstrapping and Multiple Imputation Ensemble Approaches for Missing Data

Title Bootstrapping and Multiple Imputation Ensemble Approaches for Missing Data
Authors Shehroz S. Khan, Amir Ahmad, Alex Mihailidis
Abstract Presence of missing values in a dataset can adversely affect the performance of a classifier. Single and Multiple Imputation are normally performed to fill in the missing values. In this paper, we present several variants of combining single and multiple imputation with bootstrapping to create ensembles that can model uncertainty and diversity in the data, and that are robust to high missingness in the data. We present three ensemble strategies: bootstrapping on incomplete data followed by (i) single imputation and (ii) multiple imputation, and (iii) multiple imputation ensemble without bootstrapping. We perform an extensive evaluation of the performance of the these ensemble strategies on 8 datasets by varying the missingness ratio. Our results show that bootstrapping followed by multiple imputation using expectation maximization is the most robust method even at high missingness ratio (up to 30%). For small missingness ratio (up to 10%) most of the ensemble methods perform quivalently but better than single imputation. Kappa-error plots suggest that accurate classifiers with reasonable diversity is the reason for this behaviour. A consistent observation in all the datasets suggests that for small missingness (up to 10%), bootstrapping on incomplete data without any imputation produces equivalent results to other ensemble methods.
Tasks Imputation
Published 2018-02-01
URL https://arxiv.org/abs/1802.00154v5
PDF https://arxiv.org/pdf/1802.00154v5.pdf
PWC https://paperswithcode.com/paper/bootstrapping-and-multiple-imputation
Repo https://github.com/titubeta/EnsembleImputation
Framework none

Pose-Robust Face Recognition via Deep Residual Equivariant Mapping

Title Pose-Robust Face Recognition via Deep Residual Equivariant Mapping
Authors Kaidi Cao, Yu Rong, Cheng Li, Xiaoou Tang, Chen Change Loy
Abstract Face recognition achieves exceptional success thanks to the emergence of deep learning. However, many contemporary face recognition models still perform relatively poor in processing profile faces compared to frontal faces. A key reason is that the number of frontal and profile training faces are highly imbalanced - there are extensively more frontal training samples compared to profile ones. In addition, it is intrinsically hard to learn a deep representation that is geometrically invariant to large pose variations. In this study, we hypothesize that there is an inherent mapping between frontal and profile faces, and consequently, their discrepancy in the deep representation space can be bridged by an equivariant mapping. To exploit this mapping, we formulate a novel Deep Residual EquivAriant Mapping (DREAM) block, which is capable of adaptively adding residuals to the input deep representation to transform a profile face representation to a canonical pose that simplifies recognition. The DREAM block consistently enhances the performance of profile face recognition for many strong deep networks, including ResNet models, without deliberately augmenting training data of profile faces. The block is easy to use, light-weight, and can be implemented with a negligible computational overhead.
Tasks Face Identification, Face Recognition, Face Verification, Robust Face Recognition
Published 2018-03-02
URL http://arxiv.org/abs/1803.00839v1
PDF http://arxiv.org/pdf/1803.00839v1.pdf
PWC https://paperswithcode.com/paper/pose-robust-face-recognition-via-deep
Repo https://github.com/penincillin/DREAM
Framework pytorch

Variational Bayesian Monte Carlo

Title Variational Bayesian Monte Carlo
Authors Luigi Acerbi
Abstract Many probabilistic models of interest in scientific computing and machine learning have expensive, black-box likelihoods that prevent the application of standard techniques for Bayesian inference, such as MCMC, which would require access to the gradient or a large number of likelihood evaluations. We introduce here a novel sample-efficient inference framework, Variational Bayesian Monte Carlo (VBMC). VBMC combines variational inference with Gaussian-process based, active-sampling Bayesian quadrature, using the latter to efficiently approximate the intractable integral in the variational objective. Our method produces both a nonparametric approximation of the posterior distribution and an approximate lower bound of the model evidence, useful for model selection. We demonstrate VBMC both on several synthetic likelihoods and on a neuronal model with data from real neurons. Across all tested problems and dimensions (up to $D = 10$), VBMC performs consistently well in reconstructing the posterior and the model evidence with a limited budget of likelihood evaluations, unlike other methods that work only in very low dimensions. Our framework shows great promise as a novel tool for posterior and model inference with expensive, black-box likelihoods.
Tasks Bayesian Inference, Model Selection
Published 2018-10-12
URL http://arxiv.org/abs/1810.05558v2
PDF http://arxiv.org/pdf/1810.05558v2.pdf
PWC https://paperswithcode.com/paper/variational-bayesian-monte-carlo
Repo https://github.com/lacerbi/vbmc
Framework none

Can We Gain More from Orthogonality Regularizations in Training Deep CNNs?

Title Can We Gain More from Orthogonality Regularizations in Training Deep CNNs?
Authors Nitin Bansal, Xiaohan Chen, Zhangyang Wang
Abstract This paper seeks to answer the question: as the (near-) orthogonality of weights is found to be a favorable property for training deep convolutional neural networks, how can we enforce it in more effective and easy-to-use ways? We develop novel orthogonality regularizations on training deep CNNs, utilizing various advanced analytical tools such as mutual coherence and restricted isometry property. These plug-and-play regularizations can be conveniently incorporated into training almost any CNN without extra hassle. We then benchmark their effects on state-of-the-art models: ResNet, WideResNet, and ResNeXt, on several most popular computer vision datasets: CIFAR-10, CIFAR-100, SVHN and ImageNet. We observe consistent performance gains after applying those proposed regularizations, in terms of both the final accuracies achieved, and faster and more stable convergences. We have made our codes and pre-trained models publicly available: https://github.com/nbansal90/Can-we-Gain-More-from-Orthogonality.
Tasks
Published 2018-10-22
URL http://arxiv.org/abs/1810.09102v1
PDF http://arxiv.org/pdf/1810.09102v1.pdf
PWC https://paperswithcode.com/paper/can-we-gain-more-from-orthogonality
Repo https://github.com/nbansal90/Can-we-Gain-More-from-Orthogonality
Framework pytorch

As you like it: Localization via paired comparisons

Title As you like it: Localization via paired comparisons
Authors Andrew K. Massimino, Mark A. Davenport
Abstract Suppose that we wish to estimate a vector $\mathbf{x}$ from a set of binary paired comparisons of the form “$\mathbf{x}$ is closer to $\mathbf{p}$ than to $\mathbf{q}$” for various choices of vectors $\mathbf{p}$ and $\mathbf{q}$. The problem of estimating $\mathbf{x}$ from this type of observation arises in a variety of contexts, including nonmetric multidimensional scaling, “unfolding,” and ranking problems, often because it provides a powerful and flexible model of preference. We describe theoretical bounds for how well we can expect to estimate $\mathbf{x}$ under a randomized model for $\mathbf{p}$ and $\mathbf{q}$. We also present results for the case where the comparisons are noisy and subject to some degree of error. Additionally, we show that under a randomized model for $\mathbf{p}$ and $\mathbf{q}$, a suitable number of binary paired comparisons yield a stable embedding of the space of target vectors. Finally, we also that we can achieve significant gains by adaptively changing the distribution for choosing $\mathbf{p}$ and $\mathbf{q}$.
Tasks
Published 2018-02-19
URL http://arxiv.org/abs/1802.10489v1
PDF http://arxiv.org/pdf/1802.10489v1.pdf
PWC https://paperswithcode.com/paper/as-you-like-it-localization-via-paired
Repo https://github.com/siplab-gt/pairsearch
Framework none

Utilizing Neural Networks and Linguistic Metadata for Early Detection of Depression Indications in Text Sequences

Title Utilizing Neural Networks and Linguistic Metadata for Early Detection of Depression Indications in Text Sequences
Authors Marcel Trotzek, Sven Koitka, Christoph M. Friedrich
Abstract Depression is ranked as the largest contributor to global disability and is also a major reason for suicide. Still, many individuals suffering from forms of depression are not treated for various reasons. Previous studies have shown that depression also has an effect on language usage and that many depressed individuals use social media platforms or the internet in general to get information or discuss their problems. This paper addresses the early detection of depression using machine learning models based on messages on a social platform. In particular, a convolutional neural network based on different word embeddings is evaluated and compared to a classification based on user-level linguistic metadata. An ensemble of both approaches is shown to achieve state-of-the-art results in a current early detection task. Furthermore, the currently popular ERDE score as metric for early detection systems is examined in detail and its drawbacks in the context of shared tasks are illustrated. A slightly modified metric is proposed and compared to the original score. Finally, a new word embedding was trained on a large corpus of the same domain as the described task and is evaluated as well.
Tasks Word Embeddings
Published 2018-04-19
URL http://arxiv.org/abs/1804.07000v3
PDF http://arxiv.org/pdf/1804.07000v3.pdf
PWC https://paperswithcode.com/paper/utilizing-neural-networks-and-linguistic
Repo https://github.com/serenera/Serenera
Framework tf

Transferring Rich Deep Features for Facial Beauty Prediction

Title Transferring Rich Deep Features for Facial Beauty Prediction
Authors Lu Xu, Jinhai Xiang, Xiaohui Yuan
Abstract Feature extraction plays a significant part in computer vision tasks. In this paper, we propose a method which transfers rich deep features from a pretrained model on face verification task and feeds the features into Bayesian ridge regression algorithm for facial beauty prediction. We leverage the deep neural networks that extracts more abstract features from stacked layers. Through simple but effective feature fusion strategy, our method achieves improved or comparable performance on SCUT-FBP dataset and ECCV HotOrNot dataset. Our experiments demonstrate the effectiveness of the proposed method and clarify the inner interpretability of facial beauty perception.
Tasks Face Verification, Facial Beauty Prediction
Published 2018-03-20
URL http://arxiv.org/abs/1803.07253v1
PDF http://arxiv.org/pdf/1803.07253v1.pdf
PWC https://paperswithcode.com/paper/transferring-rich-deep-features-for-facial
Repo https://github.com/lucasxlu/TransFBP
Framework tf

Hyperprior Induced Unsupervised Disentanglement of Latent Representations

Title Hyperprior Induced Unsupervised Disentanglement of Latent Representations
Authors Abdul Fatir Ansari, Harold Soh
Abstract We address the problem of unsupervised disentanglement of latent representations learnt via deep generative models. In contrast to current approaches that operate on the evidence lower bound (ELBO), we argue that statistical independence in the latent space of VAEs can be enforced in a principled hierarchical Bayesian manner. To this effect, we augment the standard VAE with an inverse-Wishart (IW) prior on the covariance matrix of the latent code. By tuning the IW parameters, we are able to encourage (or discourage) independence in the learnt latent dimensions. Extensive experimental results on a range of datasets (2DShapes, 3DChairs, 3DFaces and CelebA) show our approach to outperform the $\beta$-VAE and is competitive with the state-of-the-art FactorVAE. Our approach achieves significantly better disentanglement and reconstruction on a new dataset (CorrelatedEllipses) which introduces correlations between the factors of variation.
Tasks
Published 2018-09-12
URL http://arxiv.org/abs/1809.04497v3
PDF http://arxiv.org/pdf/1809.04497v3.pdf
PWC https://paperswithcode.com/paper/hyperprior-induced-unsupervised
Repo https://github.com/crslab/correlated-ellipses
Framework none
comments powered by Disqus