October 20, 2019

3073 words 15 mins read

Paper Group AWR 233

Skip-gram word embeddings in hyperbolic space. PCN: Point Completion Network. On Difficulties of Cross-Lingual Transfer with Order Differences: A Case Study on Dependency Parsing. Discriminability objective for training descriptive captions. Attacks Meet Interpretability: Attribute-steered Detection of Adversarial Samples. Consensus-Driven Propagat …

Skip-gram word embeddings in hyperbolic space


Title	Skip-gram word embeddings in hyperbolic space
Authors	Matthias Leimeister, Benjamin J. Wilson
Abstract	Recent work has demonstrated that embeddings of tree-like graphs in hyperbolic space surpass their Euclidean counterparts in performance by a large margin. Inspired by these results and scale-free structure in the word co-occurrence graph, we present an algorithm for learning word embeddings in hyperbolic space from free text. An objective function based on the hyperbolic distance is derived and included in the skip-gram negative-sampling architecture of word2vec. The hyperbolic word embeddings are then evaluated on word similarity and analogy benchmarks. The results demonstrate the potential of hyperbolic word embeddings, particularly in low dimensions, though without clear superiority over their Euclidean counterparts. We further discuss subtleties in the formulation of the analogy task in curved spaces.
Tasks	Learning Word Embeddings, Word Embeddings
Published	2018-08-30
URL	https://arxiv.org/abs/1809.01498v2
PDF	https://arxiv.org/pdf/1809.01498v2.pdf
PWC	https://paperswithcode.com/paper/skip-gram-word-embeddings-in-hyperbolic-space
Repo	https://github.com/mtbarta/hyperbolic
Framework	tf

PCN: Point Completion Network


Title	PCN: Point Completion Network
Authors	Wentao Yuan, Tejas Khot, David Held, Christoph Mertz, Martial Hebert
Abstract	Shape completion, the problem of estimating the complete geometry of objects from partial observations, lies at the core of many vision and robotics applications. In this work, we propose Point Completion Network (PCN), a novel learning-based approach for shape completion. Unlike existing shape completion methods, PCN directly operates on raw point clouds without any structural assumption (e.g. symmetry) or annotation (e.g. semantic class) about the underlying shape. It features a decoder design that enables the generation of fine-grained completions while maintaining a small number of parameters. Our experiments show that PCN produces dense, complete point clouds with realistic structures in the missing regions on inputs with various levels of incompleteness and noise, including cars from LiDAR scans in the KITTI dataset.
Tasks	Generating 3D Point Clouds
Published	2018-08-02
URL	https://arxiv.org/abs/1808.00671v3
PDF	https://arxiv.org/pdf/1808.00671v3.pdf
PWC	https://paperswithcode.com/paper/pcn-point-completion-network
Repo	https://github.com/TonythePlaneswalker/pcn
Framework	tf

On Difficulties of Cross-Lingual Transfer with Order Differences: A Case Study on Dependency Parsing


Title	On Difficulties of Cross-Lingual Transfer with Order Differences: A Case Study on Dependency Parsing
Authors	Wasi Uddin Ahmad, Zhisong Zhang, Xuezhe Ma, Eduard Hovy, Kai-Wei Chang, Nanyun Peng
Abstract	Different languages might have different word orders. In this paper, we investigate cross-lingual transfer and posit that an order-agnostic model will perform better when transferring to distant foreign languages. To test our hypothesis, we train dependency parsers on an English corpus and evaluate their transfer performance on 30 other languages. Specifically, we compare encoders and decoders based on Recurrent Neural Networks (RNNs) and modified self-attentive architectures. The former relies on sequential information while the latter is more flexible at modeling word order. Rigorous experiments and detailed analysis shows that RNN-based architectures transfer well to languages that are close to English, while self-attentive models have better overall cross-lingual transferability and perform especially well on distant languages.
Tasks	Cross-Lingual Transfer, Dependency Parsing
Published	2018-11-01
URL	http://arxiv.org/abs/1811.00570v3
PDF	http://arxiv.org/pdf/1811.00570v3.pdf
PWC	https://paperswithcode.com/paper/on-difficulties-of-cross-lingual-transfer
Repo	https://github.com/uclanlp/CrossLingualDepParser
Framework	pytorch

Discriminability objective for training descriptive captions


Title	Discriminability objective for training descriptive captions
Authors	Ruotian Luo, Brian Price, Scott Cohen, Gregory Shakhnarovich
Abstract	One property that remains lacking in image captions generated by contemporary methods is discriminability: being able to tell two images apart given the caption for one of them. We propose a way to improve this aspect of caption generation. By incorporating into the captioning training objective a loss component directly related to ability (by a machine) to disambiguate image/caption matches, we obtain systems that produce much more discriminative caption, according to human evaluation. Remarkably, our approach leads to improvement in other aspects of generated captions, reflected by a battery of standard scores such as BLEU, SPICE etc. Our approach is modular and can be applied to a variety of model/loss combinations commonly proposed for image captioning.
Tasks	Image Captioning
Published	2018-03-12
URL	http://arxiv.org/abs/1803.04376v2
PDF	http://arxiv.org/pdf/1803.04376v2.pdf
PWC	https://paperswithcode.com/paper/discriminability-objective-for-training
Repo	https://github.com/pdaicode/ImageCaptioning
Framework	none

Attacks Meet Interpretability: Attribute-steered Detection of Adversarial Samples


Title	Attacks Meet Interpretability: Attribute-steered Detection of Adversarial Samples
Authors	Guanhong Tao, Shiqing Ma, Yingqi Liu, Xiangyu Zhang
Abstract	Adversarial sample attacks perturb benign inputs to induce DNN misbehaviors. Recent research has demonstrated the widespread presence and the devastating consequences of such attacks. Existing defense techniques either assume prior knowledge of specific attacks or may not work well on complex models due to their underlying assumptions. We argue that adversarial sample attacks are deeply entangled with interpretability of DNN models: while classification results on benign inputs can be reasoned based on the human perceptible features/attributes, results on adversarial samples can hardly be explained. Therefore, we propose a novel adversarial sample detection technique for face recognition models, based on interpretability. It features a novel bi-directional correspondence inference between attributes and internal neurons to identify neurons critical for individual attributes. The activation values of critical neurons are enhanced to amplify the reasoning part of the computation and the values of other neurons are weakened to suppress the uninterpretable part. The classification results after such transformation are compared with those of the original model to detect adversaries. Results show that our technique can achieve 94% detection accuracy for 7 different kinds of attacks with 9.91% false positives on benign inputs. In contrast, a state-of-the-art feature squeezing technique can only achieve 55% accuracy with 23.3% false positives.
Tasks	Face Recognition
Published	2018-10-27
URL	http://arxiv.org/abs/1810.11580v1
PDF	http://arxiv.org/pdf/1810.11580v1.pdf
PWC	https://paperswithcode.com/paper/attacks-meet-interpretability-attribute
Repo	https://github.com/AmIAttribute/AmI
Framework	none

Consensus-Driven Propagation in Massive Unlabeled Data for Face Recognition


Title	Consensus-Driven Propagation in Massive Unlabeled Data for Face Recognition
Authors	Xiaohang Zhan, Ziwei Liu, Junjie Yan, Dahua Lin, Chen Change Loy
Abstract	Face recognition has witnessed great progress in recent years, mainly attributed to the high-capacity model designed and the abundant labeled data collected. However, it becomes more and more prohibitive to scale up the current million-level identity annotations. In this work, we show that unlabeled face data can be as effective as the labeled ones. Here, we consider a setting closely mimicking the real-world scenario, where the unlabeled data are collected from unconstrained environments and their identities are exclusive from the labeled ones. Our main insight is that although the class information is not available, we can still faithfully approximate these semantic relationships by constructing a relational graph in a bottom-up manner. We propose Consensus-Driven Propagation (CDP) to tackle this challenging problem with two modules, the “committee” and the “mediator”, which select positive face pairs robustly by carefully aggregating multi-view information. Extensive experiments validate the effectiveness of both modules to discard outliers and mine hard positives. With CDP, we achieve a compelling accuracy of 78.18% on MegaFace identification challenge by using only 9% of the labels, comparing to 61.78% when no unlabeled data are used and 78.52% when all labels are employed.
Tasks	Face Recognition
Published	2018-09-05
URL	http://arxiv.org/abs/1809.01407v2
PDF	http://arxiv.org/pdf/1809.01407v2.pdf
PWC	https://paperswithcode.com/paper/consensus-driven-propagation-in-massive
Repo	https://github.com/XiaohangZhan/cdp
Framework	pytorch


Title	Deep Cross Modal Learning for Caricature Verification and Identification(CaVINet)
Authors	Jatin Garg, Skand Vishwanath Peri, Himanshu Tolani, Narayanan C Krishnan
Abstract	Learning from different modalities is a challenging task. In this paper, we look at the challenging problem of cross modal face verification and recognition between caricature and visual image modalities. Caricature have exaggerations of facial features of a person. Due to the significant variations in the caricatures, building vision models for recognizing and verifying data from this modality is an extremely challenging task. Visual images with significantly lesser amount of distortions can act as a bridge for the analysis of caricature modality. We introduce a publicly available large Caricature-VIsual dataset [CaVI] with images from both the modalities that captures the rich variations in the caricature of an identity. This paper presents the first cross modal architecture that handles extreme distortions of caricatures using a deep learning network that learns similar representations across the modalities. We use two convolutional networks along with transformations that are subjected to orthogonality constraints to capture the shared and modality specific representations. In contrast to prior research, our approach neither depends on manually extracted facial landmarks for learning the representations, nor on the identities of the person for performing verification. The learned shared representation achieves 91% accuracy for verifying unseen images and 75% accuracy on unseen identities. Further, recognizing the identity in the image by knowledge transfer using a combination of shared and modality specific representations, resulted in an unprecedented performance of 85% rank-1 accuracy for caricatures and 95% rank-1 accuracy for visual images.
Tasks	Caricature, Face Recognition, Face Verification, Transfer Learning
Published	2018-07-31
URL	http://arxiv.org/abs/1807.11688v1
PDF	http://arxiv.org/pdf/1807.11688v1.pdf
PWC	https://paperswithcode.com/paper/deep-cross-modal-learning-for-caricature
Repo	https://github.com/lsaiml/CaVINet
Framework	tf

Bootstrapping and Multiple Imputation Ensemble Approaches for Missing Data


Title	Bootstrapping and Multiple Imputation Ensemble Approaches for Missing Data
Authors	Shehroz S. Khan, Amir Ahmad, Alex Mihailidis
Abstract	Presence of missing values in a dataset can adversely affect the performance of a classifier. Single and Multiple Imputation are normally performed to fill in the missing values. In this paper, we present several variants of combining single and multiple imputation with bootstrapping to create ensembles that can model uncertainty and diversity in the data, and that are robust to high missingness in the data. We present three ensemble strategies: bootstrapping on incomplete data followed by (i) single imputation and (ii) multiple imputation, and (iii) multiple imputation ensemble without bootstrapping. We perform an extensive evaluation of the performance of the these ensemble strategies on 8 datasets by varying the missingness ratio. Our results show that bootstrapping followed by multiple imputation using expectation maximization is the most robust method even at high missingness ratio (up to 30%). For small missingness ratio (up to 10%) most of the ensemble methods perform quivalently but better than single imputation. Kappa-error plots suggest that accurate classifiers with reasonable diversity is the reason for this behaviour. A consistent observation in all the datasets suggests that for small missingness (up to 10%), bootstrapping on incomplete data without any imputation produces equivalent results to other ensemble methods.
Tasks	Imputation
Published	2018-02-01
URL	https://arxiv.org/abs/1802.00154v5
PDF	https://arxiv.org/pdf/1802.00154v5.pdf
PWC	https://paperswithcode.com/paper/bootstrapping-and-multiple-imputation
Repo	https://github.com/titubeta/EnsembleImputation
Framework	none

Pose-Robust Face Recognition via Deep Residual Equivariant Mapping


Title	Pose-Robust Face Recognition via Deep Residual Equivariant Mapping
Authors	Kaidi Cao, Yu Rong, Cheng Li, Xiaoou Tang, Chen Change Loy
Abstract	Face recognition achieves exceptional success thanks to the emergence of deep learning. However, many contemporary face recognition models still perform relatively poor in processing profile faces compared to frontal faces. A key reason is that the number of frontal and profile training faces are highly imbalanced - there are extensively more frontal training samples compared to profile ones. In addition, it is intrinsically hard to learn a deep representation that is geometrically invariant to large pose variations. In this study, we hypothesize that there is an inherent mapping between frontal and profile faces, and consequently, their discrepancy in the deep representation space can be bridged by an equivariant mapping. To exploit this mapping, we formulate a novel Deep Residual EquivAriant Mapping (DREAM) block, which is capable of adaptively adding residuals to the input deep representation to transform a profile face representation to a canonical pose that simplifies recognition. The DREAM block consistently enhances the performance of profile face recognition for many strong deep networks, including ResNet models, without deliberately augmenting training data of profile faces. The block is easy to use, light-weight, and can be implemented with a negligible computational overhead.
Tasks	Face Identification, Face Recognition, Face Verification, Robust Face Recognition
Published	2018-03-02
URL	http://arxiv.org/abs/1803.00839v1
PDF	http://arxiv.org/pdf/1803.00839v1.pdf
PWC	https://paperswithcode.com/paper/pose-robust-face-recognition-via-deep
Repo	https://github.com/penincillin/DREAM
Framework	pytorch

Variational Bayesian Monte Carlo


Title	Variational Bayesian Monte Carlo
Authors	Luigi Acerbi
Abstract	Many probabilistic models of interest in scientific computing and machine learning have expensive, black-box likelihoods that prevent the application of standard techniques for Bayesian inference, such as MCMC, which would require access to the gradient or a large number of likelihood evaluations. We introduce here a novel sample-efficient inference framework, Variational Bayesian Monte Carlo (VBMC). VBMC combines variational inference with Gaussian-process based, active-sampling Bayesian quadrature, using the latter to efficiently approximate the intractable integral in the variational objective. Our method produces both a nonparametric approximation of the posterior distribution and an approximate lower bound of the model evidence, useful for model selection. We demonstrate VBMC both on several synthetic likelihoods and on a neuronal model with data from real neurons. Across all tested problems and dimensions (up to $D = 10$), VBMC performs consistently well in reconstructing the posterior and the model evidence with a limited budget of likelihood evaluations, unlike other methods that work only in very low dimensions. Our framework shows great promise as a novel tool for posterior and model inference with expensive, black-box likelihoods.
Tasks	Bayesian Inference, Model Selection
Published	2018-10-12
URL	http://arxiv.org/abs/1810.05558v2
PDF	http://arxiv.org/pdf/1810.05558v2.pdf
PWC	https://paperswithcode.com/paper/variational-bayesian-monte-carlo
Repo	https://github.com/lacerbi/vbmc
Framework	none

Can We Gain More from Orthogonality Regularizations in Training Deep CNNs?


Title	Can We Gain More from Orthogonality Regularizations in Training Deep CNNs?
Authors	Nitin Bansal, Xiaohan Chen, Zhangyang Wang
Abstract	This paper seeks to answer the question: as the (near-) orthogonality of weights is found to be a favorable property for training deep convolutional neural networks, how can we enforce it in more effective and easy-to-use ways? We develop novel orthogonality regularizations on training deep CNNs, utilizing various advanced analytical tools such as mutual coherence and restricted isometry property. These plug-and-play regularizations can be conveniently incorporated into training almost any CNN without extra hassle. We then benchmark their effects on state-of-the-art models: ResNet, WideResNet, and ResNeXt, on several most popular computer vision datasets: CIFAR-10, CIFAR-100, SVHN and ImageNet. We observe consistent performance gains after applying those proposed regularizations, in terms of both the final accuracies achieved, and faster and more stable convergences. We have made our codes and pre-trained models publicly available: https://github.com/nbansal90/Can-we-Gain-More-from-Orthogonality.
Tasks
Published	2018-10-22
URL	http://arxiv.org/abs/1810.09102v1
PDF	http://arxiv.org/pdf/1810.09102v1.pdf
PWC	https://paperswithcode.com/paper/can-we-gain-more-from-orthogonality
Repo	https://github.com/nbansal90/Can-we-Gain-More-from-Orthogonality
Framework	pytorch

As you like it: Localization via paired comparisons


Title	As you like it: Localization via paired comparisons
Authors	Andrew K. Massimino, Mark A. Davenport
Abstract	Suppose that we wish to estimate a vector $\mathbf{x}$ from a set of binary paired comparisons of the form “$\mathbf{x}$ is closer to $\mathbf{p}$ than to $\mathbf{q}$” for various choices of vectors $\mathbf{p}$ and $\mathbf{q}$. The problem of estimating $\mathbf{x}$ from this type of observation arises in a variety of contexts, including nonmetric multidimensional scaling, “unfolding,” and ranking problems, often because it provides a powerful and flexible model of preference. We describe theoretical bounds for how well we can expect to estimate $\mathbf{x}$ under a randomized model for $\mathbf{p}$ and $\mathbf{q}$. We also present results for the case where the comparisons are noisy and subject to some degree of error. Additionally, we show that under a randomized model for $\mathbf{p}$ and $\mathbf{q}$, a suitable number of binary paired comparisons yield a stable embedding of the space of target vectors. Finally, we also that we can achieve significant gains by adaptively changing the distribution for choosing $\mathbf{p}$ and $\mathbf{q}$.
Tasks
Published	2018-02-19
URL	http://arxiv.org/abs/1802.10489v1
PDF	http://arxiv.org/pdf/1802.10489v1.pdf
PWC	https://paperswithcode.com/paper/as-you-like-it-localization-via-paired
Repo	https://github.com/siplab-gt/pairsearch
Framework	none

Utilizing Neural Networks and Linguistic Metadata for Early Detection of Depression Indications in Text Sequences


Title	Utilizing Neural Networks and Linguistic Metadata for Early Detection of Depression Indications in Text Sequences
Authors	Marcel Trotzek, Sven Koitka, Christoph M. Friedrich
Abstract	Depression is ranked as the largest contributor to global disability and is also a major reason for suicide. Still, many individuals suffering from forms of depression are not treated for various reasons. Previous studies have shown that depression also has an effect on language usage and that many depressed individuals use social media platforms or the internet in general to get information or discuss their problems. This paper addresses the early detection of depression using machine learning models based on messages on a social platform. In particular, a convolutional neural network based on different word embeddings is evaluated and compared to a classification based on user-level linguistic metadata. An ensemble of both approaches is shown to achieve state-of-the-art results in a current early detection task. Furthermore, the currently popular ERDE score as metric for early detection systems is examined in detail and its drawbacks in the context of shared tasks are illustrated. A slightly modified metric is proposed and compared to the original score. Finally, a new word embedding was trained on a large corpus of the same domain as the described task and is evaluated as well.
Tasks	Word Embeddings
Published	2018-04-19
URL	http://arxiv.org/abs/1804.07000v3
PDF	http://arxiv.org/pdf/1804.07000v3.pdf
PWC	https://paperswithcode.com/paper/utilizing-neural-networks-and-linguistic
Repo	https://github.com/serenera/Serenera
Framework	tf

Transferring Rich Deep Features for Facial Beauty Prediction


Title	Transferring Rich Deep Features for Facial Beauty Prediction
Authors	Lu Xu, Jinhai Xiang, Xiaohui Yuan
Abstract	Feature extraction plays a significant part in computer vision tasks. In this paper, we propose a method which transfers rich deep features from a pretrained model on face verification task and feeds the features into Bayesian ridge regression algorithm for facial beauty prediction. We leverage the deep neural networks that extracts more abstract features from stacked layers. Through simple but effective feature fusion strategy, our method achieves improved or comparable performance on SCUT-FBP dataset and ECCV HotOrNot dataset. Our experiments demonstrate the effectiveness of the proposed method and clarify the inner interpretability of facial beauty perception.
Tasks	Face Verification, Facial Beauty Prediction
Published	2018-03-20
URL	http://arxiv.org/abs/1803.07253v1
PDF	http://arxiv.org/pdf/1803.07253v1.pdf
PWC	https://paperswithcode.com/paper/transferring-rich-deep-features-for-facial
Repo	https://github.com/lucasxlu/TransFBP
Framework	tf

Hyperprior Induced Unsupervised Disentanglement of Latent Representations


Title	Hyperprior Induced Unsupervised Disentanglement of Latent Representations
Authors	Abdul Fatir Ansari, Harold Soh
Abstract	We address the problem of unsupervised disentanglement of latent representations learnt via deep generative models. In contrast to current approaches that operate on the evidence lower bound (ELBO), we argue that statistical independence in the latent space of VAEs can be enforced in a principled hierarchical Bayesian manner. To this effect, we augment the standard VAE with an inverse-Wishart (IW) prior on the covariance matrix of the latent code. By tuning the IW parameters, we are able to encourage (or discourage) independence in the learnt latent dimensions. Extensive experimental results on a range of datasets (2DShapes, 3DChairs, 3DFaces and CelebA) show our approach to outperform the $\beta$-VAE and is competitive with the state-of-the-art FactorVAE. Our approach achieves significantly better disentanglement and reconstruction on a new dataset (CorrelatedEllipses) which introduces correlations between the factors of variation.
Tasks
Published	2018-09-12
URL	http://arxiv.org/abs/1809.04497v3
PDF	http://arxiv.org/pdf/1809.04497v3.pdf
PWC	https://paperswithcode.com/paper/hyperprior-induced-unsupervised
Repo	https://github.com/crslab/correlated-ellipses
Framework	none