Paper Group AWR 233
Skip-gram word embeddings in hyperbolic space. PCN: Point Completion Network. On Difficulties of Cross-Lingual Transfer with Order Differences: A Case Study on Dependency Parsing. Discriminability objective for training descriptive captions. Attacks Meet Interpretability: Attribute-steered Detection of Adversarial Samples. Consensus-Driven Propagat …
Skip-gram word embeddings in hyperbolic space
Title | Skip-gram word embeddings in hyperbolic space |
Authors | Matthias Leimeister, Benjamin J. Wilson |
Abstract | Recent work has demonstrated that embeddings of tree-like graphs in hyperbolic space surpass their Euclidean counterparts in performance by a large margin. Inspired by these results and scale-free structure in the word co-occurrence graph, we present an algorithm for learning word embeddings in hyperbolic space from free text. An objective function based on the hyperbolic distance is derived and included in the skip-gram negative-sampling architecture of word2vec. The hyperbolic word embeddings are then evaluated on word similarity and analogy benchmarks. The results demonstrate the potential of hyperbolic word embeddings, particularly in low dimensions, though without clear superiority over their Euclidean counterparts. We further discuss subtleties in the formulation of the analogy task in curved spaces. |
Tasks | Learning Word Embeddings, Word Embeddings |
Published | 2018-08-30 |
URL | https://arxiv.org/abs/1809.01498v2 |
https://arxiv.org/pdf/1809.01498v2.pdf | |
PWC | https://paperswithcode.com/paper/skip-gram-word-embeddings-in-hyperbolic-space |
Repo | https://github.com/mtbarta/hyperbolic |
Framework | tf |
PCN: Point Completion Network
Title | PCN: Point Completion Network |
Authors | Wentao Yuan, Tejas Khot, David Held, Christoph Mertz, Martial Hebert |
Abstract | Shape completion, the problem of estimating the complete geometry of objects from partial observations, lies at the core of many vision and robotics applications. In this work, we propose Point Completion Network (PCN), a novel learning-based approach for shape completion. Unlike existing shape completion methods, PCN directly operates on raw point clouds without any structural assumption (e.g. symmetry) or annotation (e.g. semantic class) about the underlying shape. It features a decoder design that enables the generation of fine-grained completions while maintaining a small number of parameters. Our experiments show that PCN produces dense, complete point clouds with realistic structures in the missing regions on inputs with various levels of incompleteness and noise, including cars from LiDAR scans in the KITTI dataset. |
Tasks | Generating 3D Point Clouds |
Published | 2018-08-02 |
URL | https://arxiv.org/abs/1808.00671v3 |
https://arxiv.org/pdf/1808.00671v3.pdf | |
PWC | https://paperswithcode.com/paper/pcn-point-completion-network |
Repo | https://github.com/TonythePlaneswalker/pcn |
Framework | tf |
On Difficulties of Cross-Lingual Transfer with Order Differences: A Case Study on Dependency Parsing
Title | On Difficulties of Cross-Lingual Transfer with Order Differences: A Case Study on Dependency Parsing |
Authors | Wasi Uddin Ahmad, Zhisong Zhang, Xuezhe Ma, Eduard Hovy, Kai-Wei Chang, Nanyun Peng |
Abstract | Different languages might have different word orders. In this paper, we investigate cross-lingual transfer and posit that an order-agnostic model will perform better when transferring to distant foreign languages. To test our hypothesis, we train dependency parsers on an English corpus and evaluate their transfer performance on 30 other languages. Specifically, we compare encoders and decoders based on Recurrent Neural Networks (RNNs) and modified self-attentive architectures. The former relies on sequential information while the latter is more flexible at modeling word order. Rigorous experiments and detailed analysis shows that RNN-based architectures transfer well to languages that are close to English, while self-attentive models have better overall cross-lingual transferability and perform especially well on distant languages. |
Tasks | Cross-Lingual Transfer, Dependency Parsing |
Published | 2018-11-01 |
URL | http://arxiv.org/abs/1811.00570v3 |
http://arxiv.org/pdf/1811.00570v3.pdf | |
PWC | https://paperswithcode.com/paper/on-difficulties-of-cross-lingual-transfer |
Repo | https://github.com/uclanlp/CrossLingualDepParser |
Framework | pytorch |
Discriminability objective for training descriptive captions
Title | Discriminability objective for training descriptive captions |
Authors | Ruotian Luo, Brian Price, Scott Cohen, Gregory Shakhnarovich |
Abstract | One property that remains lacking in image captions generated by contemporary methods is discriminability: being able to tell two images apart given the caption for one of them. We propose a way to improve this aspect of caption generation. By incorporating into the captioning training objective a loss component directly related to ability (by a machine) to disambiguate image/caption matches, we obtain systems that produce much more discriminative caption, according to human evaluation. Remarkably, our approach leads to improvement in other aspects of generated captions, reflected by a battery of standard scores such as BLEU, SPICE etc. Our approach is modular and can be applied to a variety of model/loss combinations commonly proposed for image captioning. |
Tasks | Image Captioning |
Published | 2018-03-12 |
URL | http://arxiv.org/abs/1803.04376v2 |
http://arxiv.org/pdf/1803.04376v2.pdf | |
PWC | https://paperswithcode.com/paper/discriminability-objective-for-training |
Repo | https://github.com/pdaicode/ImageCaptioning |
Framework | none |
Attacks Meet Interpretability: Attribute-steered Detection of Adversarial Samples
Title | Attacks Meet Interpretability: Attribute-steered Detection of Adversarial Samples |
Authors | Guanhong Tao, Shiqing Ma, Yingqi Liu, Xiangyu Zhang |
Abstract | Adversarial sample attacks perturb benign inputs to induce DNN misbehaviors. Recent research has demonstrated the widespread presence and the devastating consequences of such attacks. Existing defense techniques either assume prior knowledge of specific attacks or may not work well on complex models due to their underlying assumptions. We argue that adversarial sample attacks are deeply entangled with interpretability of DNN models: while classification results on benign inputs can be reasoned based on the human perceptible features/attributes, results on adversarial samples can hardly be explained. Therefore, we propose a novel adversarial sample detection technique for face recognition models, based on interpretability. It features a novel bi-directional correspondence inference between attributes and internal neurons to identify neurons critical for individual attributes. The activation values of critical neurons are enhanced to amplify the reasoning part of the computation and the values of other neurons are weakened to suppress the uninterpretable part. The classification results after such transformation are compared with those of the original model to detect adversaries. Results show that our technique can achieve 94% detection accuracy for 7 different kinds of attacks with 9.91% false positives on benign inputs. In contrast, a state-of-the-art feature squeezing technique can only achieve 55% accuracy with 23.3% false positives. |
Tasks | Face Recognition |
Published | 2018-10-27 |
URL | http://arxiv.org/abs/1810.11580v1 |
http://arxiv.org/pdf/1810.11580v1.pdf | |
PWC | https://paperswithcode.com/paper/attacks-meet-interpretability-attribute |
Repo | https://github.com/AmIAttribute/AmI |
Framework | none |
Consensus-Driven Propagation in Massive Unlabeled Data for Face Recognition
Title | Consensus-Driven Propagation in Massive Unlabeled Data for Face Recognition |
Authors | Xiaohang Zhan, Ziwei Liu, Junjie Yan, Dahua Lin, Chen Change Loy |
Abstract | Face recognition has witnessed great progress in recent years, mainly attributed to the high-capacity model designed and the abundant labeled data collected. However, it becomes more and more prohibitive to scale up the current million-level identity annotations. In this work, we show that unlabeled face data can be as effective as the labeled ones. Here, we consider a setting closely mimicking the real-world scenario, where the unlabeled data are collected from unconstrained environments and their identities are exclusive from the labeled ones. Our main insight is that although the class information is not available, we can still faithfully approximate these semantic relationships by constructing a relational graph in a bottom-up manner. We propose Consensus-Driven Propagation (CDP) to tackle this challenging problem with two modules, the “committee” and the “mediator”, which select positive face pairs robustly by carefully aggregating multi-view information. Extensive experiments validate the effectiveness of both modules to discard outliers and mine hard positives. With CDP, we achieve a compelling accuracy of 78.18% on MegaFace identification challenge by using only 9% of the labels, comparing to 61.78% when no unlabeled data are used and 78.52% when all labels are employed. |
Tasks | Face Recognition |
Published | 2018-09-05 |
URL | http://arxiv.org/abs/1809.01407v2 |
http://arxiv.org/pdf/1809.01407v2.pdf | |
PWC | https://paperswithcode.com/paper/consensus-driven-propagation-in-massive |
Repo | https://github.com/XiaohangZhan/cdp |
Framework | pytorch |
Deep Cross Modal Learning for Caricature Verification and Identification(CaVINet)
Title | Deep Cross Modal Learning for Caricature Verification and Identification(CaVINet) |
Authors | Jatin Garg, Skand Vishwanath Peri, Himanshu Tolani, Narayanan C Krishnan |
Abstract | Learning from different modalities is a challenging task. In this paper, we look at the challenging problem of cross modal face verification and recognition between caricature and visual image modalities. Caricature have exaggerations of facial features of a person. Due to the significant variations in the caricatures, building vision models for recognizing and verifying data from this modality is an extremely challenging task. Visual images with significantly lesser amount of distortions can act as a bridge for the analysis of caricature modality. We introduce a publicly available large Caricature-VIsual dataset [CaVI] with images from both the modalities that captures the rich variations in the caricature of an identity. This paper presents the first cross modal architecture that handles extreme distortions of caricatures using a deep learning network that learns similar representations across the modalities. We use two convolutional networks along with transformations that are subjected to orthogonality constraints to capture the shared and modality specific representations. In contrast to prior research, our approach neither depends on manually extracted facial landmarks for learning the representations, nor on the identities of the person for performing verification. The learned shared representation achieves 91% accuracy for verifying unseen images and 75% accuracy on unseen identities. Further, recognizing the identity in the image by knowledge transfer using a combination of shared and modality specific representations, resulted in an unprecedented performance of 85% rank-1 accuracy for caricatures and 95% rank-1 accuracy for visual images. |
Tasks | Caricature, Face Recognition, Face Verification, Transfer Learning |
Published | 2018-07-31 |
URL | http://arxiv.org/abs/1807.11688v1 |
http://arxiv.org/pdf/1807.11688v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-cross-modal-learning-for-caricature |
Repo | https://github.com/lsaiml/CaVINet |
Framework | tf |
Bootstrapping and Multiple Imputation Ensemble Approaches for Missing Data
Title | Bootstrapping and Multiple Imputation Ensemble Approaches for Missing Data |
Authors | Shehroz S. Khan, Amir Ahmad, Alex Mihailidis |
Abstract | Presence of missing values in a dataset can adversely affect the performance of a classifier. Single and Multiple Imputation are normally performed to fill in the missing values. In this paper, we present several variants of combining single and multiple imputation with bootstrapping to create ensembles that can model uncertainty and diversity in the data, and that are robust to high missingness in the data. We present three ensemble strategies: bootstrapping on incomplete data followed by (i) single imputation and (ii) multiple imputation, and (iii) multiple imputation ensemble without bootstrapping. We perform an extensive evaluation of the performance of the these ensemble strategies on 8 datasets by varying the missingness ratio. Our results show that bootstrapping followed by multiple imputation using expectation maximization is the most robust method even at high missingness ratio (up to 30%). For small missingness ratio (up to 10%) most of the ensemble methods perform quivalently but better than single imputation. Kappa-error plots suggest that accurate classifiers with reasonable diversity is the reason for this behaviour. A consistent observation in all the datasets suggests that for small missingness (up to 10%), bootstrapping on incomplete data without any imputation produces equivalent results to other ensemble methods. |
Tasks | Imputation |
Published | 2018-02-01 |
URL | https://arxiv.org/abs/1802.00154v5 |
https://arxiv.org/pdf/1802.00154v5.pdf | |
PWC | https://paperswithcode.com/paper/bootstrapping-and-multiple-imputation |
Repo | https://github.com/titubeta/EnsembleImputation |
Framework | none |
Pose-Robust Face Recognition via Deep Residual Equivariant Mapping
Title | Pose-Robust Face Recognition via Deep Residual Equivariant Mapping |
Authors | Kaidi Cao, Yu Rong, Cheng Li, Xiaoou Tang, Chen Change Loy |
Abstract | Face recognition achieves exceptional success thanks to the emergence of deep learning. However, many contemporary face recognition models still perform relatively poor in processing profile faces compared to frontal faces. A key reason is that the number of frontal and profile training faces are highly imbalanced - there are extensively more frontal training samples compared to profile ones. In addition, it is intrinsically hard to learn a deep representation that is geometrically invariant to large pose variations. In this study, we hypothesize that there is an inherent mapping between frontal and profile faces, and consequently, their discrepancy in the deep representation space can be bridged by an equivariant mapping. To exploit this mapping, we formulate a novel Deep Residual EquivAriant Mapping (DREAM) block, which is capable of adaptively adding residuals to the input deep representation to transform a profile face representation to a canonical pose that simplifies recognition. The DREAM block consistently enhances the performance of profile face recognition for many strong deep networks, including ResNet models, without deliberately augmenting training data of profile faces. The block is easy to use, light-weight, and can be implemented with a negligible computational overhead. |
Tasks | Face Identification, Face Recognition, Face Verification, Robust Face Recognition |
Published | 2018-03-02 |
URL | http://arxiv.org/abs/1803.00839v1 |
http://arxiv.org/pdf/1803.00839v1.pdf | |
PWC | https://paperswithcode.com/paper/pose-robust-face-recognition-via-deep |
Repo | https://github.com/penincillin/DREAM |
Framework | pytorch |
Variational Bayesian Monte Carlo
Title | Variational Bayesian Monte Carlo |
Authors | Luigi Acerbi |
Abstract | Many probabilistic models of interest in scientific computing and machine learning have expensive, black-box likelihoods that prevent the application of standard techniques for Bayesian inference, such as MCMC, which would require access to the gradient or a large number of likelihood evaluations. We introduce here a novel sample-efficient inference framework, Variational Bayesian Monte Carlo (VBMC). VBMC combines variational inference with Gaussian-process based, active-sampling Bayesian quadrature, using the latter to efficiently approximate the intractable integral in the variational objective. Our method produces both a nonparametric approximation of the posterior distribution and an approximate lower bound of the model evidence, useful for model selection. We demonstrate VBMC both on several synthetic likelihoods and on a neuronal model with data from real neurons. Across all tested problems and dimensions (up to $D = 10$), VBMC performs consistently well in reconstructing the posterior and the model evidence with a limited budget of likelihood evaluations, unlike other methods that work only in very low dimensions. Our framework shows great promise as a novel tool for posterior and model inference with expensive, black-box likelihoods. |
Tasks | Bayesian Inference, Model Selection |
Published | 2018-10-12 |
URL | http://arxiv.org/abs/1810.05558v2 |
http://arxiv.org/pdf/1810.05558v2.pdf | |
PWC | https://paperswithcode.com/paper/variational-bayesian-monte-carlo |
Repo | https://github.com/lacerbi/vbmc |
Framework | none |
Can We Gain More from Orthogonality Regularizations in Training Deep CNNs?
Title | Can We Gain More from Orthogonality Regularizations in Training Deep CNNs? |
Authors | Nitin Bansal, Xiaohan Chen, Zhangyang Wang |
Abstract | This paper seeks to answer the question: as the (near-) orthogonality of weights is found to be a favorable property for training deep convolutional neural networks, how can we enforce it in more effective and easy-to-use ways? We develop novel orthogonality regularizations on training deep CNNs, utilizing various advanced analytical tools such as mutual coherence and restricted isometry property. These plug-and-play regularizations can be conveniently incorporated into training almost any CNN without extra hassle. We then benchmark their effects on state-of-the-art models: ResNet, WideResNet, and ResNeXt, on several most popular computer vision datasets: CIFAR-10, CIFAR-100, SVHN and ImageNet. We observe consistent performance gains after applying those proposed regularizations, in terms of both the final accuracies achieved, and faster and more stable convergences. We have made our codes and pre-trained models publicly available: https://github.com/nbansal90/Can-we-Gain-More-from-Orthogonality. |
Tasks | |
Published | 2018-10-22 |
URL | http://arxiv.org/abs/1810.09102v1 |
http://arxiv.org/pdf/1810.09102v1.pdf | |
PWC | https://paperswithcode.com/paper/can-we-gain-more-from-orthogonality |
Repo | https://github.com/nbansal90/Can-we-Gain-More-from-Orthogonality |
Framework | pytorch |
As you like it: Localization via paired comparisons
Title | As you like it: Localization via paired comparisons |
Authors | Andrew K. Massimino, Mark A. Davenport |
Abstract | Suppose that we wish to estimate a vector $\mathbf{x}$ from a set of binary paired comparisons of the form “$\mathbf{x}$ is closer to $\mathbf{p}$ than to $\mathbf{q}$” for various choices of vectors $\mathbf{p}$ and $\mathbf{q}$. The problem of estimating $\mathbf{x}$ from this type of observation arises in a variety of contexts, including nonmetric multidimensional scaling, “unfolding,” and ranking problems, often because it provides a powerful and flexible model of preference. We describe theoretical bounds for how well we can expect to estimate $\mathbf{x}$ under a randomized model for $\mathbf{p}$ and $\mathbf{q}$. We also present results for the case where the comparisons are noisy and subject to some degree of error. Additionally, we show that under a randomized model for $\mathbf{p}$ and $\mathbf{q}$, a suitable number of binary paired comparisons yield a stable embedding of the space of target vectors. Finally, we also that we can achieve significant gains by adaptively changing the distribution for choosing $\mathbf{p}$ and $\mathbf{q}$. |
Tasks | |
Published | 2018-02-19 |
URL | http://arxiv.org/abs/1802.10489v1 |
http://arxiv.org/pdf/1802.10489v1.pdf | |
PWC | https://paperswithcode.com/paper/as-you-like-it-localization-via-paired |
Repo | https://github.com/siplab-gt/pairsearch |
Framework | none |
Utilizing Neural Networks and Linguistic Metadata for Early Detection of Depression Indications in Text Sequences
Title | Utilizing Neural Networks and Linguistic Metadata for Early Detection of Depression Indications in Text Sequences |
Authors | Marcel Trotzek, Sven Koitka, Christoph M. Friedrich |
Abstract | Depression is ranked as the largest contributor to global disability and is also a major reason for suicide. Still, many individuals suffering from forms of depression are not treated for various reasons. Previous studies have shown that depression also has an effect on language usage and that many depressed individuals use social media platforms or the internet in general to get information or discuss their problems. This paper addresses the early detection of depression using machine learning models based on messages on a social platform. In particular, a convolutional neural network based on different word embeddings is evaluated and compared to a classification based on user-level linguistic metadata. An ensemble of both approaches is shown to achieve state-of-the-art results in a current early detection task. Furthermore, the currently popular ERDE score as metric for early detection systems is examined in detail and its drawbacks in the context of shared tasks are illustrated. A slightly modified metric is proposed and compared to the original score. Finally, a new word embedding was trained on a large corpus of the same domain as the described task and is evaluated as well. |
Tasks | Word Embeddings |
Published | 2018-04-19 |
URL | http://arxiv.org/abs/1804.07000v3 |
http://arxiv.org/pdf/1804.07000v3.pdf | |
PWC | https://paperswithcode.com/paper/utilizing-neural-networks-and-linguistic |
Repo | https://github.com/serenera/Serenera |
Framework | tf |
Transferring Rich Deep Features for Facial Beauty Prediction
Title | Transferring Rich Deep Features for Facial Beauty Prediction |
Authors | Lu Xu, Jinhai Xiang, Xiaohui Yuan |
Abstract | Feature extraction plays a significant part in computer vision tasks. In this paper, we propose a method which transfers rich deep features from a pretrained model on face verification task and feeds the features into Bayesian ridge regression algorithm for facial beauty prediction. We leverage the deep neural networks that extracts more abstract features from stacked layers. Through simple but effective feature fusion strategy, our method achieves improved or comparable performance on SCUT-FBP dataset and ECCV HotOrNot dataset. Our experiments demonstrate the effectiveness of the proposed method and clarify the inner interpretability of facial beauty perception. |
Tasks | Face Verification, Facial Beauty Prediction |
Published | 2018-03-20 |
URL | http://arxiv.org/abs/1803.07253v1 |
http://arxiv.org/pdf/1803.07253v1.pdf | |
PWC | https://paperswithcode.com/paper/transferring-rich-deep-features-for-facial |
Repo | https://github.com/lucasxlu/TransFBP |
Framework | tf |
Hyperprior Induced Unsupervised Disentanglement of Latent Representations
Title | Hyperprior Induced Unsupervised Disentanglement of Latent Representations |
Authors | Abdul Fatir Ansari, Harold Soh |
Abstract | We address the problem of unsupervised disentanglement of latent representations learnt via deep generative models. In contrast to current approaches that operate on the evidence lower bound (ELBO), we argue that statistical independence in the latent space of VAEs can be enforced in a principled hierarchical Bayesian manner. To this effect, we augment the standard VAE with an inverse-Wishart (IW) prior on the covariance matrix of the latent code. By tuning the IW parameters, we are able to encourage (or discourage) independence in the learnt latent dimensions. Extensive experimental results on a range of datasets (2DShapes, 3DChairs, 3DFaces and CelebA) show our approach to outperform the $\beta$-VAE and is competitive with the state-of-the-art FactorVAE. Our approach achieves significantly better disentanglement and reconstruction on a new dataset (CorrelatedEllipses) which introduces correlations between the factors of variation. |
Tasks | |
Published | 2018-09-12 |
URL | http://arxiv.org/abs/1809.04497v3 |
http://arxiv.org/pdf/1809.04497v3.pdf | |
PWC | https://paperswithcode.com/paper/hyperprior-induced-unsupervised |
Repo | https://github.com/crslab/correlated-ellipses |
Framework | none |