February 1, 2020

3028 words 15 mins read

Paper Group AWR 153

Jointly Learning Entity and Relation Representations for Entity Alignment. Joint 3D Face Reconstruction and Dense Face Alignment from A Single Image with 2D-Assisted Self-Supervised Learning. Multi-view Knowledge Graph Embedding for Entity Alignment. Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model. MOROCO: T …

Jointly Learning Entity and Relation Representations for Entity Alignment


Title	Jointly Learning Entity and Relation Representations for Entity Alignment
Authors	Yuting Wu, Xiao Liu, Yansong Feng, Zheng Wang, Dongyan Zhao
Abstract	Entity alignment is a viable means for integrating heterogeneous knowledge among different knowledge graphs (KGs). Recent developments in the field often take an embedding-based approach to model the structural information of KGs so that entity alignment can be easily performed in the embedding space. However, most existing works do not explicitly utilize useful relation representations to assist in entity alignment, which, as we will show in the paper, is a simple yet effective way for improving entity alignment. This paper presents a novel joint learning framework for entity alignment. At the core of our approach is a Graph Convolutional Network (GCN) based framework for learning both entity and relation representations. Rather than relying on pre-aligned relation seeds to learn relation representations, we first approximate them using entity embeddings learned by the GCN. We then incorporate the relation approximation into entities to iteratively learn better representations for both. Experiments performed on three real-world cross-lingual datasets show that our approach substantially outperforms state-of-the-art entity alignment methods.
Tasks	Entity Alignment, Entity Embeddings, Knowledge Graphs
Published	2019-09-20
URL	https://arxiv.org/abs/1909.09317v1
PDF	https://arxiv.org/pdf/1909.09317v1.pdf
PWC	https://paperswithcode.com/paper/jointly-learning-entity-and-relation
Repo	https://github.com/StephanieWyt/HGCN-JE-JR
Framework	tf

Joint 3D Face Reconstruction and Dense Face Alignment from A Single Image with 2D-Assisted Self-Supervised Learning


Title	Joint 3D Face Reconstruction and Dense Face Alignment from A Single Image with 2D-Assisted Self-Supervised Learning
Authors	Xiaoguang Tu, Jian Zhao, Zihang Jiang, Yao Luo, Mei Xie, Yang Zhao, Linxiao He, Zheng Ma, Jiashi Feng
Abstract	3D face reconstruction from a single 2D image is a challenging problem with broad applications. Recent methods typically aim to learn a CNN-based 3D face model that regresses coefficients of 3D Morphable Model (3DMM) from 2D images to render 3D face reconstruction or dense face alignment. However, the shortage of training data with 3D annotations considerably limits performance of those methods. To alleviate this issue, we propose a novel 2D-assisted self-supervised learning (2DASL) method that can effectively use “in-the-wild” 2D face images with noisy landmark information to substantially improve 3D face model learning. Specifically, taking the sparse 2D facial landmarks as additional information, 2DSAL introduces four novel self-supervision schemes that view the 2D landmark and 3D landmark prediction as a self-mapping process, including the 2D and 3D landmark self-prediction consistency, cycle-consistency over the 2D landmark prediction and self-critic over the predicted 3DMM coefficients based on landmark predictions. Using these four self-supervision schemes, the 2DASL method significantly relieves demands on the the conventional paired 2D-to-3D annotations and gives much higher-quality 3D face models without requiring any additional 3D annotations. Experiments on multiple challenging datasets show that our method outperforms state-of-the-arts for both 3D face reconstruction and dense face alignment by a large margin.
Tasks	3D Face Reconstruction, Face Alignment, Face Reconstruction
Published	2019-03-22
URL	http://arxiv.org/abs/1903.09359v1
PDF	http://arxiv.org/pdf/1903.09359v1.pdf
PWC	https://paperswithcode.com/paper/joint-3d-face-reconstruction-and-dense-face
Repo	https://github.com/XgTu/2DASL-CNN
Framework	pytorch

Multi-view Knowledge Graph Embedding for Entity Alignment


Title	Multi-view Knowledge Graph Embedding for Entity Alignment
Authors	Qingheng Zhang, Zequn Sun, Wei Hu, Muhao Chen, Lingbing Guo, Yuzhong Qu
Abstract	We study the problem of embedding-based entity alignment between knowledge graphs (KGs). Previous works mainly focus on the relational structure of entities. Some further incorporate another type of features, such as attributes, for refinement. However, a vast of entity features are still unexplored or not equally treated together, which impairs the accuracy and robustness of embedding-based entity alignment. In this paper, we propose a novel framework that unifies multiple views of entities to learn embeddings for entity alignment. Specifically, we embed entities based on the views of entity names, relations and attributes, with several combination strategies. Furthermore, we design some cross-KG inference methods to enhance the alignment between two KGs. Our experiments on real-world datasets show that the proposed framework significantly outperforms the state-of-the-art embedding-based entity alignment methods. The selected views, cross-KG inference and combination strategies all contribute to the performance improvement.
Tasks	Entity Alignment, Graph Embedding, Knowledge Graph Embedding, Knowledge Graphs
Published	2019-06-06
URL	https://arxiv.org/abs/1906.02390v1
PDF	https://arxiv.org/pdf/1906.02390v1.pdf
PWC	https://paperswithcode.com/paper/multi-view-knowledge-graph-embedding-for
Repo	https://github.com/nju-websoft/MultiKE
Framework	tf

Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model


Title	Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model
Authors	Alex X. Lee, Anusha Nagabandi, Pieter Abbeel, Sergey Levine
Abstract	Deep reinforcement learning (RL) algorithms can use high-capacity deep networks to learn directly from image observations. However, these kinds of observation spaces present a number of challenges in practice, since the policy must now solve two problems: a representation learning problem, and a task learning problem. In this paper, we aim to explicitly learn representations that can accelerate reinforcement learning from images. We propose the stochastic latent actor-critic (SLAC) algorithm: a sample-efficient and high-performing RL algorithm for learning policies for complex continuous control tasks directly from high-dimensional image inputs. SLAC learns a compact latent representation space using a stochastic sequential latent variable model, and then learns a critic model within this latent space. By learning a critic within a compact state space, SLAC can learn much more efficiently than standard RL methods. The proposed model improves performance substantially over alternative representations as well, such as variational autoencoders. In fact, our experimental evaluation demonstrates that the sample efficiency of our resulting method is comparable to that of model-based RL methods that directly use a similar type of model for control. Furthermore, our method outperforms both model-free and model-based alternatives in terms of final performance and sample efficiency, on a range of difficult image-based control tasks. Our code and videos of our results are available at our website.
Tasks	Continuous Control, Representation Learning
Published	2019-07-01
URL	https://arxiv.org/abs/1907.00953v2
PDF	https://arxiv.org/pdf/1907.00953v2.pdf
PWC	https://paperswithcode.com/paper/stochastic-latent-actor-critic-deep
Repo	https://github.com/alexlee-gk/slac
Framework	tf

MOROCO: The Moldavian and Romanian Dialectal Corpus


Title	MOROCO: The Moldavian and Romanian Dialectal Corpus
Authors	Andrei M. Butnaru, Radu Tudor Ionescu
Abstract	In this work, we introduce the MOldavian and ROmanian Dialectal COrpus (MOROCO), which is freely available for download at https://github.com/butnaruandrei/MOROCO. The corpus contains 33564 samples of text (with over 10 million tokens) collected from the news domain. The samples belong to one of the following six topics: culture, finance, politics, science, sports and tech. The data set is divided into 21719 samples for training, 5921 samples for validation and another 5924 samples for testing. For each sample, we provide corresponding dialectal and category labels. This allows us to perform empirical studies on several classification tasks such as (i) binary discrimination of Moldavian versus Romanian text samples, (ii) intra-dialect multi-class categorization by topic and (iii) cross-dialect multi-class categorization by topic. We perform experiments using a shallow approach based on string kernels, as well as a novel deep approach based on character-level convolutional neural networks containing Squeeze-and-Excitation blocks. We also present and analyze the most discriminative features of our best performing model, before and after named entity removal.
Tasks
Published	2019-01-19
URL	https://arxiv.org/abs/1901.06543v2
PDF	https://arxiv.org/pdf/1901.06543v2.pdf
PWC	https://paperswithcode.com/paper/moroco-the-moldavian-and-romanian-dialectal
Repo	https://github.com/butnaruandrei/MOROCO
Framework	none

Subspace Inference for Bayesian Deep Learning


Title	Subspace Inference for Bayesian Deep Learning
Authors	Pavel Izmailov, Wesley J. Maddox, Polina Kirichenko, Timur Garipov, Dmitry Vetrov, Andrew Gordon Wilson
Abstract	Bayesian inference was once a gold standard for learning with neural networks, providing accurate full predictive distributions and well calibrated uncertainty. However, scaling Bayesian inference techniques to deep neural networks is challenging due to the high dimensionality of the parameter space. In this paper, we construct low-dimensional subspaces of parameter space, such as the first principal components of the stochastic gradient descent (SGD) trajectory, which contain diverse sets of high performing models. In these subspaces, we are able to apply elliptical slice sampling and variational inference, which struggle in the full parameter space. We show that Bayesian model averaging over the induced posterior in these subspaces produces accurate predictions and well calibrated predictive uncertainty for both regression and image classification.
Tasks	Bayesian Inference, Image Classification
Published	2019-07-17
URL	https://arxiv.org/abs/1907.07504v1
PDF	https://arxiv.org/pdf/1907.07504v1.pdf
PWC	https://paperswithcode.com/paper/subspace-inference-for-bayesian-deep-learning
Repo	https://github.com/wjmaddox/drbayes
Framework	pytorch

Towards Debugging Deep Neural Networks by Generating Speech Utterances


Title	Towards Debugging Deep Neural Networks by Generating Speech Utterances
Authors	Bilal Soomro, Anssi Kanervisto, Trung Ngo Trong, Ville Hautamäki
Abstract	Deep neural networks (DNN) are able to successfully process and classify speech utterances. However, understanding the reason behind a classification by DNN is difficult. One such debugging method used with image classification DNNs is activation maximization, which generates example-images that are classified as one of the classes. In this work, we evaluate applicability of this method to speech utterance classifiers as the means to understanding what DNN “listens to”. We trained a classifier using the speech command corpus and then use activation maximization to pull samples from the trained model. Then we synthesize audio from features using WaveNet vocoder for subjective analysis. We measure the quality of generated samples by objective measurements and crowd-sourced human evaluations. Results show that when combined with the prior of natural speech, activation maximization can be used to generate examples of different classes. Based on these results, activation maximization can be used to start opening up the DNN black-box in speech tasks.
Tasks	Image Classification
Published	2019-07-06
URL	https://arxiv.org/abs/1907.03164v1
PDF	https://arxiv.org/pdf/1907.03164v1.pdf
PWC	https://paperswithcode.com/paper/towards-debugging-deep-neural-networks-by
Repo	https://github.com/bilalsoomro/debugging-deep-neural-networks
Framework	pytorch

A Semi-Supervised Approach for Low-Resourced Text Generation


Title	A Semi-Supervised Approach for Low-Resourced Text Generation
Authors	Hongyu Zang, Xiaojun Wan
Abstract	Recently, encoder-decoder neural models have achieved great success on text generation tasks. However, one problem of this kind of models is that their performances are usually limited by the scale of well-labeled data, which are very expensive to get. The low-resource (of labeled data) problem is quite common in different task generation tasks, but unlabeled data are usually abundant. In this paper, we propose a method to make use of the unlabeled data to improve the performance of such models in the low-resourced circumstances. We use denoising auto-encoder (DAE) and language model (LM) based reinforcement learning (RL) to enhance the training of encoder and decoder with unlabeled data. Our method shows adaptability for different text generation tasks, and makes significant improvements over basic text generation models.
Tasks	Denoising, Language Modelling, Text Generation
Published	2019-06-03
URL	https://arxiv.org/abs/1906.00584v1
PDF	https://arxiv.org/pdf/1906.00584v1.pdf
PWC	https://paperswithcode.com/paper/190600584
Repo	https://github.com/zhyack/UDRG
Framework	tf

Word Embeddings for the Armenian Language: Intrinsic and Extrinsic Evaluation


Title	Word Embeddings for the Armenian Language: Intrinsic and Extrinsic Evaluation
Authors	Karen Avetisyan, Tsolak Ghukasyan
Abstract	In this work, we intrinsically and extrinsically evaluate and compare existing word embedding models for the Armenian language. Alongside, new embeddings are presented, trained using GloVe, fastText, CBOW, SkipGram algorithms. We adapt and use the word analogy task in intrinsic evaluation of embeddings. For extrinsic evaluation, two tasks are employed: morphological tagging and text classification. Tagging is performed on a deep neural network, using ArmTDP v2.3 dataset. For text classification, we propose a corpus of news articles categorized into 7 classes. The datasets are made public to serve as benchmarks for future models.
Tasks	Morphological Tagging, Text Classification, Word Embeddings
Published	2019-06-07
URL	https://arxiv.org/abs/1906.03134v1
PDF	https://arxiv.org/pdf/1906.03134v1.pdf
PWC	https://paperswithcode.com/paper/word-embeddings-for-the-armenian-language
Repo	https://github.com/ispras-texterra/word-embeddings-eval-hy
Framework	none

Towards Scalable and Reliable Capsule Networks for Challenging NLP Applications


Title	Towards Scalable and Reliable Capsule Networks for Challenging NLP Applications
Authors	Wei Zhao, Haiyun Peng, Steffen Eger, Erik Cambria, Min Yang
Abstract	Obstacles hindering the development of capsule networks for challenging NLP applications include poor scalability to large output spaces and less reliable routing processes. In this paper, we introduce: 1) an agreement score to evaluate the performance of routing processes at instance level; 2) an adaptive optimizer to enhance the reliability of routing; 3) capsule compression and partial routing to improve the scalability of capsule networks. We validate our approach on two NLP tasks, namely: multi-label text classification and question answering. Experimental results show that our approach considerably improves over strong competitors on both tasks. In addition, we gain the best results in low-resource settings with few training instances.
Tasks	Multi-Label Text Classification, Question Answering, Text Classification
Published	2019-06-06
URL	https://arxiv.org/abs/1906.02829v1
PDF	https://arxiv.org/pdf/1906.02829v1.pdf
PWC	https://paperswithcode.com/paper/towards-scalable-and-reliable-capsule
Repo	https://github.com/andyweizhao/NLP-Capsule
Framework	pytorch

Learning Embeddings into Entropic Wasserstein Spaces


Title	Learning Embeddings into Entropic Wasserstein Spaces
Authors	Charlie Frogner, Farzaneh Mirzazadeh, Justin Solomon
Abstract	Euclidean embeddings of data are fundamentally limited in their ability to capture latent semantic structures, which need not conform to Euclidean spatial assumptions. Here we consider an alternative, which embeds data as discrete probability distributions in a Wasserstein space, endowed with an optimal transport metric. Wasserstein spaces are much larger and more flexible than Euclidean spaces, in that they can successfully embed a wider variety of metric structures. We exploit this flexibility by learning an embedding that captures semantic information in the Wasserstein distance between embedded distributions. We examine empirically the representational capacity of our learned Wasserstein embeddings, showing that they can embed a wide variety of metric structures with smaller distortion than an equivalent Euclidean embedding. We also investigate an application to word embedding, demonstrating a unique advantage of Wasserstein embeddings: We can visualize the high-dimensional embedding directly, since it is a probability distribution on a low-dimensional space. This obviates the need for dimensionality reduction techniques like t-SNE for visualization.
Tasks	Dimensionality Reduction
Published	2019-05-08
URL	https://arxiv.org/abs/1905.03329v1
PDF	https://arxiv.org/pdf/1905.03329v1.pdf
PWC	https://paperswithcode.com/paper/190503329
Repo	https://github.com/gabsens/Learning-Embeddings-into-Entropic-Wasserstein-Spaces-ENSAE
Framework	none

Translate-to-Recognize Networks for RGB-D Scene Recognition


Title	Translate-to-Recognize Networks for RGB-D Scene Recognition
Authors	Dapeng Du, Limin Wang, Huiling Wang, Kai Zhao, Gangshan Wu
Abstract	Cross-modal transfer is helpful to enhance modality-specific discriminative power for scene recognition. To this end, this paper presents a unified framework to integrate the tasks of cross-modal translation and modality-specific recognition, termed as Translate-to-Recognize Network (TRecgNet). Specifically, both translation and recognition tasks share the same encoder network, which allows to explicitly regularize the training of recognition task with the help of translation, and thus improve its final generalization ability. For translation task, we place a decoder module on top of the encoder network and it is optimized with a new layer-wise semantic loss, while for recognition task, we use a linear classifier based on the feature embedding from encoder and its training is guided by the standard cross-entropy loss. In addition, our TRecgNet allows to exploit large numbers of unlabeled RGB-D data to train the translation task and thus improve the representation power of encoder network. Empirically, we verify that this new semi-supervised setting is able to further enhance the performance of recognition network. We perform experiments on two RGB-D scene recognition benchmarks: NYU Depth v2 and SUN RGB-D, demonstrating that TRecgNet achieves superior performance to the existing state-of-the-art methods, especially for recognition solely based on a single modality.
Tasks	Scene Recognition
Published	2019-04-28
URL	http://arxiv.org/abs/1904.12254v1
PDF	http://arxiv.org/pdf/1904.12254v1.pdf
PWC	https://paperswithcode.com/paper/translate-to-recognize-networks-for-rgb-d
Repo	https://github.com/ownstyledu/Translate-to-Recognize-Networks
Framework	pytorch

Look globally, age locally: Face aging with an attention mechanism


Title	Look globally, age locally: Face aging with an attention mechanism
Authors	Haiping Zhu, Zhizhong Huang, Hongming Shan, Junping Zhang
Abstract	Face aging is of great importance for cross-age recognition and entertainment-related applications. Recently, conditional generative adversarial networks (cGANs) have achieved impressive results for face aging. Existing cGANs-based methods usually require a pixel-wise loss to keep the identity and background consistent. However, minimizing the pixel-wise loss between the input and synthesized images likely resulting in a ghosted or blurry face. To address this deficiency, this paper introduces an Attention Conditional GANs (AcGANs) approach for face aging, which utilizes attention mechanism to only alert the regions relevant to face aging. In doing so, the synthesized face can well preserve the background information and personal identity without using the pixel-wise loss, and the ghost artifacts and blurriness can be significantly reduced. Based on the benchmarked dataset Morph, both qualitative and quantitative experiment results demonstrate superior performance over existing algorithms in terms of image quality, personal identity, and age accuracy.
Tasks
Published	2019-10-24
URL	https://arxiv.org/abs/1910.12771v1
PDF	https://arxiv.org/pdf/1910.12771v1.pdf
PWC	https://paperswithcode.com/paper/look-globally-age-locally-face-aging-with-an
Repo	https://github.com/JensonZhu14/AcGAN
Framework	pytorch

Probabilistic Relational Agent-based Models


Title	Probabilistic Relational Agent-based Models
Authors	Paul Cohen
Abstract	PRAM puts agent-based models on a sound probabilistic footing as a basis for integrating agent-based and probabilistic models. It extends the themes of probabilistic relational models and lifted inference to incorporate dynamical models and simulation. It can also be much more efficient than agent-based simulation.
Tasks
Published	2019-02-15
URL	http://arxiv.org/abs/1902.05677v1
PDF	http://arxiv.org/pdf/1902.05677v1.pdf
PWC	https://paperswithcode.com/paper/probabilistic-relational-agent-based-models
Repo	https://github.com/momacs/pram
Framework	none

Multi-fidelity classification using Gaussian processes: accelerating the prediction of large-scale computational models


Title	Multi-fidelity classification using Gaussian processes: accelerating the prediction of large-scale computational models
Authors	Francisco Sahli Costabal, Paris Perdikaris, Ellen Kuhl, Daniel E. Hurtado
Abstract	Machine learning techniques typically rely on large datasets to create accurate classifiers. However, there are situations when data is scarce and expensive to acquire. This is the case of studies that rely on state-of-the-art computational models which typically take days to run, thus hindering the potential of machine learning tools. In this work, we present a novel classifier that takes advantage of lower fidelity models and inexpensive approximations to predict the binary output of expensive computer simulations. We postulate an autoregressive model between the different levels of fidelity with Gaussian process priors. We adopt a fully Bayesian treatment for the hyper-parameters and use Markov Chain Mont Carlo samplers. We take advantage of the probabilistic nature of the classifier to implement active learning strategies. We also introduce a sparse approximation to enhance the ability of themulti-fidelity classifier to handle large datasets. We test these multi-fidelity classifiers against their single-fidelity counterpart with synthetic data, showing a median computational cost reduction of 23% for a target accuracy of 90%. In an application to cardiac electrophysiology, the multi-fidelity classifier achieves an F1 score, the harmonic mean of precision and recall, of 99.6% compared to 74.1% of a single-fidelity classifier when both are trained with 50 samples. In general, our results show that the multi-fidelity classifiers outperform their single-fidelity counterpart in terms of accuracy in all cases. We envision that this new tool will enable researchers to study classification problems that would otherwise be prohibitively expensive. Source code is available at https://github.com/fsahli/MFclass.
Tasks	Active Learning, Gaussian Processes
Published	2019-05-09
URL	https://arxiv.org/abs/1905.03406v1
PDF	https://arxiv.org/pdf/1905.03406v1.pdf
PWC	https://paperswithcode.com/paper/190503406
Repo	https://github.com/fsahli/MFclass
Framework	none