January 31, 2020

2962 words 14 mins read

Paper Group AWR 371

Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space?. E-LSTM-D: A Deep Learning Framework for Dynamic Network Link Prediction. Multiple Generative Models Ensemble for Knowledge-Driven Proactive Human-Computer Dialogue Agent. Residual or Gate? Towards Deeper Graph Neural Networks for Inductive Graph Representation Learning. LMLFM: Lon …

Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space?


Title	Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space?
Authors	Rameen Abdal, Yipeng Qin, Peter Wonka
Abstract	We propose an efficient algorithm to embed a given image into the latent space of StyleGAN. This embedding enables semantic image editing operations that can be applied to existing photographs. Taking the StyleGAN trained on the FFHQ dataset as an example, we show results for image morphing, style transfer, and expression transfer. Studying the results of the embedding algorithm provides valuable insights into the structure of the StyleGAN latent space. We propose a set of experiments to test what class of images can be embedded, how they are embedded, what latent space is suitable for embedding, and if the embedding is semantically meaningful.
Tasks	Image Morphing, Style Transfer
Published	2019-04-05
URL	https://arxiv.org/abs/1904.03189v2
PDF	https://arxiv.org/pdf/1904.03189v2.pdf
PWC	https://paperswithcode.com/paper/image2stylegan-how-to-embed-images-into-the
Repo	https://github.com/pacifinapacific/StyleGAN_LatentEditor
Framework	pytorch

E-LSTM-D: A Deep Learning Framework for Dynamic Network Link Prediction


Title	E-LSTM-D: A Deep Learning Framework for Dynamic Network Link Prediction
Authors	Jinyin Chen, Jian Zhang, Xuanheng Xu, Chengbo Fu, Dan Zhang, Qingpeng Zhang, Qi Xuan
Abstract	Predicting the potential relations between nodes in networks, known as link prediction, has long been a challenge in network science. However, most studies just focused on link prediction of static network, while real-world networks always evolve over time with the occurrence and vanishing of nodes and links. Dynamic network link prediction thus has been attracting more and more attention since it can better capture the evolution nature of networks, but still most algorithms fail to achieve satisfied prediction accuracy. Motivated by the excellent performance of Long Short-Term Memory (LSTM) in processing time series, in this paper, we propose a novel Encoder-LSTM-Decoder (E-LSTM-D) deep learning model to predict dynamic links end to end. It could handle long term prediction problems, and suits the networks of different scales with fine-tuned structure. To the best of our knowledge, it is the first time that LSTM, together with an encoder-decoder architecture, is applied to link prediction in dynamic networks. This new model is able to automatically learn structural and temporal features in a unified framework, which can predict the links that never appear in the network before. The extensive experiments show that our E-LSTM-D model significantly outperforms newly proposed dynamic network link prediction methods and obtain the state-of-the-art results.
Tasks	Link Prediction, Time Series
Published	2019-02-22
URL	http://arxiv.org/abs/1902.08329v1
PDF	http://arxiv.org/pdf/1902.08329v1.pdf
PWC	https://paperswithcode.com/paper/e-lstm-d-a-deep-learning-framework-for
Repo	https://github.com/jianz94/e-lstm-d
Framework	tf

Multiple Generative Models Ensemble for Knowledge-Driven Proactive Human-Computer Dialogue Agent


Title	Multiple Generative Models Ensemble for Knowledge-Driven Proactive Human-Computer Dialogue Agent
Authors	Zelin Dai, Weitang Liu, Hao Zhang, Minghao Zhu, Long Wang
Abstract	Multiple sequence to sequence models were used to establish an end-to-end multi-turns proactive dialogue generation agent, with the aid of data augmentation techniques and variant encoder-decoder structure designs. A rank-based ensemble approach was developed for boosting performance. Results indicate that our single model, in average, makes an obvious improvement in the terms of F1-score and BLEU over the baseline by 18.67% on the DuConv dataset. In particular, the ensemble methods further significantly outperform the baseline by 35.85%.
Tasks	Data Augmentation, Dialogue Generation
Published	2019-07-08
URL	https://arxiv.org/abs/1907.03590v1
PDF	https://arxiv.org/pdf/1907.03590v1.pdf
PWC	https://paperswithcode.com/paper/multiple-generative-models-ensemble-for
Repo	https://github.com/lonePatient/knowledge-driven-dialogue-lic2019-rank5
Framework	pytorch

Residual or Gate? Towards Deeper Graph Neural Networks for Inductive Graph Representation Learning


Title	Residual or Gate? Towards Deeper Graph Neural Networks for Inductive Graph Representation Learning
Authors	Binxuan Huang, Kathleen M. Carley
Abstract	In this paper, we study the problem of node representation learning with graph neural networks. We present a graph neural network class named recurrent graph neural network (RGNN), that address the shortcomings of prior methods. By using recurrent units to capture the long-term dependency across layers, our methods can successfully identify important information during recursive neighborhood expansion. In our experiments, we show that our model class achieves state-of-the-art results on three benchmarks: the Pubmed, Reddit, and PPI network datasets. Our in-depth analyses also demonstrate that incorporating recurrent units is a simple yet effective method to prevent noisy information in graphs, which enables a deeper graph neural network.
Tasks	Graph Representation Learning, Representation Learning
Published	2019-04-17
URL	https://arxiv.org/abs/1904.08035v3
PDF	https://arxiv.org/pdf/1904.08035v3.pdf
PWC	https://paperswithcode.com/paper/inductive-graph-representation-learning-with
Repo	https://github.com/binxuan/Recurrent-Graph-Neural-Network
Framework	none

LMLFM: Longitudinal Multi-Level Factorization Machine


Title	LMLFM: Longitudinal Multi-Level Factorization Machine
Authors	Junjie Liang, Dongkuan Xu, Yiwei Sun, Vasant Honavar
Abstract	We consider the problem of learning predictive models from longitudinal data, consisting of irregularly repeated, sparse observations from a set of individuals over time. Such data often exhibit {\em longitudinal correlation} (LC) (correlations among observations for each individual over time), {\em cluster correlation} (CC) (correlations among individuals that have similar characteristics), or both. These correlations are often accounted for using {\em mixed effects models} that include {\em fixed effects} and {\em random effects}, where the fixed effects capture the regression parameters that are shared by all individuals, whereas random effects capture those parameters that vary across individuals. However, the current state-of-the-art methods are unable to select the most predictive fixed effects and random effects from a large number of variables, while accounting for complex correlation structure in the data and non-linear interactions among the variables. We propose Longitudinal Multi-Level Factorization Machine (LMLFM), to the best of our knowledge, the first model to address these challenges in learning predictive models from longitudinal data. We establish the convergence properties, and analyze the computational complexity, of LMLFM. We present results of experiments with both simulated and real-world longitudinal data which show that LMLFM outperforms the state-of-the-art methods in terms of predictive accuracy, variable selection ability, and scalability to data with large number of variables. The code and supplemental material is available at \url{https://github.com/junjieliang672/LMLFM}.
Tasks
Published	2019-11-11
URL	https://arxiv.org/abs/1911.04062v2
PDF	https://arxiv.org/pdf/1911.04062v2.pdf
PWC	https://paperswithcode.com/paper/lmlfm-longitudinal-multi-level-factorization
Repo	https://github.com/junjieliang672/LMLFM
Framework	pytorch

An Investigation into the Effectiveness of Enhancement in ASR Training and Test for CHiME-5 Dinner Party Transcription


Title	An Investigation into the Effectiveness of Enhancement in ASR Training and Test for CHiME-5 Dinner Party Transcription
Authors	Catalin Zorila, Christoph Boeddeker, Rama Doddipatla, Reinhold Haeb-Umbach
Abstract	Despite the strong modeling power of neural network acoustic models, speech enhancement has been shown to deliver additional word error rate improvements if multi-channel data is available. However, there has been a longstanding debate whether enhancement should also be carried out on the ASR training data. In an extensive experimental evaluation on the acoustically very challenging CHiME-5 dinner party data we show that: (i) cleaning up the training data can lead to substantial error rate reductions, and (ii) enhancement in training is advisable as long as enhancement in test is at least as strong as in training. This approach stands in contrast and delivers larger gains than the common strategy reported in the literature to augment the training database with additional artificially degraded speech. Together with an acoustic model topology consisting of initial CNN layers followed by factorized TDNN layers we achieve with 41.6% and 43.2% WER on the DEV and EVAL test sets, respectively, a new single-system state-of-the-art result on the CHiME-5 data. This is a 8% relative improvement compared to the best word error rate published so far for a speech recognizer without system combination.
Tasks	Speech Enhancement
Published	2019-09-26
URL	https://arxiv.org/abs/1909.12208v1
PDF	https://arxiv.org/pdf/1909.12208v1.pdf
PWC	https://paperswithcode.com/paper/an-investigation-into-the-effectiveness-of
Repo	https://github.com/fgnt/pb_chime5
Framework	none

Predicting Model Failure using Saliency Maps in Autonomous Driving Systems


Title	Predicting Model Failure using Saliency Maps in Autonomous Driving Systems
Authors	Sina Mohseni, Akshay Jagadeesh, Zhangyang Wang
Abstract	While machine learning systems show high success rate in many complex tasks, research shows they can also fail in very unexpected situations. Rise of machine learning products in safety-critical industries cause an increase in attention in evaluating model robustness and estimating failure probability in machine learning systems. In this work, we propose a design to train a student model – a failure predictor – to predict the main model’s error for input instances based on their saliency map. We implement and review the preliminary results of our failure predictor model on an autonomous vehicle steering control system as an example of safety-critical applications.
Tasks	Autonomous Driving, Steering Control
Published	2019-05-19
URL	https://arxiv.org/abs/1905.07679v1
PDF	https://arxiv.org/pdf/1905.07679v1.pdf
PWC	https://paperswithcode.com/paper/predicting-model-failure-using-saliency-maps
Repo	https://github.com/SinaMohseni/Saliency-Based-Failure-prediction-for-Autonomous-Vehicle
Framework	tf

Look before you Hop: Conversational Question Answering over Knowledge Graphs Using Judicious Context Expansion


Title	Look before you Hop: Conversational Question Answering over Knowledge Graphs Using Judicious Context Expansion
Authors	Philipp Christmann, Rishiraj Saha Roy, Abdalghani Abujabal, Jyotsna Singh, Gerhard Weikum
Abstract	Fact-centric information needs are rarely one-shot; users typically ask follow-up questions to explore a topic. In such a conversational setting, the user’s inputs are often incomplete, with entities or predicates left out, and ungrammatical phrases. This poses a huge challenge to question answering (QA) systems that typically rely on cues in full-fledged interrogative sentences. As a solution, we develop CONVEX: an unsupervised method that can answer incomplete questions over a knowledge graph (KG) by maintaining conversation context using entities and predicates seen so far and automatically inferring missing or ambiguous pieces for follow-up questions. The core of our method is a graph exploration algorithm that judiciously expands a frontier to find candidate answers for the current question. To evaluate CONVEX, we release ConvQuestions, a crowdsourced benchmark with 11,200 distinct conversations from five different domains. We show that CONVEX: (i) adds conversational support to any stand-alone QA system, and (ii) outperforms state-of-the-art baselines and question completion strategies.
Tasks	Knowledge Graphs, Question Answering
Published	2019-10-08
URL	https://arxiv.org/abs/1910.03262v3
PDF	https://arxiv.org/pdf/1910.03262v3.pdf
PWC	https://paperswithcode.com/paper/look-before-you-hop-conversational-question
Repo	https://github.com/PhilippChr/CONVEX
Framework	none

SCAttNet: Semantic Segmentation Network with Spatial and Channel Attention Mechanism for High-Resolution Remote Sensing Images


Title	SCAttNet: Semantic Segmentation Network with Spatial and Channel Attention Mechanism for High-Resolution Remote Sensing Images
Authors	Haifeng Li, Kaijian Qiu, Li Chen, Xiaoming Mei, Liang Hong, Chao Tao
Abstract	High-resolution remote sensing images (HRRSIs) contain substantial ground object information, such as texture, shape, and spatial location. Semantic segmentation, which is an important method for element extraction, has been widely used in processing mass HRRSIs. However, HRRSIs often exhibit large intraclass variance and small interclass variance due to the diversity and complexity of ground objects, thereby bringing great challenges to a semantic segmentation task. In this study, we propose a new end-to-end semantic segmentation network, which integrates two lightweight attention mechanisms that can refine features adaptively. We compare our method with several previous advanced networks on the ISPRS Vaihingen and Potsdam datasets. Experimental results show that our method can achieve better semantic segmentation results compared with other works. The source codes are available at https://github.com/lehaifeng/SCAttNet.
Tasks	Semantic Segmentation
Published	2019-12-19
URL	https://arxiv.org/abs/1912.09121v1
PDF	https://arxiv.org/pdf/1912.09121v1.pdf
PWC	https://paperswithcode.com/paper/scattnet-semantic-segmentation-network-with
Repo	https://github.com/lehaifeng/SCAttNet
Framework	tf

Rethinking Cooperative Rationalization: Introspective Extraction and Complement Control


Title	Rethinking Cooperative Rationalization: Introspective Extraction and Complement Control
Authors	Mo Yu, Shiyu Chang, Yang Zhang, Tommi S. Jaakkola
Abstract	Selective rationalization has become a common mechanism to ensure that predictive models reveal how they use any available features. The selection may be soft or hard, and identifies a subset of input features relevant for prediction. The setup can be viewed as a co-operate game between the selector (aka rationale generator) and the predictor making use of only the selected features. The co-operative setting may, however, be compromised for two reasons. First, the generator typically has no direct access to the outcome it aims to justify, resulting in poor performance. Second, there’s typically no control exerted on the information left outside the selection. We revise the overall co-operative framework to address these challenges. We introduce an introspective model which explicitly predicts and incorporates the outcome into the selection process. Moreover, we explicitly control the rationale complement via an adversary so as not to leave any useful information out of the selection. We show that the two complementary mechanisms maintain both high predictive accuracy and lead to comprehensive rationales.
Tasks
Published	2019-10-29
URL	https://arxiv.org/abs/1910.13294v2
PDF	https://arxiv.org/pdf/1910.13294v2.pdf
PWC	https://paperswithcode.com/paper/191013294
Repo	https://github.com/Gorov/three_player_for_emnlp
Framework	pytorch

Capsule Routing via Variational Bayes


Title	Capsule Routing via Variational Bayes
Authors	Fabio De Sousa Ribeiro, Georgios Leontidis, Stefanos Kollias
Abstract	Capsule networks are a recently proposed type of neural network shown to outperform alternatives in challenging shape recognition tasks. In capsule networks, scalar neurons are replaced with capsule vectors or matrices, whose entries represent different properties of objects. The relationships between objects and their parts are learned via trainable viewpoint-invariant transformation matrices, and the presence of a given object is decided by the level of agreement among votes from its parts. This interaction occurs between capsule layers and is a process called routing-by-agreement. In this paper, we propose a new capsule routing algorithm derived from Variational Bayes for fitting a mixture of transforming gaussians, and show it is possible transform our capsule network into a Capsule-VAE. Our Bayesian approach addresses some of the inherent weaknesses of MLE based models such as the variance-collapse by modelling uncertainty over capsule pose parameters. We outperform the state-of-the-art on smallNORB using 50% fewer capsules than previously reported, achieve competitive performances on CIFAR-10, Fashion-MNIST, SVHN, and demonstrate significant improvement in MNIST to affNIST generalisation over previous works.
Tasks
Published	2019-05-27
URL	https://arxiv.org/abs/1905.11455v3
PDF	https://arxiv.org/pdf/1905.11455v3.pdf
PWC	https://paperswithcode.com/paper/capsule-routing-via-variational-bayes
Repo	https://github.com/fabio-deep/Variational-Capsule-Routing
Framework	pytorch

Occ-Traj120: Occupancy Maps with Associated Trajectories


Title	Occ-Traj120: Occupancy Maps with Associated Trajectories
Authors	Tin Lai, Weiming Zhi, Fabio Ramos
Abstract	Trajectory modelling had been the principal research area for understanding and anticipating human behaviour. Predicting the dynamic path by observing the agent and its surrounding environment are essential for applications such as autonomous driving and indoor navigation suggestions. However, despite the numerous researches that had been presented, most available dataset does not contains any information on environmental factors—such as the occupancy representation of the map—which arguably plays a significant role on how an agent chooses its trajectory. We present a trajectory dataset with the corresponding occupancy representations of different local-maps. The dataset contains more than 120 locally-structured maps with occupancy representation and more than 110K trajectories in total. Each map has few hundred corresponding simulated trajectories that navigate from a spatial location of a room to another point. The dataset is freely available online.
Tasks	Autonomous Driving
Published	2019-09-05
URL	https://arxiv.org/abs/1909.02333v2
PDF	https://arxiv.org/pdf/1909.02333v2.pdf
PWC	https://paperswithcode.com/paper/occ-traj120-occupancy-maps-with-associated
Repo	https://github.com/soraxas/Occ-Traj120
Framework	none

Mogrifier LSTM


Title	Mogrifier LSTM
Authors	Gábor Melis, Tomáš Kočiský, Phil Blunsom
Abstract	Many advances in Natural Language Processing have been based upon more expressive models for how inputs interact with the context in which they occur. Recurrent networks, which have enjoyed a modicum of success, still lack the generalization and systematicity ultimately required for modelling language. In this work, we propose an extension to the venerable Long Short-Term Memory in the form of mutual gating of the current input and the previous output. This mechanism affords the modelling of a richer space of interactions between inputs and their context. Equivalently, our model can be viewed as making the transition function given by the LSTM context-dependent. Experiments demonstrate markedly improved generalization on language modelling in the range of 3-4 perplexity points on Penn Treebank and Wikitext-2, and 0.01-0.05 bpc on four character-based datasets. We establish a new state of the art on all datasets with the exception of Enwik8, where we close a large gap between the LSTM and Transformer models.
Tasks	Language Modelling
Published	2019-09-04
URL	https://arxiv.org/abs/1909.01792v2
PDF	https://arxiv.org/pdf/1909.01792v2.pdf
PWC	https://paperswithcode.com/paper/mogrifier-lstm
Repo	https://github.com/deepmind/lamb
Framework	tf

Improving Pedestrian Attribute Recognition With Weakly-Supervised Multi-Scale Attribute-Specific Localization


Title	Improving Pedestrian Attribute Recognition With Weakly-Supervised Multi-Scale Attribute-Specific Localization
Authors	Chufeng Tang, Lu Sheng, Zhaoxiang Zhang, Xiaolin Hu
Abstract	Pedestrian attribute recognition has been an emerging research topic in the area of video surveillance. To predict the existence of a particular attribute, it is demanded to localize the regions related to the attribute. However, in this task, the region annotations are not available. How to carve out these attribute-related regions remains challenging. Existing methods applied attribute-agnostic visual attention or heuristic body-part localization mechanisms to enhance the local feature representations, while neglecting to employ attributes to define local feature areas. We propose a flexible Attribute Localization Module (ALM) to adaptively discover the most discriminative regions and learns the regional features for each attribute at multiple levels. Moreover, a feature pyramid architecture is also introduced to enhance the attribute-specific localization at low-levels with high-level semantic guidance. The proposed framework does not require additional region annotations and can be trained end-to-end with multi-level deep supervision. Extensive experiments show that the proposed method achieves state-of-the-art results on three pedestrian attribute datasets, including PETA, RAP, and PA-100K.
Tasks	Pedestrian Attribute Recognition
Published	2019-10-10
URL	https://arxiv.org/abs/1910.04562v1
PDF	https://arxiv.org/pdf/1910.04562v1.pdf
PWC	https://paperswithcode.com/paper/improving-pedestrian-attribute-recognition
Repo	https://github.com/chufengt/iccv19_attribute
Framework	pytorch

DP-MAC: The Differentially Private Method of Auxiliary Coordinates for Deep Learning


Title	DP-MAC: The Differentially Private Method of Auxiliary Coordinates for Deep Learning
Authors	Frederik Harder, Jonas Köhler, Max Welling, Mijung Park
Abstract	Developing a differentially private deep learning algorithm is challenging, due to the difficulty in analyzing the sensitivity of objective functions that are typically used to train deep neural networks. Many existing methods resort to the stochastic gradient descent algorithm and apply a pre-defined sensitivity to the gradients for privatizing weights. However, their slow convergence typically yields a high cumulative privacy loss. Here, we take a different route by employing the method of auxiliary coordinates, which allows us to independently update the weights per layer by optimizing a per-layer objective function. This objective function can be well approximated by a low-order Taylor’s expansion, in which sensitivity analysis becomes tractable. We perturb the coefficients of the expansion for privacy, which we optimize using more advanced optimization routines than SGD for faster convergence. We empirically show that our algorithm provides a decent trained model quality under a modest privacy budget.
Tasks
Published	2019-10-15
URL	https://arxiv.org/abs/1910.06924v1
PDF	https://arxiv.org/pdf/1910.06924v1.pdf
PWC	https://paperswithcode.com/paper/dp-mac-the-differentially-private-method-of
Repo	https://github.com/mijungi/dp_mac
Framework	tf