Paper Group AWR 371
Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space?. E-LSTM-D: A Deep Learning Framework for Dynamic Network Link Prediction. Multiple Generative Models Ensemble for Knowledge-Driven Proactive Human-Computer Dialogue Agent. Residual or Gate? Towards Deeper Graph Neural Networks for Inductive Graph Representation Learning. LMLFM: Lon …
Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space?
Title | Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space? |
Authors | Rameen Abdal, Yipeng Qin, Peter Wonka |
Abstract | We propose an efficient algorithm to embed a given image into the latent space of StyleGAN. This embedding enables semantic image editing operations that can be applied to existing photographs. Taking the StyleGAN trained on the FFHQ dataset as an example, we show results for image morphing, style transfer, and expression transfer. Studying the results of the embedding algorithm provides valuable insights into the structure of the StyleGAN latent space. We propose a set of experiments to test what class of images can be embedded, how they are embedded, what latent space is suitable for embedding, and if the embedding is semantically meaningful. |
Tasks | Image Morphing, Style Transfer |
Published | 2019-04-05 |
URL | https://arxiv.org/abs/1904.03189v2 |
https://arxiv.org/pdf/1904.03189v2.pdf | |
PWC | https://paperswithcode.com/paper/image2stylegan-how-to-embed-images-into-the |
Repo | https://github.com/pacifinapacific/StyleGAN_LatentEditor |
Framework | pytorch |
E-LSTM-D: A Deep Learning Framework for Dynamic Network Link Prediction
Title | E-LSTM-D: A Deep Learning Framework for Dynamic Network Link Prediction |
Authors | Jinyin Chen, Jian Zhang, Xuanheng Xu, Chengbo Fu, Dan Zhang, Qingpeng Zhang, Qi Xuan |
Abstract | Predicting the potential relations between nodes in networks, known as link prediction, has long been a challenge in network science. However, most studies just focused on link prediction of static network, while real-world networks always evolve over time with the occurrence and vanishing of nodes and links. Dynamic network link prediction thus has been attracting more and more attention since it can better capture the evolution nature of networks, but still most algorithms fail to achieve satisfied prediction accuracy. Motivated by the excellent performance of Long Short-Term Memory (LSTM) in processing time series, in this paper, we propose a novel Encoder-LSTM-Decoder (E-LSTM-D) deep learning model to predict dynamic links end to end. It could handle long term prediction problems, and suits the networks of different scales with fine-tuned structure. To the best of our knowledge, it is the first time that LSTM, together with an encoder-decoder architecture, is applied to link prediction in dynamic networks. This new model is able to automatically learn structural and temporal features in a unified framework, which can predict the links that never appear in the network before. The extensive experiments show that our E-LSTM-D model significantly outperforms newly proposed dynamic network link prediction methods and obtain the state-of-the-art results. |
Tasks | Link Prediction, Time Series |
Published | 2019-02-22 |
URL | http://arxiv.org/abs/1902.08329v1 |
http://arxiv.org/pdf/1902.08329v1.pdf | |
PWC | https://paperswithcode.com/paper/e-lstm-d-a-deep-learning-framework-for |
Repo | https://github.com/jianz94/e-lstm-d |
Framework | tf |
Multiple Generative Models Ensemble for Knowledge-Driven Proactive Human-Computer Dialogue Agent
Title | Multiple Generative Models Ensemble for Knowledge-Driven Proactive Human-Computer Dialogue Agent |
Authors | Zelin Dai, Weitang Liu, Hao Zhang, Minghao Zhu, Long Wang |
Abstract | Multiple sequence to sequence models were used to establish an end-to-end multi-turns proactive dialogue generation agent, with the aid of data augmentation techniques and variant encoder-decoder structure designs. A rank-based ensemble approach was developed for boosting performance. Results indicate that our single model, in average, makes an obvious improvement in the terms of F1-score and BLEU over the baseline by 18.67% on the DuConv dataset. In particular, the ensemble methods further significantly outperform the baseline by 35.85%. |
Tasks | Data Augmentation, Dialogue Generation |
Published | 2019-07-08 |
URL | https://arxiv.org/abs/1907.03590v1 |
https://arxiv.org/pdf/1907.03590v1.pdf | |
PWC | https://paperswithcode.com/paper/multiple-generative-models-ensemble-for |
Repo | https://github.com/lonePatient/knowledge-driven-dialogue-lic2019-rank5 |
Framework | pytorch |
Residual or Gate? Towards Deeper Graph Neural Networks for Inductive Graph Representation Learning
Title | Residual or Gate? Towards Deeper Graph Neural Networks for Inductive Graph Representation Learning |
Authors | Binxuan Huang, Kathleen M. Carley |
Abstract | In this paper, we study the problem of node representation learning with graph neural networks. We present a graph neural network class named recurrent graph neural network (RGNN), that address the shortcomings of prior methods. By using recurrent units to capture the long-term dependency across layers, our methods can successfully identify important information during recursive neighborhood expansion. In our experiments, we show that our model class achieves state-of-the-art results on three benchmarks: the Pubmed, Reddit, and PPI network datasets. Our in-depth analyses also demonstrate that incorporating recurrent units is a simple yet effective method to prevent noisy information in graphs, which enables a deeper graph neural network. |
Tasks | Graph Representation Learning, Representation Learning |
Published | 2019-04-17 |
URL | https://arxiv.org/abs/1904.08035v3 |
https://arxiv.org/pdf/1904.08035v3.pdf | |
PWC | https://paperswithcode.com/paper/inductive-graph-representation-learning-with |
Repo | https://github.com/binxuan/Recurrent-Graph-Neural-Network |
Framework | none |
LMLFM: Longitudinal Multi-Level Factorization Machine
Title | LMLFM: Longitudinal Multi-Level Factorization Machine |
Authors | Junjie Liang, Dongkuan Xu, Yiwei Sun, Vasant Honavar |
Abstract | We consider the problem of learning predictive models from longitudinal data, consisting of irregularly repeated, sparse observations from a set of individuals over time. Such data often exhibit {\em longitudinal correlation} (LC) (correlations among observations for each individual over time), {\em cluster correlation} (CC) (correlations among individuals that have similar characteristics), or both. These correlations are often accounted for using {\em mixed effects models} that include {\em fixed effects} and {\em random effects}, where the fixed effects capture the regression parameters that are shared by all individuals, whereas random effects capture those parameters that vary across individuals. However, the current state-of-the-art methods are unable to select the most predictive fixed effects and random effects from a large number of variables, while accounting for complex correlation structure in the data and non-linear interactions among the variables. We propose Longitudinal Multi-Level Factorization Machine (LMLFM), to the best of our knowledge, the first model to address these challenges in learning predictive models from longitudinal data. We establish the convergence properties, and analyze the computational complexity, of LMLFM. We present results of experiments with both simulated and real-world longitudinal data which show that LMLFM outperforms the state-of-the-art methods in terms of predictive accuracy, variable selection ability, and scalability to data with large number of variables. The code and supplemental material is available at \url{https://github.com/junjieliang672/LMLFM}. |
Tasks | |
Published | 2019-11-11 |
URL | https://arxiv.org/abs/1911.04062v2 |
https://arxiv.org/pdf/1911.04062v2.pdf | |
PWC | https://paperswithcode.com/paper/lmlfm-longitudinal-multi-level-factorization |
Repo | https://github.com/junjieliang672/LMLFM |
Framework | pytorch |
An Investigation into the Effectiveness of Enhancement in ASR Training and Test for CHiME-5 Dinner Party Transcription
Title | An Investigation into the Effectiveness of Enhancement in ASR Training and Test for CHiME-5 Dinner Party Transcription |
Authors | Catalin Zorila, Christoph Boeddeker, Rama Doddipatla, Reinhold Haeb-Umbach |
Abstract | Despite the strong modeling power of neural network acoustic models, speech enhancement has been shown to deliver additional word error rate improvements if multi-channel data is available. However, there has been a longstanding debate whether enhancement should also be carried out on the ASR training data. In an extensive experimental evaluation on the acoustically very challenging CHiME-5 dinner party data we show that: (i) cleaning up the training data can lead to substantial error rate reductions, and (ii) enhancement in training is advisable as long as enhancement in test is at least as strong as in training. This approach stands in contrast and delivers larger gains than the common strategy reported in the literature to augment the training database with additional artificially degraded speech. Together with an acoustic model topology consisting of initial CNN layers followed by factorized TDNN layers we achieve with 41.6% and 43.2% WER on the DEV and EVAL test sets, respectively, a new single-system state-of-the-art result on the CHiME-5 data. This is a 8% relative improvement compared to the best word error rate published so far for a speech recognizer without system combination. |
Tasks | Speech Enhancement |
Published | 2019-09-26 |
URL | https://arxiv.org/abs/1909.12208v1 |
https://arxiv.org/pdf/1909.12208v1.pdf | |
PWC | https://paperswithcode.com/paper/an-investigation-into-the-effectiveness-of |
Repo | https://github.com/fgnt/pb_chime5 |
Framework | none |
Predicting Model Failure using Saliency Maps in Autonomous Driving Systems
Title | Predicting Model Failure using Saliency Maps in Autonomous Driving Systems |
Authors | Sina Mohseni, Akshay Jagadeesh, Zhangyang Wang |
Abstract | While machine learning systems show high success rate in many complex tasks, research shows they can also fail in very unexpected situations. Rise of machine learning products in safety-critical industries cause an increase in attention in evaluating model robustness and estimating failure probability in machine learning systems. In this work, we propose a design to train a student model – a failure predictor – to predict the main model’s error for input instances based on their saliency map. We implement and review the preliminary results of our failure predictor model on an autonomous vehicle steering control system as an example of safety-critical applications. |
Tasks | Autonomous Driving, Steering Control |
Published | 2019-05-19 |
URL | https://arxiv.org/abs/1905.07679v1 |
https://arxiv.org/pdf/1905.07679v1.pdf | |
PWC | https://paperswithcode.com/paper/predicting-model-failure-using-saliency-maps |
Repo | https://github.com/SinaMohseni/Saliency-Based-Failure-prediction-for-Autonomous-Vehicle |
Framework | tf |
Look before you Hop: Conversational Question Answering over Knowledge Graphs Using Judicious Context Expansion
Title | Look before you Hop: Conversational Question Answering over Knowledge Graphs Using Judicious Context Expansion |
Authors | Philipp Christmann, Rishiraj Saha Roy, Abdalghani Abujabal, Jyotsna Singh, Gerhard Weikum |
Abstract | Fact-centric information needs are rarely one-shot; users typically ask follow-up questions to explore a topic. In such a conversational setting, the user’s inputs are often incomplete, with entities or predicates left out, and ungrammatical phrases. This poses a huge challenge to question answering (QA) systems that typically rely on cues in full-fledged interrogative sentences. As a solution, we develop CONVEX: an unsupervised method that can answer incomplete questions over a knowledge graph (KG) by maintaining conversation context using entities and predicates seen so far and automatically inferring missing or ambiguous pieces for follow-up questions. The core of our method is a graph exploration algorithm that judiciously expands a frontier to find candidate answers for the current question. To evaluate CONVEX, we release ConvQuestions, a crowdsourced benchmark with 11,200 distinct conversations from five different domains. We show that CONVEX: (i) adds conversational support to any stand-alone QA system, and (ii) outperforms state-of-the-art baselines and question completion strategies. |
Tasks | Knowledge Graphs, Question Answering |
Published | 2019-10-08 |
URL | https://arxiv.org/abs/1910.03262v3 |
https://arxiv.org/pdf/1910.03262v3.pdf | |
PWC | https://paperswithcode.com/paper/look-before-you-hop-conversational-question |
Repo | https://github.com/PhilippChr/CONVEX |
Framework | none |
SCAttNet: Semantic Segmentation Network with Spatial and Channel Attention Mechanism for High-Resolution Remote Sensing Images
Title | SCAttNet: Semantic Segmentation Network with Spatial and Channel Attention Mechanism for High-Resolution Remote Sensing Images |
Authors | Haifeng Li, Kaijian Qiu, Li Chen, Xiaoming Mei, Liang Hong, Chao Tao |
Abstract | High-resolution remote sensing images (HRRSIs) contain substantial ground object information, such as texture, shape, and spatial location. Semantic segmentation, which is an important method for element extraction, has been widely used in processing mass HRRSIs. However, HRRSIs often exhibit large intraclass variance and small interclass variance due to the diversity and complexity of ground objects, thereby bringing great challenges to a semantic segmentation task. In this study, we propose a new end-to-end semantic segmentation network, which integrates two lightweight attention mechanisms that can refine features adaptively. We compare our method with several previous advanced networks on the ISPRS Vaihingen and Potsdam datasets. Experimental results show that our method can achieve better semantic segmentation results compared with other works. The source codes are available at https://github.com/lehaifeng/SCAttNet. |
Tasks | Semantic Segmentation |
Published | 2019-12-19 |
URL | https://arxiv.org/abs/1912.09121v1 |
https://arxiv.org/pdf/1912.09121v1.pdf | |
PWC | https://paperswithcode.com/paper/scattnet-semantic-segmentation-network-with |
Repo | https://github.com/lehaifeng/SCAttNet |
Framework | tf |
Rethinking Cooperative Rationalization: Introspective Extraction and Complement Control
Title | Rethinking Cooperative Rationalization: Introspective Extraction and Complement Control |
Authors | Mo Yu, Shiyu Chang, Yang Zhang, Tommi S. Jaakkola |
Abstract | Selective rationalization has become a common mechanism to ensure that predictive models reveal how they use any available features. The selection may be soft or hard, and identifies a subset of input features relevant for prediction. The setup can be viewed as a co-operate game between the selector (aka rationale generator) and the predictor making use of only the selected features. The co-operative setting may, however, be compromised for two reasons. First, the generator typically has no direct access to the outcome it aims to justify, resulting in poor performance. Second, there’s typically no control exerted on the information left outside the selection. We revise the overall co-operative framework to address these challenges. We introduce an introspective model which explicitly predicts and incorporates the outcome into the selection process. Moreover, we explicitly control the rationale complement via an adversary so as not to leave any useful information out of the selection. We show that the two complementary mechanisms maintain both high predictive accuracy and lead to comprehensive rationales. |
Tasks | |
Published | 2019-10-29 |
URL | https://arxiv.org/abs/1910.13294v2 |
https://arxiv.org/pdf/1910.13294v2.pdf | |
PWC | https://paperswithcode.com/paper/191013294 |
Repo | https://github.com/Gorov/three_player_for_emnlp |
Framework | pytorch |
Capsule Routing via Variational Bayes
Title | Capsule Routing via Variational Bayes |
Authors | Fabio De Sousa Ribeiro, Georgios Leontidis, Stefanos Kollias |
Abstract | Capsule networks are a recently proposed type of neural network shown to outperform alternatives in challenging shape recognition tasks. In capsule networks, scalar neurons are replaced with capsule vectors or matrices, whose entries represent different properties of objects. The relationships between objects and their parts are learned via trainable viewpoint-invariant transformation matrices, and the presence of a given object is decided by the level of agreement among votes from its parts. This interaction occurs between capsule layers and is a process called routing-by-agreement. In this paper, we propose a new capsule routing algorithm derived from Variational Bayes for fitting a mixture of transforming gaussians, and show it is possible transform our capsule network into a Capsule-VAE. Our Bayesian approach addresses some of the inherent weaknesses of MLE based models such as the variance-collapse by modelling uncertainty over capsule pose parameters. We outperform the state-of-the-art on smallNORB using 50% fewer capsules than previously reported, achieve competitive performances on CIFAR-10, Fashion-MNIST, SVHN, and demonstrate significant improvement in MNIST to affNIST generalisation over previous works. |
Tasks | |
Published | 2019-05-27 |
URL | https://arxiv.org/abs/1905.11455v3 |
https://arxiv.org/pdf/1905.11455v3.pdf | |
PWC | https://paperswithcode.com/paper/capsule-routing-via-variational-bayes |
Repo | https://github.com/fabio-deep/Variational-Capsule-Routing |
Framework | pytorch |
Occ-Traj120: Occupancy Maps with Associated Trajectories
Title | Occ-Traj120: Occupancy Maps with Associated Trajectories |
Authors | Tin Lai, Weiming Zhi, Fabio Ramos |
Abstract | Trajectory modelling had been the principal research area for understanding and anticipating human behaviour. Predicting the dynamic path by observing the agent and its surrounding environment are essential for applications such as autonomous driving and indoor navigation suggestions. However, despite the numerous researches that had been presented, most available dataset does not contains any information on environmental factors—such as the occupancy representation of the map—which arguably plays a significant role on how an agent chooses its trajectory. We present a trajectory dataset with the corresponding occupancy representations of different local-maps. The dataset contains more than 120 locally-structured maps with occupancy representation and more than 110K trajectories in total. Each map has few hundred corresponding simulated trajectories that navigate from a spatial location of a room to another point. The dataset is freely available online. |
Tasks | Autonomous Driving |
Published | 2019-09-05 |
URL | https://arxiv.org/abs/1909.02333v2 |
https://arxiv.org/pdf/1909.02333v2.pdf | |
PWC | https://paperswithcode.com/paper/occ-traj120-occupancy-maps-with-associated |
Repo | https://github.com/soraxas/Occ-Traj120 |
Framework | none |
Mogrifier LSTM
Title | Mogrifier LSTM |
Authors | Gábor Melis, Tomáš Kočiský, Phil Blunsom |
Abstract | Many advances in Natural Language Processing have been based upon more expressive models for how inputs interact with the context in which they occur. Recurrent networks, which have enjoyed a modicum of success, still lack the generalization and systematicity ultimately required for modelling language. In this work, we propose an extension to the venerable Long Short-Term Memory in the form of mutual gating of the current input and the previous output. This mechanism affords the modelling of a richer space of interactions between inputs and their context. Equivalently, our model can be viewed as making the transition function given by the LSTM context-dependent. Experiments demonstrate markedly improved generalization on language modelling in the range of 3-4 perplexity points on Penn Treebank and Wikitext-2, and 0.01-0.05 bpc on four character-based datasets. We establish a new state of the art on all datasets with the exception of Enwik8, where we close a large gap between the LSTM and Transformer models. |
Tasks | Language Modelling |
Published | 2019-09-04 |
URL | https://arxiv.org/abs/1909.01792v2 |
https://arxiv.org/pdf/1909.01792v2.pdf | |
PWC | https://paperswithcode.com/paper/mogrifier-lstm |
Repo | https://github.com/deepmind/lamb |
Framework | tf |
Improving Pedestrian Attribute Recognition With Weakly-Supervised Multi-Scale Attribute-Specific Localization
Title | Improving Pedestrian Attribute Recognition With Weakly-Supervised Multi-Scale Attribute-Specific Localization |
Authors | Chufeng Tang, Lu Sheng, Zhaoxiang Zhang, Xiaolin Hu |
Abstract | Pedestrian attribute recognition has been an emerging research topic in the area of video surveillance. To predict the existence of a particular attribute, it is demanded to localize the regions related to the attribute. However, in this task, the region annotations are not available. How to carve out these attribute-related regions remains challenging. Existing methods applied attribute-agnostic visual attention or heuristic body-part localization mechanisms to enhance the local feature representations, while neglecting to employ attributes to define local feature areas. We propose a flexible Attribute Localization Module (ALM) to adaptively discover the most discriminative regions and learns the regional features for each attribute at multiple levels. Moreover, a feature pyramid architecture is also introduced to enhance the attribute-specific localization at low-levels with high-level semantic guidance. The proposed framework does not require additional region annotations and can be trained end-to-end with multi-level deep supervision. Extensive experiments show that the proposed method achieves state-of-the-art results on three pedestrian attribute datasets, including PETA, RAP, and PA-100K. |
Tasks | Pedestrian Attribute Recognition |
Published | 2019-10-10 |
URL | https://arxiv.org/abs/1910.04562v1 |
https://arxiv.org/pdf/1910.04562v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-pedestrian-attribute-recognition |
Repo | https://github.com/chufengt/iccv19_attribute |
Framework | pytorch |
DP-MAC: The Differentially Private Method of Auxiliary Coordinates for Deep Learning
Title | DP-MAC: The Differentially Private Method of Auxiliary Coordinates for Deep Learning |
Authors | Frederik Harder, Jonas Köhler, Max Welling, Mijung Park |
Abstract | Developing a differentially private deep learning algorithm is challenging, due to the difficulty in analyzing the sensitivity of objective functions that are typically used to train deep neural networks. Many existing methods resort to the stochastic gradient descent algorithm and apply a pre-defined sensitivity to the gradients for privatizing weights. However, their slow convergence typically yields a high cumulative privacy loss. Here, we take a different route by employing the method of auxiliary coordinates, which allows us to independently update the weights per layer by optimizing a per-layer objective function. This objective function can be well approximated by a low-order Taylor’s expansion, in which sensitivity analysis becomes tractable. We perturb the coefficients of the expansion for privacy, which we optimize using more advanced optimization routines than SGD for faster convergence. We empirically show that our algorithm provides a decent trained model quality under a modest privacy budget. |
Tasks | |
Published | 2019-10-15 |
URL | https://arxiv.org/abs/1910.06924v1 |
https://arxiv.org/pdf/1910.06924v1.pdf | |
PWC | https://paperswithcode.com/paper/dp-mac-the-differentially-private-method-of |
Repo | https://github.com/mijungi/dp_mac |
Framework | tf |