February 1, 2020

3408 words 16 mins read

Paper Group AWR 311

A Graph Theoretic Additive Approximation of Optimal Transport. Physics-Informed Neural Networks for Power Systems. Air Learning: An AI Research Platform for Algorithm-Hardware Benchmarking of Autonomous Aerial Robots. Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders. Robust Conditional GAN from Un …

A Graph Theoretic Additive Approximation of Optimal Transport


Title	A Graph Theoretic Additive Approximation of Optimal Transport
Authors	Nathaniel Lahn, Deepika Mulchandani, Sharath Raghvendra
Abstract	Transportation cost is an attractive similarity measure between probability distributions due to its many useful theoretical properties. However, solving optimal transport exactly can be prohibitively expensive. Therefore, there has been significant effort towards the design of scalable approximation algorithms. Previous combinatorial results [Sharathkumar, Agarwal STOC ‘12, Agarwal, Sharathkumar STOC ‘14] have focused primarily on the design of near-linear time multiplicative approximation algorithms. There has also been an effort to design approximate solutions with additive errors [Cuturi NIPS ‘13, Altschuler \etal\ NIPS ‘17, Dvurechensky \etal, ICML ‘18, Quanrud, SOSA ‘19] within a time bound that is linear in the size of the cost matrix and polynomial in $C/\delta$; here $C$ is the largest value in the cost matrix and $\delta$ is the additive error. We present an adaptation of the classical graph algorithm of Gabow and Tarjan and provide a novel analysis of this algorithm that bounds its execution time by $O(\frac{n^2 C}{\delta}+ \frac{nC^2}{\delta^2})$. Our algorithm is extremely simple and executes, for an arbitrarily small constant $\varepsilon$, only $\lfloor \frac{2C}{(1-\varepsilon)\delta}\rfloor + 1$ iterations, where each iteration consists only of a Dijkstra-type search followed by a depth-first search. We also provide empirical results that suggest our algorithm is competitive with respect to a sequential implementation of the Sinkhorn algorithm in execution time. Moreover, our algorithm quickly computes a solution for very small values of $\delta$ whereas Sinkhorn algorithm slows down due to numerical instability.
Tasks
Published	2019-05-28
URL	https://arxiv.org/abs/1905.11830v3
PDF	https://arxiv.org/pdf/1905.11830v3.pdf
PWC	https://paperswithcode.com/paper/a-graph-theoretic-additive-approximation-of
Repo	https://github.com/nathaniellahn/CombinatorialOptimalTransport
Framework	none

Physics-Informed Neural Networks for Power Systems


Title	Physics-Informed Neural Networks for Power Systems
Authors	George S. Misyris, Andreas Venzke, Spyros Chatzivasileiadis
Abstract	This paper introduces for the first time, to our knowledge, a framework for physics-informed neural networks in power system applications. Exploiting the underlying physical laws governing power systems, and inspired by recent developments in the field of machine learning, this paper proposes a neural network training procedure that can make use of the wide range of mathematical models describing power system behavior, both in steady-state and in dynamics. Physics-informed neural networks require substantially less training data and can result in simpler neural network structures, while achieving high accuracy. This work unlocks a range of opportunities in power systems, being able to determine dynamic states, such as rotor angles and frequency, and uncertain parameters such as inertia and damping at a fraction of the computational time required by conventional methods. This paper focuses on introducing the framework and showcases its potential using a single-machine infinite bus system as a guiding example. Physics-informed neural networks are shown to accurately determine rotor angle and frequency up to 87 times faster than conventional methods.
Tasks
Published	2019-11-09
URL	https://arxiv.org/abs/1911.03737v3
PDF	https://arxiv.org/pdf/1911.03737v3.pdf
PWC	https://paperswithcode.com/paper/physics-informed-neural-networks-for-power
Repo	https://github.com/gmisy/Phycics-informed-NN-for-Power-Systems
Framework	tf

Air Learning: An AI Research Platform for Algorithm-Hardware Benchmarking of Autonomous Aerial Robots


Title	Air Learning: An AI Research Platform for Algorithm-Hardware Benchmarking of Autonomous Aerial Robots
Authors	Srivatsan Krishnan, Behzad Borojerdian, William Fu, Aleksandra Faust, Vijay Janapa Reddi
Abstract	We introduce Air Learning, an AI research platform for benchmarking algorithm-hardware performance and energy efficiency trade-offs. We focus in particular on deep reinforcement learning (RL) interactions in autonomous unmanned aerial vehicles (UAVs). Equipped with a random environment generator, AirLearning exposes a UAV to a diverse set of challenging scenarios. Users can specify a task, train different RL policies and evaluate their performance and energy efficiency on a variety of hardware platforms. To show how Air Learning can be used, we seed it with Deep Q Networks (DQN) and Proximal Policy Optimization (PPO) to solve a point-to-point obstacle avoidance task in three different environments, generated using our configurable environment generator. We train the two algorithms using curriculum learning and non-curriculum-learning. Air Learning assesses the trained policies’ performance, under a variety of quality-of-flight (QoF) metrics, such as the energy consumed, endurance and the average trajectory length, on resource-constrained embedded platforms like a Ras-Pi. We find that the trajectories on an embedded Ras-Pi are vastly different from those predicted on a high-end desktop system, resulting in up to 79.43% longer trajectories in one of the environments. To understand the source of such differences, we use Air Learning to artificially degrade desktop performance to mimic what happens on a low-end embedded system. QoF metrics with hardware-in-the-loop characterize those differences and expose how the choice of onboard compute affects the aerial robot’s performance. We also conduct reliability studies to demonstrate how Air Learning can help understand how sensor failures affect the learned policies. All put together, Air Learning enables a broad class of RL studies on UAVs. More information and code for Air Learning can be found here: http://bit.ly/2JNAVb6.
Tasks
Published	2019-06-02
URL	https://arxiv.org/abs/1906.00421v2
PDF	https://arxiv.org/pdf/1906.00421v2.pdf
PWC	https://paperswithcode.com/paper/190600421
Repo	https://github.com/harvard-edge/airlearning
Framework	none

Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders


Title	Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders
Authors	Andy T. Liu, Shu-wen Yang, Po-Han Chi, Po-chun Hsu, Hung-yi Lee
Abstract	We present Mockingjay as a new speech representation learning approach, where bidirectional Transformer encoders are pre-trained on a large amount of unlabeled speech. Previous speech representation methods learn through conditioning on past frames and predicting information about future frames. Whereas Mockingjay is designed to predict the current frame through jointly conditioning on both past and future contexts. The Mockingjay representation improves performance for a wide range of downstream tasks, including phoneme classification, speaker recognition, and sentiment classification on spoken content, while outperforming other approaches. Mockingjay is empirically powerful and can be fine-tuned with downstream models, with only 2 epochs we further improve performance dramatically. In a low resource setting with only 0.1% of labeled data, we outperform the result of Mel-features that uses all 100% labeled data.
Tasks	Representation Learning, Sentiment Analysis, Speaker Recognition
Published	2019-10-25
URL	https://arxiv.org/abs/1910.12638v2
PDF	https://arxiv.org/pdf/1910.12638v2.pdf
PWC	https://paperswithcode.com/paper/mockingjay-unsupervised-speech-representation
Repo	https://github.com/samirsahoo007/Audio-and-Speech-Processing
Framework	pytorch

Robust Conditional GAN from Uncertainty-Aware Pairwise Comparisons


Title	Robust Conditional GAN from Uncertainty-Aware Pairwise Comparisons
Authors	Ligong Han, Ruijiang Gao, Mun Kim, Xin Tao, Bo Liu, Dimitris Metaxas
Abstract	Conditional generative adversarial networks have shown exceptional generation performance over the past few years. However, they require large numbers of annotations. To address this problem, we propose a novel generative adversarial network utilizing weak supervision in the form of pairwise comparisons (PC-GAN) for image attribute editing. In the light of Bayesian uncertainty estimation and noise-tolerant adversarial training, PC-GAN can estimate attribute rating efficiently and demonstrate robust performance in noise resistance. Through extensive experiments, we show both qualitatively and quantitatively that PC-GAN performs comparably with fully-supervised methods and outperforms unsupervised baselines.
Tasks
Published	2019-11-21
URL	https://arxiv.org/abs/1911.09298v2
PDF	https://arxiv.org/pdf/1911.09298v2.pdf
PWC	https://paperswithcode.com/paper/robust-conditional-gan-from-uncertainty-aware
Repo	https://github.com/phymhan/pc-gan
Framework	pytorch

Aiding Intra-Text Representations with Visual Context for Multimodal Named Entity Recognition


Title	Aiding Intra-Text Representations with Visual Context for Multimodal Named Entity Recognition
Authors	Omer Arshad, Ignazio Gallo, Shah Nawaz, Alessandro Calefati
Abstract	With massive explosion of social media such as Twitter and Instagram, people daily share billions of multimedia posts, containing images and text. Typically, text in these posts is short, informal and noisy, leading to ambiguities which can be resolved using images. In this paper we explore text-centric Named Entity Recognition task on these multimedia posts. We propose an end to end model which learns a joint representation of a text and an image. Our model extends multi-dimensional self attention technique, where now image helps to enhance relationship between words. Experiments show that our model is capable of capturing both textual and visual contexts with greater accuracy, achieving state-of-the-art results on Twitter multimodal Named Entity Recognition dataset.
Tasks	Named Entity Recognition
Published	2019-04-02
URL	http://arxiv.org/abs/1904.01356v1
PDF	http://arxiv.org/pdf/1904.01356v1.pdf
PWC	https://paperswithcode.com/paper/aiding-intra-text-representations-with-visual
Repo	https://github.com/omerarshad/MultiModalNER
Framework	tf

Augmenting Neural Machine Translation with Knowledge Graphs


Title	Augmenting Neural Machine Translation with Knowledge Graphs
Authors	Diego Moussallem, Mihael Arčan, Axel-Cyrille Ngonga Ngomo, Paul Buitelaar
Abstract	While neural networks have been used extensively to make substantial progress in the machine translation task, they are known for being heavily dependent on the availability of large amounts of training data. Recent efforts have tried to alleviate the data sparsity problem by augmenting the training data using different strategies, such as back-translation. Along with the data scarcity, the out-of-vocabulary words, mostly entities and terminological expressions, pose a difficult challenge to Neural Machine Translation systems. In this paper, we hypothesize that knowledge graphs enhance the semantic feature extraction of neural models, thus optimizing the translation of entities and terminological expressions in texts and consequently leading to a better translation quality. We hence investigate two different strategies for incorporating knowledge graphs into neural models without modifying the neural network architectures. We also examine the effectiveness of our augmentation method to recurrent and non-recurrent (self-attentional) neural architectures. Our knowledge graph augmented neural translation model, dubbed KG-NMT, achieves significant and consistent improvements of +3 BLEU, METEOR and chrF3 on average on the newstest datasets between 2014 and 2018 for WMT English-German translation task.
Tasks	Knowledge Graphs, Machine Translation
Published	2019-02-23
URL	http://arxiv.org/abs/1902.08816v1
PDF	http://arxiv.org/pdf/1902.08816v1.pdf
PWC	https://paperswithcode.com/paper/augmenting-neural-machine-translation-with
Repo	https://github.com/dice-group/KG-NMT
Framework	tf

Hierarchical Stochastic Block Model for Community Detection in Multiplex Networks


Title	Hierarchical Stochastic Block Model for Community Detection in Multiplex Networks
Authors	Marina S. Paez, Arash A. Amini, Lizhen Lin
Abstract	Multiplex networks have become increasingly more prevalent in many fields, and have emerged as a powerful tool for modeling the complexity of real networks. There is a critical need for developing inference models for multiplex networks that can take into account potential dependencies across different layers, particularly when the aim is community detection. We add to a limited literature by proposing a novel and efficient Bayesian model for community detection in multiplex networks. A key feature of our approach is the ability to model varying communities at different network layers. In contrast, many existing models assume the same communities for all layers. Moreover, our model automatically picks up the necessary number of communities at each layer (as validated by real data examples). This is appealing, since deciding the number of communities is a challenging aspect of community detection, and especially so in the multiplex setting, if one allows the communities to change across layers. Borrowing ideas from hierarchical Bayesian modeling, we use a hierarchical Dirichlet prior to model community labels across layers, allowing dependency in their structure. Given the community labels, a stochastic block model (SBM) is assumed for each layer. We develop an efficient slice sampler for sampling the posterior distribution of the community labels as well as the link probabilities between communities. In doing so, we address some unique challenges posed by coupling the complex likelihood of SBM with the hierarchical nature of the prior on the labels. An extensive empirical validation is performed on simulated and real data, demonstrating the superior performance of the model over single-layer alternatives, as well as the ability to uncover interesting structures in real networks.
Tasks	Community Detection
Published	2019-03-30
URL	http://arxiv.org/abs/1904.05330v1
PDF	http://arxiv.org/pdf/1904.05330v1.pdf
PWC	https://paperswithcode.com/paper/hierarchical-stochastic-block-model-for
Repo	https://github.com/aaamini/hsbm
Framework	none

Investigating Prior Knowledge for Challenging Chinese Machine Reading Comprehension


Title	Investigating Prior Knowledge for Challenging Chinese Machine Reading Comprehension
Authors	Kai Sun, Dian Yu, Dong Yu, Claire Cardie
Abstract	Machine reading comprehension tasks require a machine reader to answer questions relevant to the given document. In this paper, we present the first free-form multiple-Choice Chinese machine reading Comprehension dataset (C^3), containing 13,369 documents (dialogues or more formally written mixed-genre texts) and their associated 19,577 multiple-choice free-form questions collected from Chinese-as-a-second-language examinations. We present a comprehensive analysis of the prior knowledge (i.e., linguistic, domain-specific, and general world knowledge) needed for these real-world problems. We implement rule-based and popular neural methods and find that there is still a significant performance gap between the best performing model (68.5%) and human readers (96.0%), especially on problems that require prior knowledge. We further study the effects of distractor plausibility and data augmentation based on translated relevant datasets for English on model performance. We expect C^3 to present great challenges to existing systems as answering 86.8% of questions requires both knowledge within and beyond the accompanying document, and we hope that C^3 can serve as a platform to study how to leverage various kinds of prior knowledge to better understand a given written or orally oriented text. C^3 is available at https://dataset.org/c3/.
Tasks	Data Augmentation, Language Modelling, Machine Reading Comprehension, Reading Comprehension
Published	2019-04-21
URL	https://arxiv.org/abs/1904.09679v3
PDF	https://arxiv.org/pdf/1904.09679v3.pdf
PWC	https://paperswithcode.com/paper/probing-prior-knowledge-needed-in-challenging
Repo	https://github.com/nlpdata/c3
Framework	pytorch

ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit


Title	ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit
Authors	Tomoki Hayashi, Ryuichi Yamamoto, Katsuki Inoue, Takenori Yoshimura, Shinji Watanabe, Tomoki Toda, Kazuya Takeda, Yu Zhang, Xu Tan
Abstract	This paper introduces a new end-to-end text-to-speech (E2E-TTS) toolkit named ESPnet-TTS, which is an extension of the open-source speech processing toolkit ESPnet. The toolkit supports state-of-the-art E2E-TTS models, including Tacotron~2, Transformer TTS, and FastSpeech, and also provides recipes inspired by the Kaldi automatic speech recognition (ASR) toolkit. The recipes are based on the design unified with the ESPnet ASR recipe, providing high reproducibility. The toolkit also provides pre-trained models and samples of all of the recipes so that users can use it as a baseline. Furthermore, the unified design enables the integration of ASR functions with TTS, e.g., ASR-based objective evaluation and semi-supervised learning with both ASR and TTS models. This paper describes the design of the toolkit and experimental evaluation in comparison with other toolkits. The experimental results show that our models can achieve state-of-the-art performance comparable to the other latest toolkits, resulting in a mean opinion score (MOS) of 4.25 on the LJSpeech dataset. The toolkit is publicly available at https://github.com/espnet/espnet.
Tasks	Speech Recognition
Published	2019-10-24
URL	https://arxiv.org/abs/1910.10909v2
PDF	https://arxiv.org/pdf/1910.10909v2.pdf
PWC	https://paperswithcode.com/paper/espnet-tts-unified-reproducible-and
Repo	https://github.com/espnet/espnet
Framework	pytorch

Collaborative Global-Local Networks for Memory-Efficient Segmentation of Ultra-High Resolution Images


Title	Collaborative Global-Local Networks for Memory-Efficient Segmentation of Ultra-High Resolution Images
Authors	Wuyang Chen, Ziyu Jiang, Zhangyang Wang, Kexin Cui, Xiaoning Qian
Abstract	Segmentation of ultra-high resolution images is increasingly demanded, yet poses significant challenges for algorithm efficiency, in particular considering the (GPU) memory limits. Current approaches either downsample an ultra-high resolution image or crop it into small patches for separate processing. In either way, the loss of local fine details or global contextual information results in limited segmentation accuracy. We propose collaborative Global-Local Networks (GLNet) to effectively preserve both global and local information in a highly memory-efficient manner. GLNet is composed of a global branch and a local branch, taking the downsampled entire image and its cropped local patches as respective inputs. For segmentation, GLNet deeply fuses feature maps from two branches, capturing both the high-resolution fine structures from zoomed-in local patches and the contextual dependency from the downsampled input. To further resolve the potential class imbalance problem between background and foreground regions, we present a coarse-to-fine variant of GLNet, also being memory-efficient. Extensive experiments and analyses have been performed on three real-world ultra-high aerial and medical image datasets (resolution up to 30 million pixels). With only one single 1080Ti GPU and less than 2GB memory used, our GLNet yields high-quality segmentation results and achieves much more competitive accuracy-memory usage trade-offs compared to state-of-the-arts.
Tasks
Published	2019-05-15
URL	https://arxiv.org/abs/1905.06368v2
PDF	https://arxiv.org/pdf/1905.06368v2.pdf
PWC	https://paperswithcode.com/paper/190506368
Repo	https://github.com/chenwydj/ultra_high_resolution_segmentation
Framework	pytorch

Self-supervised Learning for Video Correspondence Flow


Title	Self-supervised Learning for Video Correspondence Flow
Authors	Zihang Lai, Weidi Xie
Abstract	The objective of this paper is self-supervised learning of feature embeddings that are suitable for matching correspondences along the videos, which we term correspondence flow. By leveraging the natural spatial-temporal coherence in videos, we propose to train a ``pointer’’ that reconstructs a target frame by copying pixels from a reference frame. We make the following contributions: First, we introduce a simple information bottleneck that forces the model to learn robust features for correspondence matching, and prevent it from learning trivial solutions, \eg matching based on low-level colour information. Second, to tackle the challenges from tracker drifting, due to complex object deformations, illumination changes and occlusions, we propose to train a recursive model over long temporal windows with scheduled sampling and cycle consistency. Third, we achieve state-of-the-art performance on DAVIS 2017 video segmentation and JHMDB keypoint tracking tasks, outperforming all previous self-supervised learning approaches by a significant margin. Fourth, in order to shed light on the potential of self-supervised learning on the task of video correspondence flow, we probe the upper bound by training on additional data, \ie more diverse videos, further demonstrating significant improvements on video segmentation. \|
Tasks	Video Correspondence Flow, Video Semantic Segmentation
Published	2019-05-02
URL	https://arxiv.org/abs/1905.00875v5
PDF	https://arxiv.org/pdf/1905.00875v5.pdf
PWC	https://paperswithcode.com/paper/self-supervised-learning-for-video
Repo	https://github.com/zlai0/CorrFlow
Framework	pytorch

Drug-Drug Interaction Prediction Based on Knowledge Graph Embeddings and Convolutional-LSTM Network


Title	Drug-Drug Interaction Prediction Based on Knowledge Graph Embeddings and Convolutional-LSTM Network
Authors	Md. Rezaul Karim, Michael Cochez, Joao Bosco Jares, Mamtaz Uddin, Oya Beyan, Stefan Decker
Abstract	Interference between pharmacological substances can cause serious medical injuries. Correctly predicting so-called drug-drug interactions (DDI) does not only reduce these cases but can also result in a reduction of drug development cost. Presently, most drug-related knowledge is the result of clinical evaluations and post-marketing surveillance; resulting in a limited amount of information. Existing data-driven prediction approaches for DDIs typically rely on a single source of information, while using information from multiple sources would help improve predictions. Machine learning (ML) techniques are used, but the techniques are often unable to deal with skewness in the data. Hence, we propose a new ML approach for predicting DDIs based on multiple data sources. For this task, we use 12,000 drug features from DrugBank, PharmGKB, and KEGG drugs, which are integrated using Knowledge Graphs (KGs). To train our prediction model, we first embed the nodes in the graph using various embedding approaches. We found that the best performing combination was a ComplEx embedding method creating using PyTorch-BigGraph (PBG) with a Convolutional-LSTM network and classic machine learning-based prediction models. The model averaging ensemble method of three best classifiers yields up to 0.94, 0.92, 0.80 for AUPR, F1-score, and MCC, respectively during 5-fold cross-validation tests.
Tasks	Knowledge Graph Embeddings, Knowledge Graphs
Published	2019-08-04
URL	https://arxiv.org/abs/1908.01288v1
PDF	https://arxiv.org/pdf/1908.01288v1.pdf
PWC	https://paperswithcode.com/paper/drug-drug-interaction-prediction-based-on
Repo	https://github.com/rezacsedu/DDI-prediction-KG-embeddings-Conv-LSTM
Framework	tf

Neural Variational Inference For Estimating Uncertainty in Knowledge Graph Embeddings


Title	Neural Variational Inference For Estimating Uncertainty in Knowledge Graph Embeddings
Authors	Alexander I. Cowen-Rivers, Pasquale Minervini, Tim Rocktaschel, Matko Bosnjak, Sebastian Riedel, Jun Wang
Abstract	Recent advances in Neural Variational Inference allowed for a renaissance in latent variable models in a variety of domains involving high-dimensional data. While traditional variational methods derive an analytical approximation for the intractable distribution over the latent variables, here we construct an inference network conditioned on the symbolic representation of entities and relation types in the Knowledge Graph, to provide the variational distributions. The new framework results in a highly-scalable method. Under a Bernoulli sampling framework, we provide an alternative justification for commonly used techniques in large-scale stochastic variational inference, which drastically reduce training time at a cost of an additional approximation to the variational lower bound. We introduce two models from this highly scalable probabilistic framework, namely the Latent Information and Latent Fact models, for reasoning over knowledge graph-based representations. Our Latent Information and Latent Fact models improve upon baseline performance under certain conditions. We use the learnt embedding variance to estimate predictive uncertainty during link prediction, and discuss the quality of these learnt uncertainty estimates. Our source code and datasets are publicly available online at https://github.com/alexanderimanicowenrivers/Neural-Variational-Knowledge-Graphs.
Tasks	Knowledge Graph Embeddings, Knowledge Graphs, Latent Variable Models, Link Prediction
Published	2019-06-12
URL	https://arxiv.org/abs/1906.04985v2
PDF	https://arxiv.org/pdf/1906.04985v2.pdf
PWC	https://paperswithcode.com/paper/neural-variational-inference-for-estimating
Repo	https://github.com/alexanderimanicowenrivers/Neural-Variational-Knowledge-Graphs
Framework	tf

Neural-IR-Explorer: A Content-Focused Tool to Explore Neural Re-Ranking Results


Title	Neural-IR-Explorer: A Content-Focused Tool to Explore Neural Re-Ranking Results
Authors	Sebastian Hofstätter, Markus Zlabinger, Allan Hanbury
Abstract	In this paper we look beyond metrics-based evaluation of Information Retrieval systems, to explore the reasons behind ranking results. We present the content-focused Neural-IR-Explorer, which empowers users to browse through retrieval results and inspect the inner workings and fine-grained results of neural re-ranking models. The explorer includes a categorized overview of the available queries, as well as an individual query result view with various options to highlight semantic connections between query-document pairs. The Neural-IR-Explorer is available at: https://neural-ir-explorer.ec.tuwien.ac.at/
Tasks	Information Retrieval
Published	2019-12-10
URL	https://arxiv.org/abs/1912.04713v1
PDF	https://arxiv.org/pdf/1912.04713v1.pdf
PWC	https://paperswithcode.com/paper/neural-ir-explorer-a-content-focused-tool-to
Repo	https://github.com/sebastian-hofstaetter/neural-ir-explorer
Framework	none