April 3, 2020

# Paper Group ANR 39

Physics Informed Deep Learning for Transport in Porous Media. Buckley Leverett Problem. Super-efficiency of automatic differentiation for functions defined as a minimum. End-to-end Emotion-Cause Pair Extraction via Learning to Link. Recurrent Dirichlet Belief Networks for Interpretable Dynamic Relational Data Modelling. Interpretable and Fair Compa …

#### Physics Informed Deep Learning for Transport in Porous Media. Buckley Leverett Problem

Title Physics Informed Deep Learning for Transport in Porous Media. Buckley Leverett Problem
Authors Cedric G. Fraces, Adrien Papaioannou, Hamdi Tchelepi
Abstract We present a new hybrid physics-based machine-learning approach to reservoir modeling. The methodology relies on a series of deep adversarial neural network architecture with physics-based regularization. The network is used to simulate the dynamic behavior of physical quantities (i.e. saturation) subject to a set of governing laws (e.g. mass conservation) and corresponding boundary and initial conditions. A residual equation is formed from the governing partial-differential equation and used as part of the training. Derivatives of the estimated physical quantities are computed using automatic differentiation algorithms. This allows the model to avoid overfitting, by reducing the variance and permits extrapolation beyond the range of the training data including uncertainty implicitely derived from the distribution output of the generative adversarial networks. The approach is used to simulate a 2 phase immiscible transport problem (Buckley Leverett). From a very limited dataset, the model learns the parameters of the governing equation and is able to provide an accurate physical solution, both in terms of shock and rarefaction. We demonstrate how this method can be applied in the context of a forward simulation for continuous problems. The use of these models for the inverse problem is also presented, where the model simultaneously learns the physical laws and determines key uncertainty subsurface parameters. The proposed methodology is a simple and elegant way to instill physical knowledge to machine-learning algorithms. This alleviates the two most significant shortcomings of machine-learning algorithms: the requirement for large datasets and the reliability of extrapolation. The principles presented in this paper can be generalized in innumerable ways in the future and should lead to a new class of algorithms to solve both forward and inverse physical problems.
Published 2020-01-15
URL https://arxiv.org/abs/2001.05172v1
PDF https://arxiv.org/pdf/2001.05172v1.pdf
PWC https://paperswithcode.com/paper/physics-informed-deep-learning-for-transport
Repo
Framework

#### Super-efficiency of automatic differentiation for functions defined as a minimum

Title Super-efficiency of automatic differentiation for functions defined as a minimum
Authors Pierre Ablin, Gabriel Peyré, Thomas Moreau
Abstract In min-min optimization or max-min optimization, one has to compute the gradient of a function defined as a minimum. In most cases, the minimum has no closed-form, and an approximation is obtained via an iterative algorithm. There are two usual ways of estimating the gradient of the function: using either an analytic formula obtained by assuming exactness of the approximation, or automatic differentiation through the algorithm. In this paper, we study the asymptotic error made by these estimators as a function of the optimization error. We find that the error of the automatic estimator is close to the square of the error of the analytic estimator, reflecting a super-efficiency phenomenon. The convergence of the automatic estimator greatly depends on the convergence of the Jacobian of the algorithm. We analyze it for gradient descent and stochastic gradient descent and derive convergence rates for the estimators in these cases. Our analysis is backed by numerical experiments on toy problems and on Wasserstein barycenter computation. Finally, we discuss the computational complexity of these estimators and give practical guidelines to chose between them.
Published 2020-02-10
URL https://arxiv.org/abs/2002.03722v1
PDF https://arxiv.org/pdf/2002.03722v1.pdf
PWC https://paperswithcode.com/paper/super-efficiency-of-automatic-differentiation
Repo
Framework
Title End-to-end Emotion-Cause Pair Extraction via Learning to Link
Authors Haolin Song, Chen Zhang, Qiuchi Li, Dawei Song
Abstract Emotion-cause pair extraction (ECPE), as an emergent natural language processing task, aims at jointly investigating emotions and their underlying causes in documents. It extends the previous emotion cause extraction (ECE) task, yet without requiring a set of pre-given emotion clauses as in ECE. Existing approaches to ECPE generally adopt a two-stage method, i.e., (1) emotion and cause detection, and then (2) pairing the detected emotions and causes. Such pipeline method, while intuitive, suffers from two critical issues, including error propagation across stages that may hinder the effectiveness, and high computational cost that would limit the practical application of the method. To tackle these issues, we propose a multi-task learning model that can extract emotions, causes and emotion-cause pairs simultaneously in an end-to-end manner. Specifically, our model regards pair extraction as a link prediction task, and learns to link from emotion clauses to cause clauses, i.e., the links are directional. Emotion extraction and cause extraction are incorporated into the model as auxiliary tasks, which further boost the pair extraction. Experiments are conducted on an ECPE benchmarking dataset. The results show that our proposed model outperforms a range of state-of-the-art approaches in terms of both effectiveness and efficiency.
Published 2020-02-25
URL https://arxiv.org/abs/2002.10710v1
PDF https://arxiv.org/pdf/2002.10710v1.pdf
PWC https://paperswithcode.com/paper/end-to-end-emotion-cause-pair-extraction-via
Repo
Framework

#### Recurrent Dirichlet Belief Networks for Interpretable Dynamic Relational Data Modelling

Title Recurrent Dirichlet Belief Networks for Interpretable Dynamic Relational Data Modelling
Authors Yaqiong Li, Xuhui Fan, Ling Chen, Bin Li, Scott A. Sisson
Abstract The Dirichlet Belief Network~(DirBN) has been recently proposed as a promising approach in learning interpretable deep latent representations for objects. In this work, we leverage its interpretable modelling architecture and propose a deep dynamic probabilistic framework – the Recurrent Dirichlet Belief Network~(Recurrent-DBN) – to study interpretable hidden structures from dynamic relational data. The proposed Recurrent-DBN has the following merits: (1) it infers interpretable and organised hierarchical latent structures for objects within and across time steps; (2) it enables recurrent long-term temporal dependence modelling, which outperforms the one-order Markov descriptions in most of the dynamic probabilistic frameworks. In addition, we develop a new inference strategy, which first upward-and-backward propagates latent counts and then downward-and-forward samples variables, to enable efficient Gibbs sampling for the Recurrent-DBN. We apply the Recurrent-DBN to dynamic relational data problems. The extensive experiment results on real-world data validate the advantages of the Recurrent-DBN over the state-of-the-art models in interpretable latent structure discovery and improved link prediction performance.
Published 2020-02-24
URL https://arxiv.org/abs/2002.10235v1
PDF https://arxiv.org/pdf/2002.10235v1.pdf
PWC https://paperswithcode.com/paper/recurrent-dirichlet-belief-networks-for
Repo
Framework
Title Interpretable and Fair Comparison of Link Prediction or Entity Alignment Methods with Adjusted Mean Rank
Authors Max Berrendorf, Evgeniy Faerman, Laurent Vermue, Volker Tresp
Abstract In this work, we take a closer look at the evaluation of two families of methods for enriching information from knowledge graphs: Link Prediction and Entity Alignment. In the current experimental setting, multiple different scores are employed to assess different aspects of model performance. We analyze the informative value of these evaluation measures and identify several shortcomings. In particular, we demonstrate that all existing scores can hardly be used to compare results across different datasets. Moreover, this problem may also arise when comparing different train/test splits for the same dataset. We show that this leads to various problems in the interpretation of results, which may support misleading conclusions. Therefore, we propose a different evaluation and demonstrate empirically how this helps for fair, comparable and interpretable assessment of model performance.
Published 2020-02-17
URL https://arxiv.org/abs/2002.06914v1
PDF https://arxiv.org/pdf/2002.06914v1.pdf
Repo
Framework

#### Scalable Dyadic Independence Models with Local and Global Constraints

Title Scalable Dyadic Independence Models with Local and Global Constraints
Authors Florian Adriaens, Alexandru Mara, Jefrey Lijffijt, Tijl De Bie
Abstract An important challenge in the field of exponential random graphs (ERGs) is the fitting of non-trivial ERGs on large networks. By utilizing matrix block-approximation techniques, we propose an approximative framework to such non-trivial ERGs that result in dyadic independence (i.e., edge independent) models, while being able to meaningfully model local information (degrees) as well as global information (clustering coefficient, assortativity, etc.) if desired. This allows one to efficiently generate random networks with similar properties as an observed network, scalable up to sparse graphs consisting of millions of nodes. Empirical evaluation demonstrates its competitiveness in terms of accuracy with state-of-the-art methods for link prediction and network reconstruction.
Published 2020-02-14
URL https://arxiv.org/abs/2002.07076v1
PDF https://arxiv.org/pdf/2002.07076v1.pdf
Repo
Framework

#### A Closer Look at Small-loss Bounds for Bandits with Graph Feedback

Title A Closer Look at Small-loss Bounds for Bandits with Graph Feedback
Authors Chung-Wei Lee, Haipeng Luo, Mengxiao Zhang
Abstract We study small-loss bounds for the adversarial multi-armed bandits problem with graph feedback, that is, adaptive regret bounds that depend on the loss of the best arm or related quantities, instead of the total number of rounds. We derive the first small-loss bound for general strongly observable graphs, resolving an open problem proposed in (Lykouris et al., 2018). Specifically, we develop an algorithm with regret $\mathcal{\tilde{O}}(\sqrt{\kappa L_*})$ where $\kappa$ is the clique partition number and $L_*$ is the loss of the best arm, and for the special case where every arm has a self-loop, we improve the regret to $\mathcal{\tilde{O}}(\min{\sqrt{\alpha T}, \sqrt{\kappa L_*}})$ where $\alpha \leq \kappa$ is the independence number. Our results significantly improve and extend those by Lykouris et al. (2018) who only consider self-aware undirected graphs. Furthermore, we also take the first attempt at deriving small-loss bounds for weakly observable graphs. We first prove that no typical small-loss bounds are achievable in this case, and then propose algorithms with alternative small-loss bounds in terms of the loss of some specific subset of arms. A surprising side result is that $\mathcal{\tilde{O}}(\sqrt{T})$ regret is achievable even for weakly observable graphs as long as the best arm has a self-loop. Our algorithms are based on the Online Mirror Descent framework but require a suite of novel techniques that might be of independent interest. Moreover, all our algorithms can be made parameter-free without the knowledge of the environment.
Published 2020-02-02
URL https://arxiv.org/abs/2002.00315v1
PDF https://arxiv.org/pdf/2002.00315v1.pdf
PWC https://paperswithcode.com/paper/a-closer-look-at-small-loss-bounds-for
Repo
Framework

#### Incentivising Exploration and Recommendations for Contextual Bandits with Payments

Title Incentivising Exploration and Recommendations for Contextual Bandits with Payments
Authors Priyank Agrawal, Theja Tulabandhula
Abstract We propose a contextual bandit based model to capture the learning and social welfare goals of a web platform in the presence of myopic users. By using payments to incentivize these agents to explore different items/recommendations, we show how the platform can learn the inherent attributes of items and achieve a sublinear regret while maximizing cumulative social welfare. We also calculate theoretical bounds on the cumulative costs of incentivization to the platform. Unlike previous works in this domain, we consider contexts to be completely adversarial, and the behavior of the adversary is unknown to the platform. Our approach can improve various engagement metrics of users on e-commerce stores, recommendation engines and matching platforms.
Published 2020-01-22
URL https://arxiv.org/abs/2001.07853v1
PDF https://arxiv.org/pdf/2001.07853v1.pdf
PWC https://paperswithcode.com/paper/incentivising-exploration-and-recommendations
Repo
Framework

#### Graph Enhanced Representation Learning for News Recommendation

Title Graph Enhanced Representation Learning for News Recommendation
Authors Suyu Ge, Chuhan Wu, Fangzhao Wu, Tao Qi, Yongfeng Huang
Abstract With the explosion of online news, personalized news recommendation becomes increasingly important for online news platforms to help their users find interesting information. Existing news recommendation methods achieve personalization by building accurate news representations from news content and user representations from their direct interactions with news (e.g., click), while ignoring the high-order relatedness between users and news. Here we propose a news recommendation method which can enhance the representation learning of users and news by modeling their relatedness in a graph setting. In our method, users and news are both viewed as nodes in a bipartite graph constructed from historical user click behaviors. For news representations, a transformer architecture is first exploited to build news semantic representations. Then we combine it with the information from neighbor news in the graph via a graph attention network. For user representations, we not only represent users from their historically clicked news, but also attentively incorporate the representations of their neighbor users in the graph. Improved performances on a large-scale real-world dataset validate the effectiveness of our proposed method.
Published 2020-03-31
URL https://arxiv.org/abs/2003.14292v1
PDF https://arxiv.org/pdf/2003.14292v1.pdf
PWC https://paperswithcode.com/paper/graph-enhanced-representation-learning-for
Repo
Framework

#### CrDoCo: Pixel-level Domain Transfer with Cross-Domain Consistency

Title CrDoCo: Pixel-level Domain Transfer with Cross-Domain Consistency
Authors Yun-Chun Chen, Yen-Yu Lin, Ming-Hsuan Yang, Jia-Bin Huang
Abstract Unsupervised domain adaptation algorithms aim to transfer the knowledge learned from one domain to another (e.g., synthetic to real images). The adapted representations often do not capture pixel-level domain shifts that are crucial for dense prediction tasks (e.g., semantic segmentation). In this paper, we present a novel pixel-wise adversarial domain adaptation algorithm. By leveraging image-to-image translation methods for data augmentation, our key insight is that while the translated images between domains may differ in styles, their predictions for the task should be consistent. We exploit this property and introduce a cross-domain consistency loss that enforces our adapted model to produce consistent predictions. Through extensive experimental results, we show that our method compares favorably against the state-of-the-art on a wide variety of unsupervised domain adaptation tasks.
Published 2020-01-09
URL https://arxiv.org/abs/2001.03182v1
PDF https://arxiv.org/pdf/2001.03182v1.pdf
PWC https://paperswithcode.com/paper/crdoco-pixel-level-domain-transfer-with-cross-1
Repo
Framework

#### Centimeter-Level Indoor Localization using Channel State Information with Recurrent Neural Networks

Title Centimeter-Level Indoor Localization using Channel State Information with Recurrent Neural Networks
Authors Jianyuan Yu, R. Michael Buehrer
Abstract Modern techniques in the Internet of Things or autonomous driving require more accuracy positioning ever. Classic location techniques mainly adapt to outdoor scenarios, while they do not meet the requirement of indoor cases with multiple paths. Meanwhile as a feature robust to noise and time variations, Channel State Information (CSI) has shown its advantages over Received Signal Strength Indicator (RSSI) at more accurate positioning. To this end, this paper proposes the neural network method to estimate the centimeter-level indoor positioning with real CSI data collected from linear antennas. It utilizes an amplitude of channel response or a correlation matrix as the input, which can highly reduce the data size and suppress the noise. Also, it makes use of the consistency in the user motion trajectory via Recurrent Neural Network (RNN) and signal-noise ratio (SNR) information, which can further improve the estimation accuracy, especially in small datasize learning. These contributions all benefit the efficiency of the neural network, based on the results with other classic supervised learning methods.
Published 2020-02-04
URL https://arxiv.org/abs/2002.01411v1
PDF https://arxiv.org/pdf/2002.01411v1.pdf
PWC https://paperswithcode.com/paper/centimeter-level-indoor-localization-using
Repo
Framework

#### Supervised Categorical Metric Learning with Schatten p-Norms

Title Supervised Categorical Metric Learning with Schatten p-Norms
Authors Xuhui Fan, Eric Gaussier
Abstract Metric learning has been successful in learning new metrics adapted to numerical datasets. However, its development on categorical data still needs further exploration. In this paper, we propose a method, called CPML for \emph{categorical projected metric learning}, that tries to efficiently~(i.e. less computational time and better prediction accuracy) address the problem of metric learning in categorical data. We make use of the Value Distance Metric to represent our data and propose new distances based on this representation. We then show how to efficiently learn new metrics. We also generalize several previous regularizers through the Schatten $p$-norm and provides a generalization bound for it that complements the standard generalization bound for metric learning. Experimental results show that our method provides
Published 2020-02-26
URL https://arxiv.org/abs/2002.11246v1
PDF https://arxiv.org/pdf/2002.11246v1.pdf
PWC https://paperswithcode.com/paper/supervised-categorical-metric-learning-with
Repo
Framework

#### Revisiting Training Strategies and Generalization Performance in Deep Metric Learning

Title Revisiting Training Strategies and Generalization Performance in Deep Metric Learning
Authors Karsten Roth, Timo Milbich, Samarth Sinha, Prateek Gupta, Björn Ommer, Joseph Paul Cohen
Abstract Deep Metric Learning (DML) is arguably one of the most influential lines of research for learning visual similarities with many proposed approaches every year. Although the field benefits from the rapid progress, the divergence in training protocols, architectures, and parameter choices make an unbiased comparison difficult. To provide a consistent reference point, we revisit the most widely used DML objective functions and conduct a study of the crucial parameter choices as well as the commonly neglected mini-batch sampling process. Based on our analysis, we uncover a correlation between the embedding space compression and the generalization performance of DML models. Exploiting these insights, we propose a simple, yet effective, training regularization to reliably boost the performance of ranking-based DML models on various standard benchmark datasets.
Published 2020-02-19
URL https://arxiv.org/abs/2002.08473v4
PDF https://arxiv.org/pdf/2002.08473v4.pdf
PWC https://paperswithcode.com/paper/revisiting-training-strategies-and
Repo
Framework

#### An Inductive Bias for Distances: Neural Nets that Respect the Triangle Inequality

Title An Inductive Bias for Distances: Neural Nets that Respect the Triangle Inequality
Authors Silviu Pitis, Harris Chan, Kiarash Jamali, Jimmy Ba
Abstract Distances are pervasive in machine learning. They serve as similarity measures, loss functions, and learning targets; it is said that a good distance measure solves a task. When defining distances, the triangle inequality has proven to be a useful constraint, both theoretically–to prove convergence and optimality guarantees–and empirically–as an inductive bias. Deep metric learning architectures that respect the triangle inequality rely, almost exclusively, on Euclidean distance in the latent space. Though effective, this fails to model two broad classes of subadditive distances, common in graphs and reinforcement learning: asymmetric metrics, and metrics that cannot be embedded into Euclidean space. To address these problems, we introduce novel architectures that are guaranteed to satisfy the triangle inequality. We prove our architectures universally approximate norm-induced metrics on $\mathbb{R}^n$, and present a similar result for modified Input Convex Neural Networks. We show that our architectures outperform existing metric approaches when modeling graph distances and have a better inductive bias than non-metric approaches when training data is limited in the multi-goal reinforcement learning setting.
Tasks Metric Learning, Multi-Goal Reinforcement Learning
Published 2020-02-14
URL https://arxiv.org/abs/2002.05825v1
PDF https://arxiv.org/pdf/2002.05825v1.pdf
PWC https://paperswithcode.com/paper/an-inductive-bias-for-distances-neural-nets-1
Repo
Framework

#### Exploiting the Matching Information in the Support Set for Few Shot Event Classification

Title Exploiting the Matching Information in the Support Set for Few Shot Event Classification
Authors Viet Dac Lai, Franck Dernoncourt, Thien Huu Nguyen
Abstract The existing event classification (EC) work primarily focuseson the traditional supervised learning setting in which models are unableto extract event mentions of new/unseen event types. Few-shot learninghas not been investigated in this area although it enables EC models toextend their operation to unobserved event types. To fill in this gap, inthis work, we investigate event classification under the few-shot learningsetting. We propose a novel training method for this problem that exten-sively exploit the support set during the training process of a few-shotlearning model. In particular, in addition to matching the query exam-ple with those in the support set for training, we seek to further matchthe examples within the support set themselves. This method providesmore training signals for the models and can be applied to every metric-learning-based few-shot learning methods. Our extensive experiments ontwo benchmark EC datasets show that the proposed method can improvethe best reported few-shot learning models by up to 10% on accuracyfor event classification