February 1, 2020

3146 words 15 mins read

Paper Group AWR 325

Paper Group AWR 325

Re-Ranking Words to Improve Interpretability of Automatically Generated Topics. Medical device surveillance with electronic health records. Go-Explore: a New Approach for Hard-Exploration Problems. Multi-Task Feature Learning for Knowledge Graph Enhanced Recommendation. Is a Single Embedding Enough? Learning Node Representations that Capture Multip …

Re-Ranking Words to Improve Interpretability of Automatically Generated Topics

Title Re-Ranking Words to Improve Interpretability of Automatically Generated Topics
Authors Areej Alokaili, Nikolaos Aletras, Mark Stevenson
Abstract Topics models, such as LDA, are widely used in Natural Language Processing. Making their output interpretable is an important area of research with applications to areas such as the enhancement of exploratory search interfaces and the development of interpretable machine learning models. Conventionally, topics are represented by their n most probable words, however, these representations are often difficult for humans to interpret. This paper explores the re-ranking of topic words to generate more interpretable topic representations. A range of approaches are compared and evaluated in two experiments. The first uses crowdworkers to associate topics represented by different word rankings with related documents. The second experiment is an automatic approach based on a document retrieval task applied on multiple domains. Results in both experiments demonstrate that re-ranking words improves topic interpretability and that the most effective re-ranking schemes were those which combine information about the importance of words both within topics and their relative frequency in the entire corpus. In addition, close correlation between the results of the two evaluation approaches suggests that the automatic method proposed here could be used to evaluate re-ranking methods without the need for human judgements.
Tasks Interpretable Machine Learning
Published 2019-03-29
URL http://arxiv.org/abs/1903.12542v1
PDF http://arxiv.org/pdf/1903.12542v1.pdf
PWC https://paperswithcode.com/paper/re-ranking-words-to-improve-interpretability
Repo https://github.com/areejokaili/topic_reranking
Framework none

Medical device surveillance with electronic health records

Title Medical device surveillance with electronic health records
Authors Alison Callahan, Jason A Fries, Christopher Ré, James I Huddleston III, Nicholas J Giori, Scott Delp, Nigam H Shah
Abstract Post-market medical device surveillance is a challenge facing manufacturers, regulatory agencies, and health care providers. Electronic health records are valuable sources of real world evidence to assess device safety and track device-related patient outcomes over time. However, distilling this evidence remains challenging, as information is fractured across clinical notes and structured records. Modern machine learning methods for machine reading promise to unlock increasingly complex information from text, but face barriers due to their reliance on large and expensive hand-labeled training sets. To address these challenges, we developed and validated state-of-the-art deep learning methods that identify patient outcomes from clinical notes without requiring hand-labeled training data. Using hip replacements as a test case, our methods accurately extracted implant details and reports of complications and pain from electronic health records with up to 96.3% precision, 98.5% recall, and 97.4% F1, improved classification performance by 12.7- 53.0% over rule-based methods, and detected over 6 times as many complication events compared to using structured data alone. Using these events to assess complication-free survivorship of different implant systems, we found significant variation between implants, including for risk of revision surgery, which could not be detected using coded data alone. Patients with revision surgeries had more hip pain mentions in the post-hip replacement, pre-revision period compared to patients with no evidence of revision surgery (mean hip pain mentions 4.97 vs. 3.23; t = 5.14; p < 0.001). Some implant models were associated with higher or lower rates of hip pain mentions. Our methods complement existing surveillance mechanisms by requiring orders of magnitude less hand-labeled training data, offering a scalable solution for national medical device surveillance.
Tasks Reading Comprehension
Published 2019-04-03
URL http://arxiv.org/abs/1904.07640v1
PDF http://arxiv.org/pdf/1904.07640v1.pdf
PWC https://paperswithcode.com/paper/190407640
Repo https://github.com/som-shahlab/ehr-rwe
Framework pytorch

Go-Explore: a New Approach for Hard-Exploration Problems

Title Go-Explore: a New Approach for Hard-Exploration Problems
Authors Adrien Ecoffet, Joost Huizinga, Joel Lehman, Kenneth O. Stanley, Jeff Clune
Abstract A grand challenge in reinforcement learning is intelligent exploration, especially when rewards are sparse or deceptive. Two Atari games serve as benchmarks for such hard-exploration domains: Montezuma’s Revenge and Pitfall. On both games, current RL algorithms perform poorly, even those with intrinsic motivation, which is the dominant method to improve performance on hard-exploration domains. To address this shortfall, we introduce a new algorithm called Go-Explore. It exploits the following principles: (1) remember previously visited states, (2) first return to a promising state (without exploration), then explore from it, and (3) solve simulated environments through any available means (including by introducing determinism), then robustify via imitation learning. The combined effect of these principles is a dramatic performance improvement on hard-exploration problems. On Montezuma’s Revenge, Go-Explore scores a mean of over 43k points, almost 4 times the previous state of the art. Go-Explore can also harness human-provided domain knowledge and, when augmented with it, scores a mean of over 650k points on Montezuma’s Revenge. Its max performance of nearly 18 million surpasses the human world record, meeting even the strictest definition of “superhuman” performance. On Pitfall, Go-Explore with domain knowledge is the first algorithm to score above zero. Its mean score of almost 60k points exceeds expert human performance. Because Go-Explore produces high-performing demonstrations automatically and cheaply, it also outperforms imitation learning work where humans provide solution demonstrations. Go-Explore opens up many new research directions into improving it and weaving its insights into current RL algorithms. It may also enable progress on previously unsolvable hard-exploration problems in many domains, especially those that harness a simulator during training (e.g. robotics).
Tasks Atari Games, Imitation Learning, Montezuma’s Revenge
Published 2019-01-30
URL https://arxiv.org/abs/1901.10995v2
PDF https://arxiv.org/pdf/1901.10995v2.pdf
PWC https://paperswithcode.com/paper/go-explore-a-new-approach-for-hard
Repo https://github.com/DanieleGravina/divergence-and-quality-diversity
Framework none

Multi-Task Feature Learning for Knowledge Graph Enhanced Recommendation

Title Multi-Task Feature Learning for Knowledge Graph Enhanced Recommendation
Authors Hongwei Wang, Fuzheng Zhang, Miao Zhao, Wenjie Li, Xing Xie, Minyi Guo
Abstract Collaborative filtering often suffers from sparsity and cold start problems in real recommendation scenarios, therefore, researchers and engineers usually use side information to address the issues and improve the performance of recommender systems. In this paper, we consider knowledge graphs as the source of side information. We propose MKR, a Multi-task feature learning approach for Knowledge graph enhanced Recommendation. MKR is a deep end-to-end framework that utilizes knowledge graph embedding task to assist recommendation task. The two tasks are associated by cross&compress units, which automatically share latent features and learn high-order interactions between items in recommender systems and entities in the knowledge graph. We prove that cross&compress units have sufficient capability of polynomial approximation, and show that MKR is a generalized framework over several representative methods of recommender systems and multi-task learning. Through extensive experiments on real-world datasets, we demonstrate that MKR achieves substantial gains in movie, book, music, and news recommendation, over state-of-the-art baselines. MKR is also shown to be able to maintain a decent performance even if user-item interactions are sparse.
Tasks Graph Embedding, Knowledge Graph Embedding, Knowledge Graphs, Multi-Task Learning, Recommendation Systems
Published 2019-01-23
URL http://arxiv.org/abs/1901.08907v1
PDF http://arxiv.org/pdf/1901.08907v1.pdf
PWC https://paperswithcode.com/paper/multi-task-feature-learning-for-knowledge
Repo https://github.com/hwwang55/MKR
Framework tf

Is a Single Embedding Enough? Learning Node Representations that Capture Multiple Social Contexts

Title Is a Single Embedding Enough? Learning Node Representations that Capture Multiple Social Contexts
Authors Alessandro Epasto, Bryan Perozzi
Abstract Recent interest in graph embedding methods has focused on learning a single representation for each node in the graph. But can nodes really be best described by a single vector representation? In this work, we propose a method for learning multiple representations of the nodes in a graph (e.g., the users of a social network). Based on a principled decomposition of the ego-network, each representation encodes the role of the node in a different local community in which the nodes participate. These representations allow for improved reconstruction of the nuanced relationships that occur in the graph – a phenomenon that we illustrate through state-of-the-art results on link prediction tasks on a variety of graphs, reducing the error by up to $90%$. In addition, we show that these embeddings allow for effective visual analysis of the learned community structure.
Tasks Graph Embedding, Link Prediction
Published 2019-05-06
URL https://arxiv.org/abs/1905.02138v1
PDF https://arxiv.org/pdf/1905.02138v1.pdf
PWC https://paperswithcode.com/paper/is-a-single-embedding-enough-learning-node
Repo https://github.com/benedekrozemberczki/Splitter
Framework pytorch

Pairwise Comparisons with Flexible Time-Dynamics

Title Pairwise Comparisons with Flexible Time-Dynamics
Authors Lucas Maystre, Victor Kristof, Matthias Grossglauser
Abstract Inspired by applications in sports where the skill of players or teams competing against each other varies over time, we propose a probabilistic model of pairwise-comparison outcomes that can capture a wide range of time dynamics. We achieve this by replacing the static parameters of a class of popular pairwise-comparison models by continuous-time Gaussian processes; the covariance function of these processes enables expressive dynamics. We develop an efficient inference algorithm that computes an approximate Bayesian posterior distribution. Despite the flexbility of our model, our inference algorithm requires only a few linear-time iterations over the data and can take advantage of modern multiprocessor computer architectures. We apply our model to several historical databases of sports outcomes and find that our approach outperforms competing approaches in terms of predictive performance, scales to millions of observations, and generates compelling visualizations that help in understanding and interpreting the data.
Tasks Bayesian Inference, Gaussian Processes
Published 2019-03-18
URL https://arxiv.org/abs/1903.07746v2
PDF https://arxiv.org/pdf/1903.07746v2.pdf
PWC https://paperswithcode.com/paper/linear-time-inference-for-pairwise
Repo https://github.com/lucasmaystre/kickscore-kdd19
Framework none

The Natural Selection of Words: Finding the Features of Fitness

Title The Natural Selection of Words: Finding the Features of Fitness
Authors Peter D. Turney, Saif M. Mohammad
Abstract We introduce a dataset for studying the evolution of words, constructed from WordNet and the Google Books Ngram Corpus. The dataset tracks the evolution of 4,000 synonym sets (synsets), containing 9,000 English words, from 1800 AD to 2000 AD. We present a supervised learning algorithm that is able to predict the future leader of a synset: the word in the synset that will have the highest frequency. The algorithm uses features based on a word’s length, the characters in the word, and the historical frequencies of the word. It can predict change of leadership (including the identity of the new leader) fifty years in the future, with an F-score considerably above random guessing. Analysis of the learned models provides insight into the causes of change in the leader of a synset. The algorithm confirms observations linguists have made, such as the trend to replace the -ise suffix with -ize, the rivalry between the -ity and -ness suffixes, and the struggle between economy (shorter words are easier to remember and to write) and clarity (longer words are more distinctive and less likely to be confused with one another). The results indicate that integration of the Google Books Ngram Corpus with WordNet has significant potential for improving our understanding of how language evolves.
Tasks
Published 2019-08-19
URL https://arxiv.org/abs/1908.07013v1
PDF https://arxiv.org/pdf/1908.07013v1.pdf
PWC https://paperswithcode.com/paper/the-natural-selection-of-words-finding-the
Repo https://github.com/pdturney/natural-selection-of-words
Framework none

Stein’s Lemma for the Reparameterization Trick with Exponential Family Mixtures

Title Stein’s Lemma for the Reparameterization Trick with Exponential Family Mixtures
Authors Wu Lin, Mohammad Emtiyaz Khan, Mark Schmidt
Abstract Stein’s method (Stein, 1973; 1981) is a powerful tool for statistical applications, and has had a significant impact in machine learning. Stein’s lemma plays an essential role in Stein’s method. Previous applications of Stein’s lemma either required strong technical assumptions or were limited to Gaussian distributions with restricted covariance structures. In this work, we extend Stein’s lemma to exponential-family mixture distributions including Gaussian distributions with full covariance structures. Our generalization enables us to establish a connection between Stein’s lemma and the reparamterization trick to derive gradients of expectations of a large class of functions under weak assumptions. Using this connection, we can derive many new reparameterizable gradient-identities that goes beyond the reach of existing works. For example, we give gradient identities when expectation is taken with respect to Student’s t-distribution, skew Gaussian, exponentially modified Gaussian, and normal inverse Gaussian.
Tasks
Published 2019-10-29
URL https://arxiv.org/abs/1910.13398v1
PDF https://arxiv.org/pdf/1910.13398v1.pdf
PWC https://paperswithcode.com/paper/191013398
Repo https://github.com/yorkerlin/VB-MixEF
Framework none

Towards Automatic Generation of Shareable Synthetic Clinical Notes Using Neural Language Models

Title Towards Automatic Generation of Shareable Synthetic Clinical Notes Using Neural Language Models
Authors Oren Melamud, Chaitanya Shivade
Abstract Large-scale clinical data is invaluable to driving many computational scientific advances today. However, understandable concerns regarding patient privacy hinder the open dissemination of such data and give rise to suboptimal siloed research. De-identification methods attempt to address these concerns but were shown to be susceptible to adversarial attacks. In this work, we focus on the vast amounts of unstructured natural language data stored in clinical notes and propose to automatically generate synthetic clinical notes that are more amenable to sharing using generative models trained on real de-identified records. To evaluate the merit of such notes, we measure both their privacy preservation properties as well as utility in training clinical NLP models. Experiments using neural language models yield notes whose utility is close to that of the real ones in some clinical NLP tasks, yet leave ample room for future improvements.
Tasks
Published 2019-05-16
URL https://arxiv.org/abs/1905.07002v2
PDF https://arxiv.org/pdf/1905.07002v2.pdf
PWC https://paperswithcode.com/paper/towards-automatic-generation-of-shareable
Repo https://github.com/orenmel/synth-clinical-notes
Framework pytorch

Graph Dynamical Networks for Unsupervised Learning of Atomic Scale Dynamics in Materials

Title Graph Dynamical Networks for Unsupervised Learning of Atomic Scale Dynamics in Materials
Authors Tian Xie, Arthur France-Lanord, Yanming Wang, Yang Shao-Horn, Jeffrey C. Grossman
Abstract Understanding the dynamical processes that govern the performance of functional materials is essential for the design of next generation materials to tackle global energy and environmental challenges. Many of these processes involve the dynamics of individual atoms or small molecules in condensed phases, e.g. lithium ions in electrolytes, water molecules in membranes, molten atoms at interfaces, etc., which are difficult to understand due to the complexity of local environments. In this work, we develop graph dynamical networks, an unsupervised learning approach for understanding atomic scale dynamics in arbitrary phases and environments from molecular dynamics simulations. We show that important dynamical information can be learned for various multi-component amorphous material systems, which is difficult to obtain otherwise. With the large amounts of molecular dynamics data generated everyday in nearly every aspect of materials design, this approach provides a broadly useful, automated tool to understand atomic scale dynamics in material systems.
Tasks
Published 2019-02-18
URL https://arxiv.org/abs/1902.06836v2
PDF https://arxiv.org/pdf/1902.06836v2.pdf
PWC https://paperswithcode.com/paper/graph-dynamical-networks-unsupervised
Repo https://github.com/txie-93/gdynet
Framework tf

GraphTER: Unsupervised Learning of Graph Transformation Equivariant Representations via Auto-Encoding Node-wise Transformations

Title GraphTER: Unsupervised Learning of Graph Transformation Equivariant Representations via Auto-Encoding Node-wise Transformations
Authors Xiang Gao, Wei Hu, Guo-Jun Qi
Abstract Recent advances in Graph Convolutional Neural Networks (GCNNs) have shown their efficiency for non-Euclidean data on graphs, which often require a large amount of labeled data with high cost. It it thus critical to learn graph feature representations in an unsupervised manner in practice. To this end, we propose a novel unsupervised learning of Graph Transformation Equivariant Representations (GraphTER), aiming to capture intrinsic patterns of graph structure under both global and local transformations. Specifically, we allow to sample different groups of nodes from a graph and then transform them node-wise isotropically or anisotropically. Then, we self-train a representation encoder to capture the graph structures by reconstructing these node-wise transformations from the feature representations of the original and transformed graphs. In experiments, we apply the learned GraphTER to graphs of 3D point cloud data, and results on point cloud segmentation/classification show that GraphTER significantly outperforms state-of-the-art unsupervised approaches and pushes greatly closer towards the upper bound set by the fully supervised counterparts. The code is available at: https://github.com/gyshgx868/graph-ter.
Tasks
Published 2019-11-19
URL https://arxiv.org/abs/1911.08142v2
PDF https://arxiv.org/pdf/1911.08142v2.pdf
PWC https://paperswithcode.com/paper/graphter-unsupervised-learning-of-graph
Repo https://github.com/gyshgx868/graph-ter
Framework pytorch

MINA: Multilevel Knowledge-Guided Attention for Modeling Electrocardiography Signals

Title MINA: Multilevel Knowledge-Guided Attention for Modeling Electrocardiography Signals
Authors Shenda Hong, Cao Xiao, Tengfei Ma, Hongyan Li, Jimeng Sun
Abstract Electrocardiography (ECG) signals are commonly used to diagnose various cardiac abnormalities. Recently, deep learning models showed initial success on modeling ECG data, however they are mostly black-box, thus lack interpretability needed for clinical usage. In this work, we propose MultIlevel kNowledge-guided Attention networks (MINA) that predict heart diseases from ECG signals with intuitive explanation aligned with medical knowledge. By extracting multilevel (beat-, rhythm- and frequency-level) domain knowledge features separately, MINA combines the medical knowledge and ECG data via a multilevel attention model, making the learned models highly interpretable. Our experiments showed MINA achieved PR-AUC 0.9436 (outperforming the best baseline by 5.51%) in real world ECG dataset. Finally, MINA also demonstrated robust performance and strong interpretability against signal distortion and noise contamination.
Tasks Electrocardiography (ECG)
Published 2019-05-27
URL https://arxiv.org/abs/1905.11333v3
PDF https://arxiv.org/pdf/1905.11333v3.pdf
PWC https://paperswithcode.com/paper/mina-multilevel-knowledge-guided-attention
Repo https://github.com/hsd1503/MINA
Framework pytorch

Distilling Structured Knowledge into Embeddings for Explainable and Accurate Recommendation

Title Distilling Structured Knowledge into Embeddings for Explainable and Accurate Recommendation
Authors Yuan Zhang, Xiaoran Xu, Hanning Zhou, Yan Zhang
Abstract Recently, the embedding-based recommendation models (e.g., matrix factorization and deep models) have been prevalent in both academia and industry due to their effectiveness and flexibility. However, they also have such intrinsic limitations as lacking explainability and suffering from data sparsity. In this paper, we propose an end-to-end joint learning framework to get around these limitations without introducing any extra overhead by distilling structured knowledge from a differentiable path-based recommendation model. Through extensive experiments, we show that our proposed framework can achieve state-of-the-art recommendation performance and meanwhile provide interpretable recommendation reasons.
Tasks
Published 2019-12-18
URL https://arxiv.org/abs/1912.08422v1
PDF https://arxiv.org/pdf/1912.08422v1.pdf
PWC https://paperswithcode.com/paper/distilling-structured-knowledge-into
Repo https://github.com/yuan-pku/Distilling-Structured-Knowledge-into-Embeddings-for-Explainable-and-Accurate-Recommendation
Framework none

Facial age estimation by deep residual decision making

Title Facial age estimation by deep residual decision making
Authors Shichao Li, Kwang-Ting Cheng
Abstract Residual representation learning simplifies the optimization problem of learning complex functions and has been widely used by traditional convolutional neural networks. However, it has not been applied to deep neural decision forest (NDF). In this paper we incorporate residual learning into NDF and the resulting model achieves state-of-the-art level accuracy on three public age estimation benchmarks while requiring less memory and computation. We further employ gradient-based technique to visualize the decision-making process of NDF and understand how it is influenced by facial image inputs. The code and pre-trained models will be available at https://github.com/Nicholasli1995/VisualizingNDF.
Tasks Age Estimation, Decision Making, Representation Learning
Published 2019-08-28
URL https://arxiv.org/abs/1908.10737v1
PDF https://arxiv.org/pdf/1908.10737v1.pdf
PWC https://paperswithcode.com/paper/facial-age-estimation-by-deep-residual
Repo https://github.com/Nicholasli1995/VisualizingNDF
Framework pytorch

Lightweight and Efficient Neural Natural Language Processing with Quaternion Networks

Title Lightweight and Efficient Neural Natural Language Processing with Quaternion Networks
Authors Yi Tay, Aston Zhang, Luu Anh Tuan, Jinfeng Rao, Shuai Zhang, Shuohang Wang, Jie Fu, Siu Cheung Hui
Abstract Many state-of-the-art neural models for NLP are heavily parameterized and thus memory inefficient. This paper proposes a series of lightweight and memory efficient neural architectures for a potpourri of natural language processing (NLP) tasks. To this end, our models exploit computation using Quaternion algebra and hypercomplex spaces, enabling not only expressive inter-component interactions but also significantly ($75%$) reduced parameter size due to lesser degrees of freedom in the Hamilton product. We propose Quaternion variants of models, giving rise to new architectures such as the Quaternion attention Model and Quaternion Transformer. Extensive experiments on a battery of NLP tasks demonstrates the utility of proposed Quaternion-inspired models, enabling up to $75%$ reduction in parameter size without significant loss in performance.
Tasks
Published 2019-06-11
URL https://arxiv.org/abs/1906.04393v1
PDF https://arxiv.org/pdf/1906.04393v1.pdf
PWC https://paperswithcode.com/paper/lightweight-and-efficient-neural-natural
Repo https://github.com/vanzytay/QuaternionTransformers
Framework tf
comments powered by Disqus