Paper Group ANR 441
Deep Euler method: solving ODEs by approximating the local truncation error of the Euler method. Reward-rational (implicit) choice: A unifying formalism for reward learning. Improving Molecular Design by Stochastic Iterative Target Augmentation. Automatic Differentiation and Continuous Sensitivity Analysis of Rigid Body Dynamics. Adequate and fair …
Deep Euler method: solving ODEs by approximating the local truncation error of the Euler method
Title | Deep Euler method: solving ODEs by approximating the local truncation error of the Euler method |
Authors | Xing Shen, Xiaoliang Cheng, Kewei Liang |
Abstract | In this paper, we propose a deep learning-based method, deep Euler method (DEM) to solve ordinary differential equations. DEM significantly improves the accuracy of the Euler method by approximating the local truncation error with deep neural networks which could obtain a high precision solution with a large step size. The deep neural network in DEM is mesh-free during training and shows good generalization in unmeasured regions. DEM could be easily combined with other schemes of numerical methods, such as Runge-Kutta method to obtain better solutions. Furthermore, the error bound and stability of DEM is discussed. |
Tasks | |
Published | 2020-03-21 |
URL | https://arxiv.org/abs/2003.09573v1 |
https://arxiv.org/pdf/2003.09573v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-euler-method-solving-odes-by |
Repo | |
Framework | |
Reward-rational (implicit) choice: A unifying formalism for reward learning
Title | Reward-rational (implicit) choice: A unifying formalism for reward learning |
Authors | Hong Jun Jeon, Smitha Milli, Anca D. Dragan |
Abstract | It is often difficult to hand-specify what the correct reward function is for a task, so researchers have instead aimed to learn reward functions from human behavior or feedback. The types of behavior interpreted as evidence of the reward function have expanded greatly in recent years. We’ve gone from demonstrations, to comparisons, to reading into the information leaked when the human is pushing the robot away or turning it off. And surely, there is more to come. How will a robot make sense of all these diverse types of behavior? Our key insight is that different types of behavior can be interpreted in a single unifying formalism - as a reward-rational choice that the human is making, often implicitly. The formalism offers both a unifying lens with which to view past work, as well as a recipe for interpreting new sources of information that are yet to be uncovered. We provide two examples to showcase this: interpreting a new feedback type, and reading into how the choice of feedback itself leaks information about the reward. |
Tasks | |
Published | 2020-02-12 |
URL | https://arxiv.org/abs/2002.04833v1 |
https://arxiv.org/pdf/2002.04833v1.pdf | |
PWC | https://paperswithcode.com/paper/reward-rational-implicit-choice-a-unifying |
Repo | |
Framework | |
Improving Molecular Design by Stochastic Iterative Target Augmentation
Title | Improving Molecular Design by Stochastic Iterative Target Augmentation |
Authors | Kevin Yang, Wengong Jin, Kyle Swanson, Regina Barzilay, Tommi Jaakkola |
Abstract | Generative models in molecular design tend to be richly parameterized, data-hungry neural models, as they must create complex structured objects as outputs. Estimating such models from data may be challenging due to the lack of sufficient training data. In this paper, we propose a surprisingly effective self-training approach for iteratively creating additional molecular targets. We first pre-train the generative model together with a simple property predictor. The property predictor is then used as a likelihood model for filtering candidate structures from the generative model. Additional targets are iteratively produced and used in the course of stochastic EM iterations to maximize the log-likelihood that the candidate structures are accepted. A simple rejection (re-weighting) sampler suffices to draw posterior samples since the generative model is already reasonable after pre-training. We demonstrate significant gains over strong baselines for both unconditional and conditional molecular design. In particular, our approach outperforms the previous state-of-the-art in conditional molecular design by over 10% in absolute gain. |
Tasks | |
Published | 2020-02-11 |
URL | https://arxiv.org/abs/2002.04720v1 |
https://arxiv.org/pdf/2002.04720v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-molecular-design-by-stochastic |
Repo | |
Framework | |
Automatic Differentiation and Continuous Sensitivity Analysis of Rigid Body Dynamics
Title | Automatic Differentiation and Continuous Sensitivity Analysis of Rigid Body Dynamics |
Authors | David Millard, Eric Heiden, Shubham Agrawal, Gaurav S. Sukhatme |
Abstract | A key ingredient to achieving intelligent behavior is physical understanding that equips robots with the ability to reason about the effects of their actions in a dynamic environment. Several methods have been proposed to learn dynamics models from data that inform model-based control algorithms. While such learning-based approaches can model locally observed behaviors, they fail to generalize to more complex dynamics and under long time horizons. In this work, we introduce a differentiable physics simulator for rigid body dynamics. Leveraging various techniques for differential equation integration and gradient calculation, we compare different methods for parameter estimation that allow us to infer the simulation parameters that are relevant to estimation and control of physical systems. In the context of trajectory optimization, we introduce a closed-loop model-predictive control algorithm that infers the simulation parameters through experience while achieving cost-minimizing performance. |
Tasks | |
Published | 2020-01-22 |
URL | https://arxiv.org/abs/2001.08539v1 |
https://arxiv.org/pdf/2001.08539v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-differentiation-and-continuous |
Repo | |
Framework | |
Adequate and fair explanations
Title | Adequate and fair explanations |
Authors | Nicholas Asher, Soumya Paul, Chris Russell |
Abstract | Explaining sophisticated machine-learning based systems is an important issue at the foundations of AI. Recent efforts have shown various methods for providing explanations. These approaches can be broadly divided into two schools: those that provide a local and human interpreatable approximation of a machine learning algorithm, and logical approaches that exactly characterise one aspect of the decision. In this paper we focus upon the second school of exact explanations with a rigorous logical foundation. There is an epistemological problem with these exact methods. While they can furnish complete explanations, such explanations may be too complex for humans to understand or even to write down in human readable form. Interpretability requires epistemically accessible explanations, explanations humans can grasp. Yet what is a sufficiently complete epistemically accessible explanation still needs clarification. We do this here in terms of counterfactuals, following [Wachter et al., 2017]. With counterfactual explanations, many of the assumptions needed to provide a complete explanation are left implicit. To do so, counterfactual explanations exploit the properties of a particular data point or sample, and as such are also local as well as partial explanations. We explore how to move from local partial explanations to what we call complete local explanations and then to global ones. But to preserve accessibility we argue for the need for partiality. This partiality makes it possible to hide explicit biases present in the algorithm that may be injurious or unfair.We investigate how easy it is to uncover these biases in providing complete and fair explanations by exploiting the structure of the set of counterfactuals providing a complete local explanation. |
Tasks | |
Published | 2020-01-21 |
URL | https://arxiv.org/abs/2001.07578v1 |
https://arxiv.org/pdf/2001.07578v1.pdf | |
PWC | https://paperswithcode.com/paper/adequate-and-fair-explanations |
Repo | |
Framework | |
Towards Learning Representations of Binary Executable Files for Security Tasks
Title | Towards Learning Representations of Binary Executable Files for Security Tasks |
Authors | Shushan Arakelyan, Christophe Hauser, Erik Kline, Aram Galstyan |
Abstract | Tackling binary analysis problems has traditionally implied manually defining rules and heuristics. As an alternative, we are suggesting using machine learning models for learning distributed representations of binaries that can be applicable for a number of downstream tasks. We construct a computational graph from the binary executable and use it with a graph convolutional neural network to learn a high dimensional representation of the program. We show the versatility of this approach by using our representations to solve two semantically different binary analysis tasks – algorithm classification and vulnerability discovery. We compare the proposed approach to our own strong baseline as well as published results and demonstrate improvement on the state of the art methods for both tasks. |
Tasks | |
Published | 2020-02-09 |
URL | https://arxiv.org/abs/2002.03388v1 |
https://arxiv.org/pdf/2002.03388v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-learning-representations-of-binary |
Repo | |
Framework | |
Graph Representation Learning via Graphical Mutual Information Maximization
Title | Graph Representation Learning via Graphical Mutual Information Maximization |
Authors | Zhen Peng, Wenbing Huang, Minnan Luo, Qinghua Zheng, Yu Rong, Tingyang Xu, Junzhou Huang |
Abstract | The richness in the content of various information networks such as social networks and communication networks provides the unprecedented potential for learning high-quality expressive representations without external supervision. This paper investigates how to preserve and extract the abundant information from graph-structured data into embedding space in an unsupervised manner. To this end, we propose a novel concept, Graphical Mutual Information (GMI), to measure the correlation between input graphs and high-level hidden representations. GMI generalizes the idea of conventional mutual information computations from vector space to the graph domain where measuring mutual information from two aspects of node features and topological structure is indispensable. GMI exhibits several benefits: First, it is invariant to the isomorphic transformation of input graphs—an inevitable constraint in many existing graph representation learning algorithms; Besides, it can be efficiently estimated and maximized by current mutual information estimation methods such as MINE; Finally, our theoretical analysis confirms its correctness and rationality. With the aid of GMI, we develop an unsupervised learning model trained by maximizing GMI between the input and output of a graph neural encoder. Considerable experiments on transductive as well as inductive node classification and link prediction demonstrate that our method outperforms state-of-the-art unsupervised counterparts, and even sometimes exceeds the performance of supervised ones. |
Tasks | Graph Representation Learning, Link Prediction, Node Classification, Representation Learning |
Published | 2020-02-04 |
URL | https://arxiv.org/abs/2002.01169v1 |
https://arxiv.org/pdf/2002.01169v1.pdf | |
PWC | https://paperswithcode.com/paper/graph-representation-learning-via-graphical |
Repo | |
Framework | |
A Framework for Generating Explanations from Temporal Personal Health Data
Title | A Framework for Generating Explanations from Temporal Personal Health Data |
Authors | Jonathan J. Harris, Ching-Hua Chen, Mohammed J. Zaki |
Abstract | Whereas it has become easier for individuals to track their personal health data (e.g., heart rate, step count, food log), there is still a wide chasm between the collection of data and the generation of meaningful explanations to help users better understand what their data means to them. With an increased comprehension of their data, users will be able to act upon the newfound information and work towards striving closer to their health goals. We aim to bridge the gap between data collection and explanation generation by mining the data for interesting behavioral findings that may provide hints about a user’s tendencies. Our focus is on improving the explainability of temporal personal health data via a set of informative summary templates, or “protoforms.” These protoforms span both evaluation-based summaries that help users evaluate their health goals and pattern-based summaries that explain their implicit behaviors. In addition to individual users, the protoforms we use are also designed for population-level summaries. We apply our approach to generate summaries (both univariate and multivariate) from real user data and show that our system can generate interesting and useful explanations. |
Tasks | |
Published | 2020-03-20 |
URL | https://arxiv.org/abs/2003.09530v1 |
https://arxiv.org/pdf/2003.09530v1.pdf | |
PWC | https://paperswithcode.com/paper/a-framework-for-generating-explanations-from |
Repo | |
Framework | |
Tri-graph Information Propagation for Polypharmacy Side Effect Prediction
Title | Tri-graph Information Propagation for Polypharmacy Side Effect Prediction |
Authors | Hao Xu, Shengqi Sang, Haiping Lu |
Abstract | The use of drug combinations often leads to polypharmacy side effects (POSE). A recent method formulates POSE prediction as a link prediction problem on a graph of drugs and proteins, and solves it with Graph Convolutional Networks (GCNs). However, due to the complex relationships in POSE, this method has high computational cost and memory demand. This paper proposes a flexible Tri-graph Information Propagation (TIP) model that operates on three subgraphs to learn representations progressively by propagation from protein-protein graph to drug-drug graph via protein-drug graph. Experiments show that TIP improves accuracy by 7%+, time efficiency by 83$\times$, and space efficiency by 3$\times$. |
Tasks | Link Prediction, Pose Prediction |
Published | 2020-01-28 |
URL | https://arxiv.org/abs/2001.10516v1 |
https://arxiv.org/pdf/2001.10516v1.pdf | |
PWC | https://paperswithcode.com/paper/tri-graph-information-propagation-for |
Repo | |
Framework | |
Linking Bank Clients using Graph Neural Networks Powered by Rich Transactional Data
Title | Linking Bank Clients using Graph Neural Networks Powered by Rich Transactional Data |
Authors | Valentina Shumovskaia, Kirill Fedyanin, Ivan Sukharev, Dmitry Berestnev, Maxim Panov |
Abstract | Financial institutions obtain enormous amounts of data about user transactions and money transfers, which can be considered as a large graph dynamically changing in time. In this work, we focus on the task of predicting new interactions in the network of bank clients and treat it as a link prediction problem. We propose a new graph neural network model, which uses not only the topological structure of the network but rich time-series data available for the graph nodes and edges. We evaluate the developed method using the data provided by a large European bank for several years. The proposed model outperforms the existing approaches, including other neural network models, with a significant gap in ROC AUC score on link prediction problem and also allows to improve the quality of credit scoring. |
Tasks | Link Prediction, Time Series |
Published | 2020-01-23 |
URL | https://arxiv.org/abs/2001.08427v1 |
https://arxiv.org/pdf/2001.08427v1.pdf | |
PWC | https://paperswithcode.com/paper/linking-bank-clients-using-graph-neural |
Repo | |
Framework | |
Language Models Are An Effective Patient Representation Learning Technique For Electronic Health Record Data
Title | Language Models Are An Effective Patient Representation Learning Technique For Electronic Health Record Data |
Authors | Ethan Steinberg, Ken Jung, Jason A. Fries, Conor K. Corbin, Stephen R. Pfohl, Nigam H. Shah |
Abstract | Widespread adoption of electronic health records (EHRs) has fueled development of clinical outcome models using machine learning. However, patient EHR data are complex, and how to optimally represent them is an open question. This complexity, along with often small training set sizes available to train these clinical outcome models, are two core challenges for training high quality models. In this paper, we demonstrate that learning generic representations from the data of all the patients in the EHR enables better performing prediction models for clinical outcomes, allowing for these challenges to be overcome. We adapt common representation learning techniques used in other domains and find that representations inspired by language models enable a 3.5% mean improvement in AUROC on five clinical outcomes compared to standard baselines, with the average improvement rising to 19% when only a small number of patients are available for training a prediction model for a given clinical outcome. |
Tasks | Representation Learning |
Published | 2020-01-06 |
URL | https://arxiv.org/abs/2001.05295v1 |
https://arxiv.org/pdf/2001.05295v1.pdf | |
PWC | https://paperswithcode.com/paper/language-models-are-an-effective-patient |
Repo | |
Framework | |
A deep learning approach for lower back-pain risk prediction during manual lifting
Title | A deep learning approach for lower back-pain risk prediction during manual lifting |
Authors | Kristian Snyder, Brennan Thomas, Ming-Lun Lu, Rashmi Jha, Menekse S. Barim, Marie Hayden, Dwight Werren |
Abstract | Occupationally-induced back pain is a leading cause of reduced productivity in industry. Detecting when a worker is lifting incorrectly and at increased risk of back injury presents significant possible benefits. These include increased quality of life for the worker due to lower rates of back injury and fewer workers’ compensation claims and missed time for the employer. However, recognizing lifting risk provides a challenge due to typically small datasets and subtle underlying features in accelerometer and gyroscope data. A novel method to classify a lifting dataset using a 2D convolutional neural network (CNN) and no manual feature extraction is proposed in this paper; the dataset consisted of 10 subjects lifting at various relative distances from the body with 720 total trials. The proposed deep CNN displayed greater accuracy (90.6%) compared to an alternative CNN and multilayer perceptron (MLP). A deep CNN could be adapted to classify many other activities that traditionally pose greater challenges in industrial environments due to their size and complexity. |
Tasks | |
Published | 2020-03-20 |
URL | https://arxiv.org/abs/2003.09521v1 |
https://arxiv.org/pdf/2003.09521v1.pdf | |
PWC | https://paperswithcode.com/paper/a-deep-learning-approach-for-lower-back-pain |
Repo | |
Framework | |
GameWikiSum: a Novel Large Multi-Document Summarization Dataset
Title | GameWikiSum: a Novel Large Multi-Document Summarization Dataset |
Authors | Diego Antognini, Boi Faltings |
Abstract | Today’s research progress in the field of multi-document summarization is obstructed by the small number of available datasets. Since the acquisition of reference summaries is costly, existing datasets contain only hundreds of samples at most, resulting in heavy reliance on hand-crafted features or necessitating additional, manually annotated data. The lack of large corpora therefore hinders the development of sophisticated models. Additionally, most publicly available multi-document summarization corpora are in the news domain, and no analogous dataset exists in the video game domain. In this paper, we propose GameWikiSum, a new domain-specific dataset for multi-document summarization, which is one hundred times larger than commonly used datasets, and in another domain than news. Input documents consist of long professional video game reviews as well as references of their gameplay sections in Wikipedia pages. We analyze the proposed dataset and show that both abstractive and extractive models can be trained on it. We release GameWikiSum for further research: https://github.com/Diego999/GameWikiSum. |
Tasks | Document Summarization, Multi-Document Summarization |
Published | 2020-02-17 |
URL | https://arxiv.org/abs/2002.06851v1 |
https://arxiv.org/pdf/2002.06851v1.pdf | |
PWC | https://paperswithcode.com/paper/gamewikisum-a-novel-large-multi-document |
Repo | |
Framework | |
Knowledge Integration of Collaborative Product Design Using Cloud Computing Infrastructure
Title | Knowledge Integration of Collaborative Product Design Using Cloud Computing Infrastructure |
Authors | Mahdi Bohlouli, Alexander Holland, Madjid Fathi |
Abstract | The pivotal key to the success of manufacturing enterprises is a sustainable and innovative product design and development. In collaborative design, stakeholders are heterogeneously distributed chain-like. Due to the growing volume of data and knowledge, effective management of the knowledge acquired in the product design and development is one of the key challenges facing most manufacturing enterprises. Opportunities for improving efficiency and performance of IT-based product design applications through centralization of resources such as knowledge and computation have increased in the last few years with the maturation of technologies such as SOA, virtualization, grid computing, and/or cloud computing. The main focus of this paper is the concept of ongoing research in providing the knowledge integration service for collaborative product design and development using cloud computing infrastructure. Potentials of the cloud computing to support the Knowledge integration functionalities as a Service by providing functionalities such as knowledge mapping, merging, searching, and transferring in product design procedure are described in this paper. Proposed knowledge integration services support users by giving real-time access to knowledge resources. The framework has the advantage of availability, efficiency, cost reduction, less time to result, and scalability. |
Tasks | |
Published | 2020-01-16 |
URL | https://arxiv.org/abs/2001.09796v1 |
https://arxiv.org/pdf/2001.09796v1.pdf | |
PWC | https://paperswithcode.com/paper/knowledge-integration-of-collaborative |
Repo | |
Framework | |
Analyzing CNN Based Behavioural Malware Detection Techniques on Cloud IaaS
Title | Analyzing CNN Based Behavioural Malware Detection Techniques on Cloud IaaS |
Authors | Andrew McDole, Mahmoud Abdelsalam, Maanak Gupta, Sudip Mittal |
Abstract | Cloud Infrastructure as a Service (IaaS) is vulnerable to malware due to its exposure to external adversaries, making it a lucrative attack vector for malicious actors. A datacenter infected with malware can cause data loss and/or major disruptions to service for its users. This paper analyzes and compares various Convolutional Neural Networks (CNNs) for online detection of malware in cloud IaaS. The detection is performed based on behavioural data using process level performance metrics including cpu usage, memory usage, disk usage etc. We have used the state of the art DenseNets and ResNets in effectively detecting malware in online cloud system. CNN are designed to extract features from data gathered from a live malware running on a real cloud environment. Experiments are performed on OpenStack (a cloud IaaS software) testbed designed to replicate a typical 3-tier web architecture. Comparative analysis is performed for different metrics for different CNN models used in this research. |
Tasks | Malware Detection |
Published | 2020-02-15 |
URL | https://arxiv.org/abs/2002.06383v1 |
https://arxiv.org/pdf/2002.06383v1.pdf | |
PWC | https://paperswithcode.com/paper/analyzing-cnn-based-behavioural-malware |
Repo | |
Framework | |