April 1, 2020

3350 words 16 mins read

Paper Group ANR 499

Paper Group ANR 499

Leveraging Prior Knowledge for Protein-Protein Interaction Extraction with Memory Network. From Patterson Maps to Atomic Coordinates: Training a Deep Neural Network to Solve the Phase Problem for a Simplified Case. Model-assisted cohort selection with bias analysis for generating large-scale cohorts from the EHR for oncology research. Unsupervised …

Leveraging Prior Knowledge for Protein-Protein Interaction Extraction with Memory Network

Title Leveraging Prior Knowledge for Protein-Protein Interaction Extraction with Memory Network
Authors Huiwei Zhou, Zhuang Liu, Shixian Ning, Yunlong Yang, Chengkun Lang, Yingyu Lin, Kun Ma
Abstract Automatically extracting Protein-Protein Interactions (PPI) from biomedical literature provides additional support for precision medicine efforts. This paper proposes a novel memory network-based model (MNM) for PPI extraction, which leverages prior knowledge about protein-protein pairs with memory networks. The proposed MNM captures important context clues related to knowledge representations learned from knowledge bases. Both entity embeddings and relation embeddings of prior knowledge are effective in improving the PPI extraction model, leading to a new state-of-the-art performance on the BioCreative VI PPI dataset. The paper also shows that multiple computational layers over an external memory are superior to long short-term memory networks with the local memories.
Tasks Entity Embeddings
Published 2020-01-07
URL https://arxiv.org/abs/2001.02107v1
PDF https://arxiv.org/pdf/2001.02107v1.pdf
PWC https://paperswithcode.com/paper/leveraging-prior-knowledge-for-protein

From Patterson Maps to Atomic Coordinates: Training a Deep Neural Network to Solve the Phase Problem for a Simplified Case

Title From Patterson Maps to Atomic Coordinates: Training a Deep Neural Network to Solve the Phase Problem for a Simplified Case
Authors David Hurwitz
Abstract This work demonstrates that, for a simple case of 10 randomly positioned atoms, a neural network can be trained to infer atomic coordinates from Patterson maps. The network was trained entirely on synthetic data. For the training set, the network outputs were 3D maps of randomly positioned atoms. From each output map, a Patterson map was generated and used as input to the network. The network generalized to cases not in the test set, inferring atom positions from Patterson maps. A key finding in this work is that the Patterson maps presented to the network input during training must uniquely describe the atomic coordinates they are paired with on the network output or the network will not train and it will not generalize. The network cannot train on conflicting data. Avoiding conflicts is handled in 3 ways: 1. Patterson maps are invariant to translation. To remove this degree of freedom, output maps are centered on the average of their atom positions. 2. Patterson maps are invariant to centrosymmetric inversion. This conflict is removed by presenting the network output with both the atoms used to make the Patterson Map and their centrosymmetry-related counterparts simultaneously. 3. The Patterson map does not uniquely describe a set of coordinates because the origin for each vector in the Patterson map is ambiguous. By adding empty space around the atoms in the output map, this ambiguity is removed. Forcing output atoms to be closer than half the output box edge dimension means the origin of each peak in the Patterson map must be the origin to which it is closest.
Published 2020-03-30
URL https://arxiv.org/abs/2003.13767v1
PDF https://arxiv.org/pdf/2003.13767v1.pdf
PWC https://paperswithcode.com/paper/from-patterson-maps-to-atomic-coordinates

Model-assisted cohort selection with bias analysis for generating large-scale cohorts from the EHR for oncology research

Title Model-assisted cohort selection with bias analysis for generating large-scale cohorts from the EHR for oncology research
Authors Benjamin Birnbaum, Nathan Nussbaum, Katharina Seidl-Rathkopf, Monica Agrawal, Melissa Estevez, Evan Estola, Joshua Haimson, Lucy He, Peter Larson, Paul Richardson
Abstract Objective Electronic health records (EHRs) are a promising source of data for health outcomes research in oncology. A challenge in using EHR data is that selecting cohorts of patients often requires information in unstructured parts of the record. Machine learning has been used to address this, but even high-performing algorithms may select patients in a non-random manner and bias the resulting cohort. To improve the efficiency of cohort selection while measuring potential bias, we introduce a technique called Model-Assisted Cohort Selection (MACS) with Bias Analysis and apply it to the selection of metastatic breast cancer (mBC) patients. Materials and Methods We trained a model on 17,263 patients using term-frequency inverse-document-frequency (TF-IDF) and logistic regression. We used a test set of 17,292 patients to measure algorithm performance and perform Bias Analysis. We compared the cohort generated by MACS to the cohort that would have been generated without MACS as reference standard, first by comparing distributions of an extensive set of clinical and demographic variables and then by comparing the results of two analyses addressing existing example research questions. Results Our algorithm had an area under the curve (AUC) of 0.976, a sensitivity of 96.0%, and an abstraction efficiency gain of 77.9%. During Bias Analysis, we found no large differences in baseline characteristics and no differences in the example analyses. Conclusion MACS with bias analysis can significantly improve the efficiency of cohort selection on EHR data while instilling confidence that outcomes research performed on the resulting cohort will not be biased.
Published 2020-01-13
URL https://arxiv.org/abs/2001.09765v1
PDF https://arxiv.org/pdf/2001.09765v1.pdf
PWC https://paperswithcode.com/paper/model-assisted-cohort-selection-with-bias

Unsupervised domain adaptation with exploring more statistics and discriminative information

Title Unsupervised domain adaptation with exploring more statistics and discriminative information
Authors Yuntao Du, Ruiting Zhang, Yikang Cao, Xiaowen Zhang, Zhiwen Tan, Chongjun Wang
Abstract Unsupervised domain adaptation aims at transferring knowledge from the labeled source domain to the unlabeled target domain. Previous methods mainly learn a domain-invariant feature transformation, where the cross-domain discrepancy can be reduced. Maximum Mean Discrepancy(MMD) is the most popular statistic to measure domain discrepancy. However, these methods may suffer from two challenges. 1) MMD-based methods only measure the first-order statistic information across domains, while other useful information such as second-order statistic information has been ignored. 2) The classifier trained on the source domain may confuse to distinguish the correct class from a similar class, and the phenomenon is called class confusion. In this paper, we propose a method called \emph{Unsupervised domain adaptation with exploring more statistics and discriminative information}(MSDI), which tackle these two problems in the principle of structural risk minimization. We adopt the recently proposed statistic called MMCD to measure domain discrepancy which can capture both first-order and second-order statistics simultaneously in RKHS. Besides, we proposed to learn more discriminative features to avoid class confusion, where the inner of the classifier predictions with their transposes are used to reflect the confusion relationship between different classes. Moreover, we minimizing source empirical risk and adopt manifold regularization to explore geometry information in the target domain. MSDI learns a domain-invariant classifier in a unified learning framework incorporating the above objectives. We conduct comprehensive experiments on five real-world datasets and the results verify the effectiveness of the proposed method.
Tasks Domain Adaptation, Unsupervised Domain Adaptation
Published 2020-03-26
URL https://arxiv.org/abs/2003.11723v1
PDF https://arxiv.org/pdf/2003.11723v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-domain-adaptation-with-exploring

Adaptive Reward-Poisoning Attacks against Reinforcement Learning

Title Adaptive Reward-Poisoning Attacks against Reinforcement Learning
Authors Xuezhou Zhang, Yuzhe Ma, Adish Singla, Xiaojin Zhu
Abstract In reward-poisoning attacks against reinforcement learning (RL), an attacker can perturb the environment reward $r_t$ into $r_t+\delta_t$ at each step, with the goal of forcing the RL agent to learn a nefarious policy. We categorize such attacks by the infinity-norm constraint on $\delta_t$: We provide a lower threshold below which reward-poisoning attack is infeasible and RL is certified to be safe; we provide a corresponding upper threshold above which the attack is feasible. Feasible attacks can be further categorized as non-adaptive where $\delta_t$ depends only on $(s_t,a_t, s_{t+1})$, or adaptive where $\delta_t$ depends further on the RL agent’s learning process at time $t$. Non-adaptive attacks have been the focus of prior works. However, we show that under mild conditions, adaptive attacks can achieve the nefarious policy in steps polynomial in state-space size $S$, whereas non-adaptive attacks require exponential steps. We provide a constructive proof that a Fast Adaptive Attack strategy achieves the polynomial rate. Finally, we show that empirically an attacker can find effective reward-poisoning attacks using state-of-the-art deep RL techniques.
Published 2020-03-27
URL https://arxiv.org/abs/2003.12613v1
PDF https://arxiv.org/pdf/2003.12613v1.pdf
PWC https://paperswithcode.com/paper/adaptive-reward-poisoning-attacks-against

Value of Information Analysis via Active Learning and Knowledge Sharing in Error-Controlled Adaptive Kriging

Title Value of Information Analysis via Active Learning and Knowledge Sharing in Error-Controlled Adaptive Kriging
Authors Chi Zhang, Zeyu Wang, Abdollah Shafieezadeh
Abstract Large uncertainties in many phenomena have challenged decision making. Collecting additional information to better characterize reducible uncertainties is among decision alternatives. Value of information (VoI) analysis is a mathematical decision framework that quantifies expected potential benefits of new data and assists with optimal allocation of resources for information collection. However, analysis of VoI is computational very costly because of the underlying Bayesian inference especially for equality-type information. This paper proposes the first surrogate-based framework for VoI analysis. Instead of modeling the limit state functions describing events of interest for decision making, which is commonly pursued in surrogate model-based reliability methods, the proposed framework models system responses. This approach affords sharing equality-type information from observations among surrogate models to update likelihoods of multiple events of interest. Moreover, two knowledge sharing schemes called model and training points sharing are proposed to most effectively take advantage of the knowledge offered by costly model evaluations. Both schemes are integrated with an error rate-based adaptive training approach to efficiently generate accurate Kriging surrogate models. The proposed VoI analysis framework is applied for an optimal decision-making problem involving load testing of a truss bridge. While state-of-the-art methods based on importance sampling and adaptive Kriging Monte Carlo simulation are unable to solve this problem, the proposed method is shown to offer accurate and robust estimates of VoI with a limited number of model evaluations. Therefore, the proposed method facilitates the application of VoI for complex decision problems.
Tasks Active Learning, Bayesian Inference, Decision Making
Published 2020-02-06
URL https://arxiv.org/abs/2002.02354v2
PDF https://arxiv.org/pdf/2002.02354v2.pdf
PWC https://paperswithcode.com/paper/value-of-information-analysis-via-active

Ontology Extraction and Usage in the Scholarly Knowledge Domain

Title Ontology Extraction and Usage in the Scholarly Knowledge Domain
Authors Angelo A. Salatino, Francesco Osborne, Enrico Motta
Abstract Ontologies of research areas have been proven to be useful in many application for analysing and making sense of scholarly data. In this chapter, we present the Computer Science Ontology (CSO), which is the largest ontology of research areas in the field of Computer Science, and discuss a number of applications that build on CSO, to support high-level tasks, such as topic classification, metadata extraction, and recommendation of books.
Published 2020-03-27
URL https://arxiv.org/abs/2003.12611v1
PDF https://arxiv.org/pdf/2003.12611v1.pdf
PWC https://paperswithcode.com/paper/ontology-extraction-and-usage-in-the

CPFed: Communication-Efficient and Privacy-Preserving Federated Learning

Title CPFed: Communication-Efficient and Privacy-Preserving Federated Learning
Authors Rui Hu, Yanmin Gong, Yuanxiong Guo
Abstract Federated learning is a machine learning setting where a set of edge devices iteratively train a model under the orchestration of a central server, while keeping all data locally on edge devices. In each iteration of federated learning, edge devices perform computation with their local data, and the local computation results are then uploaded to the server for model update. During this process, the challenges of privacy leakage and communication overhead arise due to the extensive information exchange between edge devices and the server. In this paper, we develop CPFed, a communication-efficient and privacy-preserving federated learning method, to solve the above challenges. CPFed integrates three key components: (1) periodic averaging where local computation results at edge devices are only periodically averaged at the server; (2) Gaussian mechanism where edge devices randomly perturb their local computation results before sending the results to the server; and (3) secure aggregation where the perturbed local computation results are homomorphically encrypted before being sent to the server. CPFed can address both the communication efficiency and privacy leakage challenges in federated learning while achieving high model accuracy. We provide an end-to-end privacy guarantee of CPFed and analyze its theoretical convergence rates for both convex and non-convex models. Through extensive numerical experiments on real-world datasets, we demonstrate the effectiveness and efficiency of our proposed method.
Published 2020-03-30
URL https://arxiv.org/abs/2003.13761v1
PDF https://arxiv.org/pdf/2003.13761v1.pdf
PWC https://paperswithcode.com/paper/cpfed-communication-efficient-and-privacy

Towards Semantic Noise Cleansing of Categorical Data based on Semantic Infusion

Title Towards Semantic Noise Cleansing of Categorical Data based on Semantic Infusion
Authors Rishabh Gupta, Rajesh N Rao
Abstract Semantic Noise affects text analytics activities for the domain-specific industries significantly. It impedes the text understanding which holds prime importance in the critical decision making tasks. In this work, we formalize semantic noise as a sequence of terms that do not contribute to the narrative of the text. We look beyond the notion of standard statistically-based stop words and consider the semantics of terms to exclude the semantic noise. We present a novel Semantic Infusion technique to associate meta-data with the categorical corpus text and demonstrate its near-lossless nature. Based on this technique, we propose an unsupervised text-preprocessing framework to filter the semantic noise using the context of the terms. Later we present the evaluation results of the proposed framework using a web forum dataset from the automobile-domain.
Tasks Decision Making
Published 2020-02-06
URL https://arxiv.org/abs/2002.02238v1
PDF https://arxiv.org/pdf/2002.02238v1.pdf
PWC https://paperswithcode.com/paper/towards-semantic-noise-cleansing-of

From Data to Actions in Intelligent Transportation Systems: a Prescription of Functional Requirements for Model Actionability

Title From Data to Actions in Intelligent Transportation Systems: a Prescription of Functional Requirements for Model Actionability
Authors Ibai Lana, Javier J. Sanchez-Medina, Eleni I. Vlahogianni, Javier Del Ser
Abstract Advances in Data Science are lately permeating every field of Transportation Science and Engineering, making it straightforward to imagine that developments in the transportation sector will be data-driven. Nowadays, Intelligent Transportation Systems (ITS) could be arguably approached as a “story” intensively producing and consuming large amounts of data. A diversity of sensing devices densely spread over the infrastructure, vehicles or the travelers’ personal devices act as sources of data flows that are eventually fed to software running on automatic devices, actuators or control systems producing, in turn, complex information flows between users, traffic managers, data analysts, traffic modeling scientists, etc. These information flows provide enormous opportunities to improve model development and decision-making. The present work aims to describe how data, coming from diverse ITS sources, can be used to learn and adapt data-driven models for efficiently operating ITS assets, systems and processes; in other words, for data-based models to fully become actionable. Grounded on this described data modeling pipeline for ITS, we define the characteristics, engineering requisites and challenges intrinsic to its three compounding stages, namely, data fusion, adaptive learning and model evaluation. We deliberately generalize model learning to be adaptive, since, in the core of our paper is the firm conviction that most learners will have to adapt to the everchanging phenomenon scenario underlying the majority of ITS applications. Finally, we provide a prospect of current research lines within the Data Science realm that can bring notable advances to data-based ITS modeling, which will eventually bridge the gap towards the practicality and actionability of such models.
Tasks Decision Making
Published 2020-02-06
URL https://arxiv.org/abs/2002.02210v1
PDF https://arxiv.org/pdf/2002.02210v1.pdf
PWC https://paperswithcode.com/paper/from-data-to-actions-in-intelligent

CHAIN: Concept-harmonized Hierarchical Inference Interpretation of Deep Convolutional Neural Networks

Title CHAIN: Concept-harmonized Hierarchical Inference Interpretation of Deep Convolutional Neural Networks
Authors Dan Wang, Xinrui Cui, Z. Jane Wang
Abstract With the great success of networks, it witnesses the increasing demand for the interpretation of the internal network mechanism, especially for the net decision-making logic. To tackle the challenge, the Concept-harmonized HierArchical INference (CHAIN) is proposed to interpret the net decision-making process. For net-decisions being interpreted, the proposed method presents the CHAIN interpretation in which the net decision can be hierarchically deduced into visual concepts from high to low semantic levels. To achieve it, we propose three models sequentially, i.e., the concept harmonizing model, the hierarchical inference model, and the concept-harmonized hierarchical inference model. Firstly, in the concept harmonizing model, visual concepts from high to low semantic-levels are aligned with net-units from deep to shallow layers. Secondly, in the hierarchical inference model, the concept in a deep layer is disassembled into units in shallow layers. Finally, in the concept-harmonized hierarchical inference model, a deep-layer concept is inferred from its shallow-layer concepts. After several rounds, the concept-harmonized hierarchical inference is conducted backward from the highest semantic level to the lowest semantic level. Finally, net decision-making is explained as a form of concept-harmonized hierarchical inference, which is comparable to human decision-making. Meanwhile, the net layer structure for feature learning can be explained based on the hierarchical visual concepts. In quantitative and qualitative experiments, we demonstrate the effectiveness of CHAIN at the instance and class levels.
Tasks Decision Making
Published 2020-02-05
URL https://arxiv.org/abs/2002.01660v1
PDF https://arxiv.org/pdf/2002.01660v1.pdf
PWC https://paperswithcode.com/paper/chain-concept-harmonized-hierarchical

Actor-Transformers for Group Activity Recognition

Title Actor-Transformers for Group Activity Recognition
Authors Kirill Gavrilyuk, Ryan Sanford, Mehrsan Javan, Cees G. M. Snoek
Abstract This paper strives to recognize individual actions and group activities from videos. While existing solutions for this challenging problem explicitly model spatial and temporal relationships based on location of individual actors, we propose an actor-transformer model able to learn and selectively extract information relevant for group activity recognition. We feed the transformer with rich actor-specific static and dynamic representations expressed by features from a 2D pose network and 3D CNN, respectively. We empirically study different ways to combine these representations and show their complementary benefits. Experiments show what is important to transform and how it should be transformed. What is more, actor-transformers achieve state-of-the-art results on two publicly available benchmarks for group activity recognition, outperforming the previous best published results by a considerable margin.
Tasks Activity Recognition, Group Activity Recognition
Published 2020-03-28
URL https://arxiv.org/abs/2003.12737v1
PDF https://arxiv.org/pdf/2003.12737v1.pdf
PWC https://paperswithcode.com/paper/actor-transformers-for-group-activity

On Biased Random Walks, Corrupted Intervals, and Learning Under Adversarial Design

Title On Biased Random Walks, Corrupted Intervals, and Learning Under Adversarial Design
Authors Daniel Berend, Aryeh Kontorovich, Lev Reyzin, Thomas Robinson
Abstract We tackle some fundamental problems in probability theory on corrupted random processes on the integer line. We analyze when a biased random walk is expected to reach its bottommost point and when intervals of integer points can be detected under a natural model of noise. We apply these results to problems in learning thresholds and intervals under a new model for learning under adversarial design.
Published 2020-03-30
URL https://arxiv.org/abs/2003.13561v1
PDF https://arxiv.org/pdf/2003.13561v1.pdf
PWC https://paperswithcode.com/paper/on-biased-random-walks-corrupted-intervals

Human Activity Recognition from Wearable Sensor Data Using Self-Attention

Title Human Activity Recognition from Wearable Sensor Data Using Self-Attention
Authors Saif Mahmud, M Tanjid Hasan Tonmoy, Kishor Kumar Bhaumik, A K M Mahbubur Rahman, M Ashraful Amin, Mohammad Shoyaib, Muhammad Asif Hossain Khan, Amin Ahsan Ali
Abstract Human Activity Recognition from body-worn sensor data poses an inherent challenge in capturing spatial and temporal dependencies of time-series signals. In this regard, the existing recurrent or convolutional or their hybrid models for activity recognition struggle to capture spatio-temporal context from the feature space of sensor reading sequence. To address this complex problem, we propose a self-attention based neural network model that foregoes recurrent architectures and utilizes different types of attention mechanisms to generate higher dimensional feature representation used for classification. We performed extensive experiments on four popular publicly available HAR datasets: PAMAP2, Opportunity, Skoda and USC-HAD. Our model achieve significant performance improvement over recent state-of-the-art models in both benchmark test subjects and Leave-one-subject-out evaluation. We also observe that the sensor attention maps produced by our model is able capture the importance of the modality and placement of the sensors in predicting the different activity classes.
Tasks Activity Recognition, Human Activity Recognition, Time Series
Published 2020-03-17
URL https://arxiv.org/abs/2003.09018v1
PDF https://arxiv.org/pdf/2003.09018v1.pdf
PWC https://paperswithcode.com/paper/human-activity-recognition-from-wearable

Adversarial Stress Testing of Lifetime Distributions

Title Adversarial Stress Testing of Lifetime Distributions
Authors Nozer Singpurwalla
Abstract In this paper we put forward the viewpoint that the notion of stress testing financial institutions and engineered systems can also be made viable appropos the stress testing an individual’s strength of conviction in a probability distribution. The difference is interpretation and perspective. To make our case we consider a game theoretic setup entailing two players, an adversarial C, and an amicable M.The underlying metrics entail a de Finetti style 2 sided bet with asymmetric payoffs as a way to give meaning to lifetime distributions, an adversarial stress testing function, and a maximization of the expected utility of betting scores via the Kullback Liebler discrimination.
Published 2020-03-27
URL https://arxiv.org/abs/2003.12587v1
PDF https://arxiv.org/pdf/2003.12587v1.pdf
PWC https://paperswithcode.com/paper/adversarial-stress-testing-of-lifetime
comments powered by Disqus