January 26, 2020

3163 words 15 mins read

Paper Group ANR 1546

Graph-Based Reasoning over Heterogeneous External Knowledge for Commonsense Question Answering. Disease Identification From Unstructured User Input. Designing over uncertain outcomes with stochastic sampling Bayesian optimization. Modeling and Optimization of Human-machine Interaction Processes via the Maximum Entropy Principle. TUNet: Incorporatin …

Graph-Based Reasoning over Heterogeneous External Knowledge for Commonsense Question Answering


Title	Graph-Based Reasoning over Heterogeneous External Knowledge for Commonsense Question Answering
Authors	Shangwen Lv, Daya Guo, Jingjing Xu, Duyu Tang, Nan Duan, Ming Gong, Linjun Shou, Daxin Jiang, Guihong Cao, Songlin Hu
Abstract	Commonsense question answering aims to answer questions which require background knowledge that is not explicitly expressed in the question. The key challenge is how to obtain evidence from external knowledge and make predictions based on the evidence. Recent works either learn to generate evidence from human-annotated evidence which is expensive to collect, or extract evidence from either structured or unstructured knowledge bases which fails to take advantages of both sources. In this work, we propose to automatically extract evidence from heterogeneous knowledge sources, and answer questions based on the extracted evidence. Specifically, we extract evidence from both structured knowledge base (i.e. ConceptNet) and Wikipedia plain texts. We construct graphs for both sources to obtain the relational structures of evidence. Based on these graphs, we propose a graph-based approach consisting of a graph-based contextual word representation learning module and a graph-based inference module. The first module utilizes graph structural information to re-define the distance between words for learning better contextual word representations. The second module adopts graph convolutional network to encode neighbor information into the representations of nodes, and aggregates evidence with graph attention mechanism for predicting the final answer. Experimental results on CommonsenseQA dataset illustrate that our graph-based approach over both knowledge sources brings improvement over strong baselines. Our approach achieves the state-of-the-art accuracy (75.3%) on the CommonsenseQA leaderboard.
Tasks	Question Answering, Representation Learning
Published	2019-09-09
URL	https://arxiv.org/abs/1909.05311v1
PDF	https://arxiv.org/pdf/1909.05311v1.pdf
PWC	https://paperswithcode.com/paper/graph-based-reasoning-over-heterogeneous
Repo
Framework

Disease Identification From Unstructured User Input


Title	Disease Identification From Unstructured User Input
Authors	Fahim Faisal, Shafkat Ahmed Bhuiyan, Dr. Abu Raihan Mostofa Kamal
Abstract	A method to identify probable diseases from the unstructured textual input (eg, health forum posts) by incorporating a lexicographic and semantic feature based two-phase text classification module and a symptom-disease correlation-based similarity measurement module. One notable aspect of my approach was to develop a competent algorithm to extract all inherent features from the data source to make a better decision.
Tasks	Question Answering, Text Classification
Published	2019-05-01
URL	https://arxiv.org/abs/1905.01987v2
PDF	https://arxiv.org/pdf/1905.01987v2.pdf
PWC	https://paperswithcode.com/paper/disease-identification-from-unstructured-user
Repo
Framework

Designing over uncertain outcomes with stochastic sampling Bayesian optimization


Title	Designing over uncertain outcomes with stochastic sampling Bayesian optimization
Authors	Peter D. Tonner, Daniel V. Samarov, A. Gilad Kusne
Abstract	Optimization is becoming increasingly common in scientific and engineering domains. Oftentimes, these problems involve various levels of stochasticity or uncertainty in generating proposed solutions. Therefore, optimization in these scenarios must consider this stochasticity to properly guide the design of future experiments. Here, we adapt Bayesian optimization to handle uncertain outcomes, proposing a new framework called stochastic sampling Bayesian optimization (SSBO). We show that the bounds on expected regret for an upper confidence bound search in SSBO resemble those of earlier Bayesian optimization approaches, with added penalties due to the stochastic generation of inputs. Additionally, we adapt existing batch optimization techniques to properly limit the myopic decision making that can arise when selecting multiple instances before feedback. Finally, we show that SSBO techniques properly optimize a set of standard optimization problems as well as an applied problem inspired by bioengineering.
Tasks	Decision Making
Published	2019-11-05
URL	https://arxiv.org/abs/1911.02106v1
PDF	https://arxiv.org/pdf/1911.02106v1.pdf
PWC	https://paperswithcode.com/paper/designing-over-uncertain-outcomes-with
Repo
Framework

Modeling and Optimization of Human-machine Interaction Processes via the Maximum Entropy Principle


Title	Modeling and Optimization of Human-machine Interaction Processes via the Maximum Entropy Principle
Authors	Jiaxiao Zheng, Gustavo de Veciana
Abstract	We propose a data-driven framework to enable the modeling and optimization of human-machine interaction processes, e.g., systems aimed at assisting humans in decision-making or learning, work-load allocation, and interactive advertising. This is a challenging problem for several reasons. First, humans’ behavior is hard to model or infer, as it may reflect biases, long term memory, and sensitivity to sequencing, i.e., transience and exponential complexity in the length of the interaction. Second, due to the interactive nature of such processes, the machine policy used to engage with a human may bias possible data-driven inferences. Finally, in choosing machine policies that optimize interaction rewards, one must, on the one hand, avoid being overly sensitive to error/variability in the estimated human model, and on the other, being overly deterministic/predictable which may result in poor human ‘engagement’ in the interaction. To meet these challenges, we propose a robust approach, based on the maximum entropy principle, which iteratively estimates human behavior and optimizes the machine policy–Alternating Entropy-Reward Ascent (AREA) algorithm. We characterize AREA, in terms of its space and time complexity and convergence. We also provide an initial validation based on synthetic data generated by an established noisy nonlinear model for human decision-making.
Tasks	Decision Making
Published	2019-03-17
URL	http://arxiv.org/abs/1903.07157v1
PDF	http://arxiv.org/pdf/1903.07157v1.pdf
PWC	https://paperswithcode.com/paper/modeling-and-optimization-of-human-machine
Repo
Framework

TUNet: Incorporating segmentation maps to improve classification


Title	TUNet: Incorporating segmentation maps to improve classification
Authors	Yijun Tian
Abstract	Determining the localization of specific protein in human cells is important for understanding cellular functions and biological processes of underlying diseases. Among imaging techniques, high-throughput fluorescence microscopy imaging is an efficient biotechnology to stain the protein of interest in a cell. In this work, we present a novel classification model Twin U-Net (TUNet) for processing and classifying the belonging of protein in the Atlas images. Several notable Deep Learning models including GoogleNet and Resnet have been employed for comparison. Results have shown that our system obtaining competitive performance.
Tasks
Published	2019-01-27
URL	http://arxiv.org/abs/1901.11379v1
PDF	http://arxiv.org/pdf/1901.11379v1.pdf
PWC	https://paperswithcode.com/paper/tunet-incorporating-segmentation-maps-to
Repo
Framework

Introducing MathQA – A Math-Aware Question Answering System


Title	Introducing MathQA – A Math-Aware Question Answering System
Authors	Moritz Schubotz, Philipp Scharpf, Kaushal Dudhat, Yash Nagar, Felix Hamborg, Bela Gipp
Abstract	We present an open source math-aware Question Answering System based on Ask Platypus. Our system returns as a single mathematical formula for a natural language question in English or Hindi. This formulae originate from the knowledge-base Wikidata. We translate these formulae to computable data by integrating the calculation engine sympy into our system. This way, users can enter numeric values for the variables occurring in the formula. Moreover, the system loads numeric values for constants occurring in the formula from Wikidata. In a user study, our system outperformed a commercial computational mathematical knowledge engine by 13%. However, the performance of our system heavily depends on the size and quality of the formula data available in Wikidata. Since only a few items in Wikidata contained formulae when we started the project, we facilitated the import process by suggesting formula edits to Wikidata editors. With the simple heuristic that the first formula is significant for the article, 80% of the suggestions were correct.
Tasks	Question Answering
Published	2019-06-28
URL	https://arxiv.org/abs/1907.01642v1
PDF	https://arxiv.org/pdf/1907.01642v1.pdf
PWC	https://paperswithcode.com/paper/introducing-mathqa-a-math-aware-question
Repo
Framework

Global Health Monitor: A Web-based System for Detecting and Mapping Infectious Diseases


Title	Global Health Monitor: A Web-based System for Detecting and Mapping Infectious Diseases
Authors	Son Doan, Quoc-Hung Ngo, Ai Kawazoe, Nigel Collier
Abstract	We present the Global Health Monitor, an online Web-based system for detecting and mapping infectious disease outbreaks that appear in news stories. The system analyzes English news stories from news feed providers, classifies them for topical relevance and plots them onto a Google map using geo-coding information, helping public health workers to monitor the spread of diseases in a geo-temporal context. The background knowledge for the system is contained in the BioCaster ontology (BCO) (Collier et al., 2007a) which includes both information on infectious diseases as well as geographical locations with their latitudes/longitudes. The system consists of four main stages: topic classification, named entity recognition (NER), disease/location detection and visualization. Evaluation of the system shows that it achieved high accuracy on a gold standard corpus. The system is now in practical use. Running on a clustercomputer, it monitors more than 1500 news feeds 24/7, updating the map every hour.
Tasks	Named Entity Recognition
Published	2019-11-21
URL	https://arxiv.org/abs/1911.09735v1
PDF	https://arxiv.org/pdf/1911.09735v1.pdf
PWC	https://paperswithcode.com/paper/global-health-monitor-a-web-based-system-for
Repo
Framework

Graph Optimized Convolutional Networks


Title	Graph Optimized Convolutional Networks
Authors	Bo Jiang, Ziyan Zhang, Jin Tang, Bin Luo
Abstract	Graph Convolutional Networks (GCNs) have been widely studied for graph data representation and learning tasks. Existing GCNs generally use a fixed single graph which may lead to weak suboptimal for data representation/learning and are also hard to deal with multiple graphs. To address these issues, we propose a novel Graph Optimized Convolutional Network (GOCN) for graph data representation and learning. Our GOCN is motivated based on our re-interpretation of graph convolution from a regularization/optimization framework. The core idea of GOCN is to formulate graph optimization and graph convolutional representation into a unified framework and thus conducts both of them cooperatively to boost their respective performance in GCN learning scheme. Moreover, based on the proposed unified graph optimization-convolution framework, we propose a novel Multiple Graph Optimized Convolutional Network (M-GOCN) to naturally address the data with multiple graphs. Experimental results demonstrate the effectiveness and benefit of the proposed GOCN and M-GOCN.
Tasks	Node Classification, Representation Learning
Published	2019-04-26
URL	http://arxiv.org/abs/1904.11883v1
PDF	http://arxiv.org/pdf/1904.11883v1.pdf
PWC	https://paperswithcode.com/paper/graph-optimized-convolutional-networks
Repo
Framework

Deep Learning-based Hybrid Graph-Coloring Algorithm for Register Allocation


Title	Deep Learning-based Hybrid Graph-Coloring Algorithm for Register Allocation
Authors	Dibyendu Das, Shahid Asghar Ahmad, Kumar Venkataramanan
Abstract	Register allocation, which is a crucial phase of a good optimizing compiler, relies on graph coloring. Hence, an efficient graph coloring algorithm is of paramount importance. In this work we try to learn a good heuristic for coloring interference graphs that are used in the register allocation phase. We aim to handle moderate sized interference graphs which have 100 nodes or less. For such graphs we can get the optimal allocation of colors to the nodes. Such optimal coloring is then used to train our Deep Learning network which is based on several layers of LSTM that output a color for each node of the graph. However, the current network may allocate the same color to the nodes connected by an edge resulting in an invalid coloring of the interference graph. Since it is difficult to encode constraints in an LSTM to avoid invalid coloring, we augment our deep learning network with a color correction phase that runs after the colors have been allocated by the network. Thus, our algorithm is hybrid in nature consisting of a mix of a deep learning algorithm followed by a more traditional correction phase. We have trained our network using several thousand random graphs of varying sparsity. On application of our hybrid algorithm to various popular graphs found in literature we see that our algorithm does very well when compared to the optimal coloring of these graphs. We have also run our algorithm against LLVMs popular greedy register allocator for several SPEC CPU 2017 benchmarks and notice that the hybrid algorithm performs on par or better than such a well-tuned allocator for most of these benchmarks.
Tasks
Published	2019-12-08
URL	https://arxiv.org/abs/1912.03700v1
PDF	https://arxiv.org/pdf/1912.03700v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-based-hybrid-graph-coloring
Repo
Framework

What I See Is What You See: Joint Attention Learning for First and Third Person Video Co-analysis


Title	What I See Is What You See: Joint Attention Learning for First and Third Person Video Co-analysis
Authors	Huangyue Yu, Minjie Cai, Yunfei Liu, Feng Lu
Abstract	In recent years, more and more videos are captured from the first-person viewpoint by wearable cameras. Such first-person video provides additional information besides the traditional third-person video, and thus has a wide range of applications. However, techniques for analyzing the first-person video can be fundamentally different from those for the third-person video, and it is even more difficult to explore the shared information from both viewpoints. In this paper, we propose a novel method for first- and third-person video co-analysis. At the core of our method is the notion of “joint attention”, indicating the learnable representation that corresponds to the shared attention regions in different viewpoints and thus links the two viewpoints. To this end, we develop a multi-branch deep network with a triplet loss to extract the joint attention from the first- and third-person videos via self-supervised learning. We evaluate our method on the public dataset with cross-viewpoint video matching tasks. Our method outperforms the state-of-the-art both qualitatively and quantitatively. We also demonstrate how the learned joint attention can benefit various applications through a set of additional experiments.
Tasks
Published	2019-04-16
URL	http://arxiv.org/abs/1904.07424v1
PDF	http://arxiv.org/pdf/1904.07424v1.pdf
PWC	https://paperswithcode.com/paper/what-i-see-is-what-you-see-joint-attention
Repo
Framework

Voxel2Mesh: 3D Mesh Model Generation from Volumetric Data


Title	Voxel2Mesh: 3D Mesh Model Generation from Volumetric Data
Authors	Udaranga Wickramasinghe, Edoardo Remelli, Graham Knott, Pascal Fua
Abstract	CNN-based volumetric methods that label individual voxels now dominate the field of biomedical segmentation. However, 3D surface representations are often required for proper analysis. They can be obtained by post-processing the labeled volumes which typically introduces artifacts and prevents end-to-end training. In this paper, we therefore introduce a novel architecture that goes directly from 3D image volumes to 3D surfaces without post-processing and with better accuracy than current methods. We evaluate it on Electron Microscopy and MRI brain images as well as CT liver scans. We will show that it outperforms state-of-the-art segmentation methods.
Tasks
Published	2019-12-08
URL	https://arxiv.org/abs/1912.03681v2
PDF	https://arxiv.org/pdf/1912.03681v2.pdf
PWC	https://paperswithcode.com/paper/vm-net-mesh-modeling-to-assist-segmentation
Repo
Framework

Comment on “AndrODet: An adaptive Android obfuscation detector”


Title	Comment on “AndrODet: An adaptive Android obfuscation detector”
Authors	Alireza Mohammadinodooshan, Ulf Kargén, Nahid Shahmehri
Abstract	We have identified a methodological problem in the empirical evaluation of the string encryption detection capabilities of the AndrODet system described by Mirzaei et al. in the recent paper “AndrODet: An adaptive Android obfuscation detector”. The accuracy of string encryption detection is evaluated using samples from the AMD and PraGuard malware datasets. However, the authors failed to account for the fact that many of the AMD samples are highly similar due to the fact that they come from the same malware family. This introduces a risk that a machine learning system trained on these samples could fail to learn a generalizable model for string encryption detection, and might instead learn to classify samples based on characteristics of each malware family. Our own evaluation strongly indicates that the reported high accuracy of AndrODet’s string encryption detection is indeed due to this phenomenon. When we evaluated AndrODet, we found that when we ensured that samples from the same family never appeared in both training and testing data, the accuracy dropped to around 50%. Moreover, the PraGuard dataset is not suitable for evaluating a static string encryption detector such as AndrODet, since the particular obfuscation tool used to produce the dataset effectively makes it impossible to extract meaningful features of static strings in Android apps.
Tasks
Published	2019-10-14
URL	https://arxiv.org/abs/1910.06192v2
PDF	https://arxiv.org/pdf/1910.06192v2.pdf
PWC	https://paperswithcode.com/paper/comment-on-androdet-an-adaptive-android
Repo
Framework

A Replication Strategy for Mobile Opportunistic Networks based on Utility Clustering


Title	A Replication Strategy for Mobile Opportunistic Networks based on Utility Clustering
Authors	Evangelos Papapetrou, Aristidis Likas
Abstract	Dynamic replication is a wide-spread multi-copy routing approach for efficiently coping with the intermittent connectivity in mobile opportunistic networks. According to it, a node forwards a message replica to an encountered node based on a utility value that captures the latter’s fitness for delivering the message to the destination. The popularity of the approach stems from its flexibility to effectively operate in networks with diverse characteristics without requiring special customization. Nonetheless, its drawback is the tendency to produce a high number of replicas that consume limited resources such as energy and storage. To tackle the problem we make the observation that network nodes can be grouped, based on their utility values, into clusters that portray different delivery capabilities. We exploit this finding to transform the basic forwarding strategy, which is to move a packet using nodes of increasing utility, and actually forward it through clusters of increasing delivery capability. The new strategy works in synergy with the basic dynamic replication algorithms and is fully configurable, in the sense that it can be used with virtually any utility function. We also extend our approach to work with two utility functions at the same time, a feature that is especially efficient in mobile networks that exhibit social characteristics. By conducting experiments in a wide set of real-life networks, we empirically show that our method is robust in reducing the overall number of replicas in networks with diverse connectivity characteristics without at the same time hindering delivery efficiency.
Tasks
Published	2019-12-23
URL	https://arxiv.org/abs/1912.11146v1
PDF	https://arxiv.org/pdf/1912.11146v1.pdf
PWC	https://paperswithcode.com/paper/a-replication-strategy-for-mobile
Repo
Framework

Explicit topological priors for deep-learning based image segmentation using persistent homology


Title	Explicit topological priors for deep-learning based image segmentation using persistent homology
Authors	James R. Clough, Ilkay Oksuz, Nicholas Byrne, Julia A. Schnabel, Andrew P. King
Abstract	We present a novel method to explicitly incorporate topological prior knowledge into deep learning based segmentation, which is, to our knowledge, the first work to do so. Our method uses the concept of persistent homology, a tool from topological data analysis, to capture high-level topological characteristics of segmentation results in a way which is differentiable with respect to the pixelwise probability of being assigned to a given class. The topological prior knowledge consists of the sequence of desired Betti numbers of the segmentation. As a proof-of-concept we demonstrate our approach by applying it to the problem of left-ventricle segmentation of cardiac MR images of 500 subjects from the UK Biobank dataset, where we show that it improves segmentation performance in terms of topological correctness without sacrificing pixelwise accuracy.
Tasks	Semantic Segmentation, Topological Data Analysis
Published	2019-01-29
URL	http://arxiv.org/abs/1901.10244v1
PDF	http://arxiv.org/pdf/1901.10244v1.pdf
PWC	https://paperswithcode.com/paper/explicit-topological-priors-for-deep-learning
Repo
Framework

Classification of signaling proteins based on molecular star graph descriptors using Machine Learning models


Title	Classification of signaling proteins based on molecular star graph descriptors using Machine Learning models
Authors	Carlos Fernandez-Lozano, Ruben F. Cuinas, Jose A. Seoane, Enrique Fernandez-Blanco, Julian Dorado, Cristian R. Munteanu
Abstract	Signaling proteins are an important topic in drug development due to the increased importance of finding fast, accurate and cheap methods to evaluate new molecular targets involved in specific diseases. The complexity of the protein structure hinders the direct association of the signaling activity with the molecular structure. Therefore, the proposed solution involves the use of protein star graphs for the peptide sequence information encoding into specific topological indices calculated with S2SNet tool. The Quantitative Structure-Activity Relationship classification model obtained with Machine Learning techniques is able to predict new signaling peptides. The best classification model is the first signaling prediction model, which is based on eleven descriptors and it was obtained using the Support Vector Machines - Recursive Feature Elimination (SVM-RFE) technique with the Laplacian kernel (RFE-LAP) and an AUROC of 0.961. Testing a set of 3114 proteins of unknown function from the PDB database assessed the prediction performance of the model. Important signaling pathways are presented for three UniprotIDs (34 PDBs) with a signaling prediction greater than 98.0%.
Tasks
Published	2019-04-10
URL	http://arxiv.org/abs/1904.05052v1
PDF	http://arxiv.org/pdf/1904.05052v1.pdf
PWC	https://paperswithcode.com/paper/classification-of-signaling-proteins-based-on
Repo
Framework