January 29, 2020

3157 words 15 mins read

Paper Group ANR 712

Paper Group ANR 712

PET/CT Radiomic Sequencer for Prediction of EGFR and KRAS Mutation Status in NSCLC Patients. Explainable Deep Learning for Augmentation of sRNA Expression Profiles. Measure Contribution of Participants in Federated Learning. NESTA: Hamming Weight Compression-Based Neural Proc. Engine. Batch Uniformization for Minimizing Maximum Anomaly Score of DNN …

PET/CT Radiomic Sequencer for Prediction of EGFR and KRAS Mutation Status in NSCLC Patients

Title PET/CT Radiomic Sequencer for Prediction of EGFR and KRAS Mutation Status in NSCLC Patients
Authors Isaac Shiri, Hassan Maleki, Ghasem Hajianfar, Hamid Abdollahi, Saeed Ashrafinia, Mathieu Hatt, Mehrdad Oveisi, Arman Rahmim
Abstract The aim of this study was to develop radiomic models using PET/CT radiomic features with different machine learning approaches for finding best predictive epidermal growth factor receptor (EGFR) and Kirsten rat sarcoma viral oncogene (KRAS) mutation status. Patients images including PET and CT [diagnostic (CTD) and low dose CT (CTA)] were pre-processed using wavelet (WAV), Laplacian of Gaussian (LOG) and 64 bin discretization (BIN) (alone or in combinations) and several features from images were extracted. The prediction performance of model was checked using the area under the receiver operator characteristic (ROC) curve (AUC). Results showed a wide range of radiomic model AUC performances up to 0.75 in prediction of EGFR and KRAS mutation status. Combination of K-Best and variance threshold feature selector with logistic regression (LREG) classifier in diagnostic CT scan led to the best performance in EGFR (CTD-BIN+B-KB+LREG, AUC: mean 0.75 sd 0.10) and KRAS (CTD-BIN-LOG-WAV+B-VT+LREG, AUC: mean 0.75 sd 0.07) respectively. Additionally, incorporating PET, kept AUC values at ~0.74. When considering conventional features only, highest predictive performance was achieved by PET SUVpeak (AUC: 0.69) for EGFR and by PET MTV (AUC: 0.55) for KRAS. In comparison with conventional PET parameters such as standard uptake value, radiomic models were found as more predictive. Our findings demonstrated that non-invasive and reliable radiomics analysis can be successfully used to predict EGFR and KRAS mutation status in NSCLC patients.
Tasks
Published 2019-06-15
URL https://arxiv.org/abs/1906.06623v1
PDF https://arxiv.org/pdf/1906.06623v1.pdf
PWC https://paperswithcode.com/paper/petct-radiomic-sequencer-for-prediction-of
Repo
Framework

Explainable Deep Learning for Augmentation of sRNA Expression Profiles

Title Explainable Deep Learning for Augmentation of sRNA Expression Profiles
Authors Jelena Fiosina, Maksims Fiosins, Stefan Bonn
Abstract The lack of well-structured metadata annotations complicates there-usability and interpretation of the growing amount of publicly available RNA expression data. The machine learning-based prediction of metadata(data augmentation) can considerably improve the quality of expression data annotation. In this study,we systematically benchmark deep learning (DL) and random forest (RF)-based metadata augmentation of tissue, age, and sex using small RNA (sRNA) expression profiles. We use 4243 annotated sRNA-Seq samples from the small RNA expression atlas (SEA) database to train and test the augmentation performance. In general, the DL machine learner outperforms the RF method in almost all tested cases. The average cross-validated prediction accuracy of the DL algorithm for tissues is 96.5%, for sex is 77%, and for age is 77.2%. The average tissue prediction accuracy for a completely new dataset is 83.1% (DL) and 80.8% (RF). To understand which sRNAs influence DL predictions, we employ backpropagation-based feature importance scores using the DeepLIFT method, which enable us to obtain information on biological relevance of sRNAs.
Tasks Data Augmentation, Feature Importance
Published 2019-09-26
URL https://arxiv.org/abs/1909.11956v1
PDF https://arxiv.org/pdf/1909.11956v1.pdf
PWC https://paperswithcode.com/paper/explainable-deep-learning-for-augmentation-of
Repo
Framework

Measure Contribution of Participants in Federated Learning

Title Measure Contribution of Participants in Federated Learning
Authors Guan Wang, Charlie Xiaoqian Dang, Ziye Zhou
Abstract Federated Machine Learning (FML) creates an ecosystem for multiple parties to collaborate on building models while protecting data privacy for the participants. A measure of the contribution for each party in FML enables fair credits allocation. In this paper we develop simple but powerful techniques to fairly calculate the contributions of multiple parties in FML, in the context of both horizontal FML and vertical FML. For Horizontal FML we use deletion method to calculate the grouped instance influence. For Vertical FML we use Shapley Values to calculate the grouped feature importance. Our methods open the door for research in model contribution and credit allocation in the context of federated machine learning.
Tasks Feature Importance
Published 2019-09-17
URL https://arxiv.org/abs/1909.08525v1
PDF https://arxiv.org/pdf/1909.08525v1.pdf
PWC https://paperswithcode.com/paper/measure-contribution-of-participants-in
Repo
Framework

NESTA: Hamming Weight Compression-Based Neural Proc. Engine

Title NESTA: Hamming Weight Compression-Based Neural Proc. Engine
Authors Ali Mirzaeian, Houman Homayoun, Avesta Sasan
Abstract In this paper, we present NESTA, a specialized Neural engine that significantly accelerates the computation of convolution layers in a deep convolutional neural network, while reducing the computational energy. NESTA reformats Convolutions into $3 \times 3$ batches and uses a hierarchy of Hamming Weight Compressors to process each batch. Besides, when processing the convolution across multiple channels, NESTA, rather than computing the precise result of a convolution per channel, quickly computes an approximation of its partial sum, and a residual value such that if added to the approximate partial sum, generates the accurate output. Then, instead of immediately adding the residual, it uses (consumes) the residual when processing the next batch in the hamming weight compressors with available capacity. This mechanism shortens the critical path by avoiding the need to propagate carry signals during each round of computation and speeds up the convolution of each channel. In the last stage of computation, when the partial sum of the last channel is computed, NESTA terminates by adding the residual bits to the approximate output to generate a correct result.
Tasks
Published 2019-10-01
URL https://arxiv.org/abs/1910.00700v1
PDF https://arxiv.org/pdf/1910.00700v1.pdf
PWC https://paperswithcode.com/paper/nesta-hamming-weight-compression-based-neural
Repo
Framework

Batch Uniformization for Minimizing Maximum Anomaly Score of DNN-based Anomaly Detection in Sounds

Title Batch Uniformization for Minimizing Maximum Anomaly Score of DNN-based Anomaly Detection in Sounds
Authors Yuma Koizumi, Shoichiro Saito, Masataka Yamaguchi, Shin Murata, Noboru Harada
Abstract Use of an autoencoder (AE) as a normal model is a state-of-the-art technique for unsupervised-anomaly detection in sounds (ADS). The AE is trained to minimize the sample mean of the anomaly score of normal sounds in a mini-batch. One problem with this approach is that the anomaly score of rare-normal sounds becomes higher than that of frequent-normal sounds, because the sample mean is strongly affected by frequent-normal samples, resulting in preferentially decreasing the anomaly score of frequent-normal samples. To decrease anomaly scores for both frequent- and rare-normal sounds, we propose batch uniformization, a training method for unsupervised-ADS for minimizing a weighted average of the anomaly score on each sample in a mini-batch. We used the reciprocal of the probabilistic density of each sample as the weight, more intuitively, a large weight is given for rare-normal sounds. Such a weight works to give a constant anomaly score for both frequent- and rare-normal sounds. Since the probabilistic density is unknown, we estimate it by using the kernel density estimation on each training mini-batch. Verification- and objective-experiments show that the proposed batch uniformization improves the performance of unsupervised-ADS.
Tasks Anomaly Detection, Density Estimation, Unsupervised Anomaly Detection
Published 2019-07-19
URL https://arxiv.org/abs/1907.08338v1
PDF https://arxiv.org/pdf/1907.08338v1.pdf
PWC https://paperswithcode.com/paper/batch-uniformization-for-minimizing-maximum
Repo
Framework

Predicting city safety perception based on visual image content

Title Predicting city safety perception based on visual image content
Authors Sergio Acosta, Jorge E. Camargo
Abstract Safety perception measurement has been a subject of interest in many cities of the world. This is due to its social relevance, and to its effect on some local economic activities. Even though people safety perception is a subjective topic, sometimes it is possible to find out common patterns given a restricted geographical and sociocultural context. This paper presents an approach that makes use of image processing and machine learning techniques to detect with high accuracy urban environment patterns that could affect citizen’s safety perception.
Tasks
Published 2019-02-19
URL http://arxiv.org/abs/1902.06871v1
PDF http://arxiv.org/pdf/1902.06871v1.pdf
PWC https://paperswithcode.com/paper/predicting-city-safety-perception-based-on
Repo
Framework

K-Means Clustering on Noisy Intermediate Scale Quantum Computers

Title K-Means Clustering on Noisy Intermediate Scale Quantum Computers
Authors Sumsam Ullah Khan, Ahsan Javed Awan, Gemma Vall-Llosera
Abstract Real-time clustering of big performance data generated by the telecommunication networks requires domain-specific high performance compute infrastructure to detect anomalies. In this paper, we evaluate noisy intermediate-scale quantum (NISQ) computers characterized by low decoherence times, for K-means clustering and propose three strategies to generate shorter-depth quantum circuits needed to overcome the limitation of NISQ computers. The strategies are based on exploiting; i) quantum interference, ii) negative rotations and iii) destructive interference. By comparing our implementations on IBMQX2 machine for representative data sets, we show that NISQ computers can solve the K-means clustering problem with the same level of accuracy as that of classical computers.
Tasks
Published 2019-09-26
URL https://arxiv.org/abs/1909.12183v1
PDF https://arxiv.org/pdf/1909.12183v1.pdf
PWC https://paperswithcode.com/paper/k-means-clustering-on-noisy-intermediate
Repo
Framework

Guiding Neuroevolution with Structural Objectives

Title Guiding Neuroevolution with Structural Objectives
Authors Kai Olav Ellefsen, Joost Huizinga, Jim Torresen
Abstract The structure and performance of neural networks are intimately connected, and by use of evolutionary algorithms, neural network structures optimally adapted to a given task can be explored. Guiding such neuroevolution with additional objectives related to network structure has been shown to improve performance in some cases, especially when modular neural networks are beneficial. However, apart from objectives aiming to make networks more modular, such structural objectives have not been widely explored. We propose two new structural objectives and test their ability to guide evolving neural networks on two problems which can benefit from decomposition into subtasks. The first structural objective guides evolution to align neural networks with a user-recommended decomposition pattern. Intuitively, this should be a powerful guiding target for problems where human users can easily identify a structure. The second structural objective guides evolution towards a population with a high diversity in decomposition patterns. This results in exploration of many different ways to decompose a problem, allowing evolution to find good decompositions faster. Tests on our target problems reveal that both methods perform well on a problem with a very clear and decomposable structure. However, on a problem where the optimal decomposition is less obvious, the structural diversity objective is found to outcompete other structural objectives – and this technique can even increase performance on problems without any decomposable structure at all.
Tasks
Published 2019-02-12
URL http://arxiv.org/abs/1902.04346v3
PDF http://arxiv.org/pdf/1902.04346v3.pdf
PWC https://paperswithcode.com/paper/guiding-neuroevolution-with-structural
Repo
Framework

The Principle of Unchanged Optimality in Reinforcement Learning Generalization

Title The Principle of Unchanged Optimality in Reinforcement Learning Generalization
Authors Alex Irpan, Xingyou Song
Abstract Several recent papers have examined generalization in reinforcement learning (RL), by proposing new environments or ways to add noise to existing environments, then benchmarking algorithms and model architectures on those environments. We discuss subtle conceptual properties of RL benchmarks that are not required in supervised learning (SL), and also properties that an RL benchmark should possess. Chief among them is one we call the principle of unchanged optimality: there should exist a single $\pi$ that is optimal across all train and test tasks. In this work, we argue why this principle is important, and ways it can be broken or satisfied due to subtle choices in state representation or model architecture. We conclude by discussing challenges and future lines of research in theoretically analyzing generalization benchmarks.
Tasks
Published 2019-06-02
URL https://arxiv.org/abs/1906.00336v1
PDF https://arxiv.org/pdf/1906.00336v1.pdf
PWC https://paperswithcode.com/paper/190600336
Repo
Framework

Few-Shot Sequence Labeling with Label Dependency Transfer and Pair-wise Embedding

Title Few-Shot Sequence Labeling with Label Dependency Transfer and Pair-wise Embedding
Authors Yutai Hou, Zhihan Zhou, Yijia Liu, Ning Wang, Wanxiang Che, Han Liu, Ting Liu
Abstract While few-shot classification has been widely explored with similarity based methods, few-shot sequence labeling poses a unique challenge as it also calls for modeling the label dependencies. To consider both the item similarity and label dependency, we propose to leverage the conditional random fields (CRFs) in few-shot sequence labeling. It calculates emission score with similarity based methods and obtains transition score with a specially designed transfer mechanism. When applying CRF in the few-shot scenarios, the discrepancy of label sets among different domains makes it hard to use the label dependency learned in prior domains. To tackle this, we introduce the dependency transfer mechanism that transfers abstract label transition patterns. In addition, the similarity methods rely on the high quality sample representation, which is challenging for sequence labeling, because sense of a word is different when measuring its similarity to words in different sentences. To remedy this, we take advantage of recent contextual embedding technique, and further propose a pair-wise embedder. It provides additional certainty for word sense by embedding query and support sentence pairwisely. Experimental results on slot tagging and named entity recognition show that our model significantly outperforms the strongest few-shot learning baseline by 11.76 (21.2%) and 12.18 (97.7%) F1 scores respectively in the one-shot setting.
Tasks Few-Shot Learning, Named Entity Recognition
Published 2019-06-20
URL https://arxiv.org/abs/1906.08711v3
PDF https://arxiv.org/pdf/1906.08711v3.pdf
PWC https://paperswithcode.com/paper/few-shot-sequence-labeling-with-label
Repo
Framework

Nested Variational Autoencoder for Topic Modeling on Microtexts with Word Vectors

Title Nested Variational Autoencoder for Topic Modeling on Microtexts with Word Vectors
Authors Trung Trinh, Tho Quan, Trung Mai
Abstract Most of the information on the Internet is represented in the form of microtexts, which are short text snippets such as news headlines or tweets. These sources of information are abundant, and mining these data could uncover meaningful insights. Topic modeling is one of the popular methods to extract knowledge from a collection of documents; however, conventional topic models such as latent Dirichlet allocation (LDA) are unable to perform well on short documents, mostly due to the scarcity of word co-occurrence statistics embedded in the data. The objective of our research is to create a topic model that can achieve great performances on microtexts while requiring a small runtime for scalability to large datasets. To solve the lack of information of microtexts, we allow our method to take advantage of word embeddings for additional knowledge of relationships between words. For speed and scalability, we apply autoencoding variational Bayes, an algorithm that can perform efficient black-box inference in probabilistic models. The result of our work is a novel topic model called the nested variational autoencoder, which is a distribution that takes into account word vectors and is parameterized by a neural network architecture. For optimization, the model is trained to approximate the posterior distribution of the original LDA model. Experiments show the improvements of our model on microtexts as well as its runtime advantage.
Tasks Topic Models, Word Embeddings
Published 2019-05-01
URL https://arxiv.org/abs/1905.00195v3
PDF https://arxiv.org/pdf/1905.00195v3.pdf
PWC https://paperswithcode.com/paper/nested-variational-autoencoder-for-topic
Repo
Framework

Learning to Collaborate in Markov Decision Processes

Title Learning to Collaborate in Markov Decision Processes
Authors Goran Radanovic, Rati Devidze, David C. Parkes, Adish Singla
Abstract We consider a two-agent MDP framework where agents repeatedly solve a task in a collaborative setting. We study the problem of designing a learning algorithm for the first agent (A1) that facilitates a successful collaboration even in cases when the second agent (A2) is adapting its policy in an unknown way. The key challenge in our setting is that the first agent faces non-stationarity in rewards and transitions because of the adaptive behavior of the second agent. We design novel online learning algorithms for agent A1 whose regret decays as $O(T^{\max{1-\frac{3}{7} \cdot \alpha, \frac{1}{4}}})$ with $T$ learning episodes provided that the magnitude of agent A2’s policy changes between any two consecutive episodes are upper bounded by $O(T^{-\alpha})$. Here, the parameter $\alpha$ is assumed to be strictly greater than $0$, and we show that this assumption is necessary provided that the learning parity with noise problem is computationally hard. We show that sub-linear regret of agent A1 further implies near-optimality of the agents’ joint return for MDPs that manifest the properties of a smooth game.
Tasks
Published 2019-01-23
URL https://arxiv.org/abs/1901.08029v2
PDF https://arxiv.org/pdf/1901.08029v2.pdf
PWC https://paperswithcode.com/paper/learning-to-collaborate-in-markov-decision
Repo
Framework

Design space exploration of Ferroelectric FET based Processing-in-Memory DNN Accelerator

Title Design space exploration of Ferroelectric FET based Processing-in-Memory DNN Accelerator
Authors Insik Yoon, Matthew Jerry, Suman Datta, Arijit Raychowdhury
Abstract In this letter, we quantify the impact of device limitations on the classification accuracy of an artificial neural network, where the synaptic weights are implemented in a Ferroelectric FET (FeFET) based in-memory processing architecture. We explore a design-space consisting of the resolution of the analog-to-digital converter, number of bits per FeFET cell, and the neural network depth. We show how the system architecture, training models and overparametrization can address some of the device limitations.
Tasks
Published 2019-08-12
URL https://arxiv.org/abs/1908.07942v1
PDF https://arxiv.org/pdf/1908.07942v1.pdf
PWC https://paperswithcode.com/paper/design-space-exploration-of-ferroelectric-fet
Repo
Framework

A Universal Density Matrix Functional from Molecular Orbital-Based Machine Learning: Transferability across Organic Molecules

Title A Universal Density Matrix Functional from Molecular Orbital-Based Machine Learning: Transferability across Organic Molecules
Authors Lixue Cheng, Matthew Welborn, Anders S. Christensen, Thomas F. Miller III
Abstract We address the degree to which machine learning can be used to accurately and transferably predict post-Hartree-Fock correlation energies. Refined strategies for feature design and selection are presented, and the molecular-orbital-based machine learning (MOB-ML) method is applied to several test systems. Strikingly, for the MP2, CCSD, and CCSD(T) levels of theory, it is shown that the thermally accessible (350 K) potential energy surface for a single water molecule can be described to within 1 millihartree using a model that is trained from only a single reference calculation at a randomized geometry. To explore the breadth of chemical diversity that can be described, MOB-ML is also applied to a new dataset of thermalized (350 K) geometries of 7211 organic models with up to seven heavy atoms. In comparison with the previously reported $\Delta$-ML method, MOB-ML is shown to reach chemical accuracy with three-fold fewer training geometries. Finally, a transferability test in which models trained for seven-heavy-atom systems are used to predict energies for thirteen-heavy-atom systems reveals that MOB-ML reaches chemical accuracy with 36-fold fewer training calculations than $\Delta$-ML (140 versus 5000 training calculations).
Tasks
Published 2019-01-10
URL http://arxiv.org/abs/1901.03309v3
PDF http://arxiv.org/pdf/1901.03309v3.pdf
PWC https://paperswithcode.com/paper/a-universal-density-matrix-functional-from
Repo
Framework

Learning Tractable Probabilistic Models in Open Worlds

Title Learning Tractable Probabilistic Models in Open Worlds
Authors Amelie Levray, Vaishak Belle
Abstract Large-scale probabilistic representations, including statistical knowledge bases and graphical models, are increasingly in demand. They are built by mining massive sources of structured and unstructured data, the latter often derived from natural language processing techniques. The very nature of the enterprise makes the extracted representations probabilistic. In particular, inducing relations and facts from noisy and incomplete sources via statistical machine learning models means that the labels are either already probabilistic, or that probabilities approximate confidence. While the progress is impressive, extracted representations essentially enforce the closed-world assumption, which means that all facts in the database are accorded the corresponding probability, but all other facts have probability zero. The CWA is deeply problematic in most machine learning contexts. A principled solution is needed for representing incomplete and indeterminate knowledge in such models, imprecise probability models such as credal networks being an example. In this work, we are interested in the foundational problem of learning such open-world probabilistic models. However, since exact inference in probabilistic graphical models is intractable, the paradigm of tractable learning has emerged to learn data structures (such as arithmetic circuits) that support efficient probabilistic querying. We show here how the computational machinery underlying tractable learning has to be generalized for imprecise probabilities. Our empirical evaluations demonstrate that our regime is also effective.
Tasks
Published 2019-01-17
URL http://arxiv.org/abs/1901.05847v1
PDF http://arxiv.org/pdf/1901.05847v1.pdf
PWC https://paperswithcode.com/paper/learning-tractable-probabilistic-models-in
Repo
Framework
comments powered by Disqus