October 17, 2019

2934 words 14 mins read

Paper Group ANR 692

Paper Group ANR 692

Accounting for hidden common causes when inferring cause and effect from observational data. Learning Patient Representations from Text. Learning Neural Emotion Analysis from 100 Observations: The Surprising Effectiveness of Pre-Trained Word Representations. Determining the best classifier for predicting the value of a boolean field on a blood dono …

Accounting for hidden common causes when inferring cause and effect from observational data

Title Accounting for hidden common causes when inferring cause and effect from observational data
Authors David Heckerman
Abstract Identifying causal relationships from observation data is difficult, in large part, due to the presence of hidden common causes. In some cases, where just the right patterns of conditional independence and dependence lie in the data—for example, Y-structures—it is possible to identify cause and effect. In other cases, the analyst deliberately makes an uncertain assumption that hidden common causes are absent, and infers putative causal relationships to be tested in a randomized trial. Here, we consider a third approach, where there are sufficient clues in the data such that hidden common causes can be inferred.
Tasks
Published 2018-01-02
URL http://arxiv.org/abs/1801.00727v2
PDF http://arxiv.org/pdf/1801.00727v2.pdf
PWC https://paperswithcode.com/paper/accounting-for-hidden-common-causes-when
Repo
Framework

Learning Patient Representations from Text

Title Learning Patient Representations from Text
Authors Dmitriy Dligach, Timothy Miller
Abstract Mining electronic health records for patients who satisfy a set of predefined criteria is known in medical informatics as phenotyping. Phenotyping has numerous applications such as outcome prediction, clinical trial recruitment, and retrospective studies. Supervised machine learning for phenotyping typically relies on sparse patient representations such as bag-of-words. We consider an alternative that involves learning patient representations. We develop a neural network model for learning patient representations and show that the learned representations are general enough to obtain state-of-the-art performance on a standard comorbidity detection task.
Tasks
Published 2018-05-05
URL http://arxiv.org/abs/1805.02096v1
PDF http://arxiv.org/pdf/1805.02096v1.pdf
PWC https://paperswithcode.com/paper/learning-patient-representations-from-text
Repo
Framework

Learning Neural Emotion Analysis from 100 Observations: The Surprising Effectiveness of Pre-Trained Word Representations

Title Learning Neural Emotion Analysis from 100 Observations: The Surprising Effectiveness of Pre-Trained Word Representations
Authors Sven Buechel, João Sedoc, H. Andrew Schwartz, Lyle Ungar
Abstract Deep Learning has drastically reshaped virtually all areas of NLP. Yet on the downside, it is commonly thought to be dependent on vast amounts of training data. As such, these techniques appear ill-suited for areas where annotated data is limited, like emotion analysis, with its many nuanced and hard-to-acquire annotation formats, or other low-data scenarios encountered in under-resourced languages. In contrast to this popular notion, we provide empirical evidence from three typologically diverse languages that today’s favorite neural architectures can be trained on a few hundred observations only. Our results suggest that high-quality, pre-trained word embeddings are crucial for achieving high performance despite such strong data limitations.
Tasks Emotion Recognition, Word Embeddings
Published 2018-10-25
URL http://arxiv.org/abs/1810.10949v1
PDF http://arxiv.org/pdf/1810.10949v1.pdf
PWC https://paperswithcode.com/paper/learning-neural-emotion-analysis-from-100
Repo
Framework

Determining the best classifier for predicting the value of a boolean field on a blood donor database using genetic algorithms

Title Determining the best classifier for predicting the value of a boolean field on a blood donor database using genetic algorithms
Authors Ritabrata Maiti
Abstract Motivation: Thanks to digitization, we often have access to large databases, consisting of various fields of information, ranging from numbers to texts and even boolean values. Such databases lend themselves especially well to machine learning, classification and big data analysis tasks. We are able to train classifiers, using already existing data and use them for predicting the values of a certain field, given that we have information regarding the other fields. Most specifically, in this study, we look at the Electronic Health Records (EHRs) that are compiled by hospitals. These EHRs are convenient means of accessing data of individual patients, but there processing as a whole still remains a task. However, EHRs that are composed of coherent, well-tabulated structures lend themselves quite well to the application to machine language, via the usage of classifiers. In this study, we look at a Blood Transfusion Service Center Data Set (Data taken from the Blood Transfusion Service Center in Hsin-Chu City in Taiwan). We used scikit-learn machine learning in python. From Support Vector Machines(SVM), we use Support Vector Classification(SVC), from the linear model we import Perceptron. We also used the K.neighborsclassifier and the decision tree classifiers. Furthermore, we use the TPOT library to find an optimized pipeline using genetic algorithms. Using the above classifiers, we score each one of them using k fold cross-validation. Contact: ritabratamaiti@hiretrex.com GitHub Repository: https://github.com/ritabratamaiti/Blooddonorprediction
Tasks
Published 2018-02-21
URL http://arxiv.org/abs/1802.07756v4
PDF http://arxiv.org/pdf/1802.07756v4.pdf
PWC https://paperswithcode.com/paper/determining-the-best-classifier-for
Repo
Framework

Explainable Security

Title Explainable Security
Authors Luca Viganò, Daniele Magazzeni
Abstract The Defense Advanced Research Projects Agency (DARPA) recently launched the Explainable Artificial Intelligence (XAI) program that aims to create a suite of new AI techniques that enable end users to understand, appropriately trust, and effectively manage the emerging generation of AI systems. In this paper, inspired by DARPA’s XAI program, we propose a new paradigm in security research: Explainable Security (XSec). We discuss the ``Six Ws’’ of XSec (Who? What? Where? When? Why? and How?) and argue that XSec has unique and complex characteristics: XSec involves several different stakeholders (i.e., the system’s developers, analysts, users and attackers) and is multi-faceted by nature (as it requires reasoning about system model, threat model and properties of security, privacy and trust as well as about concrete attacks, vulnerabilities and countermeasures). We define a roadmap for XSec that identifies several possible research directions. |
Tasks
Published 2018-07-11
URL http://arxiv.org/abs/1807.04178v1
PDF http://arxiv.org/pdf/1807.04178v1.pdf
PWC https://paperswithcode.com/paper/explainable-security
Repo
Framework

Hierarchical Attention Networks for Knowledge Base Completion via Joint Adversarial Training

Title Hierarchical Attention Networks for Knowledge Base Completion via Joint Adversarial Training
Authors Chen Li, Xutan Peng, Shanghang Zhang, Jianxin Li, Lihong Wang
Abstract Knowledge Base (KB) completion, which aims to determine missing relation between entities, has raised increasing attention in recent years. Most existing methods either focus on the positional relationship between entity pair and single relation (1-hop path) in semantic space or concentrate on the joint probability of Random Walks on multi-hop paths among entities. However, they do not fully consider the intrinsic relationships of all the links among entities. By observing that the single relation and multi-hop paths between the same entity pair generally contain shared/similar semantic information, this paper proposes a novel method to capture the shared features between them as the basis for inferring missing relations. To capture the shared features jointly, we develop Hierarchical Attention Networks (HANs) to automatically encode the inputs into low-dimensional vectors, and exploit two partial parameter-shared components, one for feature source discrimination and the other for determining missing relations. By joint Adversarial Training (AT) the entire model, our method minimizes the classification error of missing relations, and ensures the source of shared features are difficult to discriminate in the meantime. The AT mechanism encourages our model to extract features that are both discriminative for missing relation prediction and shareable between single relation and multi-hop paths. We extensively evaluate our method on several large-scale KBs for relation completion. Experimental results show that our method consistently outperforms the baseline approaches. In addition, the hierarchical attention mechanism and the feature extractor in our model can be well interpreted and utilized in the related downstream tasks.
Tasks Knowledge Base Completion
Published 2018-10-14
URL http://arxiv.org/abs/1810.06033v1
PDF http://arxiv.org/pdf/1810.06033v1.pdf
PWC https://paperswithcode.com/paper/hierarchical-attention-networks-for-knowledge
Repo
Framework

Learning Robotic Assembly from CAD

Title Learning Robotic Assembly from CAD
Authors Garrett Thomas, Melissa Chien, Aviv Tamar, Juan Aparicio Ojea, Pieter Abbeel
Abstract In this work, motivated by recent manufacturing trends, we investigate autonomous robotic assembly. Industrial assembly tasks require contact-rich manipulation skills, which are challenging to acquire using classical control and motion planning approaches. Consequently, robot controllers for assembly domains are presently engineered to solve a particular task, and cannot easily handle variations in the product or environment. Reinforcement learning (RL) is a promising approach for autonomously acquiring robot skills that involve contact-rich dynamics. However, RL relies on random exploration for learning a control policy, which requires many robot executions, and often gets trapped in locally suboptimal solutions. Instead, we posit that prior knowledge, when available, can improve RL performance. We exploit the fact that in modern assembly domains, geometric information about the task is readily available via the CAD design files. We propose to leverage this prior knowledge by guiding RL along a geometric motion plan, calculated using the CAD data. We show that our approach effectively improves over traditional control approaches for tracking the motion plan, and can solve assembly tasks that require high precision, even without accurate state estimation. In addition, we propose a neural network architecture that can learn to track the motion plan, and generalize the assembly controller to changes in the object positions.
Tasks Motion Planning
Published 2018-03-20
URL http://arxiv.org/abs/1803.07635v2
PDF http://arxiv.org/pdf/1803.07635v2.pdf
PWC https://paperswithcode.com/paper/learning-robotic-assembly-from-cad
Repo
Framework

Determinantal Point Processes for Coresets

Title Determinantal Point Processes for Coresets
Authors Nicolas Tremblay, Simon Barthelmé, Pierre-Olivier Amblard
Abstract When faced with a data set too large to be processed all at once, an obvious solution is to retain only part of it. In practice this takes a wide variety of different forms, and among them “coresets” are especially appealing. A coreset is a (small) weighted sample of the original data that comes with the following guarantee: a cost function can be evaluated on the smaller set instead of the larger one, with low relative error. For some classes of problems, and via a careful choice of sampling distribution (based on the so-called “sensitivity” metric), iid random sampling has turned to be one of the most successful methods for building coresets efficiently. However, independent samples are sometimes overly redundant, and one could hope that enforcing diversity would lead to better performance. The difficulty lies in proving coreset properties in non-iid samples. We show that the coreset property holds for samples formed with determinantal point processes (DPP). DPPs are interesting because they are a rare example of repulsive point processes with tractable theoretical properties, enabling us to prove general coreset theorems. We apply our results to both the k-means and the linear regression problems, and give extensive empirical evidence that the small additional computational cost of DPP sampling comes with superior performance over its iid counterpart. Of independent interest, we also provide analytical formulas for the sensitivity in the linear regression and 1-means cases.
Tasks Point Processes
Published 2018-03-23
URL https://arxiv.org/abs/1803.08700v3
PDF https://arxiv.org/pdf/1803.08700v3.pdf
PWC https://paperswithcode.com/paper/determinantal-point-processes-for-coresets
Repo
Framework

Semantically Enhanced Models for Commonsense Knowledge Acquisition

Title Semantically Enhanced Models for Commonsense Knowledge Acquisition
Authors Ikhlas Alhussien, Erik Cambria, Zhang NengSheng
Abstract Commonsense knowledge is paramount to enable intelligent systems. Typically, it is characterized as being implicit and ambiguous, hindering thereby the automation of its acquisition. To address these challenges, this paper presents semantically enhanced models to enable reasoning through resolving part of commonsense ambiguity. The proposed models enhance in a knowledge graph embedding (KGE) framework for knowledge base completion. Experimental results show the effectiveness of the new semantic models in commonsense reasoning.
Tasks Graph Embedding, Knowledge Base Completion, Knowledge Graph Embedding
Published 2018-09-12
URL http://arxiv.org/abs/1809.04708v2
PDF http://arxiv.org/pdf/1809.04708v2.pdf
PWC https://paperswithcode.com/paper/semantically-enhanced-models-for-commonsense
Repo
Framework

DBSCAN++: Towards fast and scalable density clustering

Title DBSCAN++: Towards fast and scalable density clustering
Authors Jennifer Jang, Heinrich Jiang
Abstract DBSCAN is a classical density-based clustering procedure with tremendous practical relevance. However, DBSCAN implicitly needs to compute the empirical density for each sample point, leading to a quadratic worst-case time complexity, which is too slow on large datasets. We propose DBSCAN++, a simple modification of DBSCAN which only requires computing the densities for a chosen subset of points. We show empirically that, compared to traditional DBSCAN, DBSCAN++ can provide not only competitive performance but also added robustness in the bandwidth hyperparameter while taking a fraction of the runtime. We also present statistical consistency guarantees showing the trade-off between computational cost and estimation rates. Surprisingly, up to a certain point, we can enjoy the same estimation rates while lowering computational cost, showing that DBSCAN++ is a sub-quadratic algorithm that attains minimax optimal rates for level-set estimation, a quality that may be of independent interest.
Tasks
Published 2018-10-31
URL https://arxiv.org/abs/1810.13105v3
PDF https://arxiv.org/pdf/1810.13105v3.pdf
PWC https://paperswithcode.com/paper/dbscan-towards-fast-and-scalable-density
Repo
Framework

Customer Sharing in Economic Networks with Costs

Title Customer Sharing in Economic Networks with Costs
Authors Bin Li, Dong Hao, Dengji Zhao, Tao Zhou
Abstract In an economic market, sellers, infomediaries and customers constitute an economic network. Each seller has her own customer group and the seller’s private customers are unobservable to other sellers. Therefore, a seller can only sell commodities among her own customers unless other sellers or infomediaries share her sale information to their customer groups. However, a seller is not incentivized to share others’ sale information by default, which leads to inefficient resource allocation and limited revenue for the sale. To tackle this problem, we develop a novel mechanism called customer sharing mechanism (CSM) which incentivizes all sellers to share each other’s sale information to their private customer groups. Furthermore, CSM also incentivizes all customers to truthfully participate in the sale. In the end, CSM not only allocates the commodities efficiently but also optimizes the seller’s revenue.
Tasks
Published 2018-07-18
URL http://arxiv.org/abs/1807.06822v1
PDF http://arxiv.org/pdf/1807.06822v1.pdf
PWC https://paperswithcode.com/paper/customer-sharing-in-economic-networks-with
Repo
Framework

Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis

Title Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis
Authors Tal Ben-Nun, Torsten Hoefler
Abstract Deep Neural Networks (DNNs) are becoming an important tool in modern computing applications. Accelerating their training is a major challenge and techniques range from distributed algorithms to low-level circuit design. In this survey, we describe the problem from a theoretical perspective, followed by approaches for its parallelization. We present trends in DNN architectures and the resulting implications on parallelization strategies. We then review and model the different types of concurrency in DNNs: from the single operator, through parallelism in network inference and training, to distributed deep learning. We discuss asynchronous stochastic optimization, distributed system architectures, communication schemes, and neural architecture search. Based on those approaches, we extrapolate potential directions for parallelism in deep learning.
Tasks Neural Architecture Search, Stochastic Optimization
Published 2018-02-26
URL http://arxiv.org/abs/1802.09941v2
PDF http://arxiv.org/pdf/1802.09941v2.pdf
PWC https://paperswithcode.com/paper/demystifying-parallel-and-distributed-deep
Repo
Framework

Understanding Generalization and Optimization Performance of Deep CNNs

Title Understanding Generalization and Optimization Performance of Deep CNNs
Authors Pan Zhou, Jiashi Feng
Abstract This work aims to provide understandings on the remarkable success of deep convolutional neural networks (CNNs) by theoretically analyzing their generalization performance and establishing optimization guarantees for gradient descent based training algorithms. Specifically, for a CNN model consisting of $l$ convolutional layers and one fully connected layer, we prove that its generalization error is bounded by $\mathcal{O}(\sqrt{\dt\widetilde{\varrho}/n})$ where $\theta$ denotes freedom degree of the network parameters and $\widetilde{\varrho}=\mathcal{O}(\log(\prod_{i=1}^{l}\rwi{i} (\ki{i}-\si{i}+1)/p)+\log(\rf))$ encapsulates architecture parameters including the kernel size $\ki{i}$, stride $\si{i}$, pooling size $p$ and parameter magnitude $\rwi{i}$. To our best knowledge, this is the first generalization bound that only depends on $\mathcal{O}(\log(\prod_{i=1}^{l+1}\rwi{i}))$, tighter than existing ones that all involve an exponential term like $\mathcal{O}(\prod_{i=1}^{l+1}\rwi{i})$. Besides, we prove that for an arbitrary gradient descent algorithm, the computed approximate stationary point by minimizing empirical risk is also an approximate stationary point to the population risk. This well explains why gradient descent training algorithms usually perform sufficiently well in practice. Furthermore, we prove the one-to-one correspondence and convergence guarantees for the non-degenerate stationary points between the empirical and population risks. It implies that the computed local minimum for the empirical risk is also close to a local minimum for the population risk, thus ensuring the good generalization performance of CNNs.
Tasks
Published 2018-05-28
URL http://arxiv.org/abs/1805.10767v1
PDF http://arxiv.org/pdf/1805.10767v1.pdf
PWC https://paperswithcode.com/paper/understanding-generalization-and-optimization
Repo
Framework

Data-dependent Learning of Symmetric/Antisymmetric Relations for Knowledge Base Completion

Title Data-dependent Learning of Symmetric/Antisymmetric Relations for Knowledge Base Completion
Authors Hitoshi Manabe, Katsuhiko Hayashi, Masashi Shimbo
Abstract Embedding-based methods for knowledge base completion (KBC) learn representations of entities and relations in a vector space, along with the scoring function to estimate the likelihood of relations between entities. The learnable class of scoring functions is designed to be expressive enough to cover a variety of real-world relations, but this expressive comes at the cost of an increased number of parameters. In particular, parameters in these methods are superfluous for relations that are either symmetric or antisymmetric. To mitigate this problem, we propose a new L1 regularizer for Complex Embeddings, which is one of the state-of-the-art embedding-based methods for KBC. This regularizer promotes symmetry or antisymmetry of the scoring function on a relation-by-relation basis, in accordance with the observed data. Our empirical evaluation shows that the proposed method outperforms the original Complex Embeddings and other baseline methods on the FB15k dataset.
Tasks Knowledge Base Completion
Published 2018-08-25
URL http://arxiv.org/abs/1808.08361v1
PDF http://arxiv.org/pdf/1808.08361v1.pdf
PWC https://paperswithcode.com/paper/data-dependent-learning-of
Repo
Framework

Domain Adversarial Training for Accented Speech Recognition

Title Domain Adversarial Training for Accented Speech Recognition
Authors Sining Sun, Ching-Feng Yeh, Mei-Yuh Hwang, Mari Ostendorf, Lei Xie
Abstract In this paper, we propose a domain adversarial training (DAT) algorithm to alleviate the accented speech recognition problem. In order to reduce the mismatch between labeled source domain data (“standard” accent) and unlabeled target domain data (with heavy accents), we augment the learning objective for a Kaldi TDNN network with a domain adversarial training (DAT) objective to encourage the model to learn accent-invariant features. In experiments with three Mandarin accents, we show that DAT yields up to 7.45% relative character error rate reduction when we do not have transcriptions of the accented speech, compared with the baseline trained on standard accent data only. We also find a benefit from DAT when used in combination with training from automatic transcriptions on the accented data. Furthermore, we find that DAT is superior to multi-task learning for accented speech recognition.
Tasks Accented Speech Recognition, Multi-Task Learning, Speech Recognition
Published 2018-06-07
URL http://arxiv.org/abs/1806.02786v1
PDF http://arxiv.org/pdf/1806.02786v1.pdf
PWC https://paperswithcode.com/paper/domain-adversarial-training-for-accented
Repo
Framework
comments powered by Disqus