October 17, 2019

2934 words 14 mins read

Paper Group ANR 692

Accounting for hidden common causes when inferring cause and effect from observational data. Learning Patient Representations from Text. Learning Neural Emotion Analysis from 100 Observations: The Surprising Effectiveness of Pre-Trained Word Representations. Determining the best classifier for predicting the value of a boolean field on a blood dono …

Accounting for hidden common causes when inferring cause and effect from observational data


Title	Accounting for hidden common causes when inferring cause and effect from observational data
Authors	David Heckerman
Abstract	Identifying causal relationships from observation data is difficult, in large part, due to the presence of hidden common causes. In some cases, where just the right patterns of conditional independence and dependence lie in the data—for example, Y-structures—it is possible to identify cause and effect. In other cases, the analyst deliberately makes an uncertain assumption that hidden common causes are absent, and infers putative causal relationships to be tested in a randomized trial. Here, we consider a third approach, where there are sufficient clues in the data such that hidden common causes can be inferred.
Tasks
Published	2018-01-02
URL	http://arxiv.org/abs/1801.00727v2
PDF	http://arxiv.org/pdf/1801.00727v2.pdf
PWC	https://paperswithcode.com/paper/accounting-for-hidden-common-causes-when
Repo
Framework

Learning Patient Representations from Text


Title	Learning Patient Representations from Text
Authors	Dmitriy Dligach, Timothy Miller
Abstract	Mining electronic health records for patients who satisfy a set of predefined criteria is known in medical informatics as phenotyping. Phenotyping has numerous applications such as outcome prediction, clinical trial recruitment, and retrospective studies. Supervised machine learning for phenotyping typically relies on sparse patient representations such as bag-of-words. We consider an alternative that involves learning patient representations. We develop a neural network model for learning patient representations and show that the learned representations are general enough to obtain state-of-the-art performance on a standard comorbidity detection task.
Tasks
Published	2018-05-05
URL	http://arxiv.org/abs/1805.02096v1
PDF	http://arxiv.org/pdf/1805.02096v1.pdf
PWC	https://paperswithcode.com/paper/learning-patient-representations-from-text
Repo
Framework

Learning Neural Emotion Analysis from 100 Observations: The Surprising Effectiveness of Pre-Trained Word Representations


Title	Learning Neural Emotion Analysis from 100 Observations: The Surprising Effectiveness of Pre-Trained Word Representations
Authors	Sven Buechel, João Sedoc, H. Andrew Schwartz, Lyle Ungar
Abstract	Deep Learning has drastically reshaped virtually all areas of NLP. Yet on the downside, it is commonly thought to be dependent on vast amounts of training data. As such, these techniques appear ill-suited for areas where annotated data is limited, like emotion analysis, with its many nuanced and hard-to-acquire annotation formats, or other low-data scenarios encountered in under-resourced languages. In contrast to this popular notion, we provide empirical evidence from three typologically diverse languages that today’s favorite neural architectures can be trained on a few hundred observations only. Our results suggest that high-quality, pre-trained word embeddings are crucial for achieving high performance despite such strong data limitations.
Tasks	Emotion Recognition, Word Embeddings
Published	2018-10-25
URL	http://arxiv.org/abs/1810.10949v1
PDF	http://arxiv.org/pdf/1810.10949v1.pdf
PWC	https://paperswithcode.com/paper/learning-neural-emotion-analysis-from-100
Repo
Framework

Determining the best classifier for predicting the value of a boolean field on a blood donor database using genetic algorithms


Title	Determining the best classifier for predicting the value of a boolean field on a blood donor database using genetic algorithms
Authors	Ritabrata Maiti
Abstract	Motivation: Thanks to digitization, we often have access to large databases, consisting of various fields of information, ranging from numbers to texts and even boolean values. Such databases lend themselves especially well to machine learning, classification and big data analysis tasks. We are able to train classifiers, using already existing data and use them for predicting the values of a certain field, given that we have information regarding the other fields. Most specifically, in this study, we look at the Electronic Health Records (EHRs) that are compiled by hospitals. These EHRs are convenient means of accessing data of individual patients, but there processing as a whole still remains a task. However, EHRs that are composed of coherent, well-tabulated structures lend themselves quite well to the application to machine language, via the usage of classifiers. In this study, we look at a Blood Transfusion Service Center Data Set (Data taken from the Blood Transfusion Service Center in Hsin-Chu City in Taiwan). We used scikit-learn machine learning in python. From Support Vector Machines(SVM), we use Support Vector Classification(SVC), from the linear model we import Perceptron. We also used the K.neighborsclassifier and the decision tree classifiers. Furthermore, we use the TPOT library to find an optimized pipeline using genetic algorithms. Using the above classifiers, we score each one of them using k fold cross-validation. Contact: ritabratamaiti@hiretrex.com GitHub Repository: https://github.com/ritabratamaiti/Blooddonorprediction
Tasks
Published	2018-02-21
URL	http://arxiv.org/abs/1802.07756v4
PDF	http://arxiv.org/pdf/1802.07756v4.pdf
PWC	https://paperswithcode.com/paper/determining-the-best-classifier-for
Repo
Framework

Explainable Security


Title	Explainable Security
Authors	Luca Viganò, Daniele Magazzeni
Abstract	The Defense Advanced Research Projects Agency (DARPA) recently launched the Explainable Artificial Intelligence (XAI) program that aims to create a suite of new AI techniques that enable end users to understand, appropriately trust, and effectively manage the emerging generation of AI systems. In this paper, inspired by DARPA’s XAI program, we propose a new paradigm in security research: Explainable Security (XSec). We discuss the ``Six Ws’’ of XSec (Who? What? Where? When? Why? and How?) and argue that XSec has unique and complex characteristics: XSec involves several different stakeholders (i.e., the system’s developers, analysts, users and attackers) and is multi-faceted by nature (as it requires reasoning about system model, threat model and properties of security, privacy and trust as well as about concrete attacks, vulnerabilities and countermeasures). We define a roadmap for XSec that identifies several possible research directions. \|
Tasks
Published	2018-07-11
URL	http://arxiv.org/abs/1807.04178v1
PDF	http://arxiv.org/pdf/1807.04178v1.pdf
PWC	https://paperswithcode.com/paper/explainable-security
Repo
Framework

Hierarchical Attention Networks for Knowledge Base Completion via Joint Adversarial Training


Title	Hierarchical Attention Networks for Knowledge Base Completion via Joint Adversarial Training
Authors	Chen Li, Xutan Peng, Shanghang Zhang, Jianxin Li, Lihong Wang
Abstract	Knowledge Base (KB) completion, which aims to determine missing relation between entities, has raised increasing attention in recent years. Most existing methods either focus on the positional relationship between entity pair and single relation (1-hop path) in semantic space or concentrate on the joint probability of Random Walks on multi-hop paths among entities. However, they do not fully consider the intrinsic relationships of all the links among entities. By observing that the single relation and multi-hop paths between the same entity pair generally contain shared/similar semantic information, this paper proposes a novel method to capture the shared features between them as the basis for inferring missing relations. To capture the shared features jointly, we develop Hierarchical Attention Networks (HANs) to automatically encode the inputs into low-dimensional vectors, and exploit two partial parameter-shared components, one for feature source discrimination and the other for determining missing relations. By joint Adversarial Training (AT) the entire model, our method minimizes the classification error of missing relations, and ensures the source of shared features are difficult to discriminate in the meantime. The AT mechanism encourages our model to extract features that are both discriminative for missing relation prediction and shareable between single relation and multi-hop paths. We extensively evaluate our method on several large-scale KBs for relation completion. Experimental results show that our method consistently outperforms the baseline approaches. In addition, the hierarchical attention mechanism and the feature extractor in our model can be well interpreted and utilized in the related downstream tasks.
Tasks	Knowledge Base Completion
Published	2018-10-14
URL	http://arxiv.org/abs/1810.06033v1
PDF	http://arxiv.org/pdf/1810.06033v1.pdf
PWC	https://paperswithcode.com/paper/hierarchical-attention-networks-for-knowledge
Repo
Framework

Learning Robotic Assembly from CAD


Title	Learning Robotic Assembly from CAD
Authors	Garrett Thomas, Melissa Chien, Aviv Tamar, Juan Aparicio Ojea, Pieter Abbeel
Abstract	In this work, motivated by recent manufacturing trends, we investigate autonomous robotic assembly. Industrial assembly tasks require contact-rich manipulation skills, which are challenging to acquire using classical control and motion planning approaches. Consequently, robot controllers for assembly domains are presently engineered to solve a particular task, and cannot easily handle variations in the product or environment. Reinforcement learning (RL) is a promising approach for autonomously acquiring robot skills that involve contact-rich dynamics. However, RL relies on random exploration for learning a control policy, which requires many robot executions, and often gets trapped in locally suboptimal solutions. Instead, we posit that prior knowledge, when available, can improve RL performance. We exploit the fact that in modern assembly domains, geometric information about the task is readily available via the CAD design files. We propose to leverage this prior knowledge by guiding RL along a geometric motion plan, calculated using the CAD data. We show that our approach effectively improves over traditional control approaches for tracking the motion plan, and can solve assembly tasks that require high precision, even without accurate state estimation. In addition, we propose a neural network architecture that can learn to track the motion plan, and generalize the assembly controller to changes in the object positions.
Tasks	Motion Planning
Published	2018-03-20
URL	http://arxiv.org/abs/1803.07635v2
PDF	http://arxiv.org/pdf/1803.07635v2.pdf
PWC	https://paperswithcode.com/paper/learning-robotic-assembly-from-cad
Repo
Framework

Determinantal Point Processes for Coresets


Title	Determinantal Point Processes for Coresets
Authors	Nicolas Tremblay, Simon Barthelmé, Pierre-Olivier Amblard
Abstract	When faced with a data set too large to be processed all at once, an obvious solution is to retain only part of it. In practice this takes a wide variety of different forms, and among them “coresets” are especially appealing. A coreset is a (small) weighted sample of the original data that comes with the following guarantee: a cost function can be evaluated on the smaller set instead of the larger one, with low relative error. For some classes of problems, and via a careful choice of sampling distribution (based on the so-called “sensitivity” metric), iid random sampling has turned to be one of the most successful methods for building coresets efficiently. However, independent samples are sometimes overly redundant, and one could hope that enforcing diversity would lead to better performance. The difficulty lies in proving coreset properties in non-iid samples. We show that the coreset property holds for samples formed with determinantal point processes (DPP). DPPs are interesting because they are a rare example of repulsive point processes with tractable theoretical properties, enabling us to prove general coreset theorems. We apply our results to both the k-means and the linear regression problems, and give extensive empirical evidence that the small additional computational cost of DPP sampling comes with superior performance over its iid counterpart. Of independent interest, we also provide analytical formulas for the sensitivity in the linear regression and 1-means cases.
Tasks	Point Processes
Published	2018-03-23
URL	https://arxiv.org/abs/1803.08700v3
PDF	https://arxiv.org/pdf/1803.08700v3.pdf
PWC	https://paperswithcode.com/paper/determinantal-point-processes-for-coresets
Repo
Framework

Semantically Enhanced Models for Commonsense Knowledge Acquisition


Title	Semantically Enhanced Models for Commonsense Knowledge Acquisition
Authors	Ikhlas Alhussien, Erik Cambria, Zhang NengSheng
Abstract	Commonsense knowledge is paramount to enable intelligent systems. Typically, it is characterized as being implicit and ambiguous, hindering thereby the automation of its acquisition. To address these challenges, this paper presents semantically enhanced models to enable reasoning through resolving part of commonsense ambiguity. The proposed models enhance in a knowledge graph embedding (KGE) framework for knowledge base completion. Experimental results show the effectiveness of the new semantic models in commonsense reasoning.
Tasks	Graph Embedding, Knowledge Base Completion, Knowledge Graph Embedding
Published	2018-09-12
URL	http://arxiv.org/abs/1809.04708v2
PDF	http://arxiv.org/pdf/1809.04708v2.pdf
PWC	https://paperswithcode.com/paper/semantically-enhanced-models-for-commonsense
Repo
Framework

DBSCAN++: Towards fast and scalable density clustering


Title	DBSCAN++: Towards fast and scalable density clustering
Authors	Jennifer Jang, Heinrich Jiang
Abstract	DBSCAN is a classical density-based clustering procedure with tremendous practical relevance. However, DBSCAN implicitly needs to compute the empirical density for each sample point, leading to a quadratic worst-case time complexity, which is too slow on large datasets. We propose DBSCAN++, a simple modification of DBSCAN which only requires computing the densities for a chosen subset of points. We show empirically that, compared to traditional DBSCAN, DBSCAN++ can provide not only competitive performance but also added robustness in the bandwidth hyperparameter while taking a fraction of the runtime. We also present statistical consistency guarantees showing the trade-off between computational cost and estimation rates. Surprisingly, up to a certain point, we can enjoy the same estimation rates while lowering computational cost, showing that DBSCAN++ is a sub-quadratic algorithm that attains minimax optimal rates for level-set estimation, a quality that may be of independent interest.
Tasks
Published	2018-10-31
URL	https://arxiv.org/abs/1810.13105v3
PDF	https://arxiv.org/pdf/1810.13105v3.pdf
PWC	https://paperswithcode.com/paper/dbscan-towards-fast-and-scalable-density
Repo
Framework


Title	Customer Sharing in Economic Networks with Costs
Authors	Bin Li, Dong Hao, Dengji Zhao, Tao Zhou
Abstract	In an economic market, sellers, infomediaries and customers constitute an economic network. Each seller has her own customer group and the seller’s private customers are unobservable to other sellers. Therefore, a seller can only sell commodities among her own customers unless other sellers or infomediaries share her sale information to their customer groups. However, a seller is not incentivized to share others’ sale information by default, which leads to inefficient resource allocation and limited revenue for the sale. To tackle this problem, we develop a novel mechanism called customer sharing mechanism (CSM) which incentivizes all sellers to share each other’s sale information to their private customer groups. Furthermore, CSM also incentivizes all customers to truthfully participate in the sale. In the end, CSM not only allocates the commodities efficiently but also optimizes the seller’s revenue.
Tasks
Published	2018-07-18
URL	http://arxiv.org/abs/1807.06822v1
PDF	http://arxiv.org/pdf/1807.06822v1.pdf
PWC	https://paperswithcode.com/paper/customer-sharing-in-economic-networks-with
Repo
Framework

Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis


Title	Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis
Authors	Tal Ben-Nun, Torsten Hoefler
Abstract	Deep Neural Networks (DNNs) are becoming an important tool in modern computing applications. Accelerating their training is a major challenge and techniques range from distributed algorithms to low-level circuit design. In this survey, we describe the problem from a theoretical perspective, followed by approaches for its parallelization. We present trends in DNN architectures and the resulting implications on parallelization strategies. We then review and model the different types of concurrency in DNNs: from the single operator, through parallelism in network inference and training, to distributed deep learning. We discuss asynchronous stochastic optimization, distributed system architectures, communication schemes, and neural architecture search. Based on those approaches, we extrapolate potential directions for parallelism in deep learning.
Tasks	Neural Architecture Search, Stochastic Optimization
Published	2018-02-26
URL	http://arxiv.org/abs/1802.09941v2
PDF	http://arxiv.org/pdf/1802.09941v2.pdf
PWC	https://paperswithcode.com/paper/demystifying-parallel-and-distributed-deep
Repo
Framework

Understanding Generalization and Optimization Performance of Deep CNNs


Title	Understanding Generalization and Optimization Performance of Deep CNNs
Authors	Pan Zhou, Jiashi Feng
Abstract	This work aims to provide understandings on the remarkable success of deep convolutional neural networks (CNNs) by theoretically analyzing their generalization performance and establishing optimization guarantees for gradient descent based training algorithms. Specifically, for a CNN model consisting of $l$ convolutional layers and one fully connected layer, we prove that its generalization error is bounded by $\mathcal{O}(\sqrt{\dt\widetilde{\varrho}/n})$ where $\theta$ denotes freedom degree of the network parameters and $\widetilde{\varrho}=\mathcal{O}(\log(\prod_{i=1}^{l}\rwi{i} (\ki{i}-\si{i}+1)/p)+\log(\rf))$ encapsulates architecture parameters including the kernel size $\ki{i}$, stride $\si{i}$, pooling size $p$ and parameter magnitude $\rwi{i}$. To our best knowledge, this is the first generalization bound that only depends on $\mathcal{O}(\log(\prod_{i=1}^{l+1}\rwi{i}))$, tighter than existing ones that all involve an exponential term like $\mathcal{O}(\prod_{i=1}^{l+1}\rwi{i})$. Besides, we prove that for an arbitrary gradient descent algorithm, the computed approximate stationary point by minimizing empirical risk is also an approximate stationary point to the population risk. This well explains why gradient descent training algorithms usually perform sufficiently well in practice. Furthermore, we prove the one-to-one correspondence and convergence guarantees for the non-degenerate stationary points between the empirical and population risks. It implies that the computed local minimum for the empirical risk is also close to a local minimum for the population risk, thus ensuring the good generalization performance of CNNs.
Tasks
Published	2018-05-28
URL	http://arxiv.org/abs/1805.10767v1
PDF	http://arxiv.org/pdf/1805.10767v1.pdf
PWC	https://paperswithcode.com/paper/understanding-generalization-and-optimization
Repo
Framework

Data-dependent Learning of Symmetric/Antisymmetric Relations for Knowledge Base Completion


Title	Data-dependent Learning of Symmetric/Antisymmetric Relations for Knowledge Base Completion
Authors	Hitoshi Manabe, Katsuhiko Hayashi, Masashi Shimbo
Abstract	Embedding-based methods for knowledge base completion (KBC) learn representations of entities and relations in a vector space, along with the scoring function to estimate the likelihood of relations between entities. The learnable class of scoring functions is designed to be expressive enough to cover a variety of real-world relations, but this expressive comes at the cost of an increased number of parameters. In particular, parameters in these methods are superfluous for relations that are either symmetric or antisymmetric. To mitigate this problem, we propose a new L1 regularizer for Complex Embeddings, which is one of the state-of-the-art embedding-based methods for KBC. This regularizer promotes symmetry or antisymmetry of the scoring function on a relation-by-relation basis, in accordance with the observed data. Our empirical evaluation shows that the proposed method outperforms the original Complex Embeddings and other baseline methods on the FB15k dataset.
Tasks	Knowledge Base Completion
Published	2018-08-25
URL	http://arxiv.org/abs/1808.08361v1
PDF	http://arxiv.org/pdf/1808.08361v1.pdf
PWC	https://paperswithcode.com/paper/data-dependent-learning-of
Repo
Framework

Domain Adversarial Training for Accented Speech Recognition


Title	Domain Adversarial Training for Accented Speech Recognition
Authors	Sining Sun, Ching-Feng Yeh, Mei-Yuh Hwang, Mari Ostendorf, Lei Xie
Abstract	In this paper, we propose a domain adversarial training (DAT) algorithm to alleviate the accented speech recognition problem. In order to reduce the mismatch between labeled source domain data (“standard” accent) and unlabeled target domain data (with heavy accents), we augment the learning objective for a Kaldi TDNN network with a domain adversarial training (DAT) objective to encourage the model to learn accent-invariant features. In experiments with three Mandarin accents, we show that DAT yields up to 7.45% relative character error rate reduction when we do not have transcriptions of the accented speech, compared with the baseline trained on standard accent data only. We also find a benefit from DAT when used in combination with training from automatic transcriptions on the accented data. Furthermore, we find that DAT is superior to multi-task learning for accented speech recognition.
Tasks	Accented Speech Recognition, Multi-Task Learning, Speech Recognition
Published	2018-06-07
URL	http://arxiv.org/abs/1806.02786v1
PDF	http://arxiv.org/pdf/1806.02786v1.pdf
PWC	https://paperswithcode.com/paper/domain-adversarial-training-for-accented
Repo
Framework