October 16, 2019

3239 words 16 mins read

Paper Group ANR 1062

Cautious Deep Learning. A Variational Dirichlet Framework for Out-of-Distribution Detection. Improving the Expressiveness of Deep Learning Frameworks with Recursion. Deep Gaussian Processes with Convolutional Kernels. How game complexity affects the playing behavior of synthetic agents. ReMotENet: Efficient Relevant Motion Event Detection for Large …

Cautious Deep Learning


Title	Cautious Deep Learning
Authors	Yotam Hechtlinger, Barnabás Póczos, Larry Wasserman
Abstract	Most classifiers operate by selecting the maximum of an estimate of the conditional distribution $p(yx)$ where $x$ stands for the features of the instance to be classified and $y$ denotes its label. This often results in a {\em hubristic bias}: overconfidence in the assignment of a definite label. Usually, the observations are concentrated on a small volume but the classifier provides definite predictions for the entire space. We propose constructing conformal prediction sets which contain a set of labels rather than a single label. These conformal prediction sets contain the true label with probability $1-\alpha$. Our construction is based on $p(xy)$ rather than $p(yx)$ which results in a classifier that is very cautious: it outputs the null set — meaning “I don’t know” — when the object does not resemble the training examples. An important property of our approach is that adversarial attacks are likely to be predicted as the null set or would also include the true label. We demonstrate the performance on the ImageNet ILSVRC dataset and the CelebA and IMDB-Wiki facial datasets using high dimensional features obtained from state of the art convolutional neural networks.
Tasks
Published	2018-05-24
URL	http://arxiv.org/abs/1805.09460v2
PDF	http://arxiv.org/pdf/1805.09460v2.pdf
PWC	https://paperswithcode.com/paper/cautious-deep-learning
Repo
Framework

A Variational Dirichlet Framework for Out-of-Distribution Detection


Title	A Variational Dirichlet Framework for Out-of-Distribution Detection
Authors	Wenhu Chen, Yilin Shen, Hongxia Jin, William Wang
Abstract	With the recently rapid development in deep learning, deep neural networks have been widely adopted in many real-life applications. However, deep neural networks are also known to have very little control over its uncertainty for unseen examples, which potentially causes very harmful and annoying consequences in practical scenarios. In this paper, we are particularly interested in designing a higher-order uncertainty metric for deep neural networks and investigate its effectiveness under the out-of-distribution detection task proposed by~\cite{hendrycks2016baseline}. Our method first assumes there exists an underlying higher-order distribution $\mathbb{P}(z)$, which controls label-wise categorical distribution $\mathbb{P}(y)$ over classes on the K-dimension simplex, and then approximate such higher-order distribution via parameterized posterior function $p_{\theta}(zx)$ under variational inference framework, finally we use the entropy of learned posterior distribution $p_{\theta}(zx)$ as uncertainty measure to detect out-of-distribution examples. Further, we propose an auxiliary objective function to discriminate against synthesized adversarial examples to further increase the robustness of the proposed uncertainty measure. Through comprehensive experiments on various datasets, our proposed framework is demonstrated to consistently outperform competing algorithms.
Tasks	Out-of-Distribution Detection
Published	2018-11-18
URL	http://arxiv.org/abs/1811.07308v4
PDF	http://arxiv.org/pdf/1811.07308v4.pdf
PWC	https://paperswithcode.com/paper/a-variational-dirichlet-framework-for-out-of
Repo
Framework

Improving the Expressiveness of Deep Learning Frameworks with Recursion


Title	Improving the Expressiveness of Deep Learning Frameworks with Recursion
Authors	Eunji Jeong, Joo Seong Jeong, Soojeong Kim, Gyeong-In Yu, Byung-Gon Chun
Abstract	Recursive neural networks have widely been used by researchers to handle applications with recursively or hierarchically structured data. However, embedded control flow deep learning frameworks such as TensorFlow, Theano, Caffe2, and MXNet fail to efficiently represent and execute such neural networks, due to lack of support for recursion. In this paper, we add recursion to the programming model of existing frameworks by complementing their design with recursive execution of dataflow graphs as well as additional APIs for recursive definitions. Unlike iterative implementations, which can only understand the topological index of each node in recursive data structures, our recursive implementation is able to exploit the recursive relationships between nodes for efficient execution based on parallel computation. We present an implementation on TensorFlow and evaluation results with various recursive neural network models, showing that our recursive implementation not only conveys the recursive nature of recursive neural networks better than other implementations, but also uses given resources more effectively to reduce training and inference time.
Tasks
Published	2018-09-04
URL	http://arxiv.org/abs/1809.00832v1
PDF	http://arxiv.org/pdf/1809.00832v1.pdf
PWC	https://paperswithcode.com/paper/improving-the-expressiveness-of-deep-learning
Repo
Framework

Deep Gaussian Processes with Convolutional Kernels


Title	Deep Gaussian Processes with Convolutional Kernels
Authors	Vinayak Kumar, Vaibhav Singh, P. K. Srijith, Andreas Damianou
Abstract	Deep Gaussian processes (DGPs) provide a Bayesian non-parametric alternative to standard parametric deep learning models. A DGP is formed by stacking multiple GPs resulting in a well-regularized composition of functions. The Bayesian framework that equips the model with attractive properties, such as implicit capacity control and predictive uncertainty, makes it at the same time challenging to combine with a convolutional structure. This has hindered the application of DGPs in computer vision tasks, an area where deep parametric models (i.e. CNNs) have made breakthroughs. Standard kernels used in DGPs such as radial basis functions (RBFs) are insufficient for handling pixel variability in raw images. In this paper, we build on the recent convolutional GP to develop Convolutional DGP (CDGP) models which effectively capture image level features through the use of convolution kernels, therefore opening up the way for applying DGPs to computer vision tasks. Our model learns local spatial influence and outperforms strong GP based baselines on multi-class image classification. We also consider various constructions of convolution kernel over the image patches, analyze the computational trade-offs and provide an efficient framework for convolutional DGP models. The experimental results on image data such as MNIST, rectangles-image, CIFAR10 and Caltech101 demonstrate the effectiveness of the proposed approaches.
Tasks	Gaussian Processes, Image Classification
Published	2018-06-05
URL	http://arxiv.org/abs/1806.01655v1
PDF	http://arxiv.org/pdf/1806.01655v1.pdf
PWC	https://paperswithcode.com/paper/deep-gaussian-processes-with-convolutional
Repo
Framework

How game complexity affects the playing behavior of synthetic agents


Title	How game complexity affects the playing behavior of synthetic agents
Authors	Chairi Kiourt, Dimitris Kalles, Panagiotis Kanellopoulos
Abstract	Agent based simulation of social organizations, via the investigation of agents’ training and learning tactics and strategies, has been inspired by the ability of humans to learn from social environments which are rich in agents, interactions and partial or hidden information. Such richness is a source of complexity that an effective learner has to be able to navigate. This paper focuses on the investigation of the impact of the environmental complexity on the game playing-and-learning behavior of synthetic agents. We demonstrate our approach using two independent turn-based zero-sum games as the basis of forming social events which are characterized both by competition and cooperation. The paper’s key highlight is that as the complexity of a social environment changes, an effective player has to adapt its learning and playing profile to maintain a given performance profile
Tasks
Published	2018-07-07
URL	http://arxiv.org/abs/1807.02648v1
PDF	http://arxiv.org/pdf/1807.02648v1.pdf
PWC	https://paperswithcode.com/paper/how-game-complexity-affects-the-playing
Repo
Framework

ReMotENet: Efficient Relevant Motion Event Detection for Large-scale Home Surveillance Videos


Title	ReMotENet: Efficient Relevant Motion Event Detection for Large-scale Home Surveillance Videos
Authors	Ruichi Yu, Hongcheng Wang, Larry S. Davis
Abstract	This paper addresses the problem of detecting relevant motion caused by objects of interest (e.g., person and vehicles) in large scale home surveillance videos. The traditional method usually consists of two separate steps, i.e., detecting moving objects with background subtraction running on the camera, and filtering out nuisance motion events (e.g., trees, cloud, shadow, rain/snow, flag) with deep learning based object detection and tracking running on cloud. The method is extremely slow and therefore not cost effective, and does not fully leverage the spatial-temporal redundancies with a pre-trained off-the-shelf object detector. To dramatically speedup relevant motion event detection and improve its performance, we propose a novel network for relevant motion event detection, ReMotENet, which is a unified, end-to-end data-driven method using spatial-temporal attention-based 3D ConvNets to jointly model the appearance and motion of objects-of-interest in a video. ReMotENet parses an entire video clip in one forward pass of a neural network to achieve significant speedup. Meanwhile, it exploits the properties of home surveillance videos, e.g., relevant motion is sparse both spatially and temporally, and enhances 3D ConvNets with a spatial-temporal attention model and reference-frame subtraction to encourage the network to focus on the relevant moving objects. Experiments demonstrate that our method can achieve comparable or event better performance than the object detection based method but with three to four orders of magnitude speedup (up to 20k times) on GPU devices. Our network is efficient, compact and light-weight. It can detect relevant motion on a 15s surveillance video clip within 4-8 milliseconds on a GPU and a fraction of second (0.17-0.39) on a CPU with a model size of less than 1MB.
Tasks	Object Detection
Published	2018-01-06
URL	http://arxiv.org/abs/1801.02031v1
PDF	http://arxiv.org/pdf/1801.02031v1.pdf
PWC	https://paperswithcode.com/paper/remotenet-efficient-relevant-motion-event
Repo
Framework

Quantum Structures in Human Decision-making: Towards Quantum Expected Utility


Title	Quantum Structures in Human Decision-making: Towards Quantum Expected Utility
Authors	Sandro Sozzo
Abstract	{\it Ellsberg thought experiments} and empirical confirmation of Ellsberg preferences pose serious challenges to {\it subjective expected utility theory} (SEUT). We have recently elaborated a quantum-theoretic framework for human decisions under uncertainty which satisfactorily copes with the Ellsberg paradox and other puzzles of SEUT. We apply here the quantum-theoretic framework to the {\it Ellsberg two-urn example}, showing that the paradox can be explained by assuming a state change of the conceptual entity that is the object of the decision ({\it decision-making}, or {\it DM}, {\it entity}) and representing subjective probabilities by quantum probabilities. We also model the empirical data we collected in a DM test on human participants within the theoretic framework above. The obtained results are relevant, as they provide a line to model real life, e.g., financial and medical, decisions that show the same empirical patterns as the two-urn experiment.
Tasks	Decision Making
Published	2018-10-29
URL	https://arxiv.org/abs/1811.00875v1
PDF	https://arxiv.org/pdf/1811.00875v1.pdf
PWC	https://paperswithcode.com/paper/quantum-structures-in-human-decision-making
Repo
Framework

AdapterNet - learning input transformation for domain adaptation


Title	AdapterNet - learning input transformation for domain adaptation
Authors	Alon Hazan, Yoel Shoshan, Daniel Khapun, Roy Aladjem, Vadim Ratner
Abstract	Deep neural networks have demonstrated impressive performance in various machine learning tasks. However, they are notoriously sensitive to changes in data distribution. Often, even a slight change in the distribution can lead to drastic performance reduction. Artificially augmenting the data may help to some extent, but in most cases, fails to achieve model invariance to the data distribution. Some examples where this sub-class of domain adaptation can be valuable are various imaging modalities such as thermal imaging, X-ray, ultrasound, and MRI, where changes in acquisition parameters or acquisition device manufacturer will result in a different representation of the same input. Our work shows that standard fine-tuning fails to adapt the model in certain important cases. We propose a novel method of adapting to a new data source, and demonstrate near perfect adaptation on a customized ImageNet benchmark. Moreover, our method does not require any samples from the original data set, it is completely explainable and can be tailored to the task.
Tasks	Domain Adaptation
Published	2018-05-29
URL	http://arxiv.org/abs/1805.11601v2
PDF	http://arxiv.org/pdf/1805.11601v2.pdf
PWC	https://paperswithcode.com/paper/adapternet-learning-input-transformation-for
Repo
Framework

Verb Argument Structure Alternations in Word and Sentence Embeddings


Title	Verb Argument Structure Alternations in Word and Sentence Embeddings
Authors	Katharina Kann, Alex Warstadt, Adina Williams, Samuel R. Bowman
Abstract	Verbs occur in different syntactic environments, or frames. We investigate whether artificial neural networks encode grammatical distinctions necessary for inferring the idiosyncratic frame-selectional properties of verbs. We introduce five datasets, collectively called FAVA, containing in aggregate nearly 10k sentences labeled for grammatical acceptability, illustrating different verbal argument structure alternations. We then test whether models can distinguish acceptable English verb-frame combinations from unacceptable ones using a sentence embedding alone. For converging evidence, we further construct LaVA, a corresponding word-level dataset, and investigate whether the same syntactic features can be extracted from word embeddings. Our models perform reliable classifications for some verbal alternations but not others, suggesting that while these representations do encode fine-grained lexical information, it is incomplete or can be hard to extract. Further, differences between the word- and sentence-level models show that some information present in word embeddings is not passed on to the down-stream sentence embeddings.
Tasks	Sentence Embedding, Sentence Embeddings, Word Embeddings
Published	2018-11-27
URL	http://arxiv.org/abs/1811.10773v1
PDF	http://arxiv.org/pdf/1811.10773v1.pdf
PWC	https://paperswithcode.com/paper/verb-argument-structure-alternations-in-word
Repo
Framework

Latent Geometry Inspired Graph Dissimilarities Enhance Affinity Propagation Community Detection in Complex Networks


Title	Latent Geometry Inspired Graph Dissimilarities Enhance Affinity Propagation Community Detection in Complex Networks
Authors	Carlo Vittorio Cannistraci, Alessandro Muscoloni
Abstract	Affinity propagation is one of the most effective unsupervised pattern recognition algorithms for data clustering in high-dimensional feature space. However, the numerous attempts to test its performance for community detection in complex networks have been attaining results very far from the state of the art methods such as Infomap and Louvain. Yet, all these studies agreed that the crucial problem is to convert the unweighted network topology in a ‘smart-enough’ node dissimilarity matrix that is able to properly address the message passing procedure behind affinity propagation clustering. Here we introduce a conceptual innovation and we discuss how to leverage network latent geometry notions in order to design dissimilarity matrices for affinity propagation community detection. Our results demonstrate that the latent geometry inspired dissimilarity measures we design bring affinity propagation to equal or outperform current state of the art methods for community detection. These findings are solidly proven considering both synthetic ‘realistic’ networks (with known ground-truth communities) and real networks (with community metadata), even when the data structure is corrupted by noise artificially induced by missing or spurious connectivity.
Tasks	Community Detection
Published	2018-04-12
URL	http://arxiv.org/abs/1804.04566v2
PDF	http://arxiv.org/pdf/1804.04566v2.pdf
PWC	https://paperswithcode.com/paper/latent-geometry-inspired-graph
Repo
Framework

Physics-Informed CoKriging: A Gaussian-Process-Regression-Based Multifidelity Method for Data-Model Convergence


Title	Physics-Informed CoKriging: A Gaussian-Process-Regression-Based Multifidelity Method for Data-Model Convergence
Authors	Xiu Yang, David Barajas-Solano, Guzel Tartakovsky, Alexandre Tartakovsky
Abstract	In this work, we propose a new Gaussian process regression (GPR)-based multifidelity method: physics-informed CoKriging (CoPhIK). In CoKriging-based multifidelity methods, the quantities of interest are modeled as linear combinations of multiple parameterized stationary Gaussian processes (GPs), and the hyperparameters of these GPs are estimated from data via optimization. In CoPhIK, we construct a GP representing low-fidelity data using physics-informed Kriging (PhIK), and model the discrepancy between low- and high-fidelity data using a parameterized GP with hyperparameters identified via optimization. Our approach reduces the cost of optimization for inferring hyperparameters by incorporating partial physical knowledge. We prove that the physical constraints in the form of deterministic linear operators are satisfied up to an error bound. Furthermore, we combine CoPhIK with a greedy active learning algorithm for guiding the selection of additional observation locations. The efficiency and accuracy of CoPhIK are demonstrated for reconstructing the partially observed modified Branin function, reconstructing the sparsely observed state of a steady state heat transport problem, and learning a conservative tracer distribution from sparse tracer concentration measurements.
Tasks	Active Learning, Gaussian Processes
Published	2018-11-24
URL	http://arxiv.org/abs/1811.09757v1
PDF	http://arxiv.org/pdf/1811.09757v1.pdf
PWC	https://paperswithcode.com/paper/physics-informed-cokriging-a-gaussian-process
Repo
Framework

Modeling Semantics with Gated Graph Neural Networks for Knowledge Base Question Answering


Title	Modeling Semantics with Gated Graph Neural Networks for Knowledge Base Question Answering
Authors	Daniil Sorokin, Iryna Gurevych
Abstract	The most approaches to Knowledge Base Question Answering are based on semantic parsing. In this paper, we address the problem of learning vector representations for complex semantic parses that consist of multiple entities and relations. Previous work largely focused on selecting the correct semantic relations for a question and disregarded the structure of the semantic parse: the connections between entities and the directions of the relations. We propose to use Gated Graph Neural Networks to encode the graph structure of the semantic parse. We show on two data sets that the graph networks outperform all baseline models that do not explicitly model the structure. The error analysis confirms that our approach can successfully process complex semantic parses.
Tasks	Knowledge Base Question Answering, Question Answering, Semantic Parsing
Published	2018-08-13
URL	http://arxiv.org/abs/1808.04126v1
PDF	http://arxiv.org/pdf/1808.04126v1.pdf
PWC	https://paperswithcode.com/paper/modeling-semantics-with-gated-graph-neural
Repo
Framework

Learning Domain-Sensitive and Sentiment-Aware Word Embeddings


Title	Learning Domain-Sensitive and Sentiment-Aware Word Embeddings
Authors	Bei Shi, Zihao Fu, Lidong Bing, Wai Lam
Abstract	Word embeddings have been widely used in sentiment classification because of their efficacy for semantic representations of words. Given reviews from different domains, some existing methods for word embeddings exploit sentiment information, but they cannot produce domain-sensitive embeddings. On the other hand, some other existing methods can generate domain-sensitive word embeddings, but they cannot distinguish words with similar contexts but opposite sentiment polarity. We propose a new method for learning domain-sensitive and sentiment-aware embeddings that simultaneously capture the information of sentiment semantics and domain sensitivity of individual words. Our method can automatically determine and produce domain-common embeddings and domain-specific embeddings. The differentiation of domain-common and domain-specific words enables the advantage of data augmentation of common semantics from multiple domains and capture the varied semantics of specific words from different domains at the same time. Experimental results show that our model provides an effective way to learn domain-sensitive and sentiment-aware word embeddings which benefit sentiment classification at both sentence level and lexicon term level.
Tasks	Data Augmentation, Sentiment Analysis, Word Embeddings
Published	2018-05-10
URL	http://arxiv.org/abs/1805.03801v1
PDF	http://arxiv.org/pdf/1805.03801v1.pdf
PWC	https://paperswithcode.com/paper/learning-domain-sensitive-and-sentiment-aware
Repo
Framework

A Lifelong Learning Approach to Brain MR Segmentation Across Scanners and Protocols


Title	A Lifelong Learning Approach to Brain MR Segmentation Across Scanners and Protocols
Authors	Neerav Karani, Krishna Chaitanya, Christian Baumgartner, Ender Konukoglu
Abstract	Convolutional neural networks (CNNs) have shown promising results on several segmentation tasks in magnetic resonance (MR) images. However, the accuracy of CNNs may degrade severely when segmenting images acquired with different scanners and/or protocols as compared to the training data, thus limiting their practical utility. We address this shortcoming in a lifelong multi-domain learning setting by treating images acquired with different scanners or protocols as samples from different, but related domains. Our solution is a single CNN with shared convolutional filters and domain-specific batch normalization layers, which can be tuned to new domains with only a few ($\approx$ 4) labelled images. Importantly, this is achieved while retaining performance on the older domains whose training data may no longer be available. We evaluate the method for brain structure segmentation in MR images. Results demonstrate that the proposed method largely closes the gap to the benchmark, which is training a dedicated CNN for each scanner.
Tasks
Published	2018-05-25
URL	http://arxiv.org/abs/1805.10170v1
PDF	http://arxiv.org/pdf/1805.10170v1.pdf
PWC	https://paperswithcode.com/paper/a-lifelong-learning-approach-to-brain-mr
Repo
Framework

Exploiting Sentence Embedding for Medical Question Answering


Title	Exploiting Sentence Embedding for Medical Question Answering
Authors	Yu Hao, Xien Liu, Ji Wu, Ping Lv
Abstract	Despite the great success of word embedding, sentence embedding remains a not-well-solved problem. In this paper, we present a supervised learning framework to exploit sentence embedding for the medical question answering task. The learning framework consists of two main parts: 1) a sentence embedding producing module, and 2) a scoring module. The former is developed with contextual self-attention and multi-scale techniques to encode a sentence into an embedding tensor. This module is shortly called Contextual self-Attention Multi-scale Sentence Embedding (CAMSE). The latter employs two scoring strategies: Semantic Matching Scoring (SMS) and Semantic Association Scoring (SAS). SMS measures similarity while SAS captures association between sentence pairs: a medical question concatenated with a candidate choice, and a piece of corresponding supportive evidence. The proposed framework is examined by two Medical Question Answering(MedicalQA) datasets which are collected from real-world applications: medical exam and clinical diagnosis based on electronic medical records (EMR). The comparison results show that our proposed framework achieved significant improvements compared to competitive baseline approaches. Additionally, a series of controlled experiments are also conducted to illustrate that the multi-scale strategy and the contextual self-attention layer play important roles for producing effective sentence embedding, and the two kinds of scoring strategies are highly complementary to each other for question answering problems.
Tasks	Question Answering, Sentence Embedding
Published	2018-11-15
URL	http://arxiv.org/abs/1811.06156v1
PDF	http://arxiv.org/pdf/1811.06156v1.pdf
PWC	https://paperswithcode.com/paper/exploiting-sentence-embedding-for-medical
Repo
Framework