Paper Group ANR 771
Learning to Reason. Classification Uncertainty of Deep Neural Networks Based on Gradient Information. Goal-oriented Trajectories for Efficient Exploration. Stochastic Block Model for Hypergraphs: Statistical limits and a semidefinite programming approach. Physics-Informed Kriging: A Physics-Informed Gaussian Process Regression Method for Data-Model …
Learning to Reason
Title | Learning to Reason |
Authors | Brian Groenke |
Abstract | Automated theorem proving has long been a key task of artificial intelligence. Proofs form the bedrock of rigorous scientific inquiry. Many tools for both partially and fully automating their derivations have been developed over the last half a century. Some examples of state-of-the-art provers are E (Schulz, 2013), VAMPIRE (Kov'acs & Voronkov, 2013), and Prover9 (McCune, 2005-2010). Newer theorem provers, such as E, use superposition calculus in place of more traditional resolution and tableau based methods. There have also been a number of past attempts to apply machine learning methods to guiding proof search. Suttner & Ertel proposed a multilayer-perceptron based method using hand-engineered features as far back as 1990; Urban et al (2011) apply machine learning to tableau calculus; and Loos et al (2017) recently proposed a method for guiding the E theorem prover using deep nerual networks. All of this prior work, however, has one common limitation: they all rely on the axioms of classical first-order logic. Very little attention has been paid to automated theorem proving for non-classical logics. One of the only recent examples is McLaughlin & Pfenning (2008) who applied the polarized inverse method to intuitionistic propositional logic. The literature is otherwise mostly silent. This is truly unfortunate, as there are many reasons to desire non-classical proofs over classical. Constructive/intuitionistic proofs should be of particular interest to computer scientists thanks to the well-known Curry-Howard correspondence (Howard, 1980) which tells us that all terminating programs correspond to a proof in intuitionistic logic and vice versa. This work explores using Q-learning (Watkins, 1989) to inform proof search for a specific system called non-classical logic called Core Logic (Tennant, 2017). |
Tasks | Automated Theorem Proving, Q-Learning |
Published | 2018-10-12 |
URL | http://arxiv.org/abs/1810.05315v1 |
http://arxiv.org/pdf/1810.05315v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-reason |
Repo | |
Framework | |
Classification Uncertainty of Deep Neural Networks Based on Gradient Information
Title | Classification Uncertainty of Deep Neural Networks Based on Gradient Information |
Authors | Philipp Oberdiek, Matthias Rottmann, Hanno Gottschalk |
Abstract | We study the quantification of uncertainty of Convolutional Neural Networks (CNNs) based on gradient metrics. Unlike the classical softmax entropy, such metrics gather information from all layers of the CNN. We show for the EMNIST digits data set that for several such metrics we achieve the same meta classification accuracy – i.e. the task of classifying predictions as correct or incorrect without knowing the actual label – as for entropy thresholding. We apply meta classification to unknown concepts (out-of-distribution samples) – EMNIST/Omniglot letters, CIFAR10 and noise – and demonstrate that meta classification rates for unknown concepts can be increased when using entropy together with several gradient based metrics as input quantities for a meta classifier. Meta classifiers only trained on the uncertainty metrics of known concepts, i.e. EMNIST digits, usually do not perform equally well for all unknown concepts. If we however allow the meta classifier to be trained on uncertainty metrics for some out-of-distribution samples, meta classification for concepts remote from EMNIST digits (then termed known unknowns) can be improved considerably. |
Tasks | Omniglot |
Published | 2018-05-22 |
URL | http://arxiv.org/abs/1805.08440v2 |
http://arxiv.org/pdf/1805.08440v2.pdf | |
PWC | https://paperswithcode.com/paper/classification-uncertainty-of-deep-neural |
Repo | |
Framework | |
Goal-oriented Trajectories for Efficient Exploration
Title | Goal-oriented Trajectories for Efficient Exploration |
Authors | Fabio Pardo, Vitaly Levdik, Petar Kormushev |
Abstract | Exploration is a difficult challenge in reinforcement learning and even recent state-of-the art curiosity-based methods rely on the simple epsilon-greedy strategy to generate novelty. We argue that pure random walks do not succeed to properly expand the exploration area in most environments and propose to replace single random action choices by random goals selection followed by several steps in their direction. This approach is compatible with any curiosity-based exploration and off-policy reinforcement learning agents and generates longer and safer trajectories than individual random actions. To illustrate this, we present a task-independent agent that learns to reach coordinates in screen frames and demonstrate its ability to explore with the game Super Mario Bros. improving significantly the score of a baseline DQN agent. |
Tasks | Efficient Exploration |
Published | 2018-07-05 |
URL | http://arxiv.org/abs/1807.02078v1 |
http://arxiv.org/pdf/1807.02078v1.pdf | |
PWC | https://paperswithcode.com/paper/goal-oriented-trajectories-for-efficient |
Repo | |
Framework | |
Stochastic Block Model for Hypergraphs: Statistical limits and a semidefinite programming approach
Title | Stochastic Block Model for Hypergraphs: Statistical limits and a semidefinite programming approach |
Authors | Chiheon Kim, Afonso S. Bandeira, Michel X. Goemans |
Abstract | We study the problem of community detection in a random hypergraph model which we call the stochastic block model for $k$-uniform hypergraphs ($k$-SBM). We investigate the exact recovery problem in $k$-SBM and show that a sharp phase transition occurs around a threshold: below the threshold it is impossible to recover the communities with non-vanishing probability, yet above the threshold there is an estimator which recovers the communities almost asymptotically surely. We also consider a simple, efficient algorithm for the exact recovery problem which is based on a semidefinite relaxation technique. |
Tasks | Community Detection |
Published | 2018-07-08 |
URL | http://arxiv.org/abs/1807.02884v1 |
http://arxiv.org/pdf/1807.02884v1.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-block-model-for-hypergraphs |
Repo | |
Framework | |
Physics-Informed Kriging: A Physics-Informed Gaussian Process Regression Method for Data-Model Convergence
Title | Physics-Informed Kriging: A Physics-Informed Gaussian Process Regression Method for Data-Model Convergence |
Authors | Xiu Yang, Guzel Tartakovsky, Alexandre Tartakovsky |
Abstract | In this work, we propose a new Gaussian process regression (GPR) method: physics-informed Kriging (PhIK). In the standard data-driven Kriging, the unknown function of interest is usually treated as a Gaussian process with assumed stationary covariance with hyperparameters estimated from data. In PhIK, we compute the mean and covariance function from realizations of available stochastic models, e.g., from realizations of governing stochastic partial differential equations solutions. Such a constructed Gaussian process generally is non-stationary, and does not assume a specific form of the covariance function. Our approach avoids the costly optimization step in data-driven GPR methods to identify the hyperparameters. More importantly, we prove that the physical constraints in the form of a deterministic linear operator are guaranteed in the resulting prediction. We also provide an error estimate in preserving the physical constraints when errors are included in the stochastic model realizations. To reduce the computational cost of obtaining stochastic model realizations, we propose a multilevel Monte Carlo estimate of the mean and covariance functions. Further, we present an active learning algorithm that guides the selection of additional observation locations. The efficiency and accuracy of PhIK are demonstrated for reconstructing a partially known modified Branin function and learning a conservative tracer distribution from sparse concentration measurements. |
Tasks | Active Learning |
Published | 2018-09-10 |
URL | http://arxiv.org/abs/1809.03461v2 |
http://arxiv.org/pdf/1809.03461v2.pdf | |
PWC | https://paperswithcode.com/paper/physics-informed-kriging-a-physics-informed |
Repo | |
Framework | |
Memory Matching Networks for One-Shot Image Recognition
Title | Memory Matching Networks for One-Shot Image Recognition |
Authors | Qi Cai, Yingwei Pan, Ting Yao, Chenggang Yan, Tao Mei |
Abstract | In this paper, we introduce the new ideas of augmenting Convolutional Neural Networks (CNNs) with Memory and learning to learn the network parameters for the unlabelled images on the fly in one-shot learning. Specifically, we present Memory Matching Networks (MM-Net) — a novel deep architecture that explores the training procedure, following the philosophy that training and test conditions must match. Technically, MM-Net writes the features of a set of labelled images (support set) into memory and reads from memory when performing inference to holistically leverage the knowledge in the set. Meanwhile, a Contextual Learner employs the memory slots in a sequential manner to predict the parameters of CNNs for unlabelled images. The whole architecture is trained by once showing only a few examples per class and switching the learning from minibatch to minibatch, which is tailored for one-shot learning when presented with a few examples of new categories at test time. Unlike the conventional one-shot learning approaches, our MM-Net could output one unified model irrespective of the number of shots and categories. Extensive experiments are conducted on two public datasets, i.e., Omniglot and \emph{mini}ImageNet, and superior results are reported when compared to state-of-the-art approaches. More remarkably, our MM-Net improves one-shot accuracy on Omniglot from 98.95% to 99.28% and from 49.21% to 53.37% on \emph{mini}ImageNet. |
Tasks | Omniglot, One-Shot Learning |
Published | 2018-04-23 |
URL | http://arxiv.org/abs/1804.08281v1 |
http://arxiv.org/pdf/1804.08281v1.pdf | |
PWC | https://paperswithcode.com/paper/memory-matching-networks-for-one-shot-image |
Repo | |
Framework | |
One-Shot Learning using Mixture of Variational Autoencoders: a Generalization Learning approach
Title | One-Shot Learning using Mixture of Variational Autoencoders: a Generalization Learning approach |
Authors | Decebal Constantin Mocanu, Elena Mocanu |
Abstract | Deep learning, even if it is very successful nowadays, traditionally needs very large amounts of labeled data to perform excellent on the classification task. In an attempt to solve this problem, the one-shot learning paradigm, which makes use of just one labeled sample per class and prior knowledge, becomes increasingly important. In this paper, we propose a new one-shot learning method, dubbed MoVAE (Mixture of Variational AutoEncoders), to perform classification. Complementary to prior studies, MoVAE represents a shift of paradigm in comparison with the usual one-shot learning methods, as it does not use any prior knowledge. Instead, it starts from zero knowledge and one labeled sample per class. Afterward, by using unlabeled data and the generalization learning concept (in a way, more as humans do), it is capable to gradually improve by itself its performance. Even more, if there are no unlabeled data available MoVAE can still perform well in one-shot learning classification. We demonstrate empirically the efficiency of our proposed approach on three datasets, i.e. the handwritten digits (MNIST), fashion products (Fashion-MNIST), and handwritten characters (Omniglot), showing that MoVAE outperforms state-of-the-art one-shot learning algorithms. |
Tasks | Omniglot, One-Shot Learning |
Published | 2018-04-18 |
URL | http://arxiv.org/abs/1804.07645v1 |
http://arxiv.org/pdf/1804.07645v1.pdf | |
PWC | https://paperswithcode.com/paper/one-shot-learning-using-mixture-of |
Repo | |
Framework | |
Dropping Networks for Transfer Learning
Title | Dropping Networks for Transfer Learning |
Authors | James O’ Neill, Danushka Bollegala |
Abstract | Many tasks in natural language understanding require learning relationships between two sequences for various tasks such as natural language inference, paraphrasing and entailment. These aforementioned tasks are similar in nature, yet they are often modeled individually. Knowledge transfer can be effective for closely related tasks. However, transferring all knowledge, some of which irrelevant for a target task, can lead to sub-optimal results due to \textit{negative} transfer. Hence, this paper focuses on the transferability of both instances and parameters across natural language understanding tasks by proposing an ensemble-based transfer learning method. \newline The primary contribution of this paper is the combination of both \textit{Dropout} and \textit{Bagging} for improved transferability in neural networks, referred to as \textit{Dropping} herein. We present a straightforward yet novel approach for incorporating source \textit{Dropping} Networks to a target task for few-shot learning that mitigates \textit{negative} transfer. This is achieved by using a decaying parameter chosen according to the slope changes of a smoothed spline error curve at sub-intervals during training. We compare the proposed approach against hard parameter sharing and soft parameter sharing transfer methods in the few-shot learning case. We also compare against models that are fully trained on the target task in the standard supervised learning setup. The aforementioned adjustment leads to improved transfer learning performance and comparable results to the current state of the art only using a fraction of the data from the target task. |
Tasks | Few-Shot Learning, Natural Language Inference, Transfer Learning |
Published | 2018-04-23 |
URL | http://arxiv.org/abs/1804.08501v3 |
http://arxiv.org/pdf/1804.08501v3.pdf | |
PWC | https://paperswithcode.com/paper/dropping-networks-for-transfer-learning |
Repo | |
Framework | |
Hypernyms Through Intra-Article Organization in Wikipedia
Title | Hypernyms Through Intra-Article Organization in Wikipedia |
Authors | Disha Shrivastava, Sreyash Kenkre, Santosh Penubothula |
Abstract | We introduce a new measure for unsupervised hypernym detection and directionality. The motivation is to keep the measure computationally light and portatable across languages. We show that the relative physical location of words in explanatory articles captures the directionality property. Further, the phrases in section titles of articles about the word, capture the semantic similarity needed for hypernym detection task. We experimentally show that the combination of features coming from these two simple measures suffices to produce results comparable with the best unsupervised measures in terms of the average precision. |
Tasks | Semantic Similarity, Semantic Textual Similarity |
Published | 2018-09-03 |
URL | http://arxiv.org/abs/1809.00414v1 |
http://arxiv.org/pdf/1809.00414v1.pdf | |
PWC | https://paperswithcode.com/paper/hypernyms-through-intra-article-organization |
Repo | |
Framework | |
Xu: An Automated Query Expansion and Optimization Tool
Title | Xu: An Automated Query Expansion and Optimization Tool |
Authors | Morgan Gallant, Haruna Isah, Farhana Zulkernine, Shahzad Khan |
Abstract | The exponential growth of information on the Internet is a big challenge for information retrieval systems towards generating relevant results. Novel approaches are required to reformat or expand user queries to generate a satisfactory response and increase recall and precision. Query expansion (QE) is a technique to broaden users’ queries by introducing additional tokens or phrases based on some semantic similarity metrics. The tradeoff is the added computational complexity to find semantically similar words and a possible increase in noise in information retrieval. Despite several research efforts on this topic, QE has not yet been explored enough and more work is needed on similarity matching and composition of query terms with an objective to retrieve a small set of most appropriate responses. QE should be scalable, fast, and robust in handling complex queries with a good response time and noise ceiling. In this paper, we propose Xu, an automated QE technique, using high dimensional clustering of word vectors and Datamuse API, an open source query engine to find semantically similar words. We implemented Xu as a command line tool and evaluated its performances using datasets containing news articles and human-generated QEs. The evaluation results show that Xu was better than Datamuse by achieving about 88% accuracy with reference to the human-generated QE. |
Tasks | Information Retrieval, Semantic Similarity, Semantic Textual Similarity |
Published | 2018-08-28 |
URL | https://arxiv.org/abs/1808.09353v2 |
https://arxiv.org/pdf/1808.09353v2.pdf | |
PWC | https://paperswithcode.com/paper/automated-query-expansion-using-high |
Repo | |
Framework | |
Unsupervised Learning for Large-Scale Fiber Detection and Tracking in Microscopic Material Images
Title | Unsupervised Learning for Large-Scale Fiber Detection and Tracking in Microscopic Material Images |
Authors | Hongkai Yu, Dazhou Guo, Zhipeng Yan, Wei Liu, Jeff Simmons, Craig P. Przybyla, Song Wang |
Abstract | Constructing 3D structures from serial section data is a long standing problem in microscopy. The structure of a fiber reinforced composite material can be reconstructed using a tracking-by-detection model. Tracking-by-detection algorithms rely heavily on detection accuracy, especially the recall performance. The state-of-the-art fiber detection algorithms perform well under ideal conditions, but are not accurate where there are local degradations of image quality, due to contaminants on the material surface and/or defocus blur. Convolutional Neural Networks (CNN) could be used for this problem, but would require a large number of manual annotated fibers, which are not available. We propose an unsupervised learning method to accurately detect fibers on the large scale, that is robust against local degradations of image quality. The proposed method does not require manual annotations, but uses fiber shape/size priors and spatio-temporal consistency in tracking to simulate the supervision in the training of the CNN. Experiments show significant improvements over state-of-the-art fiber detection algorithms together with advanced tracking performance. |
Tasks | |
Published | 2018-05-25 |
URL | http://arxiv.org/abs/1805.10256v1 |
http://arxiv.org/pdf/1805.10256v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-learning-for-large-scale-fiber |
Repo | |
Framework | |
A Methodology for Search Space Reduction in QoS Aware Semantic Web Service Composition
Title | A Methodology for Search Space Reduction in QoS Aware Semantic Web Service Composition |
Authors | Soumi Chattopadhyay, Ansuman Banerjee |
Abstract | The semantic information regulates the expressiveness of a web service. State-of-the-art approaches in web services research have used the semantics of a web service for different purposes, mainly for service discovery, composition, execution etc. In this paper, our main focus is on semantic driven Quality of Service (QoS) aware service composition. Most of the contemporary approaches on service composition have used the semantic information to combine the services appropriately to generate the composition solution. However, in this paper, our intention is to use the semantic information to expedite the service composition algorithm. Here, we present a service composition framework that uses semantic information of a web service to generate different clusters, where the services are semantically related within a cluster. Our final aim is to construct a composition solution using these clusters that can efficiently scale to large service spaces, while ensuring solution quality. Experimental results show the efficiency of our proposed method. |
Tasks | |
Published | 2018-09-19 |
URL | http://arxiv.org/abs/1809.07045v2 |
http://arxiv.org/pdf/1809.07045v2.pdf | |
PWC | https://paperswithcode.com/paper/a-methodology-for-search-space-reduction-in |
Repo | |
Framework | |
Quasi-Monte Carlo Variational Inference
Title | Quasi-Monte Carlo Variational Inference |
Authors | Alexander Buchholz, Florian Wenzel, Stephan Mandt |
Abstract | Many machine learning problems involve Monte Carlo gradient estimators. As a prominent example, we focus on Monte Carlo variational inference (MCVI) in this paper. The performance of MCVI crucially depends on the variance of its stochastic gradients. We propose variance reduction by means of Quasi-Monte Carlo (QMC) sampling. QMC replaces N i.i.d. samples from a uniform probability distribution by a deterministic sequence of samples of length N. This sequence covers the underlying random variable space more evenly than i.i.d. draws, reducing the variance of the gradient estimator. With our novel approach, both the score function and the reparameterization gradient estimators lead to much faster convergence. We also propose a new algorithm for Monte Carlo objectives, where we operate with a constant learning rate and increase the number of QMC samples per iteration. We prove that this way, our algorithm can converge asymptotically at a faster rate than SGD. We furthermore provide theoretical guarantees on QMC for Monte Carlo objectives that go beyond MCVI, and support our findings by several experiments on large-scale data sets from various domains. |
Tasks | |
Published | 2018-07-04 |
URL | http://arxiv.org/abs/1807.01604v1 |
http://arxiv.org/pdf/1807.01604v1.pdf | |
PWC | https://paperswithcode.com/paper/quasi-monte-carlo-variational-inference |
Repo | |
Framework | |
Concept Learning through Deep Reinforcement Learning with Memory-Augmented Neural Networks
Title | Concept Learning through Deep Reinforcement Learning with Memory-Augmented Neural Networks |
Authors | Jing Shi, Jiaming Xu, Yiqun Yao, Bo Xu |
Abstract | Deep neural networks have shown superior performance in many regimes to remember familiar patterns with large amounts of data. However, the standard supervised deep learning paradigm is still limited when facing the need to learn new concepts efficiently from scarce data. In this paper, we present a memory-augmented neural network which is motivated by the process of human concept learning. The training procedure, imitating the concept formation course of human, learns how to distinguish samples from different classes and aggregate samples of the same kind. In order to better utilize the advantages originated from the human behavior, we propose a sequential process, during which the network should decide how to remember each sample at every step. In this sequential process, a stable and interactive memory serves as an important module. We validate our model in some typical one-shot learning tasks and also an exploratory outlier detection problem. In all the experiments, our model gets highly competitive to reach or outperform those strong baselines. |
Tasks | One-Shot Learning, Outlier Detection |
Published | 2018-11-15 |
URL | http://arxiv.org/abs/1811.06145v1 |
http://arxiv.org/pdf/1811.06145v1.pdf | |
PWC | https://paperswithcode.com/paper/concept-learning-through-deep-reinforcement |
Repo | |
Framework | |
Outlier Detection using Generative Models with Theoretical Performance Guarantees
Title | Outlier Detection using Generative Models with Theoretical Performance Guarantees |
Authors | Jirong Yi, Anh Duc Le, Tianming Wang, Xiaodong Wu, Weiyu Xu |
Abstract | This paper considers the problem of recovering signals from compressed measurements contaminated with sparse outliers, which has arisen in many applications. In this paper, we propose a generative model neural network approach for reconstructing the ground truth signals under sparse outliers. We propose an iterative alternating direction method of multipliers (ADMM) algorithm for solving the outlier detection problem via $\ell_1$ norm minimization, and a gradient descent algorithm for solving the outlier detection problem via squared $\ell_1$ norm minimization. We establish the recovery guarantees for reconstruction of signals using generative models in the presence of outliers, and give an upper bound on the number of outliers allowed for recovery. Our results are applicable to both the linear generator neural network and the nonlinear generator neural network with an arbitrary number of layers. We conduct extensive experiments using variational auto-encoder and deep convolutional generative adversarial networks, and the experimental results show that the signals can be successfully reconstructed under outliers using our approach. Our approach outperforms the traditional Lasso and $\ell_2$ minimization approach. |
Tasks | Outlier Detection |
Published | 2018-10-26 |
URL | http://arxiv.org/abs/1810.11335v1 |
http://arxiv.org/pdf/1810.11335v1.pdf | |
PWC | https://paperswithcode.com/paper/outlier-detection-using-generative-models |
Repo | |
Framework | |