January 27, 2020

3146 words 15 mins read

Paper Group ANR 1069

Paper Group ANR 1069

A novel active learning-based Gaussian process metamodelling strategy for estimating the full probability distribution in forward UQ analysis. Machine learning for subgroup discovery under treatment effect. Unsupervised Domain Adaptation using Graph Transduction Games. ENCORE: Ensemble Learning using Convolution Neural Machine Translation for Autom …

A novel active learning-based Gaussian process metamodelling strategy for estimating the full probability distribution in forward UQ analysis

Title A novel active learning-based Gaussian process metamodelling strategy for estimating the full probability distribution in forward UQ analysis
Authors Ziqi Wang, Marco Broccardo
Abstract This paper proposes an active learning-based Gaussian process (AL-GP) metamodelling method to estimate the cumulative as well as complementary cumulative distribution function (CDF/CCDF) for forward uncertainty quantification (UQ) problems. Within the field of UQ, previous studies focused on developing AL-GP approaches for reliability (rare event probability) analysis of expensive black-box solvers. A naive iteration of these algorithms with respect to different CDF/CCDF threshold values would yield a discretized CDF/CCDF. However, this approach inevitably leads to a trade-off between accuracy and computational efficiency since both depend (in opposite way) on the selected discretization. In this study, a specialized error measure and a learning function are developed such that the resulting AL-GP method is able to efficiently estimate the CDF/CCDF for a specified range of interest without an explicit dependency on discretization. Particularly, the proposed AL-GP method is able to simultaneously provide accurate CDF and CCDF estimation in their median-low probability regions. Three numerical examples are introduced to test and verify the proposed method.
Tasks Active Learning
Published 2019-08-27
URL https://arxiv.org/abs/1908.10341v1
PDF https://arxiv.org/pdf/1908.10341v1.pdf
PWC https://paperswithcode.com/paper/a-novel-active-learning-based-gaussian
Repo
Framework

Machine learning for subgroup discovery under treatment effect

Title Machine learning for subgroup discovery under treatment effect
Authors Aleksey Buzmakov
Abstract In many practical tasks it is needed to estimate an effect of treatment on individual level. For example, in medicine it is essential to determine the patients that would benefit from a certain medicament. In marketing, knowing the persons that are likely to buy a new product would reduce the amount of spam. In this chapter, we review the methods to estimate an individual treatment effect from a randomized trial, i.e., an experiment when a part of individuals receives a new treatment, while the others do not. Finally, it is shown that new efficient methods are needed in this domain.
Tasks
Published 2019-02-27
URL http://arxiv.org/abs/1902.10327v1
PDF http://arxiv.org/pdf/1902.10327v1.pdf
PWC https://paperswithcode.com/paper/machine-learning-for-subgroup-discovery-under
Repo
Framework

Unsupervised Domain Adaptation using Graph Transduction Games

Title Unsupervised Domain Adaptation using Graph Transduction Games
Authors Sebastiano Vascon, Sinem Aslan, Alessandro Torcinovich, Twan van Laarhoven, Elena Marchiori, Marcello Pelillo
Abstract Unsupervised domain adaptation (UDA) amounts to assigning class labels to the unlabeled instances of a dataset from a target domain, using labeled instances of a dataset from a related source domain. In this paper, we propose to cast this problem in a game-theoretic setting as a non-cooperative game and introduce a fully automatized iterative algorithm for UDA based on graph transduction games (GTG). The main advantages of this approach are its principled foundation, guaranteed termination of the iterative algorithms to a Nash equilibrium (which corresponds to a consistent labeling condition) and soft labels quantifying the uncertainty of the label assignment process. We also investigate the beneficial effect of using pseudo-labels from linear classifiers to initialize the iterative process. The performance of the resulting methods is assessed on publicly available object recognition benchmark datasets involving both shallow and deep features. Results of experiments demonstrate the suitability of the proposed game-theoretic approach for solving UDA tasks.
Tasks Domain Adaptation, Object Recognition, Unsupervised Domain Adaptation
Published 2019-05-06
URL https://arxiv.org/abs/1905.02036v1
PDF https://arxiv.org/pdf/1905.02036v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-domain-adaptation-using-graph
Repo
Framework

ENCORE: Ensemble Learning using Convolution Neural Machine Translation for Automatic Program Repair

Title ENCORE: Ensemble Learning using Convolution Neural Machine Translation for Automatic Program Repair
Authors Thibaud Lutellier, Lawrence Pang, Viet Hung Pham, Moshi Wei, Lin Tan
Abstract Automated generate-and-validate (G&V) program repair techniques typically rely on hard-coded rules, only fix bugs following specific patterns, and are hard to adapt to different programming languages. We propose ENCORE, a new G&V technique, which uses ensemble learning on convolutional neural machine translation (NMT) models to automatically fix bugs in multiple programming languages. We take advantage of the randomness in hyper-parameter tuning to build multiple models that fix different bugs and combine them using ensemble learning. This new convolutional NMT approach outperforms the standard long short-term memory (LSTM) approach used in previous work, as it better captures both local and long-distance connections between tokens. Our evaluation on two popular benchmarks, Defects4J and QuixBugs, shows that ENCORE fixed 42 bugs, including 16 that have not been fixed by existing techniques. In addition, ENCORE is the first G&V repair technique to be applied to four popular programming languages (Java, C++, Python, and JavaScript), fixing a total of 67 bugs across five benchmarks.
Tasks Machine Translation
Published 2019-06-20
URL https://arxiv.org/abs/1906.08691v1
PDF https://arxiv.org/pdf/1906.08691v1.pdf
PWC https://paperswithcode.com/paper/encore-ensemble-learning-using-convolution
Repo
Framework
Title GroSS: Group-Size Series Decomposition for Grouped Architecture Search
Authors Henry Howard-Jenkins, Yiwen Li, Victor A. Prisacariu
Abstract We present a novel approach which is able to explore the configuration of grouped convolutions within neural networks. Group-size Series (GroSS) decomposition is a mathematical formulation of tensor factorisation into a series of approximations of increasing rank terms. GroSS allows for dynamic and differentiable selection of factorisation rank, which is analogous to a grouped convolution. Therefore, to the best of our knowledge, GroSS is the first method to enable simultaneously train differing numbers of groups within a single layer, as well as all possible combinations between layers. In doing so, GroSS is able to train an entire grouped convolution architecture search-space concurrently. We demonstrate this through architecture searches with performance objectives and evaluate its performance against conventional Block Term Decomposition. GroSS enables more effective and efficient search for grouped convolutional architectures.
Tasks
Published 2019-12-02
URL https://arxiv.org/abs/1912.00673v2
PDF https://arxiv.org/pdf/1912.00673v2.pdf
PWC https://paperswithcode.com/paper/gross-group-size-series-decomposition-for
Repo
Framework

Joint Learning of Unsupervised Object-Based Perception and Control

Title Joint Learning of Unsupervised Object-Based Perception and Control
Authors Minne Li, Pranav Nashikkar, Jun Wang
Abstract This paper is concerned with object-based perception control (OPC), which allows for joint optimization of hierarchical object-based perception and decision making. We define the OPC framework by extending the Bayesian brain hypothesis to support object-based latent representations and propose an unsupervised end-to-end solution method. We develop a practical algorithm and analyze the convergence of the perception model update. Experiments on a high-dimensional pixel environment justify the learning effectiveness of our object-based perception control approach.
Tasks Decision Making
Published 2019-03-04
URL https://arxiv.org/abs/1903.01385v2
PDF https://arxiv.org/pdf/1903.01385v2.pdf
PWC https://paperswithcode.com/paper/optimizing-object-based-perception-and
Repo
Framework

The Power of Graph Convolutional Networks to Distinguish Random Graph Models

Title The Power of Graph Convolutional Networks to Distinguish Random Graph Models
Authors Abram Magner, Mayank Baranwal, Alfred O. Hero III
Abstract Graph convolutional networks (GCNs) are a widely used method for graph representation learning. We investigate the power of GCNs, as a function of their number of layers, to distinguish between different random graph models on the basis of the embeddings of their sample graphs. In particular, the graph models that we consider arise from graphons, which are the most general possible parameterizations of infinite exchangeable graph models and which are the central objects of study in the theory of dense graph limits. We exhibit an infinite class of graphons that are well-separated in terms of cut distance and are indistinguishable by a GCN with nonlinear activation functions coming from a certain broad class if its depth is at least logarithmic in the size of the sample graph, and furthermore show that, for this application, ReLU activation functions and non-identity weight matrices with non-negative entries do not help in terms of distinguishing power. These results theoretically match empirical observations of several prior works. Finally, we show that for pairs of graphons satisfying a degree profile separation property, a very simple GCN architecture suffices for distinguishability. To prove our results, we exploit a connection to random walks on graphs.
Tasks Graph Representation Learning, Representation Learning
Published 2019-10-28
URL https://arxiv.org/abs/1910.12954v1
PDF https://arxiv.org/pdf/1910.12954v1.pdf
PWC https://paperswithcode.com/paper/the-power-of-graph-convolutional-networks-to
Repo
Framework

Collaborative Metric Learning with Memory Network for Multi-Relational Recommender Systems

Title Collaborative Metric Learning with Memory Network for Multi-Relational Recommender Systems
Authors Xiao Zhou, Danyang Liu, Jianxun Lian, Xing Xie
Abstract The success of recommender systems in modern online platforms is inseparable from the accurate capture of users’ personal tastes. In everyday life, large amounts of user feedback data are created along with user-item online interactions in a variety of ways, such as browsing, purchasing, and sharing. These multiple types of user feedback provide us with tremendous opportunities to detect individuals’ fine-grained preferences. Different from most existing recommender systems that rely on a single type of feedback, we advocate incorporating multiple types of user-item interactions for better recommendations. Based on the observation that the underlying spectrum of user preferences is reflected in various types of interactions with items and can be uncovered by latent relational learning in metric space, we propose a unified neural learning framework, named Multi-Relational Memory Network (MRMN). It can not only model fine-grained user-item relations but also enable us to discriminate between feedback types in terms of the strength and diversity of user preferences. Extensive experiments show that the proposed MRMN model outperforms competitive state-of-the-art algorithms in a wide range of scenarios, including e-commerce, local services, and job recommendations.
Tasks Metric Learning, Recommendation Systems, Relational Reasoning
Published 2019-06-24
URL https://arxiv.org/abs/1906.09882v1
PDF https://arxiv.org/pdf/1906.09882v1.pdf
PWC https://paperswithcode.com/paper/collaborative-metric-learning-with-memory
Repo
Framework

Driving Datasets Literature Review

Title Driving Datasets Literature Review
Authors Charles-Éric Noël Laflamme, François Pomerleau, Philippe Giguère
Abstract This report is a survey of the different autonomous driving datasets which have been published up to date. The first section introduces the many sensor types used in autonomous driving datasets. The second section investigates the calibration and synchronization procedure required to generate accurate data. The third section describes the diverse driving tasks explored by the datasets. Finally, the fourth section provides comprehensive lists of datasets, mainly in the form of tables.
Tasks Autonomous Driving, Calibration
Published 2019-10-26
URL https://arxiv.org/abs/1910.11968v1
PDF https://arxiv.org/pdf/1910.11968v1.pdf
PWC https://paperswithcode.com/paper/driving-datasets-literature-review
Repo
Framework

3D Instance Segmentation via Multi-Task Metric Learning

Title 3D Instance Segmentation via Multi-Task Metric Learning
Authors Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald
Abstract We propose a novel method for instance label segmentation of dense 3D voxel grids. We target volumetric scene representations, which have been acquired with depth sensors or multi-view stereo methods and which have been processed with semantic 3D reconstruction or scene completion methods. The main task is to learn shape information about individual object instances in order to accurately separate them, including connected and incompletely scanned objects. We solve the 3D instance-labeling problem with a multi-task learning strategy. The first goal is to learn an abstract feature embedding, which groups voxels with the same instance label close to each other while separating clusters with different instance labels from each other. The second goal is to learn instance information by densely estimating directional information of the instance’s center of mass for each voxel. This is particularly useful to find instance boundaries in the clustering post-processing step, as well as, for scoring the segmentation quality for the first goal. Both synthetic and real-world experiments demonstrate the viability and merits of our approach. In fact, it achieves state-of-the-art performance on the ScanNet 3D instance segmentation benchmark.
Tasks 3D Instance Segmentation, 3D Reconstruction, Instance Segmentation, Metric Learning, Multi-Task Learning, Semantic Segmentation
Published 2019-06-20
URL https://arxiv.org/abs/1906.08650v2
PDF https://arxiv.org/pdf/1906.08650v2.pdf
PWC https://paperswithcode.com/paper/3d-instance-segmentation-via-multi-task
Repo
Framework

Adversarial Mahalanobis Distance-based Attentive Song Recommender for Automatic Playlist Continuation

Title Adversarial Mahalanobis Distance-based Attentive Song Recommender for Automatic Playlist Continuation
Authors Thanh Tran, Renee Sweeney, Kyumin Lee
Abstract In this paper, we aim to solve the automatic playlist continuation (APC) problem by modeling complex interactions among users, playlists, and songs using only their interaction data. Prior methods mainly rely on dot product to account for similarities, which is not ideal as dot product is not metric learning, so it does not convey the important inequality property. Based on this observation, we propose three novel deep learning approaches that utilize Mahalanobis distance. Our first approach uses user-playlist-song interactions, and combines Mahalanobis distance scores between (i) a target user and a target song, and (ii) between a target playlist and the target song to account for both the user’s preference and the playlist’s theme. Our second approach measures song-song similarities by considering Mahalanobis distance scores between the target song and each member song (i.e., existing song) in the target playlist. The contribution of each distance score is measured by our proposed memory metric-based attention mechanism. In the third approach, we fuse the two previous models into a unified model to further enhance their performance. In addition, we adopt and customize Adversarial Personalized Ranking (APR) for our three approaches to further improve their robustness and predictive capabilities. Through extensive experiments, we show that our proposed models outperform eight state-of-the-art models in two large-scale real-world datasets.
Tasks Metric Learning
Published 2019-06-08
URL https://arxiv.org/abs/1906.03450v1
PDF https://arxiv.org/pdf/1906.03450v1.pdf
PWC https://paperswithcode.com/paper/adversarial-mahalanobis-distance-based
Repo
Framework

Comments on “Deep Neural Networks with Random Gaussian Weights: A Universal Classification Strategy?”

Title Comments on “Deep Neural Networks with Random Gaussian Weights: A Universal Classification Strategy?”
Authors Talha Cihad Gulcu, Alper Gungor
Abstract In a recently published paper [1], it is shown that deep neural networks (DNNs) with random Gaussian weights preserve the metric structure of the data, with the property that the distance shrinks more when the angle between the two data points is smaller. We agree that the random projection setup considered in [1] preserves distances with a high probability. But as far as we are concerned, the relation between the angle of the data points and the output distances is quite the opposite, i.e., smaller angles result in a weaker distance shrinkage. This leads us to conclude that Theorem 3 and Figure 5 in [1] are not accurate. Hence the usage of random Gaussian weights in DNNs cannot provide an ability of universal classification or treating in-class and out-of-class data separately. Consequently, the behavior of networks consisting of random Gaussian weights only is not useful to explain how DNNs achieve state-of-art results in a large variety of problems.
Tasks
Published 2019-01-08
URL http://arxiv.org/abs/1901.02182v2
PDF http://arxiv.org/pdf/1901.02182v2.pdf
PWC https://paperswithcode.com/paper/comments-on-deep-neural-networks-with-random
Repo
Framework

A Theory of Selective Prediction

Title A Theory of Selective Prediction
Authors Mingda Qiao, Gregory Valiant
Abstract We consider a model of selective prediction, where the prediction algorithm is given a data sequence in an online fashion and asked to predict a pre-specified statistic of the upcoming data points. The algorithm is allowed to choose when to make the prediction as well as the length of the prediction window, possibly depending on the observations so far. We prove that, even without any distributional assumption on the input data stream, a large family of statistics can be estimated to non-trivial accuracy. To give one concrete example, suppose that we are given access to an arbitrary binary sequence $x_1, \ldots, x_n$ of length $n$. Our goal is to accurately predict the average observation, and we are allowed to choose the window over which the prediction is made: for some $t < n$ and $m \le n - t$, after seeing $t$ observations we predict the average of $x_{t+1}, \ldots, x_{t+m}$. This particular problem was first studied in Drucker (2013) and referred to as the “density prediction game”. We show that the expected squared error of our prediction can be bounded by $O(\frac{1}{\log n})$ and prove a matching lower bound, which resolves an open question raised in Drucker (2013). This result holds for any sequence (that is not adaptive to when the prediction is made, or the predicted value), and the expectation of the error is with respect to the randomness of the prediction algorithm. Our results apply to more general statistics of a sequence of observations, and we highlight several open directions for future work.
Tasks
Published 2019-02-12
URL https://arxiv.org/abs/1902.04256v3
PDF https://arxiv.org/pdf/1902.04256v3.pdf
PWC https://paperswithcode.com/paper/a-theory-of-selective-prediction
Repo
Framework

Interpolating between boolean and extremely high noisy patterns through Minimal Dense Associative Memories

Title Interpolating between boolean and extremely high noisy patterns through Minimal Dense Associative Memories
Authors Francesco Alemanno, Martino Centonze, Alberto Fachechi
Abstract Recently, Hopfield and Krotov introduced the concept of {\em dense associative memories} [DAM] (close to spin-glasses with $P$-wise interactions in a disordered statistical mechanical jargon): they proved a number of remarkable features these networks share and suggested their use to (partially) explain the success of the new generation of Artificial Intelligence. Thanks to a remarkable ante-litteram analysis by Baldi & Venkatesh, among these properties, it is known these networks can handle a maximal amount of stored patterns $K$ scaling as $K \sim N^{P-1}$.\ In this paper, once introduced a {\em minimal dense associative network} as one of the most elementary cost-functions falling in this class of DAM, we sacrifice this high-load regime -namely we force the storage of {\em solely} a linear amount of patterns, i.e. $K = \alpha N$ (with $\alpha>0$)- to prove that, in this regime, these networks can correctly perform pattern recognition even if pattern signal is $O(1)$ and is embedded in a sea of noise $O(\sqrt{N})$, also in the large $N$ limit. To prove this statement, by extremizing the quenched free-energy of the model over its natural order-parameters (the various magnetizations and overlaps), we derived its phase diagram, at the replica symmetric level of description and in the thermodynamic limit: as a sideline, we stress that, to achieve this task, aiming at cross-fertilization among disciplines, we pave two hegemon routes in the statistical mechanics of spin glasses, namely the replica trick and the interpolation technique.\ Both the approaches reach the same conclusion: there is a not-empty region, in the noise-$T$ vs load-$\alpha$ phase diagram plane, where these networks can actually work in this challenging regime; in particular we obtained a quite high critical (linear) load in the (fast) noiseless case resulting in $\lim_{\beta \to \infty}\alpha_c(\beta)=0.65$.
Tasks
Published 2019-12-02
URL https://arxiv.org/abs/1912.00666v1
PDF https://arxiv.org/pdf/1912.00666v1.pdf
PWC https://paperswithcode.com/paper/interpolating-between-boolean-and-extremely
Repo
Framework

Infant-Prints: Fingerprints for Reducing Infant Mortality

Title Infant-Prints: Fingerprints for Reducing Infant Mortality
Authors Joshua J. Engelsma, Debayan Deb, Anil K. Jain, Prem S. Sudhish, Anjoo Bhatnager
Abstract In developing countries around the world, a multitude of infants continue to suffer and die from vaccine-preventable diseases, and malnutrition. Lamentably, the lack of any official identification documentation makes it exceedingly difficult to prevent these infant deaths. To solve this global crisis, we propose Infant-Prints which is comprised of (i) a custom, compact, low-cost (85 USD), high-resolution (1,900 ppi) fingerprint reader, (ii) a high-resolution fingerprint matcher, and (iii) a mobile application for search and verification for the infant fingerprint. Using Infant-Prints, we have collected a longitudinal database of infant fingerprints and demonstrate its ability to perform accurate and reliable recognition of infants enrolled at the ages 0-3 months, in time for effective delivery of critical vaccinations and nutritional supplements (TAR=90% @ FAR = 0.1% for infants older than 8 weeks).
Tasks
Published 2019-04-01
URL http://arxiv.org/abs/1904.01091v1
PDF http://arxiv.org/pdf/1904.01091v1.pdf
PWC https://paperswithcode.com/paper/infant-prints-fingerprints-for-reducing
Repo
Framework
comments powered by Disqus