Paper Group ANR 1069
A novel active learning-based Gaussian process metamodelling strategy for estimating the full probability distribution in forward UQ analysis. Machine learning for subgroup discovery under treatment effect. Unsupervised Domain Adaptation using Graph Transduction Games. ENCORE: Ensemble Learning using Convolution Neural Machine Translation for Autom …
A novel active learning-based Gaussian process metamodelling strategy for estimating the full probability distribution in forward UQ analysis
Title | A novel active learning-based Gaussian process metamodelling strategy for estimating the full probability distribution in forward UQ analysis |
Authors | Ziqi Wang, Marco Broccardo |
Abstract | This paper proposes an active learning-based Gaussian process (AL-GP) metamodelling method to estimate the cumulative as well as complementary cumulative distribution function (CDF/CCDF) for forward uncertainty quantification (UQ) problems. Within the field of UQ, previous studies focused on developing AL-GP approaches for reliability (rare event probability) analysis of expensive black-box solvers. A naive iteration of these algorithms with respect to different CDF/CCDF threshold values would yield a discretized CDF/CCDF. However, this approach inevitably leads to a trade-off between accuracy and computational efficiency since both depend (in opposite way) on the selected discretization. In this study, a specialized error measure and a learning function are developed such that the resulting AL-GP method is able to efficiently estimate the CDF/CCDF for a specified range of interest without an explicit dependency on discretization. Particularly, the proposed AL-GP method is able to simultaneously provide accurate CDF and CCDF estimation in their median-low probability regions. Three numerical examples are introduced to test and verify the proposed method. |
Tasks | Active Learning |
Published | 2019-08-27 |
URL | https://arxiv.org/abs/1908.10341v1 |
https://arxiv.org/pdf/1908.10341v1.pdf | |
PWC | https://paperswithcode.com/paper/a-novel-active-learning-based-gaussian |
Repo | |
Framework | |
Machine learning for subgroup discovery under treatment effect
Title | Machine learning for subgroup discovery under treatment effect |
Authors | Aleksey Buzmakov |
Abstract | In many practical tasks it is needed to estimate an effect of treatment on individual level. For example, in medicine it is essential to determine the patients that would benefit from a certain medicament. In marketing, knowing the persons that are likely to buy a new product would reduce the amount of spam. In this chapter, we review the methods to estimate an individual treatment effect from a randomized trial, i.e., an experiment when a part of individuals receives a new treatment, while the others do not. Finally, it is shown that new efficient methods are needed in this domain. |
Tasks | |
Published | 2019-02-27 |
URL | http://arxiv.org/abs/1902.10327v1 |
http://arxiv.org/pdf/1902.10327v1.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-for-subgroup-discovery-under |
Repo | |
Framework | |
Unsupervised Domain Adaptation using Graph Transduction Games
Title | Unsupervised Domain Adaptation using Graph Transduction Games |
Authors | Sebastiano Vascon, Sinem Aslan, Alessandro Torcinovich, Twan van Laarhoven, Elena Marchiori, Marcello Pelillo |
Abstract | Unsupervised domain adaptation (UDA) amounts to assigning class labels to the unlabeled instances of a dataset from a target domain, using labeled instances of a dataset from a related source domain. In this paper, we propose to cast this problem in a game-theoretic setting as a non-cooperative game and introduce a fully automatized iterative algorithm for UDA based on graph transduction games (GTG). The main advantages of this approach are its principled foundation, guaranteed termination of the iterative algorithms to a Nash equilibrium (which corresponds to a consistent labeling condition) and soft labels quantifying the uncertainty of the label assignment process. We also investigate the beneficial effect of using pseudo-labels from linear classifiers to initialize the iterative process. The performance of the resulting methods is assessed on publicly available object recognition benchmark datasets involving both shallow and deep features. Results of experiments demonstrate the suitability of the proposed game-theoretic approach for solving UDA tasks. |
Tasks | Domain Adaptation, Object Recognition, Unsupervised Domain Adaptation |
Published | 2019-05-06 |
URL | https://arxiv.org/abs/1905.02036v1 |
https://arxiv.org/pdf/1905.02036v1.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-domain-adaptation-using-graph |
Repo | |
Framework | |
ENCORE: Ensemble Learning using Convolution Neural Machine Translation for Automatic Program Repair
Title | ENCORE: Ensemble Learning using Convolution Neural Machine Translation for Automatic Program Repair |
Authors | Thibaud Lutellier, Lawrence Pang, Viet Hung Pham, Moshi Wei, Lin Tan |
Abstract | Automated generate-and-validate (G&V) program repair techniques typically rely on hard-coded rules, only fix bugs following specific patterns, and are hard to adapt to different programming languages. We propose ENCORE, a new G&V technique, which uses ensemble learning on convolutional neural machine translation (NMT) models to automatically fix bugs in multiple programming languages. We take advantage of the randomness in hyper-parameter tuning to build multiple models that fix different bugs and combine them using ensemble learning. This new convolutional NMT approach outperforms the standard long short-term memory (LSTM) approach used in previous work, as it better captures both local and long-distance connections between tokens. Our evaluation on two popular benchmarks, Defects4J and QuixBugs, shows that ENCORE fixed 42 bugs, including 16 that have not been fixed by existing techniques. In addition, ENCORE is the first G&V repair technique to be applied to four popular programming languages (Java, C++, Python, and JavaScript), fixing a total of 67 bugs across five benchmarks. |
Tasks | Machine Translation |
Published | 2019-06-20 |
URL | https://arxiv.org/abs/1906.08691v1 |
https://arxiv.org/pdf/1906.08691v1.pdf | |
PWC | https://paperswithcode.com/paper/encore-ensemble-learning-using-convolution |
Repo | |
Framework | |
GroSS: Group-Size Series Decomposition for Grouped Architecture Search
Title | GroSS: Group-Size Series Decomposition for Grouped Architecture Search |
Authors | Henry Howard-Jenkins, Yiwen Li, Victor A. Prisacariu |
Abstract | We present a novel approach which is able to explore the configuration of grouped convolutions within neural networks. Group-size Series (GroSS) decomposition is a mathematical formulation of tensor factorisation into a series of approximations of increasing rank terms. GroSS allows for dynamic and differentiable selection of factorisation rank, which is analogous to a grouped convolution. Therefore, to the best of our knowledge, GroSS is the first method to enable simultaneously train differing numbers of groups within a single layer, as well as all possible combinations between layers. In doing so, GroSS is able to train an entire grouped convolution architecture search-space concurrently. We demonstrate this through architecture searches with performance objectives and evaluate its performance against conventional Block Term Decomposition. GroSS enables more effective and efficient search for grouped convolutional architectures. |
Tasks | |
Published | 2019-12-02 |
URL | https://arxiv.org/abs/1912.00673v2 |
https://arxiv.org/pdf/1912.00673v2.pdf | |
PWC | https://paperswithcode.com/paper/gross-group-size-series-decomposition-for |
Repo | |
Framework | |
Joint Learning of Unsupervised Object-Based Perception and Control
Title | Joint Learning of Unsupervised Object-Based Perception and Control |
Authors | Minne Li, Pranav Nashikkar, Jun Wang |
Abstract | This paper is concerned with object-based perception control (OPC), which allows for joint optimization of hierarchical object-based perception and decision making. We define the OPC framework by extending the Bayesian brain hypothesis to support object-based latent representations and propose an unsupervised end-to-end solution method. We develop a practical algorithm and analyze the convergence of the perception model update. Experiments on a high-dimensional pixel environment justify the learning effectiveness of our object-based perception control approach. |
Tasks | Decision Making |
Published | 2019-03-04 |
URL | https://arxiv.org/abs/1903.01385v2 |
https://arxiv.org/pdf/1903.01385v2.pdf | |
PWC | https://paperswithcode.com/paper/optimizing-object-based-perception-and |
Repo | |
Framework | |
The Power of Graph Convolutional Networks to Distinguish Random Graph Models
Title | The Power of Graph Convolutional Networks to Distinguish Random Graph Models |
Authors | Abram Magner, Mayank Baranwal, Alfred O. Hero III |
Abstract | Graph convolutional networks (GCNs) are a widely used method for graph representation learning. We investigate the power of GCNs, as a function of their number of layers, to distinguish between different random graph models on the basis of the embeddings of their sample graphs. In particular, the graph models that we consider arise from graphons, which are the most general possible parameterizations of infinite exchangeable graph models and which are the central objects of study in the theory of dense graph limits. We exhibit an infinite class of graphons that are well-separated in terms of cut distance and are indistinguishable by a GCN with nonlinear activation functions coming from a certain broad class if its depth is at least logarithmic in the size of the sample graph, and furthermore show that, for this application, ReLU activation functions and non-identity weight matrices with non-negative entries do not help in terms of distinguishing power. These results theoretically match empirical observations of several prior works. Finally, we show that for pairs of graphons satisfying a degree profile separation property, a very simple GCN architecture suffices for distinguishability. To prove our results, we exploit a connection to random walks on graphs. |
Tasks | Graph Representation Learning, Representation Learning |
Published | 2019-10-28 |
URL | https://arxiv.org/abs/1910.12954v1 |
https://arxiv.org/pdf/1910.12954v1.pdf | |
PWC | https://paperswithcode.com/paper/the-power-of-graph-convolutional-networks-to |
Repo | |
Framework | |
Collaborative Metric Learning with Memory Network for Multi-Relational Recommender Systems
Title | Collaborative Metric Learning with Memory Network for Multi-Relational Recommender Systems |
Authors | Xiao Zhou, Danyang Liu, Jianxun Lian, Xing Xie |
Abstract | The success of recommender systems in modern online platforms is inseparable from the accurate capture of users’ personal tastes. In everyday life, large amounts of user feedback data are created along with user-item online interactions in a variety of ways, such as browsing, purchasing, and sharing. These multiple types of user feedback provide us with tremendous opportunities to detect individuals’ fine-grained preferences. Different from most existing recommender systems that rely on a single type of feedback, we advocate incorporating multiple types of user-item interactions for better recommendations. Based on the observation that the underlying spectrum of user preferences is reflected in various types of interactions with items and can be uncovered by latent relational learning in metric space, we propose a unified neural learning framework, named Multi-Relational Memory Network (MRMN). It can not only model fine-grained user-item relations but also enable us to discriminate between feedback types in terms of the strength and diversity of user preferences. Extensive experiments show that the proposed MRMN model outperforms competitive state-of-the-art algorithms in a wide range of scenarios, including e-commerce, local services, and job recommendations. |
Tasks | Metric Learning, Recommendation Systems, Relational Reasoning |
Published | 2019-06-24 |
URL | https://arxiv.org/abs/1906.09882v1 |
https://arxiv.org/pdf/1906.09882v1.pdf | |
PWC | https://paperswithcode.com/paper/collaborative-metric-learning-with-memory |
Repo | |
Framework | |
Driving Datasets Literature Review
Title | Driving Datasets Literature Review |
Authors | Charles-Éric Noël Laflamme, François Pomerleau, Philippe Giguère |
Abstract | This report is a survey of the different autonomous driving datasets which have been published up to date. The first section introduces the many sensor types used in autonomous driving datasets. The second section investigates the calibration and synchronization procedure required to generate accurate data. The third section describes the diverse driving tasks explored by the datasets. Finally, the fourth section provides comprehensive lists of datasets, mainly in the form of tables. |
Tasks | Autonomous Driving, Calibration |
Published | 2019-10-26 |
URL | https://arxiv.org/abs/1910.11968v1 |
https://arxiv.org/pdf/1910.11968v1.pdf | |
PWC | https://paperswithcode.com/paper/driving-datasets-literature-review |
Repo | |
Framework | |
3D Instance Segmentation via Multi-Task Metric Learning
Title | 3D Instance Segmentation via Multi-Task Metric Learning |
Authors | Jean Lahoud, Bernard Ghanem, Marc Pollefeys, Martin R. Oswald |
Abstract | We propose a novel method for instance label segmentation of dense 3D voxel grids. We target volumetric scene representations, which have been acquired with depth sensors or multi-view stereo methods and which have been processed with semantic 3D reconstruction or scene completion methods. The main task is to learn shape information about individual object instances in order to accurately separate them, including connected and incompletely scanned objects. We solve the 3D instance-labeling problem with a multi-task learning strategy. The first goal is to learn an abstract feature embedding, which groups voxels with the same instance label close to each other while separating clusters with different instance labels from each other. The second goal is to learn instance information by densely estimating directional information of the instance’s center of mass for each voxel. This is particularly useful to find instance boundaries in the clustering post-processing step, as well as, for scoring the segmentation quality for the first goal. Both synthetic and real-world experiments demonstrate the viability and merits of our approach. In fact, it achieves state-of-the-art performance on the ScanNet 3D instance segmentation benchmark. |
Tasks | 3D Instance Segmentation, 3D Reconstruction, Instance Segmentation, Metric Learning, Multi-Task Learning, Semantic Segmentation |
Published | 2019-06-20 |
URL | https://arxiv.org/abs/1906.08650v2 |
https://arxiv.org/pdf/1906.08650v2.pdf | |
PWC | https://paperswithcode.com/paper/3d-instance-segmentation-via-multi-task |
Repo | |
Framework | |
Adversarial Mahalanobis Distance-based Attentive Song Recommender for Automatic Playlist Continuation
Title | Adversarial Mahalanobis Distance-based Attentive Song Recommender for Automatic Playlist Continuation |
Authors | Thanh Tran, Renee Sweeney, Kyumin Lee |
Abstract | In this paper, we aim to solve the automatic playlist continuation (APC) problem by modeling complex interactions among users, playlists, and songs using only their interaction data. Prior methods mainly rely on dot product to account for similarities, which is not ideal as dot product is not metric learning, so it does not convey the important inequality property. Based on this observation, we propose three novel deep learning approaches that utilize Mahalanobis distance. Our first approach uses user-playlist-song interactions, and combines Mahalanobis distance scores between (i) a target user and a target song, and (ii) between a target playlist and the target song to account for both the user’s preference and the playlist’s theme. Our second approach measures song-song similarities by considering Mahalanobis distance scores between the target song and each member song (i.e., existing song) in the target playlist. The contribution of each distance score is measured by our proposed memory metric-based attention mechanism. In the third approach, we fuse the two previous models into a unified model to further enhance their performance. In addition, we adopt and customize Adversarial Personalized Ranking (APR) for our three approaches to further improve their robustness and predictive capabilities. Through extensive experiments, we show that our proposed models outperform eight state-of-the-art models in two large-scale real-world datasets. |
Tasks | Metric Learning |
Published | 2019-06-08 |
URL | https://arxiv.org/abs/1906.03450v1 |
https://arxiv.org/pdf/1906.03450v1.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-mahalanobis-distance-based |
Repo | |
Framework | |
Comments on “Deep Neural Networks with Random Gaussian Weights: A Universal Classification Strategy?”
Title | Comments on “Deep Neural Networks with Random Gaussian Weights: A Universal Classification Strategy?” |
Authors | Talha Cihad Gulcu, Alper Gungor |
Abstract | In a recently published paper [1], it is shown that deep neural networks (DNNs) with random Gaussian weights preserve the metric structure of the data, with the property that the distance shrinks more when the angle between the two data points is smaller. We agree that the random projection setup considered in [1] preserves distances with a high probability. But as far as we are concerned, the relation between the angle of the data points and the output distances is quite the opposite, i.e., smaller angles result in a weaker distance shrinkage. This leads us to conclude that Theorem 3 and Figure 5 in [1] are not accurate. Hence the usage of random Gaussian weights in DNNs cannot provide an ability of universal classification or treating in-class and out-of-class data separately. Consequently, the behavior of networks consisting of random Gaussian weights only is not useful to explain how DNNs achieve state-of-art results in a large variety of problems. |
Tasks | |
Published | 2019-01-08 |
URL | http://arxiv.org/abs/1901.02182v2 |
http://arxiv.org/pdf/1901.02182v2.pdf | |
PWC | https://paperswithcode.com/paper/comments-on-deep-neural-networks-with-random |
Repo | |
Framework | |
A Theory of Selective Prediction
Title | A Theory of Selective Prediction |
Authors | Mingda Qiao, Gregory Valiant |
Abstract | We consider a model of selective prediction, where the prediction algorithm is given a data sequence in an online fashion and asked to predict a pre-specified statistic of the upcoming data points. The algorithm is allowed to choose when to make the prediction as well as the length of the prediction window, possibly depending on the observations so far. We prove that, even without any distributional assumption on the input data stream, a large family of statistics can be estimated to non-trivial accuracy. To give one concrete example, suppose that we are given access to an arbitrary binary sequence $x_1, \ldots, x_n$ of length $n$. Our goal is to accurately predict the average observation, and we are allowed to choose the window over which the prediction is made: for some $t < n$ and $m \le n - t$, after seeing $t$ observations we predict the average of $x_{t+1}, \ldots, x_{t+m}$. This particular problem was first studied in Drucker (2013) and referred to as the “density prediction game”. We show that the expected squared error of our prediction can be bounded by $O(\frac{1}{\log n})$ and prove a matching lower bound, which resolves an open question raised in Drucker (2013). This result holds for any sequence (that is not adaptive to when the prediction is made, or the predicted value), and the expectation of the error is with respect to the randomness of the prediction algorithm. Our results apply to more general statistics of a sequence of observations, and we highlight several open directions for future work. |
Tasks | |
Published | 2019-02-12 |
URL | https://arxiv.org/abs/1902.04256v3 |
https://arxiv.org/pdf/1902.04256v3.pdf | |
PWC | https://paperswithcode.com/paper/a-theory-of-selective-prediction |
Repo | |
Framework | |
Interpolating between boolean and extremely high noisy patterns through Minimal Dense Associative Memories
Title | Interpolating between boolean and extremely high noisy patterns through Minimal Dense Associative Memories |
Authors | Francesco Alemanno, Martino Centonze, Alberto Fachechi |
Abstract | Recently, Hopfield and Krotov introduced the concept of {\em dense associative memories} [DAM] (close to spin-glasses with $P$-wise interactions in a disordered statistical mechanical jargon): they proved a number of remarkable features these networks share and suggested their use to (partially) explain the success of the new generation of Artificial Intelligence. Thanks to a remarkable ante-litteram analysis by Baldi & Venkatesh, among these properties, it is known these networks can handle a maximal amount of stored patterns $K$ scaling as $K \sim N^{P-1}$.\ In this paper, once introduced a {\em minimal dense associative network} as one of the most elementary cost-functions falling in this class of DAM, we sacrifice this high-load regime -namely we force the storage of {\em solely} a linear amount of patterns, i.e. $K = \alpha N$ (with $\alpha>0$)- to prove that, in this regime, these networks can correctly perform pattern recognition even if pattern signal is $O(1)$ and is embedded in a sea of noise $O(\sqrt{N})$, also in the large $N$ limit. To prove this statement, by extremizing the quenched free-energy of the model over its natural order-parameters (the various magnetizations and overlaps), we derived its phase diagram, at the replica symmetric level of description and in the thermodynamic limit: as a sideline, we stress that, to achieve this task, aiming at cross-fertilization among disciplines, we pave two hegemon routes in the statistical mechanics of spin glasses, namely the replica trick and the interpolation technique.\ Both the approaches reach the same conclusion: there is a not-empty region, in the noise-$T$ vs load-$\alpha$ phase diagram plane, where these networks can actually work in this challenging regime; in particular we obtained a quite high critical (linear) load in the (fast) noiseless case resulting in $\lim_{\beta \to \infty}\alpha_c(\beta)=0.65$. |
Tasks | |
Published | 2019-12-02 |
URL | https://arxiv.org/abs/1912.00666v1 |
https://arxiv.org/pdf/1912.00666v1.pdf | |
PWC | https://paperswithcode.com/paper/interpolating-between-boolean-and-extremely |
Repo | |
Framework | |
Infant-Prints: Fingerprints for Reducing Infant Mortality
Title | Infant-Prints: Fingerprints for Reducing Infant Mortality |
Authors | Joshua J. Engelsma, Debayan Deb, Anil K. Jain, Prem S. Sudhish, Anjoo Bhatnager |
Abstract | In developing countries around the world, a multitude of infants continue to suffer and die from vaccine-preventable diseases, and malnutrition. Lamentably, the lack of any official identification documentation makes it exceedingly difficult to prevent these infant deaths. To solve this global crisis, we propose Infant-Prints which is comprised of (i) a custom, compact, low-cost (85 USD), high-resolution (1,900 ppi) fingerprint reader, (ii) a high-resolution fingerprint matcher, and (iii) a mobile application for search and verification for the infant fingerprint. Using Infant-Prints, we have collected a longitudinal database of infant fingerprints and demonstrate its ability to perform accurate and reliable recognition of infants enrolled at the ages 0-3 months, in time for effective delivery of critical vaccinations and nutritional supplements (TAR=90% @ FAR = 0.1% for infants older than 8 weeks). |
Tasks | |
Published | 2019-04-01 |
URL | http://arxiv.org/abs/1904.01091v1 |
http://arxiv.org/pdf/1904.01091v1.pdf | |
PWC | https://paperswithcode.com/paper/infant-prints-fingerprints-for-reducing |
Repo | |
Framework | |