Paper Group ANR 693
Optimal Sketching for Kronecker Product Regression and Low Rank Approximation. Better accuracy with quantified privacy: representations learned via reconstructive adversarial network. Hierarchical Foresight: Self-Supervised Learning of Long-Horizon Tasks via Visual Subgoal Generation. Minimal Solvers for Mini-Loop Closures in 3D Multi-Scan Alignmen …
Optimal Sketching for Kronecker Product Regression and Low Rank Approximation
Title | Optimal Sketching for Kronecker Product Regression and Low Rank Approximation |
Authors | Huaian Diao, Rajesh Jayaram, Zhao Song, Wen Sun, David P. Woodruff |
Abstract | We study the Kronecker product regression problem, in which the design matrix is a Kronecker product of two or more matrices. Given $A_i \in \mathbb{R}^{n_i \times d_i}$ for $i=1,2,\dots,q$ where $n_i \gg d_i$ for each $i$, and $b \in \mathbb{R}^{n_1 n_2 \cdots n_q}$, let $\mathcal{A} = A_1 \otimes A_2 \otimes \cdots \otimes A_q$. Then for $p \in [1,2]$, the goal is to find $x \in \mathbb{R}^{d_1 \cdots d_q}$ that approximately minimizes $\mathcal{A}x - b_p$. Recently, Diao, Song, Sun, and Woodruff (AISTATS, 2018) gave an algorithm which is faster than forming the Kronecker product $\mathcal{A}$ Specifically, for $p=2$ their running time is $O(\sum_{i=1}^q \text{nnz}(A_i) + \text{nnz}(b))$, where nnz$(A_i)$ is the number of non-zero entries in $A_i$. Note that nnz$(b)$ can be as large as $n_1 \cdots n_q$. For $p=1,$ $q=2$ and $n_1 = n_2$, they achieve a worse bound of $O(n_1^{3/2} \text{poly}(d_1d_2) + \text{nnz}(b))$. In this work, we provide significantly faster algorithms. For $p=2$, our running time is $O(\sum_{i=1}^q \text{nnz}(A_i) )$, which has no dependence on nnz$(b)$. For $p<2$, our running time is $O(\sum_{i=1}^q \text{nnz}(A_i) + \text{nnz}(b))$, which matches the prior best running time for $p=2$. We also consider the related all-pairs regression problem, where given $A \in \mathbb{R}^{n \times d}, b \in \mathbb{R}^n$, we want to solve $\min_{x} \bar{A}x - \bar{b}_p$, where $\bar{A} \in \mathbb{R}^{n^2 \times d}, \bar{b} \in \mathbb{R}^{n^2}$ consist of all pairwise differences of the rows of $A,b$. We give an $O(\text{nnz}(A))$ time algorithm for $p \in[1,2]$, improving the $\Omega(n^2)$ time needed to form $\bar{A}$. Finally, we initiate the study of Kronecker product low rank and low $t$-rank approximation. For input $\mathcal{A}$ as above, we give $O(\sum_{i=1}^q \text{nnz}(A_i))$ time algorithms, which is much faster than computing $\mathcal{A}$. |
Tasks | |
Published | 2019-09-29 |
URL | https://arxiv.org/abs/1909.13384v1 |
https://arxiv.org/pdf/1909.13384v1.pdf | |
PWC | https://paperswithcode.com/paper/optimal-sketching-for-kronecker-product |
Repo | |
Framework | |
Better accuracy with quantified privacy: representations learned via reconstructive adversarial network
Title | Better accuracy with quantified privacy: representations learned via reconstructive adversarial network |
Authors | Sicong Liu, Anshumali Shrivastava, Junzhao Du, Lin Zhong |
Abstract | The remarkable success of machine learning, especially deep learning, has produced a variety of cloud-based services for mobile users. Such services require an end user to send data to the service provider, which presents a serious challenge to end-user privacy. To address this concern, prior works either add noise to the data or send features extracted from the raw data. They struggle to balance between the utility and privacy because added noise reduces utility and raw data can be reconstructed from extracted features. This work represents a methodical departure from prior works: we balance between a measure of privacy and another of utility by leveraging adversarial learning to find a sweeter tradeoff. We design an encoder that optimizes against the reconstruction error (a measure of privacy), adversarially by a Decoder, and the inference accuracy (a measure of utility) by a Classifier. The result is RAN, a novel deep model with a new training algorithm that automatically extracts features for classification that are both private and useful. It turns out that adversarially forcing the extracted features to only conveys the intended information required by classification leads to an implicit regularization leading to better classification accuracy than the original model which completely ignores privacy. Thus, we achieve better privacy with better utility, a surprising possibility in machine learning! We conducted extensive experiments on five popular datasets over four training schemes, and demonstrate the superiority of RAN compared with existing alternatives. |
Tasks | |
Published | 2019-01-25 |
URL | http://arxiv.org/abs/1901.08730v1 |
http://arxiv.org/pdf/1901.08730v1.pdf | |
PWC | https://paperswithcode.com/paper/better-accuracy-with-quantified-privacy |
Repo | |
Framework | |
Hierarchical Foresight: Self-Supervised Learning of Long-Horizon Tasks via Visual Subgoal Generation
Title | Hierarchical Foresight: Self-Supervised Learning of Long-Horizon Tasks via Visual Subgoal Generation |
Authors | Suraj Nair, Chelsea Finn |
Abstract | Video prediction models combined with planning algorithms have shown promise in enabling robots to learn to perform many vision-based tasks through only self-supervision, reaching novel goals in cluttered scenes with unseen objects. However, due to the compounding uncertainty in long horizon video prediction and poor scalability of sampling-based planning optimizers, one significant limitation of these approaches is the ability to plan over long horizons to reach distant goals. To that end, we propose a framework for subgoal generation and planning, hierarchical visual foresight (HVF), which generates subgoal images conditioned on a goal image, and uses them for planning. The subgoal images are directly optimized to decompose the task into easy to plan segments, and as a result, we observe that the method naturally identifies semantically meaningful states as subgoals. Across three out of four simulated vision-based manipulation tasks, we find that our method achieves nearly a 200% performance improvement over planning without subgoals and model-free RL approaches. Further, our experiments illustrate that our approach extends to real, cluttered visual scenes. Project page: https://sites.google.com/stanford.edu/hvf |
Tasks | Video Prediction |
Published | 2019-09-12 |
URL | https://arxiv.org/abs/1909.05829v1 |
https://arxiv.org/pdf/1909.05829v1.pdf | |
PWC | https://paperswithcode.com/paper/hierarchical-foresight-self-supervised |
Repo | |
Framework | |
Minimal Solvers for Mini-Loop Closures in 3D Multi-Scan Alignment
Title | Minimal Solvers for Mini-Loop Closures in 3D Multi-Scan Alignment |
Authors | Pedro Miraldo, Surojit Saha, Srikumar Ramalingam |
Abstract | 3D scan registration is a classical, yet a highly useful problem in the context of 3D sensors such as Kinect and Velodyne. While there are several existing methods, the techniques are usually incremental where adjacent scans are registered first to obtain the initial poses, followed by motion averaging and bundle-adjustment refinement. In this paper, we take a different approach and develop minimal solvers for jointly computing the initial poses of cameras in small loops such as 3-, 4-, and 5-cycles. Note that the classical registration of 2 scans can be done using a minimum of 3 point matches to compute 6 degrees of relative motion. On the other hand, to jointly compute the 3D registrations in n-cycles, we take 2 point matches between the first n-1 consecutive pairs (i.e., Scan 1 & Scan 2, … , and Scan n-1 & Scan n) and 1 or 2 point matches between Scan 1 and Scan n. Overall, we use 5, 7, and 10 point matches for 3-, 4-, and 5-cycles, and recover 12, 18, and 24 degrees of transformation variables, respectively. Using simulations and real-data we show that the 3D registration using mini n-cycles are computationally efficient, and can provide alternate and better initial poses compared to standard pairwise methods. |
Tasks | |
Published | 2019-04-08 |
URL | http://arxiv.org/abs/1904.03941v1 |
http://arxiv.org/pdf/1904.03941v1.pdf | |
PWC | https://paperswithcode.com/paper/minimal-solvers-for-mini-loop-closures-in-3d |
Repo | |
Framework | |
IIT (BHU) Varanasi at MSR-SRST 2018: A Language Model Based Approach for Natural Language Generation
Title | IIT (BHU) Varanasi at MSR-SRST 2018: A Language Model Based Approach for Natural Language Generation |
Authors | Shreyansh Singh, Avi Chawla, Ayush Sharma, Anil Kumar Singh |
Abstract | This paper describes our submission system for the Shallow Track of Surface Realization Shared Task 2018 (SRST’18). The task was to convert genuine UD structures, from which word order information had been removed and the tokens had been lemmatized, into their correct sentential form. We divide the problem statement into two parts, word reinflection and correct word order prediction. For the first sub-problem, we use a Long Short Term Memory based Encoder-Decoder approach. For the second sub-problem, we present a Language Model (LM) based approach. We apply two different sub-approaches in the LM Based approach and the combined result of these two approaches is considered as the final output of the system. |
Tasks | Language Modelling, Text Generation |
Published | 2019-04-12 |
URL | http://arxiv.org/abs/1904.06234v1 |
http://arxiv.org/pdf/1904.06234v1.pdf | |
PWC | https://paperswithcode.com/paper/iit-bhu-varanasi-at-msr-srst-2018-a-language-1 |
Repo | |
Framework | |
ChainNet: Learning on Blockchain Graphs with Topological Features
Title | ChainNet: Learning on Blockchain Graphs with Topological Features |
Authors | Nazmiye Ceren Abay, Cuneyt Gurcan Akcora, Yulia R. Gel, Umar D. Islambekov, Murat Kantarcioglu, Yahui Tian, Bhavani Thuraisingham |
Abstract | With emergence of blockchain technologies and the associated cryptocurrencies, such as Bitcoin, understanding network dynamics behind Blockchain graphs has become a rapidly evolving research direction. Unlike other financial networks, such as stock and currency trading, blockchain based cryptocurrencies have the entire transaction graph accessible to the public (i.e., all transactions can be downloaded and analyzed). A natural question is then to ask whether the dynamics of the transaction graph impacts the price of the underlying cryptocurrency. We show that standard graph features such as degree distribution of the transaction graph may not be sufficient to capture network dynamics and its potential impact on fluctuations of Bitcoin price. In contrast, the new graph associated topological features computed using the tools of persistent homology, are found to exhibit a high utility for predicting Bitcoin price dynamics. %explain higher order interactions among the nodes in Blockchain graphs and can be used to build much more accurate price prediction models. Using the proposed persistent homology-based techniques, we offer a new elegant, easily extendable and computationally light approach for graph representation learning on Blockchain. |
Tasks | Graph Representation Learning, Representation Learning |
Published | 2019-08-18 |
URL | https://arxiv.org/abs/1908.06971v1 |
https://arxiv.org/pdf/1908.06971v1.pdf | |
PWC | https://paperswithcode.com/paper/chainnet-learning-on-blockchain-graphs-with |
Repo | |
Framework | |
Hybrid Low-order and Higher-order Graph Convolutional Networks
Title | Hybrid Low-order and Higher-order Graph Convolutional Networks |
Authors | FangYuan Lei, Xun Liu, QingYun Dai, Bingo Wing-Kuen Ling, Huimin Zhao, Yan Liu |
Abstract | With higher-order neighborhood information of graph network, the accuracy of graph representation learning classification can be significantly improved. However, the current higher order graph convolutional network has a large number of parameters and high computational complexity. Therefore, we propose a Hybrid Lower order and Higher order Graph convolutional networks (HLHG) learning model, which uses weight sharing mechanism to reduce the number of network parameters. To reduce computational complexity, we propose a novel fusion pooling layer to combine the neighborhood information of high order and low order. Theoretically, we compare the model complexity of the proposed model with the other state-of-the-art model. Experimentally, we verify the proposed model on the large-scale text network datasets by supervised learning, and on the citation network datasets by semi-supervised learning. The experimental results show that the proposed model achieves highest classification accuracy with a small set of trainable weight parameters. |
Tasks | Graph Representation Learning, Representation Learning |
Published | 2019-08-02 |
URL | https://arxiv.org/abs/1908.00673v1 |
https://arxiv.org/pdf/1908.00673v1.pdf | |
PWC | https://paperswithcode.com/paper/hybrid-low-order-and-higher-order-graph |
Repo | |
Framework | |
Behavior Pattern and Compiled Information Based Performance Prediction in MOOCs
Title | Behavior Pattern and Compiled Information Based Performance Prediction in MOOCs |
Authors | Shaojie Qu, Kan Li, Zheyi Fan, Sisi Wu, Xinyi Liu, Zhiguo Huang |
Abstract | With the development of MOOCs massive open online courses, increasingly more subjects can be studied online. Researchers currently show growing interest in the field of MOOCs, including dropout prediction, cheating detection and achievement prediction. Previous studies on achievement prediction mainly focused on students’ video and forum behaviors, and few researchers have considered how well students perform their assignments. In this paper, we choose a C programming course as the experimental subject, which involved 1528 students. This paper mainly focuses on the students’ accomplishment behaviors in programming assignments and compiled information from programming assignments. In this paper, feature sequences are extracted from the logs according to submission times, submission order and plagiarism. The experimental results show that the students who did not pass the exam had obvious sequence patterns but that the students who passed the test did not have an obvious sequence pattern. Then, we extract 23 features from the compiled information of students’ programming assignments and select the most distinguishing features to predict the students’ performances. The experimental results show that we can obtain an accuracy rate of 0.7049 for predicting students’ performances. |
Tasks | |
Published | 2019-08-04 |
URL | https://arxiv.org/abs/1908.01304v1 |
https://arxiv.org/pdf/1908.01304v1.pdf | |
PWC | https://paperswithcode.com/paper/behavior-pattern-and-compiled-information |
Repo | |
Framework | |
Lie on the Fly: Strategic Voting in an Iterative Preference Elicitation Process
Title | Lie on the Fly: Strategic Voting in an Iterative Preference Elicitation Process |
Authors | Lihi Dery, Svetlana Obraztsova, Zinovi Rabinovich, Meir Kalech |
Abstract | A voting center is in charge of collecting and aggregating voter preferences. In an iterative process, the center sends comparison queries to voters, requesting them to submit their preference between two items. Voters might discuss the candidates among themselves, figuring out during the elicitation process which candidates stand a chance of winning and which do not. Consequently, strategic voters might attempt to manipulate by deviating from their true preferences and instead submit a different response in order to attempt to maximize their profit. We provide a practical algorithm for strategic voters which computes the best manipulative vote and maximizes the voter’s selfish outcome when such a vote exists. We also provide a careful voting center which is aware of the possible manipulations and avoids manipulative queries when possible. In an empirical study on four real-world domains, we show that in practice manipulation occurs in a low percentage of settings and has a low impact on the final outcome. The careful voting center reduces manipulation even further, thus allowing for a non-distorted group decision process to take place. We thus provide a core technology study of a voting process that can be adopted in opinion or information aggregation systems and in crowdsourcing applications, e.g., peer grading in Massive Open Online Courses (MOOCs). |
Tasks | |
Published | 2019-05-13 |
URL | https://arxiv.org/abs/1905.04933v1 |
https://arxiv.org/pdf/1905.04933v1.pdf | |
PWC | https://paperswithcode.com/paper/lie-on-the-fly-strategic-voting-in-an |
Repo | |
Framework | |
The Stabilized Explicit Variable-Load Solver with Machine Learning Acceleration for the Rapid Solution of Stiff Chemical Kinetics
Title | The Stabilized Explicit Variable-Load Solver with Machine Learning Acceleration for the Rapid Solution of Stiff Chemical Kinetics |
Authors | Kyle Buchheit, Opeoluwa Owoyele, Terry Jordan, Dirk Van Essendelft |
Abstract | In this study, a fast and stable machine-learned hybrid algorithm implemented in TensorFlow for the integration of stiff chemical kinetics is introduced. Numerical solutions to differential equations are at the core of computational fluid dynamics calculations. As the size and complexity of the simulations grow, so does the need for computational power and time. Many efforts have been made to implement stiff chemistry solvers on GPUs but have not been highly successful because of the logical divergence in traditional stiff solver algorithms. Because of these constrains, a novel Explicit Stabilized Variable-load (STEV) solver has been developed. Overstepping due to the relatively large time steps is prevented by introducing limits to the maximum changes of chemical species per time step. To prevent oscillations, a discrete Fourier transform is introduced to dampen ringing. In contrast to conventional explicit approaches, a variable-load approach is used where each cell in the computational domain is advanced with its unique time step. This approach allows cells to be integrated simultaneously while maintaining warp convergence but finish at different iterations and be removed from the workload. To improve the computational performance of the introduced solver, specific thermodynamic quantities of interest were estimated using shallow neural networks in place of polynomial fits, leading to an additional 10% savings in clock time with minimal training and implementation requirements. However ML specific hardware could increase the time savings to as much as 28%. While the complexity of these particular machine learning models is not high by modern standards, the impact on computational efficiency should not be ignored. The results show a dramatic decrease in total chemistry solution time (over 200 times) while maintaining a similar degree of accuracy. |
Tasks | |
Published | 2019-05-21 |
URL | https://arxiv.org/abs/1905.09395v3 |
https://arxiv.org/pdf/1905.09395v3.pdf | |
PWC | https://paperswithcode.com/paper/the-stabilized-explicit-variable-load-solver |
Repo | |
Framework | |
Candidate Generation with Binary Codes for Large-Scale Top-N Recommendation
Title | Candidate Generation with Binary Codes for Large-Scale Top-N Recommendation |
Authors | Wang-Cheng Kang, Julian McAuley |
Abstract | Generating the Top-N recommendations from a large corpus is computationally expensive to perform at scale. Candidate generation and re-ranking based approaches are often adopted in industrial settings to alleviate efficiency problems. However it remains to be fully studied how well such schemes approximate complete rankings (or how many candidates are required to achieve a good approximation), or to develop systematic approaches to generate high-quality candidates efficiently. In this paper, we seek to investigate these questions via proposing a candidate generation and re-ranking based framework (CIGAR), which first learns a preference-preserving binary embedding for building a hash table to retrieve candidates, and then learns to re-rank the candidates using real-valued ranking models with a candidate-oriented objective. We perform a comprehensive study on several large-scale real-world datasets consisting of millions of users/items and hundreds of millions of interactions. Our results show that CIGAR significantly boosts the Top-N accuracy against state-of-the-art recommendation models, while reducing the query time by orders of magnitude. We hope that this work could draw more attention to the candidate generation problem in recommender systems. |
Tasks | Recommendation Systems |
Published | 2019-09-12 |
URL | https://arxiv.org/abs/1909.05475v1 |
https://arxiv.org/pdf/1909.05475v1.pdf | |
PWC | https://paperswithcode.com/paper/candidate-generation-with-binary-codes-for |
Repo | |
Framework | |
Machine Learning of Time Series Using Time-delay Embedding and Precision Annealing
Title | Machine Learning of Time Series Using Time-delay Embedding and Precision Annealing |
Authors | Alexander J. A. Ty, Zheng Fang, Rivver A. Gonzalez, Paul J. Rozdeba, Henry D. I. Abarbanel |
Abstract | Tasking machine learning to predict segments of a time series requires estimating the parameters of a ML model with input/output pairs from the time series. Using the equivalence between statistical data assimilation and supervised machine learning, we revisit this task. The training method for the machine utilizes a precision annealing approach to identifying the global minimum of the action (-log[P]). In this way we are able to identify the number of training pairs required to produce good generalizations (predictions) for the time series. We proceed from a scalar time series $s(t_n); t_n = t_0 + n \Delta t$ and using methods of nonlinear time series analysis show how to produce a $D_E > 1$ dimensional time delay embedding space in which the time series has no false neighbors as does the observed $s(t_n)$ time series. In that $D_E$-dimensional space we explore the use of feed forward multi-layer perceptrons as network models operating on $D_E$-dimensional input and producing $D_E$-dimensional outputs. |
Tasks | Time Series, Time Series Analysis |
Published | 2019-02-12 |
URL | https://arxiv.org/abs/1902.05062v2 |
https://arxiv.org/pdf/1902.05062v2.pdf | |
PWC | https://paperswithcode.com/paper/machine-learning-of-time-series-using-time |
Repo | |
Framework | |
A Stealthy Hardware Trojan Exploiting the Architectural Vulnerability of Deep Learning Architectures: Input Interception Attack (IIA)
Title | A Stealthy Hardware Trojan Exploiting the Architectural Vulnerability of Deep Learning Architectures: Input Interception Attack (IIA) |
Authors | Tolulope A. Odetola, Hawzhin Raoof Mohammed, Syed Rafay Hasan |
Abstract | Deep learning architectures (DLA) have shown impressive performance in computer vision, natural language processing and so on. Many DLA make use of cloud computing to achieve classification due to the high computation and memory requirements. Privacy and latency concerns resulting from cloud computing has inspired the deployment of DLA on embedded hardware accelerators. To achieve short time-to-market and have access to global experts, state-of-the-art techniques of DLA deployment on hardware accelerators are outsourced to untrusted third parties. This outsourcing raises security concerns as hardware Trojans can be inserted into the hardware design of the mapped DLA of the hardware accelerator. We argue that existing hardware Trojan attacks highlighted in literature have no qualitative means how definite they are of the triggering of the Trojan. Also, most inserted Trojans show a obvious spike in the number of hardware resources utilized on the accelerator at the time of triggering the Trojan or when the payload is active. In this paper, we propose a hardware Trojan attack called Input Interception Attack (IIA). In this attack we make use of the statistical properties of layer-by-layer output to make sure that asides from being stealthy, our IIA is able to trigger with some measure of definiteness. This IIA attack is tested on DLA used to classify MNIST and Cifar-10 data sets. The attacked design utilizes approximately up to 2% more LUTs respectively compared to the un-compromised designs. This paper also discusses potential defensive mechanisms that could be used to combat such hardware Trojans based attack in hardware accelerators for DLA. |
Tasks | |
Published | 2019-11-02 |
URL | https://arxiv.org/abs/1911.00783v1 |
https://arxiv.org/pdf/1911.00783v1.pdf | |
PWC | https://paperswithcode.com/paper/a-stealthy-hardware-trojan-exploiting-the |
Repo | |
Framework | |
HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks
Title | HAWQ-V2: Hessian Aware trace-Weighted Quantization of Neural Networks |
Authors | Zhen Dong, Zhewei Yao, Yaohui Cai, Daiyaan Arfeen, Amir Gholami, Michael W. Mahoney, Kurt Keutzer |
Abstract | Quantization is an effective method for reducing memory footprint and inference time of Neural Networks, e.g., for efficient inference in the cloud, especially at the edge. However, ultra low precision quantization could lead to significant degradation in model generalization. A promising method to address this is to perform mixed-precision quantization, where more sensitive layers are kept at higher precision. However, the search space for a mixed-precision quantization is exponential in the number of layers. Recent work has proposed HAWQ, a novel Hessian based framework, with the aim of reducing this exponential search space by using second-order information. While promising, this prior work has three major limitations: (i) HAWQV1 only uses the top Hessian eigenvalue as a measure of sensitivity and do not consider the rest of the Hessian spectrum; (ii) HAWQV1 approach only provides relative sensitivity of different layers and therefore requires a manual selection of the mixed-precision setting; and (iii) HAWQV1 does not consider mixed-precision activation quantization. Here, we present HAWQV2 which addresses these shortcomings. For (i), we perform a theoretical analysis showing that a better sensitivity metric is to compute the average of all of the Hessian eigenvalues. For (ii), we develop a Pareto frontier based method for selecting the exact bit precision of different layers without any manual selection. For (iii), we extend the Hessian analysis to mixed-precision activation quantization. We have found this to be very beneficial for object detection. We show that HAWQV2 achieves new state-of-the-art results for a wide range of tasks. |
Tasks | Object Detection, Quantization |
Published | 2019-11-10 |
URL | https://arxiv.org/abs/1911.03852v1 |
https://arxiv.org/pdf/1911.03852v1.pdf | |
PWC | https://paperswithcode.com/paper/hawq-v2-hessian-aware-trace-weighted |
Repo | |
Framework | |
Deeper Connections between Neural Networks and Gaussian Processes Speed-up Active Learning
Title | Deeper Connections between Neural Networks and Gaussian Processes Speed-up Active Learning |
Authors | Evgenii Tsymbalov, Sergei Makarychev, Alexander Shapeev, Maxim Panov |
Abstract | Active learning methods for neural networks are usually based on greedy criteria which ultimately give a single new design point for the evaluation. Such an approach requires either some heuristics to sample a batch of design points at one active learning iteration, or retraining the neural network after adding each data point, which is computationally inefficient. Moreover, uncertainty estimates for neural networks sometimes are overconfident for the points lying far from the training sample. In this work we propose to approximate Bayesian neural networks (BNN) by Gaussian processes, which allows us to update the uncertainty estimates of predictions efficiently without retraining the neural network, while avoiding overconfident uncertainty prediction for out-of-sample points. In a series of experiments on real-world data including large-scale problems of chemical and physical modeling, we show superiority of the proposed approach over the state-of-the-art methods. |
Tasks | Active Learning, Gaussian Processes |
Published | 2019-02-27 |
URL | http://arxiv.org/abs/1902.10350v1 |
http://arxiv.org/pdf/1902.10350v1.pdf | |
PWC | https://paperswithcode.com/paper/deeper-connections-between-neural-networks |
Repo | |
Framework | |