Paper Group ANR 183
PolyScientist: Automatic Loop Transformations Combined with Microkernels for Optimization of Deep Learning Primitives. Privacy-Preserving Boosting in the Local Setting. DA4AD: End-to-end Deep Attention Aware Features Aided Visual Localization for Autonomous Driving. Learning from Small Data Through Sampling an Implicit Conditional Generative Latent …
PolyScientist: Automatic Loop Transformations Combined with Microkernels for Optimization of Deep Learning Primitives
Title | PolyScientist: Automatic Loop Transformations Combined with Microkernels for Optimization of Deep Learning Primitives |
Authors | Sanket Tavarageri, Alexander Heinecke, Sasikanth Avancha, Gagandeep Goyal, Ramakrishna Upadrasta, Bharat Kaul |
Abstract | At the heart of deep learning training and inferencing are computationally intensive primitives such as convolutions which form the building blocks of deep neural networks. Researchers have taken two distinct approaches to creating high performance implementations of deep learning kernels, namely, 1) library development exemplified by Intel MKL-DNN for CPUs, 2) automatic compilation represented by the TensorFlow XLA compiler. The two approaches have their drawbacks: even though a custom built library can deliver very good performance, the cost and time of development of the library can be high. Automatic compilation of kernels is attractive but in practice, till date, automatically generated implementations lag expert coded kernels in performance by orders of magnitude. In this paper, we develop a hybrid solution to the development of deep learning kernels that achieves the best of both worlds: the expert coded microkernels are utilized for the innermost loops of kernels and we use the advanced polyhedral technology to automatically tune the outer loops for performance. We design a novel polyhedral model based data reuse algorithm to optimize the outer loops of the kernel. Through experimental evaluation on an important class of deep learning primitives namely convolutions, we demonstrate that the approach we develop attains the same levels of performance as Intel MKL-DNN, a hand coded deep learning library. |
Tasks | |
Published | 2020-02-06 |
URL | https://arxiv.org/abs/2002.02145v1 |
https://arxiv.org/pdf/2002.02145v1.pdf | |
PWC | https://paperswithcode.com/paper/polyscientist-automatic-loop-transformations |
Repo | |
Framework | |
Privacy-Preserving Boosting in the Local Setting
Title | Privacy-Preserving Boosting in the Local Setting |
Authors | Sen Wang, J. Morris Chang |
Abstract | In machine learning, boosting is one of the most popular methods that designed to combine multiple base learners to a superior one. The well-known Boosted Decision Tree classifier, has been widely adopted in many areas. In the big data era, the data held by individual and entities, like personal images, browsing history and census information, are more likely to contain sensitive information. The privacy concern raises when such data leaves the hand of the owners and be further explored or mined. Such privacy issue demands that the machine learning algorithm should be privacy aware. Recently, Local Differential Privacy is proposed as an effective privacy protection approach, which offers a strong guarantee to the data owners, as the data is perturbed before any further usage, and the true values never leave the hands of the owners. Thus the machine learning algorithm with the private data instances is of great value and importance. In this paper, we are interested in developing the privacy-preserving boosting algorithm that a data user is allowed to build a classifier without knowing or deriving the exact value of each data samples. Our experiments demonstrate the effectiveness of the proposed boosting algorithm and the high utility of the learned classifiers. |
Tasks | |
Published | 2020-02-06 |
URL | https://arxiv.org/abs/2002.02096v1 |
https://arxiv.org/pdf/2002.02096v1.pdf | |
PWC | https://paperswithcode.com/paper/privacy-preserving-boosting-in-the-local |
Repo | |
Framework | |
DA4AD: End-to-end Deep Attention Aware Features Aided Visual Localization for Autonomous Driving
Title | DA4AD: End-to-end Deep Attention Aware Features Aided Visual Localization for Autonomous Driving |
Authors | Yao Zhou, Guowei Wan, Shenhua Hou, Li Yu, Gang Wang, Xiaofei Rui, Shiyu Song |
Abstract | We present a visual localization framework aided by novel deep attention aware features for autonomous driving that achieves centimeter level localization accuracy. Conventional approaches to the visual localization problem rely on handcrafted features or human-made objects on the road. They are known to be either prone to unstable matching caused by severe appearance or lighting changes, or too scarce to deliver constant and robust localization results in challenging scenarios. In this work, we seek to exploit the deep attention mechanism to search for salient, distinctive and stable features that are good for long-term matching in the scene through a novel end-to-end deep neural network. Furthermore, our learned feature descriptors are demonstrated to be competent to establish robust matches and therefore successfully estimate the optimal camera poses with high precision. We comprehensively validate the effectiveness of our method using a freshly collected dataset with high-quality ground truth trajectories and hardware synchronization between sensors. Results demonstrate that our method achieves a competitive localization accuracy when compared to the LiDAR-based localization solutions under various challenging circumstances, leading to a potential low-cost localization solution for autonomous driving. |
Tasks | Autonomous Driving, Deep Attention, Visual Localization |
Published | 2020-03-06 |
URL | https://arxiv.org/abs/2003.03026v1 |
https://arxiv.org/pdf/2003.03026v1.pdf | |
PWC | https://paperswithcode.com/paper/da4ad-end-to-end-deep-attention-aware |
Repo | |
Framework | |
Learning from Small Data Through Sampling an Implicit Conditional Generative Latent Optimization Model
Title | Learning from Small Data Through Sampling an Implicit Conditional Generative Latent Optimization Model |
Authors | Idan Azuri, Daphna Weinshall |
Abstract | We revisit the long-standing problem of \emph{learning from small sample}. In recent years major efforts have been invested into the generation of new samples from a small set of training data points. Some use classical transformations, others synthesize new examples. Our approach belongs to the second one. We propose a new model based on conditional Generative Latent Optimization (cGLO). Our model learns to synthesize completely new samples for every class just by interpolating between samples in the latent space. The proposed method samples the learned latent space using spherical interpolations (\emph{slerp}) and generates a new sample using the trained generator. Our empirical results show that the new sampled set is diverse enough, leading to improvement in image classification in comparison to the state of the art, when trained on small samples of CIFAR-100 and CUB-200. |
Tasks | Image Classification |
Published | 2020-03-31 |
URL | https://arxiv.org/abs/2003.14297v1 |
https://arxiv.org/pdf/2003.14297v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-from-small-data-through-sampling-an |
Repo | |
Framework | |
Milking CowMask for Semi-Supervised Image Classification
Title | Milking CowMask for Semi-Supervised Image Classification |
Authors | Geoff French, Avital Oliver, Tim Salimans |
Abstract | Consistency regularization is a technique for semi-supervised learning that has recently been shown to yield strong results for classification with few labeled data. The method works by perturbing input data using augmentation or adversarial examples, and encouraging the learned model to be robust to these perturbations on unlabeled data. Here, we evaluate the use of a recently proposed augmentation method, called CowMasK, for this purpose. Using CowMask as the augmentation method in semi-supervised consistency regularization, we establish a new state-of-the-art result on Imagenet with 10% labeled data, with a top-5 error of 8.76% and top-1 error of 26.06%. Moreover, we do so with a method that is much simpler than alternative methods. We further investigate the behavior of CowMask for semi-supervised learning by running many smaller scale experiments on the small image benchmarks SVHN, CIFAR-10 and CIFAR-100, where we achieve results competitive with the state of the art, and where we find evidence that the CowMask perturbation is widely applicable. We open source our code at https://github.com/google-research/google-research/tree/master/milking_cowmask |
Tasks | Image Classification, Semi-Supervised Image Classification |
Published | 2020-03-26 |
URL | https://arxiv.org/abs/2003.12022v1 |
https://arxiv.org/pdf/2003.12022v1.pdf | |
PWC | https://paperswithcode.com/paper/milking-cowmask-for-semi-supervised-image |
Repo | |
Framework | |
HHH: An Online Medical Chatbot System based on Knowledge Graph and Hierarchical Bi-Directional Attention
Title | HHH: An Online Medical Chatbot System based on Knowledge Graph and Hierarchical Bi-Directional Attention |
Authors | Qiming Bao, Lin Ni, Jiamou Liu |
Abstract | This paper proposes a chatbot framework that adopts a hybrid model which consists of a knowledge graph and a text similarity model. Based on this chatbot framework, we build HHH, an online question-and-answer (QA) Healthcare Helper system for answering complex medical questions. HHH maintains a knowledge graph constructed from medical data collected from the Internet. HHH also implements a novel text representation and similarity deep learning model, Hierarchical BiLSTM Attention Model (HBAM), to find the most similar question from a large QA dataset. We compare HBAM with other state-of-the-art language models such as bidirectional encoder representation from transformers (BERT) and Manhattan LSTM Model (MaLSTM). We train and test the models with a subset of the Quora duplicate questions dataset in the medical area. The experimental results show that our model is able to achieve a superior performance than these existing methods. |
Tasks | Chatbot |
Published | 2020-02-08 |
URL | https://arxiv.org/abs/2002.03140v1 |
https://arxiv.org/pdf/2002.03140v1.pdf | |
PWC | https://paperswithcode.com/paper/hhh-an-online-medical-chatbot-system-based-on-1 |
Repo | |
Framework | |
Can graph neural networks count substructures?
Title | Can graph neural networks count substructures? |
Authors | Zhengdao Chen, Lei Chen, Soledad Villar, Joan Bruna |
Abstract | The ability to detect and count certain substructures in graphs is important for solving many tasks on graph-structured data, especially in the contexts of computational chemistry and biology as well as social network analysis. Inspired by this, we propose to study the expressive power of graph neural networks (GNNs) via their ability to count attributed graph substructures, extending recent works that examine their power in graph isomorphism testing and function approximation. We distinguish between two types of substructure counting: matching-count and containment-count, and establish both positive and negative answers for popular GNN architectures. Specifically, we prove that Message Passing Neural Networks (MPNNs), 2-Weisfeiler-Lehman (2-WL) and 2-Invariant Graph Networks (2-IGNs) cannot perform matching-count of substructures consisting of 3 or more nodes, while they can perform containment-count of star-shaped substructures. We also prove positive results for k-WL and k-IGNs as well as negative results for k-WL with limited number of iterations. We then conduct experiments that support the theoretical results for MPNNs and 2-IGNs, and demonstrate that local relational pooling strategies inspired by Murphy et al. (2019) are more effective for substructure counting. In addition, as an intermediary step, we prove that 2-WL and 2-IGNs are equivalent in distinguishing non-isomorphic graphs, partly answering an open problem raised in Maron et al. (2019). |
Tasks | |
Published | 2020-02-10 |
URL | https://arxiv.org/abs/2002.04025v2 |
https://arxiv.org/pdf/2002.04025v2.pdf | |
PWC | https://paperswithcode.com/paper/can-graph-neural-networks-count-substructures |
Repo | |
Framework | |
Replica Exchange for Non-Convex Optimization
Title | Replica Exchange for Non-Convex Optimization |
Authors | Jing Dong, Xin T. Tong |
Abstract | Gradient descent (GD) is known to converge quickly for convex objective functions, but it can be trapped at local minimums. On the other hand, Langevin dynamics (LD) can explore the state space and find global minimums, but in order to give accurate estimates, LD needs to run with small discretization stepsize and weak stochastic force, which in general slow down its convergence. This paper shows that these two algorithms can “collaborate” through a simple exchange mechanism, in which they swap their current positions if LD yields a lower objective function. This idea can be seen as the singular limit of the replica exchange technique from the sampling literature. We show that this new algorithm converges to the global minimum linearly with high probability, assuming the objective function is strongly convex in a neighborhood of the unique global minimum. By replacing gradients with stochastic gradients, and adding a proper threshold to the exchange mechanism, our algorithm can also be used in online settings. We further verify our theoretical results through some numerical experiments, and observe superior performance of the proposed algorithm over running GD or LD alone. |
Tasks | |
Published | 2020-01-23 |
URL | https://arxiv.org/abs/2001.08356v1 |
https://arxiv.org/pdf/2001.08356v1.pdf | |
PWC | https://paperswithcode.com/paper/replica-exchange-for-non-convex-optimization |
Repo | |
Framework | |
A Game-Theoretic Model of Human Driving and Application to Discretionary Lane-Changes
Title | A Game-Theoretic Model of Human Driving and Application to Discretionary Lane-Changes |
Authors | Jehong Yoo, Reza Langari |
Abstract | In this paper we consider the application of Stackelberg game theory to model discretionary lane-changing in lightly congested highway setting. The fundamental intent of this model, which is parameterized to capture driver disposition (aggressiveness or inattentiveness), is to help with the development of decision-making strategies for autonomous vehicles in ways that are mindful of how human drivers perform the same function on the road (on which have reported elsewhere.) This paper, however, focuses only on the model development and the respective qualitative assessment. This is accomplished in unit test simulations as well as in bulk mode (i.e. using the Monte Carlo methodology), via a limited traffic micro-simulation compared against the NHTSA 100-Car Naturalistic Driving Safety data. In particular, a qualitative comparison shows the relative consistency of the proposed model with human decision-making in terms of producing qualitatively similar proportions of crashes and near crashes as a function of driver inattentiveness (or aggressiveness). While this result by itself does not offer a true quantitative validation of the proposed model, it does demonstrate the utility of the proposed approach in modeling discretionary lane-changing and may therefore be of use in autonomous driving in a manner that is consistent with human decision making on the road. |
Tasks | Autonomous Driving, Autonomous Vehicles, Decision Making |
Published | 2020-03-22 |
URL | https://arxiv.org/abs/2003.09783v1 |
https://arxiv.org/pdf/2003.09783v1.pdf | |
PWC | https://paperswithcode.com/paper/a-game-theoretic-model-of-human-driving-and |
Repo | |
Framework | |
Data-based computation of stabilizing minimum dwell times for discrete-time switched linear systems
Title | Data-based computation of stabilizing minimum dwell times for discrete-time switched linear systems |
Authors | Atreyee Kundu |
Abstract | We present an algorithm to compute stabilizing minimum dwell times for discrete-time switched linear systems without the explicit knowledge of state-space models of their subsystems. Given a set of finite traces of state trajectories of the subsystems that satisfies certain properties, our algorithm involves the following tasks: first, multiple Lyapunov functions are designed from the given data; second, a set of relevant scalars is computed from these functions; and third, a stabilizing minimum dwell time is determined as a function of these scalars. A numerical example is presented to demonstrate the proposed algorithm. |
Tasks | |
Published | 2020-02-06 |
URL | https://arxiv.org/abs/2002.02087v2 |
https://arxiv.org/pdf/2002.02087v2.pdf | |
PWC | https://paperswithcode.com/paper/data-based-computation-of-stabilizing-minimum |
Repo | |
Framework | |
Minimizing Dynamic Regret and Adaptive Regret Simultaneously
Title | Minimizing Dynamic Regret and Adaptive Regret Simultaneously |
Authors | Lijun Zhang, Shiyin Lu, Tianbao Yang |
Abstract | Regret minimization is treated as the golden rule in the traditional study of online learning. However, regret minimization algorithms tend to converge to the static optimum, thus being suboptimal for changing environments. To address this limitation, new performance measures, including dynamic regret and adaptive regret have been proposed to guide the design of online algorithms. The former one aims to minimize the global regret with respect to a sequence of changing comparators, and the latter one attempts to minimize every local regret with respect to a fixed comparator. Existing algorithms for dynamic regret and adaptive regret are developed independently, and only target one performance measure. In this paper, we bridge this gap by proposing novel online algorithms that are able to minimize the dynamic regret and adaptive regret simultaneously. In fact, our theoretical guarantee is even stronger in the sense that one algorithm is able to minimize the dynamic regret over any interval. |
Tasks | |
Published | 2020-02-06 |
URL | https://arxiv.org/abs/2002.02085v1 |
https://arxiv.org/pdf/2002.02085v1.pdf | |
PWC | https://paperswithcode.com/paper/minimizing-dynamic-regret-and-adaptive-regret |
Repo | |
Framework | |
How the Brain might use Division
Title | How the Brain might use Division |
Authors | Kieran Greer |
Abstract | One of the most fundamental questions in Biology or Artificial Intelligence is how the human brain performs mathematical functions. How does a neural architecture that may organise itself mostly through statistics, know what to do? One possibility is to extract the problem to something more abstract. This becomes clear when thinking about how the brain handles large numbers, for example to the power of something, when simply summing to an answer is not feasible. In this paper, the author suggests that the maths question can be answered more easily if the problem is changed into one of symbol manipulation and not just number counting. If symbols can be compared and manipulated, maybe without understanding completely what they are, then the mathematical operations become relative and some of them might even be rote learned. The proposed system may also be suggested as an alternative to the traditional computer binary system. Any of the actual maths still breaks down into binary operations, while a more symbolic level above that can manipulate the numbers and reduce the problem size, thus making the binary operations simpler. An interesting result of looking at this is the possibility of a new fractal equation resulting from division, that can be used as a measure of good fit and would help the brain decide how to solve something through self-replacement and a comparison with this good fit. |
Tasks | |
Published | 2020-03-11 |
URL | https://arxiv.org/abs/2003.05320v2 |
https://arxiv.org/pdf/2003.05320v2.pdf | |
PWC | https://paperswithcode.com/paper/how-the-brain-might-use-division |
Repo | |
Framework | |
Designing GANs: A Likelihood Ratio Approach
Title | Designing GANs: A Likelihood Ratio Approach |
Authors | Kalliopi Basioti, George V. Moustakides |
Abstract | We are interested in the design of generative adversarial networks. The training of these mathematical structures requires the definition of proper min-max optimization problems. We propose a simple methodology for constructing such problems assuring, at the same time, that they provide the correct answer. We give characteristic examples developed by our method, some of which can be recognized from other applications and some introduced for the first time. We compare various possibilities by applying them to well known datasets using neural networks of different configurations and sizes. |
Tasks | |
Published | 2020-02-03 |
URL | https://arxiv.org/abs/2002.00865v2 |
https://arxiv.org/pdf/2002.00865v2.pdf | |
PWC | https://paperswithcode.com/paper/designing-gans-a-likelihood-ratio-approach |
Repo | |
Framework | |
Residual-Recursion Autoencoder for Shape Illustration Images
Title | Residual-Recursion Autoencoder for Shape Illustration Images |
Authors | Qianwei Zhou, Peng Tao, Xiaoxin Li, Shengyong Chen, Fan Zhang, Haigen Hu |
Abstract | Shape illustration images (SIIs) are common and important in describing the cross-sections of industrial products. Same as MNIST, the handwritten digit images, SIIs are gray or binary and containing shapes that are surrounded by large areas of blanks. In this work, Residual-Recursion Autoencoder (RRAE) has been proposed to extract low-dimensional features from SIIs while maintaining reconstruction accuracy as high as possible. RRAE will try to reconstruct the original image several times and recursively fill the latest residual image to the reserved channel of the encoder’s input before the next trial of reconstruction. As a kind of neural network training framework, RRAE can wrap over other autoencoders and increase their performance. From experiment results, the reconstruction loss is decreased by 86.47% for convolutional autoencoder with high-resolution SIIs, 10.77% for variational autoencoder and 8.06% for conditional variational autoencoder with MNIST. |
Tasks | |
Published | 2020-02-06 |
URL | https://arxiv.org/abs/2002.02063v1 |
https://arxiv.org/pdf/2002.02063v1.pdf | |
PWC | https://paperswithcode.com/paper/residual-recursion-autoencoder-for-shape |
Repo | |
Framework | |
Informal Data Transformation Considered Harmful
Title | Informal Data Transformation Considered Harmful |
Authors | Eric Daimler, Ryan Wisnesky |
Abstract | In this paper we take the common position that AI systems are limited more by the integrity of the data they are learning from than the sophistication of their algorithms, and we take the uncommon position that the solution to achieving better data integrity in the enterprise is not to clean and validate data ex-post-facto whenever needed (the so-called data lake approach to data management, which can lead to data scientists spending 80% of their time cleaning data), but rather to formally and automatically guarantee that data integrity is preserved as it transformed (migrated, integrated, composed, queried, viewed, etc) throughout the enterprise, so that data and programs that depend on that data need not constantly be re-validated for every particular use. |
Tasks | |
Published | 2020-01-02 |
URL | https://arxiv.org/abs/2001.00338v1 |
https://arxiv.org/pdf/2001.00338v1.pdf | |
PWC | https://paperswithcode.com/paper/informal-data-transformation-considered |
Repo | |
Framework | |