Paper Group ANR 377
Regime Switching Bandits. Efficient Debiased Variational Bayes by Multilevel Monte Carlo Methods. Radial Based Analysis of GRNN in Non-Textured Image Inpainting. Optimizing Memory-Access Patterns for Deep Learning Accelerators. Symmetric Skip Connection Wasserstein GAN for High-Resolution Facial Image Inpainting. Towards Automating the AI Operation …
Regime Switching Bandits
Title | Regime Switching Bandits |
Authors | Xiang Zhou, Ningyuan Chen, Xuefeng Gao, Yi Xiong |
Abstract | We study a multi-armed bandit problem where the rewards exhibit regime-switching. Specifically, the distributions of the random rewards generated from all arms depend on a common underlying state modeled as a finite-state Markov chain. The agent does not observe the underlying state and has to learn the unknown transition probability matrix as well as the reward distribution. We propose an efficient learning algorithm for this problem, building on spectral method-of-moments estimations for hidden Markov models and upper confidence bound methods for reinforcement learning. We also establish $O(T^{2/3}\sqrt{\log T})$ bound on the regret of the proposed learning algorithm where $T$ is the unknown horizon. Finally, we conduct numerical experiments to illustrate the effectiveness of the learning algorithm. |
Tasks | |
Published | 2020-01-26 |
URL | https://arxiv.org/abs/2001.09390v1 |
https://arxiv.org/pdf/2001.09390v1.pdf | |
PWC | https://paperswithcode.com/paper/regime-switching-bandits |
Repo | |
Framework | |
Efficient Debiased Variational Bayes by Multilevel Monte Carlo Methods
Title | Efficient Debiased Variational Bayes by Multilevel Monte Carlo Methods |
Authors | Kei Ishikawa, Takashi Goda |
Abstract | Variational Bayes is a method to find a good approximation of the posterior probability distribution of latent variables from a parametric family of distributions. The evidence lower bound (ELBO), which is nothing but the model evidence minus the Kullback-Leibler divergence, has been commonly used as a quality measure in the optimization process. However, the model evidence itself has been considered computationally intractable since it is expressed as a nested expectation with an outer expectation with respect to the training dataset and an inner conditional expectation with respect to latent variables. Similarly, if the Kullback-Leibler divergence is replaced with another divergence metric, the corresponding lower bound on the model evidence is often given by such a nested expectation. The standard (nested) Monte Carlo method can be used to estimate such quantities, whereas the resulting estimate is biased and the variance is often quite large. Recently the authors provided an unbiased estimator of the model evidence with small variance by applying the idea from multilevel Monte Carlo (MLMC) methods. In this article, we give more examples involving nested expectations in the context of variational Bayes where MLMC methods can help construct low-variance unbiased estimators, and provide numerical results which demonstrate the effectiveness of our proposed estimators. |
Tasks | |
Published | 2020-01-14 |
URL | https://arxiv.org/abs/2001.04676v1 |
https://arxiv.org/pdf/2001.04676v1.pdf | |
PWC | https://paperswithcode.com/paper/efficient-debiased-variational-bayes-by |
Repo | |
Framework | |
Radial Based Analysis of GRNN in Non-Textured Image Inpainting
Title | Radial Based Analysis of GRNN in Non-Textured Image Inpainting |
Authors | Karthik R, Anvita Dwivedi, Haripriya M, Bharath K P, Rajesh Kumar M |
Abstract | Image inpainting algorithms are used to restore some damaged or missing information region of an image based on the surrounding information. The method proposed in this paper applies the radial based analysis of image inpainting on GRNN. The damaged areas are first isolated from rest of the areas and then arranged by their size and then inpainted using GRNN. The training of the neural network is done using different radii to achieve a better outcome. A comparative analysis is done for different regression-based algorithms. The overall results are compared with the results achieved by the other algorithms as LS-SVM with reference to the PSNR value. |
Tasks | Image Inpainting |
Published | 2020-01-13 |
URL | https://arxiv.org/abs/2001.04215v1 |
https://arxiv.org/pdf/2001.04215v1.pdf | |
PWC | https://paperswithcode.com/paper/radial-based-analysis-of-grnn-in-non-textured |
Repo | |
Framework | |
Optimizing Memory-Access Patterns for Deep Learning Accelerators
Title | Optimizing Memory-Access Patterns for Deep Learning Accelerators |
Authors | Hongbin Zheng, Sejong Oh, Huiqing Wang, Preston Briggs, Jiading Gai, Animesh Jain, Yizhi Liu, Rich Heaton, Randy Huang, Yida Wang |
Abstract | Deep learning (DL) workloads are moving towards accelerators for faster processing and lower cost. Modern DL accelerators are good at handling the large-scale multiply-accumulate operations that dominate DL workloads; however, it is challenging to make full use of the compute power of an accelerator since the data must be properly staged in a software-managed scratchpad memory. Failing to do so can result in significant performance loss. This paper proposes a systematic approach which leverages the polyhedral model to analyze all operators of a DL model together to minimize the number of memory accesses. Experiments show that our approach can substantially reduce the impact of memory accesses required by common neural-network models on a homegrown AWS machine-learning inference chip named Inferentia, which is available through Amazon EC2 Inf1 instances. |
Tasks | |
Published | 2020-02-27 |
URL | https://arxiv.org/abs/2002.12798v1 |
https://arxiv.org/pdf/2002.12798v1.pdf | |
PWC | https://paperswithcode.com/paper/optimizing-memory-access-patterns-for-deep |
Repo | |
Framework | |
Symmetric Skip Connection Wasserstein GAN for High-Resolution Facial Image Inpainting
Title | Symmetric Skip Connection Wasserstein GAN for High-Resolution Facial Image Inpainting |
Authors | Jireh Jam, Connah Kendrick, Vincent Drouard, Kevin Walker, Gee-Sern Hsu, Moi Hoon Yap |
Abstract | We propose a Symmetric Skip Connection Wasserstein Generative Adversarial Network (S-WGAN) for high-resolution facial image inpainting. The architecture is an encoder-decoder with convolutional blocks, linked by skip connections. The encoder is a feature extractor that captures data abstractions of an input image to learn an end-to-end mapping from an input (binary masked image) to the ground-truth. The decoder uses the learned abstractions to reconstruct the image. With skip connections, S-WGAN transfers image details to the decoder. Also, we propose a Wasserstein-Perceptual loss function to preserve colour and maintain realism on a reconstructed image. We evaluate our method and the state-of-the-art methods on CelebA-HQ dataset. Our results show S-WGAN produces sharper and more realistic images when visually compared with other methods. The quantitative measures show our proposed S-WGAN achieves the best Structure Similarity Index Measure of 0.94. |
Tasks | Image Inpainting |
Published | 2020-01-11 |
URL | https://arxiv.org/abs/2001.03725v1 |
https://arxiv.org/pdf/2001.03725v1.pdf | |
PWC | https://paperswithcode.com/paper/symmetric-skip-connection-wasserstein-gan-for |
Repo | |
Framework | |
Towards Automating the AI Operations Lifecycle
Title | Towards Automating the AI Operations Lifecycle |
Authors | Matthew Arnold, Jeffrey Boston, Michael Desmond, Evelyn Duesterwald, Benjamin Elder, Anupama Murthi, Jiri Navratil, Darrell Reimer |
Abstract | Today’s AI deployments often require significant human involvement and skill in the operational stages of the model lifecycle, including pre-release testing, monitoring, problem diagnosis and model improvements. We present a set of enabling technologies that can be used to increase the level of automation in AI operations, thus lowering the human effort required. Since a common source of human involvement is the need to assess the performance of deployed models, we focus on technologies for performance prediction and KPI analysis and show how they can be used to improve automation in the key stages of a typical AI operations pipeline. |
Tasks | |
Published | 2020-03-28 |
URL | https://arxiv.org/abs/2003.12808v1 |
https://arxiv.org/pdf/2003.12808v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-automating-the-ai-operations |
Repo | |
Framework | |
Random Machines Regression Approach: an ensemble support vector regression model with free kernel choice
Title | Random Machines Regression Approach: an ensemble support vector regression model with free kernel choice |
Authors | Anderson Ara, Mateus Maia, Samuel Macêdo, Francisco Louzada |
Abstract | Machine learning techniques always aim to reduce the generalized prediction error. In order to reduce it, ensemble methods present a good approach combining several models that results in a greater forecasting capacity. The Random Machines already have been demonstrated as strong technique, i.e: high predictive power, to classification tasks, in this article we propose an procedure to use the bagged-weighted support vector model to regression problems. Simulation studies were realized over artificial datasets, and over real data benchmarks. The results exhibited a good performance of Regression Random Machines through lower generalization error without needing to choose the best kernel function during tuning process. |
Tasks | |
Published | 2020-03-27 |
URL | https://arxiv.org/abs/2003.12643v1 |
https://arxiv.org/pdf/2003.12643v1.pdf | |
PWC | https://paperswithcode.com/paper/random-machines-regression-approach-an |
Repo | |
Framework | |
Optimality and Stability in Non-Convex-Non-Concave Min-Max Optimization
Title | Optimality and Stability in Non-Convex-Non-Concave Min-Max Optimization |
Authors | Guojun Zhang, Pascal Poupart, Yaoliang Yu |
Abstract | Convergence to a saddle point for convex-concave functions has been studied for decades, while the last few years have seen a surge of interest in non-convex-non-concave min-max optimization due to the rise of deep learning. However, it remains an intriguing research challenge how local optimal points are defined and which algorithm can converge to such points. We study definitions of “local min-max (max-min)” points and provide an elegant unification, with the corresponding first- and second-order necessary and sufficient conditions. Specifically, we show that quadratic games, as often used as illustrative examples and approximations of smooth functions, are too special, both locally and globally. Lastly, we analyze the exact conditions for local convergence of several popular gradient algorithms near the “local min-max” points defined in the previous section, identify “valid” hyper-parameters and compare the respective stable sets. Our results offer insights into the necessity of two-time-scale algorithms and the limitation of the commonly used approach based on ordinary differential equations. |
Tasks | |
Published | 2020-02-27 |
URL | https://arxiv.org/abs/2002.11875v1 |
https://arxiv.org/pdf/2002.11875v1.pdf | |
PWC | https://paperswithcode.com/paper/optimality-and-stability-in-non-convex-non |
Repo | |
Framework | |
Joint Optimization of AI Fairness and Utility: A Human-Centered Approach
Title | Joint Optimization of AI Fairness and Utility: A Human-Centered Approach |
Authors | Yunfeng Zhang, Rachel K. E. Bellamy, Kush R. Varshney |
Abstract | Today, AI is increasingly being used in many high-stakes decision-making applications in which fairness is an important concern. Already, there are many examples of AI being biased and making questionable and unfair decisions. The AI research community has proposed many methods to measure and mitigate unwanted biases, but few of them involve inputs from human policy makers. We argue that because different fairness criteria sometimes cannot be simultaneously satisfied, and because achieving fairness often requires sacrificing other objectives such as model accuracy, it is key to acquire and adhere to human policy makers’ preferences on how to make the tradeoff among these objectives. In this paper, we propose a framework and some exemplar methods for eliciting such preferences and for optimizing an AI model according to these preferences. |
Tasks | Decision Making |
Published | 2020-02-05 |
URL | https://arxiv.org/abs/2002.01621v1 |
https://arxiv.org/pdf/2002.01621v1.pdf | |
PWC | https://paperswithcode.com/paper/joint-optimization-of-ai-fairness-and-utility |
Repo | |
Framework | |
Fast and Regularized Reconstruction of Building Façades from Street-View Images using Binary Integer Programming
Title | Fast and Regularized Reconstruction of Building Façades from Street-View Images using Binary Integer Programming |
Authors | Han Hu, Libin Wang, Yulin Ding, Qing Zhu |
Abstract | Regularized arrangement of primitives on building fa\c{c}ades to aligned locations and consistent sizes is important towards structured reconstruction of urban environment. Mixed integer linear programing was used to solve the problem, however, it is extreamly time consuming even for state-of-the-art commercial solvers. Aiming to alleviate this issue, we cast the problem into binary integer programming, which omits the requirements for real value parameters and is more efficient to be solved . Firstly, the bounding boxes of the primitives are detected using the YOLOv3 architecture in real-time. Secondly, the coordinates of the upper left corners and the sizes of the bounding boxes are automatically clustered in a binary integer programming optimization, which jointly considers the geometric fitness, regularity and additional constraints; this step does not require \emph{a priori} knowledge, such as the number of clusters or pre-defined grammars. Finally, the regularized bounding boxes can be directly used to guide the fa\c{c}ade reconstruction in an interactive envinronment. Experimental evaluations have revealed that the accuracies for the extraction of primitives are above 0.85, which is sufficient for the following 3D reconstruction. The proposed approach only takes about $ 10% $ to $ 20% $ of the runtime than previous approach and reduces the diversity of the bounding boxes to about $20%$ to $50%$ |
Tasks | 3D Reconstruction |
Published | 2020-02-20 |
URL | https://arxiv.org/abs/2002.08549v1 |
https://arxiv.org/pdf/2002.08549v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-and-regularized-reconstruction-of |
Repo | |
Framework | |
GRATE: Granular Recovery of Aggregated Tensor Data by Example
Title | GRATE: Granular Recovery of Aggregated Tensor Data by Example |
Authors | Ahmed S. Zamzam, Bo Yang, Nicholas D. Sidiropoulos |
Abstract | In this paper, we address the challenge of recovering an accurate breakdown of aggregated tensor data using disaggregation examples. This problem is motivated by several applications. For example, given the breakdown of energy consumption at some homes, how can we disaggregate the total energy consumed during the same period at other homes? In order to address this challenge, we propose GRATE, a principled method that turns the ill-posed task at hand into a constrained tensor factorization problem. Then, this optimization problem is tackled using an alternating least-squares algorithm. GRATE has the ability to handle exact aggregated data as well as inexact aggregation where some unobserved quantities contribute to the aggregated data. Special emphasis is given to the energy disaggregation problem where the goal is to provide energy breakdown for consumers from their monthly aggregated consumption. Experiments on two real datasets show the efficacy of GRATE in recovering more accurate disaggregation than state-of-the-art energy disaggregation methods. |
Tasks | |
Published | 2020-03-27 |
URL | https://arxiv.org/abs/2003.12666v1 |
https://arxiv.org/pdf/2003.12666v1.pdf | |
PWC | https://paperswithcode.com/paper/grate-granular-recovery-of-aggregated-tensor |
Repo | |
Framework | |
Are Labels Necessary for Neural Architecture Search?
Title | Are Labels Necessary for Neural Architecture Search? |
Authors | Chenxi Liu, Piotr Dollár, Kaiming He, Ross Girshick, Alan Yuille, Saining Xie |
Abstract | Existing neural network architectures in computer vision — whether designed by humans or by machines — were typically found using both images and their associated labels. In this paper, we ask the question: can we find high-quality neural architectures using only images, but no human-annotated labels? To answer this question, we first define a new setup called Unsupervised Neural Architecture Search (UnNAS). We then conduct two sets of experiments. In sample-based experiments, we train a large number (500) of diverse architectures with either supervised or unsupervised objectives, and find that the architecture rankings produced with and without labels are highly correlated. In search-based experiments, we run a well-established NAS algorithm (DARTS) using various unsupervised objectives, and report that the architectures searched without labels can be competitive to their counterparts searched with labels. Together, these results reveal the potentially surprising finding that labels are not necessary, and the image statistics alone may be sufficient to identify good neural architectures. |
Tasks | Neural Architecture Search |
Published | 2020-03-26 |
URL | https://arxiv.org/abs/2003.12056v1 |
https://arxiv.org/pdf/2003.12056v1.pdf | |
PWC | https://paperswithcode.com/paper/are-labels-necessary-for-neural-architecture |
Repo | |
Framework | |
Embedded-physics machine learning for coarse-graining and collective variable discovery without data
Title | Embedded-physics machine learning for coarse-graining and collective variable discovery without data |
Authors | Markus Schöberl, Nicholas Zabaras, Phaedon-Stelios Koutsourelakis |
Abstract | We present a novel learning framework that consistently embeds underlying physics while bypassing a significant drawback of most modern, data-driven coarse-grained approaches in the context of molecular dynamics (MD), i.e., the availability of big data. The generation of a sufficiently large training dataset poses a computationally demanding task, while complete coverage of the atomistic configuration space is not guaranteed. As a result, the explorative capabilities of data-driven coarse-grained models are limited and may yield biased “predictive” tools. We propose a novel objective based on reverse Kullback-Leibler divergence that fully incorporates the available physics in the form of the atomistic force field. Rather than separating model learning from the data-generation procedure - the latter relies on simulating atomistic motions governed by force fields - we query the atomistic force field at sample configurations proposed by the predictive coarse-grained model. Thus, learning relies on the evaluation of the force field but does not require any MD simulation. The resulting generative coarse-grained model serves as an efficient surrogate model for predicting atomistic configurations and estimating relevant observables. Beyond obtaining a predictive coarse-grained model, we demonstrate that in the discovered lower-dimensional representation, the collective variables (CVs) are related to physicochemical properties, which are essential for gaining understanding of unexplored complex systems. We demonstrate the algorithmic advances in terms of predictive ability and the physical meaning of the revealed CVs for a bimodal potential energy function and the alanine dipeptide. |
Tasks | |
Published | 2020-02-24 |
URL | https://arxiv.org/abs/2002.10148v1 |
https://arxiv.org/pdf/2002.10148v1.pdf | |
PWC | https://paperswithcode.com/paper/embedded-physics-machine-learning-for-coarse |
Repo | |
Framework | |
Macromolecule Classification Based on the Amino-acid Sequence
Title | Macromolecule Classification Based on the Amino-acid Sequence |
Authors | Faisal Ghaffar, Sarwar Khan, Gaddisa O., Chen Yu-jhen |
Abstract | Deep learning is playing a vital role in every field which involves data. It has emerged as a strong and efficient framework that can be applied to a broad spectrum of complex learning problems which were difficult to solve using traditional machine learning techniques in the past. In this study we focused on classification of protein sequences with deep learning techniques. The study of amino acid sequence is vital in life sciences. We used different word embedding techniques from Natural Language processing to represent the amino acid sequence as vectors. Our main goal was to classify sequences to four group of classes, that are DNA, RNA, Protein and hybrid. After several tests we have achieved almost 99% of train and test accuracy. We have experimented on CNN, LSTM, Bidirectional LSTM, and GRU. |
Tasks | |
Published | 2020-01-06 |
URL | https://arxiv.org/abs/2001.01717v1 |
https://arxiv.org/pdf/2001.01717v1.pdf | |
PWC | https://paperswithcode.com/paper/macromolecule-classification-based-on-the |
Repo | |
Framework | |
Deep Non-Line-of-Sight Reconstruction
Title | Deep Non-Line-of-Sight Reconstruction |
Authors | Javier Grau Chopite, Matthias B. Hullin, Michael Wand, Julian Iseringhausen |
Abstract | The recent years have seen a surge of interest in methods for imaging beyond the direct line of sight. The most prominent techniques rely on time-resolved optical impulse responses, obtained by illuminating a diffuse wall with an ultrashort light pulse and observing multi-bounce indirect reflections with an ultrafast time-resolved imager. Reconstruction of geometry from such data, however, is a complex non-linear inverse problem that comes with substantial computational demands. In this paper, we employ convolutional feed-forward networks for solving the reconstruction problem efficiently while maintaining good reconstruction quality. Specifically, we devise a tailored autoencoder architecture, trained end-to-end, that maps transient images directly to a depth map representation. Training is done using an efficient transient renderer for diffuse three-bounce indirect light transport that enables the quick generation of large amounts of training data for the network. We examine the performance of our method on a variety of synthetic and experimental datasets and its dependency on the choice of training data and augmentation strategies, as well as architectural features. We demonstrate that our feed-forward network, even though it is trained solely on synthetic data, generalizes to measured data from SPAD sensors and is able to obtain results that are competitive with model-based reconstruction methods. |
Tasks | |
Published | 2020-01-24 |
URL | https://arxiv.org/abs/2001.09067v2 |
https://arxiv.org/pdf/2001.09067v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-non-line-of-sight-reconstruction |
Repo | |
Framework | |