Paper Group ANR 877
Convergence Guarantees for Adaptive Bayesian Quadrature Methods. Variance Reduction in Actor Critic Methods (ACM). Evolutionary Cell Aided Design for Neural Network Architectures. A Scalable Predictive Maintenance Model for Detecting Wind Turbine Component Failures Based on SCADA Data. Atari-fying the Vehicle Routing Problem with Stochastic Service …
Convergence Guarantees for Adaptive Bayesian Quadrature Methods
Title | Convergence Guarantees for Adaptive Bayesian Quadrature Methods |
Authors | Motonobu Kanagawa, Philipp Hennig |
Abstract | Adaptive Bayesian quadrature (ABQ) is a powerful approach to numerical integration that empirically compares favorably with Monte Carlo integration on problems of medium dimensionality (where non-adaptive quadrature is not competitive). Its key ingredient is an acquisition function that changes as a function of previously collected values of the integrand. While this adaptivity appears to be empirically powerful, it complicates analysis. Consequently, there are no theoretical guarantees so far for this class of methods. In this work, for a broad class of adaptive Bayesian quadrature methods, we prove consistency, deriving non-tight but informative convergence rates. To do so we introduce a new concept we call weak adaptivity. Our results identify a large and flexible class of adaptive Bayesian quadrature rules as consistent, within which practitioners can develop empirically efficient methods. |
Tasks | |
Published | 2019-05-24 |
URL | https://arxiv.org/abs/1905.10271v2 |
https://arxiv.org/pdf/1905.10271v2.pdf | |
PWC | https://paperswithcode.com/paper/convergence-guarantees-for-adaptive-bayesian |
Repo | |
Framework | |
Variance Reduction in Actor Critic Methods (ACM)
Title | Variance Reduction in Actor Critic Methods (ACM) |
Authors | Eric Benhamou |
Abstract | After presenting Actor Critic Methods (ACM), we show ACM are control variate estimators. Using the projection theorem, we prove that the Q and Advantage Actor Critic (A2C) methods are optimal in the sense of the $L^2$ norm for the control variate estimators spanned by functions conditioned by the current state and action. This straightforward application of Pythagoras theorem provides a theoretical justification of the strong performance of QAC and AAC most often referred to as A2C methods in deep policy gradient methods. This enables us to derive a new formulation for Advantage Actor Critic methods that has lower variance and improves the traditional A2C method. |
Tasks | Policy Gradient Methods |
Published | 2019-07-23 |
URL | https://arxiv.org/abs/1907.09765v1 |
https://arxiv.org/pdf/1907.09765v1.pdf | |
PWC | https://paperswithcode.com/paper/variance-reduction-in-actor-critic-methods |
Repo | |
Framework | |
Evolutionary Cell Aided Design for Neural Network Architectures
Title | Evolutionary Cell Aided Design for Neural Network Architectures |
Authors | Philip Colangelo, Oren Segal, Alexander Speicher, Martin Margala |
Abstract | Mathematical theory shows us that multilayer feedforward Artificial Neural Networks(ANNs) are universal function approximators, capable of approximating any measurable function to any desired degree of accuracy. In practice designing practical and efficient neural network architectures require significant effort and expertise. We present a novel software framework called Evolutionary Cell Aided Design(ECAD) meant to aid in the exploration and design of efficient Neural Network Architectures(NNAs) for reconfigurable hardware. Given a general neural network structure and a set of constraints and fitness functions, the framework will explore both the space of possible NNA and the space of possible hardware designs, using evolutionary algorithms, and attempt to find the fittest co-design solutions according to a predefined set of goals. We test the framework on an image classification task and use the MNIST data set of hand written digits with an Intel Arria 10 GX 1150 device as our target platform. We design and implement a modular and scalable 2D systolic array with enhancements for machine learning that can be used by the framework for the hardware search space. Our results demonstrate the ability to pair neural network design and hardware development together using an evolutionary algorithm and removing traditional human-in-the-loop development tasks. By running various experiments of the fittest solutions for neural network and hardware searches, we demonstrate the full end-to-end capabilities of the ECAD framework. |
Tasks | Image Classification |
Published | 2019-03-06 |
URL | https://arxiv.org/abs/1903.02130v3 |
https://arxiv.org/pdf/1903.02130v3.pdf | |
PWC | https://paperswithcode.com/paper/evolutionary-cell-aided-design-for-neural |
Repo | |
Framework | |
A Scalable Predictive Maintenance Model for Detecting Wind Turbine Component Failures Based on SCADA Data
Title | A Scalable Predictive Maintenance Model for Detecting Wind Turbine Component Failures Based on SCADA Data |
Authors | Lorenzo Gigoni, Alessandro Betti, Mauro Tucci, Emanuele Crisostomi |
Abstract | In this work, a novel predictive maintenance system is presented and applied to the main components of wind turbines. The proposed model is based on machine learning and statistical process control tools applied to SCADA (Supervisory Control And Data Acquisition) data of critical components. The test campaign was divided into two stages: a first two years long offline test, and a second one year long real-time test. The offline test used historical faults from six wind farms located in Italy and Romania, corresponding to a total of 150 wind turbines and an overall installed nominal power of 283 MW. The results demonstrate outstanding capabilities of anomaly prediction up to 2 months before device unscheduled downtime. Furthermore, the real-time 12-months test confirms the ability of the proposed system to detect several anomalies, therefore allowing the operators to identify the root causes, and to schedule maintenance actions before reaching a catastrophic stage. |
Tasks | |
Published | 2019-10-22 |
URL | https://arxiv.org/abs/1910.09808v1 |
https://arxiv.org/pdf/1910.09808v1.pdf | |
PWC | https://paperswithcode.com/paper/a-scalable-predictive-maintenance-model-for |
Repo | |
Framework | |
Atari-fying the Vehicle Routing Problem with Stochastic Service Requests
Title | Atari-fying the Vehicle Routing Problem with Stochastic Service Requests |
Authors | Nicholas D. Kullman, Jorge E. Mendoza, Martin Cousineau, Justin C. Goodson |
Abstract | We present a new general approach to modeling research problems as Atari-like videogames to make them amenable to recent groundbreaking solution methods from the deep reinforcement learning community. The approach is flexible, applicable to a wide range of problems. We demonstrate its application on a well known vehicle routing problem. Our preliminary results on this problem, though not transformative, show signs of success and suggest that Atari-fication may be a useful modeling approach for researchers studying problems involving sequential decision making under uncertainty. |
Tasks | Decision Making, Decision Making Under Uncertainty |
Published | 2019-11-14 |
URL | https://arxiv.org/abs/1911.05922v1 |
https://arxiv.org/pdf/1911.05922v1.pdf | |
PWC | https://paperswithcode.com/paper/atari-fying-the-vehicle-routing-problem-with |
Repo | |
Framework | |
Relation Discovery with Out-of-Relation Knowledge Base as Supervision
Title | Relation Discovery with Out-of-Relation Knowledge Base as Supervision |
Authors | Yan Liang, Xin Liu, Jianwen Zhang, Yangqiu Song |
Abstract | Unsupervised relation discovery aims to discover new relations from a given text corpus without annotated data. However, it does not consider existing human annotated knowledge bases even when they are relevant to the relations to be discovered. In this paper, we study the problem of how to use out-of-relation knowledge bases to supervise the discovery of unseen relations, where out-of-relation means that relations to discover from the text corpus and those in knowledge bases are not overlapped. We construct a set of constraints between entity pairs based on the knowledge base embedding and then incorporate constraints into the relation discovery by a variational auto-encoder based algorithm. Experiments show that our new approach can improve the state-of-the-art relation discovery performance by a large margin. |
Tasks | |
Published | 2019-04-19 |
URL | http://arxiv.org/abs/1905.01959v1 |
http://arxiv.org/pdf/1905.01959v1.pdf | |
PWC | https://paperswithcode.com/paper/190501959 |
Repo | |
Framework | |
Single-frame Regularization for Temporally Stable CNNs
Title | Single-frame Regularization for Temporally Stable CNNs |
Authors | Gabriel Eilertsen, Rafał K. Mantiuk, Jonas Unger |
Abstract | Convolutional neural networks (CNNs) can model complicated non-linear relations between images. However, they are notoriously sensitive to small changes in the input. Most CNNs trained to describe image-to-image mappings generate temporally unstable results when applied to video sequences, leading to flickering artifacts and other inconsistencies over time. In order to use CNNs for video material, previous methods have relied on estimating dense frame-to-frame motion information (optical flow) in the training and/or the inference phase, or by exploring recurrent learning structures. We take a different approach to the problem, posing temporal stability as a regularization of the cost function. The regularization is formulated to account for different types of motion that can occur between frames, so that temporally stable CNNs can be trained without the need for video material or expensive motion estimation. The training can be performed as a fine-tuning operation, without architectural modifications of the CNN. Our evaluation shows that the training strategy leads to large improvements in temporal smoothness. Moreover, for small datasets the regularization can help in boosting the generalization performance to a much larger extent than what is possible with na"ive augmentation strategies. |
Tasks | Motion Estimation, Optical Flow Estimation |
Published | 2019-02-27 |
URL | https://arxiv.org/abs/1902.10424v2 |
https://arxiv.org/pdf/1902.10424v2.pdf | |
PWC | https://paperswithcode.com/paper/single-frame-regularization-for-temporally |
Repo | |
Framework | |
LIP: Learning Instance Propagation for Video Object Segmentation
Title | LIP: Learning Instance Propagation for Video Object Segmentation |
Authors | Ye Lyu, George Vosselman, Gui-Song Xia, Michael Ying Yang |
Abstract | In recent years, the task of segmenting foreground objects from background in a video, i.e. video object segmentation (VOS), has received considerable attention. In this paper, we propose a single end-to-end trainable deep neural network, convolutional gated recurrent Mask-RCNN, for tackling the semi-supervised VOS task. We take advantage of both the instance segmentation network (Mask-RCNN) and the visual memory module (Conv-GRU) to tackle the VOS task. The instance segmentation network predicts masks for instances, while the visual memory module learns to selectively propagate information for multiple instances simultaneously, which handles the appearance change, the variation of scale and pose and the occlusions between objects. After offline and online training under purely instance segmentation losses, our approach is able to achieve satisfactory results without any post-processing or synthetic video data augmentation. Experimental results on DAVIS 2016 dataset and DAVIS 2017 dataset have demonstrated the effectiveness of our method for video object segmentation task. |
Tasks | Data Augmentation, Instance Segmentation, Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation |
Published | 2019-09-30 |
URL | https://arxiv.org/abs/1910.00032v1 |
https://arxiv.org/pdf/1910.00032v1.pdf | |
PWC | https://paperswithcode.com/paper/lip-learning-instance-propagation-for-video |
Repo | |
Framework | |
Enhancing the Discriminative Feature Learning for Visible-Thermal Cross-Modality Person Re-Identification
Title | Enhancing the Discriminative Feature Learning for Visible-Thermal Cross-Modality Person Re-Identification |
Authors | Haijun Liu, Jian Cheng |
Abstract | Existing person re-identification has achieved great progress in the visible domain, capturing all the person images with visible cameras. However, in a 24-hour intelligent surveillance system, the visible cameras may be noneffective at night. In this situation, thermal cameras are the best supplemental components, which capture images without depending on visible light. Therefore, in this paper, we investigate the visible-thermal cross-modality person re-identification (VT Re-ID) problem. In VT Re-ID, there are two knotty problems should be well handled, cross-modality discrepancy and intra-modality variations. To address these two issues, we propose focusing on enhancing the discriminative feature learning (EDFL) with two extreme simple means from two core aspects, (1) skip-connection for mid-level features incorporation to improve the person features with more discriminability and robustness, and (2) dual-modality triplet loss to guide the training procedures by simultaneously considering the cross-modality discrepancy and intra-modality variations. Additionally, the two-stream CNN structure is adopted to learn the multi-modality sharable person features. The experimental results on two datasets show that our proposed EDFL approach distinctly outperforms state-of-the-art methods by large margins, demonstrating the effectiveness of our EDFL to enhance the discriminative feature learning for VT Re-ID. |
Tasks | Person Re-Identification |
Published | 2019-07-23 |
URL | https://arxiv.org/abs/1907.09659v1 |
https://arxiv.org/pdf/1907.09659v1.pdf | |
PWC | https://paperswithcode.com/paper/enhancing-the-discriminative-feature-learning |
Repo | |
Framework | |
Supervised Contextual Embeddings for Transfer Learning in Natural Language Processing Tasks
Title | Supervised Contextual Embeddings for Transfer Learning in Natural Language Processing Tasks |
Authors | Mihir Kale, Aditya Siddhant, Sreyashi Nag, Radhika Parik, Matthias Grabmair, Anthony Tomasic |
Abstract | Pre-trained word embeddings are the primary method for transfer learning in several Natural Language Processing (NLP) tasks. Recent works have focused on using unsupervised techniques such as language modeling to obtain these embeddings. In contrast, this work focuses on extracting representations from multiple pre-trained supervised models, which enriches word embeddings with task and domain specific knowledge. Experiments performed in cross-task, cross-domain and cross-lingual settings indicate that such supervised embeddings are helpful, especially in the low-resource setting, but the extent of gains is dependent on the nature of the task and domain. We make our code publicly available. |
Tasks | Language Modelling, Transfer Learning, Word Embeddings |
Published | 2019-06-28 |
URL | https://arxiv.org/abs/1906.12039v1 |
https://arxiv.org/pdf/1906.12039v1.pdf | |
PWC | https://paperswithcode.com/paper/supervised-contextual-embeddings-for-transfer |
Repo | |
Framework | |
Stabilizing Inputs to Approximated Nonlinear Functions for Inference with Homomorphic Encryption in Deep Neural Networks
Title | Stabilizing Inputs to Approximated Nonlinear Functions for Inference with Homomorphic Encryption in Deep Neural Networks |
Authors | Moustafa AboulAtta, Matthias Ossadnik, Seyed-Ahmad Ahmadi |
Abstract | Leveled Homomorphic Encryption (LHE) offers a potential solution that could allow sectors with sensitive data to utilize the cloud and securely deploy their models for remote inference with Deep Neural Networks (DNN). However, this application faces several obstacles due to the limitations of LHE. One of the main problems is the incompatibility of commonly used nonlinear functions in DNN with the operations supported by LHE, i.e. addition and multiplication. As common in LHE approaches, we train a model with a nonlinear function, and replace it with a low-degree polynomial approximation at inference time on private data. While this typically leads to approximation errors and loss in prediction accuracy, we propose a method that reduces this loss to small values or eliminates it entirely, depending on simple hyper-parameters. This is achieved by the introduction of a novel and elegantly simple Min-Max normalization scheme, which scales inputs to nonlinear functions into ranges with low approximation error. While being intuitive in its concept and trivial to implement, we empirically show that it offers a stable and effective approximation solution to nonlinear functions in DNN. In return, this can enable deeper networks with LHE, and facilitate the development of security- and privacy-aware analytics applications. |
Tasks | |
Published | 2019-02-05 |
URL | http://arxiv.org/abs/1902.01870v1 |
http://arxiv.org/pdf/1902.01870v1.pdf | |
PWC | https://paperswithcode.com/paper/stabilizing-inputs-to-approximated-nonlinear |
Repo | |
Framework | |
Analyzing Large Receptive Field Convolutional Networks for Distant Speech Recognition
Title | Analyzing Large Receptive Field Convolutional Networks for Distant Speech Recognition |
Authors | Salar Jafarlou, Soheil Khorram, Vinay Kothapally, John H. L. Hansen |
Abstract | Despite significant efforts over the last few years to build a robust automatic speech recognition (ASR) system for different acoustic settings, the performance of the current state-of-the-art technologies significantly degrades in noisy reverberant environments. Convolutional Neural Networks (CNNs) have been successfully used to achieve substantial improvements in many speech processing applications including distant speech recognition (DSR). However, standard CNN architectures were not efficient in capturing long-term speech dynamics, which are essential in the design of a robust DSR system. In the present study, we address this issue by investigating variants of large receptive field CNNs (LRF-CNNs) which include deeply recursive networks, dilated convolutional neural networks, and stacked hourglass networks. To compare the efficacy of the aforementioned architectures with the standard CNN for Wall Street Journal (WSJ) corpus, we use a hybrid DNN-HMM based speech recognition system. We extend the study to evaluate the system performances for distant speech simulated using realistic room impulse responses (RIRs). Our experiments show that with fixed number of parameters across all architectures, the large receptive field networks show consistent improvements over the standard CNNs for distant speech. Amongst the explored LRF-CNNs, stacked hourglass network has shown improvements with a 8.9% relative reduction in word error rate (WER) and 10.7% relative improvement in frame accuracy compared to the standard CNNs for distant simulated speech signals. |
Tasks | Distant Speech Recognition, Speech Recognition |
Published | 2019-10-15 |
URL | https://arxiv.org/abs/1910.07047v1 |
https://arxiv.org/pdf/1910.07047v1.pdf | |
PWC | https://paperswithcode.com/paper/analyzing-large-receptive-field-convolutional |
Repo | |
Framework | |
Improving VAE generations of multimodal data through data-dependent conditional priors
Title | Improving VAE generations of multimodal data through data-dependent conditional priors |
Authors | Frantzeska Lavda, Magda Gregorová, Alexandros Kalousis |
Abstract | One of the major shortcomings of variational autoencoders is the inability to produce generations from the individual modalities of data originating from mixture distributions. This is primarily due to the use of a simple isotropic Gaussian as the prior for the latent code in the ancestral sampling procedure for the data generations. We propose a novel formulation of variational autoencoders, conditional prior VAE (CP-VAE), which learns to differentiate between the individual mixture components and therefore allows for generations from the distributional data clusters. We assume a two-level generative process with a continuous (Gaussian) latent variable sampled conditionally on a discrete (categorical) latent component. The new variational objective naturally couples the learning of the posterior and prior conditionals, and the learning of the latent categories encoding the multimodality of the original data in an unsupervised manner. The data-dependent conditional priors are then used to sample the continuous latent code when generating new samples from the individual mixture components corresponding to the multimodal structure of the original data. Our experimental results illustrate the generative performance of our new model comparing to multiple baselines. |
Tasks | |
Published | 2019-11-25 |
URL | https://arxiv.org/abs/1911.10885v1 |
https://arxiv.org/pdf/1911.10885v1.pdf | |
PWC | https://paperswithcode.com/paper/improving-vae-generations-of-multimodal-data |
Repo | |
Framework | |
Adaptive ROI Generation for Video Object Segmentation Using Reinforcement Learning
Title | Adaptive ROI Generation for Video Object Segmentation Using Reinforcement Learning |
Authors | Mingjie Sun, Jimin Xiao, Eng Gee Lim, Yanchu Xie, Jiashi Feng |
Abstract | In this paper, we aim to tackle the task of semi-supervised video object segmentation across a sequence of frames where only the ground-truth segmentation of the first frame is provided. The challenges lie in how to online update the segmentation model initialized from the first frame adaptively and accurately, even in presence of multiple confusing instances or large object motion. The existing approaches rely on selecting the region of interest for model update, which however, is rough and inflexible, leading to performance degradation. To overcome this limitation, we propose a novel approach which utilizes reinforcement learning to select optimal adaptation areas for each frame, based on the historical segmentation information. The RL model learns to take optimal actions to adjust the region of interest inferred from the previous frame for online model updating. To speed up the model adaption, we further design a novel multi-branch tree based exploration method to fast select the best state action pairs. Our experiments show that our work improves the state-of-the-art of the mean region similarity on DAVIS 2016 dataset to 87.1%. |
Tasks | Semantic Segmentation, Semi-supervised Video Object Segmentation, Video Object Segmentation, Video Semantic Segmentation |
Published | 2019-09-27 |
URL | https://arxiv.org/abs/1909.12482v1 |
https://arxiv.org/pdf/1909.12482v1.pdf | |
PWC | https://paperswithcode.com/paper/adaptive-roi-generation-for-video-object |
Repo | |
Framework | |
Primitive-based 3D Building Modeling, Sensor Simulation, and Estimation
Title | Primitive-based 3D Building Modeling, Sensor Simulation, and Estimation |
Authors | Xia Li, Yen-Liang Lin, James Miller, Alex Cheon, Walt Dixon |
Abstract | As we begin to consider modeling large, realistic 3D building scenes, it becomes necessary to consider a more compact representation over the polygonal mesh model. Due to the large amounts of annotated training data, which is costly to obtain, we leverage synthetic data to train our system for the satellite image domain. By utilizing the synthetic data, we formulate the building decomposition as an application of instance segmentation and primitive fitting to decompose a building into a set of primitive shapes. Experimental results on WorldView-3 satellite image dataset demonstrate the effectiveness of our 3D building modeling approach. |
Tasks | Instance Segmentation, Semantic Segmentation |
Published | 2019-01-16 |
URL | http://arxiv.org/abs/1901.05554v1 |
http://arxiv.org/pdf/1901.05554v1.pdf | |
PWC | https://paperswithcode.com/paper/primitive-based-3d-building-modeling-sensor |
Repo | |
Framework | |