Paper Group ANR 657
Stochastic Variance Reduction Methods for Policy Evaluation. Visualizations for an Explainable Planning Agent. A Unified Multi-Faceted Video Summarization System. Building a Regular Decision Boundary with Deep Networks. Online Monotone Games. Bayesian inference on random simple graphs with power law degree distributions. Multi-Image Semantic Matchi …
Stochastic Variance Reduction Methods for Policy Evaluation
Title | Stochastic Variance Reduction Methods for Policy Evaluation |
Authors | Simon S. Du, Jianshu Chen, Lihong Li, Lin Xiao, Dengyong Zhou |
Abstract | Policy evaluation is a crucial step in many reinforcement-learning procedures, which estimates a value function that predicts states’ long-term value under a given policy. In this paper, we focus on policy evaluation with linear function approximation over a fixed dataset. We first transform the empirical policy evaluation problem into a (quadratic) convex-concave saddle point problem, and then present a primal-dual batch gradient method, as well as two stochastic variance reduction methods for solving the problem. These algorithms scale linearly in both sample size and feature dimension. Moreover, they achieve linear convergence even when the saddle-point problem has only strong concavity in the dual variables but no strong convexity in the primal variables. Numerical experiments on benchmark problems demonstrate the effectiveness of our methods. |
Tasks | |
Published | 2017-02-25 |
URL | http://arxiv.org/abs/1702.07944v2 |
http://arxiv.org/pdf/1702.07944v2.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-variance-reduction-methods-for |
Repo | |
Framework | |
Visualizations for an Explainable Planning Agent
Title | Visualizations for an Explainable Planning Agent |
Authors | Tathagata Chakraborti, Kshitij P. Fadnis, Kartik Talamadupula, Mishal Dholakia, Biplav Srivastava, Jeffrey O. Kephart, Rachel K. E. Bellamy |
Abstract | In this paper, we report on the visualization capabilities of an Explainable AI Planning (XAIP) agent that can support human in the loop decision making. Imposing transparency and explainability requirements on such agents is especially important in order to establish trust and common ground with the end-to-end automated planning system. Visualizing the agent’s internal decision-making processes is a crucial step towards achieving this. This may include externalizing the “brain” of the agent – starting from its sensory inputs, to progressively higher order decisions made by it in order to drive its planning components. We also show how the planner can bootstrap on the latest techniques in explainable planning to cast plan visualization as a plan explanation problem, and thus provide concise model-based visualization of its plans. We demonstrate these functionalities in the context of the automated planning components of a smart assistant in an instrumented meeting space. |
Tasks | Decision Making |
Published | 2017-09-13 |
URL | http://arxiv.org/abs/1709.04517v2 |
http://arxiv.org/pdf/1709.04517v2.pdf | |
PWC | https://paperswithcode.com/paper/visualizations-for-an-explainable-planning |
Repo | |
Framework | |
A Unified Multi-Faceted Video Summarization System
Title | A Unified Multi-Faceted Video Summarization System |
Authors | Anurag Sahoo, Vishal Kaushal, Khoshrav Doctor, Suyash Shetty, Rishabh Iyer, Ganesh Ramakrishnan |
Abstract | This paper addresses automatic summarization and search in visual data comprising of videos, live streams and image collections in a unified manner. In particular, we propose a framework for multi-faceted summarization which extracts key-frames (image summaries), skims (video summaries) and entity summaries (summarization at the level of entities like objects, scenes, humans and faces in the video). The user can either view these as extractive summarization, or query focused summarization. Our approach first pre-processes the video or image collection once, to extract all important visual features, following which we provide an interactive mechanism to the user to summarize the video based on their choice. We investigate several diversity, coverage and representation models for all these problems, and argue the utility of these different mod- els depending on the application. While most of the prior work on submodular summarization approaches has focused on combining several models and learning weighted mixtures, we focus on the explain-ability of different the diversity, coverage and representation models and their scalability. Most importantly, we also show that we can summarize hours of video data in a few seconds, and our system allows the user to generate summaries of various lengths and types interactively on the fly. |
Tasks | Video Summarization |
Published | 2017-04-04 |
URL | http://arxiv.org/abs/1704.01466v1 |
http://arxiv.org/pdf/1704.01466v1.pdf | |
PWC | https://paperswithcode.com/paper/a-unified-multi-faceted-video-summarization |
Repo | |
Framework | |
Building a Regular Decision Boundary with Deep Networks
Title | Building a Regular Decision Boundary with Deep Networks |
Authors | Edouard Oyallon |
Abstract | In this work, we build a generic architecture of Convolutional Neural Networks to discover empirical properties of neural networks. Our first contribution is to introduce a state-of-the-art framework that depends upon few hyper parameters and to study the network when we vary them. It has no max pooling, no biases, only 13 layers, is purely convolutional and yields up to 95.4% and 79.6% accuracy respectively on CIFAR10 and CIFAR100. We show that the nonlinearity of a deep network does not need to be continuous, non expansive or point-wise, to achieve good performance. We show that increasing the width of our network permits being competitive with very deep networks. Our second contribution is an analysis of the contraction and separation properties of this network. Indeed, a 1-nearest neighbor classifier applied on deep features progressively improves with depth, which indicates that the representation is progressively more regular. Besides, we defined and analyzed local support vectors that separate classes locally. All our experiments are reproducible and code is available online, based on TensorFlow. |
Tasks | |
Published | 2017-03-06 |
URL | http://arxiv.org/abs/1703.01775v1 |
http://arxiv.org/pdf/1703.01775v1.pdf | |
PWC | https://paperswithcode.com/paper/building-a-regular-decision-boundary-with |
Repo | |
Framework | |
Online Monotone Games
Title | Online Monotone Games |
Authors | Ian Gemp, Sridhar Mahadevan |
Abstract | Algorithmic game theory (AGT) focuses on the design and analysis of algorithms for interacting agents, with interactions rigorously formalized within the framework of games. Results from AGT find applications in domains such as online bidding auctions for web advertisements and network routing protocols. Monotone games are games where agent strategies naturally converge to an equilibrium state. Previous results in AGT have been obtained for convex, socially-convex, or smooth games, but not monotone games. Our primary theoretical contributions are defining the monotone game setting and its extension to the online setting, a new notion of regret for this setting, and accompanying algorithms that achieve sub-linear regret. We demonstrate the utility of online monotone game theory on a variety of problem domains including variational inequalities, reinforcement learning, and generative adversarial networks. |
Tasks | |
Published | 2017-10-19 |
URL | http://arxiv.org/abs/1710.07328v1 |
http://arxiv.org/pdf/1710.07328v1.pdf | |
PWC | https://paperswithcode.com/paper/online-monotone-games |
Repo | |
Framework | |
Bayesian inference on random simple graphs with power law degree distributions
Title | Bayesian inference on random simple graphs with power law degree distributions |
Authors | Juho Lee, Creighton Heaukulani, Zoubin Ghahramani, Lancelot F. James, Seungjin Choi |
Abstract | We present a model for random simple graphs with a degree distribution that obeys a power law (i.e., is heavy-tailed). To attain this behavior, the edge probabilities in the graph are constructed from Bertoin-Fujita-Roynette-Yor (BFRY) random variables, which have been recently utilized in Bayesian statistics for the construction of power law models in several applications. Our construction readily extends to capture the structure of latent factors, similarly to stochastic blockmodels, while maintaining its power law degree distribution. The BFRY random variables are well approximated by gamma random variables in a variational Bayesian inference routine, which we apply to several network datasets for which power law degree distributions are a natural assumption. By learning the parameters of the BFRY distribution via probabilistic inference, we are able to automatically select the appropriate power law behavior from the data. In order to further scale our inference procedure, we adopt stochastic gradient ascent routines where the gradients are computed on minibatches (i.e., subsets) of the edges in the graph. |
Tasks | Bayesian Inference |
Published | 2017-02-27 |
URL | http://arxiv.org/abs/1702.08239v2 |
http://arxiv.org/pdf/1702.08239v2.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-inference-on-random-simple-graphs |
Repo | |
Framework | |
Multi-Image Semantic Matching by Mining Consistent Features
Title | Multi-Image Semantic Matching by Mining Consistent Features |
Authors | Qianqian Wang, Xiaowei Zhou, Kostas Daniilidis |
Abstract | This work proposes a multi-image matching method to estimate semantic correspondences across multiple images. In contrast to the previous methods that optimize all pairwise correspondences, the proposed method identifies and matches only a sparse set of reliable features in the image collection. In this way, the proposed method is able to prune nonrepeatable features and also highly scalable to handle thousands of images. We additionally propose a low-rank constraint to ensure the geometric consistency of feature correspondences over the whole image collection. Besides the competitive performance on multi-graph matching and semantic flow benchmarks, we also demonstrate the applicability of the proposed method for reconstructing object-class models and discovering object-class landmarks from images without using any annotation. |
Tasks | Graph Matching |
Published | 2017-11-21 |
URL | http://arxiv.org/abs/1711.07641v2 |
http://arxiv.org/pdf/1711.07641v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-image-semantic-matching-by-mining |
Repo | |
Framework | |
A neural network system for transformation of regional cuisine style
Title | A neural network system for transformation of regional cuisine style |
Authors | Masahiro Kazama, Minami Sugimoto, Chizuru Hosokawa, Keisuke Matsushima, Lav R. Varshney, Yoshiki Ishikawa |
Abstract | We propose a novel system which can transform a recipe into any selected regional style (e.g., Japanese, Mediterranean, or Italian). This system has two characteristics. First the system can identify the degree of regional cuisine style mixture of any selected recipe and visualize such regional cuisine style mixtures using barycentric Newton diagrams. Second, the system can suggest ingredient substitutions through an extended word2vec model, such that a recipe becomes more authentic for any selected regional cuisine style. Drawing on a large number of recipes from Yummly, an example shows how the proposed system can transform a traditional Japanese recipe, Sukiyaki, into French style. |
Tasks | |
Published | 2017-05-06 |
URL | http://arxiv.org/abs/1705.03487v2 |
http://arxiv.org/pdf/1705.03487v2.pdf | |
PWC | https://paperswithcode.com/paper/a-neural-network-system-for-transformation-of |
Repo | |
Framework | |
Automatic Salient Object Detection for Panoramic Images Using Region Growing and Fixation Prediction Model
Title | Automatic Salient Object Detection for Panoramic Images Using Region Growing and Fixation Prediction Model |
Authors | Chunbiao Zhu, Kan Huang, Ge Li |
Abstract | Almost all previous works on saliency detection have been dedicated to conventional images, however, with the outbreak of panoramic images due to the rapid development of VR or AR technology, it is becoming more challenging, meanwhile valuable for extracting salient contents in panoramic images. In this paper, we propose a novel bottom-up salient object detection framework for panoramic images. First, we employ a spatial density estimation method to roughly extract object proposal regions, with the help of region growing algorithm. Meanwhile, an eye fixation model is utilized to predict visually attractive parts in the image from the perspective of the human visual search mechanism. Then, the previous results are combined by the maxima normalization to get the coarse saliency map. Finally, a refinement step based on geodesic distance is utilized for post-processing to derive the final saliency map. To fairly evaluate the performance of the proposed approach, we propose a high-quality dataset of panoramic images (SalPan). Extensive evaluations demonstrate the effectiveness of our proposed method on panoramic images and the superiority of the proposed method against other methods. |
Tasks | Density Estimation, Object Detection, Saliency Detection, Salient Object Detection |
Published | 2017-10-10 |
URL | http://arxiv.org/abs/1710.04071v6 |
http://arxiv.org/pdf/1710.04071v6.pdf | |
PWC | https://paperswithcode.com/paper/automatic-salient-object-detection-for |
Repo | |
Framework | |
Multivariate Gaussian and Student$-t$ Process Regression for Multi-output Prediction
Title | Multivariate Gaussian and Student$-t$ Process Regression for Multi-output Prediction |
Authors | Zexun Chen, Bo Wang, Alexander N. Gorban |
Abstract | Gaussian process model for vector-valued function has been shown to be useful for multi-output prediction. The existing method for this model is to re-formulate the matrix-variate Gaussian distribution as a multivariate normal distribution. Although it is effective in many cases, re-formulation is not always workable and is difficult to apply to other distributions because not all matrix-variate distributions can be transformed to respective multivariate distributions, such as the case for matrix-variate Student$-t$ distribution. In this paper, we propose a unified framework which is used not only to introduce a novel multivariate Student$-t$ process regression model (MV-TPR) for multi-output prediction, but also to reformulate the multivariate Gaussian process regression (MV-GPR) that overcomes some limitations of the existing methods. Both MV-GPR and MV-TPR have closed-form expressions for the marginal likelihoods and predictive distributions under this unified framework and thus can adopt the same optimization approaches as used in the conventional GPR. The usefulness of the proposed methods is illustrated through several simulated and real data examples. In particular, we verify empirically that MV-TPR has superiority for the datasets considered, including air quality prediction and bike rent prediction. At last, the proposed methods are shown to produce profitable investment strategies in the stock markets. |
Tasks | |
Published | 2017-03-13 |
URL | http://arxiv.org/abs/1703.04455v6 |
http://arxiv.org/pdf/1703.04455v6.pdf | |
PWC | https://paperswithcode.com/paper/multivariate-gaussian-and-student-t-process |
Repo | |
Framework | |
Compiling Deep Learning Models for Custom Hardware Accelerators
Title | Compiling Deep Learning Models for Custom Hardware Accelerators |
Authors | Andre Xian Ming Chang, Aliasger Zaidy, Vinayak Gokhale, Eugenio Culurciello |
Abstract | Convolutional neural networks (CNNs) are the core of most state-of-the-art deep learning algorithms specialized for object detection and classification. CNNs are both computationally complex and embarrassingly parallel. Two properties that leave room for potential software and hardware optimizations for embedded systems. Given a programmable hardware accelerator with a CNN oriented custom instructions set, the compiler’s task is to exploit the hardware’s full potential, while abiding with the hardware constraints and maintaining generality to run different CNN models with varying workload properties. Snowflake is an efficient and scalable hardware accelerator implemented on programmable logic devices. It implements a control pipeline for a custom instruction set. The goal of this paper is to present Snowflake’s compiler that generates machine level instructions from Torch7 model description files. The main software design points explored in this work are: model structure parsing, CNN workload breakdown, loop rearrangement for memory bandwidth optimizations and memory access balancing. The performance achieved by compiler generated instructions matches against hand optimized code for convolution layers. Generated instructions also efficiently execute AlexNet and ResNet18 inference on Snowflake. Snowflake with $256$ processing units was synthesized on Xilinx’s Zynq XC7Z045 FPGA. At $250$ MHz, AlexNet achieved in $93.6$ frames/s and $1.2$ GB/s of off-chip memory bandwidth, and $21.4$ frames/s and $2.2$ GB/s for ResNet18. Total on-chip power is $5$ W. |
Tasks | Object Detection |
Published | 2017-08-01 |
URL | http://arxiv.org/abs/1708.00117v2 |
http://arxiv.org/pdf/1708.00117v2.pdf | |
PWC | https://paperswithcode.com/paper/compiling-deep-learning-models-for-custom |
Repo | |
Framework | |
Learning Human Motion Models for Long-term Predictions
Title | Learning Human Motion Models for Long-term Predictions |
Authors | Partha Ghosh, Jie Song, Emre Aksan, Otmar Hilliges |
Abstract | We propose a new architecture for the learning of predictive spatio-temporal motion models from data alone. Our approach, dubbed the Dropout Autoencoder LSTM, is capable of synthesizing natural looking motion sequences over long time horizons without catastrophic drift or motion degradation. The model consists of two components, a 3-layer recurrent neural network to model temporal aspects and a novel auto-encoder that is trained to implicitly recover the spatial structure of the human skeleton via randomly removing information about joints during training time. This Dropout Autoencoder (D-AE) is then used to filter each predicted pose of the LSTM, reducing accumulation of error and hence drift over time. Furthermore, we propose new evaluation protocols to assess the quality of synthetic motion sequences even for which no ground truth data exists. The proposed protocols can be used to assess generated sequences of arbitrary length. Finally, we evaluate our proposed method on two of the largest motion-capture datasets available to date and show that our model outperforms the state-of-the-art on a variety of actions, including cyclic and acyclic motion, and that it can produce natural looking sequences over longer time horizons than previous methods. |
Tasks | Motion Capture |
Published | 2017-04-10 |
URL | http://arxiv.org/abs/1704.02827v2 |
http://arxiv.org/pdf/1704.02827v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-human-motion-models-for-long-term |
Repo | |
Framework | |
Anomaly Detection: Review and preliminary Entropy method tests
Title | Anomaly Detection: Review and preliminary Entropy method tests |
Authors | Pelumi Oluwasanya |
Abstract | Anomalies are strange data points; they usually represent an unusual occurrence. Anomaly detection is presented from the perspective of Wireless sensor networks. Different approaches have been taken in the past, as we will see, not only to identify outliers, but also to establish the statistical properties of the different methods. The usual goal is to show that the approach is asymptotically efficient and that the metric used is unbiased or maybe biased. This project is based on a work done by [1]. The approach is based on the principle that the entropy of the data is increased when an anomalous data point is measured. The entropy of the data set is thus to be estimated. In this report however, preliminary efforts at confirming the results of [1] is presented. To estimate the entropy of the dataset, since no parametric form is assumed, the probability density function of the data set is first estimated using data split method. This estimated pdf value is then plugged-in to the entropy estimation formula to estimate the entropy of the dataset. The data (test signal) used in this report is Gaussian distributed with zero mean and variance 4. Results of pdf estimation using the k-nearest neighbour method using the entire dataset, and a data-split method are presented and compared based on how well they approximate the probability density function of a Gaussian with similar mean and variance. The number of nearest neighbours chosen for the purpose of this report is 8. This is arbitrary, but is reasonable since the number of anomalies introduced is expected to be less than this upon data-split. The data-split method is preferred and rightly so. |
Tasks | Anomaly Detection |
Published | 2017-08-29 |
URL | http://arxiv.org/abs/1708.08813v1 |
http://arxiv.org/pdf/1708.08813v1.pdf | |
PWC | https://paperswithcode.com/paper/anomaly-detection-review-and-preliminary |
Repo | |
Framework | |
Adversarial Networks for the Detection of Aggressive Prostate Cancer
Title | Adversarial Networks for the Detection of Aggressive Prostate Cancer |
Authors | Simon Kohl, David Bonekamp, Heinz-Peter Schlemmer, Kaneschka Yaqubi, Markus Hohenfellner, Boris Hadaschik, Jan-Philipp Radtke, Klaus Maier-Hein |
Abstract | Semantic segmentation constitutes an integral part of medical image analyses for which breakthroughs in the field of deep learning were of high relevance. The large number of trainable parameters of deep neural networks however renders them inherently data hungry, a characteristic that heavily challenges the medical imaging community. Though interestingly, with the de facto standard training of fully convolutional networks (FCNs) for semantic segmentation being agnostic towards the `structure’ of the predicted label maps, valuable complementary information about the global quality of the segmentation lies idle. In order to tap into this potential, we propose utilizing an adversarial network which discriminates between expert and generated annotations in order to train FCNs for semantic segmentation. Because the adversary constitutes a learned parametrization of what makes a good segmentation at a global level, we hypothesize that the method holds particular advantages for segmentation tasks on complex structured, small datasets. This holds true in our experiments: We learn to segment aggressive prostate cancer utilizing MRI images of 152 patients and show that the proposed scheme is superior over the de facto standard in terms of the detection sensitivity and the dice-score for aggressive prostate cancer. The achieved relative gains are shown to be particularly pronounced in the small dataset limit. | |
Tasks | Semantic Segmentation |
Published | 2017-02-26 |
URL | http://arxiv.org/abs/1702.08014v1 |
http://arxiv.org/pdf/1702.08014v1.pdf | |
PWC | https://paperswithcode.com/paper/adversarial-networks-for-the-detection-of |
Repo | |
Framework | |
ORBIT: Ordering Based Information Transfer Across Space and Time for Global Surface Water Monitoring
Title | ORBIT: Ordering Based Information Transfer Across Space and Time for Global Surface Water Monitoring |
Authors | Ankush Khandelwal, Anuj Karpatne, Vipin Kumar |
Abstract | Many earth science applications require data at both high spatial and temporal resolution for effective monitoring of various ecosystem resources. Due to practical limitations in sensor design, there is often a trade-off in different resolutions of spatio-temporal datasets and hence a single sensor alone cannot provide the required information. Various data fusion methods have been proposed in the literature that mainly rely on individual timesteps when both datasets are available to learn a mapping between features values at different resolutions using local relationships between pixels. Earth observation data is often plagued with spatially and temporally correlated noise, outliers and missing data due to atmospheric disturbances which pose a challenge in learning the mapping from a local neighborhood at individual timesteps. In this paper, we aim to exploit time-independent global relationships between pixels for robust transfer of information across different scales. Specifically, we propose a new framework, ORBIT (Ordering Based Information Transfer) that uses relative ordering constraint among pixels to transfer information across both time and scales. The effectiveness of the framework is demonstrated for global surface water monitoring using both synthetic and real-world datasets. |
Tasks | |
Published | 2017-11-15 |
URL | http://arxiv.org/abs/1711.05799v1 |
http://arxiv.org/pdf/1711.05799v1.pdf | |
PWC | https://paperswithcode.com/paper/orbit-ordering-based-information-transfer |
Repo | |
Framework | |