January 30, 2020

3076 words 15 mins read

Paper Group ANR 274

Paper Group ANR 274

Extending and Automating Basic Probability Theory with Propositional Computability Logic. Array Codes with Local Properties. Smoothing Structured Decomposable Circuits. Option Encoder: A Framework for Discovering a Policy Basis in Reinforcement Learning. Scaling data-driven robotics with reward sketching and batch reinforcement learning. Sparse-GAN …

Extending and Automating Basic Probability Theory with Propositional Computability Logic

Title Extending and Automating Basic Probability Theory with Propositional Computability Logic
Authors Keehang Kwon
Abstract Classical probability theory is formulated using sets. In this paper, we extend classical probability theory with propositional computability logic. Unlike other formalisms, computability logic is built on the notion of events/games, which is central to probability theory. The probability theory based on CoL is therefore useful for {\it automating} uncertainty reasoning. We describe some basic properties of this new probability theory. We also discuss a novel isomorphism between the set operations and computability logic operations.
Tasks
Published 2019-09-16
URL https://arxiv.org/abs/1909.07375v2
PDF https://arxiv.org/pdf/1909.07375v2.pdf
PWC https://paperswithcode.com/paper/extending-and-automating-basic-probability
Repo
Framework

Array Codes with Local Properties

Title Array Codes with Local Properties
Authors Mario Blaum, Steven R. Hetzler
Abstract In general, array codes consist of $m\times n$ arrays and in many cases, the arrays satisfy parity constraints along lines of different slopes (generally with a toroidal topology). Such codes are useful for RAID type of architectures, since they allow to replace finite field operations by XORs. We present expansions to traditional array codes of this type, like Blaum-Roth (BR) and extended EVENODD codes, by adding parity on columns. This vertical parity allows for recovery of one or more symbols in a column locally, i.e., by using the remaining symbols in the column without invoking the rest of the array. Properties and applications of the new codes are discussed, in particular to Locally Recoverable (LRC) codes.
Tasks
Published 2019-06-27
URL https://arxiv.org/abs/1906.11731v2
PDF https://arxiv.org/pdf/1906.11731v2.pdf
PWC https://paperswithcode.com/paper/array-codes-with-local-properties
Repo
Framework

Smoothing Structured Decomposable Circuits

Title Smoothing Structured Decomposable Circuits
Authors Andy Shih, Guy Van den Broeck, Paul Beame, Antoine Amarilli
Abstract We study the task of smoothing a circuit, i.e., ensuring that all children of a plus-gate mention the same variables. Circuits serve as the building blocks of state-of-the-art inference algorithms on discrete probabilistic graphical models and probabilistic programs. They are also important for discrete density estimation algorithms. Many of these tasks require the input circuit to be smooth. However, smoothing has not been studied in its own right yet, and only a trivial quadratic algorithm is known. This paper studies efficient smoothing for structured decomposable circuits. We propose a near-linear time algorithm for this task and explore lower bounds for smoothing decomposable circuits, using existing results on range-sum queries. Further, for the important case of All-Marginals, we show a more efficient linear-time algorithm. We validate experimentally the performance of our methods.
Tasks Density Estimation
Published 2019-06-01
URL https://arxiv.org/abs/1906.00311v2
PDF https://arxiv.org/pdf/1906.00311v2.pdf
PWC https://paperswithcode.com/paper/190600311
Repo
Framework

Option Encoder: A Framework for Discovering a Policy Basis in Reinforcement Learning

Title Option Encoder: A Framework for Discovering a Policy Basis in Reinforcement Learning
Authors Arjun Manoharan, Rahul Ramesh, Balaraman Ravindran
Abstract Option discovery and skill acquisition frameworks are integral to the functioning of a Hierarchically organized Reinforcement learning agent. However, such techniques often yield a large number of options or skills, which can potentially be represented succinctly by filtering out any redundant information. Such a reduction can reduce the required computation while also improving the performance on a target task. In order to compress an array of option policies, we attempt to find a policy basis that accurately captures the set of all options. In this work, we propose Option Encoder, an auto-encoder based framework with intelligently constrained weights, that helps discover a collection of basis policies. The policy basis can be used as a proxy for the original set of skills in a suitable hierarchically organized framework. We demonstrate the efficacy of our method on a collection of grid-worlds and on the high-dimensional Fetch-Reach robotic manipulation task by evaluating the obtained policy basis on a set of downstream tasks.
Tasks
Published 2019-09-09
URL https://arxiv.org/abs/1909.04134v1
PDF https://arxiv.org/pdf/1909.04134v1.pdf
PWC https://paperswithcode.com/paper/option-encoder-a-framework-for-discovering-a
Repo
Framework

Scaling data-driven robotics with reward sketching and batch reinforcement learning

Title Scaling data-driven robotics with reward sketching and batch reinforcement learning
Authors Serkan Cabi, Sergio Gómez Colmenarejo, Alexander Novikov, Ksenia Konyushkova, Scott Reed, Rae Jeong, Konrad Zolna, Yusuf Aytar, David Budden, Mel Vecerik, Oleg Sushkov, David Barker, Jonathan Scholz, Misha Denil, Nando de Freitas, Ziyu Wang
Abstract We present a framework for data-driven robotics that makes use of a large dataset of recorded robot experience and scales to several tasks using learned reward functions. We show how to apply this framework to accomplish three different object manipulation tasks on a real robot platform. Given demonstrations of a task together with task-agnostic recorded experience, we use a special form of human annotation as supervision to learn a reward function, which enables us to deal with real-world tasks where the reward signal cannot be acquired directly. Learned rewards are used in combination with a large dataset of experience from different tasks to learn a robot policy offline using batch RL. We show that using our approach it is possible to train agents to perform a variety of challenging manipulation tasks including stacking rigid objects and handling cloth.
Tasks
Published 2019-09-26
URL https://arxiv.org/abs/1909.12200v2
PDF https://arxiv.org/pdf/1909.12200v2.pdf
PWC https://paperswithcode.com/paper/a-framework-for-data-driven-robotics
Repo
Framework

Sparse-GAN: Sparsity-constrained Generative Adversarial Network for Anomaly Detection in Retinal OCT Image

Title Sparse-GAN: Sparsity-constrained Generative Adversarial Network for Anomaly Detection in Retinal OCT Image
Authors Kang Zhou, Shenghua Gao, Jun Cheng, Zaiwang Gu, Huazhu Fu, Zhi Tu, Jianlong Yang, Yitian Zhao, Jiang Liu
Abstract With the development of convolutional neural network, deep learning has shown its success for retinal disease detection from optical coherence tomography (OCT) images. However, deep learning often relies on large scale labelled data for training, which is oftentimes challenging especially for disease with low occurrence. Moreover, a deep learning system trained from data-set with one or a few diseases is unable to detect other unseen diseases, which limits the practical usage of the system in disease screening. To address the limitation, we propose a novel anomaly detection framework termed Sparsity-constrained Generative Adversarial Network (Sparse-GAN) for disease screening where only healthy data are available in the training set. The contributions of Sparse-GAN are two-folds: 1) The proposed Sparse-GAN predicts the anomalies in latent space rather than image-level; 2) Sparse-GAN is constrained by a novel Sparsity Regularization Net. Furthermore, in light of the role of lesions for disease screening, we present to leverage on an anomaly activation map to show the heatmap of lesions. We evaluate our proposed Sparse-GAN on a publicly available dataset, and the results show that the proposed method outperforms the state-of-the-art methods.
Tasks Anomaly Detection
Published 2019-11-28
URL https://arxiv.org/abs/1911.12527v3
PDF https://arxiv.org/pdf/1911.12527v3.pdf
PWC https://paperswithcode.com/paper/sparse-gan-sparsity-constrained-generative
Repo
Framework

Online Inference for Advertising Auctions

Title Online Inference for Advertising Auctions
Authors Caio Waisman, Harikesh S. Nair, Carlos Carrion, Nan Xu
Abstract Advertisers that engage in real-time bidding (RTB) to display their ads commonly have two goals: learning their optimal bidding policy and estimating the expected effect of exposing users to their ads. Typical strategies to accomplish one of these goals tend to ignore the other, creating an apparent tension between the two. This paper exploits the economic structure of the bid optimization problem faced by advertisers to show that these two objectives can actually be perfectly aligned. By framing the advertiser’s problem as a multi-armed bandit (MAB) problem, we propose a modified Thompson Sampling (TS) algorithm that concurrently learns the optimal bidding policy and estimates the expected effect of displaying the ad while minimizing economic losses from potential sub-optimal bidding. Simulations show that not only the proposed method successfully accomplishes the advertiser’s goals, but also does so at a much lower cost than more conventional experimentation policies aimed at performing causal inference.
Tasks Causal Inference
Published 2019-08-22
URL https://arxiv.org/abs/1908.08600v1
PDF https://arxiv.org/pdf/1908.08600v1.pdf
PWC https://paperswithcode.com/paper/online-inference-for-advertising-auctions
Repo
Framework

Blocking Bandits

Title Blocking Bandits
Authors Soumya Basu, Rajat Sen, Sujay Sanghavi, Sanjay Shakkottai
Abstract We consider a novel stochastic multi-armed bandit setting, where playing an arm makes it unavailable for a fixed number of time slots thereafter. This models situations where reusing an arm too often is undesirable (e.g. making the same product recommendation repeatedly) or infeasible (e.g. compute job scheduling on machines). We show that with prior knowledge of the rewards and delays of all the arms, the problem of optimizing cumulative reward does not admit any pseudo-polynomial time algorithm (in the number of arms) unless randomized exponential time hypothesis is false, by mapping to the PINWHEEL scheduling problem. Subsequently, we show that a simple greedy algorithm that plays the available arm with the highest reward is asymptotically $(1-1/e)$ optimal. When the rewards are unknown, we design a UCB based algorithm which is shown to have $c \log T + o(\log T)$ cumulative regret against the greedy algorithm, leveraging the free exploration of arms due to the unavailability. Finally, when all the delays are equal the problem reduces to Combinatorial Semi-bandits providing us with a lower bound of $c’ \log T+ \omega(\log T)$.
Tasks Product Recommendation
Published 2019-07-27
URL https://arxiv.org/abs/1907.11975v1
PDF https://arxiv.org/pdf/1907.11975v1.pdf
PWC https://paperswithcode.com/paper/blocking-bandits
Repo
Framework

Generalized Policy Iteration for Optimal Control in Continuous Time

Title Generalized Policy Iteration for Optimal Control in Continuous Time
Authors Jingliang Duan, Shengbo Eben Li, Zhengyu Liu, Monimoy Bujarbaruah, Bo Cheng
Abstract This paper proposes the Deep Generalized Policy Iteration (DGPI) algorithm to find the infinite horizon optimal control policy for general nonlinear continuous-time systems with known dynamics. Unlike existing adaptive dynamic programming algorithms for continuous time systems, DGPI does not require the admissibility of initialized policy, and input-affine nature of controlled systems for convergence. Our algorithm employs the actor-critic architecture to approximate both policy and value functions with the purpose of iteratively solving the Hamilton-Jacobi-Bellman equation. Both the policy and value functions are approximated by deep neural networks. Given any arbitrary initial policy, the proposed DGPI algorithm can eventually converge to an admissible, and subsequently an optimal policy for an arbitrary nonlinear system. We also relax the update termination conditions of both the policy evaluation and improvement processes, which leads to a faster convergence speed than conventional Policy Iteration (PI) methods, for the same architecture of function approximators. We further prove the convergence and optimality of the algorithm with thorough Lyapunov analysis, and demonstrate its generality and efficacy using two detailed numerical examples.
Tasks
Published 2019-09-11
URL https://arxiv.org/abs/1909.05402v1
PDF https://arxiv.org/pdf/1909.05402v1.pdf
PWC https://paperswithcode.com/paper/generalized-policy-iteration-for-optimal
Repo
Framework

A supervised-learning-based strategy for optimal demand response of an HVAC System

Title A supervised-learning-based strategy for optimal demand response of an HVAC System
Authors Youngjin Kim
Abstract The large thermal capacity of buildings enables heating, ventilating, and air-conditioning (HVAC) systems to be exploited as demand response (DR) resources. Optimal DR of HVAC units is challenging, particularly for multi-zone buildings, because this requires detailed physics-based models of zonal temperature variations for HVAC system operation and building thermal conditions. This paper proposes a new strategy for optimal DR of an HVAC system in a multi-zone building, based on supervised learning (SL). Artificial neural networks (ANNs) are trained with data obtained under normal building operating conditions. The ANNs are replicated using piecewise linear equations, which are explicitly integrated into an optimal scheduling problem for price-based DR. The optimization problem is solved for various electricity prices and building thermal conditions. The solutions are further used to train a deep neural network (DNN) to directly determine the optimal DR schedule, referred to here as supervised-learning-aided meta-prediction (SLAMP). Case studies are performed using three different methods: explicit ANN replication (EAR), SLAMP, and physics-based modeling. The case study results verify the effectiveness of the proposed SL-based strategy, in terms of both practical applicability and computational time, while also ensuring the thermal comfort of occupants and cost-effective operation of the HVAC system.
Tasks
Published 2019-04-29
URL http://arxiv.org/abs/1904.13304v1
PDF http://arxiv.org/pdf/1904.13304v1.pdf
PWC https://paperswithcode.com/paper/a-supervised-learning-based-strategy-for
Repo
Framework

Approximation spaces of deep neural networks

Title Approximation spaces of deep neural networks
Authors Rémi Gribonval, Gitta Kutyniok, Morten Nielsen, Felix Voigtlaender
Abstract We study the expressivity of deep neural networks. Measuring a network’s complexity by its number of connections or by its number of neurons, we consider the class of functions for which the error of best approximation with networks of a given complexity decays at a certain rate when increasing the complexity budget. Using results from classical approximation theory, we show that this class can be endowed with a (quasi)-norm that makes it a linear function space, called approximation space. We establish that allowing the networks to have certain types of “skip connections” does not change the resulting approximation spaces. We also discuss the role of the network’s nonlinearity (also known as activation function) on the resulting spaces, as well as the role of depth. For the popular ReLU nonlinearity and its powers, we relate the newly constructed spaces to classical Besov spaces. The established embeddings highlight that some functions of very low Besov smoothness can nevertheless be well approximated by neural networks, if these networks are sufficiently deep.
Tasks
Published 2019-05-03
URL https://arxiv.org/abs/1905.01208v2
PDF https://arxiv.org/pdf/1905.01208v2.pdf
PWC https://paperswithcode.com/paper/approximation-spaces-of-deep-neural-networks
Repo
Framework

CaDIS: Cataract Dataset for Image Segmentation

Title CaDIS: Cataract Dataset for Image Segmentation
Authors Evangello Flouty, Abdolrahim Kadkhodamohammadi, Imanol Luengo, Felix Fuentes-Hurtado, Hinde Taleb, Santiago Barbarisi, Gwenole Quellec, Danail Stoyanov
Abstract Video signals provide a wealth of information about surgical procedures and are the main sensory cue for surgeons. Video processing and understanding can be used to empower computer assisted interventions (CAI) as well as the development of detailed post-operative analysis of the surgical intervention. A fundamental building block to such capabilities is the ability to understand and segment video into semantic labels that differentiate and localize tissue types and different instruments. Deep learning has advanced semantic segmentation techniques dramatically in recent years but is fundamentally reliant on the availability of labelled datasets used to train models. In this paper, we introduce a high quality dataset for semantic segmentation in Cataract surgery. We generated this dataset from the CATARACTS challenge dataset, which is publicly available. To the best of our knowledge, this dataset has the highest quality annotation in surgical data to date. We introduce the dataset and then show the automatic segmentation performance of state-of-the-art models on that dataset as a benchmark.
Tasks Semantic Segmentation
Published 2019-06-27
URL https://arxiv.org/abs/1906.11586v4
PDF https://arxiv.org/pdf/1906.11586v4.pdf
PWC https://paperswithcode.com/paper/cadss-cataract-dataset-for-semantic
Repo
Framework

Dual Adversarial Learning with Attention Mechanism for Fine-grained Medical Image Synthesis

Title Dual Adversarial Learning with Attention Mechanism for Fine-grained Medical Image Synthesis
Authors Dong Nie, Lei Xiang, Qian Wang, Dinggang Shen
Abstract Medical imaging plays a critical role in various clinical applications. However, due to multiple considerations such as cost and risk, the acquisition of certain image modalities could be limited. To address this issue, many cross-modality medical image synthesis methods have been proposed. However, the current methods cannot well model the hard-to-synthesis regions (e.g., tumor or lesion regions). To address this issue, we propose a simple but effective strategy, that is, we propose a dual-discriminator (dual-D) adversarial learning system, in which, a global-D is used to make an overall evaluation for the synthetic image, and a local-D is proposed to densely evaluate the local regions of the synthetic image. More importantly, we build an adversarial attention mechanism which targets at better modeling hard-to-synthesize regions (e.g., tumor or lesion regions) based on the local-D. Experimental results show the robustness and accuracy of our method in synthesizing fine-grained target images from the corresponding source images. In particular, we evaluate our method on two datasets, i.e., to address the tasks of generating T2 MRI from T1 MRI for the brain tumor images and generating MRI from CT. Our method outperforms the state-of-the-art methods under comparison in all datasets and tasks. And the proposed difficult-region-aware attention mechanism is also proved to be able to help generate more realistic images, especially for the hard-to-synthesize regions.
Tasks Image Generation
Published 2019-07-07
URL https://arxiv.org/abs/1907.03297v1
PDF https://arxiv.org/pdf/1907.03297v1.pdf
PWC https://paperswithcode.com/paper/dual-adversarial-learning-with-attention
Repo
Framework

Deep Learning Enables Automatic Detection and Segmentation of Brain Metastases on Multi-Sequence MRI

Title Deep Learning Enables Automatic Detection and Segmentation of Brain Metastases on Multi-Sequence MRI
Authors Endre Grøvik, Darvin Yi, Michael Iv, Elisabeth Tong, Daniel L. Rubin, Greg Zaharchuk
Abstract Detecting and segmenting brain metastases is a tedious and time-consuming task for many radiologists, particularly with the growing use of multi-sequence 3D imaging. This study demonstrates automated detection and segmentation of brain metastases on multi-sequence MRI using a deep learning approach based on a fully convolution neural network (CNN). In this retrospective study, a total of 156 patients with brain metastases from several primary cancers were included. Pre-therapy MR images (1.5T and 3T) included pre- and post-gadolinium T1-weighted 3D fast spin echo, post-gadolinium T1-weighted 3D axial IR-prepped FSPGR, and 3D fluid attenuated inversion recovery. The ground truth was established by manual delineation by two experienced neuroradiologists. CNN training/development was performed using 100 and 5 patients, respectively, with a 2.5D network based on a GoogLeNet architecture. The results were evaluated in 51 patients, equally separated into those with few (1-3), multiple (4-10), and many (>10) lesions. Network performance was evaluated using precision, recall, Dice/F1 score, and ROC-curve statistics. For an optimal probability threshold, detection and segmentation performance was assessed on a per metastasis basis. The area under the ROC-curve (AUC), averaged across all patients, was 0.98. The AUC in the subgroups was 0.99, 0.97, and 0.97 for patients having 1-3, 4-10, and >10 metastases, respectively. Using an average optimal probability threshold determined by the development set, precision, recall, and Dice-score were 0.79, 0.53, and 0.79, respectively. At the same probability threshold, the network showed an average false positive rate of 8.3/patient (no lesion-size limit) and 3.4/patient (10 mm3 lesion size limit). In conclusion, a deep learning approach using multi-sequence MRI can aid in the detection and segmentation of brain metastases.
Tasks
Published 2019-03-18
URL http://arxiv.org/abs/1903.07988v1
PDF http://arxiv.org/pdf/1903.07988v1.pdf
PWC https://paperswithcode.com/paper/deep-learning-enables-automatic-detection-and
Repo
Framework

A Short Survey on Probabilistic Reinforcement Learning

Title A Short Survey on Probabilistic Reinforcement Learning
Authors Reazul Hasan Russel
Abstract A reinforcement learning agent tries to maximize its cumulative payoff by interacting in an unknown environment. It is important for the agent to explore suboptimal actions as well as to pick actions with highest known rewards. Yet, in sensitive domains, collecting more data with exploration is not always possible, but it is important to find a policy with a certain performance guaranty. In this paper, we present a brief survey of methods available in the literature for balancing exploration-exploitation trade off and computing robust solutions from fixed samples in reinforcement learning.
Tasks
Published 2019-01-21
URL http://arxiv.org/abs/1901.07010v1
PDF http://arxiv.org/pdf/1901.07010v1.pdf
PWC https://paperswithcode.com/paper/a-short-survey-on-probabilistic-reinforcement
Repo
Framework
comments powered by Disqus