October 17, 2019

3017 words 15 mins read

Paper Group ANR 730

Paper Group ANR 730

Finite Query Answering in Expressive Description Logics with Transitive Roles. An Optimized Architecture for Unpaired Image-to-Image Translation. A Restricted-Domain Dual Formulation for Two-Phase Image Segmentation. Utilizing Semantic Visual Landmarks for Precise Vehicle Navigation. Superconducting Optoelectronic Neurons III: Synaptic Plasticity. …

Finite Query Answering in Expressive Description Logics with Transitive Roles

Title Finite Query Answering in Expressive Description Logics with Transitive Roles
Authors Tomasz Gogacz, Yazmin Ibáñez-García, Filip Murlak
Abstract We study the problem of finite ontology mediated query answering (FOMQA), the variant of OMQA where the represented world is assumed to be finite, and thus only finite models of the ontology are considered. We adopt the most typical setting with unions of conjunctive queries and ontologies expressed in description logics (DLs). The study of FOMQA is relevant in settings that are not finitely controllable. This is the case not only for DLs without the finite model property, but also for those allowing transitive role declarations. When transitive roles are allowed, evaluating queries is challenging: FOMQA is undecidable for SHOIF and only known to be decidable for the Horn fragment of ALCIF. We show decidability of FOMQA for three proper fragments of SOIF: SOI, SOF, and SIF. Our approach is to characterise models relevant for deciding finite query entailment. Relying on a certain regularity of these models, we develop automata-based decision procedures with optimal complexity bounds.
Tasks
Published 2018-08-09
URL http://arxiv.org/abs/1808.03130v1
PDF http://arxiv.org/pdf/1808.03130v1.pdf
PWC https://paperswithcode.com/paper/finite-query-answering-in-expressive
Repo
Framework

An Optimized Architecture for Unpaired Image-to-Image Translation

Title An Optimized Architecture for Unpaired Image-to-Image Translation
Authors Mohan Nikam
Abstract Unpaired Image-to-Image translation aims to convert the image from one domain (input domain A) to another domain (target domain B), without providing paired examples for the training. The state-of-the-art, Cycle-GAN demonstrated the power of Generative Adversarial Networks with Cycle-Consistency Loss. While its results are promising, there is scope for optimization in the training process. This paper introduces a new neural network architecture, which only learns the translation from domain A to B and eliminates the need for reverse mapping (B to A), by introducing a new Deviation-loss term. Furthermore, few other improvements to the Cycle-GAN are found and utilized in this new architecture, contributing to significantly lesser training duration.
Tasks Image-to-Image Translation
Published 2018-02-13
URL http://arxiv.org/abs/1802.04467v1
PDF http://arxiv.org/pdf/1802.04467v1.pdf
PWC https://paperswithcode.com/paper/an-optimized-architecture-for-unpaired-image
Repo
Framework

A Restricted-Domain Dual Formulation for Two-Phase Image Segmentation

Title A Restricted-Domain Dual Formulation for Two-Phase Image Segmentation
Authors Jack Spencer
Abstract In two-phase image segmentation, convex relaxation has allowed global minimisers to be computed for a variety of data fitting terms. Many efficient approaches exist to compute a solution quickly. However, we consider whether the nature of the data fitting in this formulation allows for reasonable assumptions to be made about the solution that can improve the computational performance further. In particular, we employ a well known dual formulation of this problem and solve the corresponding equations in a restricted domain. We present experimental results that explore the dependence of the solution on this restriction and quantify imrovements in the computational performance. This approach can be extended to analogous methods simply and could provide an efficient alternative for problems of this type.
Tasks Semantic Segmentation
Published 2018-07-30
URL http://arxiv.org/abs/1807.11534v1
PDF http://arxiv.org/pdf/1807.11534v1.pdf
PWC https://paperswithcode.com/paper/a-restricted-domain-dual-formulation-for-two
Repo
Framework

Utilizing Semantic Visual Landmarks for Precise Vehicle Navigation

Title Utilizing Semantic Visual Landmarks for Precise Vehicle Navigation
Authors Varun Murali, Han-Pang Chiu, Supun Samarasekera, Rakesh, Kumar
Abstract This paper presents a new approach for integrating semantic information for vision-based vehicle navigation. Although vision-based vehicle navigation systems using pre-mapped visual landmarks are capable of achieving submeter level accuracy in large-scale urban environment, a typical error source in this type of systems comes from the presence of visual landmarks or features from temporal objects in the environment, such as cars and pedestrians. We propose a gated factor graph framework to use semantic information associated with visual features to make decisions on outlier/ inlier computation from three perspectives: the feature tracking process, the geo-referenced map building process, and the navigation system using pre-mapped landmarks. The class category that the visual feature belongs to is extracted from a pre-trained deep learning network trained for semantic segmentation. The feasibility and generality of our approach is demonstrated by our implementations on top of two vision-based navigation systems. Experimental evaluations validate that the injection of semantic information associated with visual landmarks using our approach achieves substantial improvements in accuracy on GPS-denied navigation solutions for large-scale urban scenarios
Tasks Semantic Segmentation
Published 2018-01-02
URL http://arxiv.org/abs/1801.00858v1
PDF http://arxiv.org/pdf/1801.00858v1.pdf
PWC https://paperswithcode.com/paper/utilizing-semantic-visual-landmarks-for
Repo
Framework

Superconducting Optoelectronic Neurons III: Synaptic Plasticity

Title Superconducting Optoelectronic Neurons III: Synaptic Plasticity
Authors Jeffrey M. Shainline, Adam N. McCaughan, Sonia M. Buckley, Christine A. Donnelly, Manuel Castellanos-Beltran, Michael L. Schneider, Richard P. Mirin, Sae Woo Nam
Abstract As a means of dynamically reconfiguring the synaptic weight of a superconducting optoelectronic loop neuron, a superconducting flux storage loop is inductively coupled to the synaptic current bias of the neuron. A standard flux memory cell is used to achieve a binary synapse, and loops capable of storing many flux quanta are used to enact multi-stable synapses. Circuits are designed to implement supervised learning wherein current pulses add or remove flux from the loop to strengthen or weaken the synaptic weight. Designs are presented for circuits with hundreds of intermediate synaptic weights between minimum and maximum strengths. Circuits for implementing unsupervised learning are modeled using two photons to strengthen and two photons to weaken the synaptic weight via Hebbian and anti-Hebbian learning rules, and techniques are proposed to control the learning rate. Implementation of short-term plasticity, homeostatic plasticity, and metaplasticity in loop neurons is discussed.
Tasks
Published 2018-05-04
URL http://arxiv.org/abs/1805.01937v4
PDF http://arxiv.org/pdf/1805.01937v4.pdf
PWC https://paperswithcode.com/paper/superconducting-optoelectronic-neurons-iii
Repo
Framework

Just-in-Time Reconstruction: Inpainting Sparse Maps using Single View Depth Predictors as Priors

Title Just-in-Time Reconstruction: Inpainting Sparse Maps using Single View Depth Predictors as Priors
Authors Chamara Saroj Weerasekera, Thanuja Dharmasiri, Ravi Garg, Tom Drummond, Ian Reid
Abstract We present ``just-in-time reconstruction” as real-time image-guided inpainting of a map with arbitrary scale and sparsity to generate a fully dense depth map for the image. In particular, our goal is to inpaint a sparse map — obtained from either a monocular visual SLAM system or a sparse sensor — using a single-view depth prediction network as a virtual depth sensor. We adopt a fairly standard approach to data fusion, to produce a fused depth map by performing inference over a novel fully-connected Conditional Random Field (CRF) which is parameterized by the input depth maps and their pixel-wise confidence weights. Crucially, we obtain the confidence weights that parameterize the CRF model in a data-dependent manner via Convolutional Neural Networks (CNNs) which are trained to model the conditional depth error distributions given each source of input depth map and the associated RGB image. Our CRF model penalises absolute depth error in its nodes and pairwise scale-invariant depth error in its edges, and the confidence-based fusion minimizes the impact of outlier input depth values on the fused result. We demonstrate the flexibility of our method by real-time inpainting of ORB-SLAM, Kinect, and LIDAR depth maps acquired both indoors and outdoors at arbitrary scale and varied amount of irregular sparsity. |
Tasks Depth Estimation
Published 2018-05-11
URL http://arxiv.org/abs/1805.04239v1
PDF http://arxiv.org/pdf/1805.04239v1.pdf
PWC https://paperswithcode.com/paper/just-in-time-reconstruction-inpainting-sparse
Repo
Framework

Object Localization with a Weakly Supervised CapsNet

Title Object Localization with a Weakly Supervised CapsNet
Authors Weitang Liu, Emad Barsoum, John D. Owens
Abstract Inspired by CapsNet’s routing-by-agreement mechanism with its ability to learn object properties, we explore if those properties in turn can determine new properties of the objects, such as the locations. We then propose a CapsNet architecture with object coordinate atoms and a modified routing-by-agreement algorithm with unevenly distributed initial routing probabilities. The model is based on CapsNet but uses a routing algorithm to find the objects’ approximate positions in the image coordinate system. We also discussed how to derive the property of translation through coordinate atoms and we show the importance of sparse representation. We train our model on the single moving MNIST dataset with class labels. Our model can learn and derive the coordinates of the digits better than its convolution counterpart that lacks a routing-by-agreement algorithm, and can also perform well when testing on the multi-digit moving MNIST and KTH datasets. The results show our method reaches the state-of-art performance on object localization without any extra localization techniques and modules as in prior work.
Tasks Object Localization, Object Recognition, Transfer Learning
Published 2018-05-20
URL https://arxiv.org/abs/1805.07706v3
PDF https://arxiv.org/pdf/1805.07706v3.pdf
PWC https://paperswithcode.com/paper/object-localization-and-motion-transfer
Repo
Framework

Generalization Error in Deep Learning

Title Generalization Error in Deep Learning
Authors Daniel Jakubovitz, Raja Giryes, Miguel R. D. Rodrigues
Abstract Deep learning models have lately shown great performance in various fields such as computer vision, speech recognition, speech translation, and natural language processing. However, alongside their state-of-the-art performance, it is still generally unclear what is the source of their generalization ability. Thus, an important question is what makes deep neural networks able to generalize well from the training set to new data. In this article, we provide an overview of the existing theory and bounds for the characterization of the generalization error of deep neural networks, combining both classical and more recent theoretical and empirical results.
Tasks Speech Recognition
Published 2018-08-03
URL http://arxiv.org/abs/1808.01174v3
PDF http://arxiv.org/pdf/1808.01174v3.pdf
PWC https://paperswithcode.com/paper/generalization-error-in-deep-learning
Repo
Framework

Teaching Categories to Human Learners with Visual Explanations

Title Teaching Categories to Human Learners with Visual Explanations
Authors Oisin Mac Aodha, Shihan Su, Yuxin Chen, Pietro Perona, Yisong Yue
Abstract We study the problem of computer-assisted teaching with explanations. Conventional approaches for machine teaching typically only provide feedback at the instance level e.g., the category or label of the instance. However, it is intuitive that clear explanations from a knowledgeable teacher can significantly improve a student’s ability to learn a new concept. To address these existing limitations, we propose a teaching framework that provides interpretable explanations as feedback and models how the learner incorporates this additional information. In the case of images, we show that we can automatically generate explanations that highlight the parts of the image that are responsible for the class label. Experiments on human learners illustrate that, on average, participants achieve better test set performance on challenging categorization tasks when taught with our interpretable approach compared to existing methods.
Tasks
Published 2018-02-20
URL http://arxiv.org/abs/1802.06924v1
PDF http://arxiv.org/pdf/1802.06924v1.pdf
PWC https://paperswithcode.com/paper/teaching-categories-to-human-learners-with
Repo
Framework

A Bayesian Perspective of Convolutional Neural Networks through a Deconvolutional Generative Model

Title A Bayesian Perspective of Convolutional Neural Networks through a Deconvolutional Generative Model
Authors Tan Nguyen, Nhat Ho, Ankit Patel, Anima Anandkumar, Michael I. Jordan, Richard G. Baraniuk
Abstract Inspired by the success of Convolutional Neural Networks (CNNs) for supervised prediction in images, we design the Deconvolutional Generative Model (DGM), a new probabilistic generative model whose inference calculations correspond to those in a given CNN architecture. The DGM uses a CNN to design the prior distribution in the probabilistic model. Furthermore, the DGM generates images from coarse to finer scales. It introduces a small set of latent variables at each scale, and enforces dependencies among all the latent variables via a conjugate prior distribution. This conjugate prior yields a new regularizer based on paths rendered in the generative model for training CNNs-the Rendering Path Normalization (RPN). We demonstrate that this regularizer improves generalization, both in theory and in practice. In addition, likelihood estimation in the DGM yields training losses for CNNs, and inspired by this, we design a new loss termed as the Max-Min cross entropy which outperforms the traditional cross-entropy loss for object classification. The Max-Min cross entropy suggests a new deep network architecture, namely the Max-Min network, which can learn from less labeled data while maintaining good prediction performance. Our experiments demonstrate that the DGM with the RPN and the Max-Min architecture exceeds or matches the-state-of-art on benchmarks including SVHN, CIFAR10, and CIFAR100 for semi-supervised and supervised learning tasks.
Tasks Object Classification
Published 2018-11-01
URL https://arxiv.org/abs/1811.02657v2
PDF https://arxiv.org/pdf/1811.02657v2.pdf
PWC https://paperswithcode.com/paper/a-bayesian-perspective-of-convolutional
Repo
Framework

Estimation of Tissue Oxygen Saturation from RGB Images based on Pixel-level Image Translation

Title Estimation of Tissue Oxygen Saturation from RGB Images based on Pixel-level Image Translation
Authors Qing-Biao Li, Xiao-Yun Zhou, Jianyu Lin, Jian-Qing Zheng, Neil T. Clancy, Daniel S. Elson
Abstract Intra-operative measurement of tissue oxygen saturation (StO2) has been widely explored by pulse oximetry or hyperspectral imaging (HSI) to assess the function and viability of tissue. In this paper we propose a pixel- level image-to-image translation approach based on conditional Generative Adversarial Networks (cGAN) to estimate tissue oxygen saturation (StO2) directly from RGB images. The real-time performance and non-reliance on additional hardware, enable a seamless integration of the proposed method into surgical and diagnostic workflows with standard endoscope systems. For validation, RGB images and StO2 ground truth were simulated and estimated from HSI images collected by a liquid crystal tuneable filter (LCTF) endoscope for three tissue types (porcine bowel, lamb uterus and rabbit uterus). The result show that the proposed method can achieve visually identical images with comparable accuracy.
Tasks Image-to-Image Translation
Published 2018-04-19
URL http://arxiv.org/abs/1804.07116v1
PDF http://arxiv.org/pdf/1804.07116v1.pdf
PWC https://paperswithcode.com/paper/estimation-of-tissue-oxygen-saturation-from
Repo
Framework

ShapeStacks: Learning Vision-Based Physical Intuition for Generalised Object Stacking

Title ShapeStacks: Learning Vision-Based Physical Intuition for Generalised Object Stacking
Authors Oliver Groth, Fabian B. Fuchs, Ingmar Posner, Andrea Vedaldi
Abstract Physical intuition is pivotal for intelligent agents to perform complex tasks. In this paper we investigate the passive acquisition of an intuitive understanding of physical principles as well as the active utilisation of this intuition in the context of generalised object stacking. To this end, we provide: a simulation-based dataset featuring 20,000 stack configurations composed of a variety of elementary geometric primitives richly annotated regarding semantics and structural stability. We train visual classifiers for binary stability prediction on the ShapeStacks data and scrutinise their learned physical intuition. Due to the richness of the training data our approach also generalises favourably to real-world scenarios achieving state-of-the-art stability prediction on a publicly available benchmark of block towers. We then leverage the physical intuition learned by our model to actively construct stable stacks and observe the emergence of an intuitive notion of stackability - an inherent object affordance - induced by the active stacking task. Our approach performs well even in challenging conditions where it considerably exceeds the stack height observed during training or in cases where initially unstable structures must be stabilised via counterbalancing.
Tasks
Published 2018-04-21
URL http://arxiv.org/abs/1804.08018v2
PDF http://arxiv.org/pdf/1804.08018v2.pdf
PWC https://paperswithcode.com/paper/shapestacks-learning-vision-based-physical
Repo
Framework

Coloring black boxes: visualization of neural network decisions

Title Coloring black boxes: visualization of neural network decisions
Authors Wlodzislaw Duch
Abstract Neural networks are commonly regarded as black boxes performing incomprehensible functions. For classification problems networks provide maps from high dimensional feature space to K-dimensional image space. Images of training vector are projected on polygon vertices, providing visualization of network function. Such visualization may show the dynamics of learning, allow for comparison of different networks, display training vectors around which potential problems may arise, show differences due to regularization and optimization procedures, investigate stability of network classification under perturbation of original vectors, and place new data sample in relation to training data, allowing for estimation of confidence in classification of a given sample. An illustrative example for the three-class Wine data and five-class Satimage data is described. The visualization method proposed here is applicable to any black box system that provides continuous outputs.
Tasks
Published 2018-02-23
URL http://arxiv.org/abs/1802.08478v1
PDF http://arxiv.org/pdf/1802.08478v1.pdf
PWC https://paperswithcode.com/paper/coloring-black-boxes-visualization-of-neural
Repo
Framework

Combining Restricted Boltzmann Machines with Neural Networks for Latent Truth Discovery

Title Combining Restricted Boltzmann Machines with Neural Networks for Latent Truth Discovery
Authors Klaus Broelemann, Gjergji Kasneci
Abstract Latent truth discovery, LTD for short, refers to the problem of aggregating ltiple claims from various sources in order to estimate the plausibility of atements about entities. In the absence of a ground truth, this problem is highly challenging, when some sources provide conflicting claims and others no claims at all. In this work we provide an unsupervised stochastic inference procedure on top of a model that combines restricted Boltzmann machines with feed-forward neural networks to accurately infer the reliability of sources as well as the plausibility of statements about entities. In comparison to prior work our approach stands out (1) by allowing the incorporation of arbitrary features about sources and claims, (2) by generalizing from reliability per source towards a reliability function, and thus (3) enabling the estimation of source reliability even for sources that have provided no or very few claims, (4) by building on efficient and scalable stochastic inference algorithms, and (5) by outperforming the state-of-the-art by a considerable margin.
Tasks
Published 2018-07-27
URL http://arxiv.org/abs/1807.10680v1
PDF http://arxiv.org/pdf/1807.10680v1.pdf
PWC https://paperswithcode.com/paper/combining-restricted-boltzmann-machines-with
Repo
Framework

Deep Photovoltaic Nowcasting

Title Deep Photovoltaic Nowcasting
Authors Jinsong Zhang, Rodrigo Verschae, Shohei Nobuhara, Jean-François Lalonde
Abstract Predicting the short-term power output of a photovoltaic panel is an important task for the efficient management of smart grids. Short-term forecasting at the minute scale, also known as nowcasting, can benefit from sky images captured by regular cameras and installed close to the solar panel. However, estimating the weather conditions from these images—sun intensity, cloud appearance and movement, etc.—is a very challenging task that the community has yet to solve with traditional computer vision techniques. In this work, we propose to learn the relationship between sky appearance and the future photovoltaic power output using deep learning. We train several variants of convolutional neural networks which take historical photovoltaic power values and sky images as input and estimate photovoltaic power in a very short term future. In particular, we compare three different architectures based on: a multi-layer perceptron (MLP), a convolutional neural network (CNN), and a long short term memory (LSTM) module. We evaluate our approach quantitatively on a dataset of photovoltaic power values and corresponding images gathered in Kyoto, Japan. Our experiments reveal that the MLP network, already used similarly in previous work, achieves an RMSE skill score of 7% over the commonly-used persistence baseline on the 1-minute future photovoltaic power prediction task. Our CNN-based network improves upon this with a 12% skill score. In contrast, our LSTM-based model, which can learn the temporal dependencies in the data, achieves a 21% RMSE skill score, thus outperforming all other approaches.
Tasks
Published 2018-10-15
URL http://arxiv.org/abs/1810.06327v1
PDF http://arxiv.org/pdf/1810.06327v1.pdf
PWC https://paperswithcode.com/paper/deep-photovoltaic-nowcasting
Repo
Framework
comments powered by Disqus