July 28, 2019

3231 words 16 mins read

Paper Group ANR 430

Paper Group ANR 430

Flexible Stereo: Constrained, Non-rigid, Wide-baseline Stereo Vision for Fixed-wing Aerial Platforms. Review on Computer Vision Techniques in Emergency Situation. Prediction of Sea Surface Temperature using Long Short-Term Memory. Video-based Person Re-identification with Accumulative Motion Context. Adaptive Simulation-based Training of AI Decisio …

Flexible Stereo: Constrained, Non-rigid, Wide-baseline Stereo Vision for Fixed-wing Aerial Platforms

Title Flexible Stereo: Constrained, Non-rigid, Wide-baseline Stereo Vision for Fixed-wing Aerial Platforms
Authors Timo Hinzmann, Tim Taubner, Roland Siegwart
Abstract This paper proposes a computationally efficient method to estimate the time-varying relative pose between two visual-inertial sensor rigs mounted on the flexible wings of a fixed-wing unmanned aerial vehicle (UAV). The estimated relative poses are used to generate highly accurate depth maps in real-time and can be employed for obstacle avoidance in low-altitude flights or landing maneuvers. The approach is structured as follows: Initially, a wing model is identified by fitting a probability density function to measured deviations from the nominal relative baseline transformation. At run-time, the prior knowledge about the wing model is fused in an Extended Kalman filter~(EKF) together with relative pose measurements obtained from solving a relative perspective N-point problem (PNP), and the linear accelerations and angular velocities measured by the two inertial measurement units (IMU) which are rigidly attached to the cameras. Results obtained from extensive synthetic experiments demonstrate that our proposed framework is able to estimate highly accurate baseline transformations and depth maps.
Tasks
Published 2017-12-19
URL http://arxiv.org/abs/1712.06837v2
PDF http://arxiv.org/pdf/1712.06837v2.pdf
PWC https://paperswithcode.com/paper/flexible-stereo-constrained-non-rigid-wide
Repo
Framework

Review on Computer Vision Techniques in Emergency Situation

Title Review on Computer Vision Techniques in Emergency Situation
Authors Laura Lopez-Fuentes, Joost van de Weijer, Manuel Gonzalez-Hidalgo, Harald Skinnemoen, Andrew D. Bagdanov
Abstract In emergency situations, actions that save lives and limit the impact of hazards are crucial. In order to act, situational awareness is needed to decide what to do. Geolocalized photos and video of the situations as they evolve can be crucial in better understanding them and making decisions faster. Cameras are almost everywhere these days, either in terms of smartphones, installed CCTV cameras, UAVs or others. However, this poses challenges in big data and information overflow. Moreover, most of the time there are no disasters at any given location, so humans aiming to detect sudden situations may not be as alert as needed at any point in time. Consequently, computer vision tools can be an excellent decision support. The number of emergencies where computer vision tools has been considered or used is very wide, and there is a great overlap across related emergency research. Researchers tend to focus on state-of-the-art systems that cover the same emergency as they are studying, obviating important research in other fields. In order to unveil this overlap, the survey is divided along four main axes: the types of emergencies that have been studied in computer vision, the objective that the algorithms can address, the type of hardware needed and the algorithms used. Therefore, this review provides a broad overview of the progress of computer vision covering all sorts of emergencies.
Tasks
Published 2017-08-24
URL http://arxiv.org/abs/1708.07455v2
PDF http://arxiv.org/pdf/1708.07455v2.pdf
PWC https://paperswithcode.com/paper/review-on-computer-vision-techniques-in
Repo
Framework

Prediction of Sea Surface Temperature using Long Short-Term Memory

Title Prediction of Sea Surface Temperature using Long Short-Term Memory
Authors Qin Zhang, Hui Wang, Junyu Dong, Guoqiang Zhong, Xin Sun
Abstract This letter adopts long short-term memory(LSTM) to predict sea surface temperature(SST), which is the first attempt, to our knowledge, to use recurrent neural network to solve the problem of SST prediction, and to make one week and one month daily prediction. We formulate the SST prediction problem as a time series regression problem. LSTM is a special kind of recurrent neural network, which introduces gate mechanism into vanilla RNN to prevent the vanished or exploding gradient problem. It has strong ability to model the temporal relationship of time series data and can handle the long-term dependency problem well. The proposed network architecture is composed of two kinds of layers: LSTM layer and full-connected dense layer. LSTM layer is utilized to model the time series relationship. Full-connected layer is utilized to map the output of LSTM layer to a final prediction. We explore the optimal setting of this architecture by experiments and report the accuracy of coastal seas of China to confirm the effectiveness of the proposed method. In addition, we also show its online updated characteristics.
Tasks Time Series
Published 2017-05-19
URL http://arxiv.org/abs/1705.06861v1
PDF http://arxiv.org/pdf/1705.06861v1.pdf
PWC https://paperswithcode.com/paper/prediction-of-sea-surface-temperature-using
Repo
Framework

Video-based Person Re-identification with Accumulative Motion Context

Title Video-based Person Re-identification with Accumulative Motion Context
Authors Hao Liu, Zequn Jie, Karlekar Jayashree, Meibin Qi, Jianguo Jiang, Shuicheng Yan, Jiashi Feng
Abstract Video based person re-identification plays a central role in realistic security and video surveillance. In this paper we propose a novel Accumulative Motion Context (AMOC) network for addressing this important problem, which effectively exploits the long-range motion context for robustly identifying the same person under challenging conditions. Given a video sequence of the same or different persons, the proposed AMOC network jointly learns appearance representation and motion context from a collection of adjacent frames using a two-stream convolutional architecture. Then AMOC accumulates clues from motion context by recurrent aggregation, allowing effective information flow among adjacent frames and capturing dynamic gist of the persons. The architecture of AMOC is end-to-end trainable and thus motion context can be adapted to complement appearance clues under unfavorable conditions (e.g. occlusions). Extensive experiments are conduced on three public benchmark datasets, i.e., the iLIDS-VID, PRID-2011 and MARS datasets, to investigate the performance of AMOC. The experimental results demonstrate that the proposed AMOC network outperforms state-of-the-arts for video-based re-identification significantly and confirm the advantage of exploiting long-range motion context for video based person re-identification, validating our motivation evidently.
Tasks Person Re-Identification, Video-Based Person Re-Identification
Published 2017-01-01
URL http://arxiv.org/abs/1701.00193v2
PDF http://arxiv.org/pdf/1701.00193v2.pdf
PWC https://paperswithcode.com/paper/video-based-person-re-identification-with
Repo
Framework

Adaptive Simulation-based Training of AI Decision-makers using Bayesian Optimization

Title Adaptive Simulation-based Training of AI Decision-makers using Bayesian Optimization
Authors Brett W. Israelsen, Nisar Ahmed, Kenneth Center, Roderick Green, Winston Bennett Jr
Abstract This work studies how an AI-controlled dog-fighting agent with tunable decision-making parameters can learn to optimize performance against an intelligent adversary, as measured by a stochastic objective function evaluated on simulated combat engagements. Gaussian process Bayesian optimization (GPBO) techniques are developed to automatically learn global Gaussian Process (GP) surrogate models, which provide statistical performance predictions in both explored and unexplored areas of the parameter space. This allows a learning engine to sample full-combat simulations at parameter values that are most likely to optimize performance and also provide highly informative data points for improving future predictions. However, standard GPBO methods do not provide a reliable surrogate model for the highly volatile objective functions found in aerial combat, and thus do not reliably identify global maxima. These issues are addressed by novel Repeat Sampling (RS) and Hybrid Repeat/Multi-point Sampling (HRMS) techniques. Simulation studies show that HRMS improves the accuracy of GP surrogate models, allowing AI decision-makers to more accurately predict performance and efficiently tune parameters.
Tasks Decision Making
Published 2017-03-27
URL http://arxiv.org/abs/1703.09310v2
PDF http://arxiv.org/pdf/1703.09310v2.pdf
PWC https://paperswithcode.com/paper/adaptive-simulation-based-training-of-ai
Repo
Framework

An End to End Deep Neural Network for Iris Segmentation in Unconstraint Scenarios

Title An End to End Deep Neural Network for Iris Segmentation in Unconstraint Scenarios
Authors Shabab Bazrafkan, Shejin Thavalengal, Peter Corcoran
Abstract With the increasing imaging and processing capabilities of today’s mobile devices, user authentication using iris biometrics has become feasible. However, as the acquisition conditions become more unconstrained and as image quality is typically lower than dedicated iris acquisition systems, the accurate segmentation of iris regions is crucial for these devices. In this work, an end to end Fully Convolutional Deep Neural Network (FCDNN) design is proposed to perform the iris segmentation task for lower-quality iris images. The network design process is explained in detail, and the resulting network is trained and tuned using several large public iris datasets. A set of methods to generate and augment suitable lower quality iris images from the high-quality public databases are provided. The network is trained on Near InfraRed (NIR) images initially and later tuned on additional datasets derived from visible images. Comprehensive inter-database comparisons are provided together with results from a selection of experiments detailing the effects of different tunings of the network. Finally, the proposed model is compared with SegNet-basic, and a near-optimal tuning of the network is compared to a selection of other state-of-art iris segmentation algorithms. The results show very promising performance from the optimized Deep Neural Networks design when compared with state-of-art techniques applied to the same lower quality datasets.
Tasks Iris Segmentation
Published 2017-12-07
URL http://arxiv.org/abs/1712.02877v1
PDF http://arxiv.org/pdf/1712.02877v1.pdf
PWC https://paperswithcode.com/paper/an-end-to-end-deep-neural-network-for-iris
Repo
Framework

Automated Experiment Design for Data-Efficient Verification of Parametric Markov Decision Processes

Title Automated Experiment Design for Data-Efficient Verification of Parametric Markov Decision Processes
Authors Elizabeth Polgreen, Viraj Wijesuriya, Sofie Haesaert, Alessandro Abate
Abstract We present a new method for statistical verification of quantitative properties over a partially unknown system with actions, utilising a parameterised model (in this work, a parametric Markov decision process) and data collected from experiments performed on the underlying system. We obtain the confidence that the underlying system satisfies a given property, and show that the method uses data efficiently and thus is robust to the amount of data available. These characteristics are achieved by firstly exploiting parameter synthesis to establish a feasible set of parameters for which the underlying system will satisfy the property; secondly, by actively synthesising experiments to increase amount of information in the collected data that is relevant to the property; and finally propagating this information over the model parameters, obtaining a confidence that reflects our belief whether or not the system parameters lie in the feasible set, thereby solving the verification problem.
Tasks
Published 2017-07-05
URL http://arxiv.org/abs/1707.01322v1
PDF http://arxiv.org/pdf/1707.01322v1.pdf
PWC https://paperswithcode.com/paper/automated-experiment-design-for-data
Repo
Framework

Merging real and virtual worlds: An analysis of the state of the art and practical evaluation of Microsoft Hololens

Title Merging real and virtual worlds: An analysis of the state of the art and practical evaluation of Microsoft Hololens
Authors Adrien Coppens
Abstract Achieving a symbiotic blending between reality and virtuality is a dream that has been lying in the minds of many people for a long time. Advances in various domains constantly bring us closer to making that dream come true. Augmented reality as well as virtual reality are in fact trending terms and are expected to further progress in the years to come. This master’s thesis aims to explore these areas and starts by defining necessary terms such as augmented reality (AR) or virtual reality (VR). Usual taxonomies to classify and compare the corresponding experiences are then discussed. In order to enable those applications, many technical challenges need to be tackled, such as accurate motion tracking with 6 degrees of freedom (positional and rotational), that is necessary for compelling experiences and to prevent user sickness. Additionally, augmented reality experiences typically rely on image processing to position the superimposed content. To do so, “paper” markers or features extracted from the environment are often employed. Both sets of techniques are explored and common solutions and algorithms are presented. After investigating those technical aspects, I carry out an objective comparison of the existing state-of-the-art and state-of-the-practice in those domains, and I discuss present and potential applications in these areas. As a practical validation, I present the results of an application that I have developed using Microsoft HoloLens, one of the more advanced affordable technologies for augmented reality that is available today. Based on the experience and lessons learned during this development, I discuss the limitations of current technologies and present some avenues of future research.
Tasks
Published 2017-06-25
URL http://arxiv.org/abs/1706.08096v1
PDF http://arxiv.org/pdf/1706.08096v1.pdf
PWC https://paperswithcode.com/paper/merging-real-and-virtual-worlds-an-analysis
Repo
Framework

Multimodal Clustering for Community Detection

Title Multimodal Clustering for Community Detection
Authors Dmitry I. Ignatov, Alexander Semenov, Daria Komissarova, Dmitry V. Gnatyshak
Abstract Multimodal clustering is an unsupervised technique for mining interesting patterns in $n$-adic binary relations or $n$-mode networks. Among different types of such generalized patterns one can find biclusters and formal concepts (maximal bicliques) for 2-mode case, triclusters and triconcepts for 3-mode case, closed $n$-sets for $n$-mode case, etc. Object-attribute biclustering (OA-biclustering) for mining large binary datatables (formal contexts or 2-mode networks) arose by the end of the last decade due to intractability of computation problems related to formal concepts; this type of patterns was proposed as a meaningful and scalable approximation of formal concepts. In this paper, our aim is to present recent advance in OA-biclustering and its extensions to mining multi-mode communities in SNA setting. We also discuss connection between clustering coefficients known in SNA community for 1-mode and 2-mode networks and OA-bicluster density, the main quality measure of an OA-bicluster. Our experiments with 2-, 3-, and 4-mode large real-world networks show that this type of patterns is suitable for community detection in multi-mode cases within reasonable time even though the number of corresponding $n$-cliques is still unknown due to computation difficulties. An interpretation of OA-biclusters for 1-mode networks is provided as well.
Tasks Community Detection
Published 2017-02-27
URL http://arxiv.org/abs/1702.08557v1
PDF http://arxiv.org/pdf/1702.08557v1.pdf
PWC https://paperswithcode.com/paper/multimodal-clustering-for-community-detection
Repo
Framework

Shape Inpainting using 3D Generative Adversarial Network and Recurrent Convolutional Networks

Title Shape Inpainting using 3D Generative Adversarial Network and Recurrent Convolutional Networks
Authors Weiyue Wang, Qiangui Huang, Suya You, Chao Yang, Ulrich Neumann
Abstract Recent advances in convolutional neural networks have shown promising results in 3D shape completion. But due to GPU memory limitations, these methods can only produce low-resolution outputs. To inpaint 3D models with semantic plausibility and contextual details, we introduce a hybrid framework that combines a 3D Encoder-Decoder Generative Adversarial Network (3D-ED-GAN) and a Long-term Recurrent Convolutional Network (LRCN). The 3D-ED-GAN is a 3D convolutional neural network trained with a generative adversarial paradigm to fill missing 3D data in low-resolution. LRCN adopts a recurrent neural network architecture to minimize GPU memory usage and incorporates an Encoder-Decoder pair into a Long Short-term Memory Network. By handling the 3D model as a sequence of 2D slices, LRCN transforms a coarse 3D shape into a more complete and higher resolution volume. While 3D-ED-GAN captures global contextual structure of the 3D shape, LRCN localizes the fine-grained details. Experimental results on both real-world and synthetic data show reconstructions from corrupted models result in complete and high-resolution 3D objects.
Tasks
Published 2017-11-17
URL http://arxiv.org/abs/1711.06375v1
PDF http://arxiv.org/pdf/1711.06375v1.pdf
PWC https://paperswithcode.com/paper/shape-inpainting-using-3d-generative
Repo
Framework

From Algorithmic Black Boxes to Adaptive White Boxes: Declarative Decision-Theoretic Ethical Programs as Codes of Ethics

Title From Algorithmic Black Boxes to Adaptive White Boxes: Declarative Decision-Theoretic Ethical Programs as Codes of Ethics
Authors Martijn van Otterlo
Abstract Ethics of algorithms is an emerging topic in various disciplines such as social science, law, and philosophy, but also artificial intelligence (AI). The value alignment problem expresses the challenge of (machine) learning values that are, in some way, aligned with human requirements or values. In this paper I argue for looking at how humans have formalized and communicated values, in professional codes of ethics, and for exploring declarative decision-theoretic ethical programs (DDTEP) to formalize codes of ethics. This renders machine ethical reasoning and decision-making, as well as learning, more transparent and hopefully more accountable. The paper includes proof-of-concept examples of known toy dilemmas and gatekeeping domains such as archives and libraries.
Tasks Decision Making
Published 2017-11-16
URL http://arxiv.org/abs/1711.06035v1
PDF http://arxiv.org/pdf/1711.06035v1.pdf
PWC https://paperswithcode.com/paper/from-algorithmic-black-boxes-to-adaptive
Repo
Framework

Efficient Low-Order Approximation of First-Passage Time Distributions

Title Efficient Low-Order Approximation of First-Passage Time Distributions
Authors David Schnoerr, Botond Cseke, Ramon Grima, Guido Sanguinetti
Abstract We consider the problem of computing first-passage time distributions for reaction processes modelled by master equations. We show that this generally intractable class of problems is equivalent to a sequential Bayesian inference problem for an auxiliary observation process. The solution can be approximated efficiently by solving a closed set of coupled ordinary differential equations (for the low-order moments of the process) whose size scales with the number of species. We apply it to an epidemic model and a trimerisation process, and show good agreement with stochastic simulations.
Tasks Bayesian Inference
Published 2017-06-01
URL http://arxiv.org/abs/1706.00348v2
PDF http://arxiv.org/pdf/1706.00348v2.pdf
PWC https://paperswithcode.com/paper/efficient-low-order-approximation-of-first
Repo
Framework

Moving to VideoKifu: the last steps toward a fully automatic record-keeping of a Go game

Title Moving to VideoKifu: the last steps toward a fully automatic record-keeping of a Go game
Authors Mario Corsolini, Andrea Carta
Abstract In a previous paper [ arXiv:1508.03269 ] we described the techniques we successfully employed for automatically reconstructing the whole move sequence of a Go game by means of a set of pictures. Now we describe how it is possible to reconstruct the move sequence by means of a video stream (which may be provided by an unattended webcam), possibly in real-time. Although the basic algorithms remain the same, we will discuss the new problems that arise when dealing with videos, with special care for the ones that could block a real-time analysis and require an improvement of our previous techniques or even a completely brand new approach. Eventually we present a number of preliminary but positive experimental results supporting the effectiveness of the software we are developing, built on the ideas here outlined.
Tasks
Published 2017-01-19
URL http://arxiv.org/abs/1701.05419v1
PDF http://arxiv.org/pdf/1701.05419v1.pdf
PWC https://paperswithcode.com/paper/moving-to-videokifu-the-last-steps-toward-a
Repo
Framework

Multi-layer Visualization for Medical Mixed Reality

Title Multi-layer Visualization for Medical Mixed Reality
Authors Séverine Habert, Ma Meng, Pascal Fallavollita, Nassir Navab
Abstract Medical Mixed Reality helps surgeons to contextualize intraoperative data with video of the surgical scene. Nonetheless, the surgical scene and anatomical target are often occluded by surgical instruments and surgeon hands. In this paper and to our knowledge, we propose a multi-layer visualization in Medical Mixed Reality solution which subtly improves a surgeon’s visualization by making transparent the occluding objects. As an example scenario, we use an augmented reality C-arm fluoroscope device. A video image is created using a volumetric-based image synthesization technique and stereo-RGBD cameras mounted on the C-arm. From this synthesized view, the background which is occluded by the surgical instruments and surgeon hands is recovered by modifying the volumetric-based image synthesization technique. The occluding objects can, therefore, become transparent over the surgical scene. Experimentation with different augmented reality scenarios yield results demonstrating that the background of the surgical scenes can be recovered with accuracy between 45%-99%. In conclusion, we presented a solution that a Mixed Reality solution for medicine, providing transparency to objects occluding the surgical scene. This work is also the first application of volumetric field for Diminished Reality/ Mixed Reality.
Tasks
Published 2017-09-26
URL http://arxiv.org/abs/1709.08962v1
PDF http://arxiv.org/pdf/1709.08962v1.pdf
PWC https://paperswithcode.com/paper/multi-layer-visualization-for-medical-mixed
Repo
Framework

SegFlow: Joint Learning for Video Object Segmentation and Optical Flow

Title SegFlow: Joint Learning for Video Object Segmentation and Optical Flow
Authors Jingchun Cheng, Yi-Hsuan Tsai, Shengjin Wang, Ming-Hsuan Yang
Abstract This paper proposes an end-to-end trainable network, SegFlow, for simultaneously predicting pixel-wise object segmentation and optical flow in videos. The proposed SegFlow has two branches where useful information of object segmentation and optical flow is propagated bidirectionally in a unified framework. The segmentation branch is based on a fully convolutional network, which has been proved effective in image segmentation task, and the optical flow branch takes advantage of the FlowNet model. The unified framework is trained iteratively offline to learn a generic notion, and fine-tuned online for specific objects. Extensive experiments on both the video object segmentation and optical flow datasets demonstrate that introducing optical flow improves the performance of segmentation and vice versa, against the state-of-the-art algorithms.
Tasks Optical Flow Estimation, Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation, Visual Object Tracking
Published 2017-09-20
URL http://arxiv.org/abs/1709.06750v1
PDF http://arxiv.org/pdf/1709.06750v1.pdf
PWC https://paperswithcode.com/paper/segflow-joint-learning-for-video-object
Repo
Framework
comments powered by Disqus