July 28, 2019

3231 words 16 mins read

Paper Group ANR 430

Flexible Stereo: Constrained, Non-rigid, Wide-baseline Stereo Vision for Fixed-wing Aerial Platforms. Review on Computer Vision Techniques in Emergency Situation. Prediction of Sea Surface Temperature using Long Short-Term Memory. Video-based Person Re-identification with Accumulative Motion Context. Adaptive Simulation-based Training of AI Decisio …

Flexible Stereo: Constrained, Non-rigid, Wide-baseline Stereo Vision for Fixed-wing Aerial Platforms


Title	Flexible Stereo: Constrained, Non-rigid, Wide-baseline Stereo Vision for Fixed-wing Aerial Platforms
Authors	Timo Hinzmann, Tim Taubner, Roland Siegwart
Abstract	This paper proposes a computationally efficient method to estimate the time-varying relative pose between two visual-inertial sensor rigs mounted on the flexible wings of a fixed-wing unmanned aerial vehicle (UAV). The estimated relative poses are used to generate highly accurate depth maps in real-time and can be employed for obstacle avoidance in low-altitude flights or landing maneuvers. The approach is structured as follows: Initially, a wing model is identified by fitting a probability density function to measured deviations from the nominal relative baseline transformation. At run-time, the prior knowledge about the wing model is fused in an Extended Kalman filter~(EKF) together with relative pose measurements obtained from solving a relative perspective N-point problem (PNP), and the linear accelerations and angular velocities measured by the two inertial measurement units (IMU) which are rigidly attached to the cameras. Results obtained from extensive synthetic experiments demonstrate that our proposed framework is able to estimate highly accurate baseline transformations and depth maps.
Tasks
Published	2017-12-19
URL	http://arxiv.org/abs/1712.06837v2
PDF	http://arxiv.org/pdf/1712.06837v2.pdf
PWC	https://paperswithcode.com/paper/flexible-stereo-constrained-non-rigid-wide
Repo
Framework

Review on Computer Vision Techniques in Emergency Situation


Title	Review on Computer Vision Techniques in Emergency Situation
Authors	Laura Lopez-Fuentes, Joost van de Weijer, Manuel Gonzalez-Hidalgo, Harald Skinnemoen, Andrew D. Bagdanov
Abstract	In emergency situations, actions that save lives and limit the impact of hazards are crucial. In order to act, situational awareness is needed to decide what to do. Geolocalized photos and video of the situations as they evolve can be crucial in better understanding them and making decisions faster. Cameras are almost everywhere these days, either in terms of smartphones, installed CCTV cameras, UAVs or others. However, this poses challenges in big data and information overflow. Moreover, most of the time there are no disasters at any given location, so humans aiming to detect sudden situations may not be as alert as needed at any point in time. Consequently, computer vision tools can be an excellent decision support. The number of emergencies where computer vision tools has been considered or used is very wide, and there is a great overlap across related emergency research. Researchers tend to focus on state-of-the-art systems that cover the same emergency as they are studying, obviating important research in other fields. In order to unveil this overlap, the survey is divided along four main axes: the types of emergencies that have been studied in computer vision, the objective that the algorithms can address, the type of hardware needed and the algorithms used. Therefore, this review provides a broad overview of the progress of computer vision covering all sorts of emergencies.
Tasks
Published	2017-08-24
URL	http://arxiv.org/abs/1708.07455v2
PDF	http://arxiv.org/pdf/1708.07455v2.pdf
PWC	https://paperswithcode.com/paper/review-on-computer-vision-techniques-in
Repo
Framework

Prediction of Sea Surface Temperature using Long Short-Term Memory


Title	Prediction of Sea Surface Temperature using Long Short-Term Memory
Authors	Qin Zhang, Hui Wang, Junyu Dong, Guoqiang Zhong, Xin Sun
Abstract	This letter adopts long short-term memory(LSTM) to predict sea surface temperature(SST), which is the first attempt, to our knowledge, to use recurrent neural network to solve the problem of SST prediction, and to make one week and one month daily prediction. We formulate the SST prediction problem as a time series regression problem. LSTM is a special kind of recurrent neural network, which introduces gate mechanism into vanilla RNN to prevent the vanished or exploding gradient problem. It has strong ability to model the temporal relationship of time series data and can handle the long-term dependency problem well. The proposed network architecture is composed of two kinds of layers: LSTM layer and full-connected dense layer. LSTM layer is utilized to model the time series relationship. Full-connected layer is utilized to map the output of LSTM layer to a final prediction. We explore the optimal setting of this architecture by experiments and report the accuracy of coastal seas of China to confirm the effectiveness of the proposed method. In addition, we also show its online updated characteristics.
Tasks	Time Series
Published	2017-05-19
URL	http://arxiv.org/abs/1705.06861v1
PDF	http://arxiv.org/pdf/1705.06861v1.pdf
PWC	https://paperswithcode.com/paper/prediction-of-sea-surface-temperature-using
Repo
Framework

Video-based Person Re-identification with Accumulative Motion Context


Title	Video-based Person Re-identification with Accumulative Motion Context
Authors	Hao Liu, Zequn Jie, Karlekar Jayashree, Meibin Qi, Jianguo Jiang, Shuicheng Yan, Jiashi Feng
Abstract	Video based person re-identification plays a central role in realistic security and video surveillance. In this paper we propose a novel Accumulative Motion Context (AMOC) network for addressing this important problem, which effectively exploits the long-range motion context for robustly identifying the same person under challenging conditions. Given a video sequence of the same or different persons, the proposed AMOC network jointly learns appearance representation and motion context from a collection of adjacent frames using a two-stream convolutional architecture. Then AMOC accumulates clues from motion context by recurrent aggregation, allowing effective information flow among adjacent frames and capturing dynamic gist of the persons. The architecture of AMOC is end-to-end trainable and thus motion context can be adapted to complement appearance clues under unfavorable conditions (e.g. occlusions). Extensive experiments are conduced on three public benchmark datasets, i.e., the iLIDS-VID, PRID-2011 and MARS datasets, to investigate the performance of AMOC. The experimental results demonstrate that the proposed AMOC network outperforms state-of-the-arts for video-based re-identification significantly and confirm the advantage of exploiting long-range motion context for video based person re-identification, validating our motivation evidently.
Tasks	Person Re-Identification, Video-Based Person Re-Identification
Published	2017-01-01
URL	http://arxiv.org/abs/1701.00193v2
PDF	http://arxiv.org/pdf/1701.00193v2.pdf
PWC	https://paperswithcode.com/paper/video-based-person-re-identification-with
Repo
Framework

Adaptive Simulation-based Training of AI Decision-makers using Bayesian Optimization


Title	Adaptive Simulation-based Training of AI Decision-makers using Bayesian Optimization
Authors	Brett W. Israelsen, Nisar Ahmed, Kenneth Center, Roderick Green, Winston Bennett Jr
Abstract	This work studies how an AI-controlled dog-fighting agent with tunable decision-making parameters can learn to optimize performance against an intelligent adversary, as measured by a stochastic objective function evaluated on simulated combat engagements. Gaussian process Bayesian optimization (GPBO) techniques are developed to automatically learn global Gaussian Process (GP) surrogate models, which provide statistical performance predictions in both explored and unexplored areas of the parameter space. This allows a learning engine to sample full-combat simulations at parameter values that are most likely to optimize performance and also provide highly informative data points for improving future predictions. However, standard GPBO methods do not provide a reliable surrogate model for the highly volatile objective functions found in aerial combat, and thus do not reliably identify global maxima. These issues are addressed by novel Repeat Sampling (RS) and Hybrid Repeat/Multi-point Sampling (HRMS) techniques. Simulation studies show that HRMS improves the accuracy of GP surrogate models, allowing AI decision-makers to more accurately predict performance and efficiently tune parameters.
Tasks	Decision Making
Published	2017-03-27
URL	http://arxiv.org/abs/1703.09310v2
PDF	http://arxiv.org/pdf/1703.09310v2.pdf
PWC	https://paperswithcode.com/paper/adaptive-simulation-based-training-of-ai
Repo
Framework

An End to End Deep Neural Network for Iris Segmentation in Unconstraint Scenarios


Title	An End to End Deep Neural Network for Iris Segmentation in Unconstraint Scenarios
Authors	Shabab Bazrafkan, Shejin Thavalengal, Peter Corcoran
Abstract	With the increasing imaging and processing capabilities of today’s mobile devices, user authentication using iris biometrics has become feasible. However, as the acquisition conditions become more unconstrained and as image quality is typically lower than dedicated iris acquisition systems, the accurate segmentation of iris regions is crucial for these devices. In this work, an end to end Fully Convolutional Deep Neural Network (FCDNN) design is proposed to perform the iris segmentation task for lower-quality iris images. The network design process is explained in detail, and the resulting network is trained and tuned using several large public iris datasets. A set of methods to generate and augment suitable lower quality iris images from the high-quality public databases are provided. The network is trained on Near InfraRed (NIR) images initially and later tuned on additional datasets derived from visible images. Comprehensive inter-database comparisons are provided together with results from a selection of experiments detailing the effects of different tunings of the network. Finally, the proposed model is compared with SegNet-basic, and a near-optimal tuning of the network is compared to a selection of other state-of-art iris segmentation algorithms. The results show very promising performance from the optimized Deep Neural Networks design when compared with state-of-art techniques applied to the same lower quality datasets.
Tasks	Iris Segmentation
Published	2017-12-07
URL	http://arxiv.org/abs/1712.02877v1
PDF	http://arxiv.org/pdf/1712.02877v1.pdf
PWC	https://paperswithcode.com/paper/an-end-to-end-deep-neural-network-for-iris
Repo
Framework

Automated Experiment Design for Data-Efficient Verification of Parametric Markov Decision Processes


Title	Automated Experiment Design for Data-Efficient Verification of Parametric Markov Decision Processes
Authors	Elizabeth Polgreen, Viraj Wijesuriya, Sofie Haesaert, Alessandro Abate
Abstract	We present a new method for statistical verification of quantitative properties over a partially unknown system with actions, utilising a parameterised model (in this work, a parametric Markov decision process) and data collected from experiments performed on the underlying system. We obtain the confidence that the underlying system satisfies a given property, and show that the method uses data efficiently and thus is robust to the amount of data available. These characteristics are achieved by firstly exploiting parameter synthesis to establish a feasible set of parameters for which the underlying system will satisfy the property; secondly, by actively synthesising experiments to increase amount of information in the collected data that is relevant to the property; and finally propagating this information over the model parameters, obtaining a confidence that reflects our belief whether or not the system parameters lie in the feasible set, thereby solving the verification problem.
Tasks
Published	2017-07-05
URL	http://arxiv.org/abs/1707.01322v1
PDF	http://arxiv.org/pdf/1707.01322v1.pdf
PWC	https://paperswithcode.com/paper/automated-experiment-design-for-data
Repo
Framework

Merging real and virtual worlds: An analysis of the state of the art and practical evaluation of Microsoft Hololens


Title	Merging real and virtual worlds: An analysis of the state of the art and practical evaluation of Microsoft Hololens
Authors	Adrien Coppens
Abstract	Achieving a symbiotic blending between reality and virtuality is a dream that has been lying in the minds of many people for a long time. Advances in various domains constantly bring us closer to making that dream come true. Augmented reality as well as virtual reality are in fact trending terms and are expected to further progress in the years to come. This master’s thesis aims to explore these areas and starts by defining necessary terms such as augmented reality (AR) or virtual reality (VR). Usual taxonomies to classify and compare the corresponding experiences are then discussed. In order to enable those applications, many technical challenges need to be tackled, such as accurate motion tracking with 6 degrees of freedom (positional and rotational), that is necessary for compelling experiences and to prevent user sickness. Additionally, augmented reality experiences typically rely on image processing to position the superimposed content. To do so, “paper” markers or features extracted from the environment are often employed. Both sets of techniques are explored and common solutions and algorithms are presented. After investigating those technical aspects, I carry out an objective comparison of the existing state-of-the-art and state-of-the-practice in those domains, and I discuss present and potential applications in these areas. As a practical validation, I present the results of an application that I have developed using Microsoft HoloLens, one of the more advanced affordable technologies for augmented reality that is available today. Based on the experience and lessons learned during this development, I discuss the limitations of current technologies and present some avenues of future research.
Tasks
Published	2017-06-25
URL	http://arxiv.org/abs/1706.08096v1
PDF	http://arxiv.org/pdf/1706.08096v1.pdf
PWC	https://paperswithcode.com/paper/merging-real-and-virtual-worlds-an-analysis
Repo
Framework

Multimodal Clustering for Community Detection


Title	Multimodal Clustering for Community Detection
Authors	Dmitry I. Ignatov, Alexander Semenov, Daria Komissarova, Dmitry V. Gnatyshak
Abstract	Multimodal clustering is an unsupervised technique for mining interesting patterns in $n$-adic binary relations or $n$-mode networks. Among different types of such generalized patterns one can find biclusters and formal concepts (maximal bicliques) for 2-mode case, triclusters and triconcepts for 3-mode case, closed $n$-sets for $n$-mode case, etc. Object-attribute biclustering (OA-biclustering) for mining large binary datatables (formal contexts or 2-mode networks) arose by the end of the last decade due to intractability of computation problems related to formal concepts; this type of patterns was proposed as a meaningful and scalable approximation of formal concepts. In this paper, our aim is to present recent advance in OA-biclustering and its extensions to mining multi-mode communities in SNA setting. We also discuss connection between clustering coefficients known in SNA community for 1-mode and 2-mode networks and OA-bicluster density, the main quality measure of an OA-bicluster. Our experiments with 2-, 3-, and 4-mode large real-world networks show that this type of patterns is suitable for community detection in multi-mode cases within reasonable time even though the number of corresponding $n$-cliques is still unknown due to computation difficulties. An interpretation of OA-biclusters for 1-mode networks is provided as well.
Tasks	Community Detection
Published	2017-02-27
URL	http://arxiv.org/abs/1702.08557v1
PDF	http://arxiv.org/pdf/1702.08557v1.pdf
PWC	https://paperswithcode.com/paper/multimodal-clustering-for-community-detection
Repo
Framework

Shape Inpainting using 3D Generative Adversarial Network and Recurrent Convolutional Networks


Title	Shape Inpainting using 3D Generative Adversarial Network and Recurrent Convolutional Networks
Authors	Weiyue Wang, Qiangui Huang, Suya You, Chao Yang, Ulrich Neumann
Abstract	Recent advances in convolutional neural networks have shown promising results in 3D shape completion. But due to GPU memory limitations, these methods can only produce low-resolution outputs. To inpaint 3D models with semantic plausibility and contextual details, we introduce a hybrid framework that combines a 3D Encoder-Decoder Generative Adversarial Network (3D-ED-GAN) and a Long-term Recurrent Convolutional Network (LRCN). The 3D-ED-GAN is a 3D convolutional neural network trained with a generative adversarial paradigm to fill missing 3D data in low-resolution. LRCN adopts a recurrent neural network architecture to minimize GPU memory usage and incorporates an Encoder-Decoder pair into a Long Short-term Memory Network. By handling the 3D model as a sequence of 2D slices, LRCN transforms a coarse 3D shape into a more complete and higher resolution volume. While 3D-ED-GAN captures global contextual structure of the 3D shape, LRCN localizes the fine-grained details. Experimental results on both real-world and synthetic data show reconstructions from corrupted models result in complete and high-resolution 3D objects.
Tasks
Published	2017-11-17
URL	http://arxiv.org/abs/1711.06375v1
PDF	http://arxiv.org/pdf/1711.06375v1.pdf
PWC	https://paperswithcode.com/paper/shape-inpainting-using-3d-generative
Repo
Framework

From Algorithmic Black Boxes to Adaptive White Boxes: Declarative Decision-Theoretic Ethical Programs as Codes of Ethics


Title	From Algorithmic Black Boxes to Adaptive White Boxes: Declarative Decision-Theoretic Ethical Programs as Codes of Ethics
Authors	Martijn van Otterlo
Abstract	Ethics of algorithms is an emerging topic in various disciplines such as social science, law, and philosophy, but also artificial intelligence (AI). The value alignment problem expresses the challenge of (machine) learning values that are, in some way, aligned with human requirements or values. In this paper I argue for looking at how humans have formalized and communicated values, in professional codes of ethics, and for exploring declarative decision-theoretic ethical programs (DDTEP) to formalize codes of ethics. This renders machine ethical reasoning and decision-making, as well as learning, more transparent and hopefully more accountable. The paper includes proof-of-concept examples of known toy dilemmas and gatekeeping domains such as archives and libraries.
Tasks	Decision Making
Published	2017-11-16
URL	http://arxiv.org/abs/1711.06035v1
PDF	http://arxiv.org/pdf/1711.06035v1.pdf
PWC	https://paperswithcode.com/paper/from-algorithmic-black-boxes-to-adaptive
Repo
Framework

Efficient Low-Order Approximation of First-Passage Time Distributions


Title	Efficient Low-Order Approximation of First-Passage Time Distributions
Authors	David Schnoerr, Botond Cseke, Ramon Grima, Guido Sanguinetti
Abstract	We consider the problem of computing first-passage time distributions for reaction processes modelled by master equations. We show that this generally intractable class of problems is equivalent to a sequential Bayesian inference problem for an auxiliary observation process. The solution can be approximated efficiently by solving a closed set of coupled ordinary differential equations (for the low-order moments of the process) whose size scales with the number of species. We apply it to an epidemic model and a trimerisation process, and show good agreement with stochastic simulations.
Tasks	Bayesian Inference
Published	2017-06-01
URL	http://arxiv.org/abs/1706.00348v2
PDF	http://arxiv.org/pdf/1706.00348v2.pdf
PWC	https://paperswithcode.com/paper/efficient-low-order-approximation-of-first
Repo
Framework

Moving to VideoKifu: the last steps toward a fully automatic record-keeping of a Go game


Title	Moving to VideoKifu: the last steps toward a fully automatic record-keeping of a Go game
Authors	Mario Corsolini, Andrea Carta
Abstract	In a previous paper [ arXiv:1508.03269 ] we described the techniques we successfully employed for automatically reconstructing the whole move sequence of a Go game by means of a set of pictures. Now we describe how it is possible to reconstruct the move sequence by means of a video stream (which may be provided by an unattended webcam), possibly in real-time. Although the basic algorithms remain the same, we will discuss the new problems that arise when dealing with videos, with special care for the ones that could block a real-time analysis and require an improvement of our previous techniques or even a completely brand new approach. Eventually we present a number of preliminary but positive experimental results supporting the effectiveness of the software we are developing, built on the ideas here outlined.
Tasks
Published	2017-01-19
URL	http://arxiv.org/abs/1701.05419v1
PDF	http://arxiv.org/pdf/1701.05419v1.pdf
PWC	https://paperswithcode.com/paper/moving-to-videokifu-the-last-steps-toward-a
Repo
Framework

Multi-layer Visualization for Medical Mixed Reality


Title	Multi-layer Visualization for Medical Mixed Reality
Authors	Séverine Habert, Ma Meng, Pascal Fallavollita, Nassir Navab
Abstract	Medical Mixed Reality helps surgeons to contextualize intraoperative data with video of the surgical scene. Nonetheless, the surgical scene and anatomical target are often occluded by surgical instruments and surgeon hands. In this paper and to our knowledge, we propose a multi-layer visualization in Medical Mixed Reality solution which subtly improves a surgeon’s visualization by making transparent the occluding objects. As an example scenario, we use an augmented reality C-arm fluoroscope device. A video image is created using a volumetric-based image synthesization technique and stereo-RGBD cameras mounted on the C-arm. From this synthesized view, the background which is occluded by the surgical instruments and surgeon hands is recovered by modifying the volumetric-based image synthesization technique. The occluding objects can, therefore, become transparent over the surgical scene. Experimentation with different augmented reality scenarios yield results demonstrating that the background of the surgical scenes can be recovered with accuracy between 45%-99%. In conclusion, we presented a solution that a Mixed Reality solution for medicine, providing transparency to objects occluding the surgical scene. This work is also the first application of volumetric field for Diminished Reality/ Mixed Reality.
Tasks
Published	2017-09-26
URL	http://arxiv.org/abs/1709.08962v1
PDF	http://arxiv.org/pdf/1709.08962v1.pdf
PWC	https://paperswithcode.com/paper/multi-layer-visualization-for-medical-mixed
Repo
Framework

SegFlow: Joint Learning for Video Object Segmentation and Optical Flow


Title	SegFlow: Joint Learning for Video Object Segmentation and Optical Flow
Authors	Jingchun Cheng, Yi-Hsuan Tsai, Shengjin Wang, Ming-Hsuan Yang
Abstract	This paper proposes an end-to-end trainable network, SegFlow, for simultaneously predicting pixel-wise object segmentation and optical flow in videos. The proposed SegFlow has two branches where useful information of object segmentation and optical flow is propagated bidirectionally in a unified framework. The segmentation branch is based on a fully convolutional network, which has been proved effective in image segmentation task, and the optical flow branch takes advantage of the FlowNet model. The unified framework is trained iteratively offline to learn a generic notion, and fine-tuned online for specific objects. Extensive experiments on both the video object segmentation and optical flow datasets demonstrate that introducing optical flow improves the performance of segmentation and vice versa, against the state-of-the-art algorithms.
Tasks	Optical Flow Estimation, Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation, Visual Object Tracking
Published	2017-09-20
URL	http://arxiv.org/abs/1709.06750v1
PDF	http://arxiv.org/pdf/1709.06750v1.pdf
PWC	https://paperswithcode.com/paper/segflow-joint-learning-for-video-object
Repo
Framework