Paper Group ANR 273
RegNet: Multimodal Sensor Registration Using Deep Neural Networks. Robotic Ironing with 3D Perception and Force/Torque Feedback in Household Environments. Experimental comparison of single-pixel imaging algorithms. Blind Image Deblurring Using Row-Column Sparse Representations. Polynomial Time Efficient Construction Heuristics for Vertex Separation …
RegNet: Multimodal Sensor Registration Using Deep Neural Networks
Title | RegNet: Multimodal Sensor Registration Using Deep Neural Networks |
Authors | Nick Schneider, Florian Piewak, Christoph Stiller, Uwe Franke |
Abstract | In this paper, we present RegNet, the first deep convolutional neural network (CNN) to infer a 6 degrees of freedom (DOF) extrinsic calibration between multimodal sensors, exemplified using a scanning LiDAR and a monocular camera. Compared to existing approaches, RegNet casts all three conventional calibration steps (feature extraction, feature matching and global regression) into a single real-time capable CNN. Our method does not require any human interaction and bridges the gap between classical offline and target-less online calibration approaches as it provides both a stable initial estimation as well as a continuous online correction of the extrinsic parameters. During training we randomly decalibrate our system in order to train RegNet to infer the correspondence between projected depth measurements and RGB image and finally regress the extrinsic calibration. Additionally, with an iterative execution of multiple CNNs, that are trained on different magnitudes of decalibration, our approach compares favorably to state-of-the-art methods in terms of a mean calibration error of 0.28 degrees for the rotational and 6 cm for the translation components even for large decalibrations up to 1.5 m and 20 degrees. |
Tasks | Calibration |
Published | 2017-07-11 |
URL | http://arxiv.org/abs/1707.03167v1 |
http://arxiv.org/pdf/1707.03167v1.pdf | |
PWC | https://paperswithcode.com/paper/regnet-multimodal-sensor-registration-using |
Repo | |
Framework | |
Robotic Ironing with 3D Perception and Force/Torque Feedback in Household Environments
Title | Robotic Ironing with 3D Perception and Force/Torque Feedback in Household Environments |
Authors | David Estevez, Juan G. Victores, Raul Fernandez-Fernandez, Carlos Balaguer |
Abstract | As robotic systems become more popular in household environments, the complexity of required tasks also increases. In this work we focus on a domestic chore deemed dull by a majority of the population, the task of ironing. The presented algorithm improves on the limited number of previous works by joining 3D perception with force/torque sensing, with emphasis on finding a practical solution with a feasible implementation in a domestic setting. Our algorithm obtains a point cloud representation of the working environment. From this point cloud, the garment is segmented and a custom Wrinkleness Local Descriptor (WiLD) is computed to determine the location of the present wrinkles. Using this descriptor, the most suitable ironing path is computed and, based on it, the manipulation algorithm performs the force-controlled ironing operation. Experiments have been performed with a humanoid robot platform, proving that our algorithm is able to detect successfully wrinkles present in garments and iteratively reduce the wrinkleness using an unmodified iron. |
Tasks | |
Published | 2017-06-16 |
URL | http://arxiv.org/abs/1706.05340v1 |
http://arxiv.org/pdf/1706.05340v1.pdf | |
PWC | https://paperswithcode.com/paper/robotic-ironing-with-3d-perception-and |
Repo | |
Framework | |
Experimental comparison of single-pixel imaging algorithms
Title | Experimental comparison of single-pixel imaging algorithms |
Authors | Liheng Bian, Jinli Suo, Qionghai Dai, Feng Chen |
Abstract | Single-pixel imaging (SPI) is a novel technique capturing 2D images using a photodiode, instead of conventional 2D array sensors. SPI owns high signal-to-noise ratio, wide spectrum range, low cost, and robustness to light scattering. Various algorithms have been proposed for SPI reconstruction, including the linear correlation methods, the alternating projection method (AP), and the compressive sensing based methods. However, there has been no comprehensive review discussing respective advantages, which is important for SPI’s further applications and development. In this paper, we reviewed and compared these algorithms in a unified reconstruction framework. Besides, we proposed two other SPI algorithms including a conjugate gradient descent based method (CGD) and a Poisson maximum likelihood based method. Both simulations and experiments validate the following conclusions: to obtain comparable reconstruction accuracy, the compressive sensing based total variation regularization method (TV) requires the least measurements and consumes the least running time for small-scale reconstruction; the CGD and AP methods run fastest in large-scale cases; the TV and AP methods are the most robust to measurement noise. In a word, there are trade-offs between capture efficiency, computational complexity and robustness to noise among different SPI algorithms. We have released our source code for non-commercial use. |
Tasks | Compressive Sensing |
Published | 2017-07-11 |
URL | http://arxiv.org/abs/1707.03164v2 |
http://arxiv.org/pdf/1707.03164v2.pdf | |
PWC | https://paperswithcode.com/paper/experimental-comparison-of-single-pixel |
Repo | |
Framework | |
Blind Image Deblurring Using Row-Column Sparse Representations
Title | Blind Image Deblurring Using Row-Column Sparse Representations |
Authors | Mohammad Tofighi, Yuelong Li, Vishal Monga |
Abstract | Blind image deblurring is a particularly challenging inverse problem where the blur kernel is unknown and must be estimated en route to recover the deblurred image. The problem is of strong practical relevance since many imaging devices such as cellphone cameras, must rely on deblurring algorithms to yield satisfactory image quality. Despite significant research effort, handling large motions remains an open problem. In this paper, we develop a new method called Blind Image Deblurring using Row-Column Sparsity (BD-RCS) to address this issue. Specifically, we model the outer product of kernel and image coefficients in certain transformation domains as a rank-one matrix, and recover it by solving a rank minimization problem. Our central contribution then includes solving {\em two new} optimization problems involving row and column sparsity to automatically determine blur kernel and image support sequentially. The kernel and image can then be recovered through a singular value decomposition (SVD). Experimental results on linear motion deblurring demonstrate that BD-RCS can yield better results than state of the art, particularly when the blur is caused by large motion. This is confirmed both visually and through quantitative measures. |
Tasks | Blind Image Deblurring, Deblurring |
Published | 2017-12-05 |
URL | http://arxiv.org/abs/1712.01937v1 |
http://arxiv.org/pdf/1712.01937v1.pdf | |
PWC | https://paperswithcode.com/paper/blind-image-deblurring-using-row-column |
Repo | |
Framework | |
Polynomial Time Efficient Construction Heuristics for Vertex Separation Minimization Problem
Title | Polynomial Time Efficient Construction Heuristics for Vertex Separation Minimization Problem |
Authors | Pallavi Jain, Gur Saran, Kamal Srivastava |
Abstract | Vertex Separation Minimization Problem (VSMP) consists of finding a layout of a graph G = (V,E) which minimizes the maximum vertex cut or separation of a layout. It is an NP-complete problem in general for which metaheuristic techniques can be applied to find near optimal solution. VSMP has applications in VLSI design, graph drawing and computer language compiler design. VSMP is polynomially solvable for grids, trees, permutation graphs and cographs. Construction heuristics play a very important role in the metaheuristic techniques as they are responsible for generating initial solutions which lead to fast convergence. In this paper, we have proposed three construction heuristics H1, H2 and H3 and performed experiments on Grids, Small graphs, Trees and Harwell Boeing graphs, totaling 248 instances of graphs. Experiments reveal that H1, H2 and H3 are able to achieve best results for 88.71%, 43.5% and 37.1% of the total instances respectively while the best construction heuristic in the literature achieves the best solution for 39.9% of the total instances. We have also compared the results with the state-of-the-art metaheuristic GVNS and observed that the proposed construction heuristics improves the results for some of the input instances. It was found that GVNS obtained best results for 82.9% instances of all input instances and the heuristic H1 obtained best results for 82.3% of all input instances. |
Tasks | |
Published | 2017-02-19 |
URL | http://arxiv.org/abs/1702.05710v1 |
http://arxiv.org/pdf/1702.05710v1.pdf | |
PWC | https://paperswithcode.com/paper/polynomial-time-efficient-construction |
Repo | |
Framework | |
Bayesian model selection consistency and oracle inequality with intractable marginal likelihood
Title | Bayesian model selection consistency and oracle inequality with intractable marginal likelihood |
Authors | Yun Yang, Debdeep Pati |
Abstract | In this article, we investigate large sample properties of model selection procedures in a general Bayesian framework when a closed form expression of the marginal likelihood function is not available or a local asymptotic quadratic approximation of the log-likelihood function does not exist. Under appropriate identifiability assumptions on the true model, we provide sufficient conditions for a Bayesian model selection procedure to be consistent and obey the Occam’s razor phenomenon, i.e., the probability of selecting the “smallest” model that contains the truth tends to one as the sample size goes to infinity. In order to show that a Bayesian model selection procedure selects the smallest model containing the truth, we impose a prior anti-concentration condition, requiring the prior mass assigned by large models to a neighborhood of the truth to be sufficiently small. In a more general setting where the strong model identifiability assumption may not hold, we introduce the notion of local Bayesian complexity and develop oracle inequalities for Bayesian model selection procedures. Our Bayesian oracle inequality characterizes a trade-off between the approximation error and a Bayesian characterization of the local complexity of the model, illustrating the adaptive nature of averaging-based Bayesian procedures towards achieving an optimal rate of posterior convergence. Specific applications of the model selection theory are discussed in the context of high-dimensional nonparametric regression and density regression where the regression function or the conditional density is assumed to depend on a fixed subset of predictors. As a result of independent interest, we propose a general technique for obtaining upper bounds of certain small ball probability of stationary Gaussian processes. |
Tasks | Gaussian Processes, Model Selection |
Published | 2017-01-02 |
URL | http://arxiv.org/abs/1701.00311v2 |
http://arxiv.org/pdf/1701.00311v2.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-model-selection-consistency-and |
Repo | |
Framework | |
Underwater Optical Image Processing: A Comprehensive Review
Title | Underwater Optical Image Processing: A Comprehensive Review |
Authors | Huimin Lu, Yujie Li, Yudong Zhang, Min Chen, Seiichi Serikawa, Hyoungseop Kim |
Abstract | Underwater cameras are widely used to observe the sea floor. They are usually included in autonomous underwater vehicles, unmanned underwater vehicles, and in situ ocean sensor networks. Despite being an important sensor for monitoring underwater scenes, there exist many issues with recent underwater camera sensors. Because of lights transportation characteristics in water and the biological activity at the sea floor, the acquired underwater images often suffer from scatters and large amounts of noise. Over the last five years, many methods have been proposed to overcome traditional underwater imaging problems. This paper aims to review the state-of-the-art techniques in underwater image processing by highlighting the contributions and challenges presented in over 40 papers. We present an overview of various underwater image processing approaches, such as underwater image descattering, underwater image color restoration, and underwater image quality assessments. Finally, we summarize the future trends and challenges in designing and processing underwater imaging sensors. |
Tasks | |
Published | 2017-02-13 |
URL | http://arxiv.org/abs/1702.03600v1 |
http://arxiv.org/pdf/1702.03600v1.pdf | |
PWC | https://paperswithcode.com/paper/underwater-optical-image-processing-a |
Repo | |
Framework | |
Quadratic Upper Bound for Recursive Teaching Dimension of Finite VC Classes
Title | Quadratic Upper Bound for Recursive Teaching Dimension of Finite VC Classes |
Authors | Lunjia Hu, Ruihan Wu, Tianhong Li, Liwei Wang |
Abstract | In this work we study the quantitative relation between the recursive teaching dimension (RTD) and the VC dimension (VCD) of concept classes of finite sizes. The RTD of a concept class $\mathcal C \subseteq {0, 1}^n$, introduced by Zilles et al. (2011), is a combinatorial complexity measure characterized by the worst-case number of examples necessary to identify a concept in $\mathcal C$ according to the recursive teaching model. For any finite concept class $\mathcal C \subseteq {0,1}^n$ with $\mathrm{VCD}(\mathcal C)=d$, Simon & Zilles (2015) posed an open problem $\mathrm{RTD}(\mathcal C) = O(d)$, i.e., is RTD linearly upper bounded by VCD? Previously, the best known result is an exponential upper bound $\mathrm{RTD}(\mathcal C) = O(d \cdot 2^d)$, due to Chen et al. (2016). In this paper, we show a quadratic upper bound: $\mathrm{RTD}(\mathcal C) = O(d^2)$, much closer to an answer to the open problem. We also discuss the challenges in fully solving the problem. |
Tasks | |
Published | 2017-02-18 |
URL | http://arxiv.org/abs/1702.05677v1 |
http://arxiv.org/pdf/1702.05677v1.pdf | |
PWC | https://paperswithcode.com/paper/quadratic-upper-bound-for-recursive-teaching |
Repo | |
Framework | |
Evaluating Deep Convolutional Neural Networks for Material Classification
Title | Evaluating Deep Convolutional Neural Networks for Material Classification |
Authors | Grigorios Kalliatakis, Georgios Stamatiadis, Shoaib Ehsan, Ales Leonardis, Juergen Gall, Anca Sticlaru, Klaus D. McDonald-Maier |
Abstract | Determining the material category of a surface from an image is a demanding task in perception that is drawing increasing attention. Following the recent remarkable results achieved for image classification and object detection utilising Convolutional Neural Networks (CNNs), we empirically study material classification of everyday objects employing these techniques. More specifically, we conduct a rigorous evaluation of how state-of-the art CNN architectures compare on a common ground over widely used material databases. Experimental results on three challenging material databases show that the best performing CNN architectures can achieve up to 94.99% mean average precision when classifying materials. |
Tasks | Image Classification, Material Classification, Object Detection |
Published | 2017-03-12 |
URL | http://arxiv.org/abs/1703.04101v2 |
http://arxiv.org/pdf/1703.04101v2.pdf | |
PWC | https://paperswithcode.com/paper/evaluating-deep-convolutional-neural-networks |
Repo | |
Framework | |
Foreground Detection in Camouflaged Scenes
Title | Foreground Detection in Camouflaged Scenes |
Authors | Shuai Li, Dinei Florencio, Yaqin Zhao, Chris Cook, Wanqing Li |
Abstract | Foreground detection has been widely studied for decades due to its importance in many practical applications. Most of the existing methods assume foreground and background show visually distinct characteristics and thus the foreground can be detected once a good background model is obtained. However, there are many situations where this is not the case. Of particular interest in video surveillance is the camouflage case. For example, an active attacker camouflages by intentionally wearing clothes that are visually similar to the background. In such cases, even given a decent background model, it is not trivial to detect foreground objects. This paper proposes a texture guided weighted voting (TGWV) method which can efficiently detect foreground objects in camouflaged scenes. The proposed method employs the stationary wavelet transform to decompose the image into frequency bands. We show that the small and hardly noticeable differences between foreground and background in the image domain can be effectively captured in certain wavelet frequency bands. To make the final foreground decision, a weighted voting scheme is developed based on intensity and texture of all the wavelet bands with weights carefully designed. Experimental results demonstrate that the proposed method achieves superior performance compared to the current state-of-the-art results. |
Tasks | |
Published | 2017-07-11 |
URL | http://arxiv.org/abs/1707.03166v1 |
http://arxiv.org/pdf/1707.03166v1.pdf | |
PWC | https://paperswithcode.com/paper/foreground-detection-in-camouflaged-scenes |
Repo | |
Framework | |
SfM-Net: Learning of Structure and Motion from Video
Title | SfM-Net: Learning of Structure and Motion from Video |
Authors | Sudheendra Vijayanarasimhan, Susanna Ricco, Cordelia Schmid, Rahul Sukthankar, Katerina Fragkiadaki |
Abstract | We propose SfM-Net, a geometry-aware neural network for motion estimation in videos that decomposes frame-to-frame pixel motion in terms of scene and object depth, camera motion and 3D object rotations and translations. Given a sequence of frames, SfM-Net predicts depth, segmentation, camera and rigid object motions, converts those into a dense frame-to-frame motion field (optical flow), differentiably warps frames in time to match pixels and back-propagates. The model can be trained with various degrees of supervision: 1) self-supervised by the re-projection photometric error (completely unsupervised), 2) supervised by ego-motion (camera motion), or 3) supervised by depth (e.g., as provided by RGBD sensors). SfM-Net extracts meaningful depth estimates and successfully estimates frame-to-frame camera rotations and translations. It often successfully segments the moving objects in the scene, even though such supervision is never provided. |
Tasks | Motion Estimation, Optical Flow Estimation |
Published | 2017-04-25 |
URL | http://arxiv.org/abs/1704.07804v1 |
http://arxiv.org/pdf/1704.07804v1.pdf | |
PWC | https://paperswithcode.com/paper/sfm-net-learning-of-structure-and-motion-from |
Repo | |
Framework | |
Multi-Modal Trip Hazard Affordance Detection On Construction Sites
Title | Multi-Modal Trip Hazard Affordance Detection On Construction Sites |
Authors | Sean McMahon, Niko Sünderhauf, Ben Upcroft, Michael Milford |
Abstract | Trip hazards are a significant contributor to accidents on construction and manufacturing sites, where over a third of Australian workplace injuries occur [1]. Current safety inspections are labour intensive and limited by human fallibility,making automation of trip hazard detection appealing from both a safety and economic perspective. Trip hazards present an interesting challenge to modern learning techniques because they are defined as much by affordance as by object type; for example wires on a table are not a trip hazard, but can be if lying on the ground. To address these challenges, we conduct a comprehensive investigation into the performance characteristics of 11 different colour and depth fusion approaches, including 4 fusion and one non fusion approach; using colour and two types of depth images. Trained and tested on over 600 labelled trip hazards over 4 floors and 2000m$\mathrm{^{2}}$ in an active construction site,this approach was able to differentiate between identical objects in different physical configurations (see Figure 1). Outperforming a colour-only detector, our multi-modal trip detector fuses colour and depth information to achieve a 4% absolute improvement in F1-score. These investigative results and the extensive publicly available dataset moves us one step closer to assistive or fully automated safety inspection systems on construction sites. |
Tasks | |
Published | 2017-06-21 |
URL | http://arxiv.org/abs/1706.06718v1 |
http://arxiv.org/pdf/1706.06718v1.pdf | |
PWC | https://paperswithcode.com/paper/multi-modal-trip-hazard-affordance-detection |
Repo | |
Framework | |
Fuzzy Clustering Data Given in the Ordinal Scale
Title | Fuzzy Clustering Data Given in the Ordinal Scale |
Authors | Zhengbing Hu, Yevgeniy V. Bodyanskiy, Oleksii K. Tyshchenko, Viktoriia O. Samitova |
Abstract | A fuzzy clustering algorithm for multidimensional data is proposed in this article. The data is described by vectors whose components are linguistic variables defined in an ordinal scale. The obtained results confirm the efficiency of the proposed approach. |
Tasks | |
Published | 2017-01-13 |
URL | http://arxiv.org/abs/1701.03571v1 |
http://arxiv.org/pdf/1701.03571v1.pdf | |
PWC | https://paperswithcode.com/paper/fuzzy-clustering-data-given-in-the-ordinal |
Repo | |
Framework | |
Robust Non-Rigid Registration with Reweighted Position and Transformation Sparsity
Title | Robust Non-Rigid Registration with Reweighted Position and Transformation Sparsity |
Authors | Kun Li, Jingyu Yang, Yu-Kun Lai, Daoliang Guo |
Abstract | Non-rigid registration is challenging because it is ill-posed with high degrees of freedom and is thus sensitive to noise and outliers. We propose a robust non-rigid registration method using reweighted sparsities on position and transformation to estimate the deformations between 3-D shapes. We formulate the energy function with position and transformation sparsity on both the data term and the smoothness term, and define the smoothness constraint using local rigidity. The double sparsity based non-rigid registration model is enhanced with a reweighting scheme, and solved by transferring the model into four alternately-optimized subproblems which have exact solutions and guaranteed convergence. Experimental results on both public datasets and real scanned datasets show that our method outperforms the state-of-the-art methods and is more robust to noise and outliers than conventional non-rigid registration methods. |
Tasks | |
Published | 2017-03-15 |
URL | https://arxiv.org/abs/1703.04861v2 |
https://arxiv.org/pdf/1703.04861v2.pdf | |
PWC | https://paperswithcode.com/paper/robust-non-rigid-registration-with-reweighted |
Repo | |
Framework | |
$L_2$Boosting for Economic Applications
Title | $L_2$Boosting for Economic Applications |
Authors | Ye Luo, Martin Spindler |
Abstract | In the recent years more and more high-dimensional data sets, where the number of parameters $p$ is high compared to the number of observations $n$ or even larger, are available for applied researchers. Boosting algorithms represent one of the major advances in machine learning and statistics in recent years and are suitable for the analysis of such data sets. While Lasso has been applied very successfully for high-dimensional data sets in Economics, boosting has been underutilized in this field, although it has been proven very powerful in fields like Biostatistics and Pattern Recognition. We attribute this to missing theoretical results for boosting. The goal of this paper is to fill this gap and show that boosting is a competitive method for inference of a treatment effect or instrumental variable (IV) estimation in a high-dimensional setting. First, we present the $L_2$Boosting with componentwise least squares algorithm and variants which are tailored for regression problems which are the workhorse for most Econometric problems. Then we show how $L_2$Boosting can be used for estimation of treatment effects and IV estimation. We highlight the methods and illustrate them with simulations and empirical examples. For further results and technical details we refer to Luo and Spindler (2016, 2017) and to the online supplement of the paper. |
Tasks | |
Published | 2017-02-10 |
URL | http://arxiv.org/abs/1702.03244v1 |
http://arxiv.org/pdf/1702.03244v1.pdf | |
PWC | https://paperswithcode.com/paper/l_2boosting-for-economic-applications |
Repo | |
Framework | |