Paper Group ANR 217
Hierarchical Auto-Regressive Model for Image Compression Incorporating Object Saliency and a Deep Perceptual Loss. Neural Networks for Encoding Dynamic Security-Constrained Optimal Power Flow to Mixed-Integer Linear Programs. Uncertainty Estimation for End-To-End Learned Dense Stereo Matching via Probabilistic Deep Learning. A hybrid algorithm for …
Hierarchical Auto-Regressive Model for Image Compression Incorporating Object Saliency and a Deep Perceptual Loss
Title | Hierarchical Auto-Regressive Model for Image Compression Incorporating Object Saliency and a Deep Perceptual Loss |
Authors | Yash Patel, Srikar Appalaraju, R. Manmatha |
Abstract | We propose a new end-to-end trainable model for lossy image compression which includes a number of novel components. This approach incorporates 1) a hierarchical auto-regressive model; 2)it also incorporates saliency in the images and focuses on reconstructing the salient regions better; 3) in addition, we empirically demonstrate that the popularly used evaluations metrics such as MS-SSIM and PSNR are inadequate for judging the performance of deep learned image compression techniques as they do not align well with human perceptual similarity. We, therefore propose an alternative metric, which is learned on perceptual similarity data specific to image compression. Our experiments show that this new metric aligns significantly better with human judgments when compared to other hand-crafted or learned metrics. The proposed compression model not only generates images that are visually better but also gives superior performance for subsequent computer vision tasks such as object detection and segmentation when compared to other engineered or learned codecs. |
Tasks | Image Compression, Object Detection |
Published | 2020-02-12 |
URL | https://arxiv.org/abs/2002.04988v1 |
https://arxiv.org/pdf/2002.04988v1.pdf | |
PWC | https://paperswithcode.com/paper/hierarchical-auto-regressive-model-for-image |
Repo | |
Framework | |
Neural Networks for Encoding Dynamic Security-Constrained Optimal Power Flow to Mixed-Integer Linear Programs
Title | Neural Networks for Encoding Dynamic Security-Constrained Optimal Power Flow to Mixed-Integer Linear Programs |
Authors | Andreas Venzke, Daniel Timon Viola, Jeanne Mermet-Guyennet, George S. Misyris, Spyros Chatzivasileiadis |
Abstract | This paper introduces a framework to capture previously intractable optimization constraints and transform them to a mixed-integer linear program, through the use of neural networks. We encode the feasible space of optimization problems characterized by both tractable and intractable constraints, e.g. differential equations, to a neural network. Leveraging an exact mixed-integer reformulation of neural networks, we solve mixed-integer linear programs that accurately approximate solutions to the originally intractable non-linear optimization problem. We apply our methods to the AC optimal power flow problem (AC-OPF), where directly including dynamic security constraints renders the AC-OPF intractable. Our proposed approach has the potential to be significantly more scalable than traditional approaches. We demonstrate our approach for power system operation considering N-1 security and small-signal stability, showing how it can efficiently obtain cost-optimal solutions which at the same time satisfy both static and dynamic security constraints. |
Tasks | |
Published | 2020-03-17 |
URL | https://arxiv.org/abs/2003.07939v3 |
https://arxiv.org/pdf/2003.07939v3.pdf | |
PWC | https://paperswithcode.com/paper/neural-networks-for-encoding-dynamic-security |
Repo | |
Framework | |
Uncertainty Estimation for End-To-End Learned Dense Stereo Matching via Probabilistic Deep Learning
Title | Uncertainty Estimation for End-To-End Learned Dense Stereo Matching via Probabilistic Deep Learning |
Authors | Max Mehltretter |
Abstract | Motivated by the need to identify erroneous disparity assignments, various approaches for uncertainty and confidence estimation of dense stereo matching have been presented in recent years. As in many other fields, especially deep learning based methods have shown convincing results. However, most of these methods only model the uncertainty contained in the data, while ignoring the uncertainty of the employed dense stereo matching procedure. Additionally modelling the latter, however, is particularly beneficial if the domain of the training data varies from that of the data to be processed. For this purpose, in the present work the idea of probabilistic deep learning is applied to the task of dense stereo matching for the first time. Based on the well-known and commonly employed GC-Net architecture, a novel probabilistic neural network is presented, for the task of joint depth and uncertainty estimation from epipolar rectified stereo image pairs. Instead of learning the network parameters directly, the proposed probabilistic neural network learns a probability distribution from which parameters are sampled for every prediction. The variations between multiple such predictions on the same image pair allow to approximate the model uncertainty. The quality of the estimated depth and uncertainty information is assessed in an extensive evaluation on three different datasets. |
Tasks | Stereo Matching |
Published | 2020-02-10 |
URL | https://arxiv.org/abs/2002.03663v1 |
https://arxiv.org/pdf/2002.03663v1.pdf | |
PWC | https://paperswithcode.com/paper/uncertainty-estimation-for-end-to-end-learned |
Repo | |
Framework | |
A hybrid algorithm for disparity calculation from sparse disparity estimates based on stereo vision
Title | A hybrid algorithm for disparity calculation from sparse disparity estimates based on stereo vision |
Authors | Subhayan Mukherjee, Ram Mohana Reddy Guddeti |
Abstract | In this paper, we have proposed a novel method for stereo disparity estimation by combining the existing methods of block based and region based stereo matching. Our method can generate dense disparity maps from disparity measurements of only 18% pixels of either the left or the right image of a stereo image pair. It works by segmenting the lightness values of image pixels using a fast implementation of K-Means clustering. It then refines those segment boundaries by morphological filtering and connected components analysis, thus removing a lot of redundant boundary pixels. This is followed by determining the boundaries’ disparities by the SAD cost function. Lastly, we reconstruct the entire disparity map of the scene from the boundaries’ disparities through disparity propagation along the scan lines and disparity prediction of regions of uncertainty by considering disparities of the neighboring regions. Experimental results on the Middlebury stereo vision dataset demonstrate that the proposed method outperforms traditional disparity determination methods like SAD and NCC by up to 30% and achieves an improvement of 2.6% when compared to a recent approach based on absolute difference (AD) cost function for disparity calculations [1]. |
Tasks | Disparity Estimation, Stereo Matching |
Published | 2020-01-20 |
URL | https://arxiv.org/abs/2001.06967v1 |
https://arxiv.org/pdf/2001.06967v1.pdf | |
PWC | https://paperswithcode.com/paper/a-hybrid-algorithm-for-disparity-calculation |
Repo | |
Framework | |
Model-Based Reinforcement Learning for Physical Systems Without Velocity and Acceleration Measurements
Title | Model-Based Reinforcement Learning for Physical Systems Without Velocity and Acceleration Measurements |
Authors | Alberto Dalla Libera, Diego Romeres, Devesh K. Jha, Bill Yerazunis, Daniel Nikovski |
Abstract | In this paper, we propose a derivative-free model learning framework for Reinforcement Learning (RL) algorithms based on Gaussian Process Regression (GPR). In many mechanical systems, only positions can be measured by the sensing instruments. Then, instead of representing the system state as suggested by the physics with a collection of positions, velocities, and accelerations, we define the state as the set of past position measurements. However, the equation of motions derived by physical first principles cannot be directly applied in this framework, being functions of velocities and accelerations. For this reason, we introduce a novel derivative-free physically-inspired kernel, which can be easily combined with nonparametric derivative-free Gaussian Process models. Tests performed on two real platforms show that the considered state definition combined with the proposed model improves estimation performance and data-efficiency w.r.t. traditional models based on GPR. Finally, we validate the proposed framework by solving two RL control problems for two real robotic systems. |
Tasks | |
Published | 2020-02-25 |
URL | https://arxiv.org/abs/2002.10621v1 |
https://arxiv.org/pdf/2002.10621v1.pdf | |
PWC | https://paperswithcode.com/paper/model-based-reinforcement-learning-for-2 |
Repo | |
Framework | |
A Sparsity Inducing Nuclear-Norm Estimator (SpINNEr) for Matrix-Variate Regression in Brain Connectivity Analysis
Title | A Sparsity Inducing Nuclear-Norm Estimator (SpINNEr) for Matrix-Variate Regression in Brain Connectivity Analysis |
Authors | Damian Brzyski, Xixi Hu, Joaquin Goni, Beau Ances, Timothy W. Randolph, Jaroslaw Harezlak |
Abstract | Classical scalar-response regression methods treat covariates as a vector and estimate a corresponding vector of regression coefficients. In medical applications, however, regressors are often in a form of multi-dimensional arrays. For example, one may be interested in using MRI imaging to identify which brain regions are associated with a health outcome. Vectorizing the two-dimensional image arrays is an unsatisfactory approach since it destroys the inherent spatial structure of the images and can be computationally challenging. We present an alternative approach - regularized matrix regression - where the matrix of regression coefficients is defined as a solution to the specific optimization problem. The method, called SParsity Inducing Nuclear Norm EstimatoR (SpINNEr), simultaneously imposes two penalty types on the regression coefficient matrix—the nuclear norm and the lasso norm—to encourage a low rank matrix solution that also has entry-wise sparsity. A specific implementation of the alternating direction method of multipliers (ADMM) is used to build a fast and efficient numerical solver. Our simulations show that SpINNEr outperforms other methods in estimation accuracy when the response-related entries (representing the brain’s functional connectivity) are arranged in well-connected communities. SpINNEr is applied to investigate associations between HIV-related outcomes and functional connectivity in the human brain. |
Tasks | |
Published | 2020-01-30 |
URL | https://arxiv.org/abs/2001.11548v1 |
https://arxiv.org/pdf/2001.11548v1.pdf | |
PWC | https://paperswithcode.com/paper/a-sparsity-inducing-nuclear-norm-estimator |
Repo | |
Framework | |
The Impact of Hole Geometry on Relative Robustness of In-Painting Networks: An Empirical Study
Title | The Impact of Hole Geometry on Relative Robustness of In-Painting Networks: An Empirical Study |
Authors | Masood S. Mortazavi, Ning Yan |
Abstract | In-painting networks use existing pixels to generate appropriate pixels to fill “holes” placed on parts of an image. A 2-D in-painting network’s input usually consists of (1) a three-channel 2-D image, and (2) an additional channel for the “holes” to be in-painted in that image. In this paper, we study the robustness of a given in-painting neural network against variations in hole geometry distributions. We observe that the robustness of an in-painting network is dependent on the probability distribution function (PDF) of the hole geometry presented to it during its training even if the underlying image dataset used (in training and testing) does not alter. We develop an experimental methodology for testing and evaluating relative robustness of in-painting networks against four different kinds of hole geometry PDFs. We examine a number of hypothesis regarding (1) the natural bias of in-painting networks to the hole distribution used for their training, (2) the underlying dataset’s ability to differentiate relative robustness as hole distributions vary in a train-test (cross-comparison) grid, and (3) the impact of the directional distribution of edges in the holes and in the image dataset. We present results for L1, PSNR and SSIM quality metrics and develop a specific measure of relative in-painting robustness to be used in cross-comparison grids based on these quality metrics. (One can incorporate other quality metrics in this relative measure.) The empirical work reported here is an initial step in a broader and deeper investigation of “filling the blank” neural networks’ sensitivity, robustness and regularization with respect to hole “geometry” PDFs, and it suggests further research in this domain. |
Tasks | |
Published | 2020-03-04 |
URL | https://arxiv.org/abs/2003.02314v1 |
https://arxiv.org/pdf/2003.02314v1.pdf | |
PWC | https://paperswithcode.com/paper/the-impact-of-hole-geometry-on-relative |
Repo | |
Framework | |
Automated Pavement Crack Segmentation Using Fully Convolutional U-Net with a Pretrained ResNet-34 Encoder
Title | Automated Pavement Crack Segmentation Using Fully Convolutional U-Net with a Pretrained ResNet-34 Encoder |
Authors | Stephen L. H. Lau, Xin Wang, Xu Yang, Edwin K. P. Chong |
Abstract | Automated pavement crack segmentation is a challenging task because of inherent irregular patterns and lighting conditions, in addition to the presence of noise in images. Conventional approaches require a substantial amount of feature engineering to differentiate crack regions from non-affected regions. In this paper, we propose a deep learning technique based on a convolutional neural network to perform segmentation tasks on pavement crack images. Our approach requires minimal feature engineering compared to other machine learning techniques. The proposed neural network architecture is a modified U-Net in which the encoder is replaced with a pretrained ResNet-34 network. To minimize the dice coefficient loss function, we optimize the parameters in the neural network by using an adaptive moment optimizer called AdamW. Additionally, we use a systematic method to find the optimum learning rate instead of doing parametric sweeps. We used a “one-cycle” training schedule based on cyclical learning rates to speed up the convergence. We evaluated the performance of our convolutional neural network on CFD, a pavement crack image dataset. Our method achieved an F1 score of about 96%. This is the best performance among all other algorithms tested on this dataset, outperforming the previous best method by a 1.7% margin. |
Tasks | Feature Engineering |
Published | 2020-01-07 |
URL | https://arxiv.org/abs/2001.01912v3 |
https://arxiv.org/pdf/2001.01912v3.pdf | |
PWC | https://paperswithcode.com/paper/automated-pavement-crack-segmentation-using |
Repo | |
Framework | |
MonoPair: Monocular 3D Object Detection Using Pairwise Spatial Relationships
Title | MonoPair: Monocular 3D Object Detection Using Pairwise Spatial Relationships |
Authors | Yongjian Chen, Lei Tai, Kai Sun, Mingyang Li |
Abstract | Monocular 3D object detection is an essential component in autonomous driving while challenging to solve, especially for those occluded samples which are only partially visible. Most detectors consider each 3D object as an independent training target, inevitably resulting in a lack of useful information for occluded samples. To this end, we propose a novel method to improve the monocular 3D object detection by considering the relationship of paired samples. This allows us to encode spatial constraints for partially-occluded objects from their adjacent neighbors. Specifically, the proposed detector computes uncertainty-aware predictions for object locations and 3D distances for the adjacent object pairs, which are subsequently jointly optimized by nonlinear least squares. Finally, the one-stage uncertainty-aware prediction structure and the post-optimization module are dedicatedly integrated for ensuring the run-time efficiency. Experiments demonstrate that our method yields the best performance on KITTI 3D detection benchmark, by outperforming state-of-the-art competitors by wide margins, especially for the hard samples. |
Tasks | 3D Object Detection, Autonomous Driving, Object Detection |
Published | 2020-03-01 |
URL | https://arxiv.org/abs/2003.00504v1 |
https://arxiv.org/pdf/2003.00504v1.pdf | |
PWC | https://paperswithcode.com/paper/monopair-monocular-3d-object-detection-using |
Repo | |
Framework | |
A Deep Neural Networks Approach for Pixel-Level Runway Pavement Crack Segmentation Using Drone-Captured Images
Title | A Deep Neural Networks Approach for Pixel-Level Runway Pavement Crack Segmentation Using Drone-Captured Images |
Authors | Liming Jiang, Yuanchang Xie, Tianzhu Ren |
Abstract | Pavement conditions are a critical aspect of asset management and directly affect safety. This study introduces a deep neural network method called U-Net for pavement crack segmentation based on drone-captured images to reduce the cost and time needed for airport runway inspection. The proposed approach can also be used for highway pavement conditions assessment during off-peak periods when there are few vehicles on the road. In this study, runway pavement images are collected using drone at various heights from the Fitchburg Municipal Airport (FMA) in Massachusetts to evaluate their quality and applicability for crack segmentation, from which an optimal height is determined. Drone images captured at the optimal height are then used to evaluate the crack segmentation performance of the U-Net model. Deep learning methods typically require a huge set of annotated training datasets for model development, which can be a major obstacle for their applications. An online annotated pavement image dataset is used together with the FMA data to train the U-Net model. The results show that U-Net performs well on the FMA testing data even with limited FMA training images, suggesting that it has good generalization ability and great potential to be used for both airport runways and highway pavements. |
Tasks | |
Published | 2020-01-09 |
URL | https://arxiv.org/abs/2001.03257v1 |
https://arxiv.org/pdf/2001.03257v1.pdf | |
PWC | https://paperswithcode.com/paper/a-deep-neural-networks-approach-for-pixel |
Repo | |
Framework | |
Deep Learning for ECG Segmentation
Title | Deep Learning for ECG Segmentation |
Authors | Viktor Moskalenko, Nikolai Zolotykh, Grigory Osipov |
Abstract | We propose an algorithm for electrocardiogram (ECG) segmentation using a UNet-like full-convolutional neural network. The algorithm receives an arbitrary sampling rate ECG signal as an input, and gives a list of onsets and offsets of P and T waves and QRS complexes as output. Our method of segmentation differs from others in speed, a small number of parameters and a good generalization: it is adaptive to different sampling rates and it is generalized to various types of ECG monitors. The proposed approach is superior to other state-of-the-art segmentation methods in terms of quality. In particular, F1-measures for detection of onsets and offsets of P and T waves and for QRS-complexes are at least 97.8%, 99.5%, and 99.9%, respectively. |
Tasks | |
Published | 2020-01-14 |
URL | https://arxiv.org/abs/2001.04689v1 |
https://arxiv.org/pdf/2001.04689v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-for-ecg-segmentation |
Repo | |
Framework | |
Reconfigurable Design for Omni-adaptive Grasp Learning
Title | Reconfigurable Design for Omni-adaptive Grasp Learning |
Authors | Fang Wan, Haokun Wang, Jiyuan Wu, Yujia Liu, Sheng Ge, Chaoyang Song |
Abstract | The engineering design of robotic grippers presents an ample design space for optimization towards robust grasping. In this paper, we adopt the reconfigurable design of the robotic gripper using a novel soft finger structure with omni-directional adaptation, which generates a large number of possible gripper configurations by rearranging these fingers. Such reconfigurable design with these omni-adaptive fingers enables us to systematically investigate the optimal arrangement of the fingers towards robust grasping. Furthermore, we adopt a learning-based method as the baseline to benchmark the effectiveness of each design configuration. As a result, we found that a 3-finger and 4-finger radial configuration is the most effective one achieving an average 96% grasp success rate on seen and novel objects selected from the YCB dataset. We also discussed the influence of the frictional surface on the finger to improve the grasp robustness. |
Tasks | |
Published | 2020-02-29 |
URL | https://arxiv.org/abs/2003.01582v1 |
https://arxiv.org/pdf/2003.01582v1.pdf | |
PWC | https://paperswithcode.com/paper/reconfigurable-design-for-omni-adaptive-grasp |
Repo | |
Framework | |
Shape retrieval of non-rigid 3d human models
Title | Shape retrieval of non-rigid 3d human models |
Authors | David Pickup, Xianfang Sun, Paul L Rosin, Ralph R Martin, Z Cheng, Zhouhui Lian, Masaki Aono, A Ben Hamza, A Bronstein, M Bronstein, S Bu, Umberto Castellani, S Cheng, Valeria Garro, Andrea Giachetti, Afzal Godil, Luca Isaia, J Han, Henry Johan, L Lai, Bo Li, C Li, Haisheng Li, Roee Litman, X Liu, Z Liu, Yijuan Lu, L Sun, G Tam, Atsushi Tatsuma, J Ye |
Abstract | 3D models of humans are commonly used within computer graphics and vision, and so the ability to distinguish between body shapes is an important shape retrieval problem. We extend our recent paper which provided a benchmark for testing non-rigid 3D shape retrieval algorithms on 3D human models. This benchmark provided a far stricter challenge than previous shape benchmarks. We have added 145 new models for use as a separate training set, in order to standardise the training data used and provide a fairer comparison. We have also included experiments with the FAUST dataset of human scans. All participants of the previous benchmark study have taken part in the new tests reported here, many providing updated results using the new data. In addition, further participants have also taken part, and we provide extra analysis of the retrieval results. A total of 25 different shape retrieval methods. |
Tasks | 3D Shape Retrieval |
Published | 2020-03-01 |
URL | https://arxiv.org/abs/2003.08763v1 |
https://arxiv.org/pdf/2003.08763v1.pdf | |
PWC | https://paperswithcode.com/paper/shape-retrieval-of-non-rigid-3d-human-models |
Repo | |
Framework | |
Privacy-Preserving Gaussian Process Regression – A Modular Approach to the Application of Homomorphic Encryption
Title | Privacy-Preserving Gaussian Process Regression – A Modular Approach to the Application of Homomorphic Encryption |
Authors | Peter Fenner, Edward O. Pyzer-Knapp |
Abstract | Much of machine learning relies on the use of large amounts of data to train models to make predictions. When this data comes from multiple sources, for example when evaluation of data against a machine learning model is offered as a service, there can be privacy issues and legal concerns over the sharing of data. Fully homomorphic encryption (FHE) allows data to be computed on whilst encrypted, which can provide a solution to the problem of data privacy. However, FHE is both slow and restrictive, so existing algorithms must be manipulated to make them work efficiently under the FHE paradigm. Some commonly used machine learning algorithms, such as Gaussian process regression, are poorly suited to FHE and cannot be manipulated to work both efficiently and accurately. In this paper, we show that a modular approach, which applies FHE to only the sensitive steps of a workflow that need protection, allows one party to make predictions on their data using a Gaussian process regression model built from another party’s data, without either party gaining access to the other’s data, in a way which is both accurate and efficient. This construction is, to our knowledge, the first example of an effectively encrypted Gaussian process. |
Tasks | |
Published | 2020-01-28 |
URL | https://arxiv.org/abs/2001.10893v1 |
https://arxiv.org/pdf/2001.10893v1.pdf | |
PWC | https://paperswithcode.com/paper/privacy-preserving-gaussian-process |
Repo | |
Framework | |
Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence
Title | Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence |
Authors | Nicolas Loizou, Sharan Vaswani, Issam Laradji, Simon Lacoste-Julien |
Abstract | We propose a stochastic variant of the classical Polyak step-size (Polyak, 1987) commonly used in the subgradient method. Although computing the Polyak step-size requires knowledge of the optimal function values, this information is readily available for typical modern machine learning applications. Consequently, the proposed stochastic Polyak step-size (SPS) is an attractive choice for setting the learning rate for stochastic gradient descent (SGD). We provide theoretical convergence guarantees for SGD equipped with SPS in different settings, including strongly convex, convex and non-convex functions. Furthermore, our analysis results in novel convergence guarantees for SGD with a constant step-size. We show that SPS is particularly effective when training over-parameterized models capable of interpolating the training data. In this setting, we prove that SPS enables SGD to converge to the true solution at a fast rate without requiring the knowledge of any problem-dependent constants or additional computational overhead. We experimentally validate our theoretical results via extensive experiments on synthetic and real datasets. We demonstrate the strong performance of SGD with SPS compared to state-of-the-art optimization methods when training over-parameterized models. |
Tasks | |
Published | 2020-02-24 |
URL | https://arxiv.org/abs/2002.10542v1 |
https://arxiv.org/pdf/2002.10542v1.pdf | |
PWC | https://paperswithcode.com/paper/stochastic-polyak-step-size-for-sgd-an |
Repo | |
Framework | |