Paper Group ANR 377
Deep Autoencoder for Combined Human Pose Estimation and body Model Upscaling. Towards Large-Scale Video Video Object Mining. Fast Online Exact Solutions for Deterministic MDPs with Sparse Rewards. Causal Bandits with Propagating Inference. Sensor Transfer: Learning Optimal Sensor Effect Image Augmentation for Sim-to-Real Domain Adaptation. GPU Acce …
Deep Autoencoder for Combined Human Pose Estimation and body Model Upscaling
Title | Deep Autoencoder for Combined Human Pose Estimation and body Model Upscaling |
Authors | Matthew Trumble, Andrew Gilbert, Adrian Hilton, John Collomosse |
Abstract | We present a method for simultaneously estimating 3D human pose and body shape from a sparse set of wide-baseline camera views. We train a symmetric convolutional autoencoder with a dual loss that enforces learning of a latent representation that encodes skeletal joint positions, and at the same time learns a deep representation of volumetric body shape. We harness the latter to up-scale input volumetric data by a factor of $4 \times$, whilst recovering a 3D estimate of joint positions with equal or greater accuracy than the state of the art. Inference runs in real-time (25 fps) and has the potential for passive human behaviour monitoring where there is a requirement for high fidelity estimation of human body shape and pose. |
Tasks | 3D Human Pose Estimation, Pose Estimation |
Published | 2018-07-04 |
URL | http://arxiv.org/abs/1807.01511v1 |
http://arxiv.org/pdf/1807.01511v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-autoencoder-for-combined-human-pose |
Repo | |
Framework | |
Towards Large-Scale Video Video Object Mining
Title | Towards Large-Scale Video Video Object Mining |
Authors | Aljosa Osep, Paul Voigtlaender, Jonathon Luiten, Stefan Breuers, Bastian Leibe |
Abstract | We propose to leverage a generic object tracker in order to perform object mining in large-scale unlabeled videos, captured in a realistic automotive setting. We present a dataset of more than 360’000 automatically mined object tracks from 10+ hours of video data (560’000 frames) and propose a method for automated novel category discovery and detector learning. In addition, we show preliminary results on using the mined tracks for object detector adaptation. |
Tasks | |
Published | 2018-09-19 |
URL | http://arxiv.org/abs/1809.07316v1 |
http://arxiv.org/pdf/1809.07316v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-large-scale-video-video-object-mining |
Repo | |
Framework | |
Fast Online Exact Solutions for Deterministic MDPs with Sparse Rewards
Title | Fast Online Exact Solutions for Deterministic MDPs with Sparse Rewards |
Authors | Joshua R. Bertram, Xuxi Yang, Peng Wei |
Abstract | Markov Decision Processes (MDPs) are a mathematical framework for modeling sequential decision making under uncertainty. The classical approaches for solving MDPs are well known and have been widely studied, some of which rely on approximation techniques to solve MDPs with large state space and/or action space. However, most of these classical solution approaches and their approximation techniques still take much computation time to converge and usually must be re-computed if the reward function is changed. This paper introduces a novel alternative approach for exactly and efficiently solving deterministic, continuous MDPs with sparse reward sources. When the environment is such that the “distance” between states can be determined in constant time, e.g. grid world, our algorithm offers $O( R^2 \times A^2 \times S)$, where $R$ is the number of reward sources, $A$ is the number of actions, and $S$ is the number of states. Memory complexity for the algorithm is $O( S + R \times A)$. This new approach opens new avenues for boosting computational performance for certain classes of MDPs and is of tremendous value for MDP applications such as robotics and unmanned systems. This paper describes the algorithm and presents numerical experiment results to demonstrate its powerful computational performance. We also provide rigorous mathematical description of the approach. |
Tasks | Decision Making, Decision Making Under Uncertainty |
Published | 2018-05-08 |
URL | http://arxiv.org/abs/1805.02785v3 |
http://arxiv.org/pdf/1805.02785v3.pdf | |
PWC | https://paperswithcode.com/paper/fast-online-exact-solutions-for-deterministic |
Repo | |
Framework | |
Causal Bandits with Propagating Inference
Title | Causal Bandits with Propagating Inference |
Authors | Akihiro Yabe, Daisuke Hatano, Hanna Sumita, Shinji Ito, Naonori Kakimura, Takuro Fukunaga, Ken-ichi Kawarabayashi |
Abstract | Bandit is a framework for designing sequential experiments. In each experiment, a learner selects an arm $A \in \mathcal{A}$ and obtains an observation corresponding to $A$. Theoretically, the tight regret lower-bound for the general bandit is polynomial with respect to the number of arms $\mathcal{A}$. This makes bandit incapable of handling an exponentially large number of arms, hence the bandit problem with side-information is often considered to overcome this lower bound. Recently, a bandit framework over a causal graph was introduced, where the structure of the causal graph is available as side-information. A causal graph is a fundamental model that is frequently used with a variety of real problems. In this setting, the arms are identified with interventions on a given causal graph, and the effect of an intervention propagates throughout all over the causal graph. The task is to find the best intervention that maximizes the expected value on a target node. Existing algorithms for causal bandit overcame the $\Omega(\sqrt{\mathcal{A}/T})$ simple-regret lower-bound; however, their algorithms work only when the interventions $\mathcal{A}$ are localized around a single node (i.e., an intervention propagates only to its neighbors). We propose a novel causal bandit algorithm for an arbitrary set of interventions, which can propagate throughout the causal graph. We also show that it achieves $O(\sqrt{ \gamma^*\log(\mathcal{A}T) / T})$ regret bound, where $\gamma^*$ is determined by using a causal graph structure. In particular, if the in-degree of the causal graph is bounded, then $\gamma^* = O(N^2)$, where $N$ is the number $N$ of nodes. |
Tasks | |
Published | 2018-06-06 |
URL | http://arxiv.org/abs/1806.02252v1 |
http://arxiv.org/pdf/1806.02252v1.pdf | |
PWC | https://paperswithcode.com/paper/causal-bandits-with-propagating-inference |
Repo | |
Framework | |
Sensor Transfer: Learning Optimal Sensor Effect Image Augmentation for Sim-to-Real Domain Adaptation
Title | Sensor Transfer: Learning Optimal Sensor Effect Image Augmentation for Sim-to-Real Domain Adaptation |
Authors | Alexandra Carlson, Katherine A. Skinner, Ram Vasudevan, Matthew Johnson-Roberson |
Abstract | Performance on benchmark datasets has drastically improved with advances in deep learning. Still, cross-dataset generalization performance remains relatively low due to the domain shift that can occur between two different datasets. This domain shift is especially exaggerated between synthetic and real datasets. Significant research has been done to reduce this gap, specifically via modeling variation in the spatial layout of a scene, such as occlusions, and scene environmental factors, such as time of day and weather effects. However, few works have addressed modeling the variation in the sensor domain as a means of reducing the synthetic to real domain gap. The camera or sensor used to capture a dataset introduces artifacts into the image data that are unique to the sensor model, suggesting that sensor effects may also contribute to domain shift. To address this, we propose a learned augmentation network composed of physically-based augmentation functions. Our proposed augmentation pipeline transfers specific effects of the sensor model – chromatic aberration, blur, exposure, noise, and color temperature – from a real dataset to a synthetic dataset. We provide experiments that demonstrate that augmenting synthetic training datasets with the proposed learned augmentation framework reduces the domain gap between synthetic and real domains for object detection in urban driving scenes. |
Tasks | Domain Adaptation, Image Augmentation, Object Detection, Transfer Learning |
Published | 2018-09-17 |
URL | http://arxiv.org/abs/1809.06256v2 |
http://arxiv.org/pdf/1809.06256v2.pdf | |
PWC | https://paperswithcode.com/paper/sensor-transfer-learning-optimal-sensor |
Repo | |
Framework | |
GPU Accelerated Sub-Sampled Newton’s Method
Title | GPU Accelerated Sub-Sampled Newton’s Method |
Authors | Sudhir B. Kylasa, Farbod Roosta-Khorasani, Michael W. Mahoney, Ananth Grama |
Abstract | First order methods, which solely rely on gradient information, are commonly used in diverse machine learning (ML) and data analysis (DA) applications. This is attributed to the simplicity of their implementations, as well as low per-iteration computational/storage costs. However, they suffer from significant disadvantages; most notably, their performance degrades with increasing problem ill-conditioning. Furthermore, they often involve a large number of hyper-parameters, and are notoriously sensitive to parameters such as the step-size. By incorporating additional information from the Hessian, second-order methods, have been shown to be resilient to many such adversarial effects. However, these advantages of using curvature information come at the cost of higher per-iteration costs, which in \enquote{big data} regimes, can be computationally prohibitive. In this paper, we show that, contrary to conventional belief, second-order methods, when implemented appropriately, can be more efficient than first-order alternatives in many large-scale ML/ DA applications. In particular, in convex settings, we consider variants of classical Newton\textsf{'}s method in which the Hessian and/or the gradient are randomly sub-sampled. We show that by effectively leveraging the power of GPUs, such randomized Newton-type algorithms can be significantly accelerated, and can easily outperform state of the art implementations of existing techniques in popular ML/ DA software packages such as TensorFlow. Additionally these randomized methods incur a small memory overhead compared to first-order methods. In particular, we show that for million-dimensional problems, our GPU accelerated sub-sampled Newton\textsf{'}s method achieves a higher test accuracy in milliseconds as compared with tens of seconds for first order alternatives. |
Tasks | |
Published | 2018-02-26 |
URL | http://arxiv.org/abs/1802.09113v2 |
http://arxiv.org/pdf/1802.09113v2.pdf | |
PWC | https://paperswithcode.com/paper/gpu-accelerated-sub-sampled-newtons-method |
Repo | |
Framework | |
Face Completion with Semantic Knowledge and Collaborative Adversarial Learning
Title | Face Completion with Semantic Knowledge and Collaborative Adversarial Learning |
Authors | Haofu Liao, Gareth Funka-Lea, Yefeng Zheng, Jiebo Luo, S. Kevin Zhou |
Abstract | Unlike a conventional background inpainting approach that infers a missing area from image patches similar to the background, face completion requires semantic knowledge about the target object for realistic outputs. Current image inpainting approaches utilize generative adversarial networks (GANs) to achieve such semantic understanding. However, in adversarial learning, the semantic knowledge is learned implicitly and hence good semantic understanding is not always guaranteed. In this work, we propose a collaborative adversarial learning approach to face completion to explicitly induce the training process. Our method is formulated under a novel generative framework called collaborative GAN (collaGAN), which allows better semantic understanding of a target object through collaborative learning of multiple tasks including face completion, landmark detection, and semantic segmentation. Together with the collaGAN, we also introduce an inpainting concentrated scheme such that the model emphasizes more on inpainting instead of autoencoding. Extensive experiments show that the proposed designs are indeed effective and collaborative adversarial learning provides better feature representations of the faces. In comparison with other generative image inpainting models and single task learning methods, our solution produces superior performances on all tasks. |
Tasks | Facial Inpainting, Image Inpainting, Semantic Segmentation |
Published | 2018-12-08 |
URL | https://arxiv.org/abs/1812.03252v2 |
https://arxiv.org/pdf/1812.03252v2.pdf | |
PWC | https://paperswithcode.com/paper/face-completion-with-semantic-knowledge-and |
Repo | |
Framework | |
Ensemble of Multi-View Learning Classifiers for Cross-Domain Iris Presentation Attack Detection
Title | Ensemble of Multi-View Learning Classifiers for Cross-Domain Iris Presentation Attack Detection |
Authors | Andrey Kuehlkamp, Allan Pinto, Anderson Rocha, Kevin Bowyer, Adam Czajka |
Abstract | The adoption of large-scale iris recognition systems around the world has brought to light the importance of detecting presentation attack images (textured contact lenses and printouts). This work presents a new approach in iris Presentation Attack Detection (PAD), by exploring combinations of Convolutional Neural Networks (CNNs) and transformed input spaces through binarized statistical image features (BSIF). Our method combines lightweight CNNs to classify multiple BSIF views of the input image. Following explorations on complementary input spaces leading to more discriminative features to detect presentation attacks, we also propose an algorithm to select the best (and most discriminative) predictors for the task at hand.An ensemble of predictors makes use of their expected individual performances to aggregate their results into a final prediction. Results show that this technique improves on the current state of the art in iris PAD, outperforming the winner of LivDet-Iris2017 competition both for intra- and cross-dataset scenarios, and illustrating the very difficult nature of the cross-dataset scenario. |
Tasks | Cross-Domain Iris Presentation Attack Detection, Iris Recognition, MULTI-VIEW LEARNING |
Published | 2018-11-25 |
URL | http://arxiv.org/abs/1811.10068v1 |
http://arxiv.org/pdf/1811.10068v1.pdf | |
PWC | https://paperswithcode.com/paper/ensemble-of-multi-view-learning-classifiers |
Repo | |
Framework | |
Exploiting Partial Structural Symmetry For Patient-Specific Image Augmentation in Trauma Interventions
Title | Exploiting Partial Structural Symmetry For Patient-Specific Image Augmentation in Trauma Interventions |
Authors | Javad Fotouhi, Mathias Unberath, Giacomo Taylor, Arash Ghaani Farashahi, Bastian Bier, Russell H. Taylor, Greg M. Osgood, M. D., Mehran Armand, Nassir Navab |
Abstract | In unilateral pelvic fracture reductions, surgeons attempt to reconstruct the bone fragments such that bilateral symmetry in the bony anatomy is restored. We propose to exploit this “structurally symmetric” nature of the pelvic bone, and provide intra-operative image augmentation to assist the surgeon in repairing dislocated fragments. The main challenge is to automatically estimate the desired plane of symmetry within the patient’s pre-operative CT. We propose to estimate this plane using a non-linear optimization strategy, by minimizing Tukey’s biweight robust estimator, relying on the partial symmetry of the anatomy. Moreover, a regularization term is designed to enforce the similarity of bone density histograms on both sides of this plane, relying on the biological fact that, even if injured, the dislocated bone segments remain within the body. The experimental results demonstrate the performance of the proposed method in estimating this “plane of partial symmetry” using CT images of both healthy and injured anatomy. Examples of unilateral pelvic fractures are used to show how intra-operative X-ray images could be augmented with the forward-projections of the mirrored anatomy, acting as objective road-map for fracture reduction procedures. |
Tasks | Image Augmentation |
Published | 2018-04-09 |
URL | http://arxiv.org/abs/1804.03224v1 |
http://arxiv.org/pdf/1804.03224v1.pdf | |
PWC | https://paperswithcode.com/paper/exploiting-partial-structural-symmetry-for |
Repo | |
Framework | |
A Generative Model For Electron Paths
Title | A Generative Model For Electron Paths |
Authors | John Bradshaw, Matt J. Kusner, Brooks Paige, Marwin H. S. Segler, José Miguel Hernández-Lobato |
Abstract | Chemical reactions can be described as the stepwise redistribution of electrons in molecules. As such, reactions are often depicted using `arrow-pushing’ diagrams which show this movement as a sequence of arrows. We propose an electron path prediction model (ELECTRO) to learn these sequences directly from raw reaction data. Instead of predicting product molecules directly from reactant molecules in one shot, learning a model of electron movement has the benefits of (a) being easy for chemists to interpret, (b) incorporating constraints of chemistry, such as balanced atom counts before and after the reaction, and (c) naturally encoding the sparsity of chemical reactions, which usually involve changes in only a small number of atoms in the reactants.We design a method to extract approximate reaction paths from any dataset of atom-mapped reaction SMILES strings. Our model achieves excellent performance on an important subset of the USPTO reaction dataset, comparing favorably to the strongest baselines. Furthermore, we show that our model recovers a basic knowledge of chemistry without being explicitly trained to do so. | |
Tasks | One-Shot Learning |
Published | 2018-05-23 |
URL | http://arxiv.org/abs/1805.10970v2 |
http://arxiv.org/pdf/1805.10970v2.pdf | |
PWC | https://paperswithcode.com/paper/predicting-electron-paths |
Repo | |
Framework | |
Fairness Under Unawareness: Assessing Disparity When Protected Class Is Unobserved
Title | Fairness Under Unawareness: Assessing Disparity When Protected Class Is Unobserved |
Authors | Jiahao Chen, Nathan Kallus, Xiaojie Mao, Geoffry Svacha, Madeleine Udell |
Abstract | Assessing the fairness of a decision making system with respect to a protected class, such as gender or race, is challenging when class membership labels are unavailable. Probabilistic models for predicting the protected class based on observable proxies, such as surname and geolocation for race, are sometimes used to impute these missing labels for compliance assessments. Empirically, these methods are observed to exaggerate disparities, but the reason why is unknown. In this paper, we decompose the biases in estimating outcome disparity via threshold-based imputation into multiple interpretable bias sources, allowing us to explain when over- or underestimation occurs. We also propose an alternative weighted estimator that uses soft classification, and show that its bias arises simply from the conditional covariance of the outcome with the true class membership. Finally, we illustrate our results with numerical simulations and a public dataset of mortgage applications, using geolocation as a proxy for race. We confirm that the bias of threshold-based imputation is generally upward, but its magnitude varies strongly with the threshold chosen. Our new weighted estimator tends to have a negative bias that is much simpler to analyze and reason about. |
Tasks | Decision Making, Imputation |
Published | 2018-11-27 |
URL | http://arxiv.org/abs/1811.11154v1 |
http://arxiv.org/pdf/1811.11154v1.pdf | |
PWC | https://paperswithcode.com/paper/fairness-under-unawareness-assessing |
Repo | |
Framework | |
Semantic Integration in the Information Flow Framework
Title | Semantic Integration in the Information Flow Framework |
Authors | Robert E. Kent |
Abstract | The Information Flow Framework (IFF) is a descriptive category metatheory currently under development, which is being offered as the structural aspect of the Standard Upper Ontology (SUO). The architecture of the IFF is composed of metalevels, namespaces and meta-ontologies. The main application of the IFF is institutional: the notion of institutions and their morphisms are being axiomatized in the upper metalevels of the IFF, and the lower metalevel of the IFF has axiomatized various institutions in which semantic integration has a natural expression as the colimit of theories. |
Tasks | |
Published | 2018-10-18 |
URL | http://arxiv.org/abs/1810.08236v1 |
http://arxiv.org/pdf/1810.08236v1.pdf | |
PWC | https://paperswithcode.com/paper/semantic-integration-in-the-information-flow |
Repo | |
Framework | |
Supervised Nonnegative Matrix Factorization to Predict ICU Mortality Risk
Title | Supervised Nonnegative Matrix Factorization to Predict ICU Mortality Risk |
Authors | Guoqing Chao, Chengsheng Mao, Fei Wang, Yuan Zhao, Yuan Luo |
Abstract | ICU mortality risk prediction is a tough yet important task. On one hand, due to the complex temporal data collected, it is difficult to identify the effective features and interpret them easily; on the other hand, good prediction can help clinicians take timely actions to prevent the mortality. These correspond to the interpretability and accuracy problems. Most existing methods lack of the interpretability, but recently Subgraph Augmented Nonnegative Matrix Factorization (SANMF) has been successfully applied to time series data to provide a path to interpret the features well. Therefore, we adopted this approach as the backbone to analyze the patient data. One limitation of the raw SANMF method is its poor prediction ability due to its unsupervised nature. To deal with this problem, we proposed a supervised SANMF algorithm by integrating the logistic regression loss function into the NMF framework and solved it with an alternating optimization procedure. We used the simulation data to verify the effectiveness of this method, and then we applied it to ICU mortality risk prediction and demonstrated its superiority over other conventional supervised NMF methods. |
Tasks | Time Series |
Published | 2018-09-27 |
URL | http://arxiv.org/abs/1809.10680v2 |
http://arxiv.org/pdf/1809.10680v2.pdf | |
PWC | https://paperswithcode.com/paper/supervised-nonnegative-matrix-factorization |
Repo | |
Framework | |
Convergence guarantees for a class of non-convex and non-smooth optimization problems
Title | Convergence guarantees for a class of non-convex and non-smooth optimization problems |
Authors | Koulik Khamaru, Martin J. Wainwright |
Abstract | We consider the problem of finding critical points of functions that are non-convex and non-smooth. Studying a fairly broad class of such problems, we analyze the behavior of three gradient-based methods (gradient descent, proximal update, and Frank-Wolfe update). For each of these methods, we establish rates of convergence for general problems, and also prove faster rates for continuous sub-analytic functions. We also show that our algorithms can escape strict saddle points for a class of non-smooth functions, thereby generalizing known results for smooth functions. Our analysis leads to a simplification of the popular CCCP algorithm, used for optimizing functions that can be written as a difference of two convex functions. Our simplified algorithm retains all the convergence properties of CCCP, along with a significantly lower cost per iteration. We illustrate our methods and theory via applications to the problems of best subset selection, robust estimation, mixture density estimation, and shape-from-shading reconstruction. |
Tasks | Density Estimation |
Published | 2018-04-25 |
URL | http://arxiv.org/abs/1804.09629v1 |
http://arxiv.org/pdf/1804.09629v1.pdf | |
PWC | https://paperswithcode.com/paper/convergence-guarantees-for-a-class-of-non |
Repo | |
Framework | |
Object 3D Reconstruction based on Photometric Stereo and Inverted Rendering
Title | Object 3D Reconstruction based on Photometric Stereo and Inverted Rendering |
Authors | Anish R. Khadka, Paolo Remagnino, Vasileios Argyriou |
Abstract | Methods for 3D reconstruction such as Photometric stereo recover the shape and reflectance properties using multiple images of an object taken with variable lighting conditions from a fixed viewpoint. Photometric stereo assumes that a scene is illuminated only directly by the illumination source. As result, indirect illumination effects due to inter-reflections introduce strong biases in the recovered shape. Our suggested approach is to recover scene properties in the presence of indirect illumination. To this end, we proposed an iterative PS method combined with a reverted Monte-Carlo ray tracing algorithm to overcome the inter-reflection effects aiming to separate the direct and indirect lighting. This approach iteratively reconstructs a surface considering both the environment around the object and its concavities. We demonstrate and evaluate our approach using three datasets and the overall results illustrate improvement over the classic PS approaches. |
Tasks | 3D Reconstruction |
Published | 2018-11-06 |
URL | http://arxiv.org/abs/1811.02357v1 |
http://arxiv.org/pdf/1811.02357v1.pdf | |
PWC | https://paperswithcode.com/paper/object-3d-reconstruction-based-on-photometric |
Repo | |
Framework | |