January 26, 2020

3129 words 15 mins read

Paper Group ANR 1407

Non-Differentiable Supervised Learning with Evolution Strategies and Hybrid Methods. H+O: Unified Egocentric Recognition of 3D Hand-Object Poses and Interactions. Self-supervised Learning for Single View Depth and Surface Normal Estimation. Machine Vision in the Context of Robotics: A Systematic Literature Review. Knowledge Graph Development for Ap …

Non-Differentiable Supervised Learning with Evolution Strategies and Hybrid Methods


Title	Non-Differentiable Supervised Learning with Evolution Strategies and Hybrid Methods
Authors	Karel Lenc, Erich Elsen, Tom Schaul, Karen Simonyan
Abstract	In this work we show that Evolution Strategies (ES) are a viable method for learning non-differentiable parameters of large supervised models. ES are black-box optimization algorithms that estimate distributions of model parameters; however they have only been used for relatively small problems so far. We show that it is possible to scale ES to more complex tasks and models with millions of parameters. While using ES for differentiable parameters is computationally impractical (although possible), we show that a hybrid approach is practically feasible in the case where the model has both differentiable and non-differentiable parameters. In this approach we use standard gradient-based methods for learning differentiable weights, while using ES for learning non-differentiable parameters - in our case sparsity masks of the weights. This proposed method is surprisingly competitive, and when parallelized over multiple devices has only negligible training time overhead compared to training with gradient descent. Additionally, this method allows to train sparse models from the first training step, so they can be much larger than when using methods that require training dense models first. We present results and analysis of supervised feed-forward models (such as MNIST and CIFAR-10 classification), as well as recurrent models, such as SparseWaveRNN for text-to-speech.
Tasks
Published	2019-06-07
URL	https://arxiv.org/abs/1906.03139v1
PDF	https://arxiv.org/pdf/1906.03139v1.pdf
PWC	https://paperswithcode.com/paper/non-differentiable-supervised-learning-with
Repo
Framework

H+O: Unified Egocentric Recognition of 3D Hand-Object Poses and Interactions


Title	H+O: Unified Egocentric Recognition of 3D Hand-Object Poses and Interactions
Authors	Bugra Tekin, Federica Bogo, Marc Pollefeys
Abstract	We present a unified framework for understanding 3D hand and object interactions in raw image sequences from egocentric RGB cameras. Given a single RGB image, our model jointly estimates the 3D hand and object poses, models their interactions, and recognizes the object and action classes with a single feed-forward pass through a neural network. We propose a single architecture that does not rely on external detection algorithms but rather is trained end-to-end on single images. We further merge and propagate information in the temporal domain to infer interactions between hand and object trajectories and recognize actions. The complete model takes as input a sequence of frames and outputs per-frame 3D hand and object pose predictions along with the estimates of object and action categories for the entire sequence. We demonstrate state-of-the-art performance of our algorithm even in comparison to the approaches that work on depth data and ground-truth annotations.
Tasks
Published	2019-04-10
URL	http://arxiv.org/abs/1904.05349v1
PDF	http://arxiv.org/pdf/1904.05349v1.pdf
PWC	https://paperswithcode.com/paper/ho-unified-egocentric-recognition-of-3d-hand
Repo
Framework

Self-supervised Learning for Single View Depth and Surface Normal Estimation


Title	Self-supervised Learning for Single View Depth and Surface Normal Estimation
Authors	Huangying Zhan, Chamara Saroj Weerasekera, Ravi Garg, Ian Reid
Abstract	In this work we present a self-supervised learning framework to simultaneously train two Convolutional Neural Networks (CNNs) to predict depth and surface normals from a single image. In contrast to most existing frameworks which represent outdoor scenes as fronto-parallel planes at piece-wise smooth depth, we propose to predict depth with surface orientation while assuming that natural scenes have piece-wise smooth normals. We show that a simple depth-normal consistency as a soft-constraint on the predictions is sufficient and effective for training both these networks simultaneously. The trained normal network provides state-of-the-art predictions while the depth network, relying on much realistic smooth normal assumption, outperforms the traditional self-supervised depth prediction network by a large margin on the KITTI benchmark. Demo video: https://youtu.be/ZD-ZRsw7hdM
Tasks	Depth Estimation, Monocular Depth Estimation
Published	2019-03-01
URL	http://arxiv.org/abs/1903.00112v1
PDF	http://arxiv.org/pdf/1903.00112v1.pdf
PWC	https://paperswithcode.com/paper/self-supervised-learning-for-single-view
Repo
Framework

Machine Vision in the Context of Robotics: A Systematic Literature Review


Title	Machine Vision in the Context of Robotics: A Systematic Literature Review
Authors	Javad Ghofrani, Robert Kirschne, Daniel Rossburg, Dirk Reichelt, Tom Dimter
Abstract	Machine vision is critical to robotics due to a wide range of applications which rely on input from visual sensors such as autonomous mobile robots and smart production systems. To create the smart homes and systems of tomorrow, an overview about current challenges in the research field would be of use to identify further possible directions, created in a systematic and reproducible manner. In this work a systematic literature review was conducted covering research from the last 10 years. We screened 172 papers from four databases and selected 52 relevant papers. While robustness and computation time were improved greatly, occlusion and lighting variance are still the biggest problems faced. From the number of recent publications, we conclude that the observed field is of relevance and interest to the research community. Further challenges arise in many areas of the field.
Tasks
Published	2019-05-03
URL	https://arxiv.org/abs/1905.03708v1
PDF	https://arxiv.org/pdf/1905.03708v1.pdf
PWC	https://paperswithcode.com/paper/190503708
Repo
Framework

Knowledge Graph Development for App Store Data Modeling


Title	Knowledge Graph Development for App Store Data Modeling
Authors	Mariia Rizun, Artur Strzelecki
Abstract	Usage of mobile applications has become a part of our lives today, since every day we use our smartphones for communication, entertainment, business and education. High demand on apps has led to significant growth of supply, yet large offer has caused complications in users search of the one suitable application. The authors have made an attempt to solve the problem of facilitating the search in app stores. With the help of a website crawling software a sample of data was retrieved from one of the well-known mobile app stores and divided into 11 groups by types. These groups of data were used to construct a Knowledge Schema - a graphic model of interconnections of data that characterize any mobile app in the selected store. Schema creation is the first step in the process of developing a Knowledge Graph that will perform applications clustering to facilitate users search in app stores.
Tasks
Published	2019-03-17
URL	https://arxiv.org/abs/1903.07182v2
PDF	https://arxiv.org/pdf/1903.07182v2.pdf
PWC	https://paperswithcode.com/paper/knowledge-graph-development-for-app-store
Repo
Framework

Co-regularized Multi-view Sparse Reconstruction Embedding for Dimension Reduction


Title	Co-regularized Multi-view Sparse Reconstruction Embedding for Dimension Reduction
Authors	Huibing Wang, Jinjia Peng, Xianping Fu
Abstract	With the development of information technology, we have witnessed an age of data explosion which produces a large variety of data filled with redundant information. Because dimension reduction is an essential tool which embeds high-dimensional data into a lower-dimensional subspace to avoid redundant information, it has attracted interests from researchers all over the world. However, facing with features from multiple views, it’s difficult for most dimension reduction methods to fully comprehended multi-view features and integrate compatible and complementary information from these features to construct low-dimensional subspace directly. Furthermore, most multi-view dimension reduction methods cannot handle features from nonlinear spaces with high dimensions. Therefore, how to construct a multi-view dimension reduction methods which can deal with multi-view features from high-dimensional nonlinear space is of vital importance but challenging. In order to address this problem, we proposed a novel method named Co-regularized Multi-view Sparse Reconstruction Embedding (CMSRE) in this paper. By exploiting correlations of sparse reconstruction from multiple views, CMSRE is able to learn local sparse structures of nonlinear manifolds from multiple views and constructs significative low-dimensional representations for them. Due to the proposed co-regularized scheme, correlations of sparse reconstructions from multiple views are preserved by CMSRE as much as possible. Furthermore, sparse representation produces more meaningful correlations between features from each single view, which helps CMSRE to gain better performances. Various evaluations based on the applications of document classification, face recognition and image retrieval can demonstrate the effectiveness of the proposed approach on multi-view dimension reduction.
Tasks	Dimensionality Reduction, Document Classification, Face Recognition, Image Retrieval
Published	2019-04-01
URL	http://arxiv.org/abs/1904.08499v1
PDF	http://arxiv.org/pdf/1904.08499v1.pdf
PWC	https://paperswithcode.com/paper/190408499
Repo
Framework

Benchmarking Deep Learning Architectures for Predicting Readmission to the ICU and Describing Patients-at-Risk


Title	Benchmarking Deep Learning Architectures for Predicting Readmission to the ICU and Describing Patients-at-Risk
Authors	Sebastiano Barbieri, James Kemp, Oscar Perez-Concha, Sradha Kotwal, Martin Gallagher, Angus Ritchie, Louisa Jorm
Abstract	Objective: To compare different deep learning architectures for predicting the risk of readmission within 30 days of discharge from the intensive care unit (ICU). The interpretability of attention-based models is leveraged to describe patients-at-risk. Methods: Several deep learning architectures making use of attention mechanisms, recurrent layers, neural ordinary differential equations (ODEs), and medical concept embeddings with time-aware attention were trained using publicly available electronic medical record data (MIMIC-III) associated with 45,298 ICU stays for 33,150 patients. Bayesian inference was used to compute the posterior over weights of an attention-based model. Odds ratios associated with an increased risk of readmission were computed for static variables. Diagnoses, procedures, medications, and vital signs were ranked according to the associated risk of readmission. Results: A recurrent neural network, with time dynamics of code embeddings computed by neural ODEs, achieved the highest average precision of 0.331 (AUROC: 0.739, F1-Score: 0.372). Predictive accuracy was comparable across neural network architectures. Groups of patients at risk included those suffering from infectious complications, with chronic or progressive conditions, and for whom standard medical care was not suitable. Conclusions: Attention-based networks may be preferable to recurrent networks if an interpretable model is required, at only marginal cost in predictive accuracy.
Tasks	Bayesian Inference
Published	2019-05-21
URL	https://arxiv.org/abs/1905.08547v3
PDF	https://arxiv.org/pdf/1905.08547v3.pdf
PWC	https://paperswithcode.com/paper/a-deep-representation-of-longitudinal-emr
Repo
Framework

Unitary Kernel Quadrature for Training Parameter Distributions


Title	Unitary Kernel Quadrature for Training Parameter Distributions
Authors	Sho Sonoda
Abstract	A shallow model commonly appears in machine learning and signal processing. Whether it is parametric or non-parametric, a shallow model is formulated as the integration of feature maps against an unknown \emph{parameter distribution}, or a complex-valued measure. As is often the case with neural networks, if a model is parameterized in vectors, the parameter dimension needs to be manually determined by model selection; and the gradient descent training often results in a non-convex optimization problem, even when the loss function is convex. On the other hand, if the model is re-parameterized in measures, the parameter dimension can be automatically determined; and the training is convex when the loss function is convex. Therefore, it is natural to consider training the parameter distribution. However, handling a measure is difficult in practice because (1) we need a finite sum of point masses, and (2) the parameterization is not always unique. In other words, two different parameter distributions may indicate the same function. For example, kernels on the input space, priors on the parameter space, and random feature methods are too weak to handle point masses, because they turn point masses into smooth functions. On the other hand, versatile topologies for measures such as the total variation and the Wasserstein distance are too strong when the parameterization is not unique. Namely, these topologies unnecessarily distinguish two measures that indicate the same function, which causes another non-convexity. To address these difficulties, we investigate the \emph{generalized kernel quadrature} for handling complex-valued point masses; and propose to employ \emph{unitary kernel embedding} for killing non-uniqueness. The proposed method converges in $L^2$-norm at Barron’s theoretical fast ratio.
Tasks	Model Selection
Published	2019-02-02
URL	https://arxiv.org/abs/1902.00648v2
PDF	https://arxiv.org/pdf/1902.00648v2.pdf
PWC	https://paperswithcode.com/paper/numerical-integration-method-for-training
Repo
Framework

Learning Linear Dynamical Systems with Semi-Parametric Least Squares


Title	Learning Linear Dynamical Systems with Semi-Parametric Least Squares
Authors	Max Simchowitz, Ross Boczar, Benjamin Recht
Abstract	We analyze a simple prefiltered variation of the least squares estimator for the problem of estimation with biased, semi-parametric noise, an error model studied more broadly in causal statistics and active learning. We prove an oracle inequality which demonstrates that this procedure provably mitigates the variance introduced by long-term dependencies. We then demonstrate that prefiltered least squares yields, to our knowledge, the first algorithm that provably estimates the parameters of partially-observed linear systems that attains rates which do not not incur a worst-case dependence on the rate at which these dependencies decay. The algorithm is provably consistent even for systems which satisfy the weaker marginal stability condition obeyed by many classical models based on Newtonian mechanics. In this context, our semi-parametric framework yields guarantees for both stochastic and worst-case noise.
Tasks	Active Learning
Published	2019-02-02
URL	http://arxiv.org/abs/1902.00768v1
PDF	http://arxiv.org/pdf/1902.00768v1.pdf
PWC	https://paperswithcode.com/paper/learning-linear-dynamical-systems-with-semi
Repo
Framework

The Impact of Preprocessing on Arabic-English Statistical and Neural Machine Translation


Title	The Impact of Preprocessing on Arabic-English Statistical and Neural Machine Translation
Authors	Mai Oudah, Amjad Almahairi, Nizar Habash
Abstract	Neural networks have become the state-of-the-art approach for machine translation (MT) in many languages. While linguistically-motivated tokenization techniques were shown to have significant effects on the performance of statistical MT, it remains unclear if those techniques are well suited for neural MT. In this paper, we systematically compare neural and statistical MT models for Arabic-English translation on data preprecossed by various prominent tokenization schemes. Furthermore, we consider a range of data and vocabulary sizes and compare their effect on both approaches. Our empirical results show that the best choice of tokenization scheme is largely based on the type of model and the size of data. We also show that we can gain significant improvements using a system selection that combines the output from neural and statistical MT.
Tasks	Machine Translation, Tokenization
Published	2019-06-27
URL	https://arxiv.org/abs/1906.11751v1
PDF	https://arxiv.org/pdf/1906.11751v1.pdf
PWC	https://paperswithcode.com/paper/the-impact-of-preprocessing-on-arabic-english
Repo
Framework

Learning Neural Search Policies for Classical Planning


Title	Learning Neural Search Policies for Classical Planning
Authors	Pawel Gomoluch, Dalal Alrajeh, Alessandra Russo, Antonio Bucchiarone
Abstract	Heuristic forward search is currently the dominant paradigm in classical planning. Forward search algorithms typically rely on a single, relatively simple variation of best-first search and remain fixed throughout the process of solving a planning problem. Existing work combining multiple search techniques usually aims at supporting best-first search with an additional exploratory mechanism, triggered using a handcrafted criterion. A notable exception is very recent work which combines various search techniques using a trainable policy. It is, however, confined to a discrete action space comprising several fixed subroutines. In this paper, we introduce a parametrized search algorithm template which combines various search techniques within a single routine. The template’s parameter space defines an infinite space of search algorithms, including, among others, BFS, local and random search. We further introduce a neural architecture for designating the values of the search parameters given the state of the search. This enables expressing neural search policies that change the values of the parameters as the search progresses. The policies can be learned automatically, with the objective of maximizing the planner’s performance on a given distribution of planning problems. We consider a training setting based on a stochastic optimization algorithm known as the cross-entropy method (CEM). Experimental evaluation of our approach shows that it is capable of finding effective distribution-specific search policies, outperforming the relevant baselines.
Tasks	Stochastic Optimization
Published	2019-11-27
URL	https://arxiv.org/abs/1911.12200v1
PDF	https://arxiv.org/pdf/1911.12200v1.pdf
PWC	https://paperswithcode.com/paper/learning-neural-search-policies-for-classical
Repo
Framework

Understanding BERT performance in propaganda analysis


Title	Understanding BERT performance in propaganda analysis
Authors	Yiqing Hua
Abstract	In this paper, we describe our system used in the shared task for fine-grained propaganda analysis at sentence level. Despite the challenging nature of the task, our pretrained BERT model (team YMJA) fine tuned on the training dataset provided by the shared task scored 0.62 F1 on the test set and ranked third among 25 teams who participated in the contest. We present a set of illustrative experiments to better understand the performance of our BERT model on this shared task. Further, we explore beyond the given dataset for false-positive cases that likely to be produced by our system. We show that despite the high performance on the given testset, our system may have the tendency of classifying opinion pieces as propaganda and cannot distinguish quotations of propaganda speech from actual usage of propaganda techniques.
Tasks
Published	2019-11-11
URL	https://arxiv.org/abs/1911.04525v1
PDF	https://arxiv.org/pdf/1911.04525v1.pdf
PWC	https://paperswithcode.com/paper/understanding-bert-performance-in-propaganda-1
Repo
Framework

Forecaster: A Graph Transformer for Forecasting Spatial and Time-Dependent Data


Title	Forecaster: A Graph Transformer for Forecasting Spatial and Time-Dependent Data
Authors	Yang Li, José M. F. Moura
Abstract	Spatial and time-dependent data is of interest in many applications. This task is difficult due to its complex spatial dependency, long-range temporal dependency, data non-stationarity, and data heterogeneity. To address these challenges, we propose Forecaster, a graph Transformer architecture. Specifically, we start by learning the structure of the graph that parsimoniously represents the spatial dependency between the data at different locations. Based on the topology of the graph, we sparsify the Transformer to account for the strength of spatial dependency, long-range temporal dependency, data non-stationarity, and data heterogeneity. We evaluate Forecaster in the problem of forecasting taxi ride-hailing demand and show that our proposed architecture significantly outperforms the state-of-the-art baselines.
Tasks
Published	2019-09-09
URL	https://arxiv.org/abs/1909.04019v5
PDF	https://arxiv.org/pdf/1909.04019v5.pdf
PWC	https://paperswithcode.com/paper/forecaster-a-graph-transformer-for
Repo
Framework

Perceptual representations of structural information in images: application to quality assessment of synthesized view in FTV scenario


Title	Perceptual representations of structural information in images: application to quality assessment of synthesized view in FTV scenario
Authors	Ling suiyi, Li Jing, Le Callet Patrick, Wang Junle
Abstract	As the immersive multimedia techniques like Free-viewpoint TV (FTV) develop at an astonishing rate, user’s demand for high-quality immersive contents increases dramatically. Unlike traditional uniform artifacts, the distortions within immersive contents could be non-uniform structure-related and thus are challenging for commonly used quality metrics. Recent studies have demonstrated that the representation of visual features can be extracted from multiple levels of the hierarchy. Inspired by the hierarchical representation mechanism in the human visual system (HVS), in this paper, we explore to adopt structural representations to quantitatively measure the impact of such structure-related distortion on perceived quality in FTV scenario. More specifically, a bio-inspired full reference image quality metric is proposed based on 1) low-level contour descriptor; 2) mid-level contour category descriptor; and 3) task-oriented non-natural structure descriptor. The experimental results show that the proposed model outperforms significantly the state-of-the-art metrics.
Tasks
Published	2019-07-08
URL	https://arxiv.org/abs/1907.03448v1
PDF	https://arxiv.org/pdf/1907.03448v1.pdf
PWC	https://paperswithcode.com/paper/perceptual-representations-of-structural
Repo
Framework

Calibrated Model-Based Deep Reinforcement Learning


Title	Calibrated Model-Based Deep Reinforcement Learning
Authors	Ali Malik, Volodymyr Kuleshov, Jiaming Song, Danny Nemer, Harlan Seymour, Stefano Ermon
Abstract	Estimates of predictive uncertainty are important for accurate model-based planning and reinforcement learning. However, predictive uncertainties—especially ones derived from modern deep learning systems—can be inaccurate and impose a bottleneck on performance. This paper explores which uncertainties are needed for model-based reinforcement learning and argues that good uncertainties must be calibrated, i.e. their probabilities should match empirical frequencies of predicted events. We describe a simple way to augment any model-based reinforcement learning agent with a calibrated model and show that doing so consistently improves planning, sample complexity, and exploration. On the \textsc{HalfCheetah} MuJoCo task, our system achieves state-of-the-art performance using 50% fewer samples than the current leading approach. Our findings suggest that calibration can improve the performance of model-based reinforcement learning with minimal computational and implementation overhead.
Tasks	Calibration
Published	2019-06-19
URL	https://arxiv.org/abs/1906.08312v1
PDF	https://arxiv.org/pdf/1906.08312v1.pdf
PWC	https://paperswithcode.com/paper/calibrated-model-based-deep-reinforcement
Repo
Framework