April 2, 2020

2661 words 13 mins read

Paper Group ANR 112

Paper Group ANR 112

WISDoM: a framework for the Analysis of Wishart distributed matrices. Unique Class Group Based Multi-Label Balancing Optimizer for Action Unit Detection. Unsupervised Enhancement of Real-World Depth Images Using Tri-Cycle GAN. Last Iterate is Slower than Averaged Iterate in Smooth Convex-Concave Saddle Point Problems. Face Attribute Invertion. A Nu …

WISDoM: a framework for the Analysis of Wishart distributed matrices

Title WISDoM: a framework for the Analysis of Wishart distributed matrices
Authors Carlo Mengucci, Daniel Remondini, Enrico Giampieri
Abstract WISDoM (Wishart Distributed Matrices) is a new framework for the characterization of symmetric positive-definite matrices associated to experimental samples, like covariance or correlation matrices, based on the Wishart distribution as a null model. WISDoM can be applied to tasks of supervised learning, like classification, even when such matrices are generated by data of different dimensionality (e.g. time series with same number of variables but different time sampling). In particular, we show the application of the method for the ranking of features associated to electro encephalogram (EEG) data with a time series design, providing a theoretically sound approach for this type of studies.
Tasks EEG, Time Series
Published 2020-01-28
URL https://arxiv.org/abs/2001.10342v1
PDF https://arxiv.org/pdf/2001.10342v1.pdf
PWC https://paperswithcode.com/paper/wisdom-a-framework-for-the-analysis-of

Unique Class Group Based Multi-Label Balancing Optimizer for Action Unit Detection

Title Unique Class Group Based Multi-Label Balancing Optimizer for Action Unit Detection
Authors Ines Rieger, Jaspar Pahl, Dominik Seuss
Abstract Balancing methods for single-label data cannot be applied to multi-label problems as they would also resample the samples with high occurrences. We propose to reformulate this problem as an optimization problem in order to balance multi-label data. We apply this balancing algorithm to training datasets for detecting isolated facial movements, so-called Action Units. Several Action Units can describe combined emotions or physical states such as pain. As datasets in this area are limited and mostly imbalanced, we show how optimized balancing and then augmentation can improve Action Unit detection. At the IEEE Conference on Face and Gesture Recognition 2020, we ranked third in the Affective Behavior Analysis in-the-wild (ABAW) challenge for the Action Unit detection task.
Tasks Action Unit Detection, Gesture Recognition
Published 2020-03-05
URL https://arxiv.org/abs/2003.08751v1
PDF https://arxiv.org/pdf/2003.08751v1.pdf
PWC https://paperswithcode.com/paper/unique-class-group-based-multi-label

Unsupervised Enhancement of Real-World Depth Images Using Tri-Cycle GAN

Title Unsupervised Enhancement of Real-World Depth Images Using Tri-Cycle GAN
Authors Alona Baruhov, Guy Gilboa
Abstract Low quality depth poses a considerable challenge to computer vision algorithms. In this work we aim to enhance highly degraded, real-world depth images acquired by a low-cost sensor, for which an analytical noise model is unavailable. In the absence of clean ground-truth, we approach the task as an unsupervised domain-translation between the low-quality sensor domain and a high-quality sensor domain, represented using two unpaired training sets. We employ the highly-successful Cycle-GAN to this task, but find it to perform poorly in this case. Identifying the sources of the failure, we introduce several modifications to the framework, including a larger generator architecture, depth-specific losses that take into account missing pixels, and a novel Tri-Cycle loss which promotes information-preservation while addressing the asymmetry between the domains. We show that the resulting framework dramatically improves over the original Cycle-GAN both visually and quantitatively, extending its applicability to more challenging and asymmetric translation tasks.
Published 2020-01-11
URL https://arxiv.org/abs/2001.03779v1
PDF https://arxiv.org/pdf/2001.03779v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-enhancement-of-real-world-depth

Last Iterate is Slower than Averaged Iterate in Smooth Convex-Concave Saddle Point Problems

Title Last Iterate is Slower than Averaged Iterate in Smooth Convex-Concave Saddle Point Problems
Authors Noah Golowich, Sarath Pattathil, Constantinos Daskalakis, Asuman Ozdaglar
Abstract In this paper we study the smooth convex-concave saddle point problem. Specifically, we analyze the last iterate convergence properties of the Extragradient (EG) algorithm. It is well known that the ergodic (averaged) iterates of EG converge at a rate of $O(1/T)$ (Nemirovski, 2004). In this paper, we show that the last iterate of EG converges at a rate of $O(1/\sqrt{T})$. To the best of our knowledge, this is the first paper to provide a convergence rate guarantee for the last iterate of EG for the smooth convex-concave saddle point problem. Moreover, we show that this rate is tight by proving a lower bound of $\Omega(1/\sqrt{T})$ for the last iterate. This lower bound therefore shows a quadratic separation of the convergence rates of ergodic and last iterates in smooth convex-concave saddle point problems.
Published 2020-01-31
URL https://arxiv.org/abs/2002.00057v1
PDF https://arxiv.org/pdf/2002.00057v1.pdf
PWC https://paperswithcode.com/paper/last-iterate-is-slower-than-averaged-iterate

Face Attribute Invertion

Title Face Attribute Invertion
Authors X G Tu, Y Luo, H S Zhang, W J Ai, Z Ma, M Xie
Abstract Manipulating human facial images between two domains is an important and interesting problem. Most of the existing methods address this issue by applying two generators or one generator with extra conditional inputs. In this paper, we proposed a novel self-perception method based on GANs for automatical face attribute inverse. The proposed method takes face images as inputs and employs only one single generator without being conditioned on other inputs. Profiting from the multi-loss strategy and modified U-net structure, our model is quite stable in training and capable of preserving finer details of the original face images.
Published 2020-01-14
URL https://arxiv.org/abs/2001.04665v1
PDF https://arxiv.org/pdf/2001.04665v1.pdf
PWC https://paperswithcode.com/paper/face-attribute-invertion

A Numerical Transform of Random Forest Regressors corrects Systematically-Biased Predictions

Title A Numerical Transform of Random Forest Regressors corrects Systematically-Biased Predictions
Authors Shipra Malhotra, John Karanicolas
Abstract Over the past decade, random forest models have become widely used as a robust method for high-dimensional data regression tasks. In part, the popularity of these models arises from the fact that they require little hyperparameter tuning and are not very susceptible to overfitting. Random forest regression models are comprised of an ensemble of decision trees that independently predict the value of a (continuous) dependent variable; predictions from each of the trees are ultimately averaged to yield an overall predicted value from the forest. Using a suite of representative real-world datasets, we find a systematic bias in predictions from random forest models. We find that this bias is recapitulated in simple synthetic datasets, regardless of whether or not they include irreducible error (noise) in the data, but that models employing boosting do not exhibit this bias. Here we demonstrate the basis for this problem, and we use the training data to define a numerical transformation that fully corrects it. Application of this transformation yields improved predictions in every one of the real-world and synthetic datasets evaluated in our study.
Published 2020-03-16
URL https://arxiv.org/abs/2003.07445v1
PDF https://arxiv.org/pdf/2003.07445v1.pdf
PWC https://paperswithcode.com/paper/a-numerical-transform-of-random-forest

Improving Yorùbá Diacritic Restoration

Title Improving Yorùbá Diacritic Restoration
Authors Iroro Orife, David I. Adelani, Timi Fasubaa, Victor Williamson, Wuraola Fisayo Oyewusi, Olamilekan Wahab, Kola Tubosun
Abstract Yor`ub'a is a widely spoken West African language with a writing system rich in orthographic and tonal diacritics. They provide morphological information, are crucial for lexical disambiguation, pronunciation and are vital for any computational Speech or Natural Language Processing tasks. However diacritic marks are commonly excluded from electronic texts due to limited device and application support as well as general education on proper usage. We report on recent efforts at dataset cultivation. By aggregating and improving disparate texts from the web and various personal libraries, we were able to significantly grow our clean Yor`ub'a dataset from a majority Bibilical text corpora with three sources to millions of tokens from over a dozen sources. We evaluate updated diacritic restoration models on a new, general purpose, public-domain Yor`ub'a evaluation dataset of modern journalistic news text, selected to be multi-purpose and reflecting contemporary usage. All pre-trained models, datasets and source-code have been released as an open-source project to advance efforts on Yor`ub'a language technology.
Published 2020-03-23
URL https://arxiv.org/abs/2003.10564v1
PDF https://arxiv.org/pdf/2003.10564v1.pdf
PWC https://paperswithcode.com/paper/improving-yoruba-diacritic-restoration

Restoration of Fragmentary Babylonian Texts Using Recurrent Neural Networks

Title Restoration of Fragmentary Babylonian Texts Using Recurrent Neural Networks
Authors Ethan Fetaya, Yonatan Lifshitz, Elad Aaron, Shai Gordin
Abstract The main source of information regarding ancient Mesopotamian history and culture are clay cuneiform tablets. Despite being an invaluable resource, many tablets are fragmented leading to missing information. Currently these missing parts are manually completed by experts. In this work we investigate the possibility of assisting scholars and even automatically completing the breaks in ancient Akkadian texts from Achaemenid period Babylonia by modelling the language using recurrent neural networks.
Published 2020-03-04
URL https://arxiv.org/abs/2003.01912v1
PDF https://arxiv.org/pdf/2003.01912v1.pdf
PWC https://paperswithcode.com/paper/restoration-of-fragmentary-babylonian-texts

Active and Incremental Learning with Weak Supervision

Title Active and Incremental Learning with Weak Supervision
Authors Clemens-Alexander Brust, Christoph Käding, Joachim Denzler
Abstract Large amounts of labeled training data are one of the main contributors to the great success that deep models have achieved in the past. Label acquisition for tasks other than benchmarks can pose a challenge due to requirements of both funding and expertise. By selecting unlabeled examples that are promising in terms of model improvement and only asking for respective labels, active learning can increase the efficiency of the labeling process in terms of time and cost. In this work, we describe combinations of an incremental learning scheme and methods of active learning. These allow for continuous exploration of newly observed unlabeled data. We describe selection criteria based on model uncertainty as well as expected model output change (EMOC). An object detection task is evaluated in a continuous exploration context on the PASCAL VOC dataset. We also validate a weakly supervised system based on active and incremental learning in a real-world biodiversity application where images from camera traps are analyzed. Labeling only 32 images by accepting or rejecting proposals generated by our method yields an increase in accuracy from 25.4% to 42.6%.
Tasks Active Learning, Object Detection
Published 2020-01-20
URL https://arxiv.org/abs/2001.07100v1
PDF https://arxiv.org/pdf/2001.07100v1.pdf
PWC https://paperswithcode.com/paper/active-and-incremental-learning-with-weak

Indoor Scene Recognition in 3D

Title Indoor Scene Recognition in 3D
Authors Shengyu Huang, Mikhail Usvyatsov, Konrad Schindler
Abstract Recognising in what type of environment one is located is an important perception task. For instance, for a robot operating in indoors it is helpful to be aware whether it is in a kitchen, a hallway or a bedroom. Existing approaches attempt to classify the scene based on 2D images or 2.5D range images. Here, we study scene recognition from 3D point cloud (or voxel) data, and show that it greatly outperforms methods based on 2D birds-eye views. Moreover, we advocate multi-task learning as a way of improving scene recognition, building on the fact that the scene type is highly correlated with the objects in the scene, and therefore with its semantic segmentation into different object classes. In a series of ablation studies, we show that successful scene recognition is not just the recognition of individual objects unique to some scene type (such as a bathtub), but depends on several different cues, including coarse 3D geometry, colour, and the (implicit) distribution of object categories. Moreover, we demonstrate that surprisingly sparse 3D data is sufficient to classify indoor scenes with good accuracy.
Tasks Multi-Task Learning, Scene Recognition, Semantic Segmentation
Published 2020-02-28
URL https://arxiv.org/abs/2002.12819v1
PDF https://arxiv.org/pdf/2002.12819v1.pdf
PWC https://paperswithcode.com/paper/indoor-scene-recognition-in-3d

Angle-Based Cost-Sensitive Multicategory Classification

Title Angle-Based Cost-Sensitive Multicategory Classification
Authors Yi Yang, Yuxuan Guo, Xiangyu Chang
Abstract Many real-world classification problems come with costs which can vary for different types of misclassification. It is thus important to develop cost-sensitive classifiers which minimize the total misclassification cost. Although binary cost-sensitive classifiers have been well-studied, solving multicategory classification problems is still challenging. A popular approach to address this issue is to construct K classification functions for a K-class problem and remove the redundancy by imposing a sum-to-zero constraint. However, such method usually results in higher computational complexity and inefficient algorithms. In this paper, we propose a novel angle-based cost-sensitive classification framework for multicategory classification without the sum-to-zero constraint. Loss functions that included in the angle-based cost-sensitive classification framework are further justified to be Fisher consistent. To show the usefulness of the framework, two cost-sensitive multicategory boosting algorithms are derived as concrete instances. Numerical experiments demonstrate that proposed boosting algorithms yield competitive classification performances against other existing boosting approaches.
Published 2020-03-08
URL https://arxiv.org/abs/2003.03691v1
PDF https://arxiv.org/pdf/2003.03691v1.pdf
PWC https://paperswithcode.com/paper/angle-based-cost-sensitive-multicategory

Time-varying Gaussian Process Bandit Optimization with Non-constant Evaluation Time

Title Time-varying Gaussian Process Bandit Optimization with Non-constant Evaluation Time
Authors Hideaki Imamura, Nontawat Charoenphakdee, Futoshi Futami, Issei Sato, Junya Honda, Masashi Sugiyama
Abstract The Gaussian process bandit is a problem in which we want to find a maximizer of a black-box function with the minimum number of function evaluations. If the black-box function varies with time, then time-varying Bayesian optimization is a promising framework. However, a drawback with current methods is in the assumption that the evaluation time for every observation is constant, which can be unrealistic for many practical applications, e.g., recommender systems and environmental monitoring. As a result, the performance of current methods can be degraded when this assumption is violated. To cope with this problem, we propose a novel time-varying Bayesian optimization algorithm that can effectively handle the non-constant evaluation time. Furthermore, we theoretically establish a regret bound of our algorithm. Our bound elucidates that a pattern of the evaluation time sequence can hugely affect the difficulty of the problem. We also provide experimental results to validate the practical effectiveness of the proposed method.
Tasks Recommendation Systems
Published 2020-03-10
URL https://arxiv.org/abs/2003.04691v2
PDF https://arxiv.org/pdf/2003.04691v2.pdf
PWC https://paperswithcode.com/paper/time-varying-gaussian-process-bandit-1

Explanation-Based Tuning of Opaque Machine Learners with Application to Paper Recommendation

Title Explanation-Based Tuning of Opaque Machine Learners with Application to Paper Recommendation
Authors Benjamin Charles Germain Lee, Kyle Lo, Doug Downey, Daniel S. Weld
Abstract Research in human-centered AI has shown the benefits of machine-learning systems that can explain their predictions. Methods that allow users to tune a model in response to the explanations are similarly useful. While both capabilities are well-developed for transparent learning models (e.g., linear models and GA2Ms), and recent techniques (e.g., LIME and SHAP) can generate explanations for opaque models, no method currently exists for tuning of opaque models in response to explanations. This paper introduces LIMEADE, a general framework for tuning an arbitrary machine learning model based on an explanation of the model’s prediction. We apply our framework to Semantic Sanity, a neural recommender system for scientific papers, and report on a detailed user study, showing that our framework leads to significantly higher perceived user control, trust, and satisfaction.
Tasks Recommendation Systems
Published 2020-03-09
URL https://arxiv.org/abs/2003.04315v1
PDF https://arxiv.org/pdf/2003.04315v1.pdf
PWC https://paperswithcode.com/paper/explanation-based-tuning-of-opaque-machine

Regret analysis of the Piyavskii-Shubert algorithm for global Lipschitz optimization

Title Regret analysis of the Piyavskii-Shubert algorithm for global Lipschitz optimization
Authors Clément Bouttier, Tommaso Cesari, Sébastien Gerchinovitz
Abstract We consider the problem of maximizing a non-concave Lipschitz multivariate function f over a compact domain. We provide regret guarantees (i.e., optimization error bounds) for a very natural algorithm originally designed by Piyavskii and Shubert in 1972. Our results hold in a general setting in which values of f can only be accessed approximately. In particular, they yield state-of-the-art regret bounds both when f is observed exactly and when evaluations are perturbed by an independent subgaussian noise.
Published 2020-02-06
URL https://arxiv.org/abs/2002.02390v1
PDF https://arxiv.org/pdf/2002.02390v1.pdf
PWC https://paperswithcode.com/paper/regret-analysis-of-the-piyavskii-shubert

Multi-object Tracking via End-to-end Tracklet Searching and Ranking

Title Multi-object Tracking via End-to-end Tracklet Searching and Ranking
Authors Tao Hu, Lichao Huang, Han Shen
Abstract Recent works in multiple object tracking use sequence model to calculate the similarity score between the detections and the previous tracklets. However, the forced exposure to ground-truth in the training stage leads to the training-inference discrepancy problem, i.e., exposure bias, where association error could accumulate in the inference and make the trajectories drift. In this paper, we propose a novel method for optimizing tracklet consistency, which directly takes the prediction errors into account by introducing an online, end-to-end tracklet search training process. Notably, our methods directly optimize the whole tracklet score instead of pairwise affinity. With sequence model as appearance encoders of tracklet, our tracker achieves remarkable performance gain from conventional tracklet association baseline. Our methods have also achieved state-of-the-art in MOT15~17 challenge benchmarks using public detection and online settings.
Tasks Multi-Object Tracking, Multiple Object Tracking, Object Tracking
Published 2020-03-04
URL https://arxiv.org/abs/2003.02795v1
PDF https://arxiv.org/pdf/2003.02795v1.pdf
PWC https://paperswithcode.com/paper/multi-object-tracking-via-end-to-end-tracklet
comments powered by Disqus