January 26, 2020

3188 words 15 mins read

Paper Group ANR 1471

Paper Group ANR 1471

Replicated Vector Approximate Message Passing For Resampling Problem. Efficient Fair Principal Component Analysis. Context-aware Human Motion Prediction. The Incomplete Rosetta Stone Problem: Identifiability Results for Multi-View Nonlinear ICA. Online Music Listening Culture of Kids and Adolescents: Listening Analysis and Music Recommendation Tail …

Replicated Vector Approximate Message Passing For Resampling Problem

Title Replicated Vector Approximate Message Passing For Resampling Problem
Authors Takashi Takahashi, Yoshiyuki Kabashima
Abstract Resampling techniques are widely used in statistical inference and ensemble learning, in which estimators’ statistical properties are essential. However, existing methods are computationally demanding, because repetitions of estimation/learning via numerical optimization/integral for each resampled data are required. In this study, we introduce a computationally efficient method to resolve such problem: replicated vector approximate message passing. This is based on a combination of the replica method of statistical physics and an accurate approximate inference algorithm, namely the vector approximate message passing of information theory. The method provides tractable densities without repeating estimation/learning, and the densities approximately offer an arbitrary degree of the estimators’ moment in practical time. In the experiment, we apply the proposed method to the stability selection method, which is commonly used in variable selection problems. The numerical results show its fast convergence and high approximation accuracy for problems involving both synthetic and real-world datasets.
Tasks
Published 2019-05-23
URL https://arxiv.org/abs/1905.09545v1
PDF https://arxiv.org/pdf/1905.09545v1.pdf
PWC https://paperswithcode.com/paper/replicated-vector-approximate-message-passing
Repo
Framework

Efficient Fair Principal Component Analysis

Title Efficient Fair Principal Component Analysis
Authors Mohammad Mahdi Kamani, Farzin Haddadpour, Rana Forsati, Mehrdad Mahdavi
Abstract It has been shown that dimension reduction methods such as PCA may be inherently prone to unfairness and treat data from different sensitive groups such as race, color, sex, etc., unfairly. In pursuit of fairness-enhancing dimensionality reduction, using the notion of Pareto optimality, we propose an adaptive first-order algorithm to learn a subspace that preserves fairness, while slightly compromising the reconstruction loss. Theoretically, we provide sufficient conditions that the solution of the proposed algorithm belongs to the Pareto frontier for all sensitive groups; thereby, the optimal trade-off between overall reconstruction loss and fairness constraints is guaranteed. We also provide the convergence analysis of our algorithm and show its efficacy through empirical studies on different datasets, which demonstrates superior performance in comparison with state-of-the-art algorithms. The proposed fairness-aware PCA algorithm can be efficiently generalized to multiple group sensitive features and effectively reduce the unfairness decisions in downstream tasks such as classification.
Tasks Dimensionality Reduction
Published 2019-11-12
URL https://arxiv.org/abs/1911.04931v2
PDF https://arxiv.org/pdf/1911.04931v2.pdf
PWC https://paperswithcode.com/paper/efficient-fair-principal-component-analysis
Repo
Framework

Context-aware Human Motion Prediction

Title Context-aware Human Motion Prediction
Authors Enric Corona, Albert Pumarola, Guillem Alenyà, Francesc Moreno-Noguer
Abstract The problem of predicting human motion given a sequence of past observations is at the core of many applications in robotics and computer vision. Current state-of-the-art formulate this problem as a sequence-to-sequence task, in which a historical of 3D skeletons feeds a Recurrent Neural Network (RNN) that predicts future movements, typically in the order of 1 to 2 seconds. However, one aspect that has been obviated so far, is the fact that human motion is inherently driven by interactions with objects and/or other humans in the environment. In this paper, we explore this scenario using a novel context-aware motion prediction architecture. We use a semantic-graph model where the nodes parameterize the human and objects in the scene and the edges their mutual interactions. These interactions are iteratively learned through a graph attention layer, fed with the past observations, which now include both object and human body motions. Once this semantic graph is learned, we inject it to a standard RNN to predict future movements of the human/s and object/s. We consider two variants of our architecture, either freezing the contextual interactions in the future of updating them. A thorough evaluation in the “Whole-Body Human Motion Database” shows that in both cases, our context-aware networks clearly outperform baselines in which the context information is not considered.
Tasks motion prediction
Published 2019-04-06
URL https://arxiv.org/abs/1904.03419v3
PDF https://arxiv.org/pdf/1904.03419v3.pdf
PWC https://paperswithcode.com/paper/context-aware-human-motion-prediction
Repo
Framework

The Incomplete Rosetta Stone Problem: Identifiability Results for Multi-View Nonlinear ICA

Title The Incomplete Rosetta Stone Problem: Identifiability Results for Multi-View Nonlinear ICA
Authors Luigi Gresele, Paul K. Rubenstein, Arash Mehrjou, Francesco Locatello, Bernhard Schölkopf
Abstract We consider the problem of recovering a common latent source with independent components from multiple views. This applies to settings in which a variable is measured with multiple experimental modalities, and where the goal is to synthesize the disparate measurements into a single unified representation. We consider the case that the observed views are a nonlinear mixing of component-wise corruptions of the sources. When the views are considered separately, this reduces to nonlinear Independent Component Analysis (ICA) for which it is provably impossible to undo the mixing. We present novel identifiability proofs that this is possible when the multiple views are considered jointly, showing that the mixing can theoretically be undone using function approximators such as deep neural networks. In contrast to known identifiability results for nonlinear ICA, we prove that independent latent sources with arbitrary mixing can be recovered as long as multiple, sufficiently different noisy views are available.
Tasks
Published 2019-05-16
URL https://arxiv.org/abs/1905.06642v2
PDF https://arxiv.org/pdf/1905.06642v2.pdf
PWC https://paperswithcode.com/paper/the-incomplete-rosetta-stone-problem
Repo
Framework

Online Music Listening Culture of Kids and Adolescents: Listening Analysis and Music Recommendation Tailored to the Young

Title Online Music Listening Culture of Kids and Adolescents: Listening Analysis and Music Recommendation Tailored to the Young
Authors Markus Schedl, Christine Bauer
Abstract In this paper, we analyze a large dataset of user-generated music listening events from Last.fm, focusing on users aged 6 to 18 years. Our contribution is two-fold. First, we study the music genre preferences of this young user group and analyze these preferences for homogeneity within more fine-grained age groups and with respect to gender and countries. Second, we investigate the performance of a collaborative filtering recommender when tailoring music recommendations to different age groups. We find that doing so improves performance for all user groups up to 18 years, but decreases performance for adult users aged 19 years and older.
Tasks
Published 2019-12-24
URL https://arxiv.org/abs/1912.11564v1
PDF https://arxiv.org/pdf/1912.11564v1.pdf
PWC https://paperswithcode.com/paper/online-music-listening-culture-of-kids-and
Repo
Framework

NPSA: Nonorthogonal Principal Skewness Analysis

Title NPSA: Nonorthogonal Principal Skewness Analysis
Authors Xiurui Geng, Lei Wang
Abstract Principal skewness analysis (PSA) has been introduced for feature extraction in hyperspectral imagery. As a third-order generalization of principal component analysis (PCA), its solution of searching for the locally maximum skewness direction is transformed into the problem of calculating the eigenpairs (the eigenvalues and the corresponding eigenvectors) of a coskewness tensor. By combining a fixed-point method with an orthogonal constraint, it can prevent the new eigenpairs from converging to the same maxima that has been determined before. However, the eigenvectors of the supersymmetric tensor are not inherently orthogonal in general, which implies that the results obtained by the search strategy used in PSA may unavoidably deviate from the actual eigenpairs. In this paper, we propose a new nonorthogonal search strategy to solve this problem and the new algorithm is named nonorthogonal principal skewness analysis (NPSA). The contribution of NPSA lies in the finding that the search space of the eigenvector to be determined can be enlarged by using the orthogonal complement of the Kronecker product of the previous one, instead of its orthogonal complement space. We give a detailed theoretical proof to illustrate why the new strategy can result in the more accurate eigenpairs. In addition, after some algebraic derivations, the complexity of the presented algorithm is also greatly reduced. Experiments with both simulated data and real multi/hyperspectral imagery demonstrate its validity in feature extraction.
Tasks
Published 2019-07-23
URL https://arxiv.org/abs/1907.09811v1
PDF https://arxiv.org/pdf/1907.09811v1.pdf
PWC https://paperswithcode.com/paper/npsa-nonorthogonal-principal-skewness
Repo
Framework

‘In-Between’ Uncertainty in Bayesian Neural Networks

Title ‘In-Between’ Uncertainty in Bayesian Neural Networks
Authors Andrew Y. K. Foong, Yingzhen Li, José Miguel Hernández-Lobato, Richard E. Turner
Abstract We describe a limitation in the expressiveness of the predictive uncertainty estimate given by mean-field variational inference (MFVI), a popular approximate inference method for Bayesian neural networks. In particular, MFVI fails to give calibrated uncertainty estimates in between separated regions of observations. This can lead to catastrophically overconfident predictions when testing on out-of-distribution data. Avoiding such overconfidence is critical for active learning, Bayesian optimisation and out-of-distribution robustness. We instead find that a classical technique, the linearised Laplace approximation, can handle ‘in-between’ uncertainty much better for small network architectures.
Tasks Active Learning, Bayesian Optimisation
Published 2019-06-27
URL https://arxiv.org/abs/1906.11537v1
PDF https://arxiv.org/pdf/1906.11537v1.pdf
PWC https://paperswithcode.com/paper/in-between-uncertainty-in-bayesian-neural
Repo
Framework

The Resale Price Prediction of Secondhand Jewelry Items Using a Multi-modal Deep Model with Iterative Co-Attention

Title The Resale Price Prediction of Secondhand Jewelry Items Using a Multi-modal Deep Model with Iterative Co-Attention
Authors Yusuke Yamaura, Nobuya Kanemaki, Yukihiro Tsuboshita
Abstract The resale price assessment of secondhand jewelry items relies heavily on the individual knowledge and skill of domain experts. In this paper, we propose a methodology for reconstructing an AI system that autonomously assesses the resale prices of secondhand jewelry items without the need for professional knowledge. As shown in recent studies on fashion items, multimodal approaches combining specifications and visual information of items have succeeded in obtaining fine-grained representations of fashion items, although they generally apply simple vector operations through a multimodal fusion. We similarly build a multimodal model using images and attributes of the product and further employ state-of-the-art multimodal deep neural networks applied in computer vision to achieve a practical performance level. In addition, we model the pricing procedure of an expert using iterative co-attention networks in which the appearance and attributes of the product are carefully and iteratively observed. Herein, we demonstrate the effectiveness of our model using a large dataset of secondhand no brand jewelry items received from a collaborating fashion retailer, and show that the iterative co-attention process operates effectively in the context of resale price prediction. Our model architecture is widely applicable to other fashion items where appearance and specifications are important aspects.
Tasks
Published 2019-07-01
URL https://arxiv.org/abs/1907.00661v1
PDF https://arxiv.org/pdf/1907.00661v1.pdf
PWC https://paperswithcode.com/paper/the-resale-price-prediction-of-secondhand
Repo
Framework

Pretrained Language Models for Sequential Sentence Classification

Title Pretrained Language Models for Sequential Sentence Classification
Authors Arman Cohan, Iz Beltagy, Daniel King, Bhavana Dalvi, Daniel S. Weld
Abstract As a step toward better document-level understanding, we explore classification of a sequence of sentences into their corresponding categories, a task that requires understanding sentences in context of the document. Recent successful models for this task have used hierarchical models to contextualize sentence representations, and Conditional Random Fields (CRFs) to incorporate dependencies between subsequent labels. In this work, we show that pretrained language models, BERT (Devlin et al., 2018) in particular, can be used for this task to capture contextual dependencies without the need for hierarchical encoding nor a CRF. Specifically, we construct a joint sentence representation that allows BERT Transformer layers to directly utilize contextual information from all words in all sentences. Our approach achieves state-of-the-art results on four datasets, including a new dataset of structured scientific abstracts.
Tasks Sentence Classification
Published 2019-09-09
URL https://arxiv.org/abs/1909.04054v2
PDF https://arxiv.org/pdf/1909.04054v2.pdf
PWC https://paperswithcode.com/paper/pretrained-language-models-for-sequential
Repo
Framework

2DR1-PCA and 2DL1-PCA: two variant 2DPCA algorithms based on none L2 norm

Title 2DR1-PCA and 2DL1-PCA: two variant 2DPCA algorithms based on none L2 norm
Authors Xing Liu, Xiao-Jun Wu, Zi-Qi Li
Abstract In this paper, two novel methods: 2DR1-PCA and 2DL1-PCA are proposed for face recognition. Compared to the traditional 2DPCA algorithm, 2DR1-PCA and 2DL1-PCA are based on the R1 norm and L1 norm, respectively. The advantage of these proposed methods is they are less sensitive to outliers. These proposed methods are tested on the ORL, YALE and XM2VTS databases and the performance of the related methods is compared experimentally.
Tasks Face Recognition
Published 2019-12-23
URL https://arxiv.org/abs/1912.10768v1
PDF https://arxiv.org/pdf/1912.10768v1.pdf
PWC https://paperswithcode.com/paper/2dr1-pca-and-2dl1-pca-two-variant-2dpca
Repo
Framework

An adaptive and fully automatic method for estimating the 3D position of bendable instruments using endoscopic images

Title An adaptive and fully automatic method for estimating the 3D position of bendable instruments using endoscopic images
Authors Paolo Cabras, Florent Nageotte, Philippe Zanne, Christophe Doignon
Abstract Background. Flexible bendable instruments are key tools for performing surgical endoscopy. Being able to measure the 3D position of such instruments can be useful for various tasks, such as controlling automatically robotized instruments and analyzing motions. Methods. We propose an automatic method to infer the 3D pose of a single bending section instrument, using only the images provided by a monocular camera embedded at the tip of the endoscope. The proposed method relies on colored markers attached onto the bending section. The image of the instrument is segmented using a graph-based method and the corners of the markers are extracted by detecting the color transition along B{'e}zier curves fitted on edge points. These features are accurately located and then used to estimate the 3D pose of the instrument using an adaptive model that allows to take into account the mechanical play between the instrument and its housing channel. Results. The feature extraction method provides good localization of markers corners with images of in vivo environment despite sensor saturation due to strong lighting. The RMS error on the estimation of the tip position of the instrument for laboratory experiments was 2.1, 1.96, 3.18 mm in the x, y and z directions respectively. Qualitative analysis in the case of in vivo images shows the ability to correctly estimate the 3D position of the instrument tip during real motions. Conclusions. The proposed method provides an automatic and accurate estimation of the 3D position of the tip of a bendable instrument in realistic conditions, where standard approaches fail.
Tasks
Published 2019-11-29
URL https://arxiv.org/abs/1911.13125v1
PDF https://arxiv.org/pdf/1911.13125v1.pdf
PWC https://paperswithcode.com/paper/an-adaptive-and-fully-automatic-method-for
Repo
Framework

A Deep Learning Framework for Pricing Financial Instruments

Title A Deep Learning Framework for Pricing Financial Instruments
Authors Qiong Wu, Zheng Zhang, Andrea Pizzoferrato, Mihai Cucuringu, Zhenming Liu
Abstract We propose an integrated deep learning architecture for the stock movement prediction. Our architecture simultaneously leverages all available alpha sources. The sources include technical signals, financial news signals, and cross-sectional signals. Our architecture possesses three main properties. First, our architecture eludes overfitting issues. Although we consume a large number of technical signals but has better generalization properties than linear models. Second, our model effectively captures the interactions between signals from different categories. Third, our architecture has low computation cost. We design a graph-based component that extracts cross-sectional interactions which circumvents usage of SVD that’s needed in standard models. Experimental results on the real-world stock market show that our approach outperforms the existing baselines. Meanwhile, the results from different trading simulators demonstrate that we can effectively monetize the signals.
Tasks
Published 2019-09-07
URL https://arxiv.org/abs/1909.04497v1
PDF https://arxiv.org/pdf/1909.04497v1.pdf
PWC https://paperswithcode.com/paper/a-deep-learning-framework-for-pricing
Repo
Framework

Development of Clinical Concept Extraction Applications: A Methodology Review

Title Development of Clinical Concept Extraction Applications: A Methodology Review
Authors Sunyang Fu, David Chen, Huan He, Sijia Liu, Sungrim Moon, Kevin J Peterson, Feichen Shen, Liwei Wang, Yanshan Wang, Andrew Wen, Yiqing Zhao, Sunghwan Sohn, Hongfang Liu
Abstract Background Concept extraction, a subdomain of natural language processing (NLP) with a focus on extracting concepts of interest, has been adopted to computationally extract clinical information from text for a wide range of applications ranging from clinical decision support to care quality improvement. Objectives In this literature review, we provide a methodology review of clinical concept extraction, aiming to catalog development processes, available methods and tools, and specific considerations when developing clinical concept extraction applications. Methods Based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, a literature search was conducted for retrieving EHR-based information extraction articles written in English and published from January 2009 through June 2019 from Ovid MEDLINE In-Process & Other Non-Indexed Citations, Ovid MEDLINE, Ovid EMBASE, Scopus, Web of Science, and the ACM Digital Library. Results A total of 6,555 publications were retrieved. After title and abstract screening, 224 publications were selected. The methods used for developing clinical concept extraction applications were discussed in this review.
Tasks Clinical Concept Extraction, Decision Making
Published 2019-10-24
URL https://arxiv.org/abs/1910.11377v3
PDF https://arxiv.org/pdf/1910.11377v3.pdf
PWC https://paperswithcode.com/paper/a-review-of-the-end-to-end-methodologies-for
Repo
Framework

Language Modeling with Deep Transformers

Title Language Modeling with Deep Transformers
Authors Kazuki Irie, Albert Zeyer, Ralf Schlüter, Hermann Ney
Abstract We explore deep autoregressive Transformer models in language modeling for speech recognition. We focus on two aspects. First, we revisit Transformer model configurations specifically for language modeling. We show that well configured Transformer models outperform our baseline models based on the shallow stack of LSTM recurrent neural network layers. We carry out experiments on the open-source LibriSpeech 960hr task, for both 200K vocabulary word-level and 10K byte-pair encoding subword-level language modeling. We apply our word-level models to conventional hybrid speech recognition by lattice rescoring, and the subword-level models to attention based encoder-decoder models by shallow fusion. Second, we show that deep Transformer language models do not require positional encoding. The positional encoding is an essential augmentation for the self-attention mechanism which is invariant to sequence ordering. However, in autoregressive setup, as is the case for language modeling, the amount of information increases along the position dimension, which is a positional signal by its own. The analysis of attention weights shows that deep autoregressive self-attention models can automatically make use of such positional information. We find that removing the positional encoding even slightly improves the performance of these models.
Tasks Language Modelling, Speech Recognition
Published 2019-05-10
URL https://arxiv.org/abs/1905.04226v2
PDF https://arxiv.org/pdf/1905.04226v2.pdf
PWC https://paperswithcode.com/paper/language-modeling-with-deep-transformers
Repo
Framework

Automatic Scale Estimation of Structure from Motion based 3D Models using Laser Scalers

Title Automatic Scale Estimation of Structure from Motion based 3D Models using Laser Scalers
Authors Klemen Istenic, Nuno Gracias, Aurelien Arnaubec, Javier Escartin, Rafael Garcia
Abstract Recent advances in structure-from-motion techniques are enabling many scientific fields to benefit from the routine creation of detailed 3D models. However, for a large number of applications, only a single camera is available, due to cost or space constraints in the survey platforms. Monocular structure-from-motion raises the issue of properly estimating the scale of the 3D models, in order to later use those models for metrology. The scale can be determined from the presence of visible objects of known dimensions, or from information on the magnitude of the camera motion provided by other sensors, such as GPS. This paper addresses the problem of accurately scaling 3D models created from monocular cameras in GPS-denied environments, such as in underwater applications. Motivated by the common availability of underwater laser scalers, we present two novel approaches. A fully-calibrated method enables the use of arbitrary laser setups, while a partially-calibrated method reduces the need for calibration by only assuming parallelism on the laser beams, with no constraints on the camera. The proposed methods have several advantages with respect to the existing methods. The need for laser alignment with the optical axis of the camera is removed, together with the extremely error-prone manual identification of image points on the 3D model. The performance of the methods and their applicability was evaluated on both data generated from a realistic 3D model and data collected during an oceanographic cruise in 2017. Three separate laser configurations have been tested, encompassing nearly all possible laser setups, to evaluate the effects of terrain roughness, noise, camera perspective angle and camera-scene distance. In the real scenario, the computation of 6 independent model scale estimates using our fully-calibrated approach, produced values with standard deviation of 0.3%.
Tasks Calibration
Published 2019-06-19
URL https://arxiv.org/abs/1906.08019v1
PDF https://arxiv.org/pdf/1906.08019v1.pdf
PWC https://paperswithcode.com/paper/automatic-scale-estimation-of-structure-from
Repo
Framework
comments powered by Disqus