Paper Group ANR 1471
Replicated Vector Approximate Message Passing For Resampling Problem. Efficient Fair Principal Component Analysis. Context-aware Human Motion Prediction. The Incomplete Rosetta Stone Problem: Identifiability Results for Multi-View Nonlinear ICA. Online Music Listening Culture of Kids and Adolescents: Listening Analysis and Music Recommendation Tail …
Replicated Vector Approximate Message Passing For Resampling Problem
Title | Replicated Vector Approximate Message Passing For Resampling Problem |
Authors | Takashi Takahashi, Yoshiyuki Kabashima |
Abstract | Resampling techniques are widely used in statistical inference and ensemble learning, in which estimators’ statistical properties are essential. However, existing methods are computationally demanding, because repetitions of estimation/learning via numerical optimization/integral for each resampled data are required. In this study, we introduce a computationally efficient method to resolve such problem: replicated vector approximate message passing. This is based on a combination of the replica method of statistical physics and an accurate approximate inference algorithm, namely the vector approximate message passing of information theory. The method provides tractable densities without repeating estimation/learning, and the densities approximately offer an arbitrary degree of the estimators’ moment in practical time. In the experiment, we apply the proposed method to the stability selection method, which is commonly used in variable selection problems. The numerical results show its fast convergence and high approximation accuracy for problems involving both synthetic and real-world datasets. |
Tasks | |
Published | 2019-05-23 |
URL | https://arxiv.org/abs/1905.09545v1 |
https://arxiv.org/pdf/1905.09545v1.pdf | |
PWC | https://paperswithcode.com/paper/replicated-vector-approximate-message-passing |
Repo | |
Framework | |
Efficient Fair Principal Component Analysis
Title | Efficient Fair Principal Component Analysis |
Authors | Mohammad Mahdi Kamani, Farzin Haddadpour, Rana Forsati, Mehrdad Mahdavi |
Abstract | It has been shown that dimension reduction methods such as PCA may be inherently prone to unfairness and treat data from different sensitive groups such as race, color, sex, etc., unfairly. In pursuit of fairness-enhancing dimensionality reduction, using the notion of Pareto optimality, we propose an adaptive first-order algorithm to learn a subspace that preserves fairness, while slightly compromising the reconstruction loss. Theoretically, we provide sufficient conditions that the solution of the proposed algorithm belongs to the Pareto frontier for all sensitive groups; thereby, the optimal trade-off between overall reconstruction loss and fairness constraints is guaranteed. We also provide the convergence analysis of our algorithm and show its efficacy through empirical studies on different datasets, which demonstrates superior performance in comparison with state-of-the-art algorithms. The proposed fairness-aware PCA algorithm can be efficiently generalized to multiple group sensitive features and effectively reduce the unfairness decisions in downstream tasks such as classification. |
Tasks | Dimensionality Reduction |
Published | 2019-11-12 |
URL | https://arxiv.org/abs/1911.04931v2 |
https://arxiv.org/pdf/1911.04931v2.pdf | |
PWC | https://paperswithcode.com/paper/efficient-fair-principal-component-analysis |
Repo | |
Framework | |
Context-aware Human Motion Prediction
Title | Context-aware Human Motion Prediction |
Authors | Enric Corona, Albert Pumarola, Guillem Alenyà, Francesc Moreno-Noguer |
Abstract | The problem of predicting human motion given a sequence of past observations is at the core of many applications in robotics and computer vision. Current state-of-the-art formulate this problem as a sequence-to-sequence task, in which a historical of 3D skeletons feeds a Recurrent Neural Network (RNN) that predicts future movements, typically in the order of 1 to 2 seconds. However, one aspect that has been obviated so far, is the fact that human motion is inherently driven by interactions with objects and/or other humans in the environment. In this paper, we explore this scenario using a novel context-aware motion prediction architecture. We use a semantic-graph model where the nodes parameterize the human and objects in the scene and the edges their mutual interactions. These interactions are iteratively learned through a graph attention layer, fed with the past observations, which now include both object and human body motions. Once this semantic graph is learned, we inject it to a standard RNN to predict future movements of the human/s and object/s. We consider two variants of our architecture, either freezing the contextual interactions in the future of updating them. A thorough evaluation in the “Whole-Body Human Motion Database” shows that in both cases, our context-aware networks clearly outperform baselines in which the context information is not considered. |
Tasks | motion prediction |
Published | 2019-04-06 |
URL | https://arxiv.org/abs/1904.03419v3 |
https://arxiv.org/pdf/1904.03419v3.pdf | |
PWC | https://paperswithcode.com/paper/context-aware-human-motion-prediction |
Repo | |
Framework | |
The Incomplete Rosetta Stone Problem: Identifiability Results for Multi-View Nonlinear ICA
Title | The Incomplete Rosetta Stone Problem: Identifiability Results for Multi-View Nonlinear ICA |
Authors | Luigi Gresele, Paul K. Rubenstein, Arash Mehrjou, Francesco Locatello, Bernhard Schölkopf |
Abstract | We consider the problem of recovering a common latent source with independent components from multiple views. This applies to settings in which a variable is measured with multiple experimental modalities, and where the goal is to synthesize the disparate measurements into a single unified representation. We consider the case that the observed views are a nonlinear mixing of component-wise corruptions of the sources. When the views are considered separately, this reduces to nonlinear Independent Component Analysis (ICA) for which it is provably impossible to undo the mixing. We present novel identifiability proofs that this is possible when the multiple views are considered jointly, showing that the mixing can theoretically be undone using function approximators such as deep neural networks. In contrast to known identifiability results for nonlinear ICA, we prove that independent latent sources with arbitrary mixing can be recovered as long as multiple, sufficiently different noisy views are available. |
Tasks | |
Published | 2019-05-16 |
URL | https://arxiv.org/abs/1905.06642v2 |
https://arxiv.org/pdf/1905.06642v2.pdf | |
PWC | https://paperswithcode.com/paper/the-incomplete-rosetta-stone-problem |
Repo | |
Framework | |
Online Music Listening Culture of Kids and Adolescents: Listening Analysis and Music Recommendation Tailored to the Young
Title | Online Music Listening Culture of Kids and Adolescents: Listening Analysis and Music Recommendation Tailored to the Young |
Authors | Markus Schedl, Christine Bauer |
Abstract | In this paper, we analyze a large dataset of user-generated music listening events from Last.fm, focusing on users aged 6 to 18 years. Our contribution is two-fold. First, we study the music genre preferences of this young user group and analyze these preferences for homogeneity within more fine-grained age groups and with respect to gender and countries. Second, we investigate the performance of a collaborative filtering recommender when tailoring music recommendations to different age groups. We find that doing so improves performance for all user groups up to 18 years, but decreases performance for adult users aged 19 years and older. |
Tasks | |
Published | 2019-12-24 |
URL | https://arxiv.org/abs/1912.11564v1 |
https://arxiv.org/pdf/1912.11564v1.pdf | |
PWC | https://paperswithcode.com/paper/online-music-listening-culture-of-kids-and |
Repo | |
Framework | |
NPSA: Nonorthogonal Principal Skewness Analysis
Title | NPSA: Nonorthogonal Principal Skewness Analysis |
Authors | Xiurui Geng, Lei Wang |
Abstract | Principal skewness analysis (PSA) has been introduced for feature extraction in hyperspectral imagery. As a third-order generalization of principal component analysis (PCA), its solution of searching for the locally maximum skewness direction is transformed into the problem of calculating the eigenpairs (the eigenvalues and the corresponding eigenvectors) of a coskewness tensor. By combining a fixed-point method with an orthogonal constraint, it can prevent the new eigenpairs from converging to the same maxima that has been determined before. However, the eigenvectors of the supersymmetric tensor are not inherently orthogonal in general, which implies that the results obtained by the search strategy used in PSA may unavoidably deviate from the actual eigenpairs. In this paper, we propose a new nonorthogonal search strategy to solve this problem and the new algorithm is named nonorthogonal principal skewness analysis (NPSA). The contribution of NPSA lies in the finding that the search space of the eigenvector to be determined can be enlarged by using the orthogonal complement of the Kronecker product of the previous one, instead of its orthogonal complement space. We give a detailed theoretical proof to illustrate why the new strategy can result in the more accurate eigenpairs. In addition, after some algebraic derivations, the complexity of the presented algorithm is also greatly reduced. Experiments with both simulated data and real multi/hyperspectral imagery demonstrate its validity in feature extraction. |
Tasks | |
Published | 2019-07-23 |
URL | https://arxiv.org/abs/1907.09811v1 |
https://arxiv.org/pdf/1907.09811v1.pdf | |
PWC | https://paperswithcode.com/paper/npsa-nonorthogonal-principal-skewness |
Repo | |
Framework | |
‘In-Between’ Uncertainty in Bayesian Neural Networks
Title | ‘In-Between’ Uncertainty in Bayesian Neural Networks |
Authors | Andrew Y. K. Foong, Yingzhen Li, José Miguel Hernández-Lobato, Richard E. Turner |
Abstract | We describe a limitation in the expressiveness of the predictive uncertainty estimate given by mean-field variational inference (MFVI), a popular approximate inference method for Bayesian neural networks. In particular, MFVI fails to give calibrated uncertainty estimates in between separated regions of observations. This can lead to catastrophically overconfident predictions when testing on out-of-distribution data. Avoiding such overconfidence is critical for active learning, Bayesian optimisation and out-of-distribution robustness. We instead find that a classical technique, the linearised Laplace approximation, can handle ‘in-between’ uncertainty much better for small network architectures. |
Tasks | Active Learning, Bayesian Optimisation |
Published | 2019-06-27 |
URL | https://arxiv.org/abs/1906.11537v1 |
https://arxiv.org/pdf/1906.11537v1.pdf | |
PWC | https://paperswithcode.com/paper/in-between-uncertainty-in-bayesian-neural |
Repo | |
Framework | |
The Resale Price Prediction of Secondhand Jewelry Items Using a Multi-modal Deep Model with Iterative Co-Attention
Title | The Resale Price Prediction of Secondhand Jewelry Items Using a Multi-modal Deep Model with Iterative Co-Attention |
Authors | Yusuke Yamaura, Nobuya Kanemaki, Yukihiro Tsuboshita |
Abstract | The resale price assessment of secondhand jewelry items relies heavily on the individual knowledge and skill of domain experts. In this paper, we propose a methodology for reconstructing an AI system that autonomously assesses the resale prices of secondhand jewelry items without the need for professional knowledge. As shown in recent studies on fashion items, multimodal approaches combining specifications and visual information of items have succeeded in obtaining fine-grained representations of fashion items, although they generally apply simple vector operations through a multimodal fusion. We similarly build a multimodal model using images and attributes of the product and further employ state-of-the-art multimodal deep neural networks applied in computer vision to achieve a practical performance level. In addition, we model the pricing procedure of an expert using iterative co-attention networks in which the appearance and attributes of the product are carefully and iteratively observed. Herein, we demonstrate the effectiveness of our model using a large dataset of secondhand no brand jewelry items received from a collaborating fashion retailer, and show that the iterative co-attention process operates effectively in the context of resale price prediction. Our model architecture is widely applicable to other fashion items where appearance and specifications are important aspects. |
Tasks | |
Published | 2019-07-01 |
URL | https://arxiv.org/abs/1907.00661v1 |
https://arxiv.org/pdf/1907.00661v1.pdf | |
PWC | https://paperswithcode.com/paper/the-resale-price-prediction-of-secondhand |
Repo | |
Framework | |
Pretrained Language Models for Sequential Sentence Classification
Title | Pretrained Language Models for Sequential Sentence Classification |
Authors | Arman Cohan, Iz Beltagy, Daniel King, Bhavana Dalvi, Daniel S. Weld |
Abstract | As a step toward better document-level understanding, we explore classification of a sequence of sentences into their corresponding categories, a task that requires understanding sentences in context of the document. Recent successful models for this task have used hierarchical models to contextualize sentence representations, and Conditional Random Fields (CRFs) to incorporate dependencies between subsequent labels. In this work, we show that pretrained language models, BERT (Devlin et al., 2018) in particular, can be used for this task to capture contextual dependencies without the need for hierarchical encoding nor a CRF. Specifically, we construct a joint sentence representation that allows BERT Transformer layers to directly utilize contextual information from all words in all sentences. Our approach achieves state-of-the-art results on four datasets, including a new dataset of structured scientific abstracts. |
Tasks | Sentence Classification |
Published | 2019-09-09 |
URL | https://arxiv.org/abs/1909.04054v2 |
https://arxiv.org/pdf/1909.04054v2.pdf | |
PWC | https://paperswithcode.com/paper/pretrained-language-models-for-sequential |
Repo | |
Framework | |
2DR1-PCA and 2DL1-PCA: two variant 2DPCA algorithms based on none L2 norm
Title | 2DR1-PCA and 2DL1-PCA: two variant 2DPCA algorithms based on none L2 norm |
Authors | Xing Liu, Xiao-Jun Wu, Zi-Qi Li |
Abstract | In this paper, two novel methods: 2DR1-PCA and 2DL1-PCA are proposed for face recognition. Compared to the traditional 2DPCA algorithm, 2DR1-PCA and 2DL1-PCA are based on the R1 norm and L1 norm, respectively. The advantage of these proposed methods is they are less sensitive to outliers. These proposed methods are tested on the ORL, YALE and XM2VTS databases and the performance of the related methods is compared experimentally. |
Tasks | Face Recognition |
Published | 2019-12-23 |
URL | https://arxiv.org/abs/1912.10768v1 |
https://arxiv.org/pdf/1912.10768v1.pdf | |
PWC | https://paperswithcode.com/paper/2dr1-pca-and-2dl1-pca-two-variant-2dpca |
Repo | |
Framework | |
An adaptive and fully automatic method for estimating the 3D position of bendable instruments using endoscopic images
Title | An adaptive and fully automatic method for estimating the 3D position of bendable instruments using endoscopic images |
Authors | Paolo Cabras, Florent Nageotte, Philippe Zanne, Christophe Doignon |
Abstract | Background. Flexible bendable instruments are key tools for performing surgical endoscopy. Being able to measure the 3D position of such instruments can be useful for various tasks, such as controlling automatically robotized instruments and analyzing motions. Methods. We propose an automatic method to infer the 3D pose of a single bending section instrument, using only the images provided by a monocular camera embedded at the tip of the endoscope. The proposed method relies on colored markers attached onto the bending section. The image of the instrument is segmented using a graph-based method and the corners of the markers are extracted by detecting the color transition along B{'e}zier curves fitted on edge points. These features are accurately located and then used to estimate the 3D pose of the instrument using an adaptive model that allows to take into account the mechanical play between the instrument and its housing channel. Results. The feature extraction method provides good localization of markers corners with images of in vivo environment despite sensor saturation due to strong lighting. The RMS error on the estimation of the tip position of the instrument for laboratory experiments was 2.1, 1.96, 3.18 mm in the x, y and z directions respectively. Qualitative analysis in the case of in vivo images shows the ability to correctly estimate the 3D position of the instrument tip during real motions. Conclusions. The proposed method provides an automatic and accurate estimation of the 3D position of the tip of a bendable instrument in realistic conditions, where standard approaches fail. |
Tasks | |
Published | 2019-11-29 |
URL | https://arxiv.org/abs/1911.13125v1 |
https://arxiv.org/pdf/1911.13125v1.pdf | |
PWC | https://paperswithcode.com/paper/an-adaptive-and-fully-automatic-method-for |
Repo | |
Framework | |
A Deep Learning Framework for Pricing Financial Instruments
Title | A Deep Learning Framework for Pricing Financial Instruments |
Authors | Qiong Wu, Zheng Zhang, Andrea Pizzoferrato, Mihai Cucuringu, Zhenming Liu |
Abstract | We propose an integrated deep learning architecture for the stock movement prediction. Our architecture simultaneously leverages all available alpha sources. The sources include technical signals, financial news signals, and cross-sectional signals. Our architecture possesses three main properties. First, our architecture eludes overfitting issues. Although we consume a large number of technical signals but has better generalization properties than linear models. Second, our model effectively captures the interactions between signals from different categories. Third, our architecture has low computation cost. We design a graph-based component that extracts cross-sectional interactions which circumvents usage of SVD that’s needed in standard models. Experimental results on the real-world stock market show that our approach outperforms the existing baselines. Meanwhile, the results from different trading simulators demonstrate that we can effectively monetize the signals. |
Tasks | |
Published | 2019-09-07 |
URL | https://arxiv.org/abs/1909.04497v1 |
https://arxiv.org/pdf/1909.04497v1.pdf | |
PWC | https://paperswithcode.com/paper/a-deep-learning-framework-for-pricing |
Repo | |
Framework | |
Development of Clinical Concept Extraction Applications: A Methodology Review
Title | Development of Clinical Concept Extraction Applications: A Methodology Review |
Authors | Sunyang Fu, David Chen, Huan He, Sijia Liu, Sungrim Moon, Kevin J Peterson, Feichen Shen, Liwei Wang, Yanshan Wang, Andrew Wen, Yiqing Zhao, Sunghwan Sohn, Hongfang Liu |
Abstract | Background Concept extraction, a subdomain of natural language processing (NLP) with a focus on extracting concepts of interest, has been adopted to computationally extract clinical information from text for a wide range of applications ranging from clinical decision support to care quality improvement. Objectives In this literature review, we provide a methodology review of clinical concept extraction, aiming to catalog development processes, available methods and tools, and specific considerations when developing clinical concept extraction applications. Methods Based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, a literature search was conducted for retrieving EHR-based information extraction articles written in English and published from January 2009 through June 2019 from Ovid MEDLINE In-Process & Other Non-Indexed Citations, Ovid MEDLINE, Ovid EMBASE, Scopus, Web of Science, and the ACM Digital Library. Results A total of 6,555 publications were retrieved. After title and abstract screening, 224 publications were selected. The methods used for developing clinical concept extraction applications were discussed in this review. |
Tasks | Clinical Concept Extraction, Decision Making |
Published | 2019-10-24 |
URL | https://arxiv.org/abs/1910.11377v3 |
https://arxiv.org/pdf/1910.11377v3.pdf | |
PWC | https://paperswithcode.com/paper/a-review-of-the-end-to-end-methodologies-for |
Repo | |
Framework | |
Language Modeling with Deep Transformers
Title | Language Modeling with Deep Transformers |
Authors | Kazuki Irie, Albert Zeyer, Ralf Schlüter, Hermann Ney |
Abstract | We explore deep autoregressive Transformer models in language modeling for speech recognition. We focus on two aspects. First, we revisit Transformer model configurations specifically for language modeling. We show that well configured Transformer models outperform our baseline models based on the shallow stack of LSTM recurrent neural network layers. We carry out experiments on the open-source LibriSpeech 960hr task, for both 200K vocabulary word-level and 10K byte-pair encoding subword-level language modeling. We apply our word-level models to conventional hybrid speech recognition by lattice rescoring, and the subword-level models to attention based encoder-decoder models by shallow fusion. Second, we show that deep Transformer language models do not require positional encoding. The positional encoding is an essential augmentation for the self-attention mechanism which is invariant to sequence ordering. However, in autoregressive setup, as is the case for language modeling, the amount of information increases along the position dimension, which is a positional signal by its own. The analysis of attention weights shows that deep autoregressive self-attention models can automatically make use of such positional information. We find that removing the positional encoding even slightly improves the performance of these models. |
Tasks | Language Modelling, Speech Recognition |
Published | 2019-05-10 |
URL | https://arxiv.org/abs/1905.04226v2 |
https://arxiv.org/pdf/1905.04226v2.pdf | |
PWC | https://paperswithcode.com/paper/language-modeling-with-deep-transformers |
Repo | |
Framework | |
Automatic Scale Estimation of Structure from Motion based 3D Models using Laser Scalers
Title | Automatic Scale Estimation of Structure from Motion based 3D Models using Laser Scalers |
Authors | Klemen Istenic, Nuno Gracias, Aurelien Arnaubec, Javier Escartin, Rafael Garcia |
Abstract | Recent advances in structure-from-motion techniques are enabling many scientific fields to benefit from the routine creation of detailed 3D models. However, for a large number of applications, only a single camera is available, due to cost or space constraints in the survey platforms. Monocular structure-from-motion raises the issue of properly estimating the scale of the 3D models, in order to later use those models for metrology. The scale can be determined from the presence of visible objects of known dimensions, or from information on the magnitude of the camera motion provided by other sensors, such as GPS. This paper addresses the problem of accurately scaling 3D models created from monocular cameras in GPS-denied environments, such as in underwater applications. Motivated by the common availability of underwater laser scalers, we present two novel approaches. A fully-calibrated method enables the use of arbitrary laser setups, while a partially-calibrated method reduces the need for calibration by only assuming parallelism on the laser beams, with no constraints on the camera. The proposed methods have several advantages with respect to the existing methods. The need for laser alignment with the optical axis of the camera is removed, together with the extremely error-prone manual identification of image points on the 3D model. The performance of the methods and their applicability was evaluated on both data generated from a realistic 3D model and data collected during an oceanographic cruise in 2017. Three separate laser configurations have been tested, encompassing nearly all possible laser setups, to evaluate the effects of terrain roughness, noise, camera perspective angle and camera-scene distance. In the real scenario, the computation of 6 independent model scale estimates using our fully-calibrated approach, produced values with standard deviation of 0.3%. |
Tasks | Calibration |
Published | 2019-06-19 |
URL | https://arxiv.org/abs/1906.08019v1 |
https://arxiv.org/pdf/1906.08019v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-scale-estimation-of-structure-from |
Repo | |
Framework | |