January 30, 2020

3179 words 15 mins read

Paper Group ANR 334

Paper Group ANR 334

Estimating the Algorithmic Variance of Randomized Ensembles via the Bootstrap. KRM-based Dialogue Management. One-stage Shape Instantiation from a Single 2D Image to 3D Point Cloud. Automatic Routing of Goldstone Diagrams using Genetic Algorithms. Examining the Presence of Gender Bias in Customer Reviews Using Word Embedding. Towards Ethical Machin …

Estimating the Algorithmic Variance of Randomized Ensembles via the Bootstrap

Title Estimating the Algorithmic Variance of Randomized Ensembles via the Bootstrap
Authors Miles E. Lopes
Abstract Although the methods of bagging and random forests are some of the most widely used prediction methods, relatively little is known about their algorithmic convergence. In particular, there are not many theoretical guarantees for deciding when an ensemble is “large enough” — so that its accuracy is close to that of an ideal infinite ensemble. Due to the fact that bagging and random forests are randomized algorithms, the choice of ensemble size is closely related to the notion of “algorithmic variance” (i.e. the variance of prediction error due only to the training algorithm). In the present work, we propose a bootstrap method to estimate this variance for bagging, random forests, and related methods in the context of classification. To be specific, suppose the training dataset is fixed, and let the random variable $Err_t$ denote the prediction error of a randomized ensemble of size $t$. Working under a “first-order model” for randomized ensembles, we prove that the centered law of $Err_t$ can be consistently approximated via the proposed method as $t\to\infty$. Meanwhile, the computational cost of the method is quite modest, by virtue of an extrapolation technique. As a consequence, the method offers a practical guideline for deciding when the algorithmic fluctuations of $Err_t$ are negligible.
Tasks
Published 2019-07-20
URL https://arxiv.org/abs/1907.08742v1
PDF https://arxiv.org/pdf/1907.08742v1.pdf
PWC https://paperswithcode.com/paper/estimating-the-algorithmic-variance-of
Repo
Framework

KRM-based Dialogue Management

Title KRM-based Dialogue Management
Authors Wenwu Qu, Xiaoyu Chi, Wei Zheng
Abstract A KRM-based dialogue management (DM) is proposed using to implement human-computer dialogue system in complex scenarios. KRM-based DM has a well description ability and it can ensure the logic of the dialogue process. Then a complex application scenario in the Internet of Things (IOT) industry and a dialogue system implemented based on the KRM-based DM will be introduced, where the system allows enterprise customers to customize topics and adapts corresponding topics in the interaction process with users. The experimental results show that the system can complete the interactive tasks well, and can effectively solve the problems of topic switching, information inheritance between topics, change of dominance.
Tasks Dialogue Management
Published 2019-12-02
URL https://arxiv.org/abs/1912.00669v1
PDF https://arxiv.org/pdf/1912.00669v1.pdf
PWC https://paperswithcode.com/paper/krm-based-dialogue-management
Repo
Framework

One-stage Shape Instantiation from a Single 2D Image to 3D Point Cloud

Title One-stage Shape Instantiation from a Single 2D Image to 3D Point Cloud
Authors Xiao-Yun Zhou, Zhao-Yang Wang, Peichao Li, Jian-Qing Zheng, Guang-Zhong Yang
Abstract Shape instantiation which predicts the 3D shape of a dynamic target from one or more 2D images is important for real-time intra-operative navigation. Previously, a general shape instantiation framework was proposed with manual image segmentation to generate a 2D Statistical Shape Model (SSM) and with Kernel Partial Least Square Regression (KPLSR) to learn the relationship between the 2D and 3D SSM for 3D shape prediction. In this paper, the two-stage shape instantiation is improved to be one-stage. PointOutNet with 19 convolutional layers and three fully-connected layers is used as the network structure and Chamfer distance is used as the loss function to predict the 3D target point cloud from a single 2D image. With the proposed one-stage shape instantiation algorithm, a spontaneous image-to-point cloud training and inference can be achieved. A dataset from 27 Right Ventricle (RV) subjects, indicating 609 experiments, were used to validate the proposed one-stage shape instantiation algorithm. An average point cloud-to-point cloud (PC-to-PC) error of 1.72mm has been achieved, which is comparable to the PLSR-based (1.42mm) and KPLSR-based (1.31mm) two-stage shape instantiation algorithm.
Tasks Semantic Segmentation
Published 2019-07-24
URL https://arxiv.org/abs/1907.10763v1
PDF https://arxiv.org/pdf/1907.10763v1.pdf
PWC https://paperswithcode.com/paper/one-stage-shape-instantiation-from-a-single
Repo
Framework

Automatic Routing of Goldstone Diagrams using Genetic Algorithms

Title Automatic Routing of Goldstone Diagrams using Genetic Algorithms
Authors Nils Herrmann, Michael Hanrath
Abstract This paper presents an algorithm for an automatic transformation (=routing) of time ordered topologies of Goldstone diagrams (i.e. Wick contractions) into graphical representations of these topologies. Since there is no hard criterion for an optimal routing, the proposed algorithm minimizes an empirically chosen cost function over a set of parameters. Some of the latter are naturally of discrete type (e.g. interchange of particle/hole lines due to antisymmetry) while others (e.g. x,y-position of nodes) are naturally continuous. In order to arrive at a manageable optimization problem the position space is artificially discretized. In terms of the (i) cost function, (ii) the discrete vertex placement, (iii) the interchange of particle/hole lines the routing problem is now well defined and fully discrete. However, it shows an exponential complexity with the number of vertices suggesting to apply a genetic algorithm for its solution. The presented algorithm is capable of routing non trivial (several loops and crossings) Goldstone diagrams. The resulting diagrams are qualitatively fully equivalent to manually routed ones. The proposed algorithm is successfully applied to several Coupled Cluster approaches and a perturbative (fixpoint iterative) CCSD expansion with repeated diagram substitution.
Tasks
Published 2019-06-30
URL https://arxiv.org/abs/1907.00426v1
PDF https://arxiv.org/pdf/1907.00426v1.pdf
PWC https://paperswithcode.com/paper/automatic-routing-of-goldstone-diagrams-using
Repo
Framework

Examining the Presence of Gender Bias in Customer Reviews Using Word Embedding

Title Examining the Presence of Gender Bias in Customer Reviews Using Word Embedding
Authors A. Mishra, H. Mishra, S. Rathee
Abstract Humans have entered the age of algorithms. Each minute, algorithms shape countless preferences from suggesting a product to a potential life partner. In the marketplace algorithms are trained to learn consumer preferences from customer reviews because user-generated reviews are considered the voice of customers and a valuable source of information to firms. Insights mined from reviews play an indispensable role in several business activities ranging from product recommendation, targeted advertising, promotions, segmentation etc. In this research, we question whether reviews might hold stereotypic gender bias that algorithms learn and propagate Utilizing data from millions of observations and a word embedding approach, GloVe, we show that algorithms designed to learn from human language output also learn gender bias. We also examine why such biases occur: whether the bias is caused because of a negative bias against females or a positive bias for males. We examine the impact of gender bias in reviews on choice and conclude with policy implications for female consumers, especially when they are unaware of the bias, and the ethical implications for firms.
Tasks Product Recommendation
Published 2019-02-01
URL http://arxiv.org/abs/1902.00496v1
PDF http://arxiv.org/pdf/1902.00496v1.pdf
PWC https://paperswithcode.com/paper/examining-the-presence-of-gender-bias-in
Repo
Framework

Towards Ethical Machines Via Logic Programming

Title Towards Ethical Machines Via Logic Programming
Authors Abeer Dyoub, Stefania Costantini, Francesca A. Lisi
Abstract Autonomous intelligent agents are playing increasingly important roles in our lives. They contain information about us and start to perform tasks on our behalves. Chatbots are an example of such agents that need to engage in a complex conversations with humans. Thus, we need to ensure that they behave ethically. In this work we propose a hybrid logic-based approach for ethical chatbots.
Tasks
Published 2019-09-18
URL https://arxiv.org/abs/1909.08255v1
PDF https://arxiv.org/pdf/1909.08255v1.pdf
PWC https://paperswithcode.com/paper/towards-ethical-machines-via-logic
Repo
Framework

Width Provably Matters in Optimization for Deep Linear Neural Networks

Title Width Provably Matters in Optimization for Deep Linear Neural Networks
Authors Simon S. Du, Wei Hu
Abstract We prove that for an $L$-layer fully-connected linear neural network, if the width of every hidden layer is $\tilde\Omega (L \cdot r \cdot d_{\mathrm{out}} \cdot \kappa^3 )$, where $r$ and $\kappa$ are the rank and the condition number of the input data, and $d_{\mathrm{out}}$ is the output dimension, then gradient descent with Gaussian random initialization converges to a global minimum at a linear rate. The number of iterations to find an $\epsilon$-suboptimal solution is $O(\kappa \log(\frac{1}{\epsilon}))$. Our polynomial upper bound on the total running time for wide deep linear networks and the $\exp\left(\Omega\left(L\right)\right)$ lower bound for narrow deep linear neural networks [Shamir, 2018] together demonstrate that wide layers are necessary for optimizing deep models.
Tasks
Published 2019-01-24
URL https://arxiv.org/abs/1901.08572v3
PDF https://arxiv.org/pdf/1901.08572v3.pdf
PWC https://paperswithcode.com/paper/width-provably-matters-in-optimization-for
Repo
Framework

Meta Dynamic Pricing: Learning Across Experiments

Title Meta Dynamic Pricing: Learning Across Experiments
Authors Hamsa Bastani, David Simchi-Levi, Ruihao Zhu
Abstract We study the problem of learning shared structure \emph{across} a sequence of dynamic pricing experiments for related products. We consider a practical formulation where the unknown demand parameters for each product come from an unknown distribution (prior) that is shared across products. We then propose a meta dynamic pricing algorithm that learns this prior online while solving a sequence of Thompson sampling pricing experiments (each with horizon $T$) for $N$ different products. Our algorithm addresses two challenges: (i) balancing the need to learn the prior (\emph{meta-exploration}) with the need to leverage the estimated prior to achieve good performance (\emph{meta-exploitation}), and (ii) accounting for uncertainty in the estimated prior by appropriately “widening” the prior as a function of its estimation error, thereby ensuring convergence of each price experiment. Unlike prior-independent approaches, our algorithm’s meta regret grows sublinearly in $N$; an immediate consequence of our analysis is that the price of an unknown prior in Thompson sampling is negligible in experiment-rich environments with shared structure (large $N$). Numerical experiments on synthetic and real auto loan data demonstrate that our algorithm significantly speeds up learning compared to prior-independent algorithms or a naive approach of greedily using the updated prior across products.
Tasks
Published 2019-02-28
URL https://arxiv.org/abs/1902.10918v2
PDF https://arxiv.org/pdf/1902.10918v2.pdf
PWC https://paperswithcode.com/paper/meta-dynamic-pricing-learning-across
Repo
Framework

Skin Cancer Segmentation and Classification with NABLA-N and Inception Recurrent Residual Convolutional Networks

Title Skin Cancer Segmentation and Classification with NABLA-N and Inception Recurrent Residual Convolutional Networks
Authors Md Zahangir Alom, Theus Aspiras, Tarek M. Taha, Vijayan K. Asari
Abstract In the last few years, Deep Learning (DL) has been showing superior performance in different modalities of biomedical image analysis. Several DL architectures have been proposed for classification, segmentation, and detection tasks in medical imaging and computational pathology. In this paper, we propose a new DL architecture, the NABLA-N network, with better feature fusion techniques in decoding units for dermoscopic image segmentation tasks. The NABLA-N network has several advances for segmentation tasks. First, this model ensures better feature representation for semantic segmentation with a combination of low to high-level feature maps. Second, this network shows better quantitative and qualitative results with the same or fewer network parameters compared to other methods. In addition, the Inception Recurrent Residual Convolutional Neural Network (IRRCNN) model is used for skin cancer classification. The proposed NABLA-N network and IRRCNN models are evaluated for skin cancer segmentation and classification on the benchmark datasets from the International Skin Imaging Collaboration 2018 (ISIC-2018). The experimental results show superior performance on segmentation tasks compared to the Recurrent Residual U-Net (R2U-Net). The classification model shows around 87% testing accuracy for dermoscopic skin cancer classification on ISIC2018 dataset.
Tasks Semantic Segmentation, Skin Cancer Classification, Skin Cancer Segmentation
Published 2019-04-25
URL http://arxiv.org/abs/1904.11126v1
PDF http://arxiv.org/pdf/1904.11126v1.pdf
PWC https://paperswithcode.com/paper/190411126
Repo
Framework

Learning with minibatch Wasserstein : asymptotic and gradient properties

Title Learning with minibatch Wasserstein : asymptotic and gradient properties
Authors Kilian Fatras, Younes Zine, Rémi Flamary, Rémi Gribonval, Nicolas Courty
Abstract Optimal transport distances are powerful tools to compare probability distributions and have found many applications in machine learning. Yet their algorithmic complexity prevents their direct use on large scale datasets. To overcome this challenge, practitioners compute these distances on minibatches {\em i.e.} they average the outcome of several smaller optimal transport problems. We propose in this paper an analysis of this practice, which effects are not well understood so far. We notably argue that it is equivalent to an implicit regularization of the original problem, with appealing properties such as unbiased estimators, gradients and a concentration bound around the expectation, but also with defects such as loss of distance property. Along with this theoretical analysis, we also conduct empirical experiments on gradient flows, GANs or color transfer that highlight the practical interest of this strategy.
Tasks
Published 2019-10-09
URL https://arxiv.org/abs/1910.04091v3
PDF https://arxiv.org/pdf/1910.04091v3.pdf
PWC https://paperswithcode.com/paper/learning-with-minibatch-wasserstein
Repo
Framework

An improper estimator with optimal excess risk in misspecified density estimation and logistic regression

Title An improper estimator with optimal excess risk in misspecified density estimation and logistic regression
Authors Jaouad Mourtada, Stéphane Gaïffas
Abstract We introduce a procedure for predictive conditional density estimation under logarithmic loss, which we call SMP (Sample Minmax Predictor). This predictor minimizes a new general excess risk bound, which critically remains valid under model misspecification. On standard examples, this bound scales as $d/n$ where $d$ is the dimension of the model and $n$ the sample size, regardless of the true distribution. The SMP, which is an improper (out-of-model) procedure, improves over proper (within-model) estimators (such as the maximum likelihood estimator), whose excess risk can degrade arbitrarily in the misspecified case. For density estimation, our bounds improve over approaches based on online-to-batch conversion, by removing suboptimal $\log n$ factors, addressing an open problem from Gr{"u}nwald and Kot{\l}owski (2011) for the considered models. For the Gaussian linear model, the SMP admits an explicit expression, and its expected excess risk in the general misspecified case is at most twice the minimax excess risk in the \emph{well-specified case}, but without any condition on the noise variance or approximation error of the linear model. For logistic regression, a penalized SMP can be computed efficiently by training two logistic regressions, and achieves a non-asymptotic excess risk of $O((d + B^2R^2)/n)$, where $R$ is a bound on the norm of the features and $B$ the norm of the comparison linear predictor. This improves the rates of proper (within-model) estimators, since such procedures can achieve no better rate than $\min(BR/\sqrt{n},de^{BR}/n)$ in general. This also provides a computationally more efficient alternative to approaches based on online-to-batch conversion of Bayesian mixture procedures, which require approximate posterior sampling, thereby partly answering a question by Foster et al. (2018).
Tasks Density Estimation
Published 2019-12-23
URL https://arxiv.org/abs/1912.10784v1
PDF https://arxiv.org/pdf/1912.10784v1.pdf
PWC https://paperswithcode.com/paper/an-improper-estimator-with-optimal-excess
Repo
Framework

Diversely Stale Parameters for Efficient Training of CNNs

Title Diversely Stale Parameters for Efficient Training of CNNs
Authors An Xu, Zhouyuan Huo, Heng Huang
Abstract The backpropagation algorithm is the most popular algorithm training neural networks nowadays. However, it suffers from the forward locking, backward locking and update locking problems, especially when a neural network is so large that its layers are distributed across multiple devices. Existing solutions either can only handle one locking problem or lead to severe accuracy loss or memory inefficiency. Moreover, none of them consider the straggler problem among devices. In this paper, we propose Layer-wise Staleness and a novel efficient training algorithm, Diversely Stale Parameters (DSP), which can address all these challenges without loss of accuracy nor memory issue. We also analyze the convergence of DSP with two popular gradient-based methods and prove that both of them are guaranteed to converge to critical points for non-convex problems. Finally, extensive experimental results on training deep convolutional neural networks demonstrate that our proposed DSP algorithm can achieve significant training speedup with stronger robustness and better generalization than compared methods.
Tasks
Published 2019-09-05
URL https://arxiv.org/abs/1909.02625v2
PDF https://arxiv.org/pdf/1909.02625v2.pdf
PWC https://paperswithcode.com/paper/diversely-stale-parameters-for-efficient
Repo
Framework

A Novel Technique of Noninvasive Hemoglobin Level Measurement Using HSV Value of Fingertip Image

Title A Novel Technique of Noninvasive Hemoglobin Level Measurement Using HSV Value of Fingertip Image
Authors Md Kamrul Hasan, Nazmus Sakib, Joshua Field, Richard R. Love, Sheikh I. Ahamed
Abstract Over the last decade, smartphones have changed radically to support us with mHealth technology, cloud computing, and machine learning algorithm. Having its multifaceted facilities, we present a novel smartphone-based noninvasive hemoglobin (Hb) level prediction model by analyzing hue, saturation and value (HSV) of a fingertip video. Here, we collect 60 videos of 60 subjects from two different locations: Blood Center of Wisconsin, USA and AmaderGram, Bangladesh. We extract red, green, and blue (RGB) pixel intensities of selected images of those videos captured by the smartphone camera with flash on. Then we convert RGB values of selected video frames of a fingertip video into HSV color space and we generate histogram values of these HSV pixel intensities. We average these histogram values of a fingertip video and consider as an observation against the gold standard Hb concentration. We generate two input feature matrices based on observation of two different data sets. Partial Least Squares (PLS) algorithm is applied on the input feature matrix. We observe R2=0.95 in both data sets through our research. We analyze our data using Python OpenCV, Matlab, and R statistics tool.
Tasks
Published 2019-10-07
URL https://arxiv.org/abs/1910.02579v1
PDF https://arxiv.org/pdf/1910.02579v1.pdf
PWC https://paperswithcode.com/paper/a-novel-technique-of-noninvasive-hemoglobin
Repo
Framework

Towards Successful Collaboration: Design Guidelines for AI-based Services enriching Information Systems in Organisations

Title Towards Successful Collaboration: Design Guidelines for AI-based Services enriching Information Systems in Organisations
Authors Nicholas R. J. Frick, Felix Brünker, Björn Ross, Stefan Stieglitz
Abstract Information systems (IS) are widely used in organisations to improve business performance. The steady progression in improving technologies like artificial intelligence (AI) and the need of securing future success of organisations lead to new requirements for IS. This research in progress firstly introduces the term AI-based services (AIBS) describing AI as a component enriching IS aiming at collaborating with employees and assisting in the execution of work-related tasks. The study derives requirements from ten expert interviews to successful design AIBS following Design Science Research (DSR). For a successful deployment of AIBS in organisations the D&M IS Success Model will be considered to validated requirements within three major dimensions of quality: Information Quality, System Quality, and Service Quality. Amongst others, preliminary findings propose that AIBS must be preferably authentic. Further discussion and research on AIBS is forced, thus, providing first insights on the deployment of AIBS in organisations.
Tasks
Published 2019-12-02
URL https://arxiv.org/abs/1912.01077v1
PDF https://arxiv.org/pdf/1912.01077v1.pdf
PWC https://paperswithcode.com/paper/towards-successful-collaboration-design
Repo
Framework

Deep Multi-scale Discriminative Networks for Double JPEG Compression Forensics

Title Deep Multi-scale Discriminative Networks for Double JPEG Compression Forensics
Authors Cheng Deng, Zhao Li, Xinbo Gao, Dacheng Tao
Abstract As JPEG is the most widely used image format, the importance of tampering detection for JPEG images in blind forensics is self-evident. In this area, extracting effective statistical characteristics from a JPEG image for classification remains a challenge. Effective features are designed manually in traditional methods, suggesting that extensive labor-consuming research and derivation is required. In this paper, we propose a novel image tampering detection method based on deep multi-scale discriminative networks (MSD-Nets). The multi-scale module is designed to automatically extract multiple features from the discrete cosine transform (DCT) coefficient histograms of the JPEG image. This module can capture the characteristic information in different scale spaces. In addition, a discriminative module is also utilized to improve the detection effect of the networks in those difficult situations when the first compression quality (QF1) is higher than the second one (QF2). A special network in this module is designed to distinguish the small statistical difference between authentic and tampered regions in these cases. Finally, a probability map can be obtained and the specific tampering area is located using the last classification results. Extensive experiments demonstrate the superiority of our proposed method in both quantitative and qualitative metrics when compared with state-of-the-art approaches.
Tasks
Published 2019-04-04
URL http://arxiv.org/abs/1904.02520v1
PDF http://arxiv.org/pdf/1904.02520v1.pdf
PWC https://paperswithcode.com/paper/deep-multi-scale-discriminative-networks-for
Repo
Framework
comments powered by Disqus