Paper Group ANR 334
Estimating the Algorithmic Variance of Randomized Ensembles via the Bootstrap. KRM-based Dialogue Management. One-stage Shape Instantiation from a Single 2D Image to 3D Point Cloud. Automatic Routing of Goldstone Diagrams using Genetic Algorithms. Examining the Presence of Gender Bias in Customer Reviews Using Word Embedding. Towards Ethical Machin …
Estimating the Algorithmic Variance of Randomized Ensembles via the Bootstrap
Title | Estimating the Algorithmic Variance of Randomized Ensembles via the Bootstrap |
Authors | Miles E. Lopes |
Abstract | Although the methods of bagging and random forests are some of the most widely used prediction methods, relatively little is known about their algorithmic convergence. In particular, there are not many theoretical guarantees for deciding when an ensemble is “large enough” — so that its accuracy is close to that of an ideal infinite ensemble. Due to the fact that bagging and random forests are randomized algorithms, the choice of ensemble size is closely related to the notion of “algorithmic variance” (i.e. the variance of prediction error due only to the training algorithm). In the present work, we propose a bootstrap method to estimate this variance for bagging, random forests, and related methods in the context of classification. To be specific, suppose the training dataset is fixed, and let the random variable $Err_t$ denote the prediction error of a randomized ensemble of size $t$. Working under a “first-order model” for randomized ensembles, we prove that the centered law of $Err_t$ can be consistently approximated via the proposed method as $t\to\infty$. Meanwhile, the computational cost of the method is quite modest, by virtue of an extrapolation technique. As a consequence, the method offers a practical guideline for deciding when the algorithmic fluctuations of $Err_t$ are negligible. |
Tasks | |
Published | 2019-07-20 |
URL | https://arxiv.org/abs/1907.08742v1 |
https://arxiv.org/pdf/1907.08742v1.pdf | |
PWC | https://paperswithcode.com/paper/estimating-the-algorithmic-variance-of |
Repo | |
Framework | |
KRM-based Dialogue Management
Title | KRM-based Dialogue Management |
Authors | Wenwu Qu, Xiaoyu Chi, Wei Zheng |
Abstract | A KRM-based dialogue management (DM) is proposed using to implement human-computer dialogue system in complex scenarios. KRM-based DM has a well description ability and it can ensure the logic of the dialogue process. Then a complex application scenario in the Internet of Things (IOT) industry and a dialogue system implemented based on the KRM-based DM will be introduced, where the system allows enterprise customers to customize topics and adapts corresponding topics in the interaction process with users. The experimental results show that the system can complete the interactive tasks well, and can effectively solve the problems of topic switching, information inheritance between topics, change of dominance. |
Tasks | Dialogue Management |
Published | 2019-12-02 |
URL | https://arxiv.org/abs/1912.00669v1 |
https://arxiv.org/pdf/1912.00669v1.pdf | |
PWC | https://paperswithcode.com/paper/krm-based-dialogue-management |
Repo | |
Framework | |
One-stage Shape Instantiation from a Single 2D Image to 3D Point Cloud
Title | One-stage Shape Instantiation from a Single 2D Image to 3D Point Cloud |
Authors | Xiao-Yun Zhou, Zhao-Yang Wang, Peichao Li, Jian-Qing Zheng, Guang-Zhong Yang |
Abstract | Shape instantiation which predicts the 3D shape of a dynamic target from one or more 2D images is important for real-time intra-operative navigation. Previously, a general shape instantiation framework was proposed with manual image segmentation to generate a 2D Statistical Shape Model (SSM) and with Kernel Partial Least Square Regression (KPLSR) to learn the relationship between the 2D and 3D SSM for 3D shape prediction. In this paper, the two-stage shape instantiation is improved to be one-stage. PointOutNet with 19 convolutional layers and three fully-connected layers is used as the network structure and Chamfer distance is used as the loss function to predict the 3D target point cloud from a single 2D image. With the proposed one-stage shape instantiation algorithm, a spontaneous image-to-point cloud training and inference can be achieved. A dataset from 27 Right Ventricle (RV) subjects, indicating 609 experiments, were used to validate the proposed one-stage shape instantiation algorithm. An average point cloud-to-point cloud (PC-to-PC) error of 1.72mm has been achieved, which is comparable to the PLSR-based (1.42mm) and KPLSR-based (1.31mm) two-stage shape instantiation algorithm. |
Tasks | Semantic Segmentation |
Published | 2019-07-24 |
URL | https://arxiv.org/abs/1907.10763v1 |
https://arxiv.org/pdf/1907.10763v1.pdf | |
PWC | https://paperswithcode.com/paper/one-stage-shape-instantiation-from-a-single |
Repo | |
Framework | |
Automatic Routing of Goldstone Diagrams using Genetic Algorithms
Title | Automatic Routing of Goldstone Diagrams using Genetic Algorithms |
Authors | Nils Herrmann, Michael Hanrath |
Abstract | This paper presents an algorithm for an automatic transformation (=routing) of time ordered topologies of Goldstone diagrams (i.e. Wick contractions) into graphical representations of these topologies. Since there is no hard criterion for an optimal routing, the proposed algorithm minimizes an empirically chosen cost function over a set of parameters. Some of the latter are naturally of discrete type (e.g. interchange of particle/hole lines due to antisymmetry) while others (e.g. x,y-position of nodes) are naturally continuous. In order to arrive at a manageable optimization problem the position space is artificially discretized. In terms of the (i) cost function, (ii) the discrete vertex placement, (iii) the interchange of particle/hole lines the routing problem is now well defined and fully discrete. However, it shows an exponential complexity with the number of vertices suggesting to apply a genetic algorithm for its solution. The presented algorithm is capable of routing non trivial (several loops and crossings) Goldstone diagrams. The resulting diagrams are qualitatively fully equivalent to manually routed ones. The proposed algorithm is successfully applied to several Coupled Cluster approaches and a perturbative (fixpoint iterative) CCSD expansion with repeated diagram substitution. |
Tasks | |
Published | 2019-06-30 |
URL | https://arxiv.org/abs/1907.00426v1 |
https://arxiv.org/pdf/1907.00426v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-routing-of-goldstone-diagrams-using |
Repo | |
Framework | |
Examining the Presence of Gender Bias in Customer Reviews Using Word Embedding
Title | Examining the Presence of Gender Bias in Customer Reviews Using Word Embedding |
Authors | A. Mishra, H. Mishra, S. Rathee |
Abstract | Humans have entered the age of algorithms. Each minute, algorithms shape countless preferences from suggesting a product to a potential life partner. In the marketplace algorithms are trained to learn consumer preferences from customer reviews because user-generated reviews are considered the voice of customers and a valuable source of information to firms. Insights mined from reviews play an indispensable role in several business activities ranging from product recommendation, targeted advertising, promotions, segmentation etc. In this research, we question whether reviews might hold stereotypic gender bias that algorithms learn and propagate Utilizing data from millions of observations and a word embedding approach, GloVe, we show that algorithms designed to learn from human language output also learn gender bias. We also examine why such biases occur: whether the bias is caused because of a negative bias against females or a positive bias for males. We examine the impact of gender bias in reviews on choice and conclude with policy implications for female consumers, especially when they are unaware of the bias, and the ethical implications for firms. |
Tasks | Product Recommendation |
Published | 2019-02-01 |
URL | http://arxiv.org/abs/1902.00496v1 |
http://arxiv.org/pdf/1902.00496v1.pdf | |
PWC | https://paperswithcode.com/paper/examining-the-presence-of-gender-bias-in |
Repo | |
Framework | |
Towards Ethical Machines Via Logic Programming
Title | Towards Ethical Machines Via Logic Programming |
Authors | Abeer Dyoub, Stefania Costantini, Francesca A. Lisi |
Abstract | Autonomous intelligent agents are playing increasingly important roles in our lives. They contain information about us and start to perform tasks on our behalves. Chatbots are an example of such agents that need to engage in a complex conversations with humans. Thus, we need to ensure that they behave ethically. In this work we propose a hybrid logic-based approach for ethical chatbots. |
Tasks | |
Published | 2019-09-18 |
URL | https://arxiv.org/abs/1909.08255v1 |
https://arxiv.org/pdf/1909.08255v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-ethical-machines-via-logic |
Repo | |
Framework | |
Width Provably Matters in Optimization for Deep Linear Neural Networks
Title | Width Provably Matters in Optimization for Deep Linear Neural Networks |
Authors | Simon S. Du, Wei Hu |
Abstract | We prove that for an $L$-layer fully-connected linear neural network, if the width of every hidden layer is $\tilde\Omega (L \cdot r \cdot d_{\mathrm{out}} \cdot \kappa^3 )$, where $r$ and $\kappa$ are the rank and the condition number of the input data, and $d_{\mathrm{out}}$ is the output dimension, then gradient descent with Gaussian random initialization converges to a global minimum at a linear rate. The number of iterations to find an $\epsilon$-suboptimal solution is $O(\kappa \log(\frac{1}{\epsilon}))$. Our polynomial upper bound on the total running time for wide deep linear networks and the $\exp\left(\Omega\left(L\right)\right)$ lower bound for narrow deep linear neural networks [Shamir, 2018] together demonstrate that wide layers are necessary for optimizing deep models. |
Tasks | |
Published | 2019-01-24 |
URL | https://arxiv.org/abs/1901.08572v3 |
https://arxiv.org/pdf/1901.08572v3.pdf | |
PWC | https://paperswithcode.com/paper/width-provably-matters-in-optimization-for |
Repo | |
Framework | |
Meta Dynamic Pricing: Learning Across Experiments
Title | Meta Dynamic Pricing: Learning Across Experiments |
Authors | Hamsa Bastani, David Simchi-Levi, Ruihao Zhu |
Abstract | We study the problem of learning shared structure \emph{across} a sequence of dynamic pricing experiments for related products. We consider a practical formulation where the unknown demand parameters for each product come from an unknown distribution (prior) that is shared across products. We then propose a meta dynamic pricing algorithm that learns this prior online while solving a sequence of Thompson sampling pricing experiments (each with horizon $T$) for $N$ different products. Our algorithm addresses two challenges: (i) balancing the need to learn the prior (\emph{meta-exploration}) with the need to leverage the estimated prior to achieve good performance (\emph{meta-exploitation}), and (ii) accounting for uncertainty in the estimated prior by appropriately “widening” the prior as a function of its estimation error, thereby ensuring convergence of each price experiment. Unlike prior-independent approaches, our algorithm’s meta regret grows sublinearly in $N$; an immediate consequence of our analysis is that the price of an unknown prior in Thompson sampling is negligible in experiment-rich environments with shared structure (large $N$). Numerical experiments on synthetic and real auto loan data demonstrate that our algorithm significantly speeds up learning compared to prior-independent algorithms or a naive approach of greedily using the updated prior across products. |
Tasks | |
Published | 2019-02-28 |
URL | https://arxiv.org/abs/1902.10918v2 |
https://arxiv.org/pdf/1902.10918v2.pdf | |
PWC | https://paperswithcode.com/paper/meta-dynamic-pricing-learning-across |
Repo | |
Framework | |
Skin Cancer Segmentation and Classification with NABLA-N and Inception Recurrent Residual Convolutional Networks
Title | Skin Cancer Segmentation and Classification with NABLA-N and Inception Recurrent Residual Convolutional Networks |
Authors | Md Zahangir Alom, Theus Aspiras, Tarek M. Taha, Vijayan K. Asari |
Abstract | In the last few years, Deep Learning (DL) has been showing superior performance in different modalities of biomedical image analysis. Several DL architectures have been proposed for classification, segmentation, and detection tasks in medical imaging and computational pathology. In this paper, we propose a new DL architecture, the NABLA-N network, with better feature fusion techniques in decoding units for dermoscopic image segmentation tasks. The NABLA-N network has several advances for segmentation tasks. First, this model ensures better feature representation for semantic segmentation with a combination of low to high-level feature maps. Second, this network shows better quantitative and qualitative results with the same or fewer network parameters compared to other methods. In addition, the Inception Recurrent Residual Convolutional Neural Network (IRRCNN) model is used for skin cancer classification. The proposed NABLA-N network and IRRCNN models are evaluated for skin cancer segmentation and classification on the benchmark datasets from the International Skin Imaging Collaboration 2018 (ISIC-2018). The experimental results show superior performance on segmentation tasks compared to the Recurrent Residual U-Net (R2U-Net). The classification model shows around 87% testing accuracy for dermoscopic skin cancer classification on ISIC2018 dataset. |
Tasks | Semantic Segmentation, Skin Cancer Classification, Skin Cancer Segmentation |
Published | 2019-04-25 |
URL | http://arxiv.org/abs/1904.11126v1 |
http://arxiv.org/pdf/1904.11126v1.pdf | |
PWC | https://paperswithcode.com/paper/190411126 |
Repo | |
Framework | |
Learning with minibatch Wasserstein : asymptotic and gradient properties
Title | Learning with minibatch Wasserstein : asymptotic and gradient properties |
Authors | Kilian Fatras, Younes Zine, Rémi Flamary, Rémi Gribonval, Nicolas Courty |
Abstract | Optimal transport distances are powerful tools to compare probability distributions and have found many applications in machine learning. Yet their algorithmic complexity prevents their direct use on large scale datasets. To overcome this challenge, practitioners compute these distances on minibatches {\em i.e.} they average the outcome of several smaller optimal transport problems. We propose in this paper an analysis of this practice, which effects are not well understood so far. We notably argue that it is equivalent to an implicit regularization of the original problem, with appealing properties such as unbiased estimators, gradients and a concentration bound around the expectation, but also with defects such as loss of distance property. Along with this theoretical analysis, we also conduct empirical experiments on gradient flows, GANs or color transfer that highlight the practical interest of this strategy. |
Tasks | |
Published | 2019-10-09 |
URL | https://arxiv.org/abs/1910.04091v3 |
https://arxiv.org/pdf/1910.04091v3.pdf | |
PWC | https://paperswithcode.com/paper/learning-with-minibatch-wasserstein |
Repo | |
Framework | |
An improper estimator with optimal excess risk in misspecified density estimation and logistic regression
Title | An improper estimator with optimal excess risk in misspecified density estimation and logistic regression |
Authors | Jaouad Mourtada, Stéphane Gaïffas |
Abstract | We introduce a procedure for predictive conditional density estimation under logarithmic loss, which we call SMP (Sample Minmax Predictor). This predictor minimizes a new general excess risk bound, which critically remains valid under model misspecification. On standard examples, this bound scales as $d/n$ where $d$ is the dimension of the model and $n$ the sample size, regardless of the true distribution. The SMP, which is an improper (out-of-model) procedure, improves over proper (within-model) estimators (such as the maximum likelihood estimator), whose excess risk can degrade arbitrarily in the misspecified case. For density estimation, our bounds improve over approaches based on online-to-batch conversion, by removing suboptimal $\log n$ factors, addressing an open problem from Gr{"u}nwald and Kot{\l}owski (2011) for the considered models. For the Gaussian linear model, the SMP admits an explicit expression, and its expected excess risk in the general misspecified case is at most twice the minimax excess risk in the \emph{well-specified case}, but without any condition on the noise variance or approximation error of the linear model. For logistic regression, a penalized SMP can be computed efficiently by training two logistic regressions, and achieves a non-asymptotic excess risk of $O((d + B^2R^2)/n)$, where $R$ is a bound on the norm of the features and $B$ the norm of the comparison linear predictor. This improves the rates of proper (within-model) estimators, since such procedures can achieve no better rate than $\min(BR/\sqrt{n},de^{BR}/n)$ in general. This also provides a computationally more efficient alternative to approaches based on online-to-batch conversion of Bayesian mixture procedures, which require approximate posterior sampling, thereby partly answering a question by Foster et al. (2018). |
Tasks | Density Estimation |
Published | 2019-12-23 |
URL | https://arxiv.org/abs/1912.10784v1 |
https://arxiv.org/pdf/1912.10784v1.pdf | |
PWC | https://paperswithcode.com/paper/an-improper-estimator-with-optimal-excess |
Repo | |
Framework | |
Diversely Stale Parameters for Efficient Training of CNNs
Title | Diversely Stale Parameters for Efficient Training of CNNs |
Authors | An Xu, Zhouyuan Huo, Heng Huang |
Abstract | The backpropagation algorithm is the most popular algorithm training neural networks nowadays. However, it suffers from the forward locking, backward locking and update locking problems, especially when a neural network is so large that its layers are distributed across multiple devices. Existing solutions either can only handle one locking problem or lead to severe accuracy loss or memory inefficiency. Moreover, none of them consider the straggler problem among devices. In this paper, we propose Layer-wise Staleness and a novel efficient training algorithm, Diversely Stale Parameters (DSP), which can address all these challenges without loss of accuracy nor memory issue. We also analyze the convergence of DSP with two popular gradient-based methods and prove that both of them are guaranteed to converge to critical points for non-convex problems. Finally, extensive experimental results on training deep convolutional neural networks demonstrate that our proposed DSP algorithm can achieve significant training speedup with stronger robustness and better generalization than compared methods. |
Tasks | |
Published | 2019-09-05 |
URL | https://arxiv.org/abs/1909.02625v2 |
https://arxiv.org/pdf/1909.02625v2.pdf | |
PWC | https://paperswithcode.com/paper/diversely-stale-parameters-for-efficient |
Repo | |
Framework | |
A Novel Technique of Noninvasive Hemoglobin Level Measurement Using HSV Value of Fingertip Image
Title | A Novel Technique of Noninvasive Hemoglobin Level Measurement Using HSV Value of Fingertip Image |
Authors | Md Kamrul Hasan, Nazmus Sakib, Joshua Field, Richard R. Love, Sheikh I. Ahamed |
Abstract | Over the last decade, smartphones have changed radically to support us with mHealth technology, cloud computing, and machine learning algorithm. Having its multifaceted facilities, we present a novel smartphone-based noninvasive hemoglobin (Hb) level prediction model by analyzing hue, saturation and value (HSV) of a fingertip video. Here, we collect 60 videos of 60 subjects from two different locations: Blood Center of Wisconsin, USA and AmaderGram, Bangladesh. We extract red, green, and blue (RGB) pixel intensities of selected images of those videos captured by the smartphone camera with flash on. Then we convert RGB values of selected video frames of a fingertip video into HSV color space and we generate histogram values of these HSV pixel intensities. We average these histogram values of a fingertip video and consider as an observation against the gold standard Hb concentration. We generate two input feature matrices based on observation of two different data sets. Partial Least Squares (PLS) algorithm is applied on the input feature matrix. We observe R2=0.95 in both data sets through our research. We analyze our data using Python OpenCV, Matlab, and R statistics tool. |
Tasks | |
Published | 2019-10-07 |
URL | https://arxiv.org/abs/1910.02579v1 |
https://arxiv.org/pdf/1910.02579v1.pdf | |
PWC | https://paperswithcode.com/paper/a-novel-technique-of-noninvasive-hemoglobin |
Repo | |
Framework | |
Towards Successful Collaboration: Design Guidelines for AI-based Services enriching Information Systems in Organisations
Title | Towards Successful Collaboration: Design Guidelines for AI-based Services enriching Information Systems in Organisations |
Authors | Nicholas R. J. Frick, Felix Brünker, Björn Ross, Stefan Stieglitz |
Abstract | Information systems (IS) are widely used in organisations to improve business performance. The steady progression in improving technologies like artificial intelligence (AI) and the need of securing future success of organisations lead to new requirements for IS. This research in progress firstly introduces the term AI-based services (AIBS) describing AI as a component enriching IS aiming at collaborating with employees and assisting in the execution of work-related tasks. The study derives requirements from ten expert interviews to successful design AIBS following Design Science Research (DSR). For a successful deployment of AIBS in organisations the D&M IS Success Model will be considered to validated requirements within three major dimensions of quality: Information Quality, System Quality, and Service Quality. Amongst others, preliminary findings propose that AIBS must be preferably authentic. Further discussion and research on AIBS is forced, thus, providing first insights on the deployment of AIBS in organisations. |
Tasks | |
Published | 2019-12-02 |
URL | https://arxiv.org/abs/1912.01077v1 |
https://arxiv.org/pdf/1912.01077v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-successful-collaboration-design |
Repo | |
Framework | |
Deep Multi-scale Discriminative Networks for Double JPEG Compression Forensics
Title | Deep Multi-scale Discriminative Networks for Double JPEG Compression Forensics |
Authors | Cheng Deng, Zhao Li, Xinbo Gao, Dacheng Tao |
Abstract | As JPEG is the most widely used image format, the importance of tampering detection for JPEG images in blind forensics is self-evident. In this area, extracting effective statistical characteristics from a JPEG image for classification remains a challenge. Effective features are designed manually in traditional methods, suggesting that extensive labor-consuming research and derivation is required. In this paper, we propose a novel image tampering detection method based on deep multi-scale discriminative networks (MSD-Nets). The multi-scale module is designed to automatically extract multiple features from the discrete cosine transform (DCT) coefficient histograms of the JPEG image. This module can capture the characteristic information in different scale spaces. In addition, a discriminative module is also utilized to improve the detection effect of the networks in those difficult situations when the first compression quality (QF1) is higher than the second one (QF2). A special network in this module is designed to distinguish the small statistical difference between authentic and tampered regions in these cases. Finally, a probability map can be obtained and the specific tampering area is located using the last classification results. Extensive experiments demonstrate the superiority of our proposed method in both quantitative and qualitative metrics when compared with state-of-the-art approaches. |
Tasks | |
Published | 2019-04-04 |
URL | http://arxiv.org/abs/1904.02520v1 |
http://arxiv.org/pdf/1904.02520v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-multi-scale-discriminative-networks-for |
Repo | |
Framework | |