January 30, 2020

3402 words 16 mins read

Paper Group ANR 432

Paper Group ANR 432

An Effective Label Noise Model for DNN Text Classification. Diagnostic Curves for Black Box Models. Assessment of Shift-Invariant CNN Gaze Mappings for PS-OG Eye Movement Sensors. Adversarial Sub-sequence for Text Generation. Hierarchical Attention Generative Adversarial Networks for Cross-domain Sentiment Classification. Mean-Field Langevin Dynami …

An Effective Label Noise Model for DNN Text Classification

Title An Effective Label Noise Model for DNN Text Classification
Authors Ishan Jindal, Daniel Pressel, Brian Lester, Matthew Nokleby
Abstract Because large, human-annotated datasets suffer from labeling errors, it is crucial to be able to train deep neural networks in the presence of label noise. While training image classification models with label noise have received much attention, training text classification models have not. In this paper, we propose an approach to training deep networks that is robust to label noise. This approach introduces a non-linear processing layer (noise model) that models the statistics of the label noise into a convolutional neural network (CNN) architecture. The noise model and the CNN weights are learned jointly from noisy training data, which prevents the model from overfitting to erroneous labels. Through extensive experiments on several text classification datasets, we show that this approach enables the CNN to learn better sentence representations and is robust even to extreme label noise. We find that proper initialization and regularization of this noise model is critical. Further, by contrast to results focusing on large batch sizes for mitigating label noise for image classification, we find that altering the batch size does not have much effect on classification performance.
Tasks Image Classification, Text Classification
Published 2019-03-18
URL http://arxiv.org/abs/1903.07507v1
PDF http://arxiv.org/pdf/1903.07507v1.pdf
PWC https://paperswithcode.com/paper/an-effective-label-noise-model-for-dnn-text
Repo
Framework

Diagnostic Curves for Black Box Models

Title Diagnostic Curves for Black Box Models
Authors David I. Inouye, Liu Leqi, Joon Sik Kim, Bryon Aragam, Pradeep Ravikumar
Abstract In safety-critical applications of machine learning, it is often necessary to look beyond standard metrics such as test accuracy in order to validate various qualitative properties such as monotonicity with respect to a feature or combination of features, checking for undesirable changes or oscillations in the response, and differences in outcomes (e.g. discrimination) for a protected class. To help answer this need, we propose a framework for approximately validating (or invalidating) various properties of a black box model by finding a univariate diagnostic curve in the input space whose output maximally violates a given property. These diagnostic curves show the exact value of the model along the curve and can be displayed with a simple and intuitive line graph. We demonstrate the usefulness of these diagnostic curves across multiple use-cases and datasets including selecting between two models and understanding out-of-sample behavior.
Tasks
Published 2019-12-02
URL https://arxiv.org/abs/1912.01108v1
PDF https://arxiv.org/pdf/1912.01108v1.pdf
PWC https://paperswithcode.com/paper/diagnostic-curves-for-black-box-models
Repo
Framework

Assessment of Shift-Invariant CNN Gaze Mappings for PS-OG Eye Movement Sensors

Title Assessment of Shift-Invariant CNN Gaze Mappings for PS-OG Eye Movement Sensors
Authors Henry K. Griffith, Dmytro Katrychuk, Oleg V. Komogortsev
Abstract Photosensor oculography (PS-OG) eye movement sensors offer desirable performance characteristics for integration within wireless head mounted devices (HMDs), including low power consumption and high sampling rates. To address the known performance degradation of these sensors due to HMD shifts, various machine learning techniques have been proposed for mapping sensor outputs to gaze location. This paper advances the understanding of a recently introduced convolutional neural network designed to provide shift invariant gaze mapping within a specified range of sensor translations. Performance is assessed for shift training examples which better reflect the distribution of values that would be generated through manual repositioning of the HMD during a dedicated collection of training data. The network is shown to exhibit comparable accuracy for this realistic shift distribution versus a previously considered rectangular grid, thereby enhancing the feasibility of in-field set-up. In addition, this work further demonstrates the practical viability of the proposed initialization process by demonstrating robust mapping performance versus training data scale. The ability to maintain reasonable accuracy for shifts extending beyond those introduced during training is also demonstrated.
Tasks
Published 2019-09-04
URL https://arxiv.org/abs/1909.05655v1
PDF https://arxiv.org/pdf/1909.05655v1.pdf
PWC https://paperswithcode.com/paper/assessment-of-shift-invariant-cnn-gaze
Repo
Framework

Adversarial Sub-sequence for Text Generation

Title Adversarial Sub-sequence for Text Generation
Authors Xingyuan Chen, Yanzhe Li, Peng Jin, Jiuhua Zhang, Xinyu Dai, Jiajun Chen, Gang Song
Abstract Generative adversarial nets (GAN) has been successfully introduced for generating text to alleviate the exposure bias. However, discriminators in these models only evaluate the entire sequence, which causes feedback sparsity and mode collapse. To tackle these problems, we propose a novel mechanism. It first segments the entire sequence into several sub-sequences. Then these sub-sequences, together with the entire sequence, are evaluated individually by the discriminator. At last these feedback signals are all used to guide the learning of GAN. This mechanism learns the generation of both the entire sequence and the sub-sequences simultaneously. Learning to generate sub-sequences is easy and is helpful in generating an entire sequence. It is easy to improve the existing GAN-based models with this mechanism. We rebuild three previous well-designed models with our mechanism, and the experimental results on benchmark data show these models are improved significantly, the best one outperforms the state-of-the-art model.\footnote[1]{All code and data are available at https://github.com/liyzcj/seggan.git
Tasks Text Generation
Published 2019-05-30
URL https://arxiv.org/abs/1905.12835v1
PDF https://arxiv.org/pdf/1905.12835v1.pdf
PWC https://paperswithcode.com/paper/adversarial-sub-sequence-for-text-generation
Repo
Framework

Hierarchical Attention Generative Adversarial Networks for Cross-domain Sentiment Classification

Title Hierarchical Attention Generative Adversarial Networks for Cross-domain Sentiment Classification
Authors Yuebing Zhang, Duoqian Miao, Jiaqi Wang
Abstract Cross-domain sentiment classification (CDSC) is an importance task in domain adaptation and sentiment classification. Due to the domain discrepancy, a sentiment classifier trained on source domain data may not works well on target domain data. In recent years, many researchers have used deep neural network models for cross-domain sentiment classification task, many of which use Gradient Reversal Layer (GRL) to design an adversarial network structure to train a domain-shared sentiment classifier. Different from those methods, we proposed Hierarchical Attention Generative Adversarial Networks (HAGAN) which alternately trains a generator and a discriminator in order to produce a document representation which is sentiment-distinguishable but domain-indistinguishable. Besides, the HAGAN model applies Bidirectional Gated Recurrent Unit (Bi-GRU) to encode the contextual information of a word and a sentence into the document representation. In addition, the HAGAN model use hierarchical attention mechanism to optimize the document representation and automatically capture the pivots and non-pivots. The experiments on Amazon review dataset show the effectiveness of HAGAN.
Tasks Domain Adaptation, Sentiment Analysis
Published 2019-03-27
URL http://arxiv.org/abs/1903.11334v1
PDF http://arxiv.org/pdf/1903.11334v1.pdf
PWC https://paperswithcode.com/paper/hierarchical-attention-generative-adversarial
Repo
Framework

Mean-Field Langevin Dynamics and Energy Landscape of Neural Networks

Title Mean-Field Langevin Dynamics and Energy Landscape of Neural Networks
Authors Kaitong Hu, Zhenjie Ren, David Siska, Lukasz Szpruch
Abstract We present a probabilistic analysis of the long-time behaviour of the nonlocal, diffusive equations with a gradient flow structure in 2-Wasserstein metric, namely, the Mean-Field Langevin Dynamics (MFLD). Our work is motivated by a desire to provide a theoretical underpinning for the convergence of stochastic gradient type algorithms widely used for non-convex learning tasks such as training of deep neural networks. The key insight is that the certain class of the finite dimensional non-convex problems becomes convex when lifted to infinite dimensional space of measures. We leverage this observation and show that the corresponding energy functional defined on the space of probability measures has a unique minimiser which can be characterised by a first order condition using the notion of linear functional derivative. Next, we show that the flow of marginal laws induced by the MFLD converges to the stationary distribution which is exactly the minimiser of the energy functional. We show that this convergence is exponential under conditions that are satisfied for highly regularised learning tasks. At the heart of our analysis is a pathwise perspective on Otto calculus used in gradient flow literature which is of independent interest. Our proof of convergence to stationary probability measure is novel and it relies on a generalisation of LaSalle’s invariance principle. Importantly we do not assume that interaction potential of MFLD is of convolution type nor that has any particular symmetric structure. This is critical for applications. Finally, we show that the error between finite dimensional optimisation problem and its infinite dimensional limit is of order one over the number of parameters.
Tasks
Published 2019-05-19
URL https://arxiv.org/abs/1905.07769v1
PDF https://arxiv.org/pdf/1905.07769v1.pdf
PWC https://paperswithcode.com/paper/mean-field-langevin-dynamics-and-energy
Repo
Framework

Vision-based Pedestrian Alert Safety System (PASS) for Signalized Intersections

Title Vision-based Pedestrian Alert Safety System (PASS) for Signalized Intersections
Authors Mhafuzul Islam, Mizanur Rahman, Mashrur Chowdhury, Gurcan Comert, Eshaa Deepak Sood, Amy Apon
Abstract Although Vehicle-to-Pedestrian (V2P) communication can significantly improve pedestrian safety at a signalized intersection, this safety is hindered as pedestrians often do not carry hand-held devices (e.g., Dedicated short-range communication (DSRC) and 5G enabled cell phone) to communicate with connected vehicles nearby. To overcome this limitation, in this study, traffic cameras at a signalized intersection were used to accurately detect and locate pedestrians via a vision-based deep learning technique to generate safety alerts in real-time about possible conflicts between vehicles and pedestrians. The contribution of this paper lies in the development of a system using a vision-based deep learning model that is able to generate personal safety messages (PSMs) in real-time (every 100 milliseconds). We develop a pedestrian alert safety system (PASS) to generate a safety alert of an imminent pedestrian-vehicle crash using generated PSMs to improve pedestrian safety at a signalized intersection. Our approach estimates the location and velocity of a pedestrian more accurately than existing DSRC-enabled pedestrian hand-held devices. A connected vehicle application, the Pedestrian in Signalized Crosswalk Warning (PSCW), was developed to evaluate the vision-based PASS. Numerical analyses show that our vision-based PASS is able to satisfy the accuracy and latency requirements of pedestrian safety applications in a connected vehicle environment.
Tasks
Published 2019-07-02
URL https://arxiv.org/abs/1907.05284v1
PDF https://arxiv.org/pdf/1907.05284v1.pdf
PWC https://paperswithcode.com/paper/vision-based-pedestrian-alert-safety-system
Repo
Framework

A Noise-Sensitivity-Analysis-Based Test Prioritization Technique for Deep Neural Networks

Title A Noise-Sensitivity-Analysis-Based Test Prioritization Technique for Deep Neural Networks
Authors Long Zhang, Xuechao Sun, Yong Li, Zhenyu Zhang
Abstract Deep neural networks (DNNs) have been widely used in the fields such as natural language processing, computer vision and image recognition. But several studies have been shown that deep neural networks can be easily fooled by artificial examples with some perturbations, which are widely known as adversarial examples. Adversarial examples can be used to attack deep neural networks or to improve the robustness of deep neural networks. A common way of generating adversarial examples is to first generate some noises and then add them into original examples. In practice, different examples have different noise-sensitive. To generate an effective adversarial example, it may be necessary to add a lot of noise to low noise-sensitive example, which may make the adversarial example meaningless. In this paper, we propose a noise-sensitivity-analysis-based test prioritization technique to pick out examples by their noise sensitivity. We construct an experiment to validate our approach on four image sets and two DNN models, which shows that examples are sensitive to noise and our method can effectively pick out examples by their noise sensitivity.
Tasks
Published 2019-01-01
URL http://arxiv.org/abs/1901.00054v3
PDF http://arxiv.org/pdf/1901.00054v3.pdf
PWC https://paperswithcode.com/paper/a-noise-sensitivity-analysis-based-test
Repo
Framework

Adaptive Bayesian Reticulum

Title Adaptive Bayesian Reticulum
Authors Giuseppe Nuti, Lluís Antoni Jiménez Rugama, Kaspar Thommen
Abstract Neural Networks and Decision Trees: two popular techniques for supervised learning that are seemingly disconnected in their formulation and optimization method, have recently been combined in a single construct. The connection pivots on assembling an artificial Neural Network with nodes that allow for a gate-like function to mimic a tree split, optimized using the standard approach of recursively applying the chain rule to update its parameters. Yet two main challenges have impeded wide use of this hybrid approach: (a) the inability of global gradient ascent techniques to optimize hierarchical parameters (as introduced by the gate function); and (b) the construction of the tree structure, which has relied on standard decision tree algorithms to learn the network topology or incrementally (and heuristically) searching the space at random. Here we propose a probabilistic construct that exploits the idea of a node’s unexplained potential (the total error channeled through the node) in order to decide where to expand further, mimicking the standard tree construction in a Neural Network setting, alongside a modified gradient ascent that first locally optimizes an expanded node before a global optimization. The probabilistic approach allows us to evaluate each new split as a ratio of likelihoods that balances the statistical improvement in explaining the evidence against the additional model complexity — thus providing a natural stopping condition. The result is a novel classification and regression technique that leverages the strength of both: a tree-structure that grows naturally and is simple to interpret with the plasticity of Neural Networks that allow for soft margins and slanted boundaries.
Tasks
Published 2019-12-12
URL https://arxiv.org/abs/1912.05901v3
PDF https://arxiv.org/pdf/1912.05901v3.pdf
PWC https://paperswithcode.com/paper/adaptive-reticulum
Repo
Framework

Robots as Actors in a Film: No War, A Robot Story

Title Robots as Actors in a Film: No War, A Robot Story
Authors Andreagiovanni Reina, Viktor Ioannou, Junjin Chen, Lu Lu, Charles Kent, James A. R. Marshall
Abstract Will the Third World War be fought by robots? This short film is a light-hearted comedy that aims to trigger an interesting discussion and reflexion on the terrifying killer-robot stories that increasingly fill us with dread when we read the news headlines. The fictional scenario takes inspiration from current scientific research and describes a future where robots are asked by humans to join the war. Robots are divided, sparking protests in robot society… will robots join the conflict or will they refuse to be employed in human warfare? Food for thought for engineers, roboticists and anyone imagining what the upcoming robot revolution could look like. We let robots pop on camera to tell a story, taking on the role of actors playing in the film, instructed through code on how to “act” for each scene.
Tasks
Published 2019-10-27
URL https://arxiv.org/abs/1910.12294v1
PDF https://arxiv.org/pdf/1910.12294v1.pdf
PWC https://paperswithcode.com/paper/robots-as-actors-in-a-film-no-war-a-robot
Repo
Framework

NLS: an accurate and yet easy-to-interpret regression method

Title NLS: an accurate and yet easy-to-interpret regression method
Authors Victor Coscrato, Marco Henrique de Almeida Inácio, Tiago Botari, Rafael Izbicki
Abstract An important feature of successful supervised machine learning applications is to be able to explain the predictions given by the regression or classification model being used. However, most state-of-the-art models that have good predictive power lead to predictions that are hard to interpret. Thus, several model-agnostic interpreters have been developed recently as a way of explaining black-box classifiers. In practice, using these methods is a slow process because a novel fitting is required for each new testing instance, and several non-trivial choices must be made. We develop NLS (neural local smoother), a method that is complex enough to give good predictions, and yet gives solutions that are easy to be interpreted without the need of using a separate interpreter. The key idea is to use a neural network that imposes a local linear shape to the output layer. We show that NLS leads to predictive power that is comparable to state-of-the-art machine learning models, and yet is easier to interpret.
Tasks
Published 2019-10-11
URL https://arxiv.org/abs/1910.05206v1
PDF https://arxiv.org/pdf/1910.05206v1.pdf
PWC https://paperswithcode.com/paper/nls-an-accurate-and-yet-easy-to-interpret
Repo
Framework

Estimation of preterm birth markers with U-Net segmentation network

Title Estimation of preterm birth markers with U-Net segmentation network
Authors Tomasz Włodarczyk, Szymon Płotka, Tomasz Trzciński, Przemysław Rokita, Nicole Sochacki-Wójcicka, Michał Lipa, Jakub Wójcicki
Abstract Preterm birth is the most common cause of neonatal death. Current diagnostic methods that assess the risk of preterm birth involve the collection of maternal characteristics and transvaginal ultrasound imaging conducted in the first and second trimester of pregnancy. Analysis of the ultrasound data is based on visual inspection of images by gynaecologist, sometimes supported by hand-designed image features such as cervical length. Due to the complexity of this process and its subjective component, approximately 30% of spontaneous preterm deliveries are not correctly predicted. Moreover, 10% of the predicted preterm deliveries are false-positives. In this paper, we address the problem of predicting spontaneous preterm delivery using machine learning. To achieve this goal, we propose to first use a deep neural network architecture for segmenting prenatal ultrasound images and then automatically extract two biophysical ultrasound markers, cervical length (CL) and anterior cervical angle (ACA), from the resulting images. Our method allows to estimate ultrasound markers without human oversight. Furthermore, we show that CL and ACA markers, when combined, allow us to decrease false-negative ratio from 30% to 18%. Finally, contrary to the current approaches to diagnostics methods that rely only on gynaecologist’s expertise, our method introduce objectively obtained results.
Tasks
Published 2019-08-24
URL https://arxiv.org/abs/1908.09148v1
PDF https://arxiv.org/pdf/1908.09148v1.pdf
PWC https://paperswithcode.com/paper/estimation-of-preterm-birth-markers-with-u
Repo
Framework

Interpreting Undesirable Pixels for Image Classification on Black-Box Models

Title Interpreting Undesirable Pixels for Image Classification on Black-Box Models
Authors Sin-Han Kang, Hong-Gyu Jung, Seong-Whan Lee
Abstract In an effort to interpret black-box models, researches for developing explanation methods have proceeded in recent years. Most studies have tried to identify input pixels that are crucial to the prediction of a classifier. While this approach is meaningful to analyse the characteristic of blackbox models, it is also important to investigate pixels that interfere with the prediction. To tackle this issue, in this paper, we propose an explanation method that visualizes undesirable regions to classify an image as a target class. To be specific, we divide the concept of undesirable regions into two terms: (1) factors for a target class, which hinder that black-box models identify intrinsic characteristics of a target class and (2) factors for non-target classes that are important regions for an image to be classified as other classes. We visualize such undesirable regions on heatmaps to qualitatively validate the proposed method. Furthermore, we present an evaluation metric to provide quantitative results on ImageNet.
Tasks Image Classification
Published 2019-09-27
URL https://arxiv.org/abs/1909.12446v2
PDF https://arxiv.org/pdf/1909.12446v2.pdf
PWC https://paperswithcode.com/paper/interpreting-undesirable-pixels-for-image
Repo
Framework

Bayesian Optimized Continual Learning with Attention Mechanism

Title Bayesian Optimized Continual Learning with Attention Mechanism
Authors Ju Xu, Jin Ma, Zhanxing Zhu
Abstract Though neural networks have achieved much progress in various applications, it is still highly challenging for them to learn from a continuous stream of tasks without forgetting. Continual learning, a new learning paradigm, aims to solve this issue. In this work, we propose a new model for continual learning, called Bayesian Optimized Continual Learning with Attention Mechanism (BOCL) that dynamically expands the network capacity upon the arrival of new tasks by Bayesian optimization and selectively utilizes previous knowledge (e.g. feature maps of previous tasks) via attention mechanism. Our experiments on variants of MNIST and CIFAR-100 demonstrate that our methods outperform the state-of-the-art in preventing catastrophic forgetting and fitting new tasks better.
Tasks Continual Learning
Published 2019-05-10
URL https://arxiv.org/abs/1905.03980v1
PDF https://arxiv.org/pdf/1905.03980v1.pdf
PWC https://paperswithcode.com/paper/bayesian-optimized-continual-learning-with
Repo
Framework

Cost Sensitive Learning in the Presence of Symmetric Label Noise

Title Cost Sensitive Learning in the Presence of Symmetric Label Noise
Authors Sandhya Tripathi, N. Hemachandra
Abstract In binary classification framework, we are interested in making cost sensitive label predictions in the presence of uniform/symmetric label noise. We first observe that $0$-$1$ Bayes classifiers are not (uniform) noise robust in cost sensitive setting. To circumvent this impossibility result, we present two schemes; unlike the existing methods, our schemes do not require noise rate. The first one uses $\alpha$-weighted $\gamma$-uneven margin squared loss function, $l_{\alpha, usq}$, which can handle cost sensitivity arising due to domain requirement (using user given $\alpha$) or class imbalance (by tuning $\gamma$) or both. However, we observe that $l_{\alpha, usq}$ Bayes classifiers are also not cost sensitive and noise robust. We show that regularized ERM of this loss function over the class of linear classifiers yields a cost sensitive uniform noise robust classifier as a solution of a system of linear equations. We also provide a performance bound for this classifier. The second scheme that we propose is a re-sampling based scheme that exploits the special structure of the uniform noise models and uses in-class probability $\eta$ estimates. Our computational experiments on some UCI datasets with class imbalance show that classifiers of our two schemes are on par with the existing methods and in fact better in some cases w.r.t. Accuracy and Arithmetic Mean, without using/tuning noise rate. We also consider other cost sensitive performance measures viz., F measure and Weighted Cost for evaluation. As our re-sampling scheme requires estimates of $\eta$, we provide a detailed comparative study of various $\eta$ estimation methods on synthetic datasets, w.r.t. half a dozen evaluation criterion. Also, we provide understanding on the interpretation of cost parameters $\alpha$ and $\gamma$ using different synthetic data experiments.
Tasks
Published 2019-01-08
URL https://arxiv.org/abs/1901.02271v4
PDF https://arxiv.org/pdf/1901.02271v4.pdf
PWC https://paperswithcode.com/paper/cost-sensitive-learning-in-the-presence-of
Repo
Framework
comments powered by Disqus