Paper Group ANR 33
Case Study: Explaining Diabetic Retinopathy Detection Deep CNNs via Integrated Gradients. TIP: Typifying the Interpretability of Procedures. Facial Affect Estimation in the Wild Using Deep Residual and Convolutional Networks. Backpropagation in matrix notation. Eye-Movement behavior identification for AD diagnosis. A Nested Attention Neural Hybrid …
Case Study: Explaining Diabetic Retinopathy Detection Deep CNNs via Integrated Gradients
Title | Case Study: Explaining Diabetic Retinopathy Detection Deep CNNs via Integrated Gradients |
Authors | Linyi Li, Matt Fredrikson, Shayak Sen, Anupam Datta |
Abstract | In this report, we applied integrated gradients to explaining a neural network for diabetic retinopathy detection. The integrated gradient is an attribution method which measures the contributions of input to the quantity of interest. We explored some new ways for applying this method such as explaining intermediate layers, filtering out unimportant units by their attribution value and generating contrary samples. Moreover, the visualization results extend the use of diabetic retinopathy detection model from merely predicting to assisting finding potential lesions. |
Tasks | Diabetic Retinopathy Detection |
Published | 2017-09-27 |
URL | http://arxiv.org/abs/1709.09586v3 |
http://arxiv.org/pdf/1709.09586v3.pdf | |
PWC | https://paperswithcode.com/paper/case-study-explaining-diabetic-retinopathy |
Repo | |
Framework | |
TIP: Typifying the Interpretability of Procedures
Title | TIP: Typifying the Interpretability of Procedures |
Authors | Amit Dhurandhar, Vijay Iyengar, Ronny Luss, Karthikeyan Shanmugam |
Abstract | We provide a novel notion of what it means to be interpretable, looking past the usual association with human understanding. Our key insight is that interpretability is not an absolute concept and so we define it relative to a target model, which may or may not be a human. We define a framework that allows for comparing interpretable procedures by linking them to important practical aspects such as accuracy and robustness. We characterize many of the current state-of-the-art interpretable methods in our framework portraying its general applicability. Finally, principled interpretable strategies are proposed and empirically evaluated on synthetic data, as well as on the largest public olfaction dataset that was made recently available \cite{olfs}. We also experiment on MNIST with a simple target model and different oracle models of varying complexity. This leads to the insight that the improvement in the target model is not only a function of the oracle model’s performance, but also its relative complexity with respect to the target model. Further experiments on CIFAR-10, a real manufacturing dataset and FICO dataset showcase the benefit of our methods over Knowledge Distillation when the target models are simple and the complex model is a neural network. |
Tasks | |
Published | 2017-06-09 |
URL | http://arxiv.org/abs/1706.02952v3 |
http://arxiv.org/pdf/1706.02952v3.pdf | |
PWC | https://paperswithcode.com/paper/tip-typifying-the-interpretability-of |
Repo | |
Framework | |
Facial Affect Estimation in the Wild Using Deep Residual and Convolutional Networks
Title | Facial Affect Estimation in the Wild Using Deep Residual and Convolutional Networks |
Authors | Behzad Hasani, Mohammad H. Mahoor |
Abstract | Automated affective computing in the wild is a challenging task in the field of computer vision. This paper presents three neural network-based methods proposed for the task of facial affect estimation submitted to the First Affect-in-the-Wild challenge. These methods are based on Inception-ResNet modules redesigned specifically for the task of facial affect estimation. These methods are: Shallow Inception-ResNet, Deep Inception-ResNet, and Inception-ResNet with LSTMs. These networks extract facial features in different scales and simultaneously estimate both the valence and arousal in each frame. Root Mean Square Error (RMSE) rates of 0.4 and 0.3 are achieved for the valence and arousal respectively with corresponding Concordance Correlation Coefficient (CCC) rates of 0.04 and 0.29 using Deep Inception-ResNet method. |
Tasks | |
Published | 2017-05-22 |
URL | http://arxiv.org/abs/1705.07884v1 |
http://arxiv.org/pdf/1705.07884v1.pdf | |
PWC | https://paperswithcode.com/paper/facial-affect-estimation-in-the-wild-using |
Repo | |
Framework | |
Backpropagation in matrix notation
Title | Backpropagation in matrix notation |
Authors | N. M. Mishachev |
Abstract | In this note we calculate the gradient of the network function in matrix notation. |
Tasks | |
Published | 2017-07-10 |
URL | http://arxiv.org/abs/1707.02746v2 |
http://arxiv.org/pdf/1707.02746v2.pdf | |
PWC | https://paperswithcode.com/paper/backpropagation-in-matrix-notation |
Repo | |
Framework | |
Eye-Movement behavior identification for AD diagnosis
Title | Eye-Movement behavior identification for AD diagnosis |
Authors | Juan Biondi, Gerardo Fernandez, Silvia Castro, Osvaldo Agamennoni |
Abstract | In the present work, we develop a deep-learning approach for differentiating the eye-movement behavior of people with neurodegenerative diseases over healthy control subjects during reading well-defined sentences. We define an information compaction of the eye-tracking data of subjects without and with probable Alzheimer’s disease when reading a set of well-defined, previously validated, sentences including high-, low-predictable sentences, and proverbs. Using this information we train a set of denoising sparse-autoencoders and build a deep neural network with these and a softmax classifier. Our results are very promising and show that these models may help to understand the dynamics of eye movement behavior and its relationship with underlying neuropsychological correlates. |
Tasks | Denoising, Eye Tracking |
Published | 2017-02-02 |
URL | http://arxiv.org/abs/1702.00837v3 |
http://arxiv.org/pdf/1702.00837v3.pdf | |
PWC | https://paperswithcode.com/paper/eye-movement-behavior-identification-for-ad |
Repo | |
Framework | |
A Nested Attention Neural Hybrid Model for Grammatical Error Correction
Title | A Nested Attention Neural Hybrid Model for Grammatical Error Correction |
Authors | Jianshu Ji, Qinlong Wang, Kristina Toutanova, Yongen Gong, Steven Truong, Jianfeng Gao |
Abstract | Grammatical error correction (GEC) systems strive to correct both global errors in word order and usage, and local errors in spelling and inflection. Further developing upon recent work on neural machine translation, we propose a new hybrid neural model with nested attention layers for GEC. Experiments show that the new model can effectively correct errors of both types by incorporating word and character-level information,and that the model significantly outperforms previous neural models for GEC as measured on the standard CoNLL-14 benchmark dataset. Further analysis also shows that the superiority of the proposed model can be largely attributed to the use of the nested attention mechanism, which has proven particularly effective in correcting local errors that involve small edits in orthography. |
Tasks | Grammatical Error Correction, Machine Translation |
Published | 2017-07-07 |
URL | http://arxiv.org/abs/1707.02026v2 |
http://arxiv.org/pdf/1707.02026v2.pdf | |
PWC | https://paperswithcode.com/paper/a-nested-attention-neural-hybrid-model-for |
Repo | |
Framework | |
Robust and Efficient Transfer Learning with Hidden-Parameter Markov Decision Processes
Title | Robust and Efficient Transfer Learning with Hidden-Parameter Markov Decision Processes |
Authors | Taylor Killian, Samuel Daulton, George Konidaris, Finale Doshi-Velez |
Abstract | We introduce a new formulation of the Hidden Parameter Markov Decision Process (HiP-MDP), a framework for modeling families of related tasks using low-dimensional latent embeddings. Our new framework correctly models the joint uncertainty in the latent parameters and the state space. We also replace the original Gaussian Process-based model with a Bayesian Neural Network, enabling more scalable inference. Thus, we expand the scope of the HiP-MDP to applications with higher dimensions and more complex dynamics. |
Tasks | Transfer Learning |
Published | 2017-06-20 |
URL | http://arxiv.org/abs/1706.06544v3 |
http://arxiv.org/pdf/1706.06544v3.pdf | |
PWC | https://paperswithcode.com/paper/robust-and-efficient-transfer-learning-with |
Repo | |
Framework | |
Learning Lexico-Functional Patterns for First-Person Affect
Title | Learning Lexico-Functional Patterns for First-Person Affect |
Authors | Lena Reed, Jiaqi Wu, Shereen Oraby, Pranav Anand, Marilyn Walker |
Abstract | Informal first-person narratives are a unique resource for computational models of everyday events and people’s affective reactions to them. People blogging about their day tend not to explicitly say I am happy. Instead they describe situations from which other humans can readily infer their affective reactions. However current sentiment dictionaries are missing much of the information needed to make similar inferences. We build on recent work that models affect in terms of lexical predicate functions and affect on the predicate’s arguments. We present a method to learn proxies for these functions from first-person narratives. We construct a novel fine-grained test set, and show that the patterns we learn improve our ability to predict first-person affective reactions to everyday events, from a Stanford sentiment baseline of .67F to .75F. |
Tasks | |
Published | 2017-08-31 |
URL | http://arxiv.org/abs/1708.09789v1 |
http://arxiv.org/pdf/1708.09789v1.pdf | |
PWC | https://paperswithcode.com/paper/learning-lexico-functional-patterns-for-first |
Repo | |
Framework | |
Weighted Voting Via No-Regret Learning
Title | Weighted Voting Via No-Regret Learning |
Authors | Nika Haghtalab, Ritesh Noothigattu, Ariel D. Procaccia |
Abstract | Voting systems typically treat all voters equally. We argue that perhaps they should not: Voters who have supported good choices in the past should be given higher weight than voters who have supported bad ones. To develop a formal framework for desirable weighting schemes, we draw on no-regret learning. Specifically, given a voting rule, we wish to design a weighting scheme such that applying the voting rule, with voters weighted by the scheme, leads to choices that are almost as good as those endorsed by the best voter in hindsight. We derive possibility and impossibility results for the existence of such weighting schemes, depending on whether the voting rule and the weighting scheme are deterministic or randomized, as well as on the social choice axioms satisfied by the voting rule. |
Tasks | |
Published | 2017-03-14 |
URL | http://arxiv.org/abs/1703.04756v1 |
http://arxiv.org/pdf/1703.04756v1.pdf | |
PWC | https://paperswithcode.com/paper/weighted-voting-via-no-regret-learning |
Repo | |
Framework | |
Asymptotically optimal private estimation under mean square loss
Title | Asymptotically optimal private estimation under mean square loss |
Authors | Min Ye, Alexander Barg |
Abstract | We consider the minimax estimation problem of a discrete distribution with support size $k$ under locally differential privacy constraints. A privatization scheme is applied to each raw sample independently, and we need to estimate the distribution of the raw samples from the privatized samples. A positive number $\epsilon$ measures the privacy level of a privatization scheme. In our previous work (arXiv:1702.00610), we proposed a family of new privatization schemes and the corresponding estimator. We also proved that our scheme and estimator are order optimal in the regime $e^{\epsilon} \ll k$ under both $\ell_2^2$ and $\ell_1$ loss. In other words, for a large number of samples the worst-case estimation loss of our scheme was shown to differ from the optimal value by at most a constant factor. In this paper, we eliminate this gap by showing asymptotic optimality of the proposed scheme and estimator under the $\ell_2^2$ (mean square) loss. More precisely, we show that for any $k$ and $\epsilon,$ the ratio between the worst-case estimation loss of our scheme and the optimal value approaches $1$ as the number of samples tends to infinity. |
Tasks | |
Published | 2017-07-31 |
URL | http://arxiv.org/abs/1708.00059v1 |
http://arxiv.org/pdf/1708.00059v1.pdf | |
PWC | https://paperswithcode.com/paper/asymptotically-optimal-private-estimation |
Repo | |
Framework | |
An Efficient Single Chord-based Accumulation Technique (SCA) to Detect More Reliable Corners
Title | An Efficient Single Chord-based Accumulation Technique (SCA) to Detect More Reliable Corners |
Authors | Mohammad Asiful Hossain, Abdul Kawsar Tushar, Shofiullah Babor |
Abstract | Corner detection is a vital operation in numerous computer vision applications. The Chord-to-Point Distance Accumulation (CPDA) detector is recognized as the contour-based corner detector producing the lowest localization error while localizing corners in an image. However, in our experiment part, we demonstrate that CPDA detector often misses some potential corners. Moreover, the detection algorithm of CPDA is computationally costly. In this paper, We focus on reducing localization error as well as increasing average repeatability. The preprocessing and refinements steps of proposed process are similar to CPDA. Our experimental results will show the effectiveness and robustness of proposed process over CPDA. |
Tasks | |
Published | 2017-08-20 |
URL | http://arxiv.org/abs/1708.05979v1 |
http://arxiv.org/pdf/1708.05979v1.pdf | |
PWC | https://paperswithcode.com/paper/an-efficient-single-chord-based-accumulation |
Repo | |
Framework | |
Semi-supervised model-based clustering with controlled clusters leakage
Title | Semi-supervised model-based clustering with controlled clusters leakage |
Authors | Marek Śmieja, Łukasz Struski, Jacek Tabor |
Abstract | In this paper, we focus on finding clusters in partially categorized data sets. We propose a semi-supervised version of Gaussian mixture model, called C3L, which retrieves natural subgroups of given categories. In contrast to other semi-supervised models, C3L is parametrized by user-defined leakage level, which controls maximal inconsistency between initial categorization and resulting clustering. Our method can be implemented as a module in practical expert systems to detect clusters, which combine expert knowledge with true distribution of data. Moreover, it can be used for improving the results of less flexible clustering techniques, such as projection pursuit clustering. The paper presents extensive theoretical analysis of the model and fast algorithm for its efficient optimization. Experimental results show that C3L finds high quality clustering model, which can be applied in discovering meaningful groups in partially classified data. |
Tasks | |
Published | 2017-05-04 |
URL | http://arxiv.org/abs/1705.01877v1 |
http://arxiv.org/pdf/1705.01877v1.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-model-based-clustering-with |
Repo | |
Framework | |
Robust Regression via Mutivariate Regression Depth
Title | Robust Regression via Mutivariate Regression Depth |
Authors | Chao Gao |
Abstract | This paper studies robust regression in the settings of Huber’s $\epsilon$-contamination models. We consider estimators that are maximizers of multivariate regression depth functions. These estimators are shown to achieve minimax rates in the settings of $\epsilon$-contamination models for various regression problems including nonparametric regression, sparse linear regression, reduced rank regression, etc. We also discuss a general notion of depth function for linear operators that has potential applications in robust functional linear regression. |
Tasks | |
Published | 2017-02-15 |
URL | http://arxiv.org/abs/1702.04656v1 |
http://arxiv.org/pdf/1702.04656v1.pdf | |
PWC | https://paperswithcode.com/paper/robust-regression-via-mutivariate-regression |
Repo | |
Framework | |
CaloGAN: Simulating 3D High Energy Particle Showers in Multi-Layer Electromagnetic Calorimeters with Generative Adversarial Networks
Title | CaloGAN: Simulating 3D High Energy Particle Showers in Multi-Layer Electromagnetic Calorimeters with Generative Adversarial Networks |
Authors | Michela Paganini, Luke de Oliveira, Benjamin Nachman |
Abstract | The precise modeling of subatomic particle interactions and propagation through matter is paramount for the advancement of nuclear and particle physics searches and precision measurements. The most computationally expensive step in the simulation pipeline of a typical experiment at the Large Hadron Collider (LHC) is the detailed modeling of the full complexity of physics processes that govern the motion and evolution of particle showers inside calorimeters. We introduce \textsc{CaloGAN}, a new fast simulation technique based on generative adversarial networks (GANs). We apply these neural networks to the modeling of electromagnetic showers in a longitudinally segmented calorimeter, and achieve speedup factors comparable to or better than existing full simulation techniques on CPU ($100\times$-$1000\times$) and even faster on GPU (up to $\sim10^5\times$). There are still challenges for achieving precision across the entire phase space, but our solution can reproduce a variety of geometric shower shape properties of photons, positrons and charged pions. This represents a significant stepping stone toward a full neural network-based detector simulation that could save significant computing time and enable many analyses now and in the future. |
Tasks | |
Published | 2017-12-21 |
URL | http://arxiv.org/abs/1712.10321v1 |
http://arxiv.org/pdf/1712.10321v1.pdf | |
PWC | https://paperswithcode.com/paper/calogan-simulating-3d-high-energy-particle |
Repo | |
Framework | |
Skeleton based action recognition using translation-scale invariant image mapping and multi-scale deep cnn
Title | Skeleton based action recognition using translation-scale invariant image mapping and multi-scale deep cnn |
Authors | Bo Li, Mingyi He, Xuelian Cheng, Yucheng Chen, Yuchao Dai |
Abstract | This paper presents an image classification based approach for skeleton-based video action recognition problem. Firstly, A dataset independent translation-scale invariant image mapping method is proposed, which transformes the skeleton videos to colour images, named skeleton-images. Secondly, A multi-scale deep convolutional neural network (CNN) architecture is proposed which could be built and fine-tuned on the powerful pre-trained CNNs, e.g., AlexNet, VGGNet, ResNet etal.. Even though the skeleton-images are very different from natural images, the fine-tune strategy still works well. At last, we prove that our method could also work well on 2D skeleton video data. We achieve the state-of-the-art results on the popular benchmard datasets e.g. NTU RGB+D, UTD-MHAD, MSRC-12, and G3D. Especially on the largest and challenge NTU RGB+D, UTD-MHAD, and MSRC-12 dataset, our method outperforms other methods by a large margion, which proves the efficacy of the proposed method. |
Tasks | Image Classification, Skeleton Based Action Recognition, Temporal Action Localization |
Published | 2017-04-19 |
URL | http://arxiv.org/abs/1704.05645v2 |
http://arxiv.org/pdf/1704.05645v2.pdf | |
PWC | https://paperswithcode.com/paper/skeleton-based-action-recognition-using-1 |
Repo | |
Framework | |