Paper Group AWR 145
Boosted Cascaded Convnets for Multilabel Classification of Thoracic Diseases in Chest Radiographs. Deep Joint Entity Disambiguation with Local Neural Attention. Knowledge Transfer for Melanoma Screening with Deep Learning. Greedy Search for Descriptive Spatial Face Features. Towards Syntactic Iberian Polarity Classification. Fast Generation for Con …
Boosted Cascaded Convnets for Multilabel Classification of Thoracic Diseases in Chest Radiographs
Title | Boosted Cascaded Convnets for Multilabel Classification of Thoracic Diseases in Chest Radiographs |
Authors | Pulkit Kumar, Monika Grewal, Muktabh Mayank Srivastava |
Abstract | Chest X-ray is one of the most accessible medical imaging technique for diagnosis of multiple diseases. With the availability of ChestX-ray14, which is a massive dataset of chest X-ray images and provides annotations for 14 thoracic diseases; it is possible to train Deep Convolutional Neural Networks (DCNN) to build Computer Aided Diagnosis (CAD) systems. In this work, we experiment a set of deep learning models and present a cascaded deep neural network that can diagnose all 14 pathologies better than the baseline and is competitive with other published methods. Our work provides the quantitative results to answer following research questions for the dataset: 1) What loss functions to use for training DCNN from scratch on ChestX-ray14 dataset that demonstrates high class imbalance and label co occurrence? 2) How to use cascading to model label dependency and to improve accuracy of the deep learning model? |
Tasks | Lung Disease Classification |
Published | 2017-11-23 |
URL | http://arxiv.org/abs/1711.08760v1 |
http://arxiv.org/pdf/1711.08760v1.pdf | |
PWC | https://paperswithcode.com/paper/boosted-cascaded-convnets-for-multilabel |
Repo | https://github.com/Azure/AzureChestXRay |
Framework | pytorch |
Deep Joint Entity Disambiguation with Local Neural Attention
Title | Deep Joint Entity Disambiguation with Local Neural Attention |
Authors | Octavian-Eugen Ganea, Thomas Hofmann |
Abstract | We propose a novel deep learning model for joint document-level entity disambiguation, which leverages learned neural representations. Key components are entity embeddings, a neural attention mechanism over local context windows, and a differentiable joint inference stage for disambiguation. Our approach thereby combines benefits of deep learning with more traditional approaches such as graphical models and probabilistic mention-entity maps. Extensive experiments show that we are able to obtain competitive or state-of-the-art accuracy at moderate computational costs. |
Tasks | Entity Disambiguation |
Published | 2017-04-17 |
URL | http://arxiv.org/abs/1704.04920v3 |
http://arxiv.org/pdf/1704.04920v3.pdf | |
PWC | https://paperswithcode.com/paper/deep-joint-entity-disambiguation-with-local |
Repo | https://github.com/dalab/deep-ed |
Framework | torch |
Knowledge Transfer for Melanoma Screening with Deep Learning
Title | Knowledge Transfer for Melanoma Screening with Deep Learning |
Authors | Afonso Menegola, Michel Fornaciali, Ramon Pires, Flávia Vasques Bittencourt, Sandra Avila, Eduardo Valle |
Abstract | Knowledge transfer impacts the performance of deep learning – the state of the art for image classification tasks, including automated melanoma screening. Deep learning’s greed for large amounts of training data poses a challenge for medical tasks, which we can alleviate by recycling knowledge from models trained on different tasks, in a scheme called transfer learning. Although much of the best art on automated melanoma screening employs some form of transfer learning, a systematic evaluation was missing. Here we investigate the presence of transfer, from which task the transfer is sourced, and the application of fine tuning (i.e., retraining of the deep learning model after transfer). We also test the impact of picking deeper (and more expensive) models. Our results favor deeper models, pre-trained over ImageNet, with fine-tuning, reaching an AUC of 80.7% and 84.5% for the two skin-lesion datasets evaluated. |
Tasks | Image Classification, Skin Cancer Classification, Transfer Learning |
Published | 2017-03-22 |
URL | http://arxiv.org/abs/1703.07479v1 |
http://arxiv.org/pdf/1703.07479v1.pdf | |
PWC | https://paperswithcode.com/paper/knowledge-transfer-for-melanoma-screening |
Repo | https://github.com/learningtitans/data-depth-design |
Framework | tf |
Greedy Search for Descriptive Spatial Face Features
Title | Greedy Search for Descriptive Spatial Face Features |
Authors | Caner Gacav, Burak Benligiray, Cihan Topal |
Abstract | Facial expression recognition methods use a combination of geometric and appearance-based features. Spatial features are derived from displacements of facial landmarks, and carry geometric information. These features are either selected based on prior knowledge, or dimension-reduced from a large pool. In this study, we produce a large number of potential spatial features using two combinations of facial landmarks. Among these, we search for a descriptive subset of features using sequential forward selection. The chosen feature subset is used to classify facial expressions in the extended Cohn-Kanade dataset (CK+), and delivered 88.7% recognition accuracy without using any appearance-based features. |
Tasks | Facial Expression Recognition |
Published | 2017-01-07 |
URL | http://arxiv.org/abs/1701.01879v2 |
http://arxiv.org/pdf/1701.01879v2.pdf | |
PWC | https://paperswithcode.com/paper/greedy-search-for-descriptive-spatial-face |
Repo | https://github.com/kyranstar/Narcissus |
Framework | tf |
Towards Syntactic Iberian Polarity Classification
Title | Towards Syntactic Iberian Polarity Classification |
Authors | David Vilares, Marcos Garcia, Miguel A. Alonso, Carlos Gómez-Rodríguez |
Abstract | Lexicon-based methods using syntactic rules for polarity classification rely on parsers that are dependent on the language and on treebank guidelines. Thus, rules are also dependent and require adaptation, especially in multilingual scenarios. We tackle this challenge in the context of the Iberian Peninsula, releasing the first symbolic syntax-based Iberian system with rules shared across five official languages: Basque, Catalan, Galician, Portuguese and Spanish. The model is made available. |
Tasks | |
Published | 2017-08-17 |
URL | http://arxiv.org/abs/1708.05269v1 |
http://arxiv.org/pdf/1708.05269v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-syntactic-iberian-polarity |
Repo | https://github.com/aghie/uuusa |
Framework | none |
Fast Generation for Convolutional Autoregressive Models
Title | Fast Generation for Convolutional Autoregressive Models |
Authors | Prajit Ramachandran, Tom Le Paine, Pooya Khorrami, Mohammad Babaeizadeh, Shiyu Chang, Yang Zhang, Mark A. Hasegawa-Johnson, Roy H. Campbell, Thomas S. Huang |
Abstract | Convolutional autoregressive models have recently demonstrated state-of-the-art performance on a number of generation tasks. While fast, parallel training methods have been crucial for their success, generation is typically implemented in a na"{i}ve fashion where redundant computations are unnecessarily repeated. This results in slow generation, making such models infeasible for production environments. In this work, we describe a method to speed up generation in convolutional autoregressive models. The key idea is to cache hidden states to avoid redundant computation. We apply our fast generation method to the Wavenet and PixelCNN++ models and achieve up to $21\times$ and $183\times$ speedups respectively. |
Tasks | |
Published | 2017-04-20 |
URL | http://arxiv.org/abs/1704.06001v1 |
http://arxiv.org/pdf/1704.06001v1.pdf | |
PWC | https://paperswithcode.com/paper/fast-generation-for-convolutional |
Repo | https://github.com/PrajitR/fast-pixel-cnn |
Framework | tf |
Semantically Decomposing the Latent Spaces of Generative Adversarial Networks
Title | Semantically Decomposing the Latent Spaces of Generative Adversarial Networks |
Authors | Chris Donahue, Zachary C. Lipton, Akshay Balsubramani, Julian McAuley |
Abstract | We propose a new algorithm for training generative adversarial networks that jointly learns latent codes for both identities (e.g. individual humans) and observations (e.g. specific photographs). By fixing the identity portion of the latent codes, we can generate diverse images of the same subject, and by fixing the observation portion, we can traverse the manifold of subjects while maintaining contingent aspects such as lighting and pose. Our algorithm features a pairwise training scheme in which each sample from the generator consists of two images with a common identity code. Corresponding samples from the real dataset consist of two distinct photographs of the same subject. In order to fool the discriminator, the generator must produce pairs that are photorealistic, distinct, and appear to depict the same individual. We augment both the DCGAN and BEGAN approaches with Siamese discriminators to facilitate pairwise training. Experiments with human judges and an off-the-shelf face verification system demonstrate our algorithm’s ability to generate convincing, identity-matched photographs. |
Tasks | Face Verification, Image Generation |
Published | 2017-05-22 |
URL | http://arxiv.org/abs/1705.07904v3 |
http://arxiv.org/pdf/1705.07904v3.pdf | |
PWC | https://paperswithcode.com/paper/semantically-decomposing-the-latent-spaces-of |
Repo | https://github.com/chrisdonahue/sdgan |
Framework | tf |
Block-Normalized Gradient Method: An Empirical Study for Training Deep Neural Network
Title | Block-Normalized Gradient Method: An Empirical Study for Training Deep Neural Network |
Authors | Adams Wei Yu, Lei Huang, Qihang Lin, Ruslan Salakhutdinov, Jaime Carbonell |
Abstract | In this paper, we propose a generic and simple strategy for utilizing stochastic gradient information in optimization. The technique essentially contains two consecutive steps in each iteration: 1) computing and normalizing each block (layer) of the mini-batch stochastic gradient; 2) selecting appropriate step size to update the decision variable (parameter) towards the negative of the block-normalized gradient. We conduct extensive empirical studies on various non-convex neural network optimization problems, including multi-layer perceptron, convolution neural networks and recurrent neural networks. The results indicate the block-normalized gradient can help accelerate the training of neural networks. In particular, we observe that the normalized gradient methods having constant step size with occasionally decay, such as SGD with momentum, have better performance in the deep convolution neural networks, while those with adaptive step sizes, such as Adam, perform better in recurrent neural networks. Besides, we also observe this line of methods can lead to solutions with better generalization properties, which is confirmed by the performance improvement over strong baselines. |
Tasks | |
Published | 2017-07-16 |
URL | http://arxiv.org/abs/1707.04822v2 |
http://arxiv.org/pdf/1707.04822v2.pdf | |
PWC | https://paperswithcode.com/paper/block-normalized-gradient-method-an-empirical |
Repo | https://github.com/AliOsm/shakkelha |
Framework | none |
Real Time Image Saliency for Black Box Classifiers
Title | Real Time Image Saliency for Black Box Classifiers |
Authors | Piotr Dabkowski, Yarin Gal |
Abstract | In this work we develop a fast saliency detection method that can be applied to any differentiable image classifier. We train a masking model to manipulate the scores of the classifier by masking salient parts of the input image. Our model generalises well to unseen images and requires a single forward pass to perform saliency detection, therefore suitable for use in real-time systems. We test our approach on CIFAR-10 and ImageNet datasets and show that the produced saliency maps are easily interpretable, sharp, and free of artifacts. We suggest a new metric for saliency and test our method on the ImageNet object localisation task. We achieve results outperforming other weakly supervised methods. |
Tasks | Saliency Detection |
Published | 2017-05-22 |
URL | http://arxiv.org/abs/1705.07857v1 |
http://arxiv.org/pdf/1705.07857v1.pdf | |
PWC | https://paperswithcode.com/paper/real-time-image-saliency-for-black-box |
Repo | https://github.com/PiotrDabkowski/pytorch-saliency |
Framework | pytorch |
Explaining Recurrent Neural Network Predictions in Sentiment Analysis
Title | Explaining Recurrent Neural Network Predictions in Sentiment Analysis |
Authors | Leila Arras, Grégoire Montavon, Klaus-Robert Müller, Wojciech Samek |
Abstract | Recently, a technique called Layer-wise Relevance Propagation (LRP) was shown to deliver insightful explanations in the form of input space relevances for understanding feed-forward neural network classification decisions. In the present work, we extend the usage of LRP to recurrent neural networks. We propose a specific propagation rule applicable to multiplicative connections as they arise in recurrent network architectures such as LSTMs and GRUs. We apply our technique to a word-based bi-directional LSTM model on a five-class sentiment prediction task, and evaluate the resulting LRP relevances both qualitatively and quantitatively, obtaining better results than a gradient-based related method which was used in previous work. |
Tasks | Interpretable Machine Learning, Sentiment Analysis |
Published | 2017-06-22 |
URL | http://arxiv.org/abs/1706.07206v2 |
http://arxiv.org/pdf/1706.07206v2.pdf | |
PWC | https://paperswithcode.com/paper/explaining-recurrent-neural-network |
Repo | https://github.com/ArrasL/LRP_for_LSTM |
Framework | none |
Deep Convolutional Framelet Denosing for Low-Dose CT via Wavelet Residual Network
Title | Deep Convolutional Framelet Denosing for Low-Dose CT via Wavelet Residual Network |
Authors | Eunhee Kang, Jaejun Yoo, Jong Chul Ye |
Abstract | Model based iterative reconstruction (MBIR) algorithms for low-dose X-ray CT are computationally expensive. To address this problem, we recently proposed a deep convolutional neural network (CNN) for low-dose X-ray CT and won the second place in 2016 AAPM Low-Dose CT Grand Challenge. However, some of the texture were not fully recovered. To address this problem, here we propose a novel framelet-based denoising algorithm using wavelet residual network which synergistically combines the expressive power of deep learning and the performance guarantee from the framelet-based denoising algorithms. The new algorithms were inspired by the recent interpretation of the deep convolutional neural network (CNN) as a cascaded convolution framelet signal representation. Extensive experimental results confirm that the proposed networks have significantly improved performance and preserves the detail texture of the original images. |
Tasks | Denoising |
Published | 2017-07-31 |
URL | http://arxiv.org/abs/1707.09938v3 |
http://arxiv.org/pdf/1707.09938v3.pdf | |
PWC | https://paperswithcode.com/paper/deep-convolutional-framelet-denosing-for-low |
Repo | https://github.com/eunh/low_dose_CT |
Framework | none |
Visual Servoing of Unmanned Surface Vehicle from Small Tethered Unmanned Aerial Vehicle
Title | Visual Servoing of Unmanned Surface Vehicle from Small Tethered Unmanned Aerial Vehicle |
Authors | Haresh Karnan, Aritra Biswas, Pranav Vaidik Dhulipala, Jan Dufek, Robin Murphy |
Abstract | This paper presents an algorithm and the implementation of a motor schema to aid the visual localization subsystem of the ongoing EMILY project at Texas A and M University. The EMILY project aims to team an Unmanned Surface Vehicle (USV) with an Unmanned Aerial Vehicle (UAV) to augment the search and rescue of marine casualties during an emergency response phase. The USV is designed to serve as a flotation device once it reaches the victims. A live video feed from the UAV is provided to the casuality responders giving them a visual estimate of the USVs orientation and position to help with its navigation. One of the challenges involved with casualty response using a USV UAV team is to simultaneously control the USV and track it. In this paper, we present an implemented solution to automate the UAV camera movements to keep the USV in view at all times. The motor schema proposed, uses the USVs coordinates from the visual localization subsystem to control the UAVs camera movements and track the USV with minimal camera movements such that the USV is always in the cameras field of view. |
Tasks | Visual Localization |
Published | 2017-10-09 |
URL | http://arxiv.org/abs/1710.02932v1 |
http://arxiv.org/pdf/1710.02932v1.pdf | |
PWC | https://paperswithcode.com/paper/visual-servoing-of-unmanned-surface-vehicle |
Repo | https://github.com/jan-dufek/emily-tracker |
Framework | none |
An Expectation Conditional Maximization approach for Gaussian graphical models
Title | An Expectation Conditional Maximization approach for Gaussian graphical models |
Authors | Zehang Richard Li, Tyler H. McCormick |
Abstract | Bayesian graphical models are a useful tool for understanding dependence relationships among many variables, particularly in situations with external prior information. In high-dimensional settings, the space of possible graphs becomes enormous, rendering even state-of-the-art Bayesian stochastic search computationally infeasible. We propose a deterministic alternative to estimate Gaussian and Gaussian copula graphical models using an Expectation Conditional Maximization (ECM) algorithm, extending the EM approach from Bayesian variable selection to graphical model estimation. We show that the ECM approach enables fast posterior exploration under a sequence of mixture priors, and can incorporate multiple sources of information. |
Tasks | |
Published | 2017-09-20 |
URL | http://arxiv.org/abs/1709.06970v3 |
http://arxiv.org/pdf/1709.06970v3.pdf | |
PWC | https://paperswithcode.com/paper/an-expectation-conditional-maximization |
Repo | https://github.com/richardli/EMGS |
Framework | none |
Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory
Title | Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory |
Authors | Hao Zhou, Minlie Huang, Tianyang Zhang, Xiaoyan Zhu, Bing Liu |
Abstract | Perception and expression of emotion are key factors to the success of dialogue systems or conversational agents. However, this problem has not been studied in large-scale conversation generation so far. In this paper, we propose Emotional Chatting Machine (ECM) that can generate appropriate responses not only in content (relevant and grammatical) but also in emotion (emotionally consistent). To the best of our knowledge, this is the first work that addresses the emotion factor in large-scale conversation generation. ECM addresses the factor using three new mechanisms that respectively (1) models the high-level abstraction of emotion expressions by embedding emotion categories, (2) captures the change of implicit internal emotion states, and (3) uses explicit emotion expressions with an external emotion vocabulary. Experiments show that the proposed model can generate responses appropriate not only in content but also in emotion. |
Tasks | |
Published | 2017-04-04 |
URL | http://arxiv.org/abs/1704.01074v4 |
http://arxiv.org/pdf/1704.01074v4.pdf | |
PWC | https://paperswithcode.com/paper/emotional-chatting-machine-emotional |
Repo | https://github.com/tuxchow/ecm |
Framework | tf |
How to Train a CAT: Learning Canonical Appearance Transformations for Direct Visual Localization Under Illumination Change
Title | How to Train a CAT: Learning Canonical Appearance Transformations for Direct Visual Localization Under Illumination Change |
Authors | Lee Clement, Jonathan Kelly |
Abstract | Direct visual localization has recently enjoyed a resurgence in popularity with the increasing availability of cheap mobile computing power. The competitive accuracy and robustness of these algorithms compared to state-of-the-art feature-based methods, as well as their natural ability to yield dense maps, makes them an appealing choice for a variety of mobile robotics applications. However, direct methods remain brittle in the face of appearance change due to their underlying assumption of photometric consistency, which is commonly violated in practice. In this paper, we propose to mitigate this problem by training deep convolutional encoder-decoder models to transform images of a scene such that they correspond to a previously-seen canonical appearance. We validate our method in multiple environments and illumination conditions using high-fidelity synthetic RGB-D datasets, and integrate the trained models into a direct visual localization pipeline, yielding improvements in visual odometry (VO) accuracy through time-varying illumination conditions, as well as improved metric relocalization performance under illumination change, where conventional methods normally fail. We further provide a preliminary investigation of transfer learning from synthetic to real environments in a localization context. An open-source implementation of our method using PyTorch is available at https://github.com/utiasSTARS/cat-net. |
Tasks | Transfer Learning, Visual Localization, Visual Odometry |
Published | 2017-09-09 |
URL | http://arxiv.org/abs/1709.03009v5 |
http://arxiv.org/pdf/1709.03009v5.pdf | |
PWC | https://paperswithcode.com/paper/how-to-train-a-cat-learning-canonical |
Repo | https://github.com/utiasSTARS/cat-net |
Framework | pytorch |