July 29, 2019

2638 words 13 mins read

Paper Group AWR 145

Paper Group AWR 145

Boosted Cascaded Convnets for Multilabel Classification of Thoracic Diseases in Chest Radiographs. Deep Joint Entity Disambiguation with Local Neural Attention. Knowledge Transfer for Melanoma Screening with Deep Learning. Greedy Search for Descriptive Spatial Face Features. Towards Syntactic Iberian Polarity Classification. Fast Generation for Con …

Boosted Cascaded Convnets for Multilabel Classification of Thoracic Diseases in Chest Radiographs

Title Boosted Cascaded Convnets for Multilabel Classification of Thoracic Diseases in Chest Radiographs
Authors Pulkit Kumar, Monika Grewal, Muktabh Mayank Srivastava
Abstract Chest X-ray is one of the most accessible medical imaging technique for diagnosis of multiple diseases. With the availability of ChestX-ray14, which is a massive dataset of chest X-ray images and provides annotations for 14 thoracic diseases; it is possible to train Deep Convolutional Neural Networks (DCNN) to build Computer Aided Diagnosis (CAD) systems. In this work, we experiment a set of deep learning models and present a cascaded deep neural network that can diagnose all 14 pathologies better than the baseline and is competitive with other published methods. Our work provides the quantitative results to answer following research questions for the dataset: 1) What loss functions to use for training DCNN from scratch on ChestX-ray14 dataset that demonstrates high class imbalance and label co occurrence? 2) How to use cascading to model label dependency and to improve accuracy of the deep learning model?
Tasks Lung Disease Classification
Published 2017-11-23
URL http://arxiv.org/abs/1711.08760v1
PDF http://arxiv.org/pdf/1711.08760v1.pdf
PWC https://paperswithcode.com/paper/boosted-cascaded-convnets-for-multilabel
Repo https://github.com/Azure/AzureChestXRay
Framework pytorch

Deep Joint Entity Disambiguation with Local Neural Attention

Title Deep Joint Entity Disambiguation with Local Neural Attention
Authors Octavian-Eugen Ganea, Thomas Hofmann
Abstract We propose a novel deep learning model for joint document-level entity disambiguation, which leverages learned neural representations. Key components are entity embeddings, a neural attention mechanism over local context windows, and a differentiable joint inference stage for disambiguation. Our approach thereby combines benefits of deep learning with more traditional approaches such as graphical models and probabilistic mention-entity maps. Extensive experiments show that we are able to obtain competitive or state-of-the-art accuracy at moderate computational costs.
Tasks Entity Disambiguation
Published 2017-04-17
URL http://arxiv.org/abs/1704.04920v3
PDF http://arxiv.org/pdf/1704.04920v3.pdf
PWC https://paperswithcode.com/paper/deep-joint-entity-disambiguation-with-local
Repo https://github.com/dalab/deep-ed
Framework torch

Knowledge Transfer for Melanoma Screening with Deep Learning

Title Knowledge Transfer for Melanoma Screening with Deep Learning
Authors Afonso Menegola, Michel Fornaciali, Ramon Pires, Flávia Vasques Bittencourt, Sandra Avila, Eduardo Valle
Abstract Knowledge transfer impacts the performance of deep learning – the state of the art for image classification tasks, including automated melanoma screening. Deep learning’s greed for large amounts of training data poses a challenge for medical tasks, which we can alleviate by recycling knowledge from models trained on different tasks, in a scheme called transfer learning. Although much of the best art on automated melanoma screening employs some form of transfer learning, a systematic evaluation was missing. Here we investigate the presence of transfer, from which task the transfer is sourced, and the application of fine tuning (i.e., retraining of the deep learning model after transfer). We also test the impact of picking deeper (and more expensive) models. Our results favor deeper models, pre-trained over ImageNet, with fine-tuning, reaching an AUC of 80.7% and 84.5% for the two skin-lesion datasets evaluated.
Tasks Image Classification, Skin Cancer Classification, Transfer Learning
Published 2017-03-22
URL http://arxiv.org/abs/1703.07479v1
PDF http://arxiv.org/pdf/1703.07479v1.pdf
PWC https://paperswithcode.com/paper/knowledge-transfer-for-melanoma-screening
Repo https://github.com/learningtitans/data-depth-design
Framework tf

Greedy Search for Descriptive Spatial Face Features

Title Greedy Search for Descriptive Spatial Face Features
Authors Caner Gacav, Burak Benligiray, Cihan Topal
Abstract Facial expression recognition methods use a combination of geometric and appearance-based features. Spatial features are derived from displacements of facial landmarks, and carry geometric information. These features are either selected based on prior knowledge, or dimension-reduced from a large pool. In this study, we produce a large number of potential spatial features using two combinations of facial landmarks. Among these, we search for a descriptive subset of features using sequential forward selection. The chosen feature subset is used to classify facial expressions in the extended Cohn-Kanade dataset (CK+), and delivered 88.7% recognition accuracy without using any appearance-based features.
Tasks Facial Expression Recognition
Published 2017-01-07
URL http://arxiv.org/abs/1701.01879v2
PDF http://arxiv.org/pdf/1701.01879v2.pdf
PWC https://paperswithcode.com/paper/greedy-search-for-descriptive-spatial-face
Repo https://github.com/kyranstar/Narcissus
Framework tf

Towards Syntactic Iberian Polarity Classification

Title Towards Syntactic Iberian Polarity Classification
Authors David Vilares, Marcos Garcia, Miguel A. Alonso, Carlos Gómez-Rodríguez
Abstract Lexicon-based methods using syntactic rules for polarity classification rely on parsers that are dependent on the language and on treebank guidelines. Thus, rules are also dependent and require adaptation, especially in multilingual scenarios. We tackle this challenge in the context of the Iberian Peninsula, releasing the first symbolic syntax-based Iberian system with rules shared across five official languages: Basque, Catalan, Galician, Portuguese and Spanish. The model is made available.
Tasks
Published 2017-08-17
URL http://arxiv.org/abs/1708.05269v1
PDF http://arxiv.org/pdf/1708.05269v1.pdf
PWC https://paperswithcode.com/paper/towards-syntactic-iberian-polarity
Repo https://github.com/aghie/uuusa
Framework none

Fast Generation for Convolutional Autoregressive Models

Title Fast Generation for Convolutional Autoregressive Models
Authors Prajit Ramachandran, Tom Le Paine, Pooya Khorrami, Mohammad Babaeizadeh, Shiyu Chang, Yang Zhang, Mark A. Hasegawa-Johnson, Roy H. Campbell, Thomas S. Huang
Abstract Convolutional autoregressive models have recently demonstrated state-of-the-art performance on a number of generation tasks. While fast, parallel training methods have been crucial for their success, generation is typically implemented in a na"{i}ve fashion where redundant computations are unnecessarily repeated. This results in slow generation, making such models infeasible for production environments. In this work, we describe a method to speed up generation in convolutional autoregressive models. The key idea is to cache hidden states to avoid redundant computation. We apply our fast generation method to the Wavenet and PixelCNN++ models and achieve up to $21\times$ and $183\times$ speedups respectively.
Tasks
Published 2017-04-20
URL http://arxiv.org/abs/1704.06001v1
PDF http://arxiv.org/pdf/1704.06001v1.pdf
PWC https://paperswithcode.com/paper/fast-generation-for-convolutional
Repo https://github.com/PrajitR/fast-pixel-cnn
Framework tf

Semantically Decomposing the Latent Spaces of Generative Adversarial Networks

Title Semantically Decomposing the Latent Spaces of Generative Adversarial Networks
Authors Chris Donahue, Zachary C. Lipton, Akshay Balsubramani, Julian McAuley
Abstract We propose a new algorithm for training generative adversarial networks that jointly learns latent codes for both identities (e.g. individual humans) and observations (e.g. specific photographs). By fixing the identity portion of the latent codes, we can generate diverse images of the same subject, and by fixing the observation portion, we can traverse the manifold of subjects while maintaining contingent aspects such as lighting and pose. Our algorithm features a pairwise training scheme in which each sample from the generator consists of two images with a common identity code. Corresponding samples from the real dataset consist of two distinct photographs of the same subject. In order to fool the discriminator, the generator must produce pairs that are photorealistic, distinct, and appear to depict the same individual. We augment both the DCGAN and BEGAN approaches with Siamese discriminators to facilitate pairwise training. Experiments with human judges and an off-the-shelf face verification system demonstrate our algorithm’s ability to generate convincing, identity-matched photographs.
Tasks Face Verification, Image Generation
Published 2017-05-22
URL http://arxiv.org/abs/1705.07904v3
PDF http://arxiv.org/pdf/1705.07904v3.pdf
PWC https://paperswithcode.com/paper/semantically-decomposing-the-latent-spaces-of
Repo https://github.com/chrisdonahue/sdgan
Framework tf

Block-Normalized Gradient Method: An Empirical Study for Training Deep Neural Network

Title Block-Normalized Gradient Method: An Empirical Study for Training Deep Neural Network
Authors Adams Wei Yu, Lei Huang, Qihang Lin, Ruslan Salakhutdinov, Jaime Carbonell
Abstract In this paper, we propose a generic and simple strategy for utilizing stochastic gradient information in optimization. The technique essentially contains two consecutive steps in each iteration: 1) computing and normalizing each block (layer) of the mini-batch stochastic gradient; 2) selecting appropriate step size to update the decision variable (parameter) towards the negative of the block-normalized gradient. We conduct extensive empirical studies on various non-convex neural network optimization problems, including multi-layer perceptron, convolution neural networks and recurrent neural networks. The results indicate the block-normalized gradient can help accelerate the training of neural networks. In particular, we observe that the normalized gradient methods having constant step size with occasionally decay, such as SGD with momentum, have better performance in the deep convolution neural networks, while those with adaptive step sizes, such as Adam, perform better in recurrent neural networks. Besides, we also observe this line of methods can lead to solutions with better generalization properties, which is confirmed by the performance improvement over strong baselines.
Tasks
Published 2017-07-16
URL http://arxiv.org/abs/1707.04822v2
PDF http://arxiv.org/pdf/1707.04822v2.pdf
PWC https://paperswithcode.com/paper/block-normalized-gradient-method-an-empirical
Repo https://github.com/AliOsm/shakkelha
Framework none

Real Time Image Saliency for Black Box Classifiers

Title Real Time Image Saliency for Black Box Classifiers
Authors Piotr Dabkowski, Yarin Gal
Abstract In this work we develop a fast saliency detection method that can be applied to any differentiable image classifier. We train a masking model to manipulate the scores of the classifier by masking salient parts of the input image. Our model generalises well to unseen images and requires a single forward pass to perform saliency detection, therefore suitable for use in real-time systems. We test our approach on CIFAR-10 and ImageNet datasets and show that the produced saliency maps are easily interpretable, sharp, and free of artifacts. We suggest a new metric for saliency and test our method on the ImageNet object localisation task. We achieve results outperforming other weakly supervised methods.
Tasks Saliency Detection
Published 2017-05-22
URL http://arxiv.org/abs/1705.07857v1
PDF http://arxiv.org/pdf/1705.07857v1.pdf
PWC https://paperswithcode.com/paper/real-time-image-saliency-for-black-box
Repo https://github.com/PiotrDabkowski/pytorch-saliency
Framework pytorch

Explaining Recurrent Neural Network Predictions in Sentiment Analysis

Title Explaining Recurrent Neural Network Predictions in Sentiment Analysis
Authors Leila Arras, Grégoire Montavon, Klaus-Robert Müller, Wojciech Samek
Abstract Recently, a technique called Layer-wise Relevance Propagation (LRP) was shown to deliver insightful explanations in the form of input space relevances for understanding feed-forward neural network classification decisions. In the present work, we extend the usage of LRP to recurrent neural networks. We propose a specific propagation rule applicable to multiplicative connections as they arise in recurrent network architectures such as LSTMs and GRUs. We apply our technique to a word-based bi-directional LSTM model on a five-class sentiment prediction task, and evaluate the resulting LRP relevances both qualitatively and quantitatively, obtaining better results than a gradient-based related method which was used in previous work.
Tasks Interpretable Machine Learning, Sentiment Analysis
Published 2017-06-22
URL http://arxiv.org/abs/1706.07206v2
PDF http://arxiv.org/pdf/1706.07206v2.pdf
PWC https://paperswithcode.com/paper/explaining-recurrent-neural-network
Repo https://github.com/ArrasL/LRP_for_LSTM
Framework none

Deep Convolutional Framelet Denosing for Low-Dose CT via Wavelet Residual Network

Title Deep Convolutional Framelet Denosing for Low-Dose CT via Wavelet Residual Network
Authors Eunhee Kang, Jaejun Yoo, Jong Chul Ye
Abstract Model based iterative reconstruction (MBIR) algorithms for low-dose X-ray CT are computationally expensive. To address this problem, we recently proposed a deep convolutional neural network (CNN) for low-dose X-ray CT and won the second place in 2016 AAPM Low-Dose CT Grand Challenge. However, some of the texture were not fully recovered. To address this problem, here we propose a novel framelet-based denoising algorithm using wavelet residual network which synergistically combines the expressive power of deep learning and the performance guarantee from the framelet-based denoising algorithms. The new algorithms were inspired by the recent interpretation of the deep convolutional neural network (CNN) as a cascaded convolution framelet signal representation. Extensive experimental results confirm that the proposed networks have significantly improved performance and preserves the detail texture of the original images.
Tasks Denoising
Published 2017-07-31
URL http://arxiv.org/abs/1707.09938v3
PDF http://arxiv.org/pdf/1707.09938v3.pdf
PWC https://paperswithcode.com/paper/deep-convolutional-framelet-denosing-for-low
Repo https://github.com/eunh/low_dose_CT
Framework none

Visual Servoing of Unmanned Surface Vehicle from Small Tethered Unmanned Aerial Vehicle

Title Visual Servoing of Unmanned Surface Vehicle from Small Tethered Unmanned Aerial Vehicle
Authors Haresh Karnan, Aritra Biswas, Pranav Vaidik Dhulipala, Jan Dufek, Robin Murphy
Abstract This paper presents an algorithm and the implementation of a motor schema to aid the visual localization subsystem of the ongoing EMILY project at Texas A and M University. The EMILY project aims to team an Unmanned Surface Vehicle (USV) with an Unmanned Aerial Vehicle (UAV) to augment the search and rescue of marine casualties during an emergency response phase. The USV is designed to serve as a flotation device once it reaches the victims. A live video feed from the UAV is provided to the casuality responders giving them a visual estimate of the USVs orientation and position to help with its navigation. One of the challenges involved with casualty response using a USV UAV team is to simultaneously control the USV and track it. In this paper, we present an implemented solution to automate the UAV camera movements to keep the USV in view at all times. The motor schema proposed, uses the USVs coordinates from the visual localization subsystem to control the UAVs camera movements and track the USV with minimal camera movements such that the USV is always in the cameras field of view.
Tasks Visual Localization
Published 2017-10-09
URL http://arxiv.org/abs/1710.02932v1
PDF http://arxiv.org/pdf/1710.02932v1.pdf
PWC https://paperswithcode.com/paper/visual-servoing-of-unmanned-surface-vehicle
Repo https://github.com/jan-dufek/emily-tracker
Framework none

An Expectation Conditional Maximization approach for Gaussian graphical models

Title An Expectation Conditional Maximization approach for Gaussian graphical models
Authors Zehang Richard Li, Tyler H. McCormick
Abstract Bayesian graphical models are a useful tool for understanding dependence relationships among many variables, particularly in situations with external prior information. In high-dimensional settings, the space of possible graphs becomes enormous, rendering even state-of-the-art Bayesian stochastic search computationally infeasible. We propose a deterministic alternative to estimate Gaussian and Gaussian copula graphical models using an Expectation Conditional Maximization (ECM) algorithm, extending the EM approach from Bayesian variable selection to graphical model estimation. We show that the ECM approach enables fast posterior exploration under a sequence of mixture priors, and can incorporate multiple sources of information.
Tasks
Published 2017-09-20
URL http://arxiv.org/abs/1709.06970v3
PDF http://arxiv.org/pdf/1709.06970v3.pdf
PWC https://paperswithcode.com/paper/an-expectation-conditional-maximization
Repo https://github.com/richardli/EMGS
Framework none

Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory

Title Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory
Authors Hao Zhou, Minlie Huang, Tianyang Zhang, Xiaoyan Zhu, Bing Liu
Abstract Perception and expression of emotion are key factors to the success of dialogue systems or conversational agents. However, this problem has not been studied in large-scale conversation generation so far. In this paper, we propose Emotional Chatting Machine (ECM) that can generate appropriate responses not only in content (relevant and grammatical) but also in emotion (emotionally consistent). To the best of our knowledge, this is the first work that addresses the emotion factor in large-scale conversation generation. ECM addresses the factor using three new mechanisms that respectively (1) models the high-level abstraction of emotion expressions by embedding emotion categories, (2) captures the change of implicit internal emotion states, and (3) uses explicit emotion expressions with an external emotion vocabulary. Experiments show that the proposed model can generate responses appropriate not only in content but also in emotion.
Tasks
Published 2017-04-04
URL http://arxiv.org/abs/1704.01074v4
PDF http://arxiv.org/pdf/1704.01074v4.pdf
PWC https://paperswithcode.com/paper/emotional-chatting-machine-emotional
Repo https://github.com/tuxchow/ecm
Framework tf

How to Train a CAT: Learning Canonical Appearance Transformations for Direct Visual Localization Under Illumination Change

Title How to Train a CAT: Learning Canonical Appearance Transformations for Direct Visual Localization Under Illumination Change
Authors Lee Clement, Jonathan Kelly
Abstract Direct visual localization has recently enjoyed a resurgence in popularity with the increasing availability of cheap mobile computing power. The competitive accuracy and robustness of these algorithms compared to state-of-the-art feature-based methods, as well as their natural ability to yield dense maps, makes them an appealing choice for a variety of mobile robotics applications. However, direct methods remain brittle in the face of appearance change due to their underlying assumption of photometric consistency, which is commonly violated in practice. In this paper, we propose to mitigate this problem by training deep convolutional encoder-decoder models to transform images of a scene such that they correspond to a previously-seen canonical appearance. We validate our method in multiple environments and illumination conditions using high-fidelity synthetic RGB-D datasets, and integrate the trained models into a direct visual localization pipeline, yielding improvements in visual odometry (VO) accuracy through time-varying illumination conditions, as well as improved metric relocalization performance under illumination change, where conventional methods normally fail. We further provide a preliminary investigation of transfer learning from synthetic to real environments in a localization context. An open-source implementation of our method using PyTorch is available at https://github.com/utiasSTARS/cat-net.
Tasks Transfer Learning, Visual Localization, Visual Odometry
Published 2017-09-09
URL http://arxiv.org/abs/1709.03009v5
PDF http://arxiv.org/pdf/1709.03009v5.pdf
PWC https://paperswithcode.com/paper/how-to-train-a-cat-learning-canonical
Repo https://github.com/utiasSTARS/cat-net
Framework pytorch
comments powered by Disqus