July 29, 2019

2638 words 13 mins read

Paper Group AWR 145

Boosted Cascaded Convnets for Multilabel Classification of Thoracic Diseases in Chest Radiographs. Deep Joint Entity Disambiguation with Local Neural Attention. Knowledge Transfer for Melanoma Screening with Deep Learning. Greedy Search for Descriptive Spatial Face Features. Towards Syntactic Iberian Polarity Classification. Fast Generation for Con …

Boosted Cascaded Convnets for Multilabel Classification of Thoracic Diseases in Chest Radiographs


Title	Boosted Cascaded Convnets for Multilabel Classification of Thoracic Diseases in Chest Radiographs
Authors	Pulkit Kumar, Monika Grewal, Muktabh Mayank Srivastava
Abstract	Chest X-ray is one of the most accessible medical imaging technique for diagnosis of multiple diseases. With the availability of ChestX-ray14, which is a massive dataset of chest X-ray images and provides annotations for 14 thoracic diseases; it is possible to train Deep Convolutional Neural Networks (DCNN) to build Computer Aided Diagnosis (CAD) systems. In this work, we experiment a set of deep learning models and present a cascaded deep neural network that can diagnose all 14 pathologies better than the baseline and is competitive with other published methods. Our work provides the quantitative results to answer following research questions for the dataset: 1) What loss functions to use for training DCNN from scratch on ChestX-ray14 dataset that demonstrates high class imbalance and label co occurrence? 2) How to use cascading to model label dependency and to improve accuracy of the deep learning model?
Tasks	Lung Disease Classification
Published	2017-11-23
URL	http://arxiv.org/abs/1711.08760v1
PDF	http://arxiv.org/pdf/1711.08760v1.pdf
PWC	https://paperswithcode.com/paper/boosted-cascaded-convnets-for-multilabel
Repo	https://github.com/Azure/AzureChestXRay
Framework	pytorch

Deep Joint Entity Disambiguation with Local Neural Attention


Title	Deep Joint Entity Disambiguation with Local Neural Attention
Authors	Octavian-Eugen Ganea, Thomas Hofmann
Abstract	We propose a novel deep learning model for joint document-level entity disambiguation, which leverages learned neural representations. Key components are entity embeddings, a neural attention mechanism over local context windows, and a differentiable joint inference stage for disambiguation. Our approach thereby combines benefits of deep learning with more traditional approaches such as graphical models and probabilistic mention-entity maps. Extensive experiments show that we are able to obtain competitive or state-of-the-art accuracy at moderate computational costs.
Tasks	Entity Disambiguation
Published	2017-04-17
URL	http://arxiv.org/abs/1704.04920v3
PDF	http://arxiv.org/pdf/1704.04920v3.pdf
PWC	https://paperswithcode.com/paper/deep-joint-entity-disambiguation-with-local
Repo	https://github.com/dalab/deep-ed
Framework	torch

Knowledge Transfer for Melanoma Screening with Deep Learning


Title	Knowledge Transfer for Melanoma Screening with Deep Learning
Authors	Afonso Menegola, Michel Fornaciali, Ramon Pires, Flávia Vasques Bittencourt, Sandra Avila, Eduardo Valle
Abstract	Knowledge transfer impacts the performance of deep learning – the state of the art for image classification tasks, including automated melanoma screening. Deep learning’s greed for large amounts of training data poses a challenge for medical tasks, which we can alleviate by recycling knowledge from models trained on different tasks, in a scheme called transfer learning. Although much of the best art on automated melanoma screening employs some form of transfer learning, a systematic evaluation was missing. Here we investigate the presence of transfer, from which task the transfer is sourced, and the application of fine tuning (i.e., retraining of the deep learning model after transfer). We also test the impact of picking deeper (and more expensive) models. Our results favor deeper models, pre-trained over ImageNet, with fine-tuning, reaching an AUC of 80.7% and 84.5% for the two skin-lesion datasets evaluated.
Tasks	Image Classification, Skin Cancer Classification, Transfer Learning
Published	2017-03-22
URL	http://arxiv.org/abs/1703.07479v1
PDF	http://arxiv.org/pdf/1703.07479v1.pdf
PWC	https://paperswithcode.com/paper/knowledge-transfer-for-melanoma-screening
Repo	https://github.com/learningtitans/data-depth-design
Framework	tf

Greedy Search for Descriptive Spatial Face Features


Title	Greedy Search for Descriptive Spatial Face Features
Authors	Caner Gacav, Burak Benligiray, Cihan Topal
Abstract	Facial expression recognition methods use a combination of geometric and appearance-based features. Spatial features are derived from displacements of facial landmarks, and carry geometric information. These features are either selected based on prior knowledge, or dimension-reduced from a large pool. In this study, we produce a large number of potential spatial features using two combinations of facial landmarks. Among these, we search for a descriptive subset of features using sequential forward selection. The chosen feature subset is used to classify facial expressions in the extended Cohn-Kanade dataset (CK+), and delivered 88.7% recognition accuracy without using any appearance-based features.
Tasks	Facial Expression Recognition
Published	2017-01-07
URL	http://arxiv.org/abs/1701.01879v2
PDF	http://arxiv.org/pdf/1701.01879v2.pdf
PWC	https://paperswithcode.com/paper/greedy-search-for-descriptive-spatial-face
Repo	https://github.com/kyranstar/Narcissus
Framework	tf

Towards Syntactic Iberian Polarity Classification


Title	Towards Syntactic Iberian Polarity Classification
Authors	David Vilares, Marcos Garcia, Miguel A. Alonso, Carlos Gómez-Rodríguez
Abstract	Lexicon-based methods using syntactic rules for polarity classification rely on parsers that are dependent on the language and on treebank guidelines. Thus, rules are also dependent and require adaptation, especially in multilingual scenarios. We tackle this challenge in the context of the Iberian Peninsula, releasing the first symbolic syntax-based Iberian system with rules shared across five official languages: Basque, Catalan, Galician, Portuguese and Spanish. The model is made available.
Tasks
Published	2017-08-17
URL	http://arxiv.org/abs/1708.05269v1
PDF	http://arxiv.org/pdf/1708.05269v1.pdf
PWC	https://paperswithcode.com/paper/towards-syntactic-iberian-polarity
Repo	https://github.com/aghie/uuusa
Framework	none

Fast Generation for Convolutional Autoregressive Models


Title	Fast Generation for Convolutional Autoregressive Models
Authors	Prajit Ramachandran, Tom Le Paine, Pooya Khorrami, Mohammad Babaeizadeh, Shiyu Chang, Yang Zhang, Mark A. Hasegawa-Johnson, Roy H. Campbell, Thomas S. Huang
Abstract	Convolutional autoregressive models have recently demonstrated state-of-the-art performance on a number of generation tasks. While fast, parallel training methods have been crucial for their success, generation is typically implemented in a na"{i}ve fashion where redundant computations are unnecessarily repeated. This results in slow generation, making such models infeasible for production environments. In this work, we describe a method to speed up generation in convolutional autoregressive models. The key idea is to cache hidden states to avoid redundant computation. We apply our fast generation method to the Wavenet and PixelCNN++ models and achieve up to $21\times$ and $183\times$ speedups respectively.
Tasks
Published	2017-04-20
URL	http://arxiv.org/abs/1704.06001v1
PDF	http://arxiv.org/pdf/1704.06001v1.pdf
PWC	https://paperswithcode.com/paper/fast-generation-for-convolutional
Repo	https://github.com/PrajitR/fast-pixel-cnn
Framework	tf

Semantically Decomposing the Latent Spaces of Generative Adversarial Networks


Title	Semantically Decomposing the Latent Spaces of Generative Adversarial Networks
Authors	Chris Donahue, Zachary C. Lipton, Akshay Balsubramani, Julian McAuley
Abstract	We propose a new algorithm for training generative adversarial networks that jointly learns latent codes for both identities (e.g. individual humans) and observations (e.g. specific photographs). By fixing the identity portion of the latent codes, we can generate diverse images of the same subject, and by fixing the observation portion, we can traverse the manifold of subjects while maintaining contingent aspects such as lighting and pose. Our algorithm features a pairwise training scheme in which each sample from the generator consists of two images with a common identity code. Corresponding samples from the real dataset consist of two distinct photographs of the same subject. In order to fool the discriminator, the generator must produce pairs that are photorealistic, distinct, and appear to depict the same individual. We augment both the DCGAN and BEGAN approaches with Siamese discriminators to facilitate pairwise training. Experiments with human judges and an off-the-shelf face verification system demonstrate our algorithm’s ability to generate convincing, identity-matched photographs.
Tasks	Face Verification, Image Generation
Published	2017-05-22
URL	http://arxiv.org/abs/1705.07904v3
PDF	http://arxiv.org/pdf/1705.07904v3.pdf
PWC	https://paperswithcode.com/paper/semantically-decomposing-the-latent-spaces-of
Repo	https://github.com/chrisdonahue/sdgan
Framework	tf

Block-Normalized Gradient Method: An Empirical Study for Training Deep Neural Network


Title	Block-Normalized Gradient Method: An Empirical Study for Training Deep Neural Network
Authors	Adams Wei Yu, Lei Huang, Qihang Lin, Ruslan Salakhutdinov, Jaime Carbonell
Abstract	In this paper, we propose a generic and simple strategy for utilizing stochastic gradient information in optimization. The technique essentially contains two consecutive steps in each iteration: 1) computing and normalizing each block (layer) of the mini-batch stochastic gradient; 2) selecting appropriate step size to update the decision variable (parameter) towards the negative of the block-normalized gradient. We conduct extensive empirical studies on various non-convex neural network optimization problems, including multi-layer perceptron, convolution neural networks and recurrent neural networks. The results indicate the block-normalized gradient can help accelerate the training of neural networks. In particular, we observe that the normalized gradient methods having constant step size with occasionally decay, such as SGD with momentum, have better performance in the deep convolution neural networks, while those with adaptive step sizes, such as Adam, perform better in recurrent neural networks. Besides, we also observe this line of methods can lead to solutions with better generalization properties, which is confirmed by the performance improvement over strong baselines.
Tasks
Published	2017-07-16
URL	http://arxiv.org/abs/1707.04822v2
PDF	http://arxiv.org/pdf/1707.04822v2.pdf
PWC	https://paperswithcode.com/paper/block-normalized-gradient-method-an-empirical
Repo	https://github.com/AliOsm/shakkelha
Framework	none

Real Time Image Saliency for Black Box Classifiers


Title	Real Time Image Saliency for Black Box Classifiers
Authors	Piotr Dabkowski, Yarin Gal
Abstract	In this work we develop a fast saliency detection method that can be applied to any differentiable image classifier. We train a masking model to manipulate the scores of the classifier by masking salient parts of the input image. Our model generalises well to unseen images and requires a single forward pass to perform saliency detection, therefore suitable for use in real-time systems. We test our approach on CIFAR-10 and ImageNet datasets and show that the produced saliency maps are easily interpretable, sharp, and free of artifacts. We suggest a new metric for saliency and test our method on the ImageNet object localisation task. We achieve results outperforming other weakly supervised methods.
Tasks	Saliency Detection
Published	2017-05-22
URL	http://arxiv.org/abs/1705.07857v1
PDF	http://arxiv.org/pdf/1705.07857v1.pdf
PWC	https://paperswithcode.com/paper/real-time-image-saliency-for-black-box
Repo	https://github.com/PiotrDabkowski/pytorch-saliency
Framework	pytorch

Explaining Recurrent Neural Network Predictions in Sentiment Analysis


Title	Explaining Recurrent Neural Network Predictions in Sentiment Analysis
Authors	Leila Arras, Grégoire Montavon, Klaus-Robert Müller, Wojciech Samek
Abstract	Recently, a technique called Layer-wise Relevance Propagation (LRP) was shown to deliver insightful explanations in the form of input space relevances for understanding feed-forward neural network classification decisions. In the present work, we extend the usage of LRP to recurrent neural networks. We propose a specific propagation rule applicable to multiplicative connections as they arise in recurrent network architectures such as LSTMs and GRUs. We apply our technique to a word-based bi-directional LSTM model on a five-class sentiment prediction task, and evaluate the resulting LRP relevances both qualitatively and quantitatively, obtaining better results than a gradient-based related method which was used in previous work.
Tasks	Interpretable Machine Learning, Sentiment Analysis
Published	2017-06-22
URL	http://arxiv.org/abs/1706.07206v2
PDF	http://arxiv.org/pdf/1706.07206v2.pdf
PWC	https://paperswithcode.com/paper/explaining-recurrent-neural-network
Repo	https://github.com/ArrasL/LRP_for_LSTM
Framework	none

Deep Convolutional Framelet Denosing for Low-Dose CT via Wavelet Residual Network


Title	Deep Convolutional Framelet Denosing for Low-Dose CT via Wavelet Residual Network
Authors	Eunhee Kang, Jaejun Yoo, Jong Chul Ye
Abstract	Model based iterative reconstruction (MBIR) algorithms for low-dose X-ray CT are computationally expensive. To address this problem, we recently proposed a deep convolutional neural network (CNN) for low-dose X-ray CT and won the second place in 2016 AAPM Low-Dose CT Grand Challenge. However, some of the texture were not fully recovered. To address this problem, here we propose a novel framelet-based denoising algorithm using wavelet residual network which synergistically combines the expressive power of deep learning and the performance guarantee from the framelet-based denoising algorithms. The new algorithms were inspired by the recent interpretation of the deep convolutional neural network (CNN) as a cascaded convolution framelet signal representation. Extensive experimental results confirm that the proposed networks have significantly improved performance and preserves the detail texture of the original images.
Tasks	Denoising
Published	2017-07-31
URL	http://arxiv.org/abs/1707.09938v3
PDF	http://arxiv.org/pdf/1707.09938v3.pdf
PWC	https://paperswithcode.com/paper/deep-convolutional-framelet-denosing-for-low
Repo	https://github.com/eunh/low_dose_CT
Framework	none

Visual Servoing of Unmanned Surface Vehicle from Small Tethered Unmanned Aerial Vehicle


Title	Visual Servoing of Unmanned Surface Vehicle from Small Tethered Unmanned Aerial Vehicle
Authors	Haresh Karnan, Aritra Biswas, Pranav Vaidik Dhulipala, Jan Dufek, Robin Murphy
Abstract	This paper presents an algorithm and the implementation of a motor schema to aid the visual localization subsystem of the ongoing EMILY project at Texas A and M University. The EMILY project aims to team an Unmanned Surface Vehicle (USV) with an Unmanned Aerial Vehicle (UAV) to augment the search and rescue of marine casualties during an emergency response phase. The USV is designed to serve as a flotation device once it reaches the victims. A live video feed from the UAV is provided to the casuality responders giving them a visual estimate of the USVs orientation and position to help with its navigation. One of the challenges involved with casualty response using a USV UAV team is to simultaneously control the USV and track it. In this paper, we present an implemented solution to automate the UAV camera movements to keep the USV in view at all times. The motor schema proposed, uses the USVs coordinates from the visual localization subsystem to control the UAVs camera movements and track the USV with minimal camera movements such that the USV is always in the cameras field of view.
Tasks	Visual Localization
Published	2017-10-09
URL	http://arxiv.org/abs/1710.02932v1
PDF	http://arxiv.org/pdf/1710.02932v1.pdf
PWC	https://paperswithcode.com/paper/visual-servoing-of-unmanned-surface-vehicle
Repo	https://github.com/jan-dufek/emily-tracker
Framework	none

An Expectation Conditional Maximization approach for Gaussian graphical models


Title	An Expectation Conditional Maximization approach for Gaussian graphical models
Authors	Zehang Richard Li, Tyler H. McCormick
Abstract	Bayesian graphical models are a useful tool for understanding dependence relationships among many variables, particularly in situations with external prior information. In high-dimensional settings, the space of possible graphs becomes enormous, rendering even state-of-the-art Bayesian stochastic search computationally infeasible. We propose a deterministic alternative to estimate Gaussian and Gaussian copula graphical models using an Expectation Conditional Maximization (ECM) algorithm, extending the EM approach from Bayesian variable selection to graphical model estimation. We show that the ECM approach enables fast posterior exploration under a sequence of mixture priors, and can incorporate multiple sources of information.
Tasks
Published	2017-09-20
URL	http://arxiv.org/abs/1709.06970v3
PDF	http://arxiv.org/pdf/1709.06970v3.pdf
PWC	https://paperswithcode.com/paper/an-expectation-conditional-maximization
Repo	https://github.com/richardli/EMGS
Framework	none

Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory


Title	Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory
Authors	Hao Zhou, Minlie Huang, Tianyang Zhang, Xiaoyan Zhu, Bing Liu
Abstract	Perception and expression of emotion are key factors to the success of dialogue systems or conversational agents. However, this problem has not been studied in large-scale conversation generation so far. In this paper, we propose Emotional Chatting Machine (ECM) that can generate appropriate responses not only in content (relevant and grammatical) but also in emotion (emotionally consistent). To the best of our knowledge, this is the first work that addresses the emotion factor in large-scale conversation generation. ECM addresses the factor using three new mechanisms that respectively (1) models the high-level abstraction of emotion expressions by embedding emotion categories, (2) captures the change of implicit internal emotion states, and (3) uses explicit emotion expressions with an external emotion vocabulary. Experiments show that the proposed model can generate responses appropriate not only in content but also in emotion.
Tasks
Published	2017-04-04
URL	http://arxiv.org/abs/1704.01074v4
PDF	http://arxiv.org/pdf/1704.01074v4.pdf
PWC	https://paperswithcode.com/paper/emotional-chatting-machine-emotional
Repo	https://github.com/tuxchow/ecm
Framework	tf

How to Train a CAT: Learning Canonical Appearance Transformations for Direct Visual Localization Under Illumination Change


Title	How to Train a CAT: Learning Canonical Appearance Transformations for Direct Visual Localization Under Illumination Change
Authors	Lee Clement, Jonathan Kelly
Abstract	Direct visual localization has recently enjoyed a resurgence in popularity with the increasing availability of cheap mobile computing power. The competitive accuracy and robustness of these algorithms compared to state-of-the-art feature-based methods, as well as their natural ability to yield dense maps, makes them an appealing choice for a variety of mobile robotics applications. However, direct methods remain brittle in the face of appearance change due to their underlying assumption of photometric consistency, which is commonly violated in practice. In this paper, we propose to mitigate this problem by training deep convolutional encoder-decoder models to transform images of a scene such that they correspond to a previously-seen canonical appearance. We validate our method in multiple environments and illumination conditions using high-fidelity synthetic RGB-D datasets, and integrate the trained models into a direct visual localization pipeline, yielding improvements in visual odometry (VO) accuracy through time-varying illumination conditions, as well as improved metric relocalization performance under illumination change, where conventional methods normally fail. We further provide a preliminary investigation of transfer learning from synthetic to real environments in a localization context. An open-source implementation of our method using PyTorch is available at https://github.com/utiasSTARS/cat-net.
Tasks	Transfer Learning, Visual Localization, Visual Odometry
Published	2017-09-09
URL	http://arxiv.org/abs/1709.03009v5
PDF	http://arxiv.org/pdf/1709.03009v5.pdf
PWC	https://paperswithcode.com/paper/how-to-train-a-cat-learning-canonical
Repo	https://github.com/utiasSTARS/cat-net
Framework	pytorch