July 28, 2019

2854 words 14 mins read

Paper Group ANR 222

Low Precision RNNs: Quantizing RNNs Without Losing Accuracy. Learning to Imagine Manipulation Goals for Robot Task Planning. Why my photos look sideways or upside down? Detecting Canonical Orientation of Images using Convolutional Neural Networks. Modeling Retinal Ganglion Cell Population Activity with Restricted Boltzmann Machines. Attribute-contr …

Low Precision RNNs: Quantizing RNNs Without Losing Accuracy


Title	Low Precision RNNs: Quantizing RNNs Without Losing Accuracy
Authors	Supriya Kapur, Asit Mishra, Debbie Marr
Abstract	Similar to convolution neural networks, recurrent neural networks (RNNs) typically suffer from over-parameterization. Quantizing bit-widths of weights and activations results in runtime efficiency on hardware, yet it often comes at the cost of reduced accuracy. This paper proposes a quantization approach that increases model size with bit-width reduction. This approach will allow networks to perform at their baseline accuracy while still maintaining the benefits of reduced precision and overall model size reduction.
Tasks	Quantization
Published	2017-10-20
URL	http://arxiv.org/abs/1710.07706v1
PDF	http://arxiv.org/pdf/1710.07706v1.pdf
PWC	https://paperswithcode.com/paper/low-precision-rnns-quantizing-rnns-without
Repo
Framework

Learning to Imagine Manipulation Goals for Robot Task Planning


Title	Learning to Imagine Manipulation Goals for Robot Task Planning
Authors	Chris Paxton, Kapil Katyal, Christian Rupprecht, Raman Arora, Gregory D. Hager
Abstract	Prospection is an important part of how humans come up with new task plans, but has not been explored in depth in robotics. Predicting multiple task-level is a challenging problem that involves capturing both task semantics and continuous variability over the state of the world. Ideally, we would combine the ability of machine learning to leverage big data for learning the semantics of a task, while using techniques from task planning to reliably generalize to new environment. In this work, we propose a method for learning a model encoding just such a representation for task planning. We learn a neural net that encodes the $k$ most likely outcomes from high level actions from a given world. Our approach creates comprehensible task plans that allow us to predict changes to the environment many time steps into the future. We demonstrate this approach via application to a stacking task in a cluttered environment, where the robot must select between different colored blocks while avoiding obstacles, in order to perform a task. We also show results on a simple navigation task. Our algorithm generates realistic image and pose predictions at multiple points in a given task.
Tasks	Robot Task Planning
Published	2017-11-08
URL	http://arxiv.org/abs/1711.02783v2
PDF	http://arxiv.org/pdf/1711.02783v2.pdf
PWC	https://paperswithcode.com/paper/learning-to-imagine-manipulation-goals-for
Repo
Framework

Why my photos look sideways or upside down? Detecting Canonical Orientation of Images using Convolutional Neural Networks


Title	Why my photos look sideways or upside down? Detecting Canonical Orientation of Images using Convolutional Neural Networks
Authors	Kunal Swami, Pranav P. Deshpande, Gaurav Khandelwal, Ajay Vijayvargiya
Abstract	Image orientation detection requires high-level scene understanding. Humans use object recognition and contextual scene information to correctly orient images. In literature, the problem of image orientation detection is mostly confronted by using low-level vision features, while some approaches incorporate few easily detectable semantic cues to gain minor improvements. The vast amount of semantic content in images makes orientation detection challenging, and therefore there is a large semantic gap between existing methods and human behavior. Also, existing methods in literature report highly discrepant detection rates, which is mainly due to large differences in datasets and limited variety of test images used for evaluation. In this work, for the first time, we leverage the power of deep learning and adapt pre-trained convolutional neural networks using largest training dataset to-date for the image orientation detection task. An extensive evaluation of our model on different public datasets shows that it remarkably generalizes to correctly orient a large set of unconstrained images; it also significantly outperforms the state-of-the-art and achieves accuracy very close to that of humans.
Tasks	Object Recognition, Scene Understanding
Published	2017-12-04
URL	http://arxiv.org/abs/1712.01195v1
PDF	http://arxiv.org/pdf/1712.01195v1.pdf
PWC	https://paperswithcode.com/paper/why-my-photos-look-sideways-or-upside-down
Repo
Framework

Modeling Retinal Ganglion Cell Population Activity with Restricted Boltzmann Machines


Title	Modeling Retinal Ganglion Cell Population Activity with Restricted Boltzmann Machines
Authors	Matteo Zanotto, Riccardo Volpi, Alessandro Maccione, Luca Berdondini, Diego Sona, Vittorio Murino
Abstract	The retina is a complex nervous system which encodes visual stimuli before higher order processing occurs in the visual cortex. In this study we evaluated whether information about the stimuli received by the retina can be retrieved from the firing rate distribution of Retinal Ganglion Cells (RGCs), exploiting High-Density 64x64 MEA technology. To this end, we modeled the RGC population activity using mean-covariance Restricted Boltzmann Machines, latent variable models capable of learning the joint distribution of a set of continuous observed random variables and a set of binary unobserved random units. The idea was to figure out if binary latent states encode the regularities associated to different visual stimuli, as modes in the joint distribution. We measured the goodness of mcRBM encoding by calculating the Mutual Information between the latent states and the stimuli shown to the retina. Results show that binary states can encode the regularities associated to different stimuli, using both gratings and natural scenes as stimuli. We also discovered that hidden variables encode interesting properties of retinal activity, interpreted as population receptive fields. We further investigated the ability of the model to learn different modes in population activity by comparing results associated to a retina in normal conditions and after pharmacologically blocking GABA receptors (GABAC at first, and then also GABAA and GABAB). As expected, Mutual Information tends to decrease if we pharmacologically block receptors. We finally stress that the computational method described in this work could potentially be applied to any kind of neural data obtained through MEA technology, though different techniques should be applied to interpret the results.
Tasks	Latent Variable Models
Published	2017-01-11
URL	http://arxiv.org/abs/1701.02898v2
PDF	http://arxiv.org/pdf/1701.02898v2.pdf
PWC	https://paperswithcode.com/paper/modeling-retinal-ganglion-cell-population
Repo
Framework

Attribute-controlled face photo synthesis from simple line drawing


Title	Attribute-controlled face photo synthesis from simple line drawing
Authors	Qi Guo, Ce Zhu, Zhiqiang Xia, Zhengtao Wang, Yipeng Liu
Abstract	Face photo synthesis from simple line drawing is a one-to-many task as simple line drawing merely contains the contour of human face. Previous exemplar-based methods are over-dependent on the datasets and are hard to generalize to complicated natural scenes. Recently, several works utilize deep neural networks to increase the generalization, but they are still limited in the controllability of the users. In this paper, we propose a deep generative model to synthesize face photo from simple line drawing controlled by face attributes such as hair color and complexion. In order to maximize the controllability of face attributes, an attribute-disentangled variational auto-encoder (AD-VAE) is firstly introduced to learn latent representations disentangled with respect to specified attributes. Then we conduct photo synthesis from simple line drawing based on AD-VAE. Experiments show that our model can well disentangle the variations of attributes from other variations of face photos and synthesize detailed photorealistic face images with desired attributes. Regarding background and illumination as the style and human face as the content, we can also synthesize face photos with the target style of a style photo.
Tasks
Published	2017-02-09
URL	http://arxiv.org/abs/1702.02805v1
PDF	http://arxiv.org/pdf/1702.02805v1.pdf
PWC	https://paperswithcode.com/paper/attribute-controlled-face-photo-synthesis
Repo
Framework


Title	Automatic Image Filtering on Social Networks Using Deep Learning and Perceptual Hashing During Crises
Authors	Dat Tien Nguyen, Firoj Alam, Ferda Ofli, Muhammad Imran
Abstract	The extensive use of social media platforms, especially during disasters, creates unique opportunities for humanitarian organizations to gain situational awareness and launch relief operations accordingly. In addition to the textual content, people post overwhelming amounts of imagery data on social networks within minutes of a disaster hit. Studies point to the importance of this online imagery content for emergency response. Despite recent advances in the computer vision field, automatic processing of the crisis-related social media imagery data remains a challenging task. It is because a majority of which consists of redundant and irrelevant content. In this paper, we present an image processing pipeline that comprises de-duplication and relevancy filtering mechanisms to collect and filter social media image content in real-time during a crisis event. Results obtained from extensive experiments on real-world crisis datasets demonstrate the significance of the proposed pipeline for optimal utilization of both human and machine computing resources.
Tasks
Published	2017-04-09
URL	http://arxiv.org/abs/1704.02602v1
PDF	http://arxiv.org/pdf/1704.02602v1.pdf
PWC	https://paperswithcode.com/paper/automatic-image-filtering-on-social-networks
Repo
Framework

Segmenting Dermoscopic Images


Title	Segmenting Dermoscopic Images
Authors	Mario Rosario Guarracino, Lucia Maddalena
Abstract	We propose an automatic algorithm, named SDI, for the segmentation of skin lesions in dermoscopic images, articulated into three main steps: selection of the image ROI, selection of the segmentation band, and segmentation. We present extensive experimental results achieved by the SDI algorithm on the lesion segmentation dataset made available for the ISIC 2017 challenge on Skin Lesion Analysis Towards Melanoma Detection, highlighting its advantages and disadvantages.
Tasks	Lesion Segmentation
Published	2017-03-09
URL	http://arxiv.org/abs/1703.03186v1
PDF	http://arxiv.org/pdf/1703.03186v1.pdf
PWC	https://paperswithcode.com/paper/segmenting-dermoscopic-images
Repo
Framework

Multi-space Variational Encoder-Decoders for Semi-supervised Labeled Sequence Transduction


Title	Multi-space Variational Encoder-Decoders for Semi-supervised Labeled Sequence Transduction
Authors	Chunting Zhou, Graham Neubig
Abstract	Labeled sequence transduction is a task of transforming one sequence into another sequence that satisfies desiderata specified by a set of labels. In this paper we propose multi-space variational encoder-decoders, a new model for labeled sequence transduction with semi-supervised learning. The generative model can use neural networks to handle both discrete and continuous latent variables to exploit various features of data. Experiments show that our model provides not only a powerful supervised framework but also can effectively take advantage of the unlabeled data. On the SIGMORPHON morphological inflection benchmark, our model outperforms single-model state-of-art results by a large margin for the majority of languages.
Tasks	Morphological Inflection
Published	2017-04-06
URL	http://arxiv.org/abs/1704.01691v2
PDF	http://arxiv.org/pdf/1704.01691v2.pdf
PWC	https://paperswithcode.com/paper/multi-space-variational-encoder-decoders-for
Repo
Framework

DeepFace: Face Generation using Deep Learning


Title	DeepFace: Face Generation using Deep Learning
Authors	Hardie Cate, Fahim Dalvi, Zeshan Hussain
Abstract	We use CNNs to build a system that both classifies images of faces based on a variety of different facial attributes and generates new faces given a set of desired facial characteristics. After introducing the problem and providing context in the first section, we discuss recent work related to image generation in Section 2. In Section 3, we describe the methods used to fine-tune our CNN and generate new images using a novel approach inspired by a Gaussian mixture model. In Section 4, we discuss our working dataset and describe our preprocessing steps and handling of facial attributes. Finally, in Sections 5, 6 and 7, we explain our experiments and results and conclude in the following section. Our classification system has 82% test accuracy. Furthermore, our generation pipeline successfully creates well-formed faces.
Tasks	Face Generation, Image Generation
Published	2017-01-07
URL	http://arxiv.org/abs/1701.01876v1
PDF	http://arxiv.org/pdf/1701.01876v1.pdf
PWC	https://paperswithcode.com/paper/deepface-face-generation-using-deep-learning
Repo
Framework

Using Deep Learning Method for Classification: A Proposed Algorithm for the ISIC 2017 Skin Lesion Classification Challenge


Title	Using Deep Learning Method for Classification: A Proposed Algorithm for the ISIC 2017 Skin Lesion Classification Challenge
Authors	Wenhao Zhang, Liangcai Gao, Runtao Liu
Abstract	Skin cancer, the most common human malignancy, is primarily diagnosed visually by physicians [1]. Classification with an automated method like CNN [2, 3] shows potential for challenging tasks [1]. By now, the deep convolutional neural networks are on par with human dermatologist [1]. This abstract is dedicated on developing a Deep Learning method for ISIC [5] 2017 Skin Lesion Detection Competition hosted at [6] to classify the dermatology pictures, which is aimed at improving the diagnostic accuracy rate and general level of the human health. The challenge falls into three sub-challenges, including Lesion Segmentation, Lesion Dermoscopic Feature Extraction and Lesion Classification. This project only participates in the Lesion Classification part. This algorithm is comprised of three steps: (1) original images preprocessing, (2) modelling the processed images using CNN [2, 3] in Caffe [4] framework, (3) predicting the test images and calculating the scores that represent the likelihood of corresponding classification. The models are built on the source images are using the Caffe [4] framework. The scores in prediction step are obtained by two different models from the source images.
Tasks	Lesion Segmentation, Skin Lesion Classification
Published	2017-03-07
URL	http://arxiv.org/abs/1703.02182v2
PDF	http://arxiv.org/pdf/1703.02182v2.pdf
PWC	https://paperswithcode.com/paper/using-deep-learning-method-for-classification
Repo
Framework

A Novel Multi-task Deep Learning Model for Skin Lesion Segmentation and Classification


Title	A Novel Multi-task Deep Learning Model for Skin Lesion Segmentation and Classification
Authors	Xulei Yang, Zeng Zeng, Si Yong Yeo, Colin Tan, Hong Liang Tey, Yi Su
Abstract	In this study, a multi-task deep neural network is proposed for skin lesion analysis. The proposed multi-task learning model solves different tasks (e.g., lesion segmentation and two independent binary lesion classifications) at the same time by exploiting commonalities and differences across tasks. This results in improved learning efficiency and potential prediction accuracy for the task-specific models, when compared to training the individual models separately. The proposed multi-task deep learning model is trained and evaluated on the dermoscopic image sets from the International Skin Imaging Collaboration (ISIC) 2017 Challenge - Skin Lesion Analysis towards Melanoma Detection, which consists of 2000 training samples and 150 evaluation samples. The experimental results show that the proposed multi-task deep learning model achieves promising performances on skin lesion segmentation and classification. The average value of Jaccard index for lesion segmentation is 0.724, while the average values of area under the receiver operating characteristic curve (AUC) on two individual lesion classifications are 0.880 and 0.972, respectively.
Tasks	Lesion Segmentation, Multi-Task Learning
Published	2017-03-03
URL	http://arxiv.org/abs/1703.01025v1
PDF	http://arxiv.org/pdf/1703.01025v1.pdf
PWC	https://paperswithcode.com/paper/a-novel-multi-task-deep-learning-model-for
Repo
Framework

Skin Lesion Analysis Towards Melanoma Detection Using Deep Learning Network


Title	Skin Lesion Analysis Towards Melanoma Detection Using Deep Learning Network
Authors	Yuexiang Li, Linlin Shen
Abstract	Skin lesion is a severe disease in world-wide extent. Early detection of melanoma in dermoscopy images significantly increases the survival rate. However, the accurate recognition of melanoma is extremely challenging due to the following reasons, e.g. low contrast between lesions and skin, visual similarity between melanoma and non-melanoma lesions, etc. Hence, reliable automatic detection of skin tumors is very useful to increase the accuracy and efficiency of pathologists. International Skin Imaging Collaboration (ISIC) is a challenge focusing on the automatic analysis of skin lesion. In this paper, we proposed two deep learning methods to address all the three tasks announced in ISIC 2017, i.e. lesion segmentation (task 1), lesion dermoscopic feature extraction (task 2) and lesion classification (task 3). A deep learning framework consisting of two fully-convolutional residual networks (FCRN) is proposed to simultaneously produce the segmentation result and the coarse classification result. A lesion index calculation unit (LICU) is developed to refine the coarse classification results by calculating the distance heat-map. A straight-forward CNN is proposed for the dermoscopic feature extraction task. To our best knowledges, we are not aware of any previous work proposed for this task. The proposed deep learning frameworks were evaluated on the ISIC 2017 testing set. Experimental results show the promising accuracies of our frameworks, i.e. 0.718 for task 1, 0.833 for task 2 and 0.823 for task 3 were achieved.
Tasks	Lesion Segmentation
Published	2017-03-02
URL	http://arxiv.org/abs/1703.00577v2
PDF	http://arxiv.org/pdf/1703.00577v2.pdf
PWC	https://paperswithcode.com/paper/skin-lesion-analysis-towards-melanoma
Repo
Framework

Skin cancer reorganization and classification with deep neural network


Title	Skin cancer reorganization and classification with deep neural network
Authors	Hao Chang
Abstract	As one kind of skin cancer, melanoma is very dangerous. Dermoscopy based early detection and recarbonization strategy is critical for melanoma therapy. However, well-trained dermatologists dominant the diagnostic accuracy. In order to solve this problem, many effort focus on developing automatic image analysis systems. Here we report a novel strategy based on deep learning technique, and achieve very high skin lesion segmentation and melanoma diagnosis accuracy: 1) we build a segmentation neural network (skin_segnn), which achieved very high lesion boundary detection accuracy; 2) We build another very deep neural network based on Google inception v3 network (skin_recnn) and its well-trained weight. The novel designed transfer learning based deep neural network skin_inceptions_v3_nn helps to achieve a high prediction accuracy.
Tasks	Boundary Detection, Lesion Segmentation, Transfer Learning
Published	2017-03-01
URL	http://arxiv.org/abs/1703.00534v1
PDF	http://arxiv.org/pdf/1703.00534v1.pdf
PWC	https://paperswithcode.com/paper/skin-cancer-reorganization-and-classification
Repo
Framework

ISIC 2017 - Skin Lesion Analysis Towards Melanoma Detection


Title	ISIC 2017 - Skin Lesion Analysis Towards Melanoma Detection
Authors	Matt Berseth
Abstract	Our system addresses Part 1, Lesion Segmentation and Part 3, Lesion Classification of the ISIC 2017 challenge. Both algorithms make use of deep convolutional networks to achieve the challenge objective.
Tasks	Lesion Segmentation
Published	2017-03-01
URL	http://arxiv.org/abs/1703.00523v1
PDF	http://arxiv.org/pdf/1703.00523v1.pdf
PWC	https://paperswithcode.com/paper/isic-2017-skin-lesion-analysis-towards
Repo
Framework

Infinite Sparse Structured Factor Analysis


Title	Infinite Sparse Structured Factor Analysis
Authors	Matthew C. Pearce, Simon R. White
Abstract	Matrix factorisation methods decompose multivariate observations as linear combinations of latent feature vectors. The Indian Buffet Process (IBP) provides a way to model the number of latent features required for a good approximation in terms of regularised reconstruction error. Previous work has focussed on latent feature vectors with independent entries. We extend the model to include nondiagonal latent covariance structures representing characteristics such as smoothness. This is done by . Using simulations we demonstrate that under appropriate conditions a smoothness prior helps to recover the true latent features, while denoising more accurately. We demonstrate our method on a real neuroimaging dataset, where computational tractability is a sufficient challenge that the efficient strategy presented here is essential.
Tasks	Denoising
Published	2017-04-13
URL	http://arxiv.org/abs/1704.04031v1
PDF	http://arxiv.org/pdf/1704.04031v1.pdf
PWC	https://paperswithcode.com/paper/infinite-sparse-structured-factor-analysis
Repo
Framework