Paper Group ANR 222
Low Precision RNNs: Quantizing RNNs Without Losing Accuracy. Learning to Imagine Manipulation Goals for Robot Task Planning. Why my photos look sideways or upside down? Detecting Canonical Orientation of Images using Convolutional Neural Networks. Modeling Retinal Ganglion Cell Population Activity with Restricted Boltzmann Machines. Attribute-contr …
Low Precision RNNs: Quantizing RNNs Without Losing Accuracy
Title | Low Precision RNNs: Quantizing RNNs Without Losing Accuracy |
Authors | Supriya Kapur, Asit Mishra, Debbie Marr |
Abstract | Similar to convolution neural networks, recurrent neural networks (RNNs) typically suffer from over-parameterization. Quantizing bit-widths of weights and activations results in runtime efficiency on hardware, yet it often comes at the cost of reduced accuracy. This paper proposes a quantization approach that increases model size with bit-width reduction. This approach will allow networks to perform at their baseline accuracy while still maintaining the benefits of reduced precision and overall model size reduction. |
Tasks | Quantization |
Published | 2017-10-20 |
URL | http://arxiv.org/abs/1710.07706v1 |
http://arxiv.org/pdf/1710.07706v1.pdf | |
PWC | https://paperswithcode.com/paper/low-precision-rnns-quantizing-rnns-without |
Repo | |
Framework | |
Learning to Imagine Manipulation Goals for Robot Task Planning
Title | Learning to Imagine Manipulation Goals for Robot Task Planning |
Authors | Chris Paxton, Kapil Katyal, Christian Rupprecht, Raman Arora, Gregory D. Hager |
Abstract | Prospection is an important part of how humans come up with new task plans, but has not been explored in depth in robotics. Predicting multiple task-level is a challenging problem that involves capturing both task semantics and continuous variability over the state of the world. Ideally, we would combine the ability of machine learning to leverage big data for learning the semantics of a task, while using techniques from task planning to reliably generalize to new environment. In this work, we propose a method for learning a model encoding just such a representation for task planning. We learn a neural net that encodes the $k$ most likely outcomes from high level actions from a given world. Our approach creates comprehensible task plans that allow us to predict changes to the environment many time steps into the future. We demonstrate this approach via application to a stacking task in a cluttered environment, where the robot must select between different colored blocks while avoiding obstacles, in order to perform a task. We also show results on a simple navigation task. Our algorithm generates realistic image and pose predictions at multiple points in a given task. |
Tasks | Robot Task Planning |
Published | 2017-11-08 |
URL | http://arxiv.org/abs/1711.02783v2 |
http://arxiv.org/pdf/1711.02783v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-to-imagine-manipulation-goals-for |
Repo | |
Framework | |
Why my photos look sideways or upside down? Detecting Canonical Orientation of Images using Convolutional Neural Networks
Title | Why my photos look sideways or upside down? Detecting Canonical Orientation of Images using Convolutional Neural Networks |
Authors | Kunal Swami, Pranav P. Deshpande, Gaurav Khandelwal, Ajay Vijayvargiya |
Abstract | Image orientation detection requires high-level scene understanding. Humans use object recognition and contextual scene information to correctly orient images. In literature, the problem of image orientation detection is mostly confronted by using low-level vision features, while some approaches incorporate few easily detectable semantic cues to gain minor improvements. The vast amount of semantic content in images makes orientation detection challenging, and therefore there is a large semantic gap between existing methods and human behavior. Also, existing methods in literature report highly discrepant detection rates, which is mainly due to large differences in datasets and limited variety of test images used for evaluation. In this work, for the first time, we leverage the power of deep learning and adapt pre-trained convolutional neural networks using largest training dataset to-date for the image orientation detection task. An extensive evaluation of our model on different public datasets shows that it remarkably generalizes to correctly orient a large set of unconstrained images; it also significantly outperforms the state-of-the-art and achieves accuracy very close to that of humans. |
Tasks | Object Recognition, Scene Understanding |
Published | 2017-12-04 |
URL | http://arxiv.org/abs/1712.01195v1 |
http://arxiv.org/pdf/1712.01195v1.pdf | |
PWC | https://paperswithcode.com/paper/why-my-photos-look-sideways-or-upside-down |
Repo | |
Framework | |
Modeling Retinal Ganglion Cell Population Activity with Restricted Boltzmann Machines
Title | Modeling Retinal Ganglion Cell Population Activity with Restricted Boltzmann Machines |
Authors | Matteo Zanotto, Riccardo Volpi, Alessandro Maccione, Luca Berdondini, Diego Sona, Vittorio Murino |
Abstract | The retina is a complex nervous system which encodes visual stimuli before higher order processing occurs in the visual cortex. In this study we evaluated whether information about the stimuli received by the retina can be retrieved from the firing rate distribution of Retinal Ganglion Cells (RGCs), exploiting High-Density 64x64 MEA technology. To this end, we modeled the RGC population activity using mean-covariance Restricted Boltzmann Machines, latent variable models capable of learning the joint distribution of a set of continuous observed random variables and a set of binary unobserved random units. The idea was to figure out if binary latent states encode the regularities associated to different visual stimuli, as modes in the joint distribution. We measured the goodness of mcRBM encoding by calculating the Mutual Information between the latent states and the stimuli shown to the retina. Results show that binary states can encode the regularities associated to different stimuli, using both gratings and natural scenes as stimuli. We also discovered that hidden variables encode interesting properties of retinal activity, interpreted as population receptive fields. We further investigated the ability of the model to learn different modes in population activity by comparing results associated to a retina in normal conditions and after pharmacologically blocking GABA receptors (GABAC at first, and then also GABAA and GABAB). As expected, Mutual Information tends to decrease if we pharmacologically block receptors. We finally stress that the computational method described in this work could potentially be applied to any kind of neural data obtained through MEA technology, though different techniques should be applied to interpret the results. |
Tasks | Latent Variable Models |
Published | 2017-01-11 |
URL | http://arxiv.org/abs/1701.02898v2 |
http://arxiv.org/pdf/1701.02898v2.pdf | |
PWC | https://paperswithcode.com/paper/modeling-retinal-ganglion-cell-population |
Repo | |
Framework | |
Attribute-controlled face photo synthesis from simple line drawing
Title | Attribute-controlled face photo synthesis from simple line drawing |
Authors | Qi Guo, Ce Zhu, Zhiqiang Xia, Zhengtao Wang, Yipeng Liu |
Abstract | Face photo synthesis from simple line drawing is a one-to-many task as simple line drawing merely contains the contour of human face. Previous exemplar-based methods are over-dependent on the datasets and are hard to generalize to complicated natural scenes. Recently, several works utilize deep neural networks to increase the generalization, but they are still limited in the controllability of the users. In this paper, we propose a deep generative model to synthesize face photo from simple line drawing controlled by face attributes such as hair color and complexion. In order to maximize the controllability of face attributes, an attribute-disentangled variational auto-encoder (AD-VAE) is firstly introduced to learn latent representations disentangled with respect to specified attributes. Then we conduct photo synthesis from simple line drawing based on AD-VAE. Experiments show that our model can well disentangle the variations of attributes from other variations of face photos and synthesize detailed photorealistic face images with desired attributes. Regarding background and illumination as the style and human face as the content, we can also synthesize face photos with the target style of a style photo. |
Tasks | |
Published | 2017-02-09 |
URL | http://arxiv.org/abs/1702.02805v1 |
http://arxiv.org/pdf/1702.02805v1.pdf | |
PWC | https://paperswithcode.com/paper/attribute-controlled-face-photo-synthesis |
Repo | |
Framework | |
Automatic Image Filtering on Social Networks Using Deep Learning and Perceptual Hashing During Crises
Title | Automatic Image Filtering on Social Networks Using Deep Learning and Perceptual Hashing During Crises |
Authors | Dat Tien Nguyen, Firoj Alam, Ferda Ofli, Muhammad Imran |
Abstract | The extensive use of social media platforms, especially during disasters, creates unique opportunities for humanitarian organizations to gain situational awareness and launch relief operations accordingly. In addition to the textual content, people post overwhelming amounts of imagery data on social networks within minutes of a disaster hit. Studies point to the importance of this online imagery content for emergency response. Despite recent advances in the computer vision field, automatic processing of the crisis-related social media imagery data remains a challenging task. It is because a majority of which consists of redundant and irrelevant content. In this paper, we present an image processing pipeline that comprises de-duplication and relevancy filtering mechanisms to collect and filter social media image content in real-time during a crisis event. Results obtained from extensive experiments on real-world crisis datasets demonstrate the significance of the proposed pipeline for optimal utilization of both human and machine computing resources. |
Tasks | |
Published | 2017-04-09 |
URL | http://arxiv.org/abs/1704.02602v1 |
http://arxiv.org/pdf/1704.02602v1.pdf | |
PWC | https://paperswithcode.com/paper/automatic-image-filtering-on-social-networks |
Repo | |
Framework | |
Segmenting Dermoscopic Images
Title | Segmenting Dermoscopic Images |
Authors | Mario Rosario Guarracino, Lucia Maddalena |
Abstract | We propose an automatic algorithm, named SDI, for the segmentation of skin lesions in dermoscopic images, articulated into three main steps: selection of the image ROI, selection of the segmentation band, and segmentation. We present extensive experimental results achieved by the SDI algorithm on the lesion segmentation dataset made available for the ISIC 2017 challenge on Skin Lesion Analysis Towards Melanoma Detection, highlighting its advantages and disadvantages. |
Tasks | Lesion Segmentation |
Published | 2017-03-09 |
URL | http://arxiv.org/abs/1703.03186v1 |
http://arxiv.org/pdf/1703.03186v1.pdf | |
PWC | https://paperswithcode.com/paper/segmenting-dermoscopic-images |
Repo | |
Framework | |
Multi-space Variational Encoder-Decoders for Semi-supervised Labeled Sequence Transduction
Title | Multi-space Variational Encoder-Decoders for Semi-supervised Labeled Sequence Transduction |
Authors | Chunting Zhou, Graham Neubig |
Abstract | Labeled sequence transduction is a task of transforming one sequence into another sequence that satisfies desiderata specified by a set of labels. In this paper we propose multi-space variational encoder-decoders, a new model for labeled sequence transduction with semi-supervised learning. The generative model can use neural networks to handle both discrete and continuous latent variables to exploit various features of data. Experiments show that our model provides not only a powerful supervised framework but also can effectively take advantage of the unlabeled data. On the SIGMORPHON morphological inflection benchmark, our model outperforms single-model state-of-art results by a large margin for the majority of languages. |
Tasks | Morphological Inflection |
Published | 2017-04-06 |
URL | http://arxiv.org/abs/1704.01691v2 |
http://arxiv.org/pdf/1704.01691v2.pdf | |
PWC | https://paperswithcode.com/paper/multi-space-variational-encoder-decoders-for |
Repo | |
Framework | |
DeepFace: Face Generation using Deep Learning
Title | DeepFace: Face Generation using Deep Learning |
Authors | Hardie Cate, Fahim Dalvi, Zeshan Hussain |
Abstract | We use CNNs to build a system that both classifies images of faces based on a variety of different facial attributes and generates new faces given a set of desired facial characteristics. After introducing the problem and providing context in the first section, we discuss recent work related to image generation in Section 2. In Section 3, we describe the methods used to fine-tune our CNN and generate new images using a novel approach inspired by a Gaussian mixture model. In Section 4, we discuss our working dataset and describe our preprocessing steps and handling of facial attributes. Finally, in Sections 5, 6 and 7, we explain our experiments and results and conclude in the following section. Our classification system has 82% test accuracy. Furthermore, our generation pipeline successfully creates well-formed faces. |
Tasks | Face Generation, Image Generation |
Published | 2017-01-07 |
URL | http://arxiv.org/abs/1701.01876v1 |
http://arxiv.org/pdf/1701.01876v1.pdf | |
PWC | https://paperswithcode.com/paper/deepface-face-generation-using-deep-learning |
Repo | |
Framework | |
Using Deep Learning Method for Classification: A Proposed Algorithm for the ISIC 2017 Skin Lesion Classification Challenge
Title | Using Deep Learning Method for Classification: A Proposed Algorithm for the ISIC 2017 Skin Lesion Classification Challenge |
Authors | Wenhao Zhang, Liangcai Gao, Runtao Liu |
Abstract | Skin cancer, the most common human malignancy, is primarily diagnosed visually by physicians [1]. Classification with an automated method like CNN [2, 3] shows potential for challenging tasks [1]. By now, the deep convolutional neural networks are on par with human dermatologist [1]. This abstract is dedicated on developing a Deep Learning method for ISIC [5] 2017 Skin Lesion Detection Competition hosted at [6] to classify the dermatology pictures, which is aimed at improving the diagnostic accuracy rate and general level of the human health. The challenge falls into three sub-challenges, including Lesion Segmentation, Lesion Dermoscopic Feature Extraction and Lesion Classification. This project only participates in the Lesion Classification part. This algorithm is comprised of three steps: (1) original images preprocessing, (2) modelling the processed images using CNN [2, 3] in Caffe [4] framework, (3) predicting the test images and calculating the scores that represent the likelihood of corresponding classification. The models are built on the source images are using the Caffe [4] framework. The scores in prediction step are obtained by two different models from the source images. |
Tasks | Lesion Segmentation, Skin Lesion Classification |
Published | 2017-03-07 |
URL | http://arxiv.org/abs/1703.02182v2 |
http://arxiv.org/pdf/1703.02182v2.pdf | |
PWC | https://paperswithcode.com/paper/using-deep-learning-method-for-classification |
Repo | |
Framework | |
A Novel Multi-task Deep Learning Model for Skin Lesion Segmentation and Classification
Title | A Novel Multi-task Deep Learning Model for Skin Lesion Segmentation and Classification |
Authors | Xulei Yang, Zeng Zeng, Si Yong Yeo, Colin Tan, Hong Liang Tey, Yi Su |
Abstract | In this study, a multi-task deep neural network is proposed for skin lesion analysis. The proposed multi-task learning model solves different tasks (e.g., lesion segmentation and two independent binary lesion classifications) at the same time by exploiting commonalities and differences across tasks. This results in improved learning efficiency and potential prediction accuracy for the task-specific models, when compared to training the individual models separately. The proposed multi-task deep learning model is trained and evaluated on the dermoscopic image sets from the International Skin Imaging Collaboration (ISIC) 2017 Challenge - Skin Lesion Analysis towards Melanoma Detection, which consists of 2000 training samples and 150 evaluation samples. The experimental results show that the proposed multi-task deep learning model achieves promising performances on skin lesion segmentation and classification. The average value of Jaccard index for lesion segmentation is 0.724, while the average values of area under the receiver operating characteristic curve (AUC) on two individual lesion classifications are 0.880 and 0.972, respectively. |
Tasks | Lesion Segmentation, Multi-Task Learning |
Published | 2017-03-03 |
URL | http://arxiv.org/abs/1703.01025v1 |
http://arxiv.org/pdf/1703.01025v1.pdf | |
PWC | https://paperswithcode.com/paper/a-novel-multi-task-deep-learning-model-for |
Repo | |
Framework | |
Skin Lesion Analysis Towards Melanoma Detection Using Deep Learning Network
Title | Skin Lesion Analysis Towards Melanoma Detection Using Deep Learning Network |
Authors | Yuexiang Li, Linlin Shen |
Abstract | Skin lesion is a severe disease in world-wide extent. Early detection of melanoma in dermoscopy images significantly increases the survival rate. However, the accurate recognition of melanoma is extremely challenging due to the following reasons, e.g. low contrast between lesions and skin, visual similarity between melanoma and non-melanoma lesions, etc. Hence, reliable automatic detection of skin tumors is very useful to increase the accuracy and efficiency of pathologists. International Skin Imaging Collaboration (ISIC) is a challenge focusing on the automatic analysis of skin lesion. In this paper, we proposed two deep learning methods to address all the three tasks announced in ISIC 2017, i.e. lesion segmentation (task 1), lesion dermoscopic feature extraction (task 2) and lesion classification (task 3). A deep learning framework consisting of two fully-convolutional residual networks (FCRN) is proposed to simultaneously produce the segmentation result and the coarse classification result. A lesion index calculation unit (LICU) is developed to refine the coarse classification results by calculating the distance heat-map. A straight-forward CNN is proposed for the dermoscopic feature extraction task. To our best knowledges, we are not aware of any previous work proposed for this task. The proposed deep learning frameworks were evaluated on the ISIC 2017 testing set. Experimental results show the promising accuracies of our frameworks, i.e. 0.718 for task 1, 0.833 for task 2 and 0.823 for task 3 were achieved. |
Tasks | Lesion Segmentation |
Published | 2017-03-02 |
URL | http://arxiv.org/abs/1703.00577v2 |
http://arxiv.org/pdf/1703.00577v2.pdf | |
PWC | https://paperswithcode.com/paper/skin-lesion-analysis-towards-melanoma |
Repo | |
Framework | |
Skin cancer reorganization and classification with deep neural network
Title | Skin cancer reorganization and classification with deep neural network |
Authors | Hao Chang |
Abstract | As one kind of skin cancer, melanoma is very dangerous. Dermoscopy based early detection and recarbonization strategy is critical for melanoma therapy. However, well-trained dermatologists dominant the diagnostic accuracy. In order to solve this problem, many effort focus on developing automatic image analysis systems. Here we report a novel strategy based on deep learning technique, and achieve very high skin lesion segmentation and melanoma diagnosis accuracy: 1) we build a segmentation neural network (skin_segnn), which achieved very high lesion boundary detection accuracy; 2) We build another very deep neural network based on Google inception v3 network (skin_recnn) and its well-trained weight. The novel designed transfer learning based deep neural network skin_inceptions_v3_nn helps to achieve a high prediction accuracy. |
Tasks | Boundary Detection, Lesion Segmentation, Transfer Learning |
Published | 2017-03-01 |
URL | http://arxiv.org/abs/1703.00534v1 |
http://arxiv.org/pdf/1703.00534v1.pdf | |
PWC | https://paperswithcode.com/paper/skin-cancer-reorganization-and-classification |
Repo | |
Framework | |
ISIC 2017 - Skin Lesion Analysis Towards Melanoma Detection
Title | ISIC 2017 - Skin Lesion Analysis Towards Melanoma Detection |
Authors | Matt Berseth |
Abstract | Our system addresses Part 1, Lesion Segmentation and Part 3, Lesion Classification of the ISIC 2017 challenge. Both algorithms make use of deep convolutional networks to achieve the challenge objective. |
Tasks | Lesion Segmentation |
Published | 2017-03-01 |
URL | http://arxiv.org/abs/1703.00523v1 |
http://arxiv.org/pdf/1703.00523v1.pdf | |
PWC | https://paperswithcode.com/paper/isic-2017-skin-lesion-analysis-towards |
Repo | |
Framework | |
Infinite Sparse Structured Factor Analysis
Title | Infinite Sparse Structured Factor Analysis |
Authors | Matthew C. Pearce, Simon R. White |
Abstract | Matrix factorisation methods decompose multivariate observations as linear combinations of latent feature vectors. The Indian Buffet Process (IBP) provides a way to model the number of latent features required for a good approximation in terms of regularised reconstruction error. Previous work has focussed on latent feature vectors with independent entries. We extend the model to include nondiagonal latent covariance structures representing characteristics such as smoothness. This is done by . Using simulations we demonstrate that under appropriate conditions a smoothness prior helps to recover the true latent features, while denoising more accurately. We demonstrate our method on a real neuroimaging dataset, where computational tractability is a sufficient challenge that the efficient strategy presented here is essential. |
Tasks | Denoising |
Published | 2017-04-13 |
URL | http://arxiv.org/abs/1704.04031v1 |
http://arxiv.org/pdf/1704.04031v1.pdf | |
PWC | https://paperswithcode.com/paper/infinite-sparse-structured-factor-analysis |
Repo | |
Framework | |