Paper Group ANR 428
Real-Time, Highly Accurate Robotic Grasp Detection using Fully Convolutional Neural Networks with High-Resolution Images. A Navigational Approach to Health. A general metric for identifying adversarial images. Unsupervised Cross-Modal Alignment of Speech and Text Embedding Spaces. The Outer Product Structure of Neural Network Derivatives. Condition …
Real-Time, Highly Accurate Robotic Grasp Detection using Fully Convolutional Neural Networks with High-Resolution Images
Title | Real-Time, Highly Accurate Robotic Grasp Detection using Fully Convolutional Neural Networks with High-Resolution Images |
Authors | Dongwon Park, Yonghyeok Seo, Se Young Chun |
Abstract | Robotic grasp detection for novel objects is a challenging task, but for the last few years, deep learning based approaches have achieved remarkable performance improvements, up to 96.1% accuracy, with RGB-D data. In this paper, we propose fully convolutional neural network (FCNN) based methods for robotic grasp detection. Our methods also achieved state-of-the-art detection accuracy (up to 96.6%) with state-of- the-art real-time computation time for high-resolution images (6-20ms per 360x360 image) on Cornell dataset. Due to FCNN, our proposed method can be applied to images with any size for detecting multigrasps on multiobjects. Proposed methods were evaluated using 4-axis robot arm with small parallel gripper and RGB-D camera for grasping challenging small, novel objects. With accurate vision-robot coordinate calibration through our proposed learning-based, fully automatic approach, our proposed method yielded 90% success rate. |
Tasks | Calibration |
Published | 2018-09-16 |
URL | https://arxiv.org/abs/1809.05828v2 |
https://arxiv.org/pdf/1809.05828v2.pdf | |
PWC | https://paperswithcode.com/paper/real-time-highly-accurate-robotic-grasp |
Repo | |
Framework | |
A Navigational Approach to Health
Title | A Navigational Approach to Health |
Authors | Nitish Nag, Ramesh Jain |
Abstract | Lifestyle and environment interacting with our biological machine are primarily responsible for shaping our health and wellbeing. Continuous, multi-modal, and quantitative approaches to understanding and controlling these factors will allow each person to better reach their desired quality of life. A navigational paradigm can help users towards a specified health goal by using constantly captured measurements to feed estimations of how a user’s health is continuously changing in order to provide actionable guidance. As various actions are taken by the user, measurements of the resulting effects loop back into the estimation and the next step of guidance. This perpetual cycle of measuring, estimating, guiding, and acting articulates a Personal Health Navigation information and actuation framework. Personal Health Navigation focuses on fulfilling a user’s health goals by ensuring minimal deviation from healthy states, rather than treating disease or symptoms after derailment from proper biological function. |
Tasks | |
Published | 2018-12-04 |
URL | http://arxiv.org/abs/1812.01638v1 |
http://arxiv.org/pdf/1812.01638v1.pdf | |
PWC | https://paperswithcode.com/paper/a-navigational-approach-to-health |
Repo | |
Framework | |
A general metric for identifying adversarial images
Title | A general metric for identifying adversarial images |
Authors | Siddharth Krishna Kumar |
Abstract | It is well known that a determined adversary can fool a neural network by making imperceptible adversarial perturbations to an image. Recent studies have shown that these perturbations can be detected even without information about the neural network if the strategy taken by the adversary is known beforehand. Unfortunately, these studies suffer from the generalization limitation – the detection method has to be recalibrated every time the adversary changes his strategy. In this study, we attempt to overcome the generalization limitation by deriving a metric which reliably identifies adversarial images even when the approach taken by the adversary is unknown. Our metric leverages key differences between the spectra of clean and adversarial images when an image is treated as a matrix. Our metric is able to detect adversarial images across different datasets and attack strategies without any additional re-calibration. In addition, our approach provides geometric insights into several unanswered questions about adversarial perturbations. |
Tasks | Calibration |
Published | 2018-07-26 |
URL | http://arxiv.org/abs/1807.10335v1 |
http://arxiv.org/pdf/1807.10335v1.pdf | |
PWC | https://paperswithcode.com/paper/a-general-metric-for-identifying-adversarial |
Repo | |
Framework | |
Unsupervised Cross-Modal Alignment of Speech and Text Embedding Spaces
Title | Unsupervised Cross-Modal Alignment of Speech and Text Embedding Spaces |
Authors | Yu-An Chung, Wei-Hung Weng, Schrasing Tong, James Glass |
Abstract | Recent research has shown that word embedding spaces learned from text corpora of different languages can be aligned without any parallel data supervision. Inspired by the success in unsupervised cross-lingual word embeddings, in this paper we target learning a cross-modal alignment between the embedding spaces of speech and text learned from corpora of their respective modalities in an unsupervised fashion. The proposed framework learns the individual speech and text embedding spaces, and attempts to align the two spaces via adversarial training, followed by a refinement procedure. We show how our framework could be used to perform spoken word classification and translation, and the results on these two tasks demonstrate that the performance of our unsupervised alignment approach is comparable to its supervised counterpart. Our framework is especially useful for developing automatic speech recognition (ASR) and speech-to-text translation systems for low- or zero-resource languages, which have little parallel audio-text data for training modern supervised ASR and speech-to-text translation models, but account for the majority of the languages spoken across the world. |
Tasks | Speech Recognition, Word Embeddings |
Published | 2018-05-18 |
URL | http://arxiv.org/abs/1805.07467v2 |
http://arxiv.org/pdf/1805.07467v2.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-cross-modal-alignment-of-speech |
Repo | |
Framework | |
The Outer Product Structure of Neural Network Derivatives
Title | The Outer Product Structure of Neural Network Derivatives |
Authors | Craig Bakker, Michael J. Henry, Nathan O. Hodas |
Abstract | In this paper, we show that feedforward and recurrent neural networks exhibit an outer product derivative structure but that convolutional neural networks do not. This structure makes it possible to use higher-order information without needing approximations or infeasibly large amounts of memory, and it may also provide insights into the geometry of neural network optima. The ability to easily access these derivatives also suggests a new, geometric approach to regularization. We then discuss how this structure could be used to improve training methods, increase network robustness and generalizability, and inform network compression methods. |
Tasks | |
Published | 2018-10-09 |
URL | http://arxiv.org/abs/1810.03798v1 |
http://arxiv.org/pdf/1810.03798v1.pdf | |
PWC | https://paperswithcode.com/paper/the-outer-product-structure-of-neural-network |
Repo | |
Framework | |
Conditional Generative Refinement Adversarial Networks for Unbalanced Medical Image Semantic Segmentation
Title | Conditional Generative Refinement Adversarial Networks for Unbalanced Medical Image Semantic Segmentation |
Authors | Mina Rezaei, Haojin Yang, Christoph Meinel |
Abstract | We propose a new generative adversarial architecture to mitigate imbalance data problem in medical image semantic segmentation where the majority of pixels belongs to a healthy region and few belong to lesion or non-health region. A model trained with imbalanced data tends to bias toward healthy data which is not desired in clinical applications and predicted outputs by these networks have high precision and low sensitivity. We propose a new conditional generative refinement network with three components: a generative, a discriminative, and a refinement network to mitigate unbalanced data problem through ensemble learning. The generative network learns to a segment at the pixel level by getting feedback from the discriminative network according to the true positive and true negative maps. On the other hand, the refinement network learns to predict the false positive and the false negative masks produced by the generative network that has significant value, especially in medical application. The final semantic segmentation masks are then composed by the output of the three networks. The proposed architecture shows state-of-the-art results on LiTS-2017 for liver lesion segmentation, and two microscopic cell segmentation datasets MDA231, PhC-HeLa. We have achieved competitive results on BraTS-2017 for brain tumour segmentation. |
Tasks | Cell Segmentation, Lesion Segmentation, Semantic Segmentation |
Published | 2018-10-09 |
URL | http://arxiv.org/abs/1810.03871v1 |
http://arxiv.org/pdf/1810.03871v1.pdf | |
PWC | https://paperswithcode.com/paper/conditional-generative-refinement-adversarial |
Repo | |
Framework | |
A Detection and Segmentation Architecture for Skin Lesion Segmentation on Dermoscopy Images
Title | A Detection and Segmentation Architecture for Skin Lesion Segmentation on Dermoscopy Images |
Authors | Chengyao Qian, Ting Liu, Hao Jiang, Zhe Wang, Pengfei Wang, Mingxin Guan, Biao Sun |
Abstract | This report summarises our method and validation results for the ISIC Challenge 2018 - Skin Lesion Analysis Towards Melanoma Detection - Task 1: Lesion Segmentation. We present a two-stage method for lesion segmentation with optimised training method and ensemble post-process. Our method achieves state-of-the-art performance on lesion segmentation and we win the first place in ISIC 2018 task1. |
Tasks | Lesion Segmentation |
Published | 2018-09-11 |
URL | http://arxiv.org/abs/1809.03917v2 |
http://arxiv.org/pdf/1809.03917v2.pdf | |
PWC | https://paperswithcode.com/paper/a-detection-and-segmentation-architecture-for |
Repo | |
Framework | |
Shallow vs deep learning architectures for white matter lesion segmentation in the early stages of multiple sclerosis
Title | Shallow vs deep learning architectures for white matter lesion segmentation in the early stages of multiple sclerosis |
Authors | Francesco La Rosa, Mário João Fartaria, Tobias Kober, Jonas Richiardi, Cristina Granziera, Jean-Philippe Thiran, Meritxell Bach Cuadra |
Abstract | In this work, we present a comparison of a shallow and a deep learning architecture for the automated segmentation of white matter lesions in MR images of multiple sclerosis patients. In particular, we train and test both methods on early stage disease patients, to verify their performance in challenging conditions, more similar to a clinical setting than what is typically provided in multiple sclerosis segmentation challenges. Furthermore, we evaluate a prototype naive combination of the two methods, which refines the final segmentation. All methods were trained on 32 patients, and the evaluation was performed on a pure test set of 73 cases. Results show low lesion-wise false positives (30%) for the deep learning architecture, whereas the shallow architecture yields the best Dice coefficient (63%) and volume difference (19%). Combining both shallow and deep architectures further improves the lesion-wise metrics (69% and 26% lesion-wise true and false positive rate, respectively). |
Tasks | Lesion Segmentation |
Published | 2018-09-10 |
URL | http://arxiv.org/abs/1809.03185v1 |
http://arxiv.org/pdf/1809.03185v1.pdf | |
PWC | https://paperswithcode.com/paper/shallow-vs-deep-learning-architectures-for |
Repo | |
Framework | |
Corresponding Projections for Orphan Screening
Title | Corresponding Projections for Orphan Screening |
Authors | Sven Giesselbach, Katrin Ullrich, Michael Kamp, Daniel Paurat, Thomas Gärtner |
Abstract | We propose a novel transfer learning approach for orphan screening called corresponding projections. In orphan screening the learning task is to predict the binding affinities of compounds to an orphan protein, i.e., one for which no training data is available. The identification of compounds with high affinity is a central concern in medicine since it can be used for drug discovery and design. Given a set of prediction models for proteins with labelled training data and a similarity between the proteins, corresponding projections constructs a model for the orphan protein from them such that the similarity between models resembles the one between proteins. Under the assumption that the similarity resemblance holds, we derive an efficient algorithm for kernel methods. We empirically show that the approach outperforms the state-of-the-art in orphan screening. |
Tasks | Drug Discovery, Transfer Learning |
Published | 2018-11-30 |
URL | http://arxiv.org/abs/1812.00058v1 |
http://arxiv.org/pdf/1812.00058v1.pdf | |
PWC | https://paperswithcode.com/paper/corresponding-projections-for-orphan |
Repo | |
Framework | |
Generating Highly Realistic Images of Skin Lesions with GANs
Title | Generating Highly Realistic Images of Skin Lesions with GANs |
Authors | Christoph Baur, Shadi Albarqouni, Nassir Navab |
Abstract | As many other machine learning driven medical image analysis tasks, skin image analysis suffers from a chronic lack of labeled data and skewed class distributions, which poses problems for the training of robust and well-generalizing models. The ability to synthesize realistic looking images of skin lesions could act as a reliever for the aforementioned problems. Generative Adversarial Networks (GANs) have been successfully used to synthesize realistically looking medical images, however limited to low resolution, whereas machine learning models for challenging tasks such as skin lesion segmentation or classification benefit from much higher resolution data. In this work, we successfully synthesize realistically looking images of skin lesions with GANs at such high resolution. Therefore, we utilize the concept of progressive growing, which we both quantitatively and qualitatively compare to other GAN architectures such as the DCGAN and the LAPGAN. Our results show that with the help of progressive growing, we can synthesize highly realistic dermoscopic images of skin lesions that even expert dermatologists find hard to distinguish from real ones. |
Tasks | Lesion Segmentation |
Published | 2018-09-05 |
URL | http://arxiv.org/abs/1809.01410v2 |
http://arxiv.org/pdf/1809.01410v2.pdf | |
PWC | https://paperswithcode.com/paper/generating-highly-realistic-images-of-skin |
Repo | |
Framework | |
Deep-Learning Ensembles for Skin-Lesion Segmentation, Analysis, Classification: RECOD Titans at ISIC Challenge 2018
Title | Deep-Learning Ensembles for Skin-Lesion Segmentation, Analysis, Classification: RECOD Titans at ISIC Challenge 2018 |
Authors | Alceu Bissoto, Fábio Perez, Vinícius Ribeiro, Michel Fornaciali, Sandra Avila, Eduardo Valle |
Abstract | This extended abstract describes the participation of RECOD Titans in parts 1 to 3 of the ISIC Challenge 2018 “Skin Lesion Analysis Towards Melanoma Detection” (MICCAI 2018). Although our team has a long experience with melanoma classification and moderate experience with lesion segmentation, the ISIC Challenge 2018 was the very first time we worked on lesion attribute detection. For each task we submitted 3 different ensemble approaches, varying combinations of models and datasets. Our best results on the official testing set, regarding the official metric of each task, were: 0.728 (segmentation), 0.344 (attribute detection) and 0.803 (classification). Those submissions reached, respectively, the 56th, 14th and 9th places. |
Tasks | Lesion Segmentation |
Published | 2018-08-25 |
URL | http://arxiv.org/abs/1808.08480v1 |
http://arxiv.org/pdf/1808.08480v1.pdf | |
PWC | https://paperswithcode.com/paper/deep-learning-ensembles-for-skin-lesion |
Repo | |
Framework | |
A Multi-task Framework for Skin Lesion Detection and Segmentation
Title | A Multi-task Framework for Skin Lesion Detection and Segmentation |
Authors | Sulaiman Vesal, Shreyas Malakarjun Patil, Nishant Ravikumar, Andreas Maier |
Abstract | Early detection and segmentation of skin lesions is crucial for timely diagnosis and treatment, necessary to improve the survival rate of patients. However, manual delineation is time consuming and subject to intra- and inter-observer variations among dermatologists. This underlines the need for an accurate and automatic approach to skin lesion segmentation. To tackle this issue, we propose a multi-task convolutional neural network (CNN) based, joint detection and segmentation framework, designed to initially localize the lesion and subsequently, segment it. A Faster region-based convolutional neural network' (Faster-RCNN) which comprises a region proposal network (RPN), is used to generate bounding boxes/region proposals, for lesion localization in each image. The proposed regions are subsequently refined using a softmax classifier and a bounding-box regressor. The refined bounding boxes are finally cropped and segmented using SkinNet’, a modified version of U-Net. We trained and evaluated the performance of our network, using the ISBI 2017 challenge and the PH2 datasets, and compared it with the state-of-the-art, using the official test data released as part of the challenge for the former. Our approach outperformed others in terms of Dice coefficients ($>0.93$), Jaccard index ($>0.88$), accuracy ($>0.96$) and sensitivity ($>0.95$), across five-fold cross validation experiments. |
Tasks | Lesion Segmentation |
Published | 2018-08-05 |
URL | http://arxiv.org/abs/1808.01676v1 |
http://arxiv.org/pdf/1808.01676v1.pdf | |
PWC | https://paperswithcode.com/paper/a-multi-task-framework-for-skin-lesion |
Repo | |
Framework | |
Gaussian Processes Over Graphs
Title | Gaussian Processes Over Graphs |
Authors | Arun Venkitaraman, Saikat Chatterjee, Peter Händel |
Abstract | We propose Gaussian processes for signals over graphs (GPG) using the apriori knowledge that the target vectors lie over a graph. We incorporate this information using a graph- Laplacian based regularization which enforces the target vectors to have a specific profile in terms of graph Fourier transform coeffcients, for example lowpass or bandpass graph signals. We discuss how the regularization affects the mean and the variance in the prediction output. In particular, we prove that the predictive variance of the GPG is strictly smaller than the conventional Gaussian process (GP) for any non-trivial graph. We validate our concepts by application to various real-world graph signals. Our experiments show that the performance of the GPG is superior to GP for small training data sizes and under noisy training. |
Tasks | Gaussian Processes |
Published | 2018-03-15 |
URL | http://arxiv.org/abs/1803.05776v2 |
http://arxiv.org/pdf/1803.05776v2.pdf | |
PWC | https://paperswithcode.com/paper/gaussian-processes-over-graphs |
Repo | |
Framework | |
Augmenting word2vec with latent Dirichlet allocation within a clinical application
Title | Augmenting word2vec with latent Dirichlet allocation within a clinical application |
Authors | Akshay Budhkar, Frank Rudzicz |
Abstract | This paper presents three hybrid models that directly combine latent Dirichlet allocation and word embedding for distinguishing between speakers with and without Alzheimer’s disease from transcripts of picture descriptions. Two of our models get F-scores over the current state-of-the-art using automatic methods on the DementiaBank dataset. |
Tasks | |
Published | 2018-08-12 |
URL | http://arxiv.org/abs/1808.03967v1 |
http://arxiv.org/pdf/1808.03967v1.pdf | |
PWC | https://paperswithcode.com/paper/augmenting-word2vec-with-latent-dirichlet |
Repo | |
Framework | |
Tumor Delineation For Brain Radiosurgery by a ConvNet and Non-Uniform Patch Generation
Title | Tumor Delineation For Brain Radiosurgery by a ConvNet and Non-Uniform Patch Generation |
Authors | Egor Krivov, Valery Kostjuchenko, Alexandra Dalechina, Boris Shirokikh, Gleb karchuk, Alexander Denisenko, Andrey Golanov, Mikhail Belyaev |
Abstract | Deep learning methods are actively used for brain lesion segmentation. One of the most popular models is DeepMedic, which was developed for segmentation of relatively large lesions like glioma and ischemic stroke. In our work, we consider segmentation of brain tumors appropriate to stereotactic radiosurgery which limits typical lesion sizes. These differences in target volumes lead to a large number of false negatives (especially for small lesions) as well as to an increased number of false positives for DeepMedic. We propose a new patch-sampling procedure to increase network performance for small lesions. We used a 6-year dataset from a stereotactic radiosurgery center. To evaluate our approach, we conducted experiments with the three most frequent brain tumors: metastasis, meningioma, schwannoma. In addition to cross-validation, we estimated quality on a hold-out test set which was collected several years later than the train one. The experimental results show solid improvements in both cases. |
Tasks | Lesion Segmentation |
Published | 2018-08-01 |
URL | http://arxiv.org/abs/1808.00244v1 |
http://arxiv.org/pdf/1808.00244v1.pdf | |
PWC | https://paperswithcode.com/paper/tumor-delineation-for-brain-radiosurgery-by-a |
Repo | |
Framework | |