January 31, 2020

3208 words 16 mins read

Paper Group AWR 376

Paper Group AWR 376

DSNet: Automatic Dermoscopic Skin Lesion Segmentation. Refined-Segmentation R-CNN: A Two-stage Convolutional Neural Network for Punctate White Matter Lesion Segmentation in Preterm Infants. Handling Inter-Annotator Agreement for Automated Skin Lesion Segmentation. A Kernelized Manifold Mapping to Diminish the Effect of Adversarial Perturbations. Ma …

DSNet: Automatic Dermoscopic Skin Lesion Segmentation

Title DSNet: Automatic Dermoscopic Skin Lesion Segmentation
Authors Md. Kamrul Hasan, Lavsen Dahal, Prasad N. Samarakoon, Fakrul Islam Tushar, Robert Marti Marly
Abstract Automatic segmentation of skin lesion is considered a crucial step in Computer Aided Diagnosis (CAD) for melanoma diagnosis. Despite its significance, skin lesion segmentation remains a challenging task due to their diverse color, texture, and indistinguishable boundaries and forms an open problem. Through this study, we present a new and automatic semantic segmentation network for robust skin lesion segmentation named Dermoscopic Skin Network (DSNet). In order to reduce the number of parameters to make the network lightweight, we used depth-wise separable convolution in lieu of standard convolution to project the learned discriminating features onto the pixel space at different stages of the encoder. Additionally, we implemented U-Net and Fully Convolutional Network (FCN8s) to compare against the proposed DSNet. We evaluate our proposed model on two publicly available datasets, namely ISIC-2017 and PH2. The obtained mean Intersection over Union (mIoU) is 77.5 % and 87.0 % respectively for ISIC-2017 and PH2 datasets which outperformed the ISIC-2017 challenge winner by 1.0 % with respect to mIoU. Our proposed network also outperformed U-Net and FCN8s respectively by 3.6 % and 6.8 % with respect to mIoU on the ISIC-2017 dataset. Our network for skin lesion segmentation outperforms other methods and can provide better segmented masks on two different test datasets which can lead to better performance in melanoma detection. Our trained model along with the source code and predicted masks are made publicly available.
Tasks Lesion Segmentation, Semantic Segmentation
Published 2019-07-09
URL https://arxiv.org/abs/1907.04305v2
PDF https://arxiv.org/pdf/1907.04305v2.pdf
PWC https://paperswithcode.com/paper/dsnet-automatic-dermoscopic-skin-lesion
Repo https://github.com/kamruleee51/Skin-Lesion-Segmentation-Using-Proposed-DSNet
Framework none

Refined-Segmentation R-CNN: A Two-stage Convolutional Neural Network for Punctate White Matter Lesion Segmentation in Preterm Infants

Title Refined-Segmentation R-CNN: A Two-stage Convolutional Neural Network for Punctate White Matter Lesion Segmentation in Preterm Infants
Authors Yalong Liu, Jie Li, Ying Wang, Miaomiao Wang, Xianjun Li, Zhicheng Jiao, Jian Yang, Xingbo Gao
Abstract Accurate segmentation of punctate white matter lesion (PWML) in infantile brains by an automatic algorithm can reduce the potential risk of postnatal development. How to segment PWML effectively has become one of the active topics in medical image segmentation in recent years. In this paper, we construct an efficient two-stage PWML semantic segmentation network based on the characteristics of the lesion, called refined segmentation R-CNN (RS RCNN). We propose a heuristic RPN (H-RPN) which can utilize surrounding information around the PWMLs for heuristic segmentation. Also, we design a lightweight segmentation network to segment the lesion in a fast way. Densely connected conditional random field (DCRF) is used to optimize the segmentation results. We only use T1w MRIs to segment PWMLs. The result shows that our model can well segment the lesion of ordinary size or even pixel size. The Dice similarity coefficient reaches 0.6616, the sensitivity is 0.7069, the specificity is 0.9997, and the Hausdorff distance is 52.9130. The proposed method outperforms the state-of-the-art algorithm. (The code of this paper is available on https://github.com/YalongLiu/Refined-Segmentation-R-CNN)
Tasks Lesion Segmentation, Medical Image Segmentation, Semantic Segmentation
Published 2019-06-24
URL https://arxiv.org/abs/1906.09684v2
PDF https://arxiv.org/pdf/1906.09684v2.pdf
PWC https://paperswithcode.com/paper/refined-segmentation-r-cnn-a-two-stage
Repo https://github.com/YalongLiu/Refined-Segmentation-R-CNN
Framework tf

Handling Inter-Annotator Agreement for Automated Skin Lesion Segmentation

Title Handling Inter-Annotator Agreement for Automated Skin Lesion Segmentation
Authors Vinicius Ribeiro, Sandra Avila, Eduardo Valle
Abstract In this work, we explore the issue of the inter-annotator agreement for training and evaluating automated segmentation of skin lesions. We explore what different degrees of agreement represent, and how they affect different use cases for segmentation. We also evaluate how conditioning the ground truths using different (but very simple) algorithms may help to enhance agreement and may be appropriate for some use cases. The segmentation of skin lesions is a cornerstone task for automated skin lesion analysis, useful both as an end-result to locate/detect the lesions and as an ancillary task for lesion classification. Lesion segmentation, however, is a very challenging task, due not only to the challenge of image segmentation itself but also to the difficulty in obtaining properly annotated data. Detecting accurately the borders of lesions is challenging even for trained humans, since, for many lesions, those borders are fuzzy and ill-defined. Using lesions and annotations from the ISIC Archive, we estimate inter-annotator agreement for skin-lesion segmentation and propose several simple procedures that may help to improve inter-annotator agreement if used to condition the ground truths.
Tasks Lesion Segmentation, Semantic Segmentation
Published 2019-06-06
URL https://arxiv.org/abs/1906.02415v1
PDF https://arxiv.org/pdf/1906.02415v1.pdf
PWC https://paperswithcode.com/paper/handling-inter-annotator-agreement-for
Repo https://github.com/vribeiro1/skin-lesion-segmentation-agreement
Framework pytorch

A Kernelized Manifold Mapping to Diminish the Effect of Adversarial Perturbations

Title A Kernelized Manifold Mapping to Diminish the Effect of Adversarial Perturbations
Authors Saeid Asgari Taghanaki, Kumar Abhishek, Shekoofeh Azizi, Ghassan Hamarneh
Abstract The linear and non-flexible nature of deep convolutional models makes them vulnerable to carefully crafted adversarial perturbations. To tackle this problem, we propose a non-linear radial basis convolutional feature mapping by learning a Mahalanobis-like distance function. Our method then maps the convolutional features onto a linearly well-separated manifold, which prevents small adversarial perturbations from forcing a sample to cross the decision boundary. We test the proposed method on three publicly available image classification and segmentation datasets namely, MNIST, ISBI ISIC 2017 skin lesion segmentation, and NIH Chest X-Ray-14. We evaluate the robustness of our method to different gradient (targeted and untargeted) and non-gradient based attacks and compare it to several non-gradient masking defense strategies. Our results demonstrate that the proposed method can increase the resilience of deep convolutional neural networks to adversarial perturbations without accuracy drop on clean data.
Tasks Image Classification, Lesion Segmentation
Published 2019-03-03
URL https://arxiv.org/abs/1903.01015v2
PDF https://arxiv.org/pdf/1903.01015v2.pdf
PWC https://paperswithcode.com/paper/a-kernelized-manifold-mapping-to-diminish-the
Repo https://github.com/asgsaeid/KernelizedManifoldMapping
Framework none

Max-margin Class Imbalanced Learning with Gaussian Affinity

Title Max-margin Class Imbalanced Learning with Gaussian Affinity
Authors Munawar Hayat, Salman Khan, Waqas Zamir, Jianbing Shen, Ling Shao
Abstract Real-world object classes appear in imbalanced ratios. This poses a significant challenge for classifiers which get biased towards frequent classes. We hypothesize that improving the generalization capability of a classifier should improve learning on imbalanced datasets. Here, we introduce the first hybrid loss function that jointly performs classification and clustering in a single formulation. Our approach is based on an `affinity measure’ in Euclidean space that leads to the following benefits: (1) direct enforcement of maximum margin constraints on classification boundaries, (2) a tractable way to ensure uniformly spaced and equidistant cluster centers, (3) flexibility to learn multiple class prototypes to support diversity and discriminability in feature space. Our extensive experiments demonstrate the significant performance improvements on visual classification and verification tasks on multiple imbalanced datasets. The proposed loss can easily be plugged in any deep architecture as a differentiable block and demonstrates robustness against different levels of data imbalance and corrupted labels. |
Tasks
Published 2019-01-23
URL http://arxiv.org/abs/1901.07711v1
PDF http://arxiv.org/pdf/1901.07711v1.pdf
PWC https://paperswithcode.com/paper/max-margin-class-imbalanced-learning-with
Repo https://github.com/koshian2/affinity-loss
Framework tf

Skin Lesion Analysis Toward Melanoma Detection 2018: A Challenge Hosted by the International Skin Imaging Collaboration (ISIC)

Title Skin Lesion Analysis Toward Melanoma Detection 2018: A Challenge Hosted by the International Skin Imaging Collaboration (ISIC)
Authors Noel Codella, Veronica Rotemberg, Philipp Tschandl, M. Emre Celebi, Stephen Dusza, David Gutman, Brian Helba, Aadi Kalloo, Konstantinos Liopyris, Michael Marchetti, Harald Kittler, Allan Halpern
Abstract This work summarizes the results of the largest skin image analysis challenge in the world, hosted by the International Skin Imaging Collaboration (ISIC), a global partnership that has organized the world’s largest public repository of dermoscopic images of skin. The challenge was hosted in 2018 at the Medical Image Computing and Computer Assisted Intervention (MICCAI) conference in Granada, Spain. The dataset included over 12,500 images across 3 tasks. 900 users registered for data download, 115 submitted to the lesion segmentation task, 25 submitted to the lesion attribute detection task, and 159 submitted to the disease classification task. Novel evaluation protocols were established, including a new test for segmentation algorithm performance, and a test for algorithm ability to generalize. Results show that top segmentation algorithms still fail on over 10% of images on average, and algorithms with equal performance on test data can have different abilities to generalize. This is an important consideration for agencies regulating the growing set of machine learning tools in the healthcare domain, and sets a new standard for future public challenges in healthcare.
Tasks Lesion Segmentation
Published 2019-02-09
URL http://arxiv.org/abs/1902.03368v2
PDF http://arxiv.org/pdf/1902.03368v2.pdf
PWC https://paperswithcode.com/paper/skin-lesion-analysis-toward-melanoma-1
Repo https://github.com/kianoush/Skin_Cancer_CNN
Framework none

Do We Really Need Fully Unsupervised Cross-Lingual Embeddings?

Title Do We Really Need Fully Unsupervised Cross-Lingual Embeddings?
Authors Ivan Vulić, Goran Glavaš, Roi Reichart, Anna Korhonen
Abstract Recent efforts in cross-lingual word embedding (CLWE) learning have predominantly focused on fully unsupervised approaches that project monolingual embeddings into a shared cross-lingual space without any cross-lingual signal. The lack of any supervision makes such approaches conceptually attractive. Yet, their only core difference from (weakly) supervised projection-based CLWE methods is in the way they obtain a seed dictionary used to initialize an iterative self-learning procedure. The fully unsupervised methods have arguably become more robust, and their primary use case is CLWE induction for pairs of resource-poor and distant languages. In this paper, we question the ability of even the most robust unsupervised CLWE approaches to induce meaningful CLWEs in these more challenging settings. A series of bilingual lexicon induction (BLI) experiments with 15 diverse languages (210 language pairs) show that fully unsupervised CLWE methods still fail for a large number of language pairs (e.g., they yield zero BLI performance for 87/210 pairs). Even when they succeed, they never surpass the performance of weakly supervised methods (seeded with 500-1,000 translation pairs) using the same self-learning procedure in any BLI setup, and the gaps are often substantial. These findings call for revisiting the main motivations behind fully unsupervised CLWE methods.
Tasks
Published 2019-09-04
URL https://arxiv.org/abs/1909.01638v1
PDF https://arxiv.org/pdf/1909.01638v1.pdf
PWC https://paperswithcode.com/paper/do-we-really-need-fully-unsupervised-cross
Repo https://github.com/cambridgeltl/panlex-bli
Framework none

A Finnish News Corpus for Named Entity Recognition

Title A Finnish News Corpus for Named Entity Recognition
Authors Teemu Ruokolainen, Pekka Kauppinen, Miikka Silfverberg, Krister Lindén
Abstract We present a corpus of Finnish news articles with a manually prepared named entity annotation. The corpus consists of 953 articles (193,742 word tokens) with six named entity classes (organization, location, person, product, event, and date). The articles are extracted from the archives of Digitoday, a Finnish online technology news source. The corpus is available for research purposes. We present baseline experiments on the corpus using a rule-based and two deep learning systems on two, in-domain and out-of-domain, test sets.
Tasks Named Entity Recognition
Published 2019-08-12
URL https://arxiv.org/abs/1908.04212v1
PDF https://arxiv.org/pdf/1908.04212v1.pdf
PWC https://paperswithcode.com/paper/a-finnish-news-corpus-for-named-entity
Repo https://github.com/TurkuNLP/FinBERT
Framework pytorch

Deep Learning for 3D Point Clouds: A Survey

Title Deep Learning for 3D Point Clouds: A Survey
Authors Yulan Guo, Hanyun Wang, Qingyong Hu, Hao Liu, Li Liu, Mohammed Bennamoun
Abstract Point cloud learning has lately attracted increasing attention due to its wide applications in many areas, such as computer vision, autonomous driving, and robotics. As a dominating technique in AI, deep learning has been successfully used to solve various 2D vision problems. However, deep learning on point clouds is still in its infancy due to the unique challenges faced by the processing of point clouds with deep neural networks. Recently, deep learning on point clouds has become even thriving, with numerous methods being proposed to address different problems in this area. To stimulate future research, this paper presents a comprehensive review of recent progress in deep learning methods for point clouds. It covers three major tasks, including 3D shape classification, 3D object detection and tracking, and 3D point cloud segmentation. It also presents comparative results on several publicly available datasets, together with insightful observations and inspiring future research directions.
Tasks 3D Object Detection, Autonomous Driving, Object Detection
Published 2019-12-27
URL https://arxiv.org/abs/1912.12033v1
PDF https://arxiv.org/pdf/1912.12033v1.pdf
PWC https://paperswithcode.com/paper/deep-learning-for-3d-point-clouds-a-survey
Repo https://github.com/QingyongHu/Benchmark_results_3D_point_cloud
Framework none

Temporal Unet: Sample Level Human Action Recognition using WiFi

Title Temporal Unet: Sample Level Human Action Recognition using WiFi
Authors Fei Wang, Yunpeng Song, Jimuyang Zhang, Jinsong Han, Dong Huang
Abstract Human doing actions will result in WiFi distortion, which is widely explored for action recognition, such as the elderly fallen detection, hand sign language recognition, and keystroke estimation. As our best survey, past work recognizes human action by categorizing one complete distortion series into one action, which we term as series-level action recognition. In this paper, we introduce a much more fine-grained and challenging action recognition task into WiFi sensing domain, i.e., sample-level action recognition. In this task, every WiFi distortion sample in the whole series should be categorized into one action, which is a critical technique in precise action localization, continuous action segmentation, and real-time action recognition. To achieve WiFi-based sample-level action recognition, we fully analyze approaches in image-based semantic segmentation as well as in video-based frame-level action recognition, then propose a simple yet efficient deep convolutional neural network, i.e., Temporal Unet. Experimental results show that Temporal Unet achieves this novel task well. Codes have been made publicly available at https://github.com/geekfeiw/WiSLAR.
Tasks Action Localization, action segmentation, Semantic Segmentation, Sign Language Recognition, Temporal Action Localization
Published 2019-04-19
URL http://arxiv.org/abs/1904.11953v1
PDF http://arxiv.org/pdf/1904.11953v1.pdf
PWC https://paperswithcode.com/paper/190411953
Repo https://github.com/geekfeiw/WiSLAR
Framework pytorch

Distributional Policy Optimization: An Alternative Approach for Continuous Control

Title Distributional Policy Optimization: An Alternative Approach for Continuous Control
Authors Chen Tessler, Guy Tennenholtz, Shie Mannor
Abstract We identify a fundamental problem in policy gradient-based methods in continuous control. As policy gradient methods require the agent’s underlying probability distribution, they limit policy representation to parametric distribution classes. We show that optimizing over such sets results in local movement in the action space and thus convergence to sub-optimal solutions. We suggest a novel distributional framework, able to represent arbitrary distribution functions over the continuous action space. Using this framework, we construct a generative scheme, trained using an off-policy actor-critic paradigm, which we call the Generative Actor Critic (GAC). Compared to policy gradient methods, GAC does not require knowledge of the underlying probability distribution, thereby overcoming these limitations. Empirical evaluation shows that our approach is comparable and often surpasses current state-of-the-art baselines in continuous domains.
Tasks Continuous Control, Policy Gradient Methods
Published 2019-05-23
URL https://arxiv.org/abs/1905.09855v2
PDF https://arxiv.org/pdf/1905.09855v2.pdf
PWC https://paperswithcode.com/paper/distributional-policy-optimization-an
Repo https://github.com/tesslerc/GAC
Framework pytorch

Keyphrase Extraction from Scholarly Articles as Sequence Labeling using Contextualized Embeddings

Title Keyphrase Extraction from Scholarly Articles as Sequence Labeling using Contextualized Embeddings
Authors Dhruva Sahrawat, Debanjan Mahata, Mayank Kulkarni, Haimin Zhang, Rakesh Gosangi, Amanda Stent, Agniv Sharma, Yaman Kumar, Rajiv Ratn Shah, Roger Zimmermann
Abstract In this paper, we formulate keyphrase extraction from scholarly articles as a sequence labeling task solved using a BiLSTM-CRF, where the words in the input text are represented using deep contextualized embeddings. We evaluate the proposed architecture using both contextualized and fixed word embedding models on three different benchmark datasets (Inspec, SemEval 2010, SemEval 2017) and compare with existing popular unsupervised and supervised techniques. Our results quantify the benefits of (a) using contextualized embeddings (e.g. BERT) over fixed word embeddings (e.g. Glove); (b) using a BiLSTM-CRF architecture with contextualized word embeddings over fine-tuning the contextualized word embedding model directly, and (c) using genre-specific contextualized embeddings (SciBERT). Through error analysis, we also provide some insights into why particular models work better than others. Lastly, we present a case study where we analyze different self-attention layers of the two best models (BERT and SciBERT) to better understand the predictions made by each for the task of keyphrase extraction.
Tasks Word Embeddings
Published 2019-10-19
URL https://arxiv.org/abs/1910.08840v1
PDF https://arxiv.org/pdf/1910.08840v1.pdf
PWC https://paperswithcode.com/paper/keyphrase-extraction-from-scholarly-articles
Repo https://github.com/midas-research/keyphrase-extraction-as-sequence-labeling-data
Framework none

Lipschitz Generative Adversarial Nets

Title Lipschitz Generative Adversarial Nets
Authors Zhiming Zhou, Jiadong Liang, Yuxuan Song, Lantao Yu, Hongwei Wang, Weinan Zhang, Yong Yu, Zhihua Zhang
Abstract In this paper, we study the convergence of generative adversarial networks (GANs) from the perspective of the informativeness of the gradient of the optimal discriminative function. We show that GANs without restriction on the discriminative function space commonly suffer from the problem that the gradient produced by the discriminator is uninformative to guide the generator. By contrast, Wasserstein GAN (WGAN), where the discriminative function is restricted to 1-Lipschitz, does not suffer from such a gradient uninformativeness problem. We further show in the paper that the model with a compact dual form of Wasserstein distance, where the Lipschitz condition is relaxed, may also theoretically suffer from this issue. This implies the importance of Lipschitz condition and motivates us to study the general formulation of GANs with Lipschitz constraint, which leads to a new family of GANs that we call Lipschitz GANs (LGANs). We show that LGANs guarantee the existence and uniqueness of the optimal discriminative function as well as the existence of a unique Nash equilibrium. We prove that LGANs are generally capable of eliminating the gradient uninformativeness problem. According to our empirical analysis, LGANs are more stable and generate consistently higher quality samples compared with WGAN.
Tasks
Published 2019-02-15
URL https://arxiv.org/abs/1902.05687v4
PDF https://arxiv.org/pdf/1902.05687v4.pdf
PWC https://paperswithcode.com/paper/lipschitz-generative-adversarial-nets
Repo https://github.com/ZhimingZhou/AdaShift-Lipschitz-GANs-MaxGP
Framework tf

Spatiotemporal CNN for Video Object Segmentation

Title Spatiotemporal CNN for Video Object Segmentation
Authors Kai Xu, Longyin Wen, Guorong Li, Liefeng Bo, Qingming Huang
Abstract In this paper, we present a unified, end-to-end trainable spatiotemporal CNN model for VOS, which consists of two branches, i.e., the temporal coherence branch and the spatial segmentation branch. Specifically, the temporal coherence branch pretrained in an adversarial fashion from unlabeled video data, is designed to capture the dynamic appearance and motion cues of video sequences to guide object segmentation. The spatial segmentation branch focuses on segmenting objects accurately based on the learned appearance and motion cues. To obtain accurate segmentation results, we design a coarse-to-fine process to sequentially apply a designed attention module on multi-scale feature maps, and concatenate them to produce the final prediction. In this way, the spatial segmentation branch is enforced to gradually concentrate on object regions. These two branches are jointly fine-tuned on video segmentation sequences in an end-to-end manner. Several experiments are carried out on three challenging datasets (i.e., DAVIS-2016, DAVIS-2017 and Youtube-Object) to show that our method achieves favorable performance against the state-of-the-arts. Code is available at https://github.com/longyin880815/STCNN.
Tasks Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation
Published 2019-04-04
URL http://arxiv.org/abs/1904.02363v1
PDF http://arxiv.org/pdf/1904.02363v1.pdf
PWC https://paperswithcode.com/paper/spatiotemporal-cnn-for-video-object
Repo https://github.com/longyin880815/STCNN
Framework pytorch

A Multi-task Approach for Named Entity Recognition in Social Media Data

Title A Multi-task Approach for Named Entity Recognition in Social Media Data
Authors Gustavo Aguilar, Suraj Maharjan, Adrian Pastor López-Monroy, Thamar Solorio
Abstract Named Entity Recognition for social media data is challenging because of its inherent noisiness. In addition to improper grammatical structures, it contains spelling inconsistencies and numerous informal abbreviations. We propose a novel multi-task approach by employing a more general secondary task of Named Entity (NE) segmentation together with the primary task of fine-grained NE categorization. The multi-task neural network architecture learns higher order feature representations from word and character sequences along with basic Part-of-Speech tags and gazetteer information. This neural network acts as a feature extractor to feed a Conditional Random Fields classifier. We were able to obtain the first position in the 3rd Workshop on Noisy User-generated Text (WNUT-2017) with a 41.86% entity F1-score and a 40.24% surface F1-score.
Tasks Named Entity Recognition
Published 2019-06-10
URL https://arxiv.org/abs/1906.04135v1
PDF https://arxiv.org/pdf/1906.04135v1.pdf
PWC https://paperswithcode.com/paper/a-multi-task-approach-for-named-entity-1
Repo https://github.com/tavo91/NER-WNUT17
Framework none
comments powered by Disqus