January 31, 2020

3208 words 16 mins read

Paper Group AWR 376

DSNet: Automatic Dermoscopic Skin Lesion Segmentation. Refined-Segmentation R-CNN: A Two-stage Convolutional Neural Network for Punctate White Matter Lesion Segmentation in Preterm Infants. Handling Inter-Annotator Agreement for Automated Skin Lesion Segmentation. A Kernelized Manifold Mapping to Diminish the Effect of Adversarial Perturbations. Ma …

DSNet: Automatic Dermoscopic Skin Lesion Segmentation


Title	DSNet: Automatic Dermoscopic Skin Lesion Segmentation
Authors	Md. Kamrul Hasan, Lavsen Dahal, Prasad N. Samarakoon, Fakrul Islam Tushar, Robert Marti Marly
Abstract	Automatic segmentation of skin lesion is considered a crucial step in Computer Aided Diagnosis (CAD) for melanoma diagnosis. Despite its significance, skin lesion segmentation remains a challenging task due to their diverse color, texture, and indistinguishable boundaries and forms an open problem. Through this study, we present a new and automatic semantic segmentation network for robust skin lesion segmentation named Dermoscopic Skin Network (DSNet). In order to reduce the number of parameters to make the network lightweight, we used depth-wise separable convolution in lieu of standard convolution to project the learned discriminating features onto the pixel space at different stages of the encoder. Additionally, we implemented U-Net and Fully Convolutional Network (FCN8s) to compare against the proposed DSNet. We evaluate our proposed model on two publicly available datasets, namely ISIC-2017 and PH2. The obtained mean Intersection over Union (mIoU) is 77.5 % and 87.0 % respectively for ISIC-2017 and PH2 datasets which outperformed the ISIC-2017 challenge winner by 1.0 % with respect to mIoU. Our proposed network also outperformed U-Net and FCN8s respectively by 3.6 % and 6.8 % with respect to mIoU on the ISIC-2017 dataset. Our network for skin lesion segmentation outperforms other methods and can provide better segmented masks on two different test datasets which can lead to better performance in melanoma detection. Our trained model along with the source code and predicted masks are made publicly available.
Tasks	Lesion Segmentation, Semantic Segmentation
Published	2019-07-09
URL	https://arxiv.org/abs/1907.04305v2
PDF	https://arxiv.org/pdf/1907.04305v2.pdf
PWC	https://paperswithcode.com/paper/dsnet-automatic-dermoscopic-skin-lesion
Repo	https://github.com/kamruleee51/Skin-Lesion-Segmentation-Using-Proposed-DSNet
Framework	none

Refined-Segmentation R-CNN: A Two-stage Convolutional Neural Network for Punctate White Matter Lesion Segmentation in Preterm Infants


Title	Refined-Segmentation R-CNN: A Two-stage Convolutional Neural Network for Punctate White Matter Lesion Segmentation in Preterm Infants
Authors	Yalong Liu, Jie Li, Ying Wang, Miaomiao Wang, Xianjun Li, Zhicheng Jiao, Jian Yang, Xingbo Gao
Abstract	Accurate segmentation of punctate white matter lesion (PWML) in infantile brains by an automatic algorithm can reduce the potential risk of postnatal development. How to segment PWML effectively has become one of the active topics in medical image segmentation in recent years. In this paper, we construct an efficient two-stage PWML semantic segmentation network based on the characteristics of the lesion, called refined segmentation R-CNN (RS RCNN). We propose a heuristic RPN (H-RPN) which can utilize surrounding information around the PWMLs for heuristic segmentation. Also, we design a lightweight segmentation network to segment the lesion in a fast way. Densely connected conditional random field (DCRF) is used to optimize the segmentation results. We only use T1w MRIs to segment PWMLs. The result shows that our model can well segment the lesion of ordinary size or even pixel size. The Dice similarity coefficient reaches 0.6616, the sensitivity is 0.7069, the specificity is 0.9997, and the Hausdorff distance is 52.9130. The proposed method outperforms the state-of-the-art algorithm. (The code of this paper is available on https://github.com/YalongLiu/Refined-Segmentation-R-CNN)
Tasks	Lesion Segmentation, Medical Image Segmentation, Semantic Segmentation
Published	2019-06-24
URL	https://arxiv.org/abs/1906.09684v2
PDF	https://arxiv.org/pdf/1906.09684v2.pdf
PWC	https://paperswithcode.com/paper/refined-segmentation-r-cnn-a-two-stage
Repo	https://github.com/YalongLiu/Refined-Segmentation-R-CNN
Framework	tf

Handling Inter-Annotator Agreement for Automated Skin Lesion Segmentation


Title	Handling Inter-Annotator Agreement for Automated Skin Lesion Segmentation
Authors	Vinicius Ribeiro, Sandra Avila, Eduardo Valle
Abstract	In this work, we explore the issue of the inter-annotator agreement for training and evaluating automated segmentation of skin lesions. We explore what different degrees of agreement represent, and how they affect different use cases for segmentation. We also evaluate how conditioning the ground truths using different (but very simple) algorithms may help to enhance agreement and may be appropriate for some use cases. The segmentation of skin lesions is a cornerstone task for automated skin lesion analysis, useful both as an end-result to locate/detect the lesions and as an ancillary task for lesion classification. Lesion segmentation, however, is a very challenging task, due not only to the challenge of image segmentation itself but also to the difficulty in obtaining properly annotated data. Detecting accurately the borders of lesions is challenging even for trained humans, since, for many lesions, those borders are fuzzy and ill-defined. Using lesions and annotations from the ISIC Archive, we estimate inter-annotator agreement for skin-lesion segmentation and propose several simple procedures that may help to improve inter-annotator agreement if used to condition the ground truths.
Tasks	Lesion Segmentation, Semantic Segmentation
Published	2019-06-06
URL	https://arxiv.org/abs/1906.02415v1
PDF	https://arxiv.org/pdf/1906.02415v1.pdf
PWC	https://paperswithcode.com/paper/handling-inter-annotator-agreement-for
Repo	https://github.com/vribeiro1/skin-lesion-segmentation-agreement
Framework	pytorch

A Kernelized Manifold Mapping to Diminish the Effect of Adversarial Perturbations


Title	A Kernelized Manifold Mapping to Diminish the Effect of Adversarial Perturbations
Authors	Saeid Asgari Taghanaki, Kumar Abhishek, Shekoofeh Azizi, Ghassan Hamarneh
Abstract	The linear and non-flexible nature of deep convolutional models makes them vulnerable to carefully crafted adversarial perturbations. To tackle this problem, we propose a non-linear radial basis convolutional feature mapping by learning a Mahalanobis-like distance function. Our method then maps the convolutional features onto a linearly well-separated manifold, which prevents small adversarial perturbations from forcing a sample to cross the decision boundary. We test the proposed method on three publicly available image classification and segmentation datasets namely, MNIST, ISBI ISIC 2017 skin lesion segmentation, and NIH Chest X-Ray-14. We evaluate the robustness of our method to different gradient (targeted and untargeted) and non-gradient based attacks and compare it to several non-gradient masking defense strategies. Our results demonstrate that the proposed method can increase the resilience of deep convolutional neural networks to adversarial perturbations without accuracy drop on clean data.
Tasks	Image Classification, Lesion Segmentation
Published	2019-03-03
URL	https://arxiv.org/abs/1903.01015v2
PDF	https://arxiv.org/pdf/1903.01015v2.pdf
PWC	https://paperswithcode.com/paper/a-kernelized-manifold-mapping-to-diminish-the
Repo	https://github.com/asgsaeid/KernelizedManifoldMapping
Framework	none

Max-margin Class Imbalanced Learning with Gaussian Affinity


Title	Max-margin Class Imbalanced Learning with Gaussian Affinity
Authors	Munawar Hayat, Salman Khan, Waqas Zamir, Jianbing Shen, Ling Shao
Abstract	Real-world object classes appear in imbalanced ratios. This poses a significant challenge for classifiers which get biased towards frequent classes. We hypothesize that improving the generalization capability of a classifier should improve learning on imbalanced datasets. Here, we introduce the first hybrid loss function that jointly performs classification and clustering in a single formulation. Our approach is based on an `affinity measure’ in Euclidean space that leads to the following benefits: (1) direct enforcement of maximum margin constraints on classification boundaries, (2) a tractable way to ensure uniformly spaced and equidistant cluster centers, (3) flexibility to learn multiple class prototypes to support diversity and discriminability in feature space. Our extensive experiments demonstrate the significant performance improvements on visual classification and verification tasks on multiple imbalanced datasets. The proposed loss can easily be plugged in any deep architecture as a differentiable block and demonstrates robustness against different levels of data imbalance and corrupted labels. \|
Tasks
Published	2019-01-23
URL	http://arxiv.org/abs/1901.07711v1
PDF	http://arxiv.org/pdf/1901.07711v1.pdf
PWC	https://paperswithcode.com/paper/max-margin-class-imbalanced-learning-with
Repo	https://github.com/koshian2/affinity-loss
Framework	tf

Skin Lesion Analysis Toward Melanoma Detection 2018: A Challenge Hosted by the International Skin Imaging Collaboration (ISIC)


Title	Skin Lesion Analysis Toward Melanoma Detection 2018: A Challenge Hosted by the International Skin Imaging Collaboration (ISIC)
Authors	Noel Codella, Veronica Rotemberg, Philipp Tschandl, M. Emre Celebi, Stephen Dusza, David Gutman, Brian Helba, Aadi Kalloo, Konstantinos Liopyris, Michael Marchetti, Harald Kittler, Allan Halpern
Abstract	This work summarizes the results of the largest skin image analysis challenge in the world, hosted by the International Skin Imaging Collaboration (ISIC), a global partnership that has organized the world’s largest public repository of dermoscopic images of skin. The challenge was hosted in 2018 at the Medical Image Computing and Computer Assisted Intervention (MICCAI) conference in Granada, Spain. The dataset included over 12,500 images across 3 tasks. 900 users registered for data download, 115 submitted to the lesion segmentation task, 25 submitted to the lesion attribute detection task, and 159 submitted to the disease classification task. Novel evaluation protocols were established, including a new test for segmentation algorithm performance, and a test for algorithm ability to generalize. Results show that top segmentation algorithms still fail on over 10% of images on average, and algorithms with equal performance on test data can have different abilities to generalize. This is an important consideration for agencies regulating the growing set of machine learning tools in the healthcare domain, and sets a new standard for future public challenges in healthcare.
Tasks	Lesion Segmentation
Published	2019-02-09
URL	http://arxiv.org/abs/1902.03368v2
PDF	http://arxiv.org/pdf/1902.03368v2.pdf
PWC	https://paperswithcode.com/paper/skin-lesion-analysis-toward-melanoma-1
Repo	https://github.com/kianoush/Skin_Cancer_CNN
Framework	none

Do We Really Need Fully Unsupervised Cross-Lingual Embeddings?


Title	Do We Really Need Fully Unsupervised Cross-Lingual Embeddings?
Authors	Ivan Vulić, Goran Glavaš, Roi Reichart, Anna Korhonen
Abstract	Recent efforts in cross-lingual word embedding (CLWE) learning have predominantly focused on fully unsupervised approaches that project monolingual embeddings into a shared cross-lingual space without any cross-lingual signal. The lack of any supervision makes such approaches conceptually attractive. Yet, their only core difference from (weakly) supervised projection-based CLWE methods is in the way they obtain a seed dictionary used to initialize an iterative self-learning procedure. The fully unsupervised methods have arguably become more robust, and their primary use case is CLWE induction for pairs of resource-poor and distant languages. In this paper, we question the ability of even the most robust unsupervised CLWE approaches to induce meaningful CLWEs in these more challenging settings. A series of bilingual lexicon induction (BLI) experiments with 15 diverse languages (210 language pairs) show that fully unsupervised CLWE methods still fail for a large number of language pairs (e.g., they yield zero BLI performance for 87/210 pairs). Even when they succeed, they never surpass the performance of weakly supervised methods (seeded with 500-1,000 translation pairs) using the same self-learning procedure in any BLI setup, and the gaps are often substantial. These findings call for revisiting the main motivations behind fully unsupervised CLWE methods.
Tasks
Published	2019-09-04
URL	https://arxiv.org/abs/1909.01638v1
PDF	https://arxiv.org/pdf/1909.01638v1.pdf
PWC	https://paperswithcode.com/paper/do-we-really-need-fully-unsupervised-cross
Repo	https://github.com/cambridgeltl/panlex-bli
Framework	none

A Finnish News Corpus for Named Entity Recognition


Title	A Finnish News Corpus for Named Entity Recognition
Authors	Teemu Ruokolainen, Pekka Kauppinen, Miikka Silfverberg, Krister Lindén
Abstract	We present a corpus of Finnish news articles with a manually prepared named entity annotation. The corpus consists of 953 articles (193,742 word tokens) with six named entity classes (organization, location, person, product, event, and date). The articles are extracted from the archives of Digitoday, a Finnish online technology news source. The corpus is available for research purposes. We present baseline experiments on the corpus using a rule-based and two deep learning systems on two, in-domain and out-of-domain, test sets.
Tasks	Named Entity Recognition
Published	2019-08-12
URL	https://arxiv.org/abs/1908.04212v1
PDF	https://arxiv.org/pdf/1908.04212v1.pdf
PWC	https://paperswithcode.com/paper/a-finnish-news-corpus-for-named-entity
Repo	https://github.com/TurkuNLP/FinBERT
Framework	pytorch

Deep Learning for 3D Point Clouds: A Survey


Title	Deep Learning for 3D Point Clouds: A Survey
Authors	Yulan Guo, Hanyun Wang, Qingyong Hu, Hao Liu, Li Liu, Mohammed Bennamoun
Abstract	Point cloud learning has lately attracted increasing attention due to its wide applications in many areas, such as computer vision, autonomous driving, and robotics. As a dominating technique in AI, deep learning has been successfully used to solve various 2D vision problems. However, deep learning on point clouds is still in its infancy due to the unique challenges faced by the processing of point clouds with deep neural networks. Recently, deep learning on point clouds has become even thriving, with numerous methods being proposed to address different problems in this area. To stimulate future research, this paper presents a comprehensive review of recent progress in deep learning methods for point clouds. It covers three major tasks, including 3D shape classification, 3D object detection and tracking, and 3D point cloud segmentation. It also presents comparative results on several publicly available datasets, together with insightful observations and inspiring future research directions.
Tasks	3D Object Detection, Autonomous Driving, Object Detection
Published	2019-12-27
URL	https://arxiv.org/abs/1912.12033v1
PDF	https://arxiv.org/pdf/1912.12033v1.pdf
PWC	https://paperswithcode.com/paper/deep-learning-for-3d-point-clouds-a-survey
Repo	https://github.com/QingyongHu/Benchmark_results_3D_point_cloud
Framework	none

Temporal Unet: Sample Level Human Action Recognition using WiFi


Title	Temporal Unet: Sample Level Human Action Recognition using WiFi
Authors	Fei Wang, Yunpeng Song, Jimuyang Zhang, Jinsong Han, Dong Huang
Abstract	Human doing actions will result in WiFi distortion, which is widely explored for action recognition, such as the elderly fallen detection, hand sign language recognition, and keystroke estimation. As our best survey, past work recognizes human action by categorizing one complete distortion series into one action, which we term as series-level action recognition. In this paper, we introduce a much more fine-grained and challenging action recognition task into WiFi sensing domain, i.e., sample-level action recognition. In this task, every WiFi distortion sample in the whole series should be categorized into one action, which is a critical technique in precise action localization, continuous action segmentation, and real-time action recognition. To achieve WiFi-based sample-level action recognition, we fully analyze approaches in image-based semantic segmentation as well as in video-based frame-level action recognition, then propose a simple yet efficient deep convolutional neural network, i.e., Temporal Unet. Experimental results show that Temporal Unet achieves this novel task well. Codes have been made publicly available at https://github.com/geekfeiw/WiSLAR.
Tasks	Action Localization, action segmentation, Semantic Segmentation, Sign Language Recognition, Temporal Action Localization
Published	2019-04-19
URL	http://arxiv.org/abs/1904.11953v1
PDF	http://arxiv.org/pdf/1904.11953v1.pdf
PWC	https://paperswithcode.com/paper/190411953
Repo	https://github.com/geekfeiw/WiSLAR
Framework	pytorch

Distributional Policy Optimization: An Alternative Approach for Continuous Control


Title	Distributional Policy Optimization: An Alternative Approach for Continuous Control
Authors	Chen Tessler, Guy Tennenholtz, Shie Mannor
Abstract	We identify a fundamental problem in policy gradient-based methods in continuous control. As policy gradient methods require the agent’s underlying probability distribution, they limit policy representation to parametric distribution classes. We show that optimizing over such sets results in local movement in the action space and thus convergence to sub-optimal solutions. We suggest a novel distributional framework, able to represent arbitrary distribution functions over the continuous action space. Using this framework, we construct a generative scheme, trained using an off-policy actor-critic paradigm, which we call the Generative Actor Critic (GAC). Compared to policy gradient methods, GAC does not require knowledge of the underlying probability distribution, thereby overcoming these limitations. Empirical evaluation shows that our approach is comparable and often surpasses current state-of-the-art baselines in continuous domains.
Tasks	Continuous Control, Policy Gradient Methods
Published	2019-05-23
URL	https://arxiv.org/abs/1905.09855v2
PDF	https://arxiv.org/pdf/1905.09855v2.pdf
PWC	https://paperswithcode.com/paper/distributional-policy-optimization-an
Repo	https://github.com/tesslerc/GAC
Framework	pytorch

Keyphrase Extraction from Scholarly Articles as Sequence Labeling using Contextualized Embeddings


Title	Keyphrase Extraction from Scholarly Articles as Sequence Labeling using Contextualized Embeddings
Authors	Dhruva Sahrawat, Debanjan Mahata, Mayank Kulkarni, Haimin Zhang, Rakesh Gosangi, Amanda Stent, Agniv Sharma, Yaman Kumar, Rajiv Ratn Shah, Roger Zimmermann
Abstract	In this paper, we formulate keyphrase extraction from scholarly articles as a sequence labeling task solved using a BiLSTM-CRF, where the words in the input text are represented using deep contextualized embeddings. We evaluate the proposed architecture using both contextualized and fixed word embedding models on three different benchmark datasets (Inspec, SemEval 2010, SemEval 2017) and compare with existing popular unsupervised and supervised techniques. Our results quantify the benefits of (a) using contextualized embeddings (e.g. BERT) over fixed word embeddings (e.g. Glove); (b) using a BiLSTM-CRF architecture with contextualized word embeddings over fine-tuning the contextualized word embedding model directly, and (c) using genre-specific contextualized embeddings (SciBERT). Through error analysis, we also provide some insights into why particular models work better than others. Lastly, we present a case study where we analyze different self-attention layers of the two best models (BERT and SciBERT) to better understand the predictions made by each for the task of keyphrase extraction.
Tasks	Word Embeddings
Published	2019-10-19
URL	https://arxiv.org/abs/1910.08840v1
PDF	https://arxiv.org/pdf/1910.08840v1.pdf
PWC	https://paperswithcode.com/paper/keyphrase-extraction-from-scholarly-articles
Repo	https://github.com/midas-research/keyphrase-extraction-as-sequence-labeling-data
Framework	none

Lipschitz Generative Adversarial Nets


Title	Lipschitz Generative Adversarial Nets
Authors	Zhiming Zhou, Jiadong Liang, Yuxuan Song, Lantao Yu, Hongwei Wang, Weinan Zhang, Yong Yu, Zhihua Zhang
Abstract	In this paper, we study the convergence of generative adversarial networks (GANs) from the perspective of the informativeness of the gradient of the optimal discriminative function. We show that GANs without restriction on the discriminative function space commonly suffer from the problem that the gradient produced by the discriminator is uninformative to guide the generator. By contrast, Wasserstein GAN (WGAN), where the discriminative function is restricted to 1-Lipschitz, does not suffer from such a gradient uninformativeness problem. We further show in the paper that the model with a compact dual form of Wasserstein distance, where the Lipschitz condition is relaxed, may also theoretically suffer from this issue. This implies the importance of Lipschitz condition and motivates us to study the general formulation of GANs with Lipschitz constraint, which leads to a new family of GANs that we call Lipschitz GANs (LGANs). We show that LGANs guarantee the existence and uniqueness of the optimal discriminative function as well as the existence of a unique Nash equilibrium. We prove that LGANs are generally capable of eliminating the gradient uninformativeness problem. According to our empirical analysis, LGANs are more stable and generate consistently higher quality samples compared with WGAN.
Tasks
Published	2019-02-15
URL	https://arxiv.org/abs/1902.05687v4
PDF	https://arxiv.org/pdf/1902.05687v4.pdf
PWC	https://paperswithcode.com/paper/lipschitz-generative-adversarial-nets
Repo	https://github.com/ZhimingZhou/AdaShift-Lipschitz-GANs-MaxGP
Framework	tf

Spatiotemporal CNN for Video Object Segmentation


Title	Spatiotemporal CNN for Video Object Segmentation
Authors	Kai Xu, Longyin Wen, Guorong Li, Liefeng Bo, Qingming Huang
Abstract	In this paper, we present a unified, end-to-end trainable spatiotemporal CNN model for VOS, which consists of two branches, i.e., the temporal coherence branch and the spatial segmentation branch. Specifically, the temporal coherence branch pretrained in an adversarial fashion from unlabeled video data, is designed to capture the dynamic appearance and motion cues of video sequences to guide object segmentation. The spatial segmentation branch focuses on segmenting objects accurately based on the learned appearance and motion cues. To obtain accurate segmentation results, we design a coarse-to-fine process to sequentially apply a designed attention module on multi-scale feature maps, and concatenate them to produce the final prediction. In this way, the spatial segmentation branch is enforced to gradually concentrate on object regions. These two branches are jointly fine-tuned on video segmentation sequences in an end-to-end manner. Several experiments are carried out on three challenging datasets (i.e., DAVIS-2016, DAVIS-2017 and Youtube-Object) to show that our method achieves favorable performance against the state-of-the-arts. Code is available at https://github.com/longyin880815/STCNN.
Tasks	Semantic Segmentation, Video Object Segmentation, Video Semantic Segmentation
Published	2019-04-04
URL	http://arxiv.org/abs/1904.02363v1
PDF	http://arxiv.org/pdf/1904.02363v1.pdf
PWC	https://paperswithcode.com/paper/spatiotemporal-cnn-for-video-object
Repo	https://github.com/longyin880815/STCNN
Framework	pytorch


Title	A Multi-task Approach for Named Entity Recognition in Social Media Data
Authors	Gustavo Aguilar, Suraj Maharjan, Adrian Pastor López-Monroy, Thamar Solorio
Abstract	Named Entity Recognition for social media data is challenging because of its inherent noisiness. In addition to improper grammatical structures, it contains spelling inconsistencies and numerous informal abbreviations. We propose a novel multi-task approach by employing a more general secondary task of Named Entity (NE) segmentation together with the primary task of fine-grained NE categorization. The multi-task neural network architecture learns higher order feature representations from word and character sequences along with basic Part-of-Speech tags and gazetteer information. This neural network acts as a feature extractor to feed a Conditional Random Fields classifier. We were able to obtain the first position in the 3rd Workshop on Noisy User-generated Text (WNUT-2017) with a 41.86% entity F1-score and a 40.24% surface F1-score.
Tasks	Named Entity Recognition
Published	2019-06-10
URL	https://arxiv.org/abs/1906.04135v1
PDF	https://arxiv.org/pdf/1906.04135v1.pdf
PWC	https://paperswithcode.com/paper/a-multi-task-approach-for-named-entity-1
Repo	https://github.com/tavo91/NER-WNUT17
Framework	none