January 29, 2020

3203 words 16 mins read

Paper Group ANR 768

Paper Group ANR 768

Truly Generalizable Radiograph Segmentation with Conditional Domain Adaptation. Sign Language Recognition Analysis using Multimodal Data. MinCall - MinION end2end convolutional deep learning basecaller. AI-based Pilgrim Detection using Convolutional Neural Networks. Obj-GloVe: Scene-Based Contextual Object Embedding. Fake news detection using Deep …

Truly Generalizable Radiograph Segmentation with Conditional Domain Adaptation

Title Truly Generalizable Radiograph Segmentation with Conditional Domain Adaptation
Authors Hugo Oliveira, Edemir Ferreira, Jefersson A. dos Santos
Abstract Digitization techniques for biomedical images yield different visual patterns in radiological exams. These differences may hamper the use of data-driven approaches for inference over these images, such as Deep Neural Networks. Another noticeable difficulty in this field is the lack of labeled data, even though in many cases there is an abundance of unlabeled data available. Therefore an important step in improving the generalization capabilities of these methods is to perform Unsupervised and Semi-Supervised Domain Adaptation between different datasets of biomedical images. In order to tackle this problem, in this work we propose an Unsupervised and Semi-Supervised Domain Adaptation method for segmentation of biomedical images using Generative Adversarial Networks for Unsupervised Image Translation. We merge these unsupervised networks with supervised deep semantic segmentation architectures in order to create a semi-supervised method capable of learning from both unlabeled and labeled data, whenever labeling is available. We compare our method using several domains, datasets, segmentation tasks and traditional baselines, such as unsupervised distance-based methods and reusing pretrained models both with and without Fine-tuning. We perform both quantitative and qualitative analysis of the proposed method and baselines in the distinct scenarios considered in our experimental evaluation. The proposed method shows consistently better results than the baselines in scarce labeled data scenarios, achieving Jaccard values greater than 0.9 and good segmentation quality in most tasks. Unsupervised Domain Adaptation results were observed to be close to the Fully Supervised Domain Adaptation used in the traditional procedure of Fine-tuning pretrained networks.
Tasks Domain Adaptation, Semantic Segmentation, Transfer Learning, Unsupervised Domain Adaptation
Published 2019-01-16
URL https://arxiv.org/abs/1901.05553v4
PDF https://arxiv.org/pdf/1901.05553v4.pdf
PWC https://paperswithcode.com/paper/conditional-domain-adaptation-gans-for
Repo
Framework

Sign Language Recognition Analysis using Multimodal Data

Title Sign Language Recognition Analysis using Multimodal Data
Authors Al Amin Hosain, Panneer Selvam Santhalingam, Parth Pathak, Jana Kosecka, Huzefa Rangwala
Abstract Voice-controlled personal and home assistants (such as the Amazon Echo and Apple Siri) are becoming increasingly popular for a variety of applications. However, the benefits of these technologies are not readily accessible to Deaf or Hard-ofHearing (DHH) users. The objective of this study is to develop and evaluate a sign recognition system using multiple modalities that can be used by DHH signers to interact with voice-controlled devices. With the advancement of depth sensors, skeletal data is used for applications like video analysis and activity recognition. Despite having similarity with the well-studied human activity recognition, the use of 3D skeleton data in sign language recognition is rare. This is because unlike activity recognition, sign language is mostly dependent on hand shape pattern. In this work, we investigate the feasibility of using skeletal and RGB video data for sign language recognition using a combination of different deep learning architectures. We validate our results on a large-scale American Sign Language (ASL) dataset of 12 users and 13107 samples across 51 signs. It is named as GMUASL51. We collected the dataset over 6 months and it will be publicly released in the hope of spurring further machine learning research towards providing improved accessibility for digital assistants.
Tasks Activity Recognition, Human Activity Recognition, Sign Language Recognition
Published 2019-09-24
URL https://arxiv.org/abs/1909.11232v1
PDF https://arxiv.org/pdf/1909.11232v1.pdf
PWC https://paperswithcode.com/paper/sign-language-recognition-analysis-using
Repo
Framework

MinCall - MinION end2end convolutional deep learning basecaller

Title MinCall - MinION end2end convolutional deep learning basecaller
Authors Neven Miculinić, Marko Ratković, Mile Šikić
Abstract The Oxford Nanopore Technologies’s MinION is the first portable DNA sequencing device. It is capable of producing long reads, over 100 kBp were reported. However, it has significantly higher error rate than other methods. In this study, we present MinCall, an end2end basecaller model for the MinION. The model is based on deep learning and uses convolutional neural networks (CNN) in its implementation. For extra performance, it uses cutting edge deep learning techniques and architectures, batch normalization and Connectionist Temporal Classification (CTC) loss. The best performing deep learning model achieves 91.4% median match rate on E. Coli dataset using R9 pore chemistry and 1D reads.
Tasks
Published 2019-04-22
URL http://arxiv.org/abs/1904.10337v1
PDF http://arxiv.org/pdf/1904.10337v1.pdf
PWC https://paperswithcode.com/paper/mincall-minion-end2end-convolutional-deep
Repo
Framework

AI-based Pilgrim Detection using Convolutional Neural Networks

Title AI-based Pilgrim Detection using Convolutional Neural Networks
Authors Marwa Ben Jabra, Adel Ammar, Anis Koubaa, Omar Cheikhrouhou, Habib Hamam
Abstract Pilgrimage represents the most important Islamic religious gathering in the world where millions of pilgrims visit the holy places of Makkah and Madinah to perform their rituals. The safety and security of pilgrims is the highest priority for the authorities. In Makkah, 5000 cameras are spread around the holy for monitoring pilgrims, but it is almost impossible to track all events by humans considering the huge number of images collected every second. To address this issue, we propose to use artificial intelligence technique based on deep learning and convolution neural networks to detect and identify Pilgrims and their features. For this purpose, we built a comprehensive dataset for the detection of pilgrims and their genders. Then, we develop two convolutional neural networks based on YOLOv3 and Faster-RCNN for the detection of Pilgrims. Experiments results show that Faster RCNN with Inception v2 feature extractor provides the best mean average precision over all classes of 51%.
Tasks
Published 2019-11-18
URL https://arxiv.org/abs/1911.07509v2
PDF https://arxiv.org/pdf/1911.07509v2.pdf
PWC https://paperswithcode.com/paper/ai-based-pilgrim-detection-using
Repo
Framework

Obj-GloVe: Scene-Based Contextual Object Embedding

Title Obj-GloVe: Scene-Based Contextual Object Embedding
Authors Canwen Xu, Zhenzhong Chen, Chenliang Li
Abstract Recently, with the prevalence of large-scale image dataset, the co-occurrence information among classes becomes rich, calling for a new way to exploit it to facilitate inference. In this paper, we propose Obj-GloVe, a generic scene-based contextual embedding for common visual objects, where we adopt the word embedding method GloVe to exploit the co-occurrence between entities. We train the embedding on pre-processed Open Images V4 dataset and provide extensive visualization and analysis by dimensionality reduction and projecting the vectors along a specific semantic axis, and showcasing the nearest neighbors of the most common objects. Furthermore, we reveal the potential applications of Obj-GloVe on object detection and text-to-image synthesis, then verify its effectiveness on these two applications respectively.
Tasks Dimensionality Reduction, Image Generation, Object Detection
Published 2019-07-02
URL https://arxiv.org/abs/1907.01478v1
PDF https://arxiv.org/pdf/1907.01478v1.pdf
PWC https://paperswithcode.com/paper/obj-glove-scene-based-contextual-object
Repo
Framework

Fake news detection using Deep Learning

Title Fake news detection using Deep Learning
Authors Álvaro Ibrain Rodríguez, Lara Lloret Iglesias
Abstract The evolution of the information and communication technologies has dramatically increased the number of people with access to the Internet, which has changed the way the information is consumed. As a consequence of the above, fake news have become one of the major concerns because its potential to destabilize governments, which makes them a potential danger to modern society. An example of this can be found in the US. electoral campaign, where the term “fake news” gained great notoriety due to the influence of the hoaxes in the final result of these. In this work the feasibility of applying deep learning techniques to discriminate fake news on the Internet using only their text is studied. In order to accomplish that, three different neural network architectures are proposed, one of them based on BERT, a modern language model created by Google which achieves state-of-the-art results.
Tasks Fake News Detection, Language Modelling
Published 2019-09-29
URL https://arxiv.org/abs/1910.03496v2
PDF https://arxiv.org/pdf/1910.03496v2.pdf
PWC https://paperswithcode.com/paper/fake-news-detection-using-deep-learning
Repo
Framework

ERM and RERM are optimal estimators for regression problems when malicious outliers corrupt the labels

Title ERM and RERM are optimal estimators for regression problems when malicious outliers corrupt the labels
Authors Chinot Geoffrey
Abstract We study Empirical Risk Minimizers (ERM) and Regularized Empirical Risk Minimizers (RERM) for regression problems with convex and $L$-Lipschitz loss functions. We consider a setting where $\mathcal O$ malicious outliers may contaminate the labels. In that case, we show that the $L_2$-error rate is bounded by $r_N + L \mathcal O/N$, where $N$ is the total number of observations and $r_N$ is the $L_2$-error rate in the non-contaminated setting. When $r_N$ is minimax-rate-optimal in a non-contaminated setting, the rate $r_N + L\mathcal O/N$ is also minimax-rate-optimal when $\mathcal O$ outliers contaminate the label. The main results of the paper can be used for many non-regularized and regularized procedures under weak assumptions on the noise. For instance, we present results for Huber’s M-estimators (without penalization or regularized by the $\ell_1$-norm) and for general regularized learning problems in reproducible kernel Hilbert spaces.
Tasks
Published 2019-10-24
URL https://arxiv.org/abs/1910.10923v1
PDF https://arxiv.org/pdf/1910.10923v1.pdf
PWC https://paperswithcode.com/paper/erm-and-rerm-are-optimal-estimators-for
Repo
Framework

GAMIN: An Adversarial Approach to Black-Box Model Inversion

Title GAMIN: An Adversarial Approach to Black-Box Model Inversion
Authors Ulrich Aïvodji, Sébastien Gambs, Timon Ther
Abstract Recent works have demonstrated that machine learning models are vulnerable to model inversion attacks, which lead to the exposure of sensitive information contained in their training dataset. While some model inversion attacks have been developed in the past in the black-box attack setting, in which the adversary does not have direct access to the structure of the model, few of these have been conducted so far against complex models such as deep neural networks. In this paper, we introduce GAMIN (for Generative Adversarial Model INversion), a new black-box model inversion attack framework achieving significant results even against deep models such as convolutional neural networks at a reasonable computing cost. GAMIN is based on the continuous training of a surrogate model for the target model under attack and a generator whose objective is to generate inputs resembling those used to train the target model. The attack was validated against various neural networks used as image classifiers. In particular, when attacking models trained on the MNIST dataset, GAMIN is able to extract recognizable digits for up to 60% of labels produced by the target. Attacks against skin classification models trained on the pilot parliament dataset also demonstrated the capacity to extract recognizable features from the targets.
Tasks
Published 2019-09-26
URL https://arxiv.org/abs/1909.11835v1
PDF https://arxiv.org/pdf/1909.11835v1.pdf
PWC https://paperswithcode.com/paper/gamin-an-adversarial-approach-to-black-box
Repo
Framework

Nature Inspired Dimensional Reduction Technique for Fast and Invariant Visual Feature Extraction

Title Nature Inspired Dimensional Reduction Technique for Fast and Invariant Visual Feature Extraction
Authors Ravimal Bandara, Lochandaka Ranathunga, Nor Aniza Abdullah
Abstract Fast and invariant feature extraction is crucial in certain computer vision applications where the computation time is constrained in both training and testing phases of the classifier. In this paper, we propose a nature-inspired dimensionality reduction technique for fast and invariant visual feature extraction. The human brain can exchange the spatial and spectral resolution to reconstruct missing colors in visual perception. The phenomenon is widely used in the printing industry to reduce the number of colors used to print, through a technique, called color dithering. In this work, we adopt a fast error-diffusion color dithering algorithm to reduce the spectral resolution and extract salient features by employing novel Hessian matrix analysis technique, which is then described by a spatial-chromatic histogram. The computation time, descriptor dimensionality and classification performance of the proposed feature are assessed under drastic variances in orientation, viewing angle and illumination of objects comparing with several different state-of-the-art handcrafted and deep-learned features. Extensive experiments on two publicly available object datasets, coil-100 and ALOI carried on both a desktop PC and a Raspberry Pi device show multiple advantages of using the proposed approach, such as the lower computation time, high robustness, and comparable classification accuracy under weakly supervised environment. Further, it showed the capability of operating solely inside a conventional SoC device utilizing a small fraction of the available hardware resources.
Tasks Dimensionality Reduction
Published 2019-07-01
URL https://arxiv.org/abs/1907.01102v1
PDF https://arxiv.org/pdf/1907.01102v1.pdf
PWC https://paperswithcode.com/paper/nature-inspired-dimensional-reduction
Repo
Framework

Exploiting Multi-domain Visual Information for Fake News Detection

Title Exploiting Multi-domain Visual Information for Fake News Detection
Authors Peng Qi, Juan Cao, Tianyun Yang, Junbo Guo, Jintao Li
Abstract The increasing popularity of social media promotes the proliferation of fake news. With the development of multimedia technology, fake news attempts to utilize multimedia contents with images or videos to attract and mislead readers for rapid dissemination, which makes visual contents an important part of fake news. Fake-news images, images attached in fake news posts,include not only fake images which are maliciously tampered but also real images which are wrongly used to represent irrelevant events. Hence, how to fully exploit the inherent characteristics of fake-news images is an important but challenging problem for fake news detection. In the real world, fake-news images may have significantly different characteristics from real-news images at both physical and semantic levels, which can be clearly reflected in the frequency and pixel domain, respectively. Therefore, we propose a novel framework Multi-domain Visual Neural Network (MVNN) to fuse the visual information of frequency and pixel domains for detecting fake news. Specifically, we design a CNN-based network to automatically capture the complex patterns of fake-news images in the frequency domain; and utilize a multi-branch CNN-RNN model to extract visual features from different semantic levels in the pixel domain. An attention mechanism is utilized to fuse the feature representations of frequency and pixel domains dynamically. Extensive experiments conducted on a real-world dataset demonstrate that MVNN outperforms existing methods with at least 9.2% in accuracy, and can help improve the performance of multimodal fake news detection by over 5.2%.
Tasks Fake News Detection
Published 2019-08-13
URL https://arxiv.org/abs/1908.04472v1
PDF https://arxiv.org/pdf/1908.04472v1.pdf
PWC https://paperswithcode.com/paper/exploiting-multi-domain-visual-information
Repo
Framework

An Improvement of PAA on Trend-Based Approximation for Time Series

Title An Improvement of PAA on Trend-Based Approximation for Time Series
Authors Chunkai Zhang, Yingyang Chen, Ao Yin, Zhen Qin, Xing Zhang, Keli Zhang, Zoe L. Jiang
Abstract Piecewise Aggregate Approximation (PAA) is a competitive basic dimension reduction method for high-dimensional time series mining. When deployed, however, the limitations are obvious that some important information will be missed, especially the trend. In this paper, we propose two new approaches for time series that utilize approximate trend feature information. Our first method is based on relative mean value of each segment to record the trend, which divide each segment into two parts and use the numerical average respectively to represent the trend. We proved that this method satisfies lower bound which guarantee no false dismissals. Our second method uses a binary string to record the trend which is also relative to mean in each segment. Our methods are applied on similarity measurement in classification and anomaly detection, the experimental results show the improvement of accuracy and effectiveness by extracting the trend feature suitably.
Tasks Anomaly Detection, Dimensionality Reduction, Time Series
Published 2019-06-28
URL https://arxiv.org/abs/1907.00700v1
PDF https://arxiv.org/pdf/1907.00700v1.pdf
PWC https://paperswithcode.com/paper/an-improvement-of-paa-on-trend-based
Repo
Framework

Polysemy and brevity versus frequency in language

Title Polysemy and brevity versus frequency in language
Authors Bernardino Casas, Antoni Hernández-Fernández, Neus Català, Ramon Ferrer-i-Cancho, Jaume Baixeries
Abstract The pioneering research of G. K. Zipf on the relationship between word frequency and other word features led to the formulation of various linguistic laws. The most popular is Zipf’s law for word frequencies. Here we focus on two laws that have been studied less intensively: the meaning-frequency law, i.e. the tendency of more frequent words to be more polysemous, and the law of abbreviation, i.e. the tendency of more frequent words to be shorter. In a previous work, we tested the robustness of these Zipfian laws for English, roughly measuring word length in number of characters and distinguishing adult from child speech. In the present article, we extend our study to other languages (Dutch and Spanish) and introduce two additional measures of length: syllabic length and phonemic length. Our correlation analysis indicates that both the meaning-frequency law and the law of abbreviation hold overall in all the analyzed languages.
Tasks
Published 2019-03-27
URL http://arxiv.org/abs/1904.00812v1
PDF http://arxiv.org/pdf/1904.00812v1.pdf
PWC https://paperswithcode.com/paper/polysemy-and-brevity-versus-frequency-in
Repo
Framework

Dynamic Embedding on Textual Networks via a Gaussian Process

Title Dynamic Embedding on Textual Networks via a Gaussian Process
Authors Pengyu Cheng, Yitong Li, Xinyuan Zhang, Liqun Cheng, David Carlson, Lawrence Carin
Abstract Textual network embedding aims to learn low-dimensional representations of text-annotated nodes in a graph. Prior work in this area has typically focused on fixed graph structures; however, real-world networks are often dynamic. We address this challenge with a novel end-to-end node-embedding model, called Dynamic Embedding for Textual Networks with a Gaussian Process (DetGP). After training, DetGP can be applied efficiently to dynamic graphs without re-training or backpropagation. The learned representation of each node is a combination of textual and structural embeddings. Because the structure is allowed to be dynamic, our method uses the Gaussian process to take advantage of its non-parametric properties. To use both local and global graph structures, diffusion is used to model multiple hops between neighbors. The relative importance of global versus local structure for the embeddings is learned automatically. With the non-parametric nature of the Gaussian process, updating the embeddings for a changed graph structure requires only a forward pass through the learned model. Considering link prediction and node classification, experiments demonstrate the empirical effectiveness of our method compared to baseline approaches. We further show that DetGP can be straightforwardly and efficiently applied to dynamic textual networks.
Tasks Link Prediction, Network Embedding, Node Classification
Published 2019-10-05
URL https://arxiv.org/abs/1910.02187v3
PDF https://arxiv.org/pdf/1910.02187v3.pdf
PWC https://paperswithcode.com/paper/gaussian-process-based-dynamic-embedding-for
Repo
Framework

Real-time Event Detection on Social Data Streams

Title Real-time Event Detection on Social Data Streams
Authors Mateusz Fedoryszak, Brent Frederick, Vijay Rajaram, Changtao Zhong
Abstract Social networks are quickly becoming the primary medium for discussing what is happening around real-world events. The information that is generated on social platforms like Twitter can produce rich data streams for immediate insights into ongoing matters and the conversations around them. To tackle the problem of event detection, we model events as a list of clusters of trending entities over time. We describe a real-time system for discovering events that is modular in design and novel in scale and speed: it applies clustering on a large stream with millions of entities per minute and produces a dynamically updated set of events. In order to assess clustering methodologies, we build an evaluation dataset derived from a snapshot of the full Twitter Firehose and propose novel metrics for measuring clustering quality. Through experiments and system profiling, we highlight key results from the offline and online pipelines. Finally, we visualize a high profile event on Twitter to show the importance of modeling the evolution of events, especially those detected from social data streams.
Tasks
Published 2019-07-25
URL https://arxiv.org/abs/1907.11229v1
PDF https://arxiv.org/pdf/1907.11229v1.pdf
PWC https://paperswithcode.com/paper/real-time-event-detection-on-social-data
Repo
Framework

Effective Medical Test Suggestions Using Deep Reinforcement Learning

Title Effective Medical Test Suggestions Using Deep Reinforcement Learning
Authors Yang-En Chen, Kai-Fu Tang, Yu-Shao Peng, Edward Y. Chang
Abstract Effective medical test suggestions benefit both patients and physicians to conserve time and improve diagnosis accuracy. In this work, we show that an agent can learn to suggest effective medical tests. We formulate the problem as a stage-wise Markov decision process and propose a reinforcement learning method to train the agent. We introduce a new representation of multiple action policy along with the training method of the proposed representation. Furthermore, a new exploration scheme is proposed to accelerate the learning of disease distributions. Our experimental results demonstrate that the accuracy of disease diagnosis can be significantly improved with good medical test suggestions.
Tasks
Published 2019-05-30
URL https://arxiv.org/abs/1905.12916v2
PDF https://arxiv.org/pdf/1905.12916v2.pdf
PWC https://paperswithcode.com/paper/effective-medical-test-suggestions-using-deep
Repo
Framework
comments powered by Disqus