January 30, 2020

3290 words 16 mins read

Paper Group ANR 478

Paper Group ANR 478

Complement Face Forensic Detection and Localization with FacialLandmarks. Knowledge Flow: Improve Upon Your Teachers. Solving Dynamic Multi-objective Optimization Problems Using Incremental Support Vector Machine. First Analysis of Local GD on Heterogeneous Data. Towards DeepSpray: Using Convolutional Neural Network to post-process Shadowgraphy Ima …

Complement Face Forensic Detection and Localization with FacialLandmarks

Title Complement Face Forensic Detection and Localization with FacialLandmarks
Authors Kritaphat Songsri-in, Stefanos Zafeiriou
Abstract Recently, Generative Adversarial Networks (GANs) and image manipulating methods are becoming more powerful and can produce highly realistic face images beyond human recognition which have raised significant concerns regarding the authenticity of digital media. Although there have been some prior works that tackle face forensic classification problem, it is not trivial to estimate edited locations from classification predictions. In this paper, we propose, to the best of our knowledge, the first rigorous face forensic localization dataset, which consists of genuine, generated, and manipulated face images. In particular, the pristine parts contain face images from CelebA and FFHQ datasets. The fake images are generated from various GANs methods, namely DCGANs, LSGANs, BEGANs, WGAN-GP, ProGANs, and StyleGANs. Lastly, the edited subset is generated from StarGAN and SEFCGAN based on free-form masks. In total, the dataset contains about 1.3 million facial images labelled with corresponding binary masks. Based on the proposed dataset, we demonstrated that explicit adding facial landmarks information in addition to input images improves the performance. In addition, our proposed method consists of two branches and can coherently predict face forensic detection and localization to outperform the previous state-of-the-art techniques on the newly proposed dataset as well as the faceforecsic++ dataset especially on low-quality videos.
Tasks
Published 2019-10-12
URL https://arxiv.org/abs/1910.05455v1
PDF https://arxiv.org/pdf/1910.05455v1.pdf
PWC https://paperswithcode.com/paper/complement-face-forensic-detection-and
Repo
Framework

Knowledge Flow: Improve Upon Your Teachers

Title Knowledge Flow: Improve Upon Your Teachers
Authors Iou-Jen Liu, Jian Peng, Alexander G. Schwing
Abstract A zoo of deep nets is available these days for almost any given task, and it is increasingly unclear which net to start with when addressing a new task, or which net to use as an initialization for fine-tuning a new model. To address this issue, in this paper, we develop knowledge flow which moves ‘knowledge’ from multiple deep nets, referred to as teachers, to a new deep net model, called the student. The structure of the teachers and the student can differ arbitrarily and they can be trained on entirely different tasks with different output spaces too. Upon training with knowledge flow the student is independent of the teachers. We demonstrate our approach on a variety of supervised and reinforcement learning tasks, outperforming fine-tuning and other ‘knowledge exchange’ methods.
Tasks
Published 2019-04-11
URL http://arxiv.org/abs/1904.05878v1
PDF http://arxiv.org/pdf/1904.05878v1.pdf
PWC https://paperswithcode.com/paper/knowledge-flow-improve-upon-your-teachers-1
Repo
Framework

Solving Dynamic Multi-objective Optimization Problems Using Incremental Support Vector Machine

Title Solving Dynamic Multi-objective Optimization Problems Using Incremental Support Vector Machine
Authors Weizhen Hu, Min Jiang, Xing Gao, Kay Chen Tan, Yiu-ming Cheung
Abstract The main feature of the Dynamic Multi-objective Optimization Problems (DMOPs) is that optimization objective functions will change with times or environments. One of the promising approaches for solving the DMOPs is reusing the obtained Pareto optimal set (POS) to train prediction models via machine learning approaches. In this paper, we train an Incremental Support Vector Machine (ISVM) classifier with the past POS, and then the solutions of the DMOP we want to solve at the next moment are filtered through the trained ISVM classifier. A high-quality initial population will be generated by the ISVM classifier, and a variety of different types of population-based dynamic multi-objective optimization algorithms can benefit from the population. To verify this idea, we incorporate the proposed approach into three evolutionary algorithms, the multi-objective particle swarm optimization(MOPSO), Nondominated Sorting Genetic Algorithm II (NSGA-II), and the Regularity Model-based multi-objective estimation of distribution algorithm(RE-MEDA). We employ experiments to test these algorithms, and experimental results show the effectiveness.
Tasks
Published 2019-10-19
URL https://arxiv.org/abs/1910.08751v1
PDF https://arxiv.org/pdf/1910.08751v1.pdf
PWC https://paperswithcode.com/paper/solving-dynamic-multi-objective-optimization
Repo
Framework

First Analysis of Local GD on Heterogeneous Data

Title First Analysis of Local GD on Heterogeneous Data
Authors Ahmed Khaled, Konstantin Mishchenko, Peter Richtárik
Abstract We provide the first convergence analysis of local gradient descent for minimizing the average of smooth and convex but otherwise arbitrary functions. Problems of this form and local gradient descent as a solution method are of importance in federated learning, where each function is based on private data stored by a user on a mobile device, and the data of different users can be arbitrarily heterogeneous. We show that in a low accuracy regime, the method has the same communication complexity as gradient descent.
Tasks
Published 2019-09-10
URL https://arxiv.org/abs/1909.04715v2
PDF https://arxiv.org/pdf/1909.04715v2.pdf
PWC https://paperswithcode.com/paper/first-analysis-of-local-gd-on-heterogeneous
Repo
Framework

Towards DeepSpray: Using Convolutional Neural Network to post-process Shadowgraphy Images of Liquid Atomization

Title Towards DeepSpray: Using Convolutional Neural Network to post-process Shadowgraphy Images of Liquid Atomization
Authors Geoffroy Chaussonnet, Christian Lieber, Yan Yikang, Wenda Gu, Andreas Bartschat, Markus Reischl, Rainer Koch, Ralf Mikut, Hans-Jörg Bauer
Abstract This technical report investigates the potential of Convolutional Neural Networks to post-process images from primary atomization. Three tasks are investigated. First, the detection and segmentation of liquid droplets in degraded optical conditions. Second, the detection of overlapping ellipses and the prediction of their geometrical characteristics. This task corresponds to extrapolate the hidden contour of an ellipse with reduced visual information. Third, several features of the liquid surface during primary breakup (ligaments, bags, rims) are manually annotated on 15 experimental images. The detector is trained on this minimal database using simple data augmentation and then applied to other images from numerical simulation and from other experiment. In these three tasks, models from the literature based on Convolutional Neural Networks showed very promising results, thus demonstrating the high potential of Deep Learning to post-process liquid atomization. The next step is to embed these models into a unified framework DeepSpray.
Tasks Data Augmentation
Published 2019-10-11
URL https://arxiv.org/abs/1910.11073v1
PDF https://arxiv.org/pdf/1910.11073v1.pdf
PWC https://paperswithcode.com/paper/towards-deepspray-using-convolutional-neural
Repo
Framework

Character Feature Engineering for Japanese Word Segmentation

Title Character Feature Engineering for Japanese Word Segmentation
Authors Mike Tian-Jian Jiang
Abstract On word segmentation problems, machine learning architecture engineering often draws attention. The problem representation itself, however, has remained almost static as either word lattice ranking or character sequence tagging, for at least two decades. The latter of-ten shows stronger predictive power than the former for out-of-vocabulary (OOV) issue. When the issue escalating to rapid adaptation, which is a common scenario for industrial applications, active learning of partial annotations or re-training with additional lexical re-sources is usually applied, however, from a somewhat word-based perspective. Not only it is uneasy for end-users to comply with linguistically consistent word boundary decisions, but also the risk/cost of forking models permanently with estimated weights is seldom affordable. To overcome the obstacle, this work provides an alternative, which uses linguistic intuition about character compositions, such that a sophisticated feature set and its derived scheme can enable dynamic lexicon expansion with the model remaining intact. Experiment results suggest that the proposed solution, with or without external lexemes, performs competitively in terms of F1 score and OOV recall across various datasets.
Tasks Active Learning, Feature Engineering
Published 2019-10-03
URL https://arxiv.org/abs/1910.01761v1
PDF https://arxiv.org/pdf/1910.01761v1.pdf
PWC https://paperswithcode.com/paper/character-feature-engineering-for-japanese
Repo
Framework

Multi-label Detection and Classification of Red Blood Cells in Microscopic Images

Title Multi-label Detection and Classification of Red Blood Cells in Microscopic Images
Authors Wei Qiu, Jiaming Guo, Xiang Li, Mengjia Xu, Mo Zhang, Ning Guo, Quanzheng Li
Abstract Cell detection and cell type classification from biomedical images play an important role for high-throughput imaging and various clinical application. While classification of single cell sample can be performed with standard computer vision and machine learning methods, analysis of multi-label samples (region containing congregating cells) is more challenging, as separation of individual cells can be difficult (e.g. touching cells) or even impossible (e.g. overlapping cells). As multi-instance images are common in analyzing Red Blood Cell (RBC) for Sickle Cell Disease (SCD) diagnosis, we develop and implement a multi-instance cell detection and classification framework to address this challenge. The framework firstly trains a region proposal model based on Region-based Convolutional Network (RCNN) to obtain bounding-boxes of regions potentially containing single or multiple cells from input microscopic images, which are extracted as image patches. High-level image features are then calculated from image patches through a pre-trained Convolutional Neural Network (CNN) with ResNet-50 structure. Using these image features inputs, six networks are then trained to make multi-label prediction of whether a given patch contains cells belonging to a specific cell type. As the six networks are trained with image patches consisting of both individual cells and touching/overlapping cells, they can effectively recognize cell types that are presented in multi-instance image samples. Finally, for the purpose of SCD testing, we train another machine learning classifier to predict whether the given image patch contains abnormal cell type based on outputs from the six networks. Testing result of the proposed framework shows that it can achieve good performance in automatic cell detection and classification.
Tasks
Published 2019-10-07
URL https://arxiv.org/abs/1910.02672v2
PDF https://arxiv.org/pdf/1910.02672v2.pdf
PWC https://paperswithcode.com/paper/multi-label-detection-and-classification-of
Repo
Framework

Kernel Transform Learning

Title Kernel Transform Learning
Authors Jyoti Maggu, Angshul Majumdar
Abstract This work proposes kernel transform learning. The idea of dictionary learning is well known; it is a synthesis formulation where a basis is learnt along with the coefficients so as to generate or synthesize the data. Transform learning is its analysis equivalent; the transforms operates or analyses on the data to generate the coefficients. The concept of kernel dictionary learning has been introduced in the recent past, where the dictionary is represented as a linear combination of non-linear version of the data. Its success has been showcased in feature extraction. In this work we propose to kernelize transform learning on line similar to kernel dictionary learning. An efficient solution for kernel transform learning has been proposed especially for problems where the number of samples is much larger than the dimensionality of the input samples making the kernel matrix very high dimensional. Kernel transform learning has been compared with other representation learning tools like autoencoder, restricted Boltzmann machine as well as with dictionary learning (and its kernelized version). Our proposed kernel transform learning yields better results than all the aforesaid techniques; experiments have been carried out on benchmark databases.
Tasks Dictionary Learning, Representation Learning
Published 2019-12-11
URL https://arxiv.org/abs/1912.12129v1
PDF https://arxiv.org/pdf/1912.12129v1.pdf
PWC https://paperswithcode.com/paper/kernel-transform-learning
Repo
Framework

Lead2Gold: Towards exploiting the full potential of noisy transcriptions for speech recognition

Title Lead2Gold: Towards exploiting the full potential of noisy transcriptions for speech recognition
Authors Adrien Dufraux, Emmanuel Vincent, Awni Hannun, Armelle Brun, Matthijs Douze
Abstract The transcriptions used to train an Automatic Speech Recognition (ASR) system may contain errors. Usually, either a quality control stage discards transcriptions with too many errors, or the noisy transcriptions are used as is. We introduce Lead2Gold, a method to train an ASR system that exploits the full potential of noisy transcriptions. Based on a noise model of transcription errors, Lead2Gold searches for better transcriptions of the training data with a beam search that takes this noise model into account. The beam search is differentiable and does not require a forced alignment step, thus the whole system is trained end-to-end. Lead2Gold can be viewed as a new loss function that can be used on top of any sequence-to-sequence deep neural network. We conduct proof-of-concept experiments on noisy transcriptions generated from letter corruptions with different noise levels. We show that Lead2Gold obtains a better ASR accuracy than a competitive baseline which does not account for the (artificially-introduced) transcription noise.
Tasks Speech Recognition
Published 2019-10-16
URL https://arxiv.org/abs/1910.07323v1
PDF https://arxiv.org/pdf/1910.07323v1.pdf
PWC https://paperswithcode.com/paper/lead2gold-towards-exploiting-the-full
Repo
Framework

Movienet: A Movie Multilayer Network Model using Visual and Textual Semantic Cues

Title Movienet: A Movie Multilayer Network Model using Visual and Textual Semantic Cues
Authors Youssef Mourchid, Benjamin Renoust, Olivier Roupin, Le Van, Hocine Cherifi, Mohammed El Hassouni
Abstract Discovering content and stories in movies is one of the most important concepts in multimedia content research studies. Network models have proven to be an efficient choice for this purpose. When an audience watches a movie, they usually compare the characters and the relationships between them. For this reason, most of the models developed so far are based on social networks analysis. They focus essentially on the characters at play. By analyzing characters’ interactions, we can obtain a broad picture of the narration’s content. Other works have proposed to exploit semantic elements such as scenes, dialogues, etc. However, they are always captured from a single facet. Motivated by these limitations, we introduce in this work a multilayer network model to capture the narration of a movie based on its script, its subtitles, and the movie content. After introducing the model and the extraction process from the raw data, we perform a comparative analysis of the whole 6-movie cycle of the Star Wars saga. Results demonstrate the effectiveness of the proposed framework for video content representation and analysis.
Tasks
Published 2019-10-18
URL https://arxiv.org/abs/1910.09368v1
PDF https://arxiv.org/pdf/1910.09368v1.pdf
PWC https://paperswithcode.com/paper/movienet-a-movie-multilayer-network-model
Repo
Framework

Machine learning approaches in Detecting the Depression from Resting-state Electroencephalogram (EEG): A Review Study

Title Machine learning approaches in Detecting the Depression from Resting-state Electroencephalogram (EEG): A Review Study
Authors Milena Cukic Radenkovic
Abstract In this paper, we aimed at reviewing several different approaches present today in the search for more accurate diagnostic and treatment management in mental healthcare. Our focus is on mood disorders, and in particular on the major depressive disorder (MDD). We are reviewing and discussing findings based on neuroimaging studies (MRI and fMRI) first to get the impression of the body of knowledge about the anatomical and functional differences in depression. Then, we are focusing on less expensive data-driven approach, applicable for everyday clinical practice, in particular, those based on electroencephalographic (EEG) recordings. Among those studies utilizing EEG, we are discussing a group of applications used for detecting of depression based on the resting state EEG (detection studies) and interventional studies (using stimulus in their protocols or aiming to predict the outcome of therapy). We conclude with a discussion and review of guidelines to improve the reliability of developed models that could serve improvement of diagnostic of depression in psychiatry.
Tasks EEG
Published 2019-03-26
URL http://arxiv.org/abs/1903.11454v1
PDF http://arxiv.org/pdf/1903.11454v1.pdf
PWC https://paperswithcode.com/paper/machine-learning-approaches-in-detecting-the
Repo
Framework

Classification of EEG-Based Brain Connectivity Networks in Schizophrenia Using a Multi-Domain Connectome Convolutional Neural Network

Title Classification of EEG-Based Brain Connectivity Networks in Schizophrenia Using a Multi-Domain Connectome Convolutional Neural Network
Authors Chun-Ren Phang, Chee-Ming Ting, Fuad Noman, Hernando Ombao
Abstract We exploit altered patterns in brain functional connectivity as features for automatic discriminative analysis of neuropsychiatric patients. Deep learning methods have been introduced to functional network classification only very recently for fMRI, and the proposed architectures essentially focused on a single type of connectivity measure. We propose a deep convolutional neural network (CNN) framework for classification of electroencephalogram (EEG)-derived brain connectome in schizophrenia (SZ). To capture complementary aspects of disrupted connectivity in SZ, we explore combination of various connectivity features consisting of time and frequency-domain metrics of effective connectivity based on vector autoregressive model and partial directed coherence, and complex network measures of network topology. We design a novel multi-domain connectome CNN (MDC-CNN) based on a parallel ensemble of 1D and 2D CNNs to integrate the features from various domains and dimensions using different fusion strategies. Hierarchical latent representations learned by the multiple convolutional layers from EEG connectivity reveal apparent group differences between SZ and healthy controls (HC). Results on a large resting-state EEG dataset show that the proposed CNNs significantly outperform traditional support vector machine classifiers. The MDC-CNN with combined connectivity features further improves performance over single-domain CNNs using individual features, achieving remarkable accuracy of $93.06%$ with a decision-level fusion. The proposed MDC-CNN by integrating information from diverse brain connectivity descriptors is able to accurately discriminate SZ from HC. The new framework is potentially useful for developing diagnostic tools for SZ and other disorders.
Tasks EEG
Published 2019-03-21
URL http://arxiv.org/abs/1903.08858v1
PDF http://arxiv.org/pdf/1903.08858v1.pdf
PWC https://paperswithcode.com/paper/classification-of-eeg-based-brain
Repo
Framework

Adaptive Hard Thresholding for Near-optimal Consistent Robust Regression

Title Adaptive Hard Thresholding for Near-optimal Consistent Robust Regression
Authors Arun Sai Suggala, Kush Bhatia, Pradeep Ravikumar, Prateek Jain
Abstract We study the problem of robust linear regression with response variable corruptions. We consider the oblivious adversary model, where the adversary corrupts a fraction of the responses in complete ignorance of the data. We provide a nearly linear time estimator which consistently estimates the true regression vector, even with $1-o(1)$ fraction of corruptions. Existing results in this setting either don’t guarantee consistent estimates or can only handle a small fraction of corruptions. We also extend our estimator to robust sparse linear regression and show that similar guarantees hold in this setting. Finally, we apply our estimator to the problem of linear regression with heavy-tailed noise and show that our estimator consistently estimates the regression vector even when the noise has unbounded variance (e.g., Cauchy distribution), for which most existing results don’t even apply. Our estimator is based on a novel variant of outlier removal via hard thresholding in which the threshold is chosen adaptively and crucially relies on randomness to escape bad fixed points of the non-convex hard thresholding operation.
Tasks
Published 2019-03-19
URL http://arxiv.org/abs/1903.08192v1
PDF http://arxiv.org/pdf/1903.08192v1.pdf
PWC https://paperswithcode.com/paper/adaptive-hard-thresholding-for-near-optimal
Repo
Framework

A simple and efficient architecture for trainable activation functions

Title A simple and efficient architecture for trainable activation functions
Authors Andrea Apicella, Francesco Isgrò, Roberto Prevete
Abstract Learning automatically the best activation function for the task is an active topic in neural network research. At the moment, despite promising results, it is still difficult to determine a method for learning an activation function that is at the same time theoretically simple and easy to implement. Moreover, most of the methods proposed so far introduce new parameters or adopt different learning techniques. In this work we propose a simple method to obtain trained activation function which adds to the neural network local subnetworks with a small amount of neurons. Experiments show that this approach could lead to better result with respect to using a pre-defined activation function, without introducing a large amount of extra parameters that need to be learned.
Tasks
Published 2019-02-08
URL http://arxiv.org/abs/1902.03306v2
PDF http://arxiv.org/pdf/1902.03306v2.pdf
PWC https://paperswithcode.com/paper/a-simple-and-efficient-architecture-for
Repo
Framework

Residual Deep Convolutional Neural Network for EEG Signal Classification in Epilepsy

Title Residual Deep Convolutional Neural Network for EEG Signal Classification in Epilepsy
Authors Diyuan Lu, Jochen Triesch
Abstract Epilepsy is the fourth most common neurological disorder, affecting about 1% of the population at all ages. As many as 60% of people with epilepsy experience focal seizures which originate in a certain brain area and are limited to part of one cerebral hemisphere. In focal epilepsy patients, a precise surgical removal of the seizure onset zone can lead to effective seizure control or even a seizure-free outcome. Thus, correct identification of the seizure onset zone is essential. For clinical evaluation purposes, electroencephalography (EEG) recordings are commonly used. However, their interpretation is usually done manually by physicians and is time-consuming and error-prone. In this work, we propose an automated epileptic signal classification method based on modern deep learning methods. In contrast to previous approaches, the network is trained directly on the EEG recordings, avoiding hand-crafted feature extraction and selection procedures. This exploits the ability of deep neural networks to detect and extract relevant features automatically, that may be too complex or subtle to be noticed by humans. The proposed network structure is based on a convolutional neural network with residual connections. We demonstrate that our network produces state-of-the-art performance on two benchmark data sets, a data set from Bonn University and the Bern-Barcelona data set. We conclude that modern deep learning approaches can reach state-of-the-art performance on epileptic EEG classification and automated seizure onset zone identification tasks when trained on raw EEG data. This suggests that such approaches have potential for improving clinical practice.
Tasks EEG
Published 2019-03-19
URL http://arxiv.org/abs/1903.08100v1
PDF http://arxiv.org/pdf/1903.08100v1.pdf
PWC https://paperswithcode.com/paper/residual-deep-convolutional-neural-network
Repo
Framework
comments powered by Disqus