October 19, 2019

2708 words 13 mins read

Paper Group ANR 112

Paper Group ANR 112

Fingerprint Presentation Attack Detection: Generalization and Efficiency. An Image dehazing approach based on the airlight field estimation. Low-Resource Speech-to-Text Translation. Dense Scene Flow from Stereo Disparity and Optical Flow. Evaluating the Complementarity of Taxonomic Relation Extraction Methods Across Different Languages. Deep Learni …

Fingerprint Presentation Attack Detection: Generalization and Efficiency

Title Fingerprint Presentation Attack Detection: Generalization and Efficiency
Authors Tarang Chugh, Anil K. Jain
Abstract We study the problem of fingerprint presentation attack detection (PAD) under unknown PA materials not seen during PAD training. A dataset of 5,743 bonafide and 4,912 PA images of 12 different materials is used to evaluate a state-of-the-art PAD, namely Fingerprint Spoof Buster. We utilize 3D t-SNE visualization and clustering of material characteristics to identify a representative set of PA materials that cover most of PA feature space. We observe that a set of six PA materials, namely Silicone, 2D Paper, Play Doh, Gelatin, Latex Body Paint and Monster Liquid Latex provide a good representative set that should be included in training to achieve generalization of PAD. We also propose an optimized Android app of Fingerprint Spoof Buster that can run on a commodity smartphone (Xiaomi Redmi Note 4) without a significant drop in PAD performance (from TDR = 95.7% to 95.3% @ FDR = 0.2%) which can make a PA prediction in less than 300ms.
Tasks
Published 2018-12-30
URL http://arxiv.org/abs/1812.11574v1
PDF http://arxiv.org/pdf/1812.11574v1.pdf
PWC https://paperswithcode.com/paper/fingerprint-presentation-attack-detection
Repo
Framework

An Image dehazing approach based on the airlight field estimation

Title An Image dehazing approach based on the airlight field estimation
Authors Lijun Zhang, Yongbin Gao, Yujin Zhang
Abstract This paper proposes a scheme for single image haze removal based on the airlight field (ALF) estimation. Conventional image dehazing methods which are based on a physical model generally take the global atmospheric light as a constant. However, the constant-airlight assumption may be unsuitable for images with large sky regions, which causes unacceptable brightness imbalance and color distortion in recovery images. This paper models the atmospheric light as a field function, and presents a maximum a-priori (MAP) method for jointly estimating the airlight field, the transmission rate and the haze free image. We also introduce a valid haze-level prior for effective estimate of transmission. Evaluation on real world images shows that the proposed approach outperforms existing methods in single image dehazing, especially when the large sky region is included.
Tasks Image Dehazing, Single Image Dehazing, Single Image Haze Removal
Published 2018-05-06
URL http://arxiv.org/abs/1805.02142v1
PDF http://arxiv.org/pdf/1805.02142v1.pdf
PWC https://paperswithcode.com/paper/an-image-dehazing-approach-based-on-the
Repo
Framework

Low-Resource Speech-to-Text Translation

Title Low-Resource Speech-to-Text Translation
Authors Sameer Bansal, Herman Kamper, Karen Livescu, Adam Lopez, Sharon Goldwater
Abstract Speech-to-text translation has many potential applications for low-resource languages, but the typical approach of cascading speech recognition with machine translation is often impossible, since the transcripts needed to train a speech recognizer are usually not available for low-resource languages. Recent work has found that neural encoder-decoder models can learn to directly translate foreign speech in high-resource scenarios, without the need for intermediate transcription. We investigate whether this approach also works in settings where both data and computation are limited. To make the approach efficient, we make several architectural changes, including a change from character-level to word-level decoding. We find that this choice yields crucial speed improvements that allow us to train with fewer computational resources, yet still performs well on frequent words. We explore models trained on between 20 and 160 hours of data, and find that although models trained on less data have considerably lower BLEU scores, they can still predict words with relatively high precision and recall—around 50% for a model trained on 50 hours of data, versus around 60% for the full 160 hour model. Thus, they may still be useful for some low-resource scenarios.
Tasks Machine Translation, Speech Recognition
Published 2018-03-24
URL http://arxiv.org/abs/1803.09164v2
PDF http://arxiv.org/pdf/1803.09164v2.pdf
PWC https://paperswithcode.com/paper/low-resource-speech-to-text-translation
Repo
Framework

Dense Scene Flow from Stereo Disparity and Optical Flow

Title Dense Scene Flow from Stereo Disparity and Optical Flow
Authors René Schuster, Oliver Wasenmüller, Didier Stricker
Abstract Scene flow describes 3D motion in a 3D scene. It can either be modeled as a single task, or it can be reconstructed from the auxiliary tasks of stereo depth and optical flow estimation. While the second method can achieve real-time performance by using real-time auxiliary methods, it will typically produce non-dense results. In this representation of a basic combination approach for scene flow estimation, we will tackle the problem of non-density by interpolation.
Tasks Optical Flow Estimation, Scene Flow Estimation
Published 2018-08-30
URL http://arxiv.org/abs/1808.10146v1
PDF http://arxiv.org/pdf/1808.10146v1.pdf
PWC https://paperswithcode.com/paper/dense-scene-flow-from-stereo-disparity-and
Repo
Framework

Evaluating the Complementarity of Taxonomic Relation Extraction Methods Across Different Languages

Title Evaluating the Complementarity of Taxonomic Relation Extraction Methods Across Different Languages
Authors Roger Granada, Renata Vieira, Cassia Trojahn, Nathalie Aussenac-Gilles
Abstract Modern information systems are changing the idea of “data processing” to the idea of “concept processing”, meaning that instead of processing words, such systems process semantic concepts which carry meaning and share contexts with other concepts. Ontology is commonly used as a structure that captures the knowledge about a certain area via providing concepts and relations between them. Traditionally, concept hierarchies have been built manually by knowledge engineers or domain experts. However, the manual construction of a concept hierarchy suffers from several limitations such as its coverage and the enormous costs of its extension and maintenance. Ontology learning, usually referred to the (semi-)automatic support in ontology development, is usually divided into steps, going from concepts identification, passing through hierarchy and non-hierarchy relations detection and, seldom, axiom extraction. It is reasonable to say that among these steps the current frontier is in the establishment of concept hierarchies, since this is the backbone of ontologies and, therefore, a good concept hierarchy is already a valuable resource for many ontology applications. The automatic construction of concept hierarchies from texts is a complex task and much work have been proposing approaches to better extract relations between concepts. These different proposals have never been contrasted against each other on the same set of data and across different languages. Such comparison is important to see whether they are complementary or incremental. Also, we can see whether they present different tendencies towards recall and precision. This paper evaluates these different methods on the basis of hierarchy metrics such as density and depth, and evaluation metrics such as Recall and Precision. Results shed light over the comprehensive set of methods according to the literature in the area.
Tasks Relation Extraction
Published 2018-11-08
URL http://arxiv.org/abs/1811.03245v1
PDF http://arxiv.org/pdf/1811.03245v1.pdf
PWC https://paperswithcode.com/paper/evaluating-the-complementarity-of-taxonomic
Repo
Framework

Deep Learning in Pharmacogenomics: From Gene Regulation to Patient Stratification

Title Deep Learning in Pharmacogenomics: From Gene Regulation to Patient Stratification
Authors Alexandr A. Kalinin, Gerald A. Higgins, Narathip Reamaroon, S. M. Reza Soroushmehr, Ari Allyn-Feuer, Ivo D. Dinov, Kayvan Najarian, Brian D. Athey
Abstract This Perspective provides examples of current and future applications of deep learning in pharmacogenomics, including: (1) identification of novel regulatory variants located in noncoding domains and their function as applied to pharmacoepigenomics; (2) patient stratification from medical records; and (3) prediction of drugs, targets, and their interactions. Deep learning encapsulates a family of machine learning algorithms that over the last decade has transformed many important subfields of artificial intelligence (AI) and has demonstrated breakthrough performance improvements on a wide range of tasks in biomedicine. We anticipate that in the future deep learning will be widely used to predict personalized drug response and optimize medication selection and dosing, using knowledge extracted from large and complex molecular, epidemiological, clinical, and demographic datasets.
Tasks
Published 2018-01-25
URL http://arxiv.org/abs/1801.08570v2
PDF http://arxiv.org/pdf/1801.08570v2.pdf
PWC https://paperswithcode.com/paper/deep-learning-in-pharmacogenomics-from-gene
Repo
Framework

Safe end-to-end imitation learning for model predictive control

Title Safe end-to-end imitation learning for model predictive control
Authors Keuntaek Lee, Kamil Saigol, Evangelos A. Theodorou
Abstract We propose the use of Bayesian networks, which provide both a mean value and an uncertainty estimate as output, to enhance the safety of learned control policies under circumstances in which a test-time input differs significantly from the training set. Our algorithm combines reinforcement learning and end-to-end imitation learning to simultaneously learn a control policy as well as a threshold over the predictive uncertainty of the learned model, with no hand-tuning required. Corrective action, such as a return of control to the model predictive controller or human expert, is taken when the uncertainty threshold is exceeded. We validate our method on fully-observable and vision-based partially-observable systems using cart-pole and autonomous driving simulations using deep convolutional Bayesian neural networks. We demonstrate that our method is robust to uncertainty resulting from varying system dynamics as well as from partial state observability.
Tasks Autonomous Driving, Imitation Learning
Published 2018-03-27
URL http://arxiv.org/abs/1803.10231v3
PDF http://arxiv.org/pdf/1803.10231v3.pdf
PWC https://paperswithcode.com/paper/safe-end-to-end-imitation-learning-for-model
Repo
Framework

Automated image segmentation for detecting cell spreading for metastasizing assessments of cancer development

Title Automated image segmentation for detecting cell spreading for metastasizing assessments of cancer development
Authors Sholpan Kauanova, Ivan Vorobjev, Alex Pappachen James
Abstract The automated segmentation of cells in microscopic images is an open research problem that has important implications for studies of the developmental and cancer processes based on in vitro models. In this paper, we present the approach for segmentation of the DIC images of cultured cells using G-neighbor smoothing followed by Kauwahara filtering and local standard deviation approach for boundary detection. NIH FIJI/ImageJ tools are used to create the ground truth dataset. The results of this work indicate that detection of cell boundaries using segmentation approach even in the case of realistic measurement conditions is a challenging problem.
Tasks Boundary Detection, Semantic Segmentation
Published 2018-01-01
URL http://arxiv.org/abs/1801.00455v1
PDF http://arxiv.org/pdf/1801.00455v1.pdf
PWC https://paperswithcode.com/paper/automated-image-segmentation-for-detecting
Repo
Framework

An Introduction to Image Synthesis with Generative Adversarial Nets

Title An Introduction to Image Synthesis with Generative Adversarial Nets
Authors He Huang, Philip S. Yu, Changhu Wang
Abstract There has been a drastic growth of research in Generative Adversarial Nets (GANs) in the past few years. Proposed in 2014, GAN has been applied to various applications such as computer vision and natural language processing, and achieves impressive performance. Among the many applications of GAN, image synthesis is the most well-studied one, and research in this area has already demonstrated the great potential of using GAN in image synthesis. In this paper, we provide a taxonomy of methods used in image synthesis, review different models for text-to-image synthesis and image-to-image translation, and discuss some evaluation metrics as well as possible future research directions in image synthesis with GAN.
Tasks Image Generation, Image-to-Image Translation
Published 2018-03-12
URL http://arxiv.org/abs/1803.04469v2
PDF http://arxiv.org/pdf/1803.04469v2.pdf
PWC https://paperswithcode.com/paper/an-introduction-to-image-synthesis-with
Repo
Framework

Convolutional Neural Network Achieves Human-level Accuracy in Music Genre Classification

Title Convolutional Neural Network Achieves Human-level Accuracy in Music Genre Classification
Authors Mingwen Dong
Abstract Music genre classification is one example of content-based analysis of music signals. Traditionally, human-engineered features were used to automatize this task and 61% accuracy has been achieved in the 10-genre classification. However, it’s still below the 70% accuracy that humans could achieve in the same task. Here, we propose a new method that combines knowledge of human perception study in music genre classification and the neurophysiology of the auditory system. The method works by training a simple convolutional neural network (CNN) to classify a short segment of the music signal. Then, the genre of a music is determined by splitting it into short segments and then combining CNN’s predictions from all short segments. After training, this method achieves human-level (70%) accuracy and the filters learned in the CNN resemble the spectrotemporal receptive field (STRF) in the auditory system.
Tasks
Published 2018-02-27
URL http://arxiv.org/abs/1802.09697v1
PDF http://arxiv.org/pdf/1802.09697v1.pdf
PWC https://paperswithcode.com/paper/convolutional-neural-network-achieves-human
Repo
Framework

Protecting Sensitive Attributes via Generative Adversarial Networks

Title Protecting Sensitive Attributes via Generative Adversarial Networks
Authors Aria Rezaei, Chaowei Xiao, Jie Gao, Bo Li
Abstract Recent advances in computing have allowed for the possibility to collect large amounts of data on personal activities and private living spaces. Collecting and publishing a dataset in this environment can cause concerns over privacy of the individuals in the dataset. In this paper we examine these privacy concerns. In particular, given a target application, how can we mask sensitive attributes in the data while preserving the utility of the data in that target application. Our focus is on protecting attributes that are hidden and can be inferred from the data by machine learning algorithms. We propose a generic framework that (1) removes the knowledge useful for inferring sensitive information, but (2) preserves the knowledge relevant to a given target application. We use deep neural networks and generative adversarial networks (GAN) to create privacy-preserving perturbations. Our noise-generating network is compact and efficient for running on mobile devices. Through extensive experiments, we show that our method outperforms conventional methods in effectively hiding the sensitive attributes while guaranteeing high performance for the target application. Our results hold for new neural network architectures, not seen before during training and are suitable for training new classifiers.
Tasks
Published 2018-12-26
URL http://arxiv.org/abs/1812.10193v1
PDF http://arxiv.org/pdf/1812.10193v1.pdf
PWC https://paperswithcode.com/paper/protecting-sensitive-attributes-via
Repo
Framework

Machine Learning pipeline for discovering neuroimaging-based biomarkers in neurology and psychiatry

Title Machine Learning pipeline for discovering neuroimaging-based biomarkers in neurology and psychiatry
Authors Alexander Bernstein, Evgeny Burnaev, Ekaterina Kondratyeva, Svetlana Sushchinskaya, Maxim Sharaev, Alexander Andreev, Alexey Artemov, Renat Akzhigitov
Abstract We consider a problem of diagnostic pattern recognition/classification from neuroimaging data. We propose a common data analysis pipeline for neuroimaging-based diagnostic classification problems using various ML algorithms and processing toolboxes for brain imaging. We illustrate the pipeline application by discovering new biomarkers for diagnostics of epilepsy and depression based on clinical and MRI/fMRI data for patients and healthy volunteers.
Tasks
Published 2018-04-26
URL http://arxiv.org/abs/1804.10163v1
PDF http://arxiv.org/pdf/1804.10163v1.pdf
PWC https://paperswithcode.com/paper/machine-learning-pipeline-for-discovering
Repo
Framework

Learning to Find Eye Region Landmarks for Remote Gaze Estimation in Unconstrained Settings

Title Learning to Find Eye Region Landmarks for Remote Gaze Estimation in Unconstrained Settings
Authors Seonwook Park, Xucong Zhang, Andreas Bulling, Otmar Hilliges
Abstract Conventional feature-based and model-based gaze estimation methods have proven to perform well in settings with controlled illumination and specialized cameras. In unconstrained real-world settings, however, such methods are surpassed by recent appearance-based methods due to difficulties in modeling factors such as illumination changes and other visual artifacts. We present a novel learning-based method for eye region landmark localization that enables conventional methods to be competitive to latest appearance-based methods. Despite having been trained exclusively on synthetic data, our method exceeds the state of the art for iris localization and eye shape registration on real-world imagery. We then use the detected landmarks as input to iterative model-fitting and lightweight learning-based gaze estimation methods. Our approach outperforms existing model-fitting and appearance-based methods in the context of person-independent and personalized gaze estimation.
Tasks Gaze Estimation
Published 2018-05-12
URL http://arxiv.org/abs/1805.04771v1
PDF http://arxiv.org/pdf/1805.04771v1.pdf
PWC https://paperswithcode.com/paper/learning-to-find-eye-region-landmarks-for
Repo
Framework

Towards Learning Fine-Grained Disentangled Representations from Speech

Title Towards Learning Fine-Grained Disentangled Representations from Speech
Authors Yuan Gong, Christian Poellabauer
Abstract Learning disentangled representations of high-dimensional data is currently an active research area. However, compared to the field of computer vision, less work has been done for speech processing. In this paper, we provide a review of two representative efforts on this topic and propose the novel concept of fine-grained disentangled speech representation learning.
Tasks Representation Learning
Published 2018-08-08
URL http://arxiv.org/abs/1808.02939v1
PDF http://arxiv.org/pdf/1808.02939v1.pdf
PWC https://paperswithcode.com/paper/towards-learning-fine-grained-disentangled
Repo
Framework

Learning Sparse Deep Feedforward Networks via Tree Skeleton Expansion

Title Learning Sparse Deep Feedforward Networks via Tree Skeleton Expansion
Authors Zhourong Chen, Xiaopeng Li, Nevin L. Zhang
Abstract Despite the popularity of deep learning, structure learning for deep models remains a relatively under-explored area. In contrast, structure learning has been studied extensively for probabilistic graphical models (PGMs). In particular, an efficient algorithm has been developed for learning a class of tree-structured PGMs called hierarchical latent tree models (HLTMs), where there is a layer of observed variables at the bottom and multiple layers of latent variables on top. In this paper, we propose a simple method for learning the structures of feedforward neural networks (FNNs) based on HLTMs. The idea is to expand the connections in the tree skeletons from HLTMs and to use the resulting structures for FNNs. An important characteristic of FNN structures learned this way is that they are sparse. We present extensive empirical results to show that, compared with standard FNNs tuned-manually, sparse FNNs learned by our method achieve better or comparable classification performance with much fewer parameters. They are also more interpretable.
Tasks
Published 2018-03-16
URL http://arxiv.org/abs/1803.06120v1
PDF http://arxiv.org/pdf/1803.06120v1.pdf
PWC https://paperswithcode.com/paper/learning-sparse-deep-feedforward-networks-via
Repo
Framework
comments powered by Disqus