October 19, 2019

2708 words 13 mins read

Paper Group ANR 112

Fingerprint Presentation Attack Detection: Generalization and Efficiency. An Image dehazing approach based on the airlight field estimation. Low-Resource Speech-to-Text Translation. Dense Scene Flow from Stereo Disparity and Optical Flow. Evaluating the Complementarity of Taxonomic Relation Extraction Methods Across Different Languages. Deep Learni …

Fingerprint Presentation Attack Detection: Generalization and Efficiency


Title	Fingerprint Presentation Attack Detection: Generalization and Efficiency
Authors	Tarang Chugh, Anil K. Jain
Abstract	We study the problem of fingerprint presentation attack detection (PAD) under unknown PA materials not seen during PAD training. A dataset of 5,743 bonafide and 4,912 PA images of 12 different materials is used to evaluate a state-of-the-art PAD, namely Fingerprint Spoof Buster. We utilize 3D t-SNE visualization and clustering of material characteristics to identify a representative set of PA materials that cover most of PA feature space. We observe that a set of six PA materials, namely Silicone, 2D Paper, Play Doh, Gelatin, Latex Body Paint and Monster Liquid Latex provide a good representative set that should be included in training to achieve generalization of PAD. We also propose an optimized Android app of Fingerprint Spoof Buster that can run on a commodity smartphone (Xiaomi Redmi Note 4) without a significant drop in PAD performance (from TDR = 95.7% to 95.3% @ FDR = 0.2%) which can make a PA prediction in less than 300ms.
Tasks
Published	2018-12-30
URL	http://arxiv.org/abs/1812.11574v1
PDF	http://arxiv.org/pdf/1812.11574v1.pdf
PWC	https://paperswithcode.com/paper/fingerprint-presentation-attack-detection
Repo
Framework

An Image dehazing approach based on the airlight field estimation


Title	An Image dehazing approach based on the airlight field estimation
Authors	Lijun Zhang, Yongbin Gao, Yujin Zhang
Abstract	This paper proposes a scheme for single image haze removal based on the airlight field (ALF) estimation. Conventional image dehazing methods which are based on a physical model generally take the global atmospheric light as a constant. However, the constant-airlight assumption may be unsuitable for images with large sky regions, which causes unacceptable brightness imbalance and color distortion in recovery images. This paper models the atmospheric light as a field function, and presents a maximum a-priori (MAP) method for jointly estimating the airlight field, the transmission rate and the haze free image. We also introduce a valid haze-level prior for effective estimate of transmission. Evaluation on real world images shows that the proposed approach outperforms existing methods in single image dehazing, especially when the large sky region is included.
Tasks	Image Dehazing, Single Image Dehazing, Single Image Haze Removal
Published	2018-05-06
URL	http://arxiv.org/abs/1805.02142v1
PDF	http://arxiv.org/pdf/1805.02142v1.pdf
PWC	https://paperswithcode.com/paper/an-image-dehazing-approach-based-on-the
Repo
Framework

Low-Resource Speech-to-Text Translation


Title	Low-Resource Speech-to-Text Translation
Authors	Sameer Bansal, Herman Kamper, Karen Livescu, Adam Lopez, Sharon Goldwater
Abstract	Speech-to-text translation has many potential applications for low-resource languages, but the typical approach of cascading speech recognition with machine translation is often impossible, since the transcripts needed to train a speech recognizer are usually not available for low-resource languages. Recent work has found that neural encoder-decoder models can learn to directly translate foreign speech in high-resource scenarios, without the need for intermediate transcription. We investigate whether this approach also works in settings where both data and computation are limited. To make the approach efficient, we make several architectural changes, including a change from character-level to word-level decoding. We find that this choice yields crucial speed improvements that allow us to train with fewer computational resources, yet still performs well on frequent words. We explore models trained on between 20 and 160 hours of data, and find that although models trained on less data have considerably lower BLEU scores, they can still predict words with relatively high precision and recall—around 50% for a model trained on 50 hours of data, versus around 60% for the full 160 hour model. Thus, they may still be useful for some low-resource scenarios.
Tasks	Machine Translation, Speech Recognition
Published	2018-03-24
URL	http://arxiv.org/abs/1803.09164v2
PDF	http://arxiv.org/pdf/1803.09164v2.pdf
PWC	https://paperswithcode.com/paper/low-resource-speech-to-text-translation
Repo
Framework

Dense Scene Flow from Stereo Disparity and Optical Flow


Title	Dense Scene Flow from Stereo Disparity and Optical Flow
Authors	René Schuster, Oliver Wasenmüller, Didier Stricker
Abstract	Scene flow describes 3D motion in a 3D scene. It can either be modeled as a single task, or it can be reconstructed from the auxiliary tasks of stereo depth and optical flow estimation. While the second method can achieve real-time performance by using real-time auxiliary methods, it will typically produce non-dense results. In this representation of a basic combination approach for scene flow estimation, we will tackle the problem of non-density by interpolation.
Tasks	Optical Flow Estimation, Scene Flow Estimation
Published	2018-08-30
URL	http://arxiv.org/abs/1808.10146v1
PDF	http://arxiv.org/pdf/1808.10146v1.pdf
PWC	https://paperswithcode.com/paper/dense-scene-flow-from-stereo-disparity-and
Repo
Framework

Evaluating the Complementarity of Taxonomic Relation Extraction Methods Across Different Languages


Title	Evaluating the Complementarity of Taxonomic Relation Extraction Methods Across Different Languages
Authors	Roger Granada, Renata Vieira, Cassia Trojahn, Nathalie Aussenac-Gilles
Abstract	Modern information systems are changing the idea of “data processing” to the idea of “concept processing”, meaning that instead of processing words, such systems process semantic concepts which carry meaning and share contexts with other concepts. Ontology is commonly used as a structure that captures the knowledge about a certain area via providing concepts and relations between them. Traditionally, concept hierarchies have been built manually by knowledge engineers or domain experts. However, the manual construction of a concept hierarchy suffers from several limitations such as its coverage and the enormous costs of its extension and maintenance. Ontology learning, usually referred to the (semi-)automatic support in ontology development, is usually divided into steps, going from concepts identification, passing through hierarchy and non-hierarchy relations detection and, seldom, axiom extraction. It is reasonable to say that among these steps the current frontier is in the establishment of concept hierarchies, since this is the backbone of ontologies and, therefore, a good concept hierarchy is already a valuable resource for many ontology applications. The automatic construction of concept hierarchies from texts is a complex task and much work have been proposing approaches to better extract relations between concepts. These different proposals have never been contrasted against each other on the same set of data and across different languages. Such comparison is important to see whether they are complementary or incremental. Also, we can see whether they present different tendencies towards recall and precision. This paper evaluates these different methods on the basis of hierarchy metrics such as density and depth, and evaluation metrics such as Recall and Precision. Results shed light over the comprehensive set of methods according to the literature in the area.
Tasks	Relation Extraction
Published	2018-11-08
URL	http://arxiv.org/abs/1811.03245v1
PDF	http://arxiv.org/pdf/1811.03245v1.pdf
PWC	https://paperswithcode.com/paper/evaluating-the-complementarity-of-taxonomic
Repo
Framework

Deep Learning in Pharmacogenomics: From Gene Regulation to Patient Stratification


Title	Deep Learning in Pharmacogenomics: From Gene Regulation to Patient Stratification
Authors	Alexandr A. Kalinin, Gerald A. Higgins, Narathip Reamaroon, S. M. Reza Soroushmehr, Ari Allyn-Feuer, Ivo D. Dinov, Kayvan Najarian, Brian D. Athey
Abstract	This Perspective provides examples of current and future applications of deep learning in pharmacogenomics, including: (1) identification of novel regulatory variants located in noncoding domains and their function as applied to pharmacoepigenomics; (2) patient stratification from medical records; and (3) prediction of drugs, targets, and their interactions. Deep learning encapsulates a family of machine learning algorithms that over the last decade has transformed many important subfields of artificial intelligence (AI) and has demonstrated breakthrough performance improvements on a wide range of tasks in biomedicine. We anticipate that in the future deep learning will be widely used to predict personalized drug response and optimize medication selection and dosing, using knowledge extracted from large and complex molecular, epidemiological, clinical, and demographic datasets.
Tasks
Published	2018-01-25
URL	http://arxiv.org/abs/1801.08570v2
PDF	http://arxiv.org/pdf/1801.08570v2.pdf
PWC	https://paperswithcode.com/paper/deep-learning-in-pharmacogenomics-from-gene
Repo
Framework

Safe end-to-end imitation learning for model predictive control


Title	Safe end-to-end imitation learning for model predictive control
Authors	Keuntaek Lee, Kamil Saigol, Evangelos A. Theodorou
Abstract	We propose the use of Bayesian networks, which provide both a mean value and an uncertainty estimate as output, to enhance the safety of learned control policies under circumstances in which a test-time input differs significantly from the training set. Our algorithm combines reinforcement learning and end-to-end imitation learning to simultaneously learn a control policy as well as a threshold over the predictive uncertainty of the learned model, with no hand-tuning required. Corrective action, such as a return of control to the model predictive controller or human expert, is taken when the uncertainty threshold is exceeded. We validate our method on fully-observable and vision-based partially-observable systems using cart-pole and autonomous driving simulations using deep convolutional Bayesian neural networks. We demonstrate that our method is robust to uncertainty resulting from varying system dynamics as well as from partial state observability.
Tasks	Autonomous Driving, Imitation Learning
Published	2018-03-27
URL	http://arxiv.org/abs/1803.10231v3
PDF	http://arxiv.org/pdf/1803.10231v3.pdf
PWC	https://paperswithcode.com/paper/safe-end-to-end-imitation-learning-for-model
Repo
Framework

Automated image segmentation for detecting cell spreading for metastasizing assessments of cancer development


Title	Automated image segmentation for detecting cell spreading for metastasizing assessments of cancer development
Authors	Sholpan Kauanova, Ivan Vorobjev, Alex Pappachen James
Abstract	The automated segmentation of cells in microscopic images is an open research problem that has important implications for studies of the developmental and cancer processes based on in vitro models. In this paper, we present the approach for segmentation of the DIC images of cultured cells using G-neighbor smoothing followed by Kauwahara filtering and local standard deviation approach for boundary detection. NIH FIJI/ImageJ tools are used to create the ground truth dataset. The results of this work indicate that detection of cell boundaries using segmentation approach even in the case of realistic measurement conditions is a challenging problem.
Tasks	Boundary Detection, Semantic Segmentation
Published	2018-01-01
URL	http://arxiv.org/abs/1801.00455v1
PDF	http://arxiv.org/pdf/1801.00455v1.pdf
PWC	https://paperswithcode.com/paper/automated-image-segmentation-for-detecting
Repo
Framework

An Introduction to Image Synthesis with Generative Adversarial Nets


Title	An Introduction to Image Synthesis with Generative Adversarial Nets
Authors	He Huang, Philip S. Yu, Changhu Wang
Abstract	There has been a drastic growth of research in Generative Adversarial Nets (GANs) in the past few years. Proposed in 2014, GAN has been applied to various applications such as computer vision and natural language processing, and achieves impressive performance. Among the many applications of GAN, image synthesis is the most well-studied one, and research in this area has already demonstrated the great potential of using GAN in image synthesis. In this paper, we provide a taxonomy of methods used in image synthesis, review different models for text-to-image synthesis and image-to-image translation, and discuss some evaluation metrics as well as possible future research directions in image synthesis with GAN.
Tasks	Image Generation, Image-to-Image Translation
Published	2018-03-12
URL	http://arxiv.org/abs/1803.04469v2
PDF	http://arxiv.org/pdf/1803.04469v2.pdf
PWC	https://paperswithcode.com/paper/an-introduction-to-image-synthesis-with
Repo
Framework

Convolutional Neural Network Achieves Human-level Accuracy in Music Genre Classification


Title	Convolutional Neural Network Achieves Human-level Accuracy in Music Genre Classification
Authors	Mingwen Dong
Abstract	Music genre classification is one example of content-based analysis of music signals. Traditionally, human-engineered features were used to automatize this task and 61% accuracy has been achieved in the 10-genre classification. However, it’s still below the 70% accuracy that humans could achieve in the same task. Here, we propose a new method that combines knowledge of human perception study in music genre classification and the neurophysiology of the auditory system. The method works by training a simple convolutional neural network (CNN) to classify a short segment of the music signal. Then, the genre of a music is determined by splitting it into short segments and then combining CNN’s predictions from all short segments. After training, this method achieves human-level (70%) accuracy and the filters learned in the CNN resemble the spectrotemporal receptive field (STRF) in the auditory system.
Tasks
Published	2018-02-27
URL	http://arxiv.org/abs/1802.09697v1
PDF	http://arxiv.org/pdf/1802.09697v1.pdf
PWC	https://paperswithcode.com/paper/convolutional-neural-network-achieves-human
Repo
Framework

Protecting Sensitive Attributes via Generative Adversarial Networks


Title	Protecting Sensitive Attributes via Generative Adversarial Networks
Authors	Aria Rezaei, Chaowei Xiao, Jie Gao, Bo Li
Abstract	Recent advances in computing have allowed for the possibility to collect large amounts of data on personal activities and private living spaces. Collecting and publishing a dataset in this environment can cause concerns over privacy of the individuals in the dataset. In this paper we examine these privacy concerns. In particular, given a target application, how can we mask sensitive attributes in the data while preserving the utility of the data in that target application. Our focus is on protecting attributes that are hidden and can be inferred from the data by machine learning algorithms. We propose a generic framework that (1) removes the knowledge useful for inferring sensitive information, but (2) preserves the knowledge relevant to a given target application. We use deep neural networks and generative adversarial networks (GAN) to create privacy-preserving perturbations. Our noise-generating network is compact and efficient for running on mobile devices. Through extensive experiments, we show that our method outperforms conventional methods in effectively hiding the sensitive attributes while guaranteeing high performance for the target application. Our results hold for new neural network architectures, not seen before during training and are suitable for training new classifiers.
Tasks
Published	2018-12-26
URL	http://arxiv.org/abs/1812.10193v1
PDF	http://arxiv.org/pdf/1812.10193v1.pdf
PWC	https://paperswithcode.com/paper/protecting-sensitive-attributes-via
Repo
Framework

Machine Learning pipeline for discovering neuroimaging-based biomarkers in neurology and psychiatry


Title	Machine Learning pipeline for discovering neuroimaging-based biomarkers in neurology and psychiatry
Authors	Alexander Bernstein, Evgeny Burnaev, Ekaterina Kondratyeva, Svetlana Sushchinskaya, Maxim Sharaev, Alexander Andreev, Alexey Artemov, Renat Akzhigitov
Abstract	We consider a problem of diagnostic pattern recognition/classification from neuroimaging data. We propose a common data analysis pipeline for neuroimaging-based diagnostic classification problems using various ML algorithms and processing toolboxes for brain imaging. We illustrate the pipeline application by discovering new biomarkers for diagnostics of epilepsy and depression based on clinical and MRI/fMRI data for patients and healthy volunteers.
Tasks
Published	2018-04-26
URL	http://arxiv.org/abs/1804.10163v1
PDF	http://arxiv.org/pdf/1804.10163v1.pdf
PWC	https://paperswithcode.com/paper/machine-learning-pipeline-for-discovering
Repo
Framework

Learning to Find Eye Region Landmarks for Remote Gaze Estimation in Unconstrained Settings


Title	Learning to Find Eye Region Landmarks for Remote Gaze Estimation in Unconstrained Settings
Authors	Seonwook Park, Xucong Zhang, Andreas Bulling, Otmar Hilliges
Abstract	Conventional feature-based and model-based gaze estimation methods have proven to perform well in settings with controlled illumination and specialized cameras. In unconstrained real-world settings, however, such methods are surpassed by recent appearance-based methods due to difficulties in modeling factors such as illumination changes and other visual artifacts. We present a novel learning-based method for eye region landmark localization that enables conventional methods to be competitive to latest appearance-based methods. Despite having been trained exclusively on synthetic data, our method exceeds the state of the art for iris localization and eye shape registration on real-world imagery. We then use the detected landmarks as input to iterative model-fitting and lightweight learning-based gaze estimation methods. Our approach outperforms existing model-fitting and appearance-based methods in the context of person-independent and personalized gaze estimation.
Tasks	Gaze Estimation
Published	2018-05-12
URL	http://arxiv.org/abs/1805.04771v1
PDF	http://arxiv.org/pdf/1805.04771v1.pdf
PWC	https://paperswithcode.com/paper/learning-to-find-eye-region-landmarks-for
Repo
Framework

Towards Learning Fine-Grained Disentangled Representations from Speech


Title	Towards Learning Fine-Grained Disentangled Representations from Speech
Authors	Yuan Gong, Christian Poellabauer
Abstract	Learning disentangled representations of high-dimensional data is currently an active research area. However, compared to the field of computer vision, less work has been done for speech processing. In this paper, we provide a review of two representative efforts on this topic and propose the novel concept of fine-grained disentangled speech representation learning.
Tasks	Representation Learning
Published	2018-08-08
URL	http://arxiv.org/abs/1808.02939v1
PDF	http://arxiv.org/pdf/1808.02939v1.pdf
PWC	https://paperswithcode.com/paper/towards-learning-fine-grained-disentangled
Repo
Framework

Learning Sparse Deep Feedforward Networks via Tree Skeleton Expansion


Title	Learning Sparse Deep Feedforward Networks via Tree Skeleton Expansion
Authors	Zhourong Chen, Xiaopeng Li, Nevin L. Zhang
Abstract	Despite the popularity of deep learning, structure learning for deep models remains a relatively under-explored area. In contrast, structure learning has been studied extensively for probabilistic graphical models (PGMs). In particular, an efficient algorithm has been developed for learning a class of tree-structured PGMs called hierarchical latent tree models (HLTMs), where there is a layer of observed variables at the bottom and multiple layers of latent variables on top. In this paper, we propose a simple method for learning the structures of feedforward neural networks (FNNs) based on HLTMs. The idea is to expand the connections in the tree skeletons from HLTMs and to use the resulting structures for FNNs. An important characteristic of FNN structures learned this way is that they are sparse. We present extensive empirical results to show that, compared with standard FNNs tuned-manually, sparse FNNs learned by our method achieve better or comparable classification performance with much fewer parameters. They are also more interpretable.
Tasks
Published	2018-03-16
URL	http://arxiv.org/abs/1803.06120v1
PDF	http://arxiv.org/pdf/1803.06120v1.pdf
PWC	https://paperswithcode.com/paper/learning-sparse-deep-feedforward-networks-via
Repo
Framework