May 5, 2019

2908 words 14 mins read

Paper Group ANR 570

Paper Group ANR 570

Using k-nearest neighbors to construct cancelable minutiae templates. Hierarchical Clustering in Face Similarity Score Space. Network of Experts for Large-Scale Image Categorization. Location Sensitive Deep Convolutional Neural Networks for Segmentation of White Matter Hyperintensities. Monaural Multi-Talker Speech Recognition using Factorial Speec …

Using k-nearest neighbors to construct cancelable minutiae templates

Title Using k-nearest neighbors to construct cancelable minutiae templates
Authors Qinghai Gao
Abstract Fingerprint is widely used in a variety of applications. Security measures have to be taken to protect the privacy of fingerprint data. Cancelable biometrics is proposed as an effective mechanism of using and protecting biometrics. In this paper we propose a new method of constructing cancelable fingerprint template by combining real template with synthetic template. Specifically, each user is given one synthetic minutia template generated with random number generator. Every minutia point from the real template is individually thrown into the synthetic template, from which its k-nearest neighbors are found. The verification template is constructed by combining an arbitrary set of the k-nearest neighbors. To prove the validity of the scheme, testing is carried out on three databases. The results show that the constructed templates satisfy the requirements of cancelable biometrics.
Tasks
Published 2016-08-29
URL http://arxiv.org/abs/1608.07897v2
PDF http://arxiv.org/pdf/1608.07897v2.pdf
PWC https://paperswithcode.com/paper/using-k-nearest-neighbors-to-construct
Repo
Framework

Hierarchical Clustering in Face Similarity Score Space

Title Hierarchical Clustering in Face Similarity Score Space
Authors Jason Grant, Patrick Flynn
Abstract Similarity scores in face recognition represent the proximity between pairs of images as computed by a matching algorithm. Given a large set of images and the proximities between all pairs, a similarity score space is defined. Cluster analysis was applied to the similarity score space to develop various taxonomies. Given the number of subjects in the dataset, we used hierarchical methods to aggregate images of the same subject. We also explored the hierarchy above and below the subject level, including clusters that reflect gender and ethnicity. Evidence supports the existence of clustering by race, gender, subject, and illumination condition.
Tasks Face Recognition
Published 2016-05-19
URL http://arxiv.org/abs/1605.06052v1
PDF http://arxiv.org/pdf/1605.06052v1.pdf
PWC https://paperswithcode.com/paper/hierarchical-clustering-in-face-similarity
Repo
Framework

Network of Experts for Large-Scale Image Categorization

Title Network of Experts for Large-Scale Image Categorization
Authors Karim Ahmed, Mohammad Haris Baig, Lorenzo Torresani
Abstract We present a tree-structured network architecture for large scale image classification. The trunk of the network contains convolutional layers optimized over all classes. At a given depth, the trunk splits into separate branches, each dedicated to discriminate a different subset of classes. Each branch acts as an expert classifying a set of categories that are difficult to tell apart, while the trunk provides common knowledge to all experts in the form of shared features. The training of our “network of experts” is completely end-to-end: the partition of categories into disjoint subsets is learned simultaneously with the parameters of the network trunk and the experts are trained jointly by minimizing a single learning objective over all classes. The proposed structure can be built from any existing convolutional neural network (CNN). We demonstrate its generality by adapting 4 popular CNNs for image categorization into the form of networks of experts. Our experiments on CIFAR100 and ImageNet show that in every case our method yields a substantial improvement in accuracy over the base CNN, and gives the best result achieved so far on CIFAR100. Finally, the improvement in accuracy comes at little additional cost: compared to the base network, the training time is only moderately increased and the number of parameters is comparable or in some cases even lower.
Tasks Image Categorization, Image Classification
Published 2016-04-20
URL http://arxiv.org/abs/1604.06119v3
PDF http://arxiv.org/pdf/1604.06119v3.pdf
PWC https://paperswithcode.com/paper/network-of-experts-for-large-scale-image
Repo
Framework

Location Sensitive Deep Convolutional Neural Networks for Segmentation of White Matter Hyperintensities

Title Location Sensitive Deep Convolutional Neural Networks for Segmentation of White Matter Hyperintensities
Authors Mohsen Ghafoorian, Nico Karssemeijer, Tom Heskes, Inge van Uden, Clara Sanchez, Geert Litjens, Frank-Erik de Leeuw, Bram van Ginneken, Elena Marchiori, Bram Platel
Abstract The anatomical location of imaging features is of crucial importance for accurate diagnosis in many medical tasks. Convolutional neural networks (CNN) have had huge successes in computer vision, but they lack the natural ability to incorporate the anatomical location in their decision making process, hindering success in some medical image analysis tasks. In this paper, to integrate the anatomical location information into the network, we propose several deep CNN architectures that consider multi-scale patches or take explicit location features while training. We apply and compare the proposed architectures for segmentation of white matter hyperintensities in brain MR images on a large dataset. As a result, we observe that the CNNs that incorporate location information substantially outperform a conventional segmentation method with hand-crafted features as well as CNNs that do not integrate location information. On a test set of 46 scans, the best configuration of our networks obtained a Dice score of 0.791, compared to 0.797 for an independent human observer. Performance levels of the machine and the independent human observer were not statistically significantly different (p-value=0.17).
Tasks Decision Making
Published 2016-10-16
URL http://arxiv.org/abs/1610.04834v2
PDF http://arxiv.org/pdf/1610.04834v2.pdf
PWC https://paperswithcode.com/paper/location-sensitive-deep-convolutional-neural
Repo
Framework

Monaural Multi-Talker Speech Recognition using Factorial Speech Processing Models

Title Monaural Multi-Talker Speech Recognition using Factorial Speech Processing Models
Authors Mahdi Khademian, Mohammad Mehdi Homayounpour
Abstract A Pascal challenge entitled monaural multi-talker speech recognition was developed, targeting the problem of robust automatic speech recognition against speech like noises which significantly degrades the performance of automatic speech recognition systems. In this challenge, two competing speakers say a simple command simultaneously and the objective is to recognize speech of the target speaker. Surprisingly during the challenge, a team from IBM research, could achieve a performance better than human listeners on this task. The proposed method of the IBM team, consist of an intermediate speech separation and then a single-talker speech recognition. This paper reconsiders the task of this challenge based on gain adapted factorial speech processing models. It develops a joint-token passing algorithm for direct utterance decoding of both target and masker speakers, simultaneously. Comparing it to the challenge winner, it uses maximum uncertainty during the decoding which cannot be used in the past two-phased method. It provides detailed derivation of inference on these models based on general inference procedures of probabilistic graphical models. As another improvement, it uses deep neural networks for joint-speaker identification and gain estimation which makes these two steps easier than before producing competitive results for these steps. The proposed method of this work outperforms past super-human results and even the results were achieved recently by Microsoft research, using deep neural networks. It achieved 5.5% absolute task performance improvement compared to the first super-human system and 2.7% absolute task performance improvement compared to its recent competitor.
Tasks Speaker Identification, Speech Recognition, Speech Separation
Published 2016-10-05
URL http://arxiv.org/abs/1610.01367v1
PDF http://arxiv.org/pdf/1610.01367v1.pdf
PWC https://paperswithcode.com/paper/monaural-multi-talker-speech-recognition
Repo
Framework

Anytime Bi-Objective Optimization with a Hybrid Multi-Objective CMA-ES (HMO-CMA-ES)

Title Anytime Bi-Objective Optimization with a Hybrid Multi-Objective CMA-ES (HMO-CMA-ES)
Authors Ilya Loshchilov, Tobias Glasmachers
Abstract We propose a multi-objective optimization algorithm aimed at achieving good anytime performance over a wide range of problems. Performance is assessed in terms of the hypervolume metric. The algorithm called HMO-CMA-ES represents a hybrid of several old and new variants of CMA-ES, complemented by BOBYQA as a warm start. We benchmark HMO-CMA-ES on the recently introduced bi-objective problem suite of the COCO framework (COmparing Continuous Optimizers), consisting of 55 scalable continuous optimization problems, which is used by the Black-Box Optimization Benchmarking (BBOB) Workshop 2016.
Tasks
Published 2016-05-09
URL http://arxiv.org/abs/1605.02720v1
PDF http://arxiv.org/pdf/1605.02720v1.pdf
PWC https://paperswithcode.com/paper/anytime-bi-objective-optimization-with-a
Repo
Framework

Look, Listen and Learn - A Multimodal LSTM for Speaker Identification

Title Look, Listen and Learn - A Multimodal LSTM for Speaker Identification
Authors Jimmy Ren, Yongtao Hu, Yu-Wing Tai, Chuan Wang, Li Xu, Wenxiu Sun, Qiong Yan
Abstract Speaker identification refers to the task of localizing the face of a person who has the same identity as the ongoing voice in a video. This task not only requires collective perception over both visual and auditory signals, the robustness to handle severe quality degradations and unconstrained content variations are also indispensable. In this paper, we describe a novel multimodal Long Short-Term Memory (LSTM) architecture which seamlessly unifies both visual and auditory modalities from the beginning of each sequence input. The key idea is to extend the conventional LSTM by not only sharing weights across time steps, but also sharing weights across modalities. We show that modeling the temporal dependency across face and voice can significantly improve the robustness to content quality degradations and variations. We also found that our multimodal LSTM is robustness to distractors, namely the non-speaking identities. We applied our multimodal LSTM to The Big Bang Theory dataset and showed that our system outperforms the state-of-the-art systems in speaker identification with lower false alarm rate and higher recognition accuracy.
Tasks Speaker Identification
Published 2016-02-13
URL http://arxiv.org/abs/1602.04364v1
PDF http://arxiv.org/pdf/1602.04364v1.pdf
PWC https://paperswithcode.com/paper/look-listen-and-learn-a-multimodal-lstm-for
Repo
Framework

Computational linking theory

Title Computational linking theory
Authors Aaron Steven White, Drew Reisinger, Rachel Rudinger, Kyle Rawlins, Benjamin Van Durme
Abstract A linking theory explains how verbs’ semantic arguments are mapped to their syntactic arguments—the inverse of the Semantic Role Labeling task from the shallow semantic parsing literature. In this paper, we develop the Computational Linking Theory framework as a method for implementing and testing linking theories proposed in the theoretical literature. We deploy this framework to assess two cross-cutting types of linking theory: local v. global models and categorical v. featural models. To further investigate the behavior of these models, we develop a measurement model in the spirit of previous work in semantic role induction: the Semantic Proto-Role Linking Model. We use this model, which implements a generalization of Dowty’s seminal Proto-Role Theory, to induce semantic proto-roles, which we compare to those Dowty proposes.
Tasks Semantic Parsing, Semantic Role Labeling
Published 2016-10-08
URL http://arxiv.org/abs/1610.02544v1
PDF http://arxiv.org/pdf/1610.02544v1.pdf
PWC https://paperswithcode.com/paper/computational-linking-theory
Repo
Framework

Engineering Deep Representations for Modeling Aesthetic Perception

Title Engineering Deep Representations for Modeling Aesthetic Perception
Authors Yanxiang Chen, Yuxing Hu, Luming Zhang, Ping Li, Chao Zhang
Abstract Many aesthetic models in computer vision suffer from two shortcomings: 1) the low descriptiveness and interpretability of those hand-crafted aesthetic criteria (i.e., nonindicative of region-level aesthetics), and 2) the difficulty of engineering aesthetic features adaptively and automatically toward different image sets. To remedy these problems, we develop a deep architecture to learn aesthetically-relevant visual attributes from Flickr1, which are localized by multiple textual attributes in a weakly-supervised setting. More specifically, using a bag-ofwords (BoW) representation of the frequent Flickr image tags, a sparsity-constrained subspace algorithm discovers a compact set of textual attributes (e.g., landscape and sunset) for each image. Then, a weakly-supervised learning algorithm projects the textual attributes at image-level to the highly-responsive image patches at pixel-level. These patches indicate where humans look at appealing regions with respect to each textual attribute, which are employed to learn the visual attributes. Psychological and anatomical studies have shown that humans perceive visual concepts hierarchically. Hence, we normalize these patches and feed them into a five-layer convolutional neural network (CNN) to mimick the hierarchy of human perceiving the visual attributes. We apply the learned deep features on image retargeting, aesthetics ranking, and retrieval. Both subjective and objective experimental results thoroughly demonstrate the competitiveness of our approach.
Tasks
Published 2016-05-25
URL https://arxiv.org/abs/1605.07699v2
PDF https://arxiv.org/pdf/1605.07699v2.pdf
PWC https://paperswithcode.com/paper/describing-human-aesthetic-perception-by
Repo
Framework

A Deep Neural Network to identify foreshocks in real time

Title A Deep Neural Network to identify foreshocks in real time
Authors K. Vikraman
Abstract Foreshock events provide valuable insight to predict imminent major earthquakes. However, it is difficult to identify them in real time. In this paper, I propose an algorithm based on deep learning to instantaneously classify a seismic waveform as a foreshock, mainshock or an aftershock event achieving a high accuracy of 99% in classification. As a result, this is by far the most reliable method to predict major earthquakes that are preceded by foreshocks. In addition, I discuss methods to create an earthquake dataset that is compatible with deep networks.
Tasks
Published 2016-11-26
URL http://arxiv.org/abs/1611.08655v1
PDF http://arxiv.org/pdf/1611.08655v1.pdf
PWC https://paperswithcode.com/paper/a-deep-neural-network-to-identify-foreshocks
Repo
Framework

Unifying Adversarial Training Algorithms with Flexible Deep Data Gradient Regularization

Title Unifying Adversarial Training Algorithms with Flexible Deep Data Gradient Regularization
Authors Alexander G. Ororbia II, C. Lee Giles, Daniel Kifer
Abstract Many previous proposals for adversarial training of deep neural nets have included di- rectly modifying the gradient, training on a mix of original and adversarial examples, using contractive penalties, and approximately optimizing constrained adversarial ob- jective functions. In this paper, we show these proposals are actually all instances of optimizing a general, regularized objective we call DataGrad. Our proposed DataGrad framework, which can be viewed as a deep extension of the layerwise contractive au- toencoder penalty, cleanly simplifies prior work and easily allows extensions such as adversarial training with multi-task cues. In our experiments, we find that the deep gra- dient regularization of DataGrad (which also has L1 and L2 flavors of regularization) outperforms alternative forms of regularization, including classical L1, L2, and multi- task, both on the original dataset as well as on adversarial sets. Furthermore, we find that combining multi-task optimization with DataGrad adversarial training results in the most robust performance.
Tasks
Published 2016-01-26
URL http://arxiv.org/abs/1601.07213v3
PDF http://arxiv.org/pdf/1601.07213v3.pdf
PWC https://paperswithcode.com/paper/unifying-adversarial-training-algorithms-with
Repo
Framework

Mammalian Value Systems

Title Mammalian Value Systems
Authors Gopal P. Sarma, Nick J. Hay
Abstract Characterizing human values is a topic deeply interwoven with the sciences, humanities, art, and many other human endeavors. In recent years, a number of thinkers have argued that accelerating trends in computer science, cognitive science, and related disciplines foreshadow the creation of intelligent machines which meet and ultimately surpass the cognitive abilities of human beings, thereby entangling an understanding of human values with future technological development. Contemporary research accomplishments suggest sophisticated AI systems becoming widespread and responsible for managing many aspects of the modern world, from preemptively planning users’ travel schedules and logistics, to fully autonomous vehicles, to domestic robots assisting in daily living. The extrapolation of these trends has been most forcefully described in the context of a hypothetical “intelligence explosion,” in which the capabilities of an intelligent software agent would rapidly increase due to the presence of feedback loops unavailable to biological organisms. The possibility of superintelligent agents, or simply the widespread deployment of sophisticated, autonomous AI systems, highlights an important theoretical problem: the need to separate the cognitive and rational capacities of an agent from the fundamental goal structure, or value system, which constrains and guides the agent’s actions. The “value alignment problem” is to specify a goal structure for autonomous agents compatible with human values. In this brief article, we suggest that recent ideas from affective neuroscience and related disciplines aimed at characterizing neurological and behavioral universals in the mammalian class provide important conceptual foundations relevant to describing human values. We argue that the notion of “mammalian value systems” points to a potential avenue for fundamental research in AI safety and AI ethics.
Tasks Autonomous Vehicles
Published 2016-07-28
URL http://arxiv.org/abs/1607.08289v4
PDF http://arxiv.org/pdf/1607.08289v4.pdf
PWC https://paperswithcode.com/paper/mammalian-value-systems
Repo
Framework

Polynomial Networks and Factorization Machines: New Insights and Efficient Training Algorithms

Title Polynomial Networks and Factorization Machines: New Insights and Efficient Training Algorithms
Authors Mathieu Blondel, Masakazu Ishihata, Akinori Fujino, Naonori Ueda
Abstract Polynomial networks and factorization machines are two recently-proposed models that can efficiently use feature interactions in classification and regression tasks. In this paper, we revisit both models from a unified perspective. Based on this new view, we study the properties of both models and propose new efficient training algorithms. Key to our approach is to cast parameter learning as a low-rank symmetric tensor estimation problem, which we solve by multi-convex optimization. We demonstrate our approach on regression and recommender system tasks.
Tasks Recommendation Systems
Published 2016-07-29
URL http://arxiv.org/abs/1607.08810v1
PDF http://arxiv.org/pdf/1607.08810v1.pdf
PWC https://paperswithcode.com/paper/polynomial-networks-and-factorization
Repo
Framework

Functional archetype and archetypoid analysis

Title Functional archetype and archetypoid analysis
Authors Irene Epifanio
Abstract Archetype and archetypoid analysis can be extended to functional data. Each function is represented as a mixture of actual observations (functional archetypoids) or functional archetypes, which are a mixture of observations in the data set. Well-known Canadian temperature data are used to illustrate the analysis developed. Computational methods are proposed for performing these analyses, based on the coefficients of a basis. Unlike a previous attempt to compute functional archetypes, which was only valid for an orthogonal basis, the proposed methodology can be used for any basis. It is computationally less demanding than the simple approach of discretizing the functions. Multivariate functional archetype and archetypoid analysis are also introduced and applied in an interesting problem about the study of human development around the world over the last 50 years. These tools can contribute to the understanding of a functional data set, as in the multivariate case.
Tasks
Published 2016-01-26
URL http://arxiv.org/abs/1601.06911v2
PDF http://arxiv.org/pdf/1601.06911v2.pdf
PWC https://paperswithcode.com/paper/functional-archetype-and-archetypoid-analysis
Repo
Framework

QBF Solving by Counterexample-guided Expansion

Title QBF Solving by Counterexample-guided Expansion
Authors Roderick Bloem, Nicolas Braud-Santoni, Vedad Hadzic
Abstract We introduce a novel generalization of Counterexample-Guided Inductive Synthesis (CEGIS) and instantiate it to yield a novel, competitive algorithm for solving Quantified Boolean Formulas (QBF). Current QBF solvers based on counterexample-guided expansion use a recursive approach which scales poorly with the number of quantifier alternations. Our generalization of CEGIS removes the need for this recursive approach, and we instantiate it to yield a simple and efficient algorithm for QBF solving. Lastly, this research is supported by a competitive, though straightforward, implementation of the algorithm, making it possible to study the practical impact of our algorithm design decisions, along with various optimizations.
Tasks
Published 2016-11-04
URL http://arxiv.org/abs/1611.01553v4
PDF http://arxiv.org/pdf/1611.01553v4.pdf
PWC https://paperswithcode.com/paper/qbf-solving-by-counterexample-guided
Repo
Framework
comments powered by Disqus