May 5, 2019

2908 words 14 mins read

Paper Group ANR 570

Using k-nearest neighbors to construct cancelable minutiae templates. Hierarchical Clustering in Face Similarity Score Space. Network of Experts for Large-Scale Image Categorization. Location Sensitive Deep Convolutional Neural Networks for Segmentation of White Matter Hyperintensities. Monaural Multi-Talker Speech Recognition using Factorial Speec …

Using k-nearest neighbors to construct cancelable minutiae templates


Title	Using k-nearest neighbors to construct cancelable minutiae templates
Authors	Qinghai Gao
Abstract	Fingerprint is widely used in a variety of applications. Security measures have to be taken to protect the privacy of fingerprint data. Cancelable biometrics is proposed as an effective mechanism of using and protecting biometrics. In this paper we propose a new method of constructing cancelable fingerprint template by combining real template with synthetic template. Specifically, each user is given one synthetic minutia template generated with random number generator. Every minutia point from the real template is individually thrown into the synthetic template, from which its k-nearest neighbors are found. The verification template is constructed by combining an arbitrary set of the k-nearest neighbors. To prove the validity of the scheme, testing is carried out on three databases. The results show that the constructed templates satisfy the requirements of cancelable biometrics.
Tasks
Published	2016-08-29
URL	http://arxiv.org/abs/1608.07897v2
PDF	http://arxiv.org/pdf/1608.07897v2.pdf
PWC	https://paperswithcode.com/paper/using-k-nearest-neighbors-to-construct
Repo
Framework

Hierarchical Clustering in Face Similarity Score Space


Title	Hierarchical Clustering in Face Similarity Score Space
Authors	Jason Grant, Patrick Flynn
Abstract	Similarity scores in face recognition represent the proximity between pairs of images as computed by a matching algorithm. Given a large set of images and the proximities between all pairs, a similarity score space is defined. Cluster analysis was applied to the similarity score space to develop various taxonomies. Given the number of subjects in the dataset, we used hierarchical methods to aggregate images of the same subject. We also explored the hierarchy above and below the subject level, including clusters that reflect gender and ethnicity. Evidence supports the existence of clustering by race, gender, subject, and illumination condition.
Tasks	Face Recognition
Published	2016-05-19
URL	http://arxiv.org/abs/1605.06052v1
PDF	http://arxiv.org/pdf/1605.06052v1.pdf
PWC	https://paperswithcode.com/paper/hierarchical-clustering-in-face-similarity
Repo
Framework

Network of Experts for Large-Scale Image Categorization


Title	Network of Experts for Large-Scale Image Categorization
Authors	Karim Ahmed, Mohammad Haris Baig, Lorenzo Torresani
Abstract	We present a tree-structured network architecture for large scale image classification. The trunk of the network contains convolutional layers optimized over all classes. At a given depth, the trunk splits into separate branches, each dedicated to discriminate a different subset of classes. Each branch acts as an expert classifying a set of categories that are difficult to tell apart, while the trunk provides common knowledge to all experts in the form of shared features. The training of our “network of experts” is completely end-to-end: the partition of categories into disjoint subsets is learned simultaneously with the parameters of the network trunk and the experts are trained jointly by minimizing a single learning objective over all classes. The proposed structure can be built from any existing convolutional neural network (CNN). We demonstrate its generality by adapting 4 popular CNNs for image categorization into the form of networks of experts. Our experiments on CIFAR100 and ImageNet show that in every case our method yields a substantial improvement in accuracy over the base CNN, and gives the best result achieved so far on CIFAR100. Finally, the improvement in accuracy comes at little additional cost: compared to the base network, the training time is only moderately increased and the number of parameters is comparable or in some cases even lower.
Tasks	Image Categorization, Image Classification
Published	2016-04-20
URL	http://arxiv.org/abs/1604.06119v3
PDF	http://arxiv.org/pdf/1604.06119v3.pdf
PWC	https://paperswithcode.com/paper/network-of-experts-for-large-scale-image
Repo
Framework

Location Sensitive Deep Convolutional Neural Networks for Segmentation of White Matter Hyperintensities


Title	Location Sensitive Deep Convolutional Neural Networks for Segmentation of White Matter Hyperintensities
Authors	Mohsen Ghafoorian, Nico Karssemeijer, Tom Heskes, Inge van Uden, Clara Sanchez, Geert Litjens, Frank-Erik de Leeuw, Bram van Ginneken, Elena Marchiori, Bram Platel
Abstract	The anatomical location of imaging features is of crucial importance for accurate diagnosis in many medical tasks. Convolutional neural networks (CNN) have had huge successes in computer vision, but they lack the natural ability to incorporate the anatomical location in their decision making process, hindering success in some medical image analysis tasks. In this paper, to integrate the anatomical location information into the network, we propose several deep CNN architectures that consider multi-scale patches or take explicit location features while training. We apply and compare the proposed architectures for segmentation of white matter hyperintensities in brain MR images on a large dataset. As a result, we observe that the CNNs that incorporate location information substantially outperform a conventional segmentation method with hand-crafted features as well as CNNs that do not integrate location information. On a test set of 46 scans, the best configuration of our networks obtained a Dice score of 0.791, compared to 0.797 for an independent human observer. Performance levels of the machine and the independent human observer were not statistically significantly different (p-value=0.17).
Tasks	Decision Making
Published	2016-10-16
URL	http://arxiv.org/abs/1610.04834v2
PDF	http://arxiv.org/pdf/1610.04834v2.pdf
PWC	https://paperswithcode.com/paper/location-sensitive-deep-convolutional-neural
Repo
Framework

Monaural Multi-Talker Speech Recognition using Factorial Speech Processing Models


Title	Monaural Multi-Talker Speech Recognition using Factorial Speech Processing Models
Authors	Mahdi Khademian, Mohammad Mehdi Homayounpour
Abstract	A Pascal challenge entitled monaural multi-talker speech recognition was developed, targeting the problem of robust automatic speech recognition against speech like noises which significantly degrades the performance of automatic speech recognition systems. In this challenge, two competing speakers say a simple command simultaneously and the objective is to recognize speech of the target speaker. Surprisingly during the challenge, a team from IBM research, could achieve a performance better than human listeners on this task. The proposed method of the IBM team, consist of an intermediate speech separation and then a single-talker speech recognition. This paper reconsiders the task of this challenge based on gain adapted factorial speech processing models. It develops a joint-token passing algorithm for direct utterance decoding of both target and masker speakers, simultaneously. Comparing it to the challenge winner, it uses maximum uncertainty during the decoding which cannot be used in the past two-phased method. It provides detailed derivation of inference on these models based on general inference procedures of probabilistic graphical models. As another improvement, it uses deep neural networks for joint-speaker identification and gain estimation which makes these two steps easier than before producing competitive results for these steps. The proposed method of this work outperforms past super-human results and even the results were achieved recently by Microsoft research, using deep neural networks. It achieved 5.5% absolute task performance improvement compared to the first super-human system and 2.7% absolute task performance improvement compared to its recent competitor.
Tasks	Speaker Identification, Speech Recognition, Speech Separation
Published	2016-10-05
URL	http://arxiv.org/abs/1610.01367v1
PDF	http://arxiv.org/pdf/1610.01367v1.pdf
PWC	https://paperswithcode.com/paper/monaural-multi-talker-speech-recognition
Repo
Framework

Anytime Bi-Objective Optimization with a Hybrid Multi-Objective CMA-ES (HMO-CMA-ES)


Title	Anytime Bi-Objective Optimization with a Hybrid Multi-Objective CMA-ES (HMO-CMA-ES)
Authors	Ilya Loshchilov, Tobias Glasmachers
Abstract	We propose a multi-objective optimization algorithm aimed at achieving good anytime performance over a wide range of problems. Performance is assessed in terms of the hypervolume metric. The algorithm called HMO-CMA-ES represents a hybrid of several old and new variants of CMA-ES, complemented by BOBYQA as a warm start. We benchmark HMO-CMA-ES on the recently introduced bi-objective problem suite of the COCO framework (COmparing Continuous Optimizers), consisting of 55 scalable continuous optimization problems, which is used by the Black-Box Optimization Benchmarking (BBOB) Workshop 2016.
Tasks
Published	2016-05-09
URL	http://arxiv.org/abs/1605.02720v1
PDF	http://arxiv.org/pdf/1605.02720v1.pdf
PWC	https://paperswithcode.com/paper/anytime-bi-objective-optimization-with-a
Repo
Framework

Look, Listen and Learn - A Multimodal LSTM for Speaker Identification


Title	Look, Listen and Learn - A Multimodal LSTM for Speaker Identification
Authors	Jimmy Ren, Yongtao Hu, Yu-Wing Tai, Chuan Wang, Li Xu, Wenxiu Sun, Qiong Yan
Abstract	Speaker identification refers to the task of localizing the face of a person who has the same identity as the ongoing voice in a video. This task not only requires collective perception over both visual and auditory signals, the robustness to handle severe quality degradations and unconstrained content variations are also indispensable. In this paper, we describe a novel multimodal Long Short-Term Memory (LSTM) architecture which seamlessly unifies both visual and auditory modalities from the beginning of each sequence input. The key idea is to extend the conventional LSTM by not only sharing weights across time steps, but also sharing weights across modalities. We show that modeling the temporal dependency across face and voice can significantly improve the robustness to content quality degradations and variations. We also found that our multimodal LSTM is robustness to distractors, namely the non-speaking identities. We applied our multimodal LSTM to The Big Bang Theory dataset and showed that our system outperforms the state-of-the-art systems in speaker identification with lower false alarm rate and higher recognition accuracy.
Tasks	Speaker Identification
Published	2016-02-13
URL	http://arxiv.org/abs/1602.04364v1
PDF	http://arxiv.org/pdf/1602.04364v1.pdf
PWC	https://paperswithcode.com/paper/look-listen-and-learn-a-multimodal-lstm-for
Repo
Framework

Computational linking theory


Title	Computational linking theory
Authors	Aaron Steven White, Drew Reisinger, Rachel Rudinger, Kyle Rawlins, Benjamin Van Durme
Abstract	A linking theory explains how verbs’ semantic arguments are mapped to their syntactic arguments—the inverse of the Semantic Role Labeling task from the shallow semantic parsing literature. In this paper, we develop the Computational Linking Theory framework as a method for implementing and testing linking theories proposed in the theoretical literature. We deploy this framework to assess two cross-cutting types of linking theory: local v. global models and categorical v. featural models. To further investigate the behavior of these models, we develop a measurement model in the spirit of previous work in semantic role induction: the Semantic Proto-Role Linking Model. We use this model, which implements a generalization of Dowty’s seminal Proto-Role Theory, to induce semantic proto-roles, which we compare to those Dowty proposes.
Tasks	Semantic Parsing, Semantic Role Labeling
Published	2016-10-08
URL	http://arxiv.org/abs/1610.02544v1
PDF	http://arxiv.org/pdf/1610.02544v1.pdf
PWC	https://paperswithcode.com/paper/computational-linking-theory
Repo
Framework

Engineering Deep Representations for Modeling Aesthetic Perception


Title	Engineering Deep Representations for Modeling Aesthetic Perception
Authors	Yanxiang Chen, Yuxing Hu, Luming Zhang, Ping Li, Chao Zhang
Abstract	Many aesthetic models in computer vision suffer from two shortcomings: 1) the low descriptiveness and interpretability of those hand-crafted aesthetic criteria (i.e., nonindicative of region-level aesthetics), and 2) the difficulty of engineering aesthetic features adaptively and automatically toward different image sets. To remedy these problems, we develop a deep architecture to learn aesthetically-relevant visual attributes from Flickr1, which are localized by multiple textual attributes in a weakly-supervised setting. More specifically, using a bag-ofwords (BoW) representation of the frequent Flickr image tags, a sparsity-constrained subspace algorithm discovers a compact set of textual attributes (e.g., landscape and sunset) for each image. Then, a weakly-supervised learning algorithm projects the textual attributes at image-level to the highly-responsive image patches at pixel-level. These patches indicate where humans look at appealing regions with respect to each textual attribute, which are employed to learn the visual attributes. Psychological and anatomical studies have shown that humans perceive visual concepts hierarchically. Hence, we normalize these patches and feed them into a five-layer convolutional neural network (CNN) to mimick the hierarchy of human perceiving the visual attributes. We apply the learned deep features on image retargeting, aesthetics ranking, and retrieval. Both subjective and objective experimental results thoroughly demonstrate the competitiveness of our approach.
Tasks
Published	2016-05-25
URL	https://arxiv.org/abs/1605.07699v2
PDF	https://arxiv.org/pdf/1605.07699v2.pdf
PWC	https://paperswithcode.com/paper/describing-human-aesthetic-perception-by
Repo
Framework

A Deep Neural Network to identify foreshocks in real time


Title	A Deep Neural Network to identify foreshocks in real time
Authors	K. Vikraman
Abstract	Foreshock events provide valuable insight to predict imminent major earthquakes. However, it is difficult to identify them in real time. In this paper, I propose an algorithm based on deep learning to instantaneously classify a seismic waveform as a foreshock, mainshock or an aftershock event achieving a high accuracy of 99% in classification. As a result, this is by far the most reliable method to predict major earthquakes that are preceded by foreshocks. In addition, I discuss methods to create an earthquake dataset that is compatible with deep networks.
Tasks
Published	2016-11-26
URL	http://arxiv.org/abs/1611.08655v1
PDF	http://arxiv.org/pdf/1611.08655v1.pdf
PWC	https://paperswithcode.com/paper/a-deep-neural-network-to-identify-foreshocks
Repo
Framework

Unifying Adversarial Training Algorithms with Flexible Deep Data Gradient Regularization


Title	Unifying Adversarial Training Algorithms with Flexible Deep Data Gradient Regularization
Authors	Alexander G. Ororbia II, C. Lee Giles, Daniel Kifer
Abstract	Many previous proposals for adversarial training of deep neural nets have included di- rectly modifying the gradient, training on a mix of original and adversarial examples, using contractive penalties, and approximately optimizing constrained adversarial ob- jective functions. In this paper, we show these proposals are actually all instances of optimizing a general, regularized objective we call DataGrad. Our proposed DataGrad framework, which can be viewed as a deep extension of the layerwise contractive au- toencoder penalty, cleanly simplifies prior work and easily allows extensions such as adversarial training with multi-task cues. In our experiments, we find that the deep gra- dient regularization of DataGrad (which also has L1 and L2 flavors of regularization) outperforms alternative forms of regularization, including classical L1, L2, and multi- task, both on the original dataset as well as on adversarial sets. Furthermore, we find that combining multi-task optimization with DataGrad adversarial training results in the most robust performance.
Tasks
Published	2016-01-26
URL	http://arxiv.org/abs/1601.07213v3
PDF	http://arxiv.org/pdf/1601.07213v3.pdf
PWC	https://paperswithcode.com/paper/unifying-adversarial-training-algorithms-with
Repo
Framework

Mammalian Value Systems


Title	Mammalian Value Systems
Authors	Gopal P. Sarma, Nick J. Hay
Abstract	Characterizing human values is a topic deeply interwoven with the sciences, humanities, art, and many other human endeavors. In recent years, a number of thinkers have argued that accelerating trends in computer science, cognitive science, and related disciplines foreshadow the creation of intelligent machines which meet and ultimately surpass the cognitive abilities of human beings, thereby entangling an understanding of human values with future technological development. Contemporary research accomplishments suggest sophisticated AI systems becoming widespread and responsible for managing many aspects of the modern world, from preemptively planning users’ travel schedules and logistics, to fully autonomous vehicles, to domestic robots assisting in daily living. The extrapolation of these trends has been most forcefully described in the context of a hypothetical “intelligence explosion,” in which the capabilities of an intelligent software agent would rapidly increase due to the presence of feedback loops unavailable to biological organisms. The possibility of superintelligent agents, or simply the widespread deployment of sophisticated, autonomous AI systems, highlights an important theoretical problem: the need to separate the cognitive and rational capacities of an agent from the fundamental goal structure, or value system, which constrains and guides the agent’s actions. The “value alignment problem” is to specify a goal structure for autonomous agents compatible with human values. In this brief article, we suggest that recent ideas from affective neuroscience and related disciplines aimed at characterizing neurological and behavioral universals in the mammalian class provide important conceptual foundations relevant to describing human values. We argue that the notion of “mammalian value systems” points to a potential avenue for fundamental research in AI safety and AI ethics.
Tasks	Autonomous Vehicles
Published	2016-07-28
URL	http://arxiv.org/abs/1607.08289v4
PDF	http://arxiv.org/pdf/1607.08289v4.pdf
PWC	https://paperswithcode.com/paper/mammalian-value-systems
Repo
Framework

Polynomial Networks and Factorization Machines: New Insights and Efficient Training Algorithms


Title	Polynomial Networks and Factorization Machines: New Insights and Efficient Training Algorithms
Authors	Mathieu Blondel, Masakazu Ishihata, Akinori Fujino, Naonori Ueda
Abstract	Polynomial networks and factorization machines are two recently-proposed models that can efficiently use feature interactions in classification and regression tasks. In this paper, we revisit both models from a unified perspective. Based on this new view, we study the properties of both models and propose new efficient training algorithms. Key to our approach is to cast parameter learning as a low-rank symmetric tensor estimation problem, which we solve by multi-convex optimization. We demonstrate our approach on regression and recommender system tasks.
Tasks	Recommendation Systems
Published	2016-07-29
URL	http://arxiv.org/abs/1607.08810v1
PDF	http://arxiv.org/pdf/1607.08810v1.pdf
PWC	https://paperswithcode.com/paper/polynomial-networks-and-factorization
Repo
Framework

Functional archetype and archetypoid analysis


Title	Functional archetype and archetypoid analysis
Authors	Irene Epifanio
Abstract	Archetype and archetypoid analysis can be extended to functional data. Each function is represented as a mixture of actual observations (functional archetypoids) or functional archetypes, which are a mixture of observations in the data set. Well-known Canadian temperature data are used to illustrate the analysis developed. Computational methods are proposed for performing these analyses, based on the coefficients of a basis. Unlike a previous attempt to compute functional archetypes, which was only valid for an orthogonal basis, the proposed methodology can be used for any basis. It is computationally less demanding than the simple approach of discretizing the functions. Multivariate functional archetype and archetypoid analysis are also introduced and applied in an interesting problem about the study of human development around the world over the last 50 years. These tools can contribute to the understanding of a functional data set, as in the multivariate case.
Tasks
Published	2016-01-26
URL	http://arxiv.org/abs/1601.06911v2
PDF	http://arxiv.org/pdf/1601.06911v2.pdf
PWC	https://paperswithcode.com/paper/functional-archetype-and-archetypoid-analysis
Repo
Framework

QBF Solving by Counterexample-guided Expansion


Title	QBF Solving by Counterexample-guided Expansion
Authors	Roderick Bloem, Nicolas Braud-Santoni, Vedad Hadzic
Abstract	We introduce a novel generalization of Counterexample-Guided Inductive Synthesis (CEGIS) and instantiate it to yield a novel, competitive algorithm for solving Quantified Boolean Formulas (QBF). Current QBF solvers based on counterexample-guided expansion use a recursive approach which scales poorly with the number of quantifier alternations. Our generalization of CEGIS removes the need for this recursive approach, and we instantiate it to yield a simple and efficient algorithm for QBF solving. Lastly, this research is supported by a competitive, though straightforward, implementation of the algorithm, making it possible to study the practical impact of our algorithm design decisions, along with various optimizations.
Tasks
Published	2016-11-04
URL	http://arxiv.org/abs/1611.01553v4
PDF	http://arxiv.org/pdf/1611.01553v4.pdf
PWC	https://paperswithcode.com/paper/qbf-solving-by-counterexample-guided
Repo
Framework