July 27, 2019

3109 words 15 mins read

Paper Group ANR 529

Paper Group ANR 529

Gaussian Processes with Context-Supported Priors for Active Object Localization. Collaborative Deep Reinforcement Learning for Joint Object Search. Large-scale Feature Selection of Risk Genetic Factors for Alzheimer’s Disease via Distributed Group Lasso Regression. RaspiReader: An Open Source Fingerprint Reader Facilitating Spoof Detection. Non-Con …

Gaussian Processes with Context-Supported Priors for Active Object Localization

Title Gaussian Processes with Context-Supported Priors for Active Object Localization
Authors Anthony D. Rhodes, Jordan Witte, Melanie Mitchell, Bruno Jedynak
Abstract We devise an algorithm using a Bayesian optimization framework in conjunction with contextual visual data for the efficient localization of objects in still images. Recent research has demonstrated substantial progress in object localization and related tasks for computer vision. However, many current state-of-the-art object localization procedures still suffer from inaccuracy and inefficiency, in addition to failing to provide a principled and interpretable system amenable to high-level vision tasks. We address these issues with the current research. Our method encompasses an active search procedure that uses contextual data to generate initial bounding-box proposals for a target object. We train a convolutional neural network to approximate an offset distance from the target object. Next, we use a Gaussian Process to model this offset response signal over the search space of the target. We then employ a Bayesian active search for accurate localization of the target. In experiments, we compare our approach to a state-of-theart bounding-box regression method for a challenging pedestrian localization task. Our method exhibits a substantial improvement over this baseline regression method.
Tasks Active Object Localization, Gaussian Processes, Object Localization
Published 2017-03-25
URL http://arxiv.org/abs/1703.08653v3
PDF http://arxiv.org/pdf/1703.08653v3.pdf
PWC https://paperswithcode.com/paper/gaussian-processes-with-context-supported
Repo
Framework
Title Collaborative Deep Reinforcement Learning for Joint Object Search
Authors Xiangyu Kong, Bo Xin, Yizhou Wang, Gang Hua
Abstract We examine the problem of joint top-down active search of multiple objects under interaction, e.g., person riding a bicycle, cups held by the table, etc.. Such objects under interaction often can provide contextual cues to each other to facilitate more efficient search. By treating each detector as an agent, we present the first collaborative multi-agent deep reinforcement learning algorithm to learn the optimal policy for joint active object localization, which effectively exploits such beneficial contextual information. We learn inter-agent communication through cross connections with gates between the Q-networks, which is facilitated by a novel multi-agent deep Q-learning algorithm with joint exploitation sampling. We verify our proposed method on multiple object detection benchmarks. Not only does our model help to improve the performance of state-of-the-art active localization models, it also reveals interesting co-detection patterns that are intuitively interpretable.
Tasks Active Object Localization, Object Detection, Object Localization, Q-Learning
Published 2017-02-18
URL http://arxiv.org/abs/1702.05573v1
PDF http://arxiv.org/pdf/1702.05573v1.pdf
PWC https://paperswithcode.com/paper/collaborative-deep-reinforcement-learning-for-1
Repo
Framework

Large-scale Feature Selection of Risk Genetic Factors for Alzheimer’s Disease via Distributed Group Lasso Regression

Title Large-scale Feature Selection of Risk Genetic Factors for Alzheimer’s Disease via Distributed Group Lasso Regression
Authors Qingyang Li, Dajiang Zhu, Jie Zhang, Derrek Paul Hibar, Neda Jahanshad, Yalin Wang, Jieping Ye, Paul M. Thompson, Jie Wang
Abstract Genome-wide association studies (GWAS) have achieved great success in the genetic study of Alzheimer’s disease (AD). Collaborative imaging genetics studies across different research institutions show the effectiveness of detecting genetic risk factors. However, the high dimensionality of GWAS data poses significant challenges in detecting risk SNPs for AD. Selecting relevant features is crucial in predicting the response variable. In this study, we propose a novel Distributed Feature Selection Framework (DFSF) to conduct the large-scale imaging genetics studies across multiple institutions. To speed up the learning process, we propose a family of distributed group Lasso screening rules to identify irrelevant features and remove them from the optimization. Then we select the relevant group features by performing the group Lasso feature selection process in a sequence of parameters. Finally, we employ the stability selection to rank the top risk SNPs that might help detect the early stage of AD. To the best of our knowledge, this is the first distributed feature selection model integrated with group Lasso feature selection as well as detecting the risk genetic factors across multiple research institutions system. Empirical studies are conducted on 809 subjects with 5.9 million SNPs which are distributed across several individual institutions, demonstrating the efficiency and effectiveness of the proposed method.
Tasks Feature Selection
Published 2017-04-27
URL http://arxiv.org/abs/1704.08383v1
PDF http://arxiv.org/pdf/1704.08383v1.pdf
PWC https://paperswithcode.com/paper/large-scale-feature-selection-of-risk-genetic
Repo
Framework

RaspiReader: An Open Source Fingerprint Reader Facilitating Spoof Detection

Title RaspiReader: An Open Source Fingerprint Reader Facilitating Spoof Detection
Authors Joshua J. Engelsma, Kai Cao, Anil K. Jain
Abstract We present the design and prototype of an open source, optical fingerprint reader, called RaspiReader, using ubiquitous components. RaspiReader, a low-cost and easy to assemble reader, provides the fingerprint research community a seamless and simple method for gaining more control over the sensing component of fingerprint recognition systems. In particular, we posit that this versatile fingerprint reader will encourage researchers to explore novel spoof detection methods that integrate both hardware and software. RaspiReader’s hardware is customized with two cameras for fingerprint acquisition with one camera providing high contrast, frustrated total internal reflection (FTIR) images, and the other camera outputting direct images. Using both of these image streams, we extract complementary information which, when fused together, results in highly discriminative features for fingerprint spoof (presentation attack) detection. Our experimental results demonstrate a marked improvement over previous spoof detection methods which rely only on FTIR images provided by COTS optical readers. Finally, fingerprint matching experiments between images acquired from the FTIR output of the RaspiReader and images acquired from a COTS fingerprint reader verify the interoperability of the RaspiReader with existing COTS optical readers.
Tasks
Published 2017-08-25
URL http://arxiv.org/abs/1708.07887v1
PDF http://arxiv.org/pdf/1708.07887v1.pdf
PWC https://paperswithcode.com/paper/raspireader-an-open-source-fingerprint-reader
Repo
Framework

Non-Contextual Modeling of Sarcasm using a Neural Network Benchmark

Title Non-Contextual Modeling of Sarcasm using a Neural Network Benchmark
Authors N. Dianna Radpour, Vinay Ashokkumar
Abstract One of the most crucial components of natural human-robot interaction is artificial intuition and its influence on dialog systems. The intuitive capability that humans have is undeniably extraordinary, and so remains one of the greatest challenges for natural communicative dialogue between humans and robots. In this paper, we introduce a novel probabilistic modeling framework of identifying, classifying and learning features of sarcastic text via training a neural network with human-informed sarcastic benchmarks. This is necessary for establishing a comprehensive sentiment analysis schema that is sensitive to the nuances of sarcasm-ridden text by being trained on linguistic cues. We show that our model provides a good fit for this type of real-world informed data, with potential to achieve as accurate, if not more, than alternatives. Though the implementation and benchmarking is an extensive task, it can be extended via the same method that we present to capture different forms of nuances in communication and making for much more natural and engaging dialogue systems.
Tasks Sentiment Analysis
Published 2017-11-20
URL http://arxiv.org/abs/1711.07404v1
PDF http://arxiv.org/pdf/1711.07404v1.pdf
PWC https://paperswithcode.com/paper/non-contextual-modeling-of-sarcasm-using-a
Repo
Framework

Use Generalized Representations, But Do Not Forget Surface Features

Title Use Generalized Representations, But Do Not Forget Surface Features
Authors Nafise Sadat Moosavi, Michael Strube
Abstract Only a year ago, all state-of-the-art coreference resolvers were using an extensive amount of surface features. Recently, there was a paradigm shift towards using word embeddings and deep neural networks, where the use of surface features is very limited. In this paper, we show that a simple SVM model with surface features outperforms more complex neural models for detecting anaphoric mentions. Our analysis suggests that using generalized representations and surface features have different strength that should be both taken into account for improving coreference resolution.
Tasks Coreference Resolution, Word Embeddings
Published 2017-02-24
URL http://arxiv.org/abs/1702.07507v1
PDF http://arxiv.org/pdf/1702.07507v1.pdf
PWC https://paperswithcode.com/paper/use-generalized-representations-but-do-not
Repo
Framework

Analyzing and Exploiting NARX Recurrent Neural Networks for Long-Term Dependencies

Title Analyzing and Exploiting NARX Recurrent Neural Networks for Long-Term Dependencies
Authors Robert DiPietro, Christian Rupprecht, Nassir Navab, Gregory D. Hager
Abstract Recurrent neural networks (RNNs) have achieved state-of-the-art performance on many diverse tasks, from machine translation to surgical activity recognition, yet training RNNs to capture long-term dependencies remains difficult. To date, the vast majority of successful RNN architectures alleviate this problem using nearly-additive connections between states, as introduced by long short-term memory (LSTM). We take an orthogonal approach and introduce MIST RNNs, a NARX RNN architecture that allows direct connections from the very distant past. We show that MIST RNNs 1) exhibit superior vanishing-gradient properties in comparison to LSTM and previously-proposed NARX RNNs; 2) are far more efficient than previously-proposed NARX RNN architectures, requiring even fewer computations than LSTM; and 3) improve performance substantially over LSTM and Clockwork RNNs on tasks requiring very long-term dependencies.
Tasks Activity Recognition, Machine Translation
Published 2017-02-24
URL http://arxiv.org/abs/1702.07805v4
PDF http://arxiv.org/pdf/1702.07805v4.pdf
PWC https://paperswithcode.com/paper/analyzing-and-exploiting-narx-recurrent
Repo
Framework

Assessing Information Transmission in Data Transformations with the Channel Multivariate Entropy Triangle

Title Assessing Information Transmission in Data Transformations with the Channel Multivariate Entropy Triangle
Authors Francisco J. Valverde-Albacete, Carmen Peláez-Moreno
Abstract Data transformation, e.g. feature transformation and selection, is an integral part of any machine learning procedure. In this paper we introduce an information-theoretic model and tools to assess the quality of data transformations in machine learning tasks. In an unsupervised fashion, we analyze the transfer of information of the transformation of a discrete, multivariate source of information X into a discrete, multivariate sink of information Y related by a distribution PXY . The first contribution is a decomposition of the maximal potential entropy of (X, Y) that we call a balance equation, into its a) non-transferable, b) transferable but not transferred and c) transferred parts. Such balance equations can be represented in (de Finetti) entropy diagrams, our second set of contributions. The most important of these, the aggregate Channel Multivariate Entropy Triangle is a visual exploratory tool to assess the effectiveness of multivariate data transformations in transferring information from input to output variables. We also show how these decomposition and balance equation also apply to the entropies of X and Y respectively and generate entropy triangles for them. As an example, we present the application of these tools to the assessment of information transfer efficiency for PCA and ICA as unsupervised feature transformation and selection procedures in supervised classification tasks.
Tasks
Published 2017-11-30
URL http://arxiv.org/abs/1711.11510v2
PDF http://arxiv.org/pdf/1711.11510v2.pdf
PWC https://paperswithcode.com/paper/assessing-information-transmission-in-data
Repo
Framework

Multi-scale Online Learning and its Applications to Online Auctions

Title Multi-scale Online Learning and its Applications to Online Auctions
Authors Sébastien Bubeck, Nikhil R. Devanur, Zhiyi Huang, Rad Niazadeh
Abstract We consider revenue maximization in online auction/pricing problems. A seller sells an identical item in each period to a new buyer, or a new set of buyers. For the online posted pricing problem, we show regret bounds that scale with the best fixed price, rather than the range of the values. We also show regret bounds that are almost scale free, and match the offline sample complexity, when comparing to a benchmark that requires a lower bound on the market share. These results are obtained by generalizing the classical learning from experts and multi-armed bandit problems to their multi-scale versions. In this version, the reward of each action is in a different range, and the regret w.r.t. a given action scales with its own range, rather than the maximum range.
Tasks
Published 2017-05-26
URL http://arxiv.org/abs/1705.09700v2
PDF http://arxiv.org/pdf/1705.09700v2.pdf
PWC https://paperswithcode.com/paper/multi-scale-online-learning-and-its
Repo
Framework

On the Capacity of Face Representation

Title On the Capacity of Face Representation
Authors Sixue Gong, Vishnu Naresh Boddeti, Anil K. Jain
Abstract In this paper we address the following question, given a face representation, how many identities can it resolve? In other words, what is the capacity of the face representation? A scientific basis for estimating the capacity of a given face representation will not only benefit the evaluation and comparison of different representation methods, but will also establish an upper bound on the scalability of an automatic face recognition system. We cast the face capacity problem in terms of packing bounds on a low-dimensional manifold embedded within a deep representation space. By explicitly accounting for the manifold structure of the representation as well two different sources of representational noise: epistemic (model) uncertainty and aleatoric (data) variability, our approach is able to estimate the capacity of a given face representation. To demonstrate the efficacy of our approach, we estimate the capacity of two deep neural network based face representations, namely 128-dimensional FaceNet and 512-dimensional SphereFace. Numerical experiments on unconstrained faces (IJB-C) provides a capacity upper bound of $2.7\times10^4$ for FaceNet and $8.4\times10^4$ for SphereFace representation at a false acceptance rate (FAR) of 1%. As expected, capacity reduces drastically at lower FARs. The capacity at FAR of 0.1% and 0.001% is $2.2\times10^3$ and $1.6\times10^{1}$, respectively for FaceNet and $3.6\times10^3$ and $6.0\times10^0$, respectively for SphereFace.
Tasks Face Recognition
Published 2017-09-29
URL http://arxiv.org/abs/1709.10433v3
PDF http://arxiv.org/pdf/1709.10433v3.pdf
PWC https://paperswithcode.com/paper/on-the-capacity-of-face-representation
Repo
Framework

Deep Hashing with Triplet Quantization Loss

Title Deep Hashing with Triplet Quantization Loss
Authors Yuefu Zhou, Shanshan Huang, Ya Zhang, Yanfeng Wang
Abstract With the explosive growth of image databases, deep hashing, which learns compact binary descriptors for images, has become critical for fast image retrieval. Many existing deep hashing methods leverage quantization loss, defined as distance between the features before and after quantization, to reduce the error from binarizing features. While minimizing the quantization loss guarantees that quantization has minimal effect on retrieval accuracy, it unfortunately significantly reduces the expressiveness of features even before the quantization. In this paper, we show that the above definition of quantization loss is too restricted and in fact not necessary for maintaining high retrieval accuracy. We therefore propose a new form of quantization loss measured in triplets. The core idea of the triplet quantization loss is to learn discriminative real-valued descriptors which lead to minimal loss on retrieval accuracy after quantization. Extensive experiments on two widely used benchmark data sets of different scales, CIFAR-10 and In-shop, demonstrate that the proposed method outperforms the state-of-the-art deep hashing methods. Moreover, we show that the compact binary descriptors obtained with triplet quantization loss lead to very small performance drop after quantization.
Tasks Image Retrieval, Quantization
Published 2017-10-31
URL http://arxiv.org/abs/1710.11445v1
PDF http://arxiv.org/pdf/1710.11445v1.pdf
PWC https://paperswithcode.com/paper/deep-hashing-with-triplet-quantization-loss
Repo
Framework

A Syntactic Approach to Domain-Specific Automatic Question Generation

Title A Syntactic Approach to Domain-Specific Automatic Question Generation
Authors Guy Danon, Mark Last
Abstract Factoid questions are questions that require short fact-based answers. Automatic generation (AQG) of factoid questions from a given text can contribute to educational activities, interactive question answering systems, search engines, and other applications. The goal of our research is to generate factoid source-question-answer triplets based on a specific domain. We propose a four-component pipeline, which obtains as input a training corpus of domain-specific documents, along with a set of declarative sentences from the same domain, and generates as output a set of factoid questions that refer to the source sentences but are slightly different from them, so that a question-answering system or a person can be asked a question that requires a deeper understanding and knowledge than a simple word-matching. Contrary to existing domain-specific AQG systems that utilize the template-based approach to question generation, we propose to transform each source sentence into a set of questions by applying a series of domain-independent rules (a syntactic-based approach). Our pipeline was evaluated in the domain of cyber security using a series of experiments on each component of the pipeline separately and on the end-to-end system. The proposed approach generated a higher percentage of acceptable questions than a prior state-of-the-art AQG system.
Tasks Question Answering, Question Generation
Published 2017-12-28
URL http://arxiv.org/abs/1712.09827v1
PDF http://arxiv.org/pdf/1712.09827v1.pdf
PWC https://paperswithcode.com/paper/a-syntactic-approach-to-domain-specific
Repo
Framework

Fully Convolutional Neural Networks for Page Segmentation of Historical Document Images

Title Fully Convolutional Neural Networks for Page Segmentation of Historical Document Images
Authors Christoph Wick, Frank Puppe
Abstract We propose a high-performance fully convolutional neural network (FCN) for historical document segmentation that is designed to process a single page in one step. The advantage of this model beside its speed is its ability to directly learn from raw pixels instead of using preprocessing steps e. g. feature computation or superpixel generation. We show that this network yields better results than existing methods on different public data sets. For evaluation of this model we introduce a novel metric that is independent of ambiguous ground truth called Foreground Pixel Accuracy (FgPA). This pixel based measure only counts foreground pixels in the binarized page, any background pixel is omitted. The major advantage of this metric is, that it enables researchers to compare different segmentation methods on their ability to successfully segment text or pictures and not on their ability to learn and possibly overfit the peculiarities of an ambiguous hand-made ground truth segmentation.
Tasks
Published 2017-11-21
URL http://arxiv.org/abs/1711.07695v2
PDF http://arxiv.org/pdf/1711.07695v2.pdf
PWC https://paperswithcode.com/paper/fully-convolutional-neural-networks-for-page
Repo
Framework

Using objective words in the reviews to improve the colloquial arabic sentiment analysis

Title Using objective words in the reviews to improve the colloquial arabic sentiment analysis
Authors Omar Al-Harbi
Abstract One of the main difficulties in sentiment analysis of the Arabic language is the presence of the colloquialism. In this paper, we examine the effect of using objective words in conjunction with sentimental words on sentiment classification for the colloquial Arabic reviews, specifically Jordanian colloquial reviews. The reviews often include both sentimental and objective words, however, the most existing sentiment analysis models ignore the objective words as they are considered useless. In this work, we created two lexicons: the first includes the colloquial sentimental words and compound phrases, while the other contains the objective words associated with values of sentiment tendency based on a particular estimation method. We used these lexicons to extract sentiment features that would be training input to the Support Vector Machines (SVM) to classify the sentiment polarity of the reviews. The reviews dataset have been collected manually from JEERAN website. The results of the experiments show that the proposed approach improves the polarity classification in comparison to two baseline models, with accuracy 95.6%.
Tasks Arabic Sentiment Analysis, Sentiment Analysis
Published 2017-09-25
URL http://arxiv.org/abs/1709.08521v1
PDF http://arxiv.org/pdf/1709.08521v1.pdf
PWC https://paperswithcode.com/paper/using-objective-words-in-the-reviews-to
Repo
Framework

OhioState at IJCNLP-2017 Task 4: Exploring Neural Architectures for Multilingual Customer Feedback Analysis

Title OhioState at IJCNLP-2017 Task 4: Exploring Neural Architectures for Multilingual Customer Feedback Analysis
Authors Dushyanta Dhyani
Abstract This paper describes our systems for IJCNLP 2017 Shared Task on Customer Feedback Analysis. We experimented with simple neural architectures that gave competitive performance on certain tasks. This includes shallow CNN and Bi-Directional LSTM architectures with Facebook’s Fasttext as a baseline model. Our best performing model was in the Top 5 systems using the Exact-Accuracy and Micro-Average-F1 metrics for the Spanish (85.28% for both) and French (70% and 73.17% respectively) task, and outperformed all the other models on comment (87.28%) and meaningless (51.85%) tags using Micro Average F1 by Tags metric for the French task.
Tasks
Published 2017-10-18
URL http://arxiv.org/abs/1710.06931v2
PDF http://arxiv.org/pdf/1710.06931v2.pdf
PWC https://paperswithcode.com/paper/ohiostate-at-ijcnlp-2017-task-4-exploring
Repo
Framework
comments powered by Disqus