April 2, 2020

3491 words 17 mins read

Paper Group ANR 216

Paper Group ANR 216

TREC CAsT 2019: The Conversational Assistance Track Overview. Multi-task Learning with Multi-head Attention for Multi-choice Reading Comprehension. Incorporating BERT into Neural Machine Translation. Demographic Bias in Presentation Attack Detection of Iris Recognition Systems. Robust Iris Presentation Attack Detection Fusing 2D and 3D Information. …

TREC CAsT 2019: The Conversational Assistance Track Overview

Title TREC CAsT 2019: The Conversational Assistance Track Overview
Authors Jeffrey Dalton, Chenyan Xiong, Jamie Callan
Abstract The Conversational Assistance Track (CAsT) is a new track for TREC 2019 to facilitate Conversational Information Seeking (CIS) research and to create a large-scale reusable test collection for conversational search systems. The document corpus is 38,426,252 passages from the TREC Complex Answer Retrieval (CAR) and Microsoft MAchine Reading COmprehension (MARCO) datasets. Eighty information seeking dialogues (30 train, 50 test) are an average of 9 to 10 questions long. Relevance assessments are provided for 30 training topics and 20 test topics. This year 21 groups submitted a total of 65 runs using varying methods for conversational query understanding and ranking. Methods include traditional retrieval based methods, feature based learning-to-rank, neural models, and knowledge enhanced methods. A common theme through the runs is the use of BERT-based neural reranking methods. Leading methods also employed document expansion, conversational query expansion, and generative language models for conversational query rewriting (GPT-2). The results show a gap between automatic systems and those using the manually resolved utterances, with a 35% relative improvement of manual rewrites over the best automatic system.
Tasks Learning-To-Rank, Machine Reading Comprehension, Reading Comprehension
Published 2020-03-30
URL https://arxiv.org/abs/2003.13624v1
PDF https://arxiv.org/pdf/2003.13624v1.pdf
PWC https://paperswithcode.com/paper/trec-cast-2019-the-conversational-assistance

Multi-task Learning with Multi-head Attention for Multi-choice Reading Comprehension

Title Multi-task Learning with Multi-head Attention for Multi-choice Reading Comprehension
Authors Hui Wan
Abstract Multiple-choice Machine Reading Comprehension (MRC) is an important and challenging Natural Language Understanding (NLU) task, in which a machine must choose the answer to a question from a set of choices, with the question placed in context of text passages or dialog. In the last a couple of years the NLU field has been revolutionized with the advent of models based on the Transformer architecture, which are pretrained on massive amounts of unsupervised data and then fine-tuned for various supervised learning NLU tasks. Transformer models have come to dominate a wide variety of leader-boards in the NLU field; in the area of MRC, the current state-of-the-art model on the DREAM dataset (see[Sunet al., 2019]) fine tunes Albert, a large pretrained Transformer-based model, and addition-ally combines it with an extra layer of multi-head attention between context and question-answer[Zhuet al., 2020].The purpose of this note is to document a new state-of-the-art result in the DREAM task, which is accomplished by, additionally, performing multi-task learning on two MRC multi-choice reading comprehension tasks (RACE and DREAM).
Tasks Machine Reading Comprehension, Multi-Task Learning, Reading Comprehension
Published 2020-02-26
URL https://arxiv.org/abs/2003.04992v1
PDF https://arxiv.org/pdf/2003.04992v1.pdf
PWC https://paperswithcode.com/paper/multi-task-learning-with-multi-head-attention

Incorporating BERT into Neural Machine Translation

Title Incorporating BERT into Neural Machine Translation
Authors Jinhua Zhu, Yingce Xia, Lijun Wu, Di He, Tao Qin, Wengang Zhou, Houqiang Li, Tie-Yan Liu
Abstract The recently proposed BERT has shown great power on a variety of natural language understanding tasks, such as text classification, reading comprehension, etc. However, how to effectively apply BERT to neural machine translation (NMT) lacks enough exploration. While BERT is more commonly used as fine-tuning instead of contextual embedding for downstream language understanding tasks, in NMT, our preliminary exploration of using BERT as contextual embedding is better than using for fine-tuning. This motivates us to think how to better leverage BERT for NMT along this direction. We propose a new algorithm named BERT-fused model, in which we first use BERT to extract representations for an input sequence, and then the representations are fused with each layer of the encoder and decoder of the NMT model through attention mechanisms. We conduct experiments on supervised (including sentence-level and document-level translations), semi-supervised and unsupervised machine translation, and achieve state-of-the-art results on seven benchmark datasets. Our code is available at \url{https://github.com/bert-nmt/bert-nmt}.
Tasks Machine Translation, Reading Comprehension, Text Classification, Unsupervised Machine Translation
Published 2020-02-17
URL https://arxiv.org/abs/2002.06823v1
PDF https://arxiv.org/pdf/2002.06823v1.pdf
PWC https://paperswithcode.com/paper/incorporating-bert-into-neural-machine-1

Demographic Bias in Presentation Attack Detection of Iris Recognition Systems

Title Demographic Bias in Presentation Attack Detection of Iris Recognition Systems
Authors Meiling Fang, Naser Damer, Florian Kirchbuchner, Arjan Kuijper
Abstract With the widespread use of biometric systems, the demographic bias problem raises more attention. Although many studies addressed bias issues in biometric verification, there is no works that analyse the bias in presentation attack detection (PAD) decisions. Hence, we investigate and analyze the demographic bias in iris PAD algorithms in this paper. To enable a clear discussion, we adapt the notions of differential performance and differential outcome to the PAD problem. We study the bias in iris PAD using three baselines (hand-crafted, transfer-learning, and training from scratch) using the the NDCLD-2013 database. The experimental results points out that female users will be significantly less protected by the PAD, in comparison to males.
Tasks Iris Recognition, Transfer Learning
Published 2020-03-06
URL https://arxiv.org/abs/2003.03151v1
PDF https://arxiv.org/pdf/2003.03151v1.pdf
PWC https://paperswithcode.com/paper/demographic-bias-in-presentation-attack

Robust Iris Presentation Attack Detection Fusing 2D and 3D Information

Title Robust Iris Presentation Attack Detection Fusing 2D and 3D Information
Authors Zhaoyuan Fang, Adam Czajka, Kevin W. Bowyer
Abstract Diversity and unpredictability of artifacts potentially presented to an iris sensor calls for presentation attack detection methods that are agnostic to specificity of presentation attack instruments. This paper proposes a method that combines two-dimensional and three-dimensional properties of the observed iris to address the problem of spoof detection in case when some properties of artifacts are unknown. The 2D (textural) iris features are extracted by a state-of-the-art method employing Binary Statistical Image Features (BSIF) and an ensemble of classifiers is used to deliver 2D modality-related decision. The 3D (shape) iris features are reconstructed by a photometric stereo method from only two images captured under near-infrared illumination placed at two different angles, as in many current commercial iris recognition sensors. The map of normal vectors is used to assess the convexity of the observed iris surface. The combination of these two approaches has been applied to detect whether a subject is wearing a textured contact lens to disguise their identity. Extensive experiments with NDCLD’15 dataset, and a newly collected NDIris3D dataset show that the proposed method is highly robust under various open-set testing scenarios, and that it outperforms all available open-source iris PAD methods tested in identical scenarios. The source code and the newly prepared benchmark are made available along with this paper.
Tasks Iris Recognition
Published 2020-02-21
URL https://arxiv.org/abs/2002.09137v1
PDF https://arxiv.org/pdf/2002.09137v1.pdf
PWC https://paperswithcode.com/paper/robust-iris-presentation-attack-detection

The Gossiping Insert-Eliminate Algorithm for Multi-Agent Bandits

Title The Gossiping Insert-Eliminate Algorithm for Multi-Agent Bandits
Authors Ronshee Chawla, Abishek Sankararaman, Ayalvadi Ganesh, Sanjay Shakkottai
Abstract We consider a decentralized multi-agent Multi Armed Bandit (MAB) setup consisting of $N$ agents, solving the same MAB instance to minimize individual cumulative regret. In our model, agents collaborate by exchanging messages through pairwise gossip style communications on an arbitrary connected graph. We develop two novel algorithms, where each agent only plays from a subset of all the arms. Agents use the communication medium to recommend only arm-IDs (not samples), and thus update the set of arms from which they play. We establish that, if agents communicate $\Omega(\log(T))$ times through any connected pairwise gossip mechanism, then every agent’s regret is a factor of order $N$ smaller compared to the case of no collaborations. Furthermore, we show that the communication constraints only have a second order effect on the regret of our algorithm. We then analyze this second order term of the regret to derive bounds on the regret-communication tradeoffs. Finally, we empirically evaluate our algorithm and conclude that the insights are fundamental and not artifacts of our bounds. We also show a lower bound which gives that the regret scaling obtained by our algorithm cannot be improved even in the absence of any communication constraints. Our results thus demonstrate that even a minimal level of collaboration among agents greatly reduces regret for all agents.
Published 2020-01-15
URL https://arxiv.org/abs/2001.05452v3
PDF https://arxiv.org/pdf/2001.05452v3.pdf
PWC https://paperswithcode.com/paper/the-gossiping-insert-eliminate-algorithm-for

RandomNet: Towards Fully Automatic Neural Architecture Design for Multimodal Learning

Title RandomNet: Towards Fully Automatic Neural Architecture Design for Multimodal Learning
Authors Stefano Alletto, Shenyang Huang, Vincent Francois-Lavet, Yohei Nakata, Guillaume Rabusseau
Abstract Almost all neural architecture search methods are evaluated in terms of performance (i.e. test accuracy) of the model structures that it finds. Should it be the only metric for a good autoML approach? To examine aspects beyond performance, we propose a set of criteria aimed at evaluating the core of autoML problem: the amount of human intervention required to deploy these methods into real world scenarios. Based on our proposed evaluation checklist, we study the effectiveness of a random search strategy for fully automated multimodal neural architecture search. Compared to traditional methods that rely on manually crafted feature extractors, our method selects each modality from a large search space with minimal human supervision. We show that our proposed random search strategy performs close to the state of the art on the AV-MNIST dataset while meeting the desirable characteristics for a fully automated design process.
Tasks AutoML, Neural Architecture Search
Published 2020-03-02
URL https://arxiv.org/abs/2003.01181v1
PDF https://arxiv.org/pdf/2003.01181v1.pdf
PWC https://paperswithcode.com/paper/randomnet-towards-fully-automatic-neural

Ramifications and Diminution of Image Noise in Iris Recognition System

Title Ramifications and Diminution of Image Noise in Iris Recognition System
Authors Prajoy Podder, A. H. M Shahariar Parvez, Md. Mizanur Rahman, Tanvir Zaman Khan
Abstract Human Identity verification has always been an eye-catching goal in digital based security system. Authentication or identification systems developed using human characteristics such as face, finger print, hand geometry, iris, and voice are denoted as biometric systems. Among the various characteristics, Iris recognition trusts on the idiosyncratic human iris patterns to find out and corroborate the identity of a person. The image is normally contemplated as a gathering of information. Existence of noises in the input or processed image effects degradation in the image superiority. It should be paramount to restore original image from noises for attaining maximum amount of information from corrupted images. Noisy images in biometric identification system cannot give accurate identity. So Image related data or information tends to loss or damage. Images are affected by various sorts of noises. This paper mainly focuses on Salt and Pepper noise, Gaussian noise, Uniform noise, Speckle noise. Different filtering techniques can be adapted for noise diminution to develop the visual quality as well as understandability of images. In this paper, four types of noises have been undertaken and applied on some images. The filtering of these noises uses different types of filters like Mean, Median, Weiner, Gaussian filter etc. A relative interpretation is performed using four different categories of filter with finding the value of quality determined parameters like mean square error (MSE), peak signal to noise ratio (PSNR), average difference value (AD) and maximum difference value (MD).
Tasks Iris Recognition
Published 2020-02-08
URL https://arxiv.org/abs/2002.03125v1
PDF https://arxiv.org/pdf/2002.03125v1.pdf
PWC https://paperswithcode.com/paper/ramifications-and-diminution-of-image-noise

Morton Filters for Superior Template Protection for Iris Recognition

Title Morton Filters for Superior Template Protection for Iris Recognition
Authors Kiran B. Raja, R. Raghavendra, Sushma Venkatesh, Christoph Busch
Abstract We address the fundamental performance issues of template protection (TP) for iris verification. We base our work on the popular Bloom-Filter templates protection & address the key challenges like sub-optimal performance and low unlinkability. Specifically, we focus on cases where Bloom-filter templates results in non-ideal performance due to presence of large degradations within iris images. Iris recognition is challenged with number of occluding factors such as presence of eye-lashes within captured image, occlusion due to eyelids, low quality iris images due to motion blur. All of such degrading factors result in obtaining non-reliable iris codes & thereby provide non-ideal biometric performance. These factors directly impact the protected templates derived from iris images when classical Bloom-filters are employed. To this end, we propose and extend our earlier ideas of Morton-filters for obtaining better and reliable templates for iris. Morton filter based TP for iris codes is based on leveraging the intra and inter-class distribution by exploiting low-rank iris codes to derive the stable bits across iris images for a particular subject and also analyzing the discriminable bits across various subjects. Such low-rank non-noisy iris codes enables realizing the template protection in a superior way which not only can be used in constrained setting, but also in relaxed iris imaging. We further extend the work to analyze the applicability to VIS iris images by employing a large scale public iris image database - UBIRIS(v1 & v2), captured in a unconstrained setting. Through a set of experiments, we demonstrate the applicability of proposed approach and vet the strengths and weakness. Yet another contribution of this work stems in assessing the security of the proposed approach where factors of Unlinkability is studied to indicate the antagonistic nature to relaxed iris imaging scenarios.
Tasks Iris Recognition
Published 2020-01-15
URL https://arxiv.org/abs/2001.05290v1
PDF https://arxiv.org/pdf/2001.05290v1.pdf
PWC https://paperswithcode.com/paper/morton-filters-for-superior-template

AttentionAnatomy: A unified framework for whole-body organs at risk segmentation using multiple partially annotated datasets

Title AttentionAnatomy: A unified framework for whole-body organs at risk segmentation using multiple partially annotated datasets
Authors Shanlin Sun, Yang Liu, Narisu Bai, Hao Tang, Xuming Chen, Qian Huang, Yong Liu, Xiaohui Xie
Abstract Organs-at-risk (OAR) delineation in computed tomography (CT) is an important step in Radiation Therapy (RT) planning. Recently, deep learning based methods for OAR delineation have been proposed and applied in clinical practice for separate regions of the human body (head and neck, thorax, and abdomen). However, there are few researches regarding the end-to-end whole-body OARs delineation because the existing datasets are mostly partially or incompletely annotated for such task. In this paper, our proposed end-to-end convolutional neural network model, called \textbf{AttentionAnatomy}, can be jointly trained with three partially annotated datasets, segmenting OARs from whole body. Our main contributions are: 1) an attention module implicitly guided by body region label to modulate the segmentation branch output; 2) a prediction re-calibration operation, exploiting prior information of the input images, to handle partial-annotation(HPA) problem; 3) a new hybrid loss function combining batch Dice loss and spatially balanced focal loss to alleviate the organ size imbalance problem. Experimental results of our proposed framework presented significant improvements in both S{\o}rensen-Dice coefficient (DSC) and 95% Hausdorff distance compared to the baseline model.
Tasks Calibration, Computed Tomography (CT)
Published 2020-01-13
URL https://arxiv.org/abs/2001.04446v1
PDF https://arxiv.org/pdf/2001.04446v1.pdf
PWC https://paperswithcode.com/paper/attentionanatomy-a-unified-framework-for

Deep Transfer Convolutional Neural Network and Extreme Learning Machine for Lung Nodule Diagnosis on CT images

Title Deep Transfer Convolutional Neural Network and Extreme Learning Machine for Lung Nodule Diagnosis on CT images
Authors Xufeng Huang, Qiang Lei, Tingli Xie, Yahui Zhang, Zhen Hu, Qi Zhou
Abstract Diagnosis of benign-malignant nodules in the lung on Computed Tomography (CT) images is critical for determining tumor level and reducing patient mortality. Deep learning-based diagnosis of nodules in lung CT images, however, is time-consuming and less accurate due to redundant structure and the lack of adequate training data. In this paper, a novel diagnosis method based on Deep Transfer Convolutional Neural Network (DTCNN) and Extreme Learning Machine (ELM) is explored, which merges the synergy of two algorithms to deal with benign-malignant nodules classification. An optimal DTCNN is first adopted to extract high level features of lung nodules, which has been trained with the ImageNet dataset beforehand. After that, an ELM classifier is further developed to classify benign and malignant lung nodules. Two datasets, including the Lung Image Database Consortium and Image Database Resource Initiative (LIDC-IDRI) public dataset and a private dataset from the First Affiliated Hospital of Guangzhou Medical University in China (FAH-GMU), have been conducted to verify the efficiency and effectiveness of the proposed approach. The experimental results show that our novel DTCNN-ELM model provides the most reliable results compared with current state-of-the-art methods.
Tasks Computed Tomography (CT)
Published 2020-01-05
URL https://arxiv.org/abs/2001.01279v1
PDF https://arxiv.org/pdf/2001.01279v1.pdf
PWC https://paperswithcode.com/paper/deep-transfer-convolutional-neural-network

Xtreaming: an incremental multidimensional projection technique and its application to streaming data

Title Xtreaming: an incremental multidimensional projection technique and its application to streaming data
Authors Tácito T. A. T. Neves, Rafael M. Martins, Danilo B. Coimbra, Kostiantyn Kucher, Andreas Kerren, Fernando V. Paulovich
Abstract Streaming data applications are becoming more common due to the ability of different information sources to continuously capture or produce data, such as sensors and social media. Despite recent advances, most visualization approaches, in particular, multidimensional projection or dimensionality reduction techniques, cannot be directly applied in such scenarios due to the transient nature of streaming data. Currently, only a few methods address this limitation using online or incremental strategies, continuously processing data, and updating the visualization. Despite their relative success, most of them impose the need for storing and accessing the data multiple times, not being appropriate for streaming where data continuously grow. Others do not impose such requirements but are not capable of updating the position of the data already projected, potentially resulting in visual artifacts. In this paper, we present Xtreaming, a novel incremental projection technique that continuously updates the visual representation to reflect new emerging structures or patterns without visiting the multidimensional data more than once. Our tests show that Xtreaming is competitive in terms of global distance preservation if compared to other streaming and incremental techniques, but it is orders of magnitude faster. To the best of our knowledge, it is the first methodology that is capable of evolving a projection to faithfully represent new emerging structures without the need to store all data, providing reliable results for efficiently and effectively projecting streaming data.
Tasks Dimensionality Reduction
Published 2020-03-08
URL https://arxiv.org/abs/2003.09017v1
PDF https://arxiv.org/pdf/2003.09017v1.pdf
PWC https://paperswithcode.com/paper/xtreaming-an-incremental-multidimensional

Emergent Communication with World Models

Title Emergent Communication with World Models
Authors Alexander I. Cowen-Rivers, Jason Naradowsky
Abstract We introduce Language World Models, a class of language-conditional generative model which interpret natural language messages by predicting latent codes of future observations. This provides a visual grounding of the message, similar to an enhanced observation of the world, which may include objects outside of the listening agent’s field-of-view. We incorporate this “observation” into a persistent memory state, and allow the listening agent’s policy to condition on it, akin to the relationship between memory and controller in a World Model. We show this improves effective communication and task success in 2D gridworld speaker-listener navigation tasks. In addition, we develop two losses framed specifically for our model-based formulation to promote positive signalling and positive listening. Finally, because messages are interpreted in a generative model, we can visualize the model beliefs to gain insight into how the communication channel is utilized.
Published 2020-02-22
URL https://arxiv.org/abs/2002.09604v1
PDF https://arxiv.org/pdf/2002.09604v1.pdf
PWC https://paperswithcode.com/paper/emergent-communication-with-world-models

KGvec2go – Knowledge Graph Embeddings as a Service

Title KGvec2go – Knowledge Graph Embeddings as a Service
Authors Jan Portisch, Michael Hladik, Heiko Paulheim
Abstract In this paper, we present KGvec2go, a Web API for accessing and consuming graph embeddings in a light-weight fashion in downstream applications. Currently, we serve pre-trained embeddings for four knowledge graphs. We introduce the service and its usage, and we show further that the trained models have semantic value by evaluating them on multiple semantic benchmarks. The evaluation also reveals that the combination of multiple models can lead to a better outcome than the best individual model.
Tasks Knowledge Graph Embeddings, Knowledge Graphs
Published 2020-03-09
URL https://arxiv.org/abs/2003.05809v1
PDF https://arxiv.org/pdf/2003.05809v1.pdf
PWC https://paperswithcode.com/paper/kgvec2go-knowledge-graph-embeddings-as-a

An Evaluation of Knowledge Graph Embeddings for Autonomous Driving Data: Experience and Practice

Title An Evaluation of Knowledge Graph Embeddings for Autonomous Driving Data: Experience and Practice
Authors Ruwan Wickramarachchi, Cory Henson, Amit Sheth
Abstract The autonomous driving (AD) industry is exploring the use of knowledge graphs (KGs) to manage the vast amount of heterogeneous data generated from vehicular sensors. The various types of equipped sensors include video, LIDAR and RADAR. Scene understanding is an important topic in AD which requires consideration of various aspects of a scene, such as detected objects, events, time and location. Recent work on knowledge graph embeddings (KGEs) - an approach that facilitates neuro-symbolic fusion - has shown to improve the predictive performance of machine learning models. With the expectation that neuro-symbolic fusion through KGEs will improve scene understanding, this research explores the generation and evaluation of KGEs for autonomous driving data. We also present an investigation of the relationship between the level of informational detail in a KG and the quality of its derivative embeddings. By systematically evaluating KGEs along four dimensions – i.e. quality metrics, KG informational detail, algorithms, and datasets – we show that (1) higher levels of informational detail in KGs lead to higher quality embeddings, (2) type and relation semantics are better captured by the semantic transitional distance-based TransE algorithm, and (3) some metrics, such as coherence measure, may not be suitable for intrinsically evaluating KGEs in this domain. Additionally, we also present an (early) investigation of the usefulness of KGEs for two use-cases in the AD domain.
Tasks Autonomous Driving, Knowledge Graph Embeddings, Knowledge Graphs, Scene Understanding
Published 2020-02-29
URL https://arxiv.org/abs/2003.00344v1
PDF https://arxiv.org/pdf/2003.00344v1.pdf
PWC https://paperswithcode.com/paper/an-evaluation-of-knowledge-graph-embeddings
comments powered by Disqus