Paper Group AWR 68
Interactive Natural Language-based Person Search. Data Mining in Clinical Trial Text: Transformers for Classification and Question Answering Tasks. A Comparison of Metric Learning Loss Functions for End-To-End Speaker Verification. Bayesian Neural Networks With Maximum Mean Discrepancy Regularization. On Identifying Hashtags in Disaster Twitter Dat …
Interactive Natural Language-based Person Search
Title | Interactive Natural Language-based Person Search |
Authors | Vikram Shree, Wei-Lun Chao, Mark Campbell |
Abstract | In this work, we consider the problem of searching people in an unconstrained environment, with natural language descriptions. Specifically, we study how to systematically design an algorithm to effectively acquire descriptions from humans. An algorithm is proposed by adapting models, used for visual and language understanding, to search a person of interest (POI) in a principled way, achieving promising results without the need to re-design another complicated model. We then investigate an iterative question-answering (QA) strategy that enable robots to request additional information about the POI’s appearance from the user. To this end, we introduce a greedy algorithm to rank questions in terms of their significance, and equip the algorithm with the capability to dynamically adjust the length of human-robot interaction according to model’s uncertainty. Our approach is validated not only on benchmark datasets but on a mobile robot, moving in a dynamic and crowded environment. |
Tasks | Person Search, Question Answering |
Published | 2020-02-19 |
URL | https://arxiv.org/abs/2002.08434v1 |
https://arxiv.org/pdf/2002.08434v1.pdf | |
PWC | https://paperswithcode.com/paper/interactive-natural-language-based-person |
Repo | https://github.com/vikshree/QA_PersonSearchLanguageData |
Framework | none |
Data Mining in Clinical Trial Text: Transformers for Classification and Question Answering Tasks
Title | Data Mining in Clinical Trial Text: Transformers for Classification and Question Answering Tasks |
Authors | Lena Schmidt, Julie Weeds, Julian P. T. Higgins |
Abstract | This research on data extraction methods applies recent advances in natural language processing to evidence synthesis based on medical texts. Texts of interest include abstracts of clinical trials in English and in multilingual contexts. The main focus is on information characterized via the Population, Intervention, Comparator, and Outcome (PICO) framework, but data extraction is not limited to these fields. Recent neural network architectures based on transformers show capacities for transfer learning and increased performance on downstream natural language processing tasks such as universal reading comprehension, brought forward by this architecture’s use of contextualized word embeddings and self-attention mechanisms. This paper contributes to solving problems related to ambiguity in PICO sentence prediction tasks, as well as highlighting how annotations for training named entity recognition systems are used to train a high-performing, but nevertheless flexible architecture for question answering in systematic review automation. Additionally, it demonstrates how the problem of insufficient amounts of training annotations for PICO entity extraction is tackled by augmentation. All models in this paper were created with the aim to support systematic review (semi)automation. They achieve high F1 scores, and demonstrate the feasibility of applying transformer-based classification methods to support data mining in the biomedical literature. |
Tasks | Entity Extraction, Named Entity Recognition, Question Answering, Reading Comprehension, Transfer Learning, Word Embeddings |
Published | 2020-01-30 |
URL | https://arxiv.org/abs/2001.11268v1 |
https://arxiv.org/pdf/2001.11268v1.pdf | |
PWC | https://paperswithcode.com/paper/data-mining-in-clinical-trial-text |
Repo | https://github.com/L-ENA/HealthINF2020 |
Framework | none |
A Comparison of Metric Learning Loss Functions for End-To-End Speaker Verification
Title | A Comparison of Metric Learning Loss Functions for End-To-End Speaker Verification |
Authors | Juan M. Coria, Hervé Bredin, Sahar Ghannay, Sophie Rosset |
Abstract | Despite the growing popularity of metric learning approaches, very little work has attempted to perform a fair comparison of these techniques for speaker verification. We try to fill this gap and compare several metric learning loss functions in a systematic manner on the VoxCeleb dataset. The first family of loss functions is derived from the cross entropy loss (usually used for supervised classification) and includes the congenerous cosine loss, the additive angular margin loss, and the center loss. The second family of loss functions focuses on the similarity between training samples and includes the contrastive loss and the triplet loss. We show that the additive angular margin loss function outperforms all other loss functions in the study, while learning more robust representations. Based on a combination of SincNet trainable features and the x-vector architecture, the network used in this paper brings us a step closer to a really-end-to-end speaker verification system, when combined with the additive angular margin loss, while still being competitive with the x-vector baseline. In the spirit of reproducible research, we also release open source Python code for reproducing our results, and share pretrained PyTorch models on torch.hub that can be used either directly or after fine-tuning. |
Tasks | Metric Learning, Speaker Verification |
Published | 2020-03-31 |
URL | https://arxiv.org/abs/2003.14021v1 |
https://arxiv.org/pdf/2003.14021v1.pdf | |
PWC | https://paperswithcode.com/paper/a-comparison-of-metric-learning-loss |
Repo | https://github.com/juanmc2005/SpeakerEmbeddingLossComparison |
Framework | none |
Bayesian Neural Networks With Maximum Mean Discrepancy Regularization
Title | Bayesian Neural Networks With Maximum Mean Discrepancy Regularization |
Authors | Jary Pomponi, Simone Scardapane, Aurelio Uncini |
Abstract | Bayesian Neural Networks (BNNs) are trained to optimize an entire distribution over their weights instead of a single set, having significant advantages in terms of, e.g., interpretability, multi-task learning, and calibration. Because of the intractability of the resulting optimization problem, most BNNs are either sampled through Monte Carlo methods, or trained by minimizing a suitable Evidence Lower BOund (ELBO) on a variational approximation. In this paper, we propose a variant of the latter, wherein we replace the Kullback-Leibler divergence in the ELBO term with a Maximum Mean Discrepancy (MMD) estimator, inspired by recent work in variational inference. After motivating our proposal based on the properties of the MMD term, we proceed to show a number of empirical advantages of the proposed formulation over the state-of-the-art. In particular, our BNNs achieve higher accuracy on multiple benchmarks, including several image classification tasks. In addition, they are more robust to the selection of a prior over the weights, and they are better calibrated. As a second contribution, we provide a new formulation for estimating the uncertainty on a given prediction, showing it performs in a more robust fashion against adversarial attacks and the injection of noise over their inputs, compared to more classical criteria such as the differential entropy. |
Tasks | Calibration, Image Classification, Multi-Task Learning |
Published | 2020-03-02 |
URL | https://arxiv.org/abs/2003.00952v1 |
https://arxiv.org/pdf/2003.00952v1.pdf | |
PWC | https://paperswithcode.com/paper/bayesian-neural-networks-with-maximum-mean |
Repo | https://github.com/ispamm/MMD-Bayesian-Neural-Network |
Framework | pytorch |
On Identifying Hashtags in Disaster Twitter Data
Title | On Identifying Hashtags in Disaster Twitter Data |
Authors | Jishnu Ray Chowdhury, Cornelia Caragea, Doina Caragea |
Abstract | Tweet hashtags have the potential to improve the search for information during disaster events. However, there is a large number of disaster-related tweets that do not have any user-provided hashtags. Moreover, only a small number of tweets that contain actionable hashtags are useful for disaster response. To facilitate progress on automatic identification (or extraction) of disaster hashtags for Twitter data, we construct a unique dataset of disaster-related tweets annotated with hashtags useful for filtering actionable information. Using this dataset, we further investigate Long Short Term Memory-based models within a Multi-Task Learning framework. The best performing model achieves an F1-score as high as 92.22%. The dataset, code, and other resources are available on Github. |
Tasks | Multi-Task Learning |
Published | 2020-01-05 |
URL | https://arxiv.org/abs/2001.01323v1 |
https://arxiv.org/pdf/2001.01323v1.pdf | |
PWC | https://paperswithcode.com/paper/on-identifying-hashtags-in-disaster-twitter |
Repo | https://github.com/JRC1995/Tweet-Disaster-Keyphrase |
Framework | tf |
Tree-Structured Policy based Progressive Reinforcement Learning for Temporally Language Grounding in Video
Title | Tree-Structured Policy based Progressive Reinforcement Learning for Temporally Language Grounding in Video |
Authors | Jie Wu, Guanbin Li, Si Liu, Liang Lin |
Abstract | Temporally language grounding in untrimmed videos is a newly-raised task in video understanding. Most of the existing methods suffer from inferior efficiency, lacking interpretability, and deviating from the human perception mechanism. Inspired by human’s coarse-to-fine decision-making paradigm, we formulate a novel Tree-Structured Policy based Progressive Reinforcement Learning (TSP-PRL) framework to sequentially regulate the temporal boundary by an iterative refinement process. The semantic concepts are explicitly represented as the branches in the policy, which contributes to efficiently decomposing complex policies into an interpretable primitive action. Progressive reinforcement learning provides correct credit assignment via two task-oriented rewards that encourage mutual promotion within the tree-structured policy. We extensively evaluate TSP-PRL on the Charades-STA and ActivityNet datasets, and experimental results show that TSP-PRL achieves competitive performance over existing state-of-the-art methods. |
Tasks | Decision Making, Video Understanding |
Published | 2020-01-18 |
URL | https://arxiv.org/abs/2001.06680v1 |
https://arxiv.org/pdf/2001.06680v1.pdf | |
PWC | https://paperswithcode.com/paper/tree-structured-policy-based-progressive |
Repo | https://github.com/WuJie1010/TSP-PRL |
Framework | pytorch |
Neural Cross-Lingual Transfer and Limited Annotated Data for Named Entity Recognition in Danish
Title | Neural Cross-Lingual Transfer and Limited Annotated Data for Named Entity Recognition in Danish |
Authors | Barbara Plank |
Abstract | Named Entity Recognition (NER) has greatly advanced by the introduction of deep neural architectures. However, the success of these methods depends on large amounts of training data. The scarcity of publicly-available human-labeled datasets has resulted in limited evaluation of existing NER systems, as is the case for Danish. This paper studies the effectiveness of cross-lingual transfer for Danish, evaluates its complementarity to limited gold data, and sheds light on performance of Danish NER. |
Tasks | Cross-Lingual Transfer, Named Entity Recognition |
Published | 2020-03-05 |
URL | https://arxiv.org/abs/2003.02931v1 |
https://arxiv.org/pdf/2003.02931v1.pdf | |
PWC | https://paperswithcode.com/paper/neural-cross-lingual-transfer-and-limited |
Repo | https://github.com/bplank/danish_ner_transfer |
Framework | none |
PhoBERT: Pre-trained language models for Vietnamese
Title | PhoBERT: Pre-trained language models for Vietnamese |
Authors | Dat Quoc Nguyen, Anh Tuan Nguyen |
Abstract | We present PhoBERT with two versions of “base” and “large”–the first public large-scale monolingual language models pre-trained for Vietnamese. We show that PhoBERT improves the state-of-the-art in multiple Vietnamese-specific NLP tasks including Part-of-speech tagging, Named-entity recognition and Natural language inference. We release PhoBERT to facilitate future research and downstream applications for Vietnamese NLP. Our PhoBERT is released at: https://github.com/VinAIResearch/PhoBERT |
Tasks | Named Entity Recognition, Natural Language Inference, Part-Of-Speech Tagging |
Published | 2020-03-02 |
URL | https://arxiv.org/abs/2003.00744v1 |
https://arxiv.org/pdf/2003.00744v1.pdf | |
PWC | https://paperswithcode.com/paper/phobert-pre-trained-language-models-for |
Repo | https://github.com/VinAIResearch/PhoBERT |
Framework | pytorch |
Parameter Space Factorization for Zero-Shot Learning across Tasks and Languages
Title | Parameter Space Factorization for Zero-Shot Learning across Tasks and Languages |
Authors | Edoardo M. Ponti, Ivan Vulić, Ryan Cotterell, Marinela Parovic, Roi Reichart, Anna Korhonen |
Abstract | Most combinations of NLP tasks and language varieties lack in-domain examples for supervised training because of the paucity of annotated data. How can neural models make sample-efficient generalizations from task-language combinations with available data to low-resource ones? In this work, we propose a Bayesian generative model for the space of neural parameters. We assume that this space can be factorized into latent variables for each language and each task. We infer the posteriors over such latent variables based on data from seen task-language combinations through variational inference. This enables zero-shot classification on unseen combinations at prediction time. For instance, given training data for named entity recognition (NER) in Vietnamese and for part-of-speech (POS) tagging in Wolof, our model can perform accurate predictions for NER in Wolof. In particular, we experiment with a typologically diverse sample of 33 languages from 4 continents and 11 families, and show that our model yields comparable or better results than state-of-the-art, zero-shot cross-lingual transfer methods; it increases performance by 4.49 points for POS tagging and 7.73 points for NER on average compared to the strongest baseline. |
Tasks | Cross-Lingual Transfer, Named Entity Recognition, Part-Of-Speech Tagging, Zero-Shot Learning |
Published | 2020-01-30 |
URL | https://arxiv.org/abs/2001.11453v1 |
https://arxiv.org/pdf/2001.11453v1.pdf | |
PWC | https://paperswithcode.com/paper/parameter-space-factorization-for-zero-shot |
Repo | https://github.com/cambridgeltl/parameter-factorization |
Framework | pytorch |
Learning Delicate Local Representations for Multi-Person Pose Estimation
Title | Learning Delicate Local Representations for Multi-Person Pose Estimation |
Authors | Yuanhao Cai, Zhicheng Wang, Zhengxiong Luo, Binyi Yin, Angang Du, Haoqian Wang, Xinyu Zhou, Erjin Zhou, Xiangyu Zhang, Jian Sun |
Abstract | In this paper, we propose a novel method called Residual Steps Network (RSN). RSN aggregates features with the same spatialsize (Intra-level features) efficiently to obtain delicate local representations, which retain rich low-level spatial information and result in pre-cise keypoint localization. In addition, we propose an efficient attention mechanism - Pose Refine Machine (PRM) to further refine the keypointlocations. Our approach won the 1st place of COCO Keypoint Challenge 2019 and achieves state-of-the-art results on both COCO and MPII benchmarks, without using extra training data and pretrained model. Our single model achieves 78.6 on COCO test-dev, 93.0 on MPII test dataset. Ensembled models achieve 79.2 on COCO test-dev, 77.1 on COCO test-challenge dataset. The source code is publicly available for further research at https://github.com/caiyuanhao1998/RSN |
Tasks | Multi-Person Pose Estimation, Pose Estimation |
Published | 2020-03-09 |
URL | https://arxiv.org/abs/2003.04030v2 |
https://arxiv.org/pdf/2003.04030v2.pdf | |
PWC | https://paperswithcode.com/paper/learning-delicate-local-representations-for |
Repo | https://github.com/caiyuanhao1998/RSN |
Framework | pytorch |
MoVi: A Large Multipurpose Motion and Video Dataset
Title | MoVi: A Large Multipurpose Motion and Video Dataset |
Authors | Saeed Ghorbani, Kimia Mahdaviani, Anne Thaler, Konrad Kording, Douglas James Cook, Gunnar Blohm, Nikolaus F. Troje |
Abstract | Human movements are both an area of intense study and the basis of many applications such as character animation. For many applications, it is crucial to identify movements from videos or analyze datasets of movements. Here we introduce a new human Motion and Video dataset MoVi, which we make available publicly. It contains 60 female and 30 male actors performing a collection of 20 predefined everyday actions and sports movements, and one self-chosen movement. In five capture rounds, the same actors and movements were recorded using different hardware systems, including an optical motion capture system, video cameras, and inertial measurement units (IMU). For some of the capture rounds, the actors were recorded when wearing natural clothing, for the other rounds they wore minimal clothing. In total, our dataset contains 9 hours of motion capture data, 17 hours of video data from 4 different points of view (including one hand-held camera), and 6.6 hours of IMU data. In this paper, we describe how the dataset was collected and post-processed; We present state-of-the-art estimates of skeletal motions and full-body shape deformations associated with skeletal motion. We discuss examples for potential studies this dataset could enable. |
Tasks | Motion Capture |
Published | 2020-03-04 |
URL | https://arxiv.org/abs/2003.01888v1 |
https://arxiv.org/pdf/2003.01888v1.pdf | |
PWC | https://paperswithcode.com/paper/movi-a-large-multipurpose-motion-and-video |
Repo | https://github.com/saeed1262/MoVi-Toolbox |
Framework | none |
VegasFlow: accelerating Monte Carlo simulation across multiple hardware platforms
Title | VegasFlow: accelerating Monte Carlo simulation across multiple hardware platforms |
Authors | Stefano Carrazza, Juan M. Cruz-Martinez |
Abstract | We present VegasFlow, a new software for fast evaluation of high dimensional integrals based on Monte Carlo integration techniques designed for platforms with hardware accelerators. The growing complexity of calculations and simulations in many areas of science have been accompanied by advances in the computational tools which have helped their developments. VegasFlow enables developers to delegate all complicated aspects of hardware or platform implementation to the library so they can focus on the problem at hand. This software is inspired on the Vegas algorithm, ubiquitous in the particle physics community as the driver of cross section integration, and based on Google’s powerful TensorFlow library. We benchmark the performance of this library on many different consumer and professional grade GPUs and CPUs. |
Tasks | |
Published | 2020-02-28 |
URL | https://arxiv.org/abs/2002.12921v1 |
https://arxiv.org/pdf/2002.12921v1.pdf | |
PWC | https://paperswithcode.com/paper/vegasflow-accelerating-monte-carlo-simulation |
Repo | https://github.com/N3PDF/vegasflow |
Framework | tf |
Gaining a Sense of Touch. Physical Parameters Estimation using a Soft Gripper and Neural Networks
Title | Gaining a Sense of Touch. Physical Parameters Estimation using a Soft Gripper and Neural Networks |
Authors | Michał Bednarek, Piotr Kicki, Jakub Bednarek, Krzysztof Walas |
Abstract | Soft grippers are gaining significant attention in the manipulation of elastic objects, where it is required to handle soft and unstructured objects which are vulnerable to deformations. A crucial problem is to estimate the physical parameters of a squeezed object to adjust the manipulation procedure, which is considered as a significant challenge. To the best of the authors’ knowledge, there is not enough research on physical parameters estimation using deep learning algorithms on measurements from direct interaction with objects using robotic grippers. In our work, we proposed a trainable system for the regression of a stiffness coefficient and provided extensive experiments using the physics simulator environment. Moreover, we prepared the application that works in the real-world scenario. Our system can reliably estimate the stiffness of an object using the Yale OpenHand soft gripper based on readings from Inertial Measurement Units (IMUs) attached to its fingers. Additionally, during the experiments, we prepared three datasets of signals gathered while squeezing objects – two created in the simulation environment and one composed of real data. |
Tasks | |
Published | 2020-03-02 |
URL | https://arxiv.org/abs/2003.00784v2 |
https://arxiv.org/pdf/2003.00784v2.pdf | |
PWC | https://paperswithcode.com/paper/gaining-a-sense-of-touch-physical-parameters |
Repo | https://github.com/mbed92/soft-grip |
Framework | tf |
Towards Detection of Subjective Bias using Contextualized Word Embeddings
Title | Towards Detection of Subjective Bias using Contextualized Word Embeddings |
Authors | Tanvi Dadu, Kartikey Pant, Radhika Mamidi |
Abstract | Subjective bias detection is critical for applications like propaganda detection, content recommendation, sentiment analysis, and bias neutralization. This bias is introduced in natural language via inflammatory words and phrases, casting doubt over facts, and presupposing the truth. In this work, we perform comprehensive experiments for detecting subjective bias using BERT-based models on the Wiki Neutrality Corpus(WNC). The dataset consists of $360k$ labeled instances, from Wikipedia edits that remove various instances of the bias. We further propose BERT-based ensembles that outperform state-of-the-art methods like $BERT_{large}$ by a margin of $5.6$ F1 score. |
Tasks | Sentiment Analysis, Word Embeddings |
Published | 2020-02-16 |
URL | https://arxiv.org/abs/2002.06644v1 |
https://arxiv.org/pdf/2002.06644v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-detection-of-subjective-bias-using |
Repo | https://github.com/tanvidadu/Subjective-Bias-Detection |
Framework | none |
Lesion Harvester: Iteratively Mining Unlabeled Lesions and Hard-Negative Examples at Scale
Title | Lesion Harvester: Iteratively Mining Unlabeled Lesions and Hard-Negative Examples at Scale |
Authors | Jinzheng Cai, Adam P. Harrison, Youjing Zheng, Ke Yan, Yuankai Huo, Jing Xiao, Lin Yang, Le Lu |
Abstract | Acquiring large-scale medical image data, necessary for training machine learning algorithms, is frequently intractable, due to prohibitive expert-driven annotation costs. Recent datasets extracted from hospital archives, e.g., DeepLesion, have begun to address this problem. However, these are often incompletely or noisily labeled, e.g., DeepLesion leaves over 50% of its lesions unlabeled. Thus, effective methods to harvest missing annotations are critical for continued progress in medical image analysis. This is the goal of our work, where we develop a powerful system to harvest missing lesions from the DeepLesion dataset at high precision. Accepting the need for some degree of expert labor to achieve high fidelity, we exploit a small fully-labeled subset of medical image volumes and use it to intelligently mine annotations from the remainder. To do this, we chain together a highly sensitive lesion proposal generator and a very selective lesion proposal classifier. While our framework is generic, we optimize our performance by proposing a 3D contextual lesion proposal generator and by using a multi-view multi-scale lesion proposal classifier. These produce harvested and hard-negative proposals, which we then re-use to finetune our proposal generator by using a novel hard negative suppression loss, continuing this process until no extra lesions are found. Extensive experimental analysis demonstrates that our method can harvest an additional 9,805 lesions while keeping precision above 90%. To demonstrate the benefits of our approach, we show that lesion detectors trained on our harvested lesions can significantly outperform the same variants only trained on the original annotations, with boost of average precision of 7% to 10%. We open source our annotations at https://github.com/JimmyCai91/DeepLesionAnnotation. |
Tasks | |
Published | 2020-01-21 |
URL | https://arxiv.org/abs/2001.07776v2 |
https://arxiv.org/pdf/2001.07776v2.pdf | |
PWC | https://paperswithcode.com/paper/lesion-harvester-iteratively-mining-unlabeled |
Repo | https://github.com/JimmyCai91/DeepLesionAnnotation |
Framework | none |