February 1, 2020

3289 words 16 mins read

Paper Group AWR 193

Paper Group AWR 193

Accelerating Column Generation via Flexible Dual Optimal Inequalities with Application to Entity Resolution. Panoramic Annular Localizer: Tackling the Variation Challenges of Outdoor Localization Using Panoramic Annular Images and Active Deep Descriptors. A General Framework for Information Extraction using Dynamic Span Graphs. Multi-Agent Tensor F …

Accelerating Column Generation via Flexible Dual Optimal Inequalities with Application to Entity Resolution

Title Accelerating Column Generation via Flexible Dual Optimal Inequalities with Application to Entity Resolution
Authors Vishnu Suresh Lokhande, Shaofei Wang, Maneesh Singh, Julian Yarkony
Abstract In this paper, we introduce a new optimization approach to Entity Resolution. Traditional approaches tackle entity resolution with hierarchical clustering, which does not benefit from a formal optimization formulation. In contrast, we model entity resolution as correlation-clustering, which we treat as a weighted set-packing problem and write as an integer linear program (ILP). In this case sources in the input data correspond to elements and entities in output data correspond to sets/clusters. We tackle optimization of weighted set packing by relaxing integrality in our ILP formulation. The set of potential sets/clusters can not be explicitly enumerated, thus motivating optimization via column generation. In addition to the novel formulation, we also introduce new dual optimal inequalities (DOI), that we call flexible dual optimal inequalities, which tightly lower-bound dual variables during optimization and accelerate column generation. We apply our formulation to entity resolution (also called de-duplication of records), and achieve state-of-the-art accuracy on two popular benchmark datasets. The project page is available at the following url, https://github.com/lokhande-vishnu/EntityResolution
Tasks Entity Resolution
Published 2019-09-12
URL https://arxiv.org/abs/1909.05460v3
PDF https://arxiv.org/pdf/1909.05460v3.pdf
PWC https://paperswithcode.com/paper/accelerating-column-generation-via-flexible
Repo https://github.com/lokhande-vishnu/EntityResolution
Framework none

Panoramic Annular Localizer: Tackling the Variation Challenges of Outdoor Localization Using Panoramic Annular Images and Active Deep Descriptors

Title Panoramic Annular Localizer: Tackling the Variation Challenges of Outdoor Localization Using Panoramic Annular Images and Active Deep Descriptors
Authors Ruiqi Cheng, Kaiwei Wang, Shufei Lin, Weijian Hu, Kailun Yang, Xiao Huang, Huabing Li, Dongming Sun, Jian Bai
Abstract Visual localization is an attractive problem that estimates the camera localization from database images based on the query image. It is a crucial task for various applications, such as autonomous vehicles, assistive navigation and augmented reality. The challenging issues of the task lie in various appearance variations between query and database images, including illumination variations, dynamic object variations and viewpoint variations. In order to tackle those challenges, Panoramic Annular Localizer into which panoramic annular lens and robust deep image descriptors are incorporated is proposed in this paper. The panoramic annular images captured by the single camera are processed and fed into the NetVLAD network to form the active deep descriptor, and sequential matching is utilized to generate the localization result. The experiments carried on the public datasets and in the field illustrate the validation of the proposed system.
Tasks Autonomous Vehicles, Camera Localization, Visual Localization
Published 2019-05-14
URL https://arxiv.org/abs/1905.05425v2
PDF https://arxiv.org/pdf/1905.05425v2.pdf
PWC https://paperswithcode.com/paper/panoramic-annular-localizer-tackling-the
Repo https://github.com/chengricky/PanoramicScenePlaceRecognition
Framework pytorch

A General Framework for Information Extraction using Dynamic Span Graphs

Title A General Framework for Information Extraction using Dynamic Span Graphs
Authors Yi Luan, Dave Wadden, Luheng He, Amy Shah, Mari Ostendorf, Hannaneh Hajishirzi
Abstract We introduce a general framework for several information extraction tasks that share span representations using dynamically constructed span graphs. The graphs are constructed by selecting the most confident entity spans and linking these nodes with confidence-weighted relation types and coreferences. The dynamic span graph allows coreference and relation type confidences to propagate through the graph to iteratively refine the span representations. This is unlike previous multi-task frameworks for information extraction in which the only interaction between tasks is in the shared first-layer LSTM. Our framework significantly outperforms the state-of-the-art on multiple information extraction tasks across multiple datasets reflecting different domains. We further observe that the span enumeration approach is good at detecting nested span entities, with significant F1 score improvement on the ACE dataset.
Tasks Joint Entity and Relation Extraction, Relation Extraction
Published 2019-04-05
URL http://arxiv.org/abs/1904.03296v1
PDF http://arxiv.org/pdf/1904.03296v1.pdf
PWC https://paperswithcode.com/paper/a-general-framework-for-information
Repo https://github.com/luanyi/DyGIE
Framework tf

Multi-Agent Tensor Fusion for Contextual Trajectory Prediction

Title Multi-Agent Tensor Fusion for Contextual Trajectory Prediction
Authors Tianyang Zhao, Yifei Xu, Mathew Monfort, Wongun Choi, Chris Baker, Yibiao Zhao, Yizhou Wang, Ying Nian Wu
Abstract Accurate prediction of others’ trajectories is essential for autonomous driving. Trajectory prediction is challenging because it requires reasoning about agents’ past movements, social interactions among varying numbers and kinds of agents, constraints from the scene context, and the stochasticity of human behavior. Our approach models these interactions and constraints jointly within a novel Multi-Agent Tensor Fusion (MATF) network. Specifically, the model encodes multiple agents’ past trajectories and the scene context into a Multi-Agent Tensor, then applies convolutional fusion to capture multiagent interactions while retaining the spatial structure of agents and the scene context. The model decodes recurrently to multiple agents’ future trajectories, using adversarial loss to learn stochastic predictions. Experiments on both highway driving and pedestrian crowd datasets show that the model achieves state-of-the-art prediction accuracy.
Tasks Autonomous Driving, Trajectory Prediction
Published 2019-04-09
URL https://arxiv.org/abs/1904.04776v2
PDF https://arxiv.org/pdf/1904.04776v2.pdf
PWC https://paperswithcode.com/paper/multi-agent-tensor-fusion-for-contextual
Repo https://github.com/programmingLearner/MATF-architecture-details
Framework pytorch

Ouroboros: On Accelerating Training of Transformer-Based Language Models

Title Ouroboros: On Accelerating Training of Transformer-Based Language Models
Authors Qian Yang, Zhouyuan Huo, Wenlin Wang, Heng Huang, Lawrence Carin
Abstract Language models are essential for natural language processing (NLP) tasks, such as machine translation and text summarization. Remarkable performance has been demonstrated recently across many NLP domains via a Transformer-based language model with over a billion parameters, verifying the benefits of model size. Model parallelism is required if a model is too large to fit in a single computing device. Current methods for model parallelism either suffer from backward locking in backpropagation or are not applicable to language models. We propose the first model-parallel algorithm that speeds the training of Transformer-based language models. We also prove that our proposed algorithm is guaranteed to converge to critical points for non-convex problems. Extensive experiments on Transformer and Transformer-XL language models demonstrate that the proposed algorithm obtains a much faster speedup beyond data parallelism, with comparable or better accuracy. Code to reproduce experiments is to be found at \url{https://github.com/LaraQianYang/Ouroboros}.
Tasks Language Modelling, Machine Translation, Text Summarization
Published 2019-09-14
URL https://arxiv.org/abs/1909.06695v1
PDF https://arxiv.org/pdf/1909.06695v1.pdf
PWC https://paperswithcode.com/paper/ouroboros-on-accelerating-training-of
Repo https://github.com/LaraQianYang/Ouroboros
Framework none

Sentence Embedding Alignment for Lifelong Relation Extraction

Title Sentence Embedding Alignment for Lifelong Relation Extraction
Authors Hong Wang, Wenhan Xiong, Mo Yu, Xiaoxiao Guo, Shiyu Chang, William Yang Wang
Abstract Conventional approaches to relation extraction usually require a fixed set of pre-defined relations. Such requirement is hard to meet in many real applications, especially when new data and relations are emerging incessantly and it is computationally expensive to store all data and re-train the whole model every time new data and relations come in. We formulate such a challenging problem as lifelong relation extraction and investigate memory-efficient incremental learning methods without catastrophically forgetting knowledge learned from previous tasks. We first investigate a modified version of the stochastic gradient methods with a replay memory, which surprisingly outperforms recent state-of-the-art lifelong learning methods. We further propose to improve this approach to alleviate the forgetting problem by anchoring the sentence embedding space. Specifically, we utilize an explicit alignment model to mitigate the sentence embedding distortion of the learned model when training on new data and new relations. Experiment results on multiple benchmarks show that our proposed method significantly outperforms the state-of-the-art lifelong learning approaches.
Tasks Relation Extraction, Sentence Embedding
Published 2019-03-06
URL http://arxiv.org/abs/1903.02588v3
PDF http://arxiv.org/pdf/1903.02588v3.pdf
PWC https://paperswithcode.com/paper/sentence-embedding-alignment-for-lifelong
Repo https://github.com/hongwang600/Lifelong_Relation_Detection
Framework pytorch

Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding

Title Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding
Authors Oren Barkan, Noam Razin, Itzik Malkiel, Ori Katz, Avi Caciularu, Noam Koenigstein
Abstract Recent state-of-the-art natural language understanding models, such as BERT and XLNet, score a pair of sentences (A and B) using multiple cross-attention operations - a process in which each word in sentence A attends to all words in sentence B and vice versa. As a result, computing the similarity between a query sentence and a set of candidate sentences, requires the propagation of all query-candidate sentence-pairs throughout a stack of cross-attention layers. This exhaustive process becomes computationally prohibitive when the number of candidate sentences is large. In contrast, sentence embedding techniques learn a sentence-to-vector mapping and compute the similarity between the sentence vectors via simple elementary operations. In this paper, we introduce Distilled Sentence Embedding (DSE) - a model that is based on knowledge distillation from cross-attentive models, focusing on sentence-pair tasks. The outline of DSE is as follows: Given a cross-attentive teacher model (e.g. a fine-tuned BERT), we train a sentence embedding based student model to reconstruct the sentence-pair scores obtained by the teacher model. We empirically demonstrate the effectiveness of DSE on five GLUE sentence-pair tasks. DSE significantly outperforms several ELMO variants and other sentence embedding methods, while accelerating computation of the query-candidate sentence-pairs similarities by several orders of magnitude, with an average relative degradation of 4.6% compared to BERT. Furthermore, we show that DSE produces sentence embeddings that reach state-of-the-art performance on universal sentence representation benchmarks. Our code is made publicly available at https://github.com/microsoft/Distilled-Sentence-Embedding.
Tasks Semantic Similarity, Sentence Embedding, Sentence Embeddings, Sentence Pair Modeling
Published 2019-08-14
URL https://arxiv.org/abs/1908.05161v3
PDF https://arxiv.org/pdf/1908.05161v3.pdf
PWC https://paperswithcode.com/paper/scalable-attentive-sentence-pair-modeling-via
Repo https://github.com/microsoft/Distilled-Sentence-Embedding
Framework pytorch

C3AE: Exploring the Limits of Compact Model for Age Estimation

Title C3AE: Exploring the Limits of Compact Model for Age Estimation
Authors Chao Zhang, Shuaicheng Liu, Xun Xu, Ce Zhu
Abstract Age estimation is a classic learning problem in computer vision. Many larger and deeper CNNs have been proposed with promising performance, such as AlexNet, VggNet, GoogLeNet and ResNet. However, these models are not practical for the embedded/mobile devices. Recently, MobileNets and ShuffleNets have been proposed to reduce the number of parameters, yielding lightweight models. However, their representation has been weakened because of the adoption of depth-wise separable convolution. In this work, we investigate the limits of compact model for small-scale image and propose an extremely Compact yet efficient Cascade Context-based Age Estimation model(C3AE). This model possesses only 1/9 and 1/2000 parameters compared with MobileNets/ShuffleNets and VggNet, while achieves competitive performance. In particular, we re-define age estimation problem by two-points representation, which is implemented by a cascade model. Moreover, to fully utilize the facial context information, multi-branch CNN network is proposed to aggregate multi-scale context. Experiments are carried out on three age estimation datasets. The state-of-the-art performance on compact model has been achieved with a relatively large margin.
Tasks Age Estimation
Published 2019-04-10
URL http://arxiv.org/abs/1904.05059v2
PDF http://arxiv.org/pdf/1904.05059v2.pdf
PWC https://paperswithcode.com/paper/c3ae-exploring-the-limits-of-compact-model
Repo https://github.com/StevenBanama/C3AE
Framework tf

ET-Net: A Generic Edge-aTtention Guidance Network for Medical Image Segmentation

Title ET-Net: A Generic Edge-aTtention Guidance Network for Medical Image Segmentation
Authors Zhijie Zhang, Huazhu Fu, Hang Dai, Jianbing Shen, Yanwei Pang, Ling Shao
Abstract Segmentation is a fundamental task in medical image analysis. However, most existing methods focus on primary region extraction and ignore edge information, which is useful for obtaining accurate segmentation. In this paper, we propose a generic medical segmentation method, called Edge-aTtention guidance Network (ET-Net), which embeds edge-attention representations to guide the segmentation network. Specifically, an edge guidance module is utilized to learn the edge-attention representations in the early encoding layers, which are then transferred to the multi-scale decoding layers, fused using a weighted aggregation module. The experimental results on four segmentation tasks (i.e., optic disc/cup and vessel segmentation in retinal images, and lung segmentation in chest X-Ray and CT images) demonstrate that preserving edge-attention representations contributes to the final segmentation accuracy, and our proposed method outperforms current state-of-the-art segmentation methods. The source code of our method is available at https://github.com/ZzzJzzZ/ETNet.
Tasks Medical Image Segmentation, Semantic Segmentation
Published 2019-07-25
URL https://arxiv.org/abs/1907.10936v1
PDF https://arxiv.org/pdf/1907.10936v1.pdf
PWC https://paperswithcode.com/paper/et-net-a-generic-edge-attention-guidance
Repo https://github.com/ZzzJzzZ/ETNet
Framework none

Representation Similarity Analysis for Efficient Task taxonomy & Transfer Learning

Title Representation Similarity Analysis for Efficient Task taxonomy & Transfer Learning
Authors Kshitij Dwivedi, Gemma Roig
Abstract Transfer learning is widely used in deep neural network models when there are few labeled examples available. The common approach is to take a pre-trained network in a similar task and finetune the model parameters. This is usually done blindly without a pre-selection from a set of pre-trained models, or by finetuning a set of models trained on different tasks and selecting the best performing one by cross-validation. We address this problem by proposing an approach to assess the relationship between visual tasks and their task-specific models. Our method uses Representation Similarity Analysis (RSA), which is commonly used to find a correlation between neuronal responses from brain data and models. With RSA we obtain a similarity score among tasks by computing correlations between models trained on different tasks. Our method is efficient as it requires only pre-trained models, and a few images with no further training. We demonstrate the effectiveness and efficiency of our method for generating task taxonomy on Taskonomy dataset. We next evaluate the relationship of RSA with the transfer learning performance on Taskonomy tasks and a new task: Pascal VOC semantic segmentation. Our results reveal that models trained on tasks with higher similarity score show higher transfer learning performance. Surprisingly, the best transfer learning result for Pascal VOC semantic segmentation is not obtained from the pre-trained model on semantic segmentation, probably due to the domain differences, and our method successfully selects the high performing models.
Tasks Semantic Segmentation, Transfer Learning
Published 2019-04-26
URL http://arxiv.org/abs/1904.11740v1
PDF http://arxiv.org/pdf/1904.11740v1.pdf
PWC https://paperswithcode.com/paper/representation-similarity-analysis-for
Repo https://github.com/kshitijd20/RSA-CVPR19-release
Framework tf

Unsupervised Pre-Training of Image Features on Non-Curated Data

Title Unsupervised Pre-Training of Image Features on Non-Curated Data
Authors Mathilde Caron, Piotr Bojanowski, Julien Mairal, Armand Joulin
Abstract Pre-training general-purpose visual features with convolutional neural networks without relying on annotations is a challenging and important task. Most recent efforts in unsupervised feature learning have focused on either small or highly curated datasets like ImageNet, whereas using uncurated raw datasets was found to decrease the feature quality when evaluated on a transfer task. Our goal is to bridge the performance gap between unsupervised methods trained on curated data, which are costly to obtain, and massive raw datasets that are easily available. To that effect, we propose a new unsupervised approach which leverages self-supervision and clustering to capture complementary statistics from large-scale data. We validate our approach on 96 million images from YFCC100M, achieving state-of-the-art results among unsupervised methods on standard benchmarks, which confirms the potential of unsupervised learning when only uncurated data are available. We also show that pre-training a supervised VGG-16 with our method achieves 74.9% top-1 classification accuracy on the validation set of ImageNet, which is an improvement of +0.8% over the same network trained from scratch. Our code is available at https://github.com/facebookresearch/DeeperCluster.
Tasks
Published 2019-05-03
URL https://arxiv.org/abs/1905.01278v3
PDF https://arxiv.org/pdf/1905.01278v3.pdf
PWC https://paperswithcode.com/paper/leveraging-large-scale-uncurated-data-for
Repo https://github.com/facebookresearch/deepcluster
Framework pytorch

Gender-preserving Debiasing for Pre-trained Word Embeddings

Title Gender-preserving Debiasing for Pre-trained Word Embeddings
Authors Masahiro Kaneko, Danushka Bollegala
Abstract Word embeddings learnt from massive text collections have demonstrated significant levels of discriminative biases such as gender, racial or ethnic biases, which in turn bias the down-stream NLP applications that use those word embeddings. Taking gender-bias as a working example, we propose a debiasing method that preserves non-discriminative gender-related information, while removing stereotypical discriminative gender biases from pre-trained word embeddings. Specifically, we consider four types of information: \emph{feminine}, \emph{masculine}, \emph{gender-neutral} and \emph{stereotypical}, which represent the relationship between gender vs. bias, and propose a debiasing method that (a) preserves the gender-related information in feminine and masculine words, (b) preserves the neutrality in gender-neutral words, and (c) removes the biases from stereotypical words. Experimental results on several previously proposed benchmark datasets show that our proposed method can debias pre-trained word embeddings better than existing SoTA methods proposed for debiasing word embeddings while preserving gender-related but non-discriminative information.
Tasks Word Embeddings
Published 2019-06-03
URL https://arxiv.org/abs/1906.00742v1
PDF https://arxiv.org/pdf/1906.00742v1.pdf
PWC https://paperswithcode.com/paper/190600742
Repo https://github.com/kanekomasahiro/gp_debias
Framework pytorch

BERTSel: Answer Selection with Pre-trained Models

Title BERTSel: Answer Selection with Pre-trained Models
Authors Dongfang Li, Yifei Yu, Qingcai Chen, Xinyu Li
Abstract Recently, pre-trained models have been the dominant paradigm in natural language processing. They achieved remarkable state-of-the-art performance across a wide range of related tasks, such as textual entailment, natural language inference, question answering, etc. BERT, proposed by Devlin et.al., has achieved a better marked result in GLUE leaderboard with a deep transformer architecture. Despite its soaring popularity, however, BERT has not yet been applied to answer selection. This task is different from others with a few nuances: first, modeling the relevance and correctness of candidates matters compared to semantic relatedness and syntactic structure; second, the length of an answer may be different from other candidates and questions. In this paper. we are the first to explore the performance of fine-tuning BERT for answer selection. We achieved STOA results across five popular datasets, demonstrating the success of pre-trained models in this task.
Tasks Answer Selection, Natural Language Inference, Question Answering
Published 2019-05-18
URL https://arxiv.org/abs/1905.07588v1
PDF https://arxiv.org/pdf/1905.07588v1.pdf
PWC https://paperswithcode.com/paper/bertsel-answer-selection-with-pre-trained
Repo https://github.com/BPYap/BERTSel
Framework pytorch

Zero-Shot Grounding of Objects from Natural Language Queries

Title Zero-Shot Grounding of Objects from Natural Language Queries
Authors Arka Sadhu, Kan Chen, Ram Nevatia
Abstract A phrase grounding system localizes a particular object in an image referred to by a natural language query. In previous work, the phrases were restricted to have nouns that were encountered in training, we extend the task to Zero-Shot Grounding(ZSG) which can include novel, “unseen” nouns. Current phrase grounding systems use an explicit object detection network in a 2-stage framework where one stage generates sparse proposals and the other stage evaluates them. In the ZSG setting, generating appropriate proposals itself becomes an obstacle as the proposal generator is trained on the entities common in the detection and grounding datasets. We propose a new single-stage model called ZSGNet which combines the detector network and the grounding system and predicts classification scores and regression parameters. Evaluation of ZSG system brings additional subtleties due to the influence of the relationship between the query and learned categories; we define four distinct conditions that incorporate different levels of difficulty. We also introduce new datasets, sub-sampled from Flickr30k Entities and Visual Genome, that enable evaluations for the four conditions. Our experiments show that ZSGNet achieves state-of-the-art performance on Flickr30k and ReferIt under the usual “seen” settings and performs significantly better than baseline in the zero-shot setting.
Tasks Object Detection, Phrase Grounding
Published 2019-08-20
URL https://arxiv.org/abs/1908.07129v1
PDF https://arxiv.org/pdf/1908.07129v1.pdf
PWC https://paperswithcode.com/paper/zero-shot-grounding-of-objects-from-natural
Repo https://github.com/TheShadow29/zsgnet-pytorch
Framework pytorch

An Experimental-based Review of Image Enhancement and Image Restoration Methods for Underwater Imaging

Title An Experimental-based Review of Image Enhancement and Image Restoration Methods for Underwater Imaging
Authors Yan Wang, Wei Song, Giancarlo Fortino, Lizhe Qi, Wenqiang Zhang, Antonio Liotta
Abstract Underwater images play a key role in ocean exploration, but often suffer from severe quality degradation due to light absorption and scattering in water medium. Although major breakthroughs have been made recently in the general area of image enhancement and restoration, the applicability of new methods for improving the quality of underwater images has not specifically been captured. In this paper, we review the image enhancement and restoration methods that tackle typical underwater image impairments, including some extreme degradations and distortions. Firstly, we introduce the key causes of quality reduction in underwater images, in terms of the underwater image formation model (IFM). Then, we review underwater restoration methods, considering both the IFM-free and the IFM-based approaches. Next, we present an experimental-based comparative evaluation of state-of-the-art IFM-free and IFM-based methods, considering also the prior-based parameter estimation algorithms of the IFM-based methods, using both subjective and objective analysis (the used code is freely available at https://github.com/wangyanckxx/Single-Underwater-Image-Enhancement-and-Color-Restoration). Starting from this study, we pinpoint the key shortcomings of existing methods, drawing recommendations for future research in this area. Our review of underwater image enhancement and restoration provides researchers with the necessary background to appreciate challenges and opportunities in this important field.
Tasks Image Enhancement, Image Restoration
Published 2019-07-07
URL https://arxiv.org/abs/1907.03246v1
PDF https://arxiv.org/pdf/1907.03246v1.pdf
PWC https://paperswithcode.com/paper/an-experimental-based-review-of-image
Repo https://github.com/wangyanckxx/Single-Underwater-Image-Enhancement-and-Color-Restoration
Framework none
comments powered by Disqus