February 1, 2020

3217 words 16 mins read

Paper Group AWR 359

Paper Group AWR 359

ELKI: A large open-source library for data analysis - ELKI Release 0.7.5 “Heidelberg”. Referring Expression Object Segmentation with Caption-Aware Consistency. How Can We Know What Language Models Know?. Dense Dilated Convolutions Merging Network for Semantic Mapping of Remote Sensing Images. Outlier Exposure with Confidence Control for Out-of-Dist …

ELKI: A large open-source library for data analysis - ELKI Release 0.7.5 “Heidelberg”

Title ELKI: A large open-source library for data analysis - ELKI Release 0.7.5 “Heidelberg”
Authors Erich Schubert, Arthur Zimek
Abstract This paper documents the release of the ELKI data mining framework, version 0.7.5. ELKI is an open source (AGPLv3) data mining software written in Java. The focus of ELKI is research in algorithms, with an emphasis on unsupervised methods in cluster analysis and outlier detection. In order to achieve high performance and scalability, ELKI offers data index structures such as the R*-tree that can provide major performance gains. ELKI is designed to be easy to extend for researchers and students in this domain, and welcomes contributions of additional methods. ELKI aims at providing a large collection of highly parameterizable algorithms, in order to allow easy and fair evaluation and benchmarking of algorithms. We will first outline the motivation for this release, the plans for the future, and then give a brief overview over the new functionality in this version. We also include an appendix presenting an overview on the overall implemented functionality.
Tasks Outlier Detection
Published 2019-02-10
URL http://arxiv.org/abs/1902.03616v1
PDF http://arxiv.org/pdf/1902.03616v1.pdf
PWC https://paperswithcode.com/paper/elki-a-large-open-source-library-for-data
Repo https://github.com/elki-project/elki
Framework none

Referring Expression Object Segmentation with Caption-Aware Consistency

Title Referring Expression Object Segmentation with Caption-Aware Consistency
Authors Yi-Wen Chen, Yi-Hsuan Tsai, Tiantian Wang, Yen-Yu Lin, Ming-Hsuan Yang
Abstract Referring expressions are natural language descriptions that identify a particular object within a scene and are widely used in our daily conversations. In this work, we focus on segmenting the object in an image specified by a referring expression. To this end, we propose an end-to-end trainable comprehension network that consists of the language and visual encoders to extract feature representations from both domains. We introduce the spatial-aware dynamic filters to transfer knowledge from text to image, and effectively capture the spatial information of the specified object. To better communicate between the language and visual modules, we employ a caption generation network that takes features shared across both domains as input, and improves both representations via a consistency that enforces the generated sentence to be similar to the given referring expression. We evaluate the proposed framework on two referring expression datasets and show that our method performs favorably against the state-of-the-art algorithms.
Tasks Semantic Segmentation
Published 2019-10-10
URL https://arxiv.org/abs/1910.04748v1
PDF https://arxiv.org/pdf/1910.04748v1.pdf
PWC https://paperswithcode.com/paper/referring-expression-object-segmentation-with
Repo https://github.com/wenz116/lang2seg
Framework pytorch

How Can We Know What Language Models Know?

Title How Can We Know What Language Models Know?
Authors Zhengbao Jiang, Frank F. Xu, Jun Araki, Graham Neubig
Abstract Recent work has presented intriguing results examining the knowledge contained in language models (LM) by having the LM fill in the blanks of prompts such as “Obama is a _ by profession”. These prompts are usually manually created, and quite possibly sub-optimal; another prompt such as “Obama worked as a _” may result in more accurately predicting the correct profession. Because of this, given an inappropriate prompt, we might fail to retrieve facts that the LM does know, and thus any given prompt only provides a lower bound estimate of the knowledge contained in an LM. In this paper, we attempt to more accurately estimate the knowledge contained in LMs by automatically discovering better prompts to use in this querying process. Specifically, we propose mining-based and paraphrasing-based methods to automatically generate high-quality and diverse prompts and ensemble methods to combine answers from different prompts. Extensive experiments on the LAMA benchmark for extracting relational knowledge from LMs demonstrate that our methods can improve accuracy from 31.1% to 38.1%, providing a tighter lower bound on what LMs know. We have released the code and the resulting LM Prompt And Query Archive (LPAQA) at https://github.com/jzbjyb/LPAQA.
Tasks
Published 2019-11-28
URL https://arxiv.org/abs/1911.12543v1
PDF https://arxiv.org/pdf/1911.12543v1.pdf
PWC https://paperswithcode.com/paper/how-can-we-know-what-language-models-know
Repo https://github.com/jzbjyb/LPAQA
Framework none

Dense Dilated Convolutions Merging Network for Semantic Mapping of Remote Sensing Images

Title Dense Dilated Convolutions Merging Network for Semantic Mapping of Remote Sensing Images
Authors Qinghui Liu, Michael Kampffmeyer, Robert Jenssen, Arnt-Børre Salberg
Abstract We propose a network for semantic mapping called the Dense Dilated Convolutions Merging Network (DDCM-Net) to provide a deep learning approach that can recognize multi-scale and complex shaped objects with similar color and textures, such as buildings, surfaces/roads, and trees in very high resolution remote sensing images. The proposed DDCM-Net consists of dense dilated convolutions merged with varying dilation rates. This can effectively enlarge the kernels’ receptive fields, and, more importantly, obtain fused local and global context information to promote surrounding discriminative capability. We demonstrate the effectiveness of the proposed DDCM-Net on the publicly available ISPRS Potsdam dataset and achieve a performance of 92.3% F1-score and 86.0% mean intersection over union accuracy by only using the RGB bands, without any post-processing. We also show results on the ISPRS Vaihingen dataset, where the DDCM-Net trained with IRRG bands, also obtained better mapping accuracy (89.8% F1-score) than previous state-of-the-art approaches.
Tasks
Published 2019-08-30
URL https://arxiv.org/abs/1908.11799v1
PDF https://arxiv.org/pdf/1908.11799v1.pdf
PWC https://paperswithcode.com/paper/dense-dilated-convolutions-merging-network
Repo https://github.com/samleoqh/DDCM-Semantic-Segmentation-PyTorch
Framework pytorch

Outlier Exposure with Confidence Control for Out-of-Distribution Detection

Title Outlier Exposure with Confidence Control for Out-of-Distribution Detection
Authors Aristotelis-Angelos Papadopoulos, Mohammad Reza Rajati, Nazim Shaikh, Jiamian Wang
Abstract Deep neural networks have achieved great success in classification tasks during the last years. However, one major problem to the path towards artificial intelligence is the inability of neural networks to accurately detect samples from novel class distributions and therefore, most of the existent classification algorithms assume that all classes are known prior to the training stage. In this work, we propose a methodology for training a neural network that allows it to efficiently detect out-of-distribution (OOD) examples without compromising much of its classification accuracy on the test examples from known classes. Based on the Outlier Exposure (OE) technique, we propose a novel loss function, Outlier Exposure with Confidence Control (OECC), that achieves state-of-the-art results in out-of-distribution detection with OE both on image and text classification tasks without requiring access to OOD samples. Additionally, we experimentally show that the combination of OECC with the Mahalanobis distance-based classifier achieves state-of-the-art results in the OOD detection task.
Tasks Out-of-Distribution Detection, Text Classification
Published 2019-06-08
URL https://arxiv.org/abs/1906.03509v2
PDF https://arxiv.org/pdf/1906.03509v2.pdf
PWC https://paperswithcode.com/paper/simultaneous-classification-and-novelty
Repo https://github.com/nazim1021/OOD-detection-using-OECC
Framework pytorch

FastContext: an efficient and scalable implementation of the ConText algorithm

Title FastContext: an efficient and scalable implementation of the ConText algorithm
Authors Jianlin Shi, John F. Hurdle
Abstract Objective: To develop and evaluate FastContext, an efficient, scalable implementation of the ConText algorithm suitable for very large-scale clinical natural language processing. Background: The ConText algorithm performs with state-of-art accuracy in detecting the experiencer, negation status, and temporality of concept mentions in clinical narratives. However, the speed limitation of its current implementations hinders its use in big data processing. Methods: We developed FastContext through hashing the ConText’s rules, then compared its speed and accuracy with JavaConText and GeneralConText, two widely used Java implementations. Results: FastContext ran two orders of magnitude faster and was less decelerated by rule increase than the other two implementations used in this study for comparison. Additionally, FastContext consistently gained accuracy improvement as the rules increased (the desired outcome of adding new rules), while the other two implementations did not. Conclusions: FastContext is an efficient, scalable implementation of the popular ConText algorithm, suitable for natural language applications on very large clinical corpora.
Tasks
Published 2019-04-30
URL http://arxiv.org/abs/1905.00079v1
PDF http://arxiv.org/pdf/1905.00079v1.pdf
PWC https://paperswithcode.com/paper/fastcontext-an-efficient-and-scalable
Repo https://github.com/jianlins/FastContext
Framework none

A Comparative Study between Bayesian and Frequentist Neural Networks for Remaining Useful Life Estimation in Condition-Based Maintenance

Title A Comparative Study between Bayesian and Frequentist Neural Networks for Remaining Useful Life Estimation in Condition-Based Maintenance
Authors Luca Della Libera
Abstract In the last decade, deep learning (DL) has outperformed model-based and statistical approaches in predicting the remaining useful life (RUL) of machinery in the context of condition-based maintenance. One of the major drawbacks of DL is that it heavily depends on a large amount of labeled data, which are typically expensive and time-consuming to obtain, especially in industrial applications. Scarce training data lead to uncertain estimates of the model’s parameters, which in turn result in poor prognostic performance. Quantifying this parameter uncertainty is important in order to determine how reliable the prediction is. Traditional DL techniques such as neural networks are incapable of capturing the uncertainty in the training data, thus they are overconfident about their estimates. On the contrary, Bayesian deep learning has recently emerged as a promising solution to account for uncertainty in the training process, achieving state-of-the-art performance in many classification and regression tasks. In this work Bayesian DL techniques such as Bayesian dense neural networks and Bayesian convolutional neural networks are applied to RUL estimation and compared to their frequentist counterparts from the literature. The effectiveness of the proposed models is verified on the popular C-MAPSS dataset. Furthermore, parameter uncertainty is quantified and used to gain additional insight into the data.
Tasks
Published 2019-11-14
URL https://arxiv.org/abs/1911.06256v2
PDF https://arxiv.org/pdf/1911.06256v2.pdf
PWC https://paperswithcode.com/paper/a-comparative-study-between-bayesian-and
Repo https://github.com/luca310795/bayesian-deep-rul
Framework pytorch

Sentence Specified Dynamic Video Thumbnail Generation

Title Sentence Specified Dynamic Video Thumbnail Generation
Authors Yitian Yuan, Lin Ma, Wenwu Zhu
Abstract With the tremendous growth of videos over the Internet, video thumbnails, providing video content previews, are becoming increasingly crucial to influencing users’ online searching experiences. Conventional video thumbnails are generated once purely based on the visual characteristics of videos, and then displayed as requested. Hence, such video thumbnails, without considering the users’ searching intentions, cannot provide a meaningful snapshot of the video contents that users concern. In this paper, we define a distinctively new task, namely sentence specified dynamic video thumbnail generation, where the generated thumbnails not only provide a concise preview of the original video contents but also dynamically relate to the users’ searching intentions with semantic correspondences to the users’ query sentences. To tackle such a challenging task, we propose a novel graph convolved video thumbnail pointer (GTP). Specifically, GTP leverages a sentence specified video graph convolutional network to model both the sentence-video semantic interaction and the internal video relationships incorporated with the sentence information, based on which a temporal conditioned pointer network is then introduced to sequentially generate the sentence specified video thumbnails. Moreover, we annotate a new dataset based on ActivityNet Captions for the proposed new task, which consists of 10,000+ video-sentence pairs with each accompanied by an annotated sentence specified video thumbnail. We demonstrate that our proposed GTP outperforms several baseline methods on the created dataset, and thus believe that our initial results along with the release of the new dataset will inspire further research on sentence specified dynamic video thumbnail generation. Dataset and code are available at https://github.com/yytzsy/GTP.
Tasks
Published 2019-08-12
URL https://arxiv.org/abs/1908.04052v2
PDF https://arxiv.org/pdf/1908.04052v2.pdf
PWC https://paperswithcode.com/paper/sentence-specified-dynamic-video-thumbnail
Repo https://github.com/yytzsy/GTP
Framework tf

Neural Networks for Relational Data

Title Neural Networks for Relational Data
Authors Navdeep Kaur, Gautam Kunapuli, Saket Joshi, Kristian Kersting, Sriraam Natarajan
Abstract While deep networks have been enormously successful over the last decade, they rely on flat-feature vector representations, which makes them unsuitable for richly structured domains such as those arising in applications like social network analysis. Such domains rely on relational representations to capture complex relationships between entities and their attributes. Thus, we consider the problem of learning neural networks for relational data. We distinguish ourselves from current approaches that rely on expert hand-coded rules by learning relational random-walk-based features to capture local structural interactions and the resulting network architecture. We further exploit parameter tying of the network weights of the resulting relational neural network, where instances of the same type share parameters. Our experimental results across several standard relational data sets demonstrate the effectiveness of the proposed approach over multiple neural net baselines as well as state-of-the-art statistical relational models.
Tasks
Published 2019-08-28
URL https://arxiv.org/abs/1909.04723v3
PDF https://arxiv.org/pdf/1909.04723v3.pdf
PWC https://paperswithcode.com/paper/neural-networks-for-relational-data
Repo https://github.com/navdeepkjohal/NNRPT
Framework tf

Graph Structured Prediction Energy Networks

Title Graph Structured Prediction Energy Networks
Authors Colin Graber, Alexander Schwing
Abstract For joint inference over multiple variables, a variety of structured prediction techniques have been developed to model correlations among variables and thereby improve predictions. However, many classical approaches suffer from one of two primary drawbacks: they either lack the ability to model high-order correlations among variables while maintaining computationally tractable inference, or they do not allow to explicitly model known correlations. To address this shortcoming, we introduce `Graph Structured Prediction Energy Networks,’ for which we develop inference techniques that allow to both model explicit local and implicit higher-order correlations while maintaining tractability of inference. We apply the proposed method to tasks from the natural language processing and computer vision domain and demonstrate its general utility. |
Tasks Structured Prediction
Published 2019-10-31
URL https://arxiv.org/abs/1910.14670v2
PDF https://arxiv.org/pdf/1910.14670v2.pdf
PWC https://paperswithcode.com/paper/graph-structured-prediction-energy-networks
Repo https://github.com/cgraber/GSPEN
Framework pytorch

FreeLB: Enhanced Adversarial Training for Natural Language Understanding

Title FreeLB: Enhanced Adversarial Training for Natural Language Understanding
Authors Chen Zhu, Yu Cheng, Zhe Gan, Siqi Sun, Tom Goldstein, Jingjing Liu
Abstract Adversarial training, which minimizes the maximal risk for label-preserving input perturbations, has proved to be effective for improving the generalization of language models. In this work, we propose a novel adversarial training algorithm, FreeLB, that promotes higher invariance in the embedding space, by adding adversarial perturbations to word embeddings and minimizing the resultant adversarial risk inside different regions around input samples. To validate the effectiveness of the proposed approach, we apply it to Transformer-based models for natural language understanding and commonsense reasoning tasks. Experiments on the GLUE benchmark show that when applied only to the finetuning stage, it is able to improve the overall test scores of BERT-base model from 78.3 to 79.4, and RoBERTa-large model from 88.5 to 88.8. In addition, the proposed approach achieves state-of-the-art single-model test accuracies of 85.44% and 67.75% on ARC-Easy and ARC-Challenge. Experiments on CommonsenseQA benchmark further demonstrate that FreeLB can be generalized and boost the performance of RoBERTa-large model on other tasks as well. Code is available at \url{https://github.com/zhuchen03/FreeLB .
Tasks Word Embeddings
Published 2019-09-25
URL https://arxiv.org/abs/1909.11764v4
PDF https://arxiv.org/pdf/1909.11764v4.pdf
PWC https://paperswithcode.com/paper/freelb-enhanced-adversarial-training-for
Repo https://github.com/zhuchen03/FreeLB
Framework pytorch

How Transformer Revitalizes Character-based Neural Machine Translation: An Investigation on Japanese-Vietnamese Translation Systems

Title How Transformer Revitalizes Character-based Neural Machine Translation: An Investigation on Japanese-Vietnamese Translation Systems
Authors Thi-Vinh Ngo, Thanh-Le Ha, Phuong-Thai Nguyen, Le-Minh Nguyen
Abstract While translating between East Asian languages, many works have discovered clear advantages of using characters as the translation unit. Unfortunately, traditional recurrent neural machine translation systems hinder the practical usage of those character-based systems due to their architectural limitations. They are unfavorable in handling extremely long sequences as well as highly restricted in parallelizing the computations. In this paper, we demonstrate that the new transformer architecture can perform character-based translation better than the recurrent one. We conduct experiments on a low-resource language pair: Japanese-Vietnamese. Our models considerably outperform the state-of-the-art systems which employ word-based recurrent architectures.
Tasks Machine Translation
Published 2019-10-05
URL https://arxiv.org/abs/1910.02238v2
PDF https://arxiv.org/pdf/1910.02238v2.pdf
PWC https://paperswithcode.com/paper/how-transformer-revitalizes-character-based
Repo https://github.com/ngovinhtn/charTransform
Framework pytorch

CSPNet: A New Backbone that can Enhance Learning Capability of CNN

Title CSPNet: A New Backbone that can Enhance Learning Capability of CNN
Authors Chien-Yao Wang, Hong-Yuan Mark Liao, I-Hau Yeh, Yueh-Hua Wu, Ping-Yang Chen, Jun-Wei Hsieh
Abstract Neural networks have enabled state-of-the-art approaches to achieve incredible results on computer vision tasks such as object detection. However, such success greatly relies on costly computation resources, which hinders people with cheap devices from appreciating the advanced technology. In this paper, we propose Cross Stage Partial Network (CSPNet) to mitigate the problem that previous works require heavy inference computations from the network architecture perspective. We attribute the problem to the duplicate gradient information within network optimization. The proposed networks respect the variability of the gradients by integrating feature maps from the beginning and the end of a network stage, which, in our experiments, reduces computations by 20% with equivalent or even superior accuracy on the ImageNet dataset, and significantly outperforms state-of-the-art approaches in terms of AP50 on the MS COCO object detection dataset. The CSPNet is easy to implement and general enough to cope with architectures based on ResNet, ResNeXt, and DenseNet. Source code is at https://github.com/WongKinYiu/CrossStagePartialNetworks.
Tasks Image Classification, Object Detection, Real-Time Object Detection
Published 2019-11-27
URL https://arxiv.org/abs/1911.11929v1
PDF https://arxiv.org/pdf/1911.11929v1.pdf
PWC https://paperswithcode.com/paper/cspnet-a-new-backbone-that-can-enhance
Repo https://github.com/Code-Fight/darknet_study
Framework tf

Positive-Unlabeled Compression on the Cloud

Title Positive-Unlabeled Compression on the Cloud
Authors Yixing Xu, Yunhe Wang, Hanting Chen, Kai Han, Chunjing Xu, Dacheng Tao, Chang Xu
Abstract Many attempts have been done to extend the great success of convolutional neural networks (CNNs) achieved on high-end GPU servers to portable devices such as smart phones. Providing compression and acceleration service of deep learning models on the cloud is therefore of significance and is attractive for end users. However, existing network compression and acceleration approaches usually fine-tuning the svelte model by requesting the entire original training data (\eg ImageNet), which could be more cumbersome than the network itself and cannot be easily uploaded to the cloud. In this paper, we present a novel positive-unlabeled (PU) setting for addressing this problem. In practice, only a small portion of the original training set is required as positive examples and more useful training examples can be obtained from the massive unlabeled data on the cloud through a PU classifier with an attention based multi-scale feature extractor. We further introduce a robust knowledge distillation (RKD) scheme to deal with the class imbalance problem of these newly augmented training examples. The superiority of the proposed method is verified through experiments conducted on the benchmark models and datasets. We can use only $8%$ of uniformly selected data from the ImageNet to obtain an efficient model with comparable performance to the baseline ResNet-34.
Tasks
Published 2019-09-21
URL https://arxiv.org/abs/1909.09757v2
PDF https://arxiv.org/pdf/1909.09757v2.pdf
PWC https://paperswithcode.com/paper/190909757
Repo https://github.com/huawei-noah/DAFL
Framework pytorch

Unsupervised Open Domain Recognition by Semantic Discrepancy Minimization

Title Unsupervised Open Domain Recognition by Semantic Discrepancy Minimization
Authors Junbao Zhuo, Shuhui Wang, Shuhao Cui, Qingming Huang
Abstract We address the unsupervised open domain recognition (UODR) problem, where categories in labeled source domain S is only a subset of those in unlabeled target domain T. The task is to correctly classify all samples in T including known and unknown categories. UODR is challenging due to the domain discrepancy, which becomes even harder to bridge when a large number of unknown categories exist in T. Moreover, the classification rules propagated by graph CNN (GCN) may be distracted by unknown categories and lack generalization capability. To measure the domain discrepancy for asymmetric label space between S and T, we propose Semantic-Guided Matching Discrepancy (SGMD), which first employs instance matching between S and T, and then the discrepancy is measured by a weighted feature distance between matched instances. We further design a limited balance constraint to achieve a more balanced classification output on known and unknown categories. We develop Unsupervised Open Domain Transfer Network (UODTN), which learns both the backbone classification network and GCN jointly by reducing the SGMD, enforcing the limited balance constraint and minimizing the classification loss on S. UODTN better preserves the semantic structure and enforces the consistency between the learned domain invariant visual features and the semantic embeddings. Experimental results show superiority of our method on recognizing images of both known and unknown categories.
Tasks
Published 2019-04-18
URL http://arxiv.org/abs/1904.08631v1
PDF http://arxiv.org/pdf/1904.08631v1.pdf
PWC https://paperswithcode.com/paper/unsupervised-open-domain-recognition-by
Repo https://github.com/junbaoZHUO/UODTN
Framework pytorch
comments powered by Disqus