February 1, 2020

3217 words 16 mins read

Paper Group AWR 359

ELKI: A large open-source library for data analysis - ELKI Release 0.7.5 “Heidelberg”. Referring Expression Object Segmentation with Caption-Aware Consistency. How Can We Know What Language Models Know?. Dense Dilated Convolutions Merging Network for Semantic Mapping of Remote Sensing Images. Outlier Exposure with Confidence Control for Out-of-Dist …

ELKI: A large open-source library for data analysis - ELKI Release 0.7.5 “Heidelberg”


Title	ELKI: A large open-source library for data analysis - ELKI Release 0.7.5 “Heidelberg”
Authors	Erich Schubert, Arthur Zimek
Abstract	This paper documents the release of the ELKI data mining framework, version 0.7.5. ELKI is an open source (AGPLv3) data mining software written in Java. The focus of ELKI is research in algorithms, with an emphasis on unsupervised methods in cluster analysis and outlier detection. In order to achieve high performance and scalability, ELKI offers data index structures such as the R*-tree that can provide major performance gains. ELKI is designed to be easy to extend for researchers and students in this domain, and welcomes contributions of additional methods. ELKI aims at providing a large collection of highly parameterizable algorithms, in order to allow easy and fair evaluation and benchmarking of algorithms. We will first outline the motivation for this release, the plans for the future, and then give a brief overview over the new functionality in this version. We also include an appendix presenting an overview on the overall implemented functionality.
Tasks	Outlier Detection
Published	2019-02-10
URL	http://arxiv.org/abs/1902.03616v1
PDF	http://arxiv.org/pdf/1902.03616v1.pdf
PWC	https://paperswithcode.com/paper/elki-a-large-open-source-library-for-data
Repo	https://github.com/elki-project/elki
Framework	none

Referring Expression Object Segmentation with Caption-Aware Consistency


Title	Referring Expression Object Segmentation with Caption-Aware Consistency
Authors	Yi-Wen Chen, Yi-Hsuan Tsai, Tiantian Wang, Yen-Yu Lin, Ming-Hsuan Yang
Abstract	Referring expressions are natural language descriptions that identify a particular object within a scene and are widely used in our daily conversations. In this work, we focus on segmenting the object in an image specified by a referring expression. To this end, we propose an end-to-end trainable comprehension network that consists of the language and visual encoders to extract feature representations from both domains. We introduce the spatial-aware dynamic filters to transfer knowledge from text to image, and effectively capture the spatial information of the specified object. To better communicate between the language and visual modules, we employ a caption generation network that takes features shared across both domains as input, and improves both representations via a consistency that enforces the generated sentence to be similar to the given referring expression. We evaluate the proposed framework on two referring expression datasets and show that our method performs favorably against the state-of-the-art algorithms.
Tasks	Semantic Segmentation
Published	2019-10-10
URL	https://arxiv.org/abs/1910.04748v1
PDF	https://arxiv.org/pdf/1910.04748v1.pdf
PWC	https://paperswithcode.com/paper/referring-expression-object-segmentation-with
Repo	https://github.com/wenz116/lang2seg
Framework	pytorch

How Can We Know What Language Models Know?


Title	How Can We Know What Language Models Know?
Authors	Zhengbao Jiang, Frank F. Xu, Jun Araki, Graham Neubig
Abstract	Recent work has presented intriguing results examining the knowledge contained in language models (LM) by having the LM fill in the blanks of prompts such as “Obama is a _ by profession”. These prompts are usually manually created, and quite possibly sub-optimal; another prompt such as “Obama worked as a _” may result in more accurately predicting the correct profession. Because of this, given an inappropriate prompt, we might fail to retrieve facts that the LM does know, and thus any given prompt only provides a lower bound estimate of the knowledge contained in an LM. In this paper, we attempt to more accurately estimate the knowledge contained in LMs by automatically discovering better prompts to use in this querying process. Specifically, we propose mining-based and paraphrasing-based methods to automatically generate high-quality and diverse prompts and ensemble methods to combine answers from different prompts. Extensive experiments on the LAMA benchmark for extracting relational knowledge from LMs demonstrate that our methods can improve accuracy from 31.1% to 38.1%, providing a tighter lower bound on what LMs know. We have released the code and the resulting LM Prompt And Query Archive (LPAQA) at https://github.com/jzbjyb/LPAQA.
Tasks
Published	2019-11-28
URL	https://arxiv.org/abs/1911.12543v1
PDF	https://arxiv.org/pdf/1911.12543v1.pdf
PWC	https://paperswithcode.com/paper/how-can-we-know-what-language-models-know
Repo	https://github.com/jzbjyb/LPAQA
Framework	none

Dense Dilated Convolutions Merging Network for Semantic Mapping of Remote Sensing Images


Title	Dense Dilated Convolutions Merging Network for Semantic Mapping of Remote Sensing Images
Authors	Qinghui Liu, Michael Kampffmeyer, Robert Jenssen, Arnt-Børre Salberg
Abstract	We propose a network for semantic mapping called the Dense Dilated Convolutions Merging Network (DDCM-Net) to provide a deep learning approach that can recognize multi-scale and complex shaped objects with similar color and textures, such as buildings, surfaces/roads, and trees in very high resolution remote sensing images. The proposed DDCM-Net consists of dense dilated convolutions merged with varying dilation rates. This can effectively enlarge the kernels’ receptive fields, and, more importantly, obtain fused local and global context information to promote surrounding discriminative capability. We demonstrate the effectiveness of the proposed DDCM-Net on the publicly available ISPRS Potsdam dataset and achieve a performance of 92.3% F1-score and 86.0% mean intersection over union accuracy by only using the RGB bands, without any post-processing. We also show results on the ISPRS Vaihingen dataset, where the DDCM-Net trained with IRRG bands, also obtained better mapping accuracy (89.8% F1-score) than previous state-of-the-art approaches.
Tasks
Published	2019-08-30
URL	https://arxiv.org/abs/1908.11799v1
PDF	https://arxiv.org/pdf/1908.11799v1.pdf
PWC	https://paperswithcode.com/paper/dense-dilated-convolutions-merging-network
Repo	https://github.com/samleoqh/DDCM-Semantic-Segmentation-PyTorch
Framework	pytorch

Outlier Exposure with Confidence Control for Out-of-Distribution Detection


Title	Outlier Exposure with Confidence Control for Out-of-Distribution Detection
Authors	Aristotelis-Angelos Papadopoulos, Mohammad Reza Rajati, Nazim Shaikh, Jiamian Wang
Abstract	Deep neural networks have achieved great success in classification tasks during the last years. However, one major problem to the path towards artificial intelligence is the inability of neural networks to accurately detect samples from novel class distributions and therefore, most of the existent classification algorithms assume that all classes are known prior to the training stage. In this work, we propose a methodology for training a neural network that allows it to efficiently detect out-of-distribution (OOD) examples without compromising much of its classification accuracy on the test examples from known classes. Based on the Outlier Exposure (OE) technique, we propose a novel loss function, Outlier Exposure with Confidence Control (OECC), that achieves state-of-the-art results in out-of-distribution detection with OE both on image and text classification tasks without requiring access to OOD samples. Additionally, we experimentally show that the combination of OECC with the Mahalanobis distance-based classifier achieves state-of-the-art results in the OOD detection task.
Tasks	Out-of-Distribution Detection, Text Classification
Published	2019-06-08
URL	https://arxiv.org/abs/1906.03509v2
PDF	https://arxiv.org/pdf/1906.03509v2.pdf
PWC	https://paperswithcode.com/paper/simultaneous-classification-and-novelty
Repo	https://github.com/nazim1021/OOD-detection-using-OECC
Framework	pytorch

FastContext: an efficient and scalable implementation of the ConText algorithm


Title	FastContext: an efficient and scalable implementation of the ConText algorithm
Authors	Jianlin Shi, John F. Hurdle
Abstract	Objective: To develop and evaluate FastContext, an efficient, scalable implementation of the ConText algorithm suitable for very large-scale clinical natural language processing. Background: The ConText algorithm performs with state-of-art accuracy in detecting the experiencer, negation status, and temporality of concept mentions in clinical narratives. However, the speed limitation of its current implementations hinders its use in big data processing. Methods: We developed FastContext through hashing the ConText’s rules, then compared its speed and accuracy with JavaConText and GeneralConText, two widely used Java implementations. Results: FastContext ran two orders of magnitude faster and was less decelerated by rule increase than the other two implementations used in this study for comparison. Additionally, FastContext consistently gained accuracy improvement as the rules increased (the desired outcome of adding new rules), while the other two implementations did not. Conclusions: FastContext is an efficient, scalable implementation of the popular ConText algorithm, suitable for natural language applications on very large clinical corpora.
Tasks
Published	2019-04-30
URL	http://arxiv.org/abs/1905.00079v1
PDF	http://arxiv.org/pdf/1905.00079v1.pdf
PWC	https://paperswithcode.com/paper/fastcontext-an-efficient-and-scalable
Repo	https://github.com/jianlins/FastContext
Framework	none

A Comparative Study between Bayesian and Frequentist Neural Networks for Remaining Useful Life Estimation in Condition-Based Maintenance


Title	A Comparative Study between Bayesian and Frequentist Neural Networks for Remaining Useful Life Estimation in Condition-Based Maintenance
Authors	Luca Della Libera
Abstract	In the last decade, deep learning (DL) has outperformed model-based and statistical approaches in predicting the remaining useful life (RUL) of machinery in the context of condition-based maintenance. One of the major drawbacks of DL is that it heavily depends on a large amount of labeled data, which are typically expensive and time-consuming to obtain, especially in industrial applications. Scarce training data lead to uncertain estimates of the model’s parameters, which in turn result in poor prognostic performance. Quantifying this parameter uncertainty is important in order to determine how reliable the prediction is. Traditional DL techniques such as neural networks are incapable of capturing the uncertainty in the training data, thus they are overconfident about their estimates. On the contrary, Bayesian deep learning has recently emerged as a promising solution to account for uncertainty in the training process, achieving state-of-the-art performance in many classification and regression tasks. In this work Bayesian DL techniques such as Bayesian dense neural networks and Bayesian convolutional neural networks are applied to RUL estimation and compared to their frequentist counterparts from the literature. The effectiveness of the proposed models is verified on the popular C-MAPSS dataset. Furthermore, parameter uncertainty is quantified and used to gain additional insight into the data.
Tasks
Published	2019-11-14
URL	https://arxiv.org/abs/1911.06256v2
PDF	https://arxiv.org/pdf/1911.06256v2.pdf
PWC	https://paperswithcode.com/paper/a-comparative-study-between-bayesian-and
Repo	https://github.com/luca310795/bayesian-deep-rul
Framework	pytorch

Sentence Specified Dynamic Video Thumbnail Generation


Title	Sentence Specified Dynamic Video Thumbnail Generation
Authors	Yitian Yuan, Lin Ma, Wenwu Zhu
Abstract	With the tremendous growth of videos over the Internet, video thumbnails, providing video content previews, are becoming increasingly crucial to influencing users’ online searching experiences. Conventional video thumbnails are generated once purely based on the visual characteristics of videos, and then displayed as requested. Hence, such video thumbnails, without considering the users’ searching intentions, cannot provide a meaningful snapshot of the video contents that users concern. In this paper, we define a distinctively new task, namely sentence specified dynamic video thumbnail generation, where the generated thumbnails not only provide a concise preview of the original video contents but also dynamically relate to the users’ searching intentions with semantic correspondences to the users’ query sentences. To tackle such a challenging task, we propose a novel graph convolved video thumbnail pointer (GTP). Specifically, GTP leverages a sentence specified video graph convolutional network to model both the sentence-video semantic interaction and the internal video relationships incorporated with the sentence information, based on which a temporal conditioned pointer network is then introduced to sequentially generate the sentence specified video thumbnails. Moreover, we annotate a new dataset based on ActivityNet Captions for the proposed new task, which consists of 10,000+ video-sentence pairs with each accompanied by an annotated sentence specified video thumbnail. We demonstrate that our proposed GTP outperforms several baseline methods on the created dataset, and thus believe that our initial results along with the release of the new dataset will inspire further research on sentence specified dynamic video thumbnail generation. Dataset and code are available at https://github.com/yytzsy/GTP.
Tasks
Published	2019-08-12
URL	https://arxiv.org/abs/1908.04052v2
PDF	https://arxiv.org/pdf/1908.04052v2.pdf
PWC	https://paperswithcode.com/paper/sentence-specified-dynamic-video-thumbnail
Repo	https://github.com/yytzsy/GTP
Framework	tf

Neural Networks for Relational Data


Title	Neural Networks for Relational Data
Authors	Navdeep Kaur, Gautam Kunapuli, Saket Joshi, Kristian Kersting, Sriraam Natarajan
Abstract	While deep networks have been enormously successful over the last decade, they rely on flat-feature vector representations, which makes them unsuitable for richly structured domains such as those arising in applications like social network analysis. Such domains rely on relational representations to capture complex relationships between entities and their attributes. Thus, we consider the problem of learning neural networks for relational data. We distinguish ourselves from current approaches that rely on expert hand-coded rules by learning relational random-walk-based features to capture local structural interactions and the resulting network architecture. We further exploit parameter tying of the network weights of the resulting relational neural network, where instances of the same type share parameters. Our experimental results across several standard relational data sets demonstrate the effectiveness of the proposed approach over multiple neural net baselines as well as state-of-the-art statistical relational models.
Tasks
Published	2019-08-28
URL	https://arxiv.org/abs/1909.04723v3
PDF	https://arxiv.org/pdf/1909.04723v3.pdf
PWC	https://paperswithcode.com/paper/neural-networks-for-relational-data
Repo	https://github.com/navdeepkjohal/NNRPT
Framework	tf

Graph Structured Prediction Energy Networks


Title	Graph Structured Prediction Energy Networks
Authors	Colin Graber, Alexander Schwing
Abstract	For joint inference over multiple variables, a variety of structured prediction techniques have been developed to model correlations among variables and thereby improve predictions. However, many classical approaches suffer from one of two primary drawbacks: they either lack the ability to model high-order correlations among variables while maintaining computationally tractable inference, or they do not allow to explicitly model known correlations. To address this shortcoming, we introduce `Graph Structured Prediction Energy Networks,’ for which we develop inference techniques that allow to both model explicit local and implicit higher-order correlations while maintaining tractability of inference. We apply the proposed method to tasks from the natural language processing and computer vision domain and demonstrate its general utility. \|
Tasks	Structured Prediction
Published	2019-10-31
URL	https://arxiv.org/abs/1910.14670v2
PDF	https://arxiv.org/pdf/1910.14670v2.pdf
PWC	https://paperswithcode.com/paper/graph-structured-prediction-energy-networks
Repo	https://github.com/cgraber/GSPEN
Framework	pytorch

FreeLB: Enhanced Adversarial Training for Natural Language Understanding


Title	FreeLB: Enhanced Adversarial Training for Natural Language Understanding
Authors	Chen Zhu, Yu Cheng, Zhe Gan, Siqi Sun, Tom Goldstein, Jingjing Liu
Abstract	Adversarial training, which minimizes the maximal risk for label-preserving input perturbations, has proved to be effective for improving the generalization of language models. In this work, we propose a novel adversarial training algorithm, FreeLB, that promotes higher invariance in the embedding space, by adding adversarial perturbations to word embeddings and minimizing the resultant adversarial risk inside different regions around input samples. To validate the effectiveness of the proposed approach, we apply it to Transformer-based models for natural language understanding and commonsense reasoning tasks. Experiments on the GLUE benchmark show that when applied only to the finetuning stage, it is able to improve the overall test scores of BERT-base model from 78.3 to 79.4, and RoBERTa-large model from 88.5 to 88.8. In addition, the proposed approach achieves state-of-the-art single-model test accuracies of 85.44% and 67.75% on ARC-Easy and ARC-Challenge. Experiments on CommonsenseQA benchmark further demonstrate that FreeLB can be generalized and boost the performance of RoBERTa-large model on other tasks as well. Code is available at \url{https://github.com/zhuchen03/FreeLB .
Tasks	Word Embeddings
Published	2019-09-25
URL	https://arxiv.org/abs/1909.11764v4
PDF	https://arxiv.org/pdf/1909.11764v4.pdf
PWC	https://paperswithcode.com/paper/freelb-enhanced-adversarial-training-for
Repo	https://github.com/zhuchen03/FreeLB
Framework	pytorch

How Transformer Revitalizes Character-based Neural Machine Translation: An Investigation on Japanese-Vietnamese Translation Systems


Title	How Transformer Revitalizes Character-based Neural Machine Translation: An Investigation on Japanese-Vietnamese Translation Systems
Authors	Thi-Vinh Ngo, Thanh-Le Ha, Phuong-Thai Nguyen, Le-Minh Nguyen
Abstract	While translating between East Asian languages, many works have discovered clear advantages of using characters as the translation unit. Unfortunately, traditional recurrent neural machine translation systems hinder the practical usage of those character-based systems due to their architectural limitations. They are unfavorable in handling extremely long sequences as well as highly restricted in parallelizing the computations. In this paper, we demonstrate that the new transformer architecture can perform character-based translation better than the recurrent one. We conduct experiments on a low-resource language pair: Japanese-Vietnamese. Our models considerably outperform the state-of-the-art systems which employ word-based recurrent architectures.
Tasks	Machine Translation
Published	2019-10-05
URL	https://arxiv.org/abs/1910.02238v2
PDF	https://arxiv.org/pdf/1910.02238v2.pdf
PWC	https://paperswithcode.com/paper/how-transformer-revitalizes-character-based
Repo	https://github.com/ngovinhtn/charTransform
Framework	pytorch

CSPNet: A New Backbone that can Enhance Learning Capability of CNN


Title	CSPNet: A New Backbone that can Enhance Learning Capability of CNN
Authors	Chien-Yao Wang, Hong-Yuan Mark Liao, I-Hau Yeh, Yueh-Hua Wu, Ping-Yang Chen, Jun-Wei Hsieh
Abstract	Neural networks have enabled state-of-the-art approaches to achieve incredible results on computer vision tasks such as object detection. However, such success greatly relies on costly computation resources, which hinders people with cheap devices from appreciating the advanced technology. In this paper, we propose Cross Stage Partial Network (CSPNet) to mitigate the problem that previous works require heavy inference computations from the network architecture perspective. We attribute the problem to the duplicate gradient information within network optimization. The proposed networks respect the variability of the gradients by integrating feature maps from the beginning and the end of a network stage, which, in our experiments, reduces computations by 20% with equivalent or even superior accuracy on the ImageNet dataset, and significantly outperforms state-of-the-art approaches in terms of AP50 on the MS COCO object detection dataset. The CSPNet is easy to implement and general enough to cope with architectures based on ResNet, ResNeXt, and DenseNet. Source code is at https://github.com/WongKinYiu/CrossStagePartialNetworks.
Tasks	Image Classification, Object Detection, Real-Time Object Detection
Published	2019-11-27
URL	https://arxiv.org/abs/1911.11929v1
PDF	https://arxiv.org/pdf/1911.11929v1.pdf
PWC	https://paperswithcode.com/paper/cspnet-a-new-backbone-that-can-enhance
Repo	https://github.com/Code-Fight/darknet_study
Framework	tf

Positive-Unlabeled Compression on the Cloud


Title	Positive-Unlabeled Compression on the Cloud
Authors	Yixing Xu, Yunhe Wang, Hanting Chen, Kai Han, Chunjing Xu, Dacheng Tao, Chang Xu
Abstract	Many attempts have been done to extend the great success of convolutional neural networks (CNNs) achieved on high-end GPU servers to portable devices such as smart phones. Providing compression and acceleration service of deep learning models on the cloud is therefore of significance and is attractive for end users. However, existing network compression and acceleration approaches usually fine-tuning the svelte model by requesting the entire original training data (\eg ImageNet), which could be more cumbersome than the network itself and cannot be easily uploaded to the cloud. In this paper, we present a novel positive-unlabeled (PU) setting for addressing this problem. In practice, only a small portion of the original training set is required as positive examples and more useful training examples can be obtained from the massive unlabeled data on the cloud through a PU classifier with an attention based multi-scale feature extractor. We further introduce a robust knowledge distillation (RKD) scheme to deal with the class imbalance problem of these newly augmented training examples. The superiority of the proposed method is verified through experiments conducted on the benchmark models and datasets. We can use only $8%$ of uniformly selected data from the ImageNet to obtain an efficient model with comparable performance to the baseline ResNet-34.
Tasks
Published	2019-09-21
URL	https://arxiv.org/abs/1909.09757v2
PDF	https://arxiv.org/pdf/1909.09757v2.pdf
PWC	https://paperswithcode.com/paper/190909757
Repo	https://github.com/huawei-noah/DAFL
Framework	pytorch

Unsupervised Open Domain Recognition by Semantic Discrepancy Minimization


Title	Unsupervised Open Domain Recognition by Semantic Discrepancy Minimization
Authors	Junbao Zhuo, Shuhui Wang, Shuhao Cui, Qingming Huang
Abstract	We address the unsupervised open domain recognition (UODR) problem, where categories in labeled source domain S is only a subset of those in unlabeled target domain T. The task is to correctly classify all samples in T including known and unknown categories. UODR is challenging due to the domain discrepancy, which becomes even harder to bridge when a large number of unknown categories exist in T. Moreover, the classification rules propagated by graph CNN (GCN) may be distracted by unknown categories and lack generalization capability. To measure the domain discrepancy for asymmetric label space between S and T, we propose Semantic-Guided Matching Discrepancy (SGMD), which first employs instance matching between S and T, and then the discrepancy is measured by a weighted feature distance between matched instances. We further design a limited balance constraint to achieve a more balanced classification output on known and unknown categories. We develop Unsupervised Open Domain Transfer Network (UODTN), which learns both the backbone classification network and GCN jointly by reducing the SGMD, enforcing the limited balance constraint and minimizing the classification loss on S. UODTN better preserves the semantic structure and enforces the consistency between the learned domain invariant visual features and the semantic embeddings. Experimental results show superiority of our method on recognizing images of both known and unknown categories.
Tasks
Published	2019-04-18
URL	http://arxiv.org/abs/1904.08631v1
PDF	http://arxiv.org/pdf/1904.08631v1.pdf
PWC	https://paperswithcode.com/paper/unsupervised-open-domain-recognition-by
Repo	https://github.com/junbaoZHUO/UODTN
Framework	pytorch