January 24, 2020

2529 words 12 mins read

Paper Group NANR 209

Learning Multi-Class Segmentations From Single-Class Datasets. Adaptive Convolution for Text Classification. Cross-lingual Transfer Learning with Data Selection for Large-Scale Spoken Language Understanding. Distilled Person Re-Identification: Towards a More Scalable System. It’s Not About the Journey; It’s About the Destination: Following Soft Pat …

Learning Multi-Class Segmentations From Single-Class Datasets


Title	Learning Multi-Class Segmentations From Single-Class Datasets
Authors	Konstantin Dmitriev, Arie E. Kaufman
Abstract	Multi-class segmentation has recently achieved significant performance in natural images and videos. This achievement is due primarily to the public availability of large multi-class datasets. However, there are certain domains, such as biomedical images, where obtaining sufficient multi-class annotations is a laborious and often impossible task and only single-class datasets are available. While existing segmentation research in such domains use private multi-class datasets or focus on single-class segmentations, we propose a unified highly efficient framework for robust simultaneous learning of multi-class segmentations by combining single-class datasets and utilizing a novel way of conditioning a convolutional network for the purpose of segmentation. We demonstrate various ways of incorporating the conditional information, perform an extensive evaluation, and show compelling multi-class segmentation performance on biomedical images, which outperforms current state-of-the-art solutions (up to 2.7%). Unlike current solutions, which are meticulously tailored for particular single-class datasets, we utilize datasets from a variety of sources. Furthermore, we show the applicability of our method also to natural images and evaluate it on the Cityscapes dataset. We further discuss other possible applications of our proposed framework.
Tasks
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Dmitriev_Learning_Multi-Class_Segmentations_From_Single-Class_Datasets_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Dmitriev_Learning_Multi-Class_Segmentations_From_Single-Class_Datasets_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/learning-multi-class-segmentations-from
Repo
Framework

Adaptive Convolution for Text Classification


Title	Adaptive Convolution for Text Classification
Authors	Byung-Ju Choi, Jun-Hyung Park, SangKeun Lee
Abstract	In this paper, we present an adaptive convolution for text classification to give flexibility to convolutional neural networks (CNNs). Unlike traditional convolutions which utilize the same set of filters regardless of different inputs, the adaptive convolution employs adaptively generated convolutional filters conditioned on inputs. We achieve this by attaching filter-generating networks, which are carefully designed to generate input-specific filters, to convolution blocks in existing CNNs. We show the efficacy of our approach in existing CNNs based on the performance evaluation. Our evaluation indicates that all of our baselines achieve performance improvements with adaptive convolutions as much as up to 2.6 percentage point in seven benchmark text classification datasets.
Tasks	Text Classification
Published	2019-06-01
URL	https://www.aclweb.org/anthology/N19-1256/
PDF	https://www.aclweb.org/anthology/N19-1256
PWC	https://paperswithcode.com/paper/adaptive-convolution-for-text-classification
Repo
Framework

Cross-lingual Transfer Learning with Data Selection for Large-Scale Spoken Language Understanding


Title	Cross-lingual Transfer Learning with Data Selection for Large-Scale Spoken Language Understanding
Authors	Quynh Do, Judith Gaspers
Abstract	A typical cross-lingual transfer learning approach boosting model performance on a language is to pre-train the model on all available supervised data from another language. However, in large-scale systems this leads to high training times and computational requirements. In addition, characteristic differences between the source and target languages raise a natural question of whether source data selection can improve the knowledge transfer. In this paper, we address this question and propose a simple but effective language model based source-language data selection method for cross-lingual transfer learning in large-scale spoken language understanding. The experimental results show that with data selection i) source data and hence training speed is reduced significantly and ii) model performance is improved.
Tasks	Cross-Lingual Transfer, Language Modelling, Spoken Language Understanding, Transfer Learning
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-1153/
PDF	https://www.aclweb.org/anthology/D19-1153
PWC	https://paperswithcode.com/paper/cross-lingual-transfer-learning-with-data
Repo
Framework

Distilled Person Re-Identification: Towards a More Scalable System


Title	Distilled Person Re-Identification: Towards a More Scalable System
Authors	Ancong Wu, Wei-Shi Zheng, Xiaowei Guo, Jian-Huang Lai
Abstract	Person re-identification (Re-ID), for matching pedestrians across non-overlapping camera views, has made great progress in supervised learning with abundant labelled data. However, the scalability problem is the bottleneck for applications in large-scale systems. We consider the scalability problem of Re-ID from three aspects: (1) low labelling cost by reducing label amount, (2) low extension cost by reusing existing knowledge and (3) low testing computation cost by using lightweight models. The requirements render scalable Re-ID a challenging problem. To solve these problems in a unified system, we propose a Multi-teacher Adaptive Similarity Distillation Framework, which requires only a few labelled identities of target domain to transfer knowledge from multiple teacher models to a user-specified lightweight student model without accessing source domain data. We propose the Log-Euclidean Similarity Distillation Loss for Re-ID and further integrate the Adaptive Knowledge Aggregator to select effective teacher models to transfer target-adaptive knowledge. Extensive evaluations show that our method can extend with high scalability and the performance is comparable to the state-of-the-art unsupervised and semi-supervised Re-ID methods.
Tasks	Person Re-Identification
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Wu_Distilled_Person_Re-Identification_Towards_a_More_Scalable_System_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Wu_Distilled_Person_Re-Identification_Towards_a_More_Scalable_System_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/distilled-person-re-identification-towards-a
Repo
Framework

It’s Not About the Journey; It’s About the Destination: Following Soft Paths Under Question-Guidance for Visual Reasoning


Title	It’s Not About the Journey; It’s About the Destination: Following Soft Paths Under Question-Guidance for Visual Reasoning
Authors	Monica Haurilet, Alina Roitberg, Rainer Stiefelhagen
Abstract	Visual Reasoning remains a challenging task, as it has to deal with long-range and multi-step object relationships in the scene. We present a new model for Visual Reasoning, aimed at capturing the interplay among individual objects in the image represented as a scene graph. As not all graph components are relevant for the query, we introduce the concept of a question-based visual guide, which constrains the potential solution space by learning an optimal traversal scheme, where the final destination nodes alone are used to produce the answer. We show, that finding relevant semantic structures facilitates generalization to new tasks by introducing a novel problem of knowledge transfer: training on one question type and answering questions from a different domain without any training data. Furthermore, we report state-of-the-art results for Visual Reasoning on multiple query types and diverse image and video datasets.
Tasks	Transfer Learning, Visual Reasoning
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Haurilet_Its_Not_About_the_Journey_Its_About_the_Destination_Following_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Haurilet_Its_Not_About_the_Journey_Its_About_the_Destination_Following_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/its-not-about-the-journey-its-about-the
Repo
Framework

Graphemic ambiguous queries on Arabic-scripted historical corpora


Title	Graphemic ambiguous queries on Arabic-scripted historical corpora
Authors	Alicia Gonz{'a}lez Mart{'\i}nez
Abstract
Tasks
Published	2019-09-01
URL	https://www.aclweb.org/anthology/W19-9001/
PDF	https://www.aclweb.org/anthology/W19-9001
PWC	https://paperswithcode.com/paper/graphemic-ambiguous-queries-on-arabic
Repo
Framework

Proceedings of the Thirteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-13)


Title	Proceedings of the Thirteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-13)
Authors
Abstract
Tasks
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-5300/
PDF	https://www.aclweb.org/anthology/D19-5300
PWC	https://paperswithcode.com/paper/proceedings-of-the-thirteenth-workshop-on-1
Repo
Framework

WSLLN:Weakly Supervised Natural Language Localization Networks


Title	WSLLN:Weakly Supervised Natural Language Localization Networks
Authors	Mingfei Gao, Larry Davis, Richard Socher, Caiming Xiong
Abstract	We propose weakly supervised language localization networks (WSLLN) to detect events in long, untrimmed videos given language queries. To learn the correspondence between visual segments and texts, most previous methods require temporal coordinates (start and end times) of events for training, which leads to high costs of annotation. WSLLN relieves the annotation burden by training with only video-sentence pairs without accessing to temporal locations of events. With a simple end-to-end structure, WSLLN measures segment-text consistency and conducts segment selection (conditioned on the text) simultaneously. Results from both are merged and optimized as a video-sentence matching problem. Experiments on ActivityNet Captions and DiDeMo demonstrate that WSLLN achieves state-of-the-art performance.
Tasks
Published	2019-11-01
URL	https://www.aclweb.org/anthology/D19-1157/
PDF	https://www.aclweb.org/anthology/D19-1157
PWC	https://paperswithcode.com/paper/wsllnweakly-supervised-natural-language
Repo
Framework

DL2: Training and Querying Neural Networks with Logic


Title	DL2: Training and Querying Neural Networks with Logic
Authors	Marc Fischer, Mislav Balunovic, Dana Drachsler-Cohen, Timon Gehr, Ce Zhang, Martin Vechev
Abstract	We present DL2, a system for training and querying neural networks with logical constraints. The key idea is to translate these constraints into a differentiable loss with desirable mathematical properties and to then either train with this loss in an iterative manner or to use the loss for querying the network for inputs subject to the constraints. We empirically demonstrate that DL2 is effective in both training and querying scenarios, across a range of constraints and data sets.
Tasks
Published	2019-05-01
URL	https://openreview.net/forum?id=H1faSn0qY7
PDF	https://openreview.net/pdf?id=H1faSn0qY7
PWC	https://paperswithcode.com/paper/dl2-training-and-querying-neural-networks
Repo
Framework

Enhancing Transformer for End-to-end Speech-to-Text Translation


Title	Enhancing Transformer for End-to-end Speech-to-Text Translation
Authors	Mattia Antonino Di Gangi, Matteo Negri, Roldano Cattoni, Roberto Dessi, Marco Turchi
Abstract
Tasks
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-6603/
PDF	https://www.aclweb.org/anthology/W19-6603
PWC	https://paperswithcode.com/paper/enhancing-transformer-for-end-to-end-speech
Repo
Framework

SemEval-2019 Task 9: Suggestion Mining from Online Reviews and Forums


Title	SemEval-2019 Task 9: Suggestion Mining from Online Reviews and Forums
Authors	Sapna Negi, Tobias Daudert, Paul Buitelaar
Abstract	We present the pilot SemEval task on Suggestion Mining. The task consists of subtasks A and B, where we created labeled data from feedback forum and hotel reviews respectively. Subtask A provides training and test data from the same domain, while Subtask B evaluates the system on a test dataset from a different domain than the available training data. 33 teams participated in the shared task, with a total of 50 members. We summarize the problem definition, benchmark dataset preparation, and methods used by the participating teams, providing details of the methods used by the top ranked systems. The dataset is made freely available to help advance the research in suggestion mining, and reproduce the systems submitted under this task
Tasks
Published	2019-06-01
URL	https://www.aclweb.org/anthology/S19-2151/
PDF	https://www.aclweb.org/anthology/S19-2151
PWC	https://paperswithcode.com/paper/semeval-2019-task-9-suggestion-mining-from
Repo
Framework

Identifying Temporal Trends Based on Perplexity and Clustering: Are We Looking at Language Change?


Title	Identifying Temporal Trends Based on Perplexity and Clustering: Are We Looking at Language Change?
Authors	Sidsel Boldsen, Manex Agirrezabal, Patrizia Paggio
Abstract	In this work we propose a data-driven methodology for identifying temporal trends in a corpus of medieval charters. We have used perplexities derived from RNNs as a distance measure between documents and then, performed clustering on those distances. We argue that perplexities calculated by such language models are representative of temporal trends. The clusters produced using the K-Means algorithm give an insight of the differences in language in different time periods at least partly due to language change. We suggest that the temporal distribution of the individual clusters might provide a more nuanced picture of temporal trends compared to discrete bins, thus providing better results when used in a classification task.
Tasks
Published	2019-08-01
URL	https://www.aclweb.org/anthology/W19-4711/
PDF	https://www.aclweb.org/anthology/W19-4711
PWC	https://paperswithcode.com/paper/identifying-temporal-trends-based-on
Repo
Framework

Evaluation of Three Deep Learning Models for Early Crop Classification Using Sentinel-1A Imagery Time Series—A Case Study in Zhanjiang, China


Title	Evaluation of Three Deep Learning Models for Early Crop Classification Using Sentinel-1A Imagery Time Series—A Case Study in Zhanjiang, China
Authors	Hongwei Zhao, Zhongxin Chen, Hao Jiang, Wenlong Jing, Liang Sun, Min Feng 6
Abstract	Timely and accurate estimation of the area and distribution of crops is vital for food security. Optical remote sensing has been a key technique for acquiring crop area and conditions on regional to global scales, but great challenges arise due to frequent cloudy days in southern China. This makes optical remote sensing images usually unavailable. Synthetic aperture radar (SAR) could bridge this gap since it is less a effected by clouds. The recent availability of Sentinel-1A (S1A) SAR imagery with a 12-day revisit period at a high spatial resolution of about 10 m makes it possible to fully utilize phenological information to improve early crop classification. In deep learning methods, one-dimensional convolutional neural networks (1D CNNs), long short-term memory recurrent neural networks (LSTM RNNs), and gated recurrent unit RNNs (GRU RNNs) have been shown to efficiently extract temporal features for classification tasks. However, due to the complexity of training, these three deep learning methods have been less used in early crop classification. In this work, we attempted to combine them with an incremental classification method to avoid the need for training optimal architectures and hyper-parameters for data from each time series. First, we trained 1D CNNs, LSTM RNNs, and GRU RNNs based on the full images’ time series to attain three classifiers with optimal architectures and hyper-parameters. Then, starting at the first time point, we performed an incremental classification process to train each classifier using all of the previous data, and obtained a classification network with all parameter values (including the hyper-parameters) at each time point. Finally, test accuracies of each time point were assessed for each crop type to determine the optimal time series length. A case study was conducted in Suixi and Leizhou counties of Zhanjiang City, China. To verify the effectiveness of this method, we also implemented the classic random forest (RF) approach. The results were as follows: (i) 1D CNNs achieved the highest Kappa coefficient (0.942) of the four classifiers, and the highest value (0.934) in the GRU RNNs time series was attained earlier than with other classifiers; (ii) all three deep learning methods and the RF achieved F measures above 0.900 before the end of growth seasons of banana, eucalyptus, second-season paddy rice, and sugarcane; while, the 1D CNN classifier was the only one that could obtain an F-measure above 0.900 for pineapple before harvest. All results indicated the e effectiveness of the solution combining the deep learning models with the incremental classification approach for early crop classification. This method is expected to provide new perspectives for early mapping of croplands in cloudy areas.
Tasks	Crop Classification, Time Series
Published	2019-11-15
URL	https://www.mdpi.com/2072-4292/11/22/2673
PDF	https://www.mdpi.com/2072-4292/11/22/2673/pdf
PWC	https://paperswithcode.com/paper/evaluation-of-three-deep-learning-models-for
Repo
Framework

Geometry-Aware Distillation for Indoor Semantic Segmentation


Title	Geometry-Aware Distillation for Indoor Semantic Segmentation
Authors	Jianbo Jiao, Yunchao Wei, Zequn Jie, Honghui Shi, Rynson W.H. Lau, Thomas S. Huang
Abstract	It has been shown that jointly reasoning the 2D appearance and 3D information from RGB-D domains is beneficial to indoor scene semantic segmentation. However, most existing approaches require accurate depth map as input to segment the scene which severely limits their applications. In this paper, we propose to jointly infer the semantic and depth information by distilling geometry-aware embedding to eliminate such strong constraint while still exploiting the helpful depth domain information. In addition, we use this learned embedding to improve the quality of semantic segmentation, through a proposed geometry-aware propagation framework followed by several multi-level skip feature fusion blocks. By decoupling the single task prediction network into two joint tasks of semantic segmentation and geometry embedding learning, together with the proposed information propagation and feature fusion architecture, our method is shown to perform favorably against state-of-the-art methods for semantic segmentation on publicly available challenging indoor datasets.
Tasks	Semantic Segmentation
Published	2019-06-01
URL	http://openaccess.thecvf.com/content_CVPR_2019/html/Jiao_Geometry-Aware_Distillation_for_Indoor_Semantic_Segmentation_CVPR_2019_paper.html
PDF	http://openaccess.thecvf.com/content_CVPR_2019/papers/Jiao_Geometry-Aware_Distillation_for_Indoor_Semantic_Segmentation_CVPR_2019_paper.pdf
PWC	https://paperswithcode.com/paper/geometry-aware-distillation-for-indoor
Repo
Framework

Non-Synergistic Variational Autoencoders


Title	Non-Synergistic Variational Autoencoders
Authors	Gonzalo Barrientos, Sten Sootla
Abstract	Learning disentangling representations of the independent factors of variations that explain the data in an unsupervised setting is still a major challenge. In the following paper we address the task of disentanglement and introduce a new state-of-the-art approach called Non-synergistic variational Autoencoder (Non-Syn VAE). Our model draws inspiration from population coding, where the notion of synergy arises when we describe the encoded information by neurons in the form of responses from the stimuli. If those responses convey more information together than separate as independent sources of encoding information, they are acting synergetically. By penalizing the synergistic mutual information within the latents we encourage information independence and by doing that disentangle the latent factors. Notably, our approach could be added to the VAE framework easily, where the new ELBO function is still a lower bound on the log likelihood. In addition, we qualitatively compare our model with Factor VAE and show that this one implicitly minimises the synergy of the latents.
Tasks
Published	2019-05-01
URL	https://openreview.net/forum?id=Skl3M20qYQ
PDF	https://openreview.net/pdf?id=Skl3M20qYQ
PWC	https://paperswithcode.com/paper/non-synergistic-variational-autoencoders
Repo
Framework