Paper Group NANR 209
Learning Multi-Class Segmentations From Single-Class Datasets. Adaptive Convolution for Text Classification. Cross-lingual Transfer Learning with Data Selection for Large-Scale Spoken Language Understanding. Distilled Person Re-Identification: Towards a More Scalable System. It’s Not About the Journey; It’s About the Destination: Following Soft Pat …
Learning Multi-Class Segmentations From Single-Class Datasets
Title | Learning Multi-Class Segmentations From Single-Class Datasets |
Authors | Konstantin Dmitriev, Arie E. Kaufman |
Abstract | Multi-class segmentation has recently achieved significant performance in natural images and videos. This achievement is due primarily to the public availability of large multi-class datasets. However, there are certain domains, such as biomedical images, where obtaining sufficient multi-class annotations is a laborious and often impossible task and only single-class datasets are available. While existing segmentation research in such domains use private multi-class datasets or focus on single-class segmentations, we propose a unified highly efficient framework for robust simultaneous learning of multi-class segmentations by combining single-class datasets and utilizing a novel way of conditioning a convolutional network for the purpose of segmentation. We demonstrate various ways of incorporating the conditional information, perform an extensive evaluation, and show compelling multi-class segmentation performance on biomedical images, which outperforms current state-of-the-art solutions (up to 2.7%). Unlike current solutions, which are meticulously tailored for particular single-class datasets, we utilize datasets from a variety of sources. Furthermore, we show the applicability of our method also to natural images and evaluate it on the Cityscapes dataset. We further discuss other possible applications of our proposed framework. |
Tasks | |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Dmitriev_Learning_Multi-Class_Segmentations_From_Single-Class_Datasets_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Dmitriev_Learning_Multi-Class_Segmentations_From_Single-Class_Datasets_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/learning-multi-class-segmentations-from |
Repo | |
Framework | |
Adaptive Convolution for Text Classification
Title | Adaptive Convolution for Text Classification |
Authors | Byung-Ju Choi, Jun-Hyung Park, SangKeun Lee |
Abstract | In this paper, we present an adaptive convolution for text classification to give flexibility to convolutional neural networks (CNNs). Unlike traditional convolutions which utilize the same set of filters regardless of different inputs, the adaptive convolution employs adaptively generated convolutional filters conditioned on inputs. We achieve this by attaching filter-generating networks, which are carefully designed to generate input-specific filters, to convolution blocks in existing CNNs. We show the efficacy of our approach in existing CNNs based on the performance evaluation. Our evaluation indicates that all of our baselines achieve performance improvements with adaptive convolutions as much as up to 2.6 percentage point in seven benchmark text classification datasets. |
Tasks | Text Classification |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/N19-1256/ |
https://www.aclweb.org/anthology/N19-1256 | |
PWC | https://paperswithcode.com/paper/adaptive-convolution-for-text-classification |
Repo | |
Framework | |
Cross-lingual Transfer Learning with Data Selection for Large-Scale Spoken Language Understanding
Title | Cross-lingual Transfer Learning with Data Selection for Large-Scale Spoken Language Understanding |
Authors | Quynh Do, Judith Gaspers |
Abstract | A typical cross-lingual transfer learning approach boosting model performance on a language is to pre-train the model on all available supervised data from another language. However, in large-scale systems this leads to high training times and computational requirements. In addition, characteristic differences between the source and target languages raise a natural question of whether source data selection can improve the knowledge transfer. In this paper, we address this question and propose a simple but effective language model based source-language data selection method for cross-lingual transfer learning in large-scale spoken language understanding. The experimental results show that with data selection i) source data and hence training speed is reduced significantly and ii) model performance is improved. |
Tasks | Cross-Lingual Transfer, Language Modelling, Spoken Language Understanding, Transfer Learning |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1153/ |
https://www.aclweb.org/anthology/D19-1153 | |
PWC | https://paperswithcode.com/paper/cross-lingual-transfer-learning-with-data |
Repo | |
Framework | |
Distilled Person Re-Identification: Towards a More Scalable System
Title | Distilled Person Re-Identification: Towards a More Scalable System |
Authors | Ancong Wu, Wei-Shi Zheng, Xiaowei Guo, Jian-Huang Lai |
Abstract | Person re-identification (Re-ID), for matching pedestrians across non-overlapping camera views, has made great progress in supervised learning with abundant labelled data. However, the scalability problem is the bottleneck for applications in large-scale systems. We consider the scalability problem of Re-ID from three aspects: (1) low labelling cost by reducing label amount, (2) low extension cost by reusing existing knowledge and (3) low testing computation cost by using lightweight models. The requirements render scalable Re-ID a challenging problem. To solve these problems in a unified system, we propose a Multi-teacher Adaptive Similarity Distillation Framework, which requires only a few labelled identities of target domain to transfer knowledge from multiple teacher models to a user-specified lightweight student model without accessing source domain data. We propose the Log-Euclidean Similarity Distillation Loss for Re-ID and further integrate the Adaptive Knowledge Aggregator to select effective teacher models to transfer target-adaptive knowledge. Extensive evaluations show that our method can extend with high scalability and the performance is comparable to the state-of-the-art unsupervised and semi-supervised Re-ID methods. |
Tasks | Person Re-Identification |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Wu_Distilled_Person_Re-Identification_Towards_a_More_Scalable_System_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Wu_Distilled_Person_Re-Identification_Towards_a_More_Scalable_System_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/distilled-person-re-identification-towards-a |
Repo | |
Framework | |
It’s Not About the Journey; It’s About the Destination: Following Soft Paths Under Question-Guidance for Visual Reasoning
Title | It’s Not About the Journey; It’s About the Destination: Following Soft Paths Under Question-Guidance for Visual Reasoning |
Authors | Monica Haurilet, Alina Roitberg, Rainer Stiefelhagen |
Abstract | Visual Reasoning remains a challenging task, as it has to deal with long-range and multi-step object relationships in the scene. We present a new model for Visual Reasoning, aimed at capturing the interplay among individual objects in the image represented as a scene graph. As not all graph components are relevant for the query, we introduce the concept of a question-based visual guide, which constrains the potential solution space by learning an optimal traversal scheme, where the final destination nodes alone are used to produce the answer. We show, that finding relevant semantic structures facilitates generalization to new tasks by introducing a novel problem of knowledge transfer: training on one question type and answering questions from a different domain without any training data. Furthermore, we report state-of-the-art results for Visual Reasoning on multiple query types and diverse image and video datasets. |
Tasks | Transfer Learning, Visual Reasoning |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Haurilet_Its_Not_About_the_Journey_Its_About_the_Destination_Following_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Haurilet_Its_Not_About_the_Journey_Its_About_the_Destination_Following_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/its-not-about-the-journey-its-about-the |
Repo | |
Framework | |
Graphemic ambiguous queries on Arabic-scripted historical corpora
Title | Graphemic ambiguous queries on Arabic-scripted historical corpora |
Authors | Alicia Gonz{'a}lez Mart{'\i}nez |
Abstract | |
Tasks | |
Published | 2019-09-01 |
URL | https://www.aclweb.org/anthology/W19-9001/ |
https://www.aclweb.org/anthology/W19-9001 | |
PWC | https://paperswithcode.com/paper/graphemic-ambiguous-queries-on-arabic |
Repo | |
Framework | |
Proceedings of the Thirteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-13)
Title | Proceedings of the Thirteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-13) |
Authors | |
Abstract | |
Tasks | |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-5300/ |
https://www.aclweb.org/anthology/D19-5300 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-thirteenth-workshop-on-1 |
Repo | |
Framework | |
WSLLN:Weakly Supervised Natural Language Localization Networks
Title | WSLLN:Weakly Supervised Natural Language Localization Networks |
Authors | Mingfei Gao, Larry Davis, Richard Socher, Caiming Xiong |
Abstract | We propose weakly supervised language localization networks (WSLLN) to detect events in long, untrimmed videos given language queries. To learn the correspondence between visual segments and texts, most previous methods require temporal coordinates (start and end times) of events for training, which leads to high costs of annotation. WSLLN relieves the annotation burden by training with only video-sentence pairs without accessing to temporal locations of events. With a simple end-to-end structure, WSLLN measures segment-text consistency and conducts segment selection (conditioned on the text) simultaneously. Results from both are merged and optimized as a video-sentence matching problem. Experiments on ActivityNet Captions and DiDeMo demonstrate that WSLLN achieves state-of-the-art performance. |
Tasks | |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1157/ |
https://www.aclweb.org/anthology/D19-1157 | |
PWC | https://paperswithcode.com/paper/wsllnweakly-supervised-natural-language |
Repo | |
Framework | |
DL2: Training and Querying Neural Networks with Logic
Title | DL2: Training and Querying Neural Networks with Logic |
Authors | Marc Fischer, Mislav Balunovic, Dana Drachsler-Cohen, Timon Gehr, Ce Zhang, Martin Vechev |
Abstract | We present DL2, a system for training and querying neural networks with logical constraints. The key idea is to translate these constraints into a differentiable loss with desirable mathematical properties and to then either train with this loss in an iterative manner or to use the loss for querying the network for inputs subject to the constraints. We empirically demonstrate that DL2 is effective in both training and querying scenarios, across a range of constraints and data sets. |
Tasks | |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=H1faSn0qY7 |
https://openreview.net/pdf?id=H1faSn0qY7 | |
PWC | https://paperswithcode.com/paper/dl2-training-and-querying-neural-networks |
Repo | |
Framework | |
Enhancing Transformer for End-to-end Speech-to-Text Translation
Title | Enhancing Transformer for End-to-end Speech-to-Text Translation |
Authors | Mattia Antonino Di Gangi, Matteo Negri, Roldano Cattoni, Roberto Dessi, Marco Turchi |
Abstract | |
Tasks | |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-6603/ |
https://www.aclweb.org/anthology/W19-6603 | |
PWC | https://paperswithcode.com/paper/enhancing-transformer-for-end-to-end-speech |
Repo | |
Framework | |
SemEval-2019 Task 9: Suggestion Mining from Online Reviews and Forums
Title | SemEval-2019 Task 9: Suggestion Mining from Online Reviews and Forums |
Authors | Sapna Negi, Tobias Daudert, Paul Buitelaar |
Abstract | We present the pilot SemEval task on Suggestion Mining. The task consists of subtasks A and B, where we created labeled data from feedback forum and hotel reviews respectively. Subtask A provides training and test data from the same domain, while Subtask B evaluates the system on a test dataset from a different domain than the available training data. 33 teams participated in the shared task, with a total of 50 members. We summarize the problem definition, benchmark dataset preparation, and methods used by the participating teams, providing details of the methods used by the top ranked systems. The dataset is made freely available to help advance the research in suggestion mining, and reproduce the systems submitted under this task |
Tasks | |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/S19-2151/ |
https://www.aclweb.org/anthology/S19-2151 | |
PWC | https://paperswithcode.com/paper/semeval-2019-task-9-suggestion-mining-from |
Repo | |
Framework | |
Identifying Temporal Trends Based on Perplexity and Clustering: Are We Looking at Language Change?
Title | Identifying Temporal Trends Based on Perplexity and Clustering: Are We Looking at Language Change? |
Authors | Sidsel Boldsen, Manex Agirrezabal, Patrizia Paggio |
Abstract | In this work we propose a data-driven methodology for identifying temporal trends in a corpus of medieval charters. We have used perplexities derived from RNNs as a distance measure between documents and then, performed clustering on those distances. We argue that perplexities calculated by such language models are representative of temporal trends. The clusters produced using the K-Means algorithm give an insight of the differences in language in different time periods at least partly due to language change. We suggest that the temporal distribution of the individual clusters might provide a more nuanced picture of temporal trends compared to discrete bins, thus providing better results when used in a classification task. |
Tasks | |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-4711/ |
https://www.aclweb.org/anthology/W19-4711 | |
PWC | https://paperswithcode.com/paper/identifying-temporal-trends-based-on |
Repo | |
Framework | |
Evaluation of Three Deep Learning Models for Early Crop Classification Using Sentinel-1A Imagery Time Series—A Case Study in Zhanjiang, China
Title | Evaluation of Three Deep Learning Models for Early Crop Classification Using Sentinel-1A Imagery Time Series—A Case Study in Zhanjiang, China |
Authors | Hongwei Zhao, Zhongxin Chen, Hao Jiang, Wenlong Jing, Liang Sun, Min Feng 6 |
Abstract | Timely and accurate estimation of the area and distribution of crops is vital for food security. Optical remote sensing has been a key technique for acquiring crop area and conditions on regional to global scales, but great challenges arise due to frequent cloudy days in southern China. This makes optical remote sensing images usually unavailable. Synthetic aperture radar (SAR) could bridge this gap since it is less a effected by clouds. The recent availability of Sentinel-1A (S1A) SAR imagery with a 12-day revisit period at a high spatial resolution of about 10 m makes it possible to fully utilize phenological information to improve early crop classification. In deep learning methods, one-dimensional convolutional neural networks (1D CNNs), long short-term memory recurrent neural networks (LSTM RNNs), and gated recurrent unit RNNs (GRU RNNs) have been shown to efficiently extract temporal features for classification tasks. However, due to the complexity of training, these three deep learning methods have been less used in early crop classification. In this work, we attempted to combine them with an incremental classification method to avoid the need for training optimal architectures and hyper-parameters for data from each time series. First, we trained 1D CNNs, LSTM RNNs, and GRU RNNs based on the full images’ time series to attain three classifiers with optimal architectures and hyper-parameters. Then, starting at the first time point, we performed an incremental classification process to train each classifier using all of the previous data, and obtained a classification network with all parameter values (including the hyper-parameters) at each time point. Finally, test accuracies of each time point were assessed for each crop type to determine the optimal time series length. A case study was conducted in Suixi and Leizhou counties of Zhanjiang City, China. To verify the effectiveness of this method, we also implemented the classic random forest (RF) approach. The results were as follows: (i) 1D CNNs achieved the highest Kappa coefficient (0.942) of the four classifiers, and the highest value (0.934) in the GRU RNNs time series was attained earlier than with other classifiers; (ii) all three deep learning methods and the RF achieved F measures above 0.900 before the end of growth seasons of banana, eucalyptus, second-season paddy rice, and sugarcane; while, the 1D CNN classifier was the only one that could obtain an F-measure above 0.900 for pineapple before harvest. All results indicated the e effectiveness of the solution combining the deep learning models with the incremental classification approach for early crop classification. This method is expected to provide new perspectives for early mapping of croplands in cloudy areas. |
Tasks | Crop Classification, Time Series |
Published | 2019-11-15 |
URL | https://www.mdpi.com/2072-4292/11/22/2673 |
https://www.mdpi.com/2072-4292/11/22/2673/pdf | |
PWC | https://paperswithcode.com/paper/evaluation-of-three-deep-learning-models-for |
Repo | |
Framework | |
Geometry-Aware Distillation for Indoor Semantic Segmentation
Title | Geometry-Aware Distillation for Indoor Semantic Segmentation |
Authors | Jianbo Jiao, Yunchao Wei, Zequn Jie, Honghui Shi, Rynson W.H. Lau, Thomas S. Huang |
Abstract | It has been shown that jointly reasoning the 2D appearance and 3D information from RGB-D domains is beneficial to indoor scene semantic segmentation. However, most existing approaches require accurate depth map as input to segment the scene which severely limits their applications. In this paper, we propose to jointly infer the semantic and depth information by distilling geometry-aware embedding to eliminate such strong constraint while still exploiting the helpful depth domain information. In addition, we use this learned embedding to improve the quality of semantic segmentation, through a proposed geometry-aware propagation framework followed by several multi-level skip feature fusion blocks. By decoupling the single task prediction network into two joint tasks of semantic segmentation and geometry embedding learning, together with the proposed information propagation and feature fusion architecture, our method is shown to perform favorably against state-of-the-art methods for semantic segmentation on publicly available challenging indoor datasets. |
Tasks | Semantic Segmentation |
Published | 2019-06-01 |
URL | http://openaccess.thecvf.com/content_CVPR_2019/html/Jiao_Geometry-Aware_Distillation_for_Indoor_Semantic_Segmentation_CVPR_2019_paper.html |
http://openaccess.thecvf.com/content_CVPR_2019/papers/Jiao_Geometry-Aware_Distillation_for_Indoor_Semantic_Segmentation_CVPR_2019_paper.pdf | |
PWC | https://paperswithcode.com/paper/geometry-aware-distillation-for-indoor |
Repo | |
Framework | |
Non-Synergistic Variational Autoencoders
Title | Non-Synergistic Variational Autoencoders |
Authors | Gonzalo Barrientos, Sten Sootla |
Abstract | Learning disentangling representations of the independent factors of variations that explain the data in an unsupervised setting is still a major challenge. In the following paper we address the task of disentanglement and introduce a new state-of-the-art approach called Non-synergistic variational Autoencoder (Non-Syn VAE). Our model draws inspiration from population coding, where the notion of synergy arises when we describe the encoded information by neurons in the form of responses from the stimuli. If those responses convey more information together than separate as independent sources of encoding information, they are acting synergetically. By penalizing the synergistic mutual information within the latents we encourage information independence and by doing that disentangle the latent factors. Notably, our approach could be added to the VAE framework easily, where the new ELBO function is still a lower bound on the log likelihood. In addition, we qualitatively compare our model with Factor VAE and show that this one implicitly minimises the synergy of the latents. |
Tasks | |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=Skl3M20qYQ |
https://openreview.net/pdf?id=Skl3M20qYQ | |
PWC | https://paperswithcode.com/paper/non-synergistic-variational-autoencoders |
Repo | |
Framework | |