January 24, 2020

2544 words 12 mins read

Paper Group NANR 222

Paper Group NANR 222

BAM: A combination of deep and shallow models for German Dialect Identification.. Not All Frames Are Equal: Weakly-Supervised Video Grounding With Contextual Similarity and Visual Clustering Losses. Supporting content evaluation of student summaries by Idea Unit embedding. Incremental Learning Using Conditional Adversarial Networks. Emotion-Aware H …

BAM: A combination of deep and shallow models for German Dialect Identification.

Title BAM: A combination of deep and shallow models for German Dialect Identification.
Authors Andrei M. Butnaru
Abstract This is a submission for the Third VarDial Evaluation Campaign In this paper, we present a machine learning approach for the German Dialect Identification (GDI) Closed Shared Task of the DSL 2019 Challenge. The proposed approach combines deep and shallow models, by applying a voting scheme on the outputs resulted from a Character-level Convolutional Neural Networks (Char-CNN), a Long Short-Term Memory (LSTM) network, and a model based on String Kernels. The first model used is the Char-CNN model that merges multiple convolutions computed with kernels of different sizes. The second model is the LSTM network which applies a global max pooling over the returned sequences over time. Both models pass the activation maps to two fully-connected layers. The final model is based on String Kernels, computed on character p-grams extracted from speech transcripts. The model combines two blended kernel functions, one is the presence bits kernel, and the other is the intersection kernel. The empirical results obtained in the shared task prove that the approach can achieve good results. The system proposed in this paper obtained the fourth place with a macro-F1 score of 62.55{%}
Tasks
Published 2019-06-01
URL https://www.aclweb.org/anthology/W19-1413/
PDF https://www.aclweb.org/anthology/W19-1413
PWC https://paperswithcode.com/paper/bam-a-combination-of-deep-and-shallow-models
Repo
Framework

Not All Frames Are Equal: Weakly-Supervised Video Grounding With Contextual Similarity and Visual Clustering Losses

Title Not All Frames Are Equal: Weakly-Supervised Video Grounding With Contextual Similarity and Visual Clustering Losses
Authors Jing Shi, Jia Xu, Boqing Gong, Chenliang Xu
Abstract We invest the problem of weakly-supervised video grounding, where only video-level sentences are provided. This is a challenging task, and previous Multi-Instance Learning (MIL) based image grounding methods turn to fail in the video domain. Recent work attempts to decompose the video-level MIL into frame-level MIL by applying weighted sentence-frame ranking loss over frames, but it is not robust and does not exploit the rich temporal information in videos. In this work, we address these issues by extending frame-level MIL with a false positive frame-bag constraint and modeling the visual feature consistency in the video. In specific, we design a contextual similarity between semantic and visual features to deal with sparse objects association across frames. Furthermore, we leverage temporal coherence by strengthening the clustering effect of similar features in the visual space. We conduct an extensive evaluation on YouCookII and RoboWatch datasets, and demonstrate our method significantly outperforms prior state-of-the-art methods.
Tasks
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Shi_Not_All_Frames_Are_Equal_Weakly-Supervised_Video_Grounding_With_Contextual_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Shi_Not_All_Frames_Are_Equal_Weakly-Supervised_Video_Grounding_With_Contextual_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/not-all-frames-are-equal-weakly-supervised
Repo
Framework

Supporting content evaluation of student summaries by Idea Unit embedding

Title Supporting content evaluation of student summaries by Idea Unit embedding
Authors Marcello Gecchele, Hiroaki Yamada, Takenobu Tokunaga, Yasuyo Sawaki
Abstract This paper discusses the computer-assisted content evaluation of summaries. We propose a method to make a correspondence between the segments of the source text and its summary. As a unit of the segment, we adopt {}Idea Unit (IU){''} which is proposed in Applied Linguistics. Introducing IUs enables us to make a correspondence even for the sentences that contain multiple ideas. The IU correspondence is made based on the similarity between vector representations of IU. An evaluation experiment with two source texts and 20 summaries showed that the proposed method is more robust against rephrased expressions than the conventional ROUGE-based baselines. Also, the proposed method outperformed the baselines in recall. We im-plemented the proposed method in a GUI tool{}Segment Matcher{''} that aids teachers to estab-lish a link between corresponding IUs acrossthe summary and source text.
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-4436/
PDF https://www.aclweb.org/anthology/W19-4436
PWC https://paperswithcode.com/paper/supporting-content-evaluation-of-student
Repo
Framework

Incremental Learning Using Conditional Adversarial Networks

Title Incremental Learning Using Conditional Adversarial Networks
Authors Ye Xiang, Ying Fu, Pan Ji, Hua Huang
Abstract Incremental learning using Deep Neural Networks (DNNs) suffers from catastrophic forgetting. Existing methods mitigate it by either storing old image examples or only updating a few fully connected layers of DNNs, which, however, requires large memory footprints or hurts the plasticity of models. In this paper, we propose a new incremental learning strategy based on conditional adversarial networks. Our new strategy allows us to use memory-efficient statistical information to store old knowledge, and fine-tune both convolutional layers and fully connected layers to consolidate new knowledge. Specifically, we propose a model consisting of three parts, i.e., a base sub-net, a generator, and a discriminator. The base sub-net works as a feature extractor which can be pre-trained on large scale datasets and shared across multiple image recognition tasks. The generator conditioned on labeled embeddings aims to construct pseudo-examples with the same distribution as the old data. The discriminator combines real-examples from new data and pseudo-examples generated from the old data distribution to learn representation for both old and new classes. Through adversarial training of the discriminator and generator, we accomplish the multiple continuous incremental learning. Comparison with the state-of-the-arts on public CIFAR-100 and CUB-200 datasets shows that our method achieves the best accuracies on both old and new classes while requiring relatively less memory storage.
Tasks
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Xiang_Incremental_Learning_Using_Conditional_Adversarial_Networks_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Xiang_Incremental_Learning_Using_Conditional_Adversarial_Networks_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/incremental-learning-using-conditional
Repo
Framework

Emotion-Aware Human Attention Prediction

Title Emotion-Aware Human Attention Prediction
Authors Macario O. Cordel II, Shaojing Fan, Zhiqi Shen, Mohan S. Kankanhalli
Abstract Despite the recent success in face recognition and object classification, in the field of human gaze prediction, computer models are still struggling to accurately mimic human attention. One main reason is that visual attention is a complex human behavior influenced by multiple factors, ranging from low-level features (e.g., color, contrast) to high-level human perception (e.g., objects interactions, object sentiment), making it difficult to model computationally. In this work, we investigate the relation between object sentiment and human attention. We first introduce a new evaluation metric (AttI) for measuring human attention that focuses on human fixation consensus. A series of empirical data analyses with AttI indicate that emotion-evoking objects receive attention favor, especially when they co-occur with emotionally-neutral objects, and this favor varies with different image complexity. Based on the empirical analyses, we design a deep neural network for human attention prediction which allows the attention bias on emotion-evoking objects to be encoded in its feature space. Experiments on two benchmark datasets demonstrate its superior performance, especially on metrics that evaluate relative importance of salient regions. This research provides the clearest picture to date on how object sentiments influence human attention, and it makes one of the first attempts to model this phenomenon computationally.
Tasks Face Recognition, Gaze Prediction, Object Classification
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Cordel_Emotion-Aware_Human_Attention_Prediction_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Cordel_Emotion-Aware_Human_Attention_Prediction_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/emotion-aware-human-attention-prediction
Repo
Framework

Developing Universal Dependencies for Wolof

Title Developing Universal Dependencies for Wolof
Authors Cheikh Bamba Dione
Abstract
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-8003/
PDF https://www.aclweb.org/anthology/W19-8003
PWC https://paperswithcode.com/paper/developing-universal-dependencies-for-wolof
Repo
Framework

SyntaxFest 2019 Invited talk - Inductive biases and language emergence in communicative agents

Title SyntaxFest 2019 Invited talk - Inductive biases and language emergence in communicative agents
Authors Emmanuel Dupoux
Abstract
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-7701/
PDF https://www.aclweb.org/anthology/W19-7701
PWC https://paperswithcode.com/paper/syntaxfest-2019-invited-talk-inductive-biases
Repo
Framework

SyntaxFest 2019 Invited talk - Transferring NLP models across languages and domains

Title SyntaxFest 2019 Invited talk - Transferring NLP models across languages and domains
Authors Barbara Plank
Abstract
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-7702/
PDF https://www.aclweb.org/anthology/W19-7702
PWC https://paperswithcode.com/paper/syntaxfest-2019-invited-talk-transferring-nlp
Repo
Framework

Multi-Level Bottom-Top and Top-Bottom Feature Fusion for Crowd Counting

Title Multi-Level Bottom-Top and Top-Bottom Feature Fusion for Crowd Counting
Authors Vishwanath A. Sindagi, Vishal M. Patel
Abstract Crowd counting presents enormous challenges in the form of large variation in scales within images and across the dataset. These issues are further exacerbated in highly congested scenes. Approaches based on straightforward fusion of multi-scale features from a deep network seem to be obvious solutions to this problem. However, these fusion approaches do not yield significant improvements in the case of crowd counting in congested scenes. This is usually due to their limited abilities in effectively combining the multi-scale features for problems like crowd counting. To overcome this, we focus on how to efficiently leverage information present in different layers of the network. Specifically, we present a network that involves: (i) a multi-level bottom-top and top-bottom fusion (MBTTBF) method to combine information from shallower to deeper layers and vice versa at multiple levels, (ii) scale complementary feature extraction blocks (SCFB) involving cross-scale residual functions to explicitly enable flow of complementary features from adjacent conv layers along the fusion paths. Furthermore, in order to increase the effectiveness of the multi-scale fusion, we employ a principled way of generating scale-aware ground-truth density maps for training. Experiments conducted on three datasets that contain highly congested scenes (ShanghaiTech, UCF_CC_50, and UCF-QNRF) demonstrate that the proposed method is able to outperform several recent methods in all the datasets
Tasks Crowd Counting
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Sindagi_Multi-Level_Bottom-Top_and_Top-Bottom_Feature_Fusion_for_Crowd_Counting_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Sindagi_Multi-Level_Bottom-Top_and_Top-Bottom_Feature_Fusion_for_Crowd_Counting_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/multi-level-bottom-top-and-top-bottom-feature-1
Repo
Framework

Proceedings of the Eighth Workshop on Speech and Language Processing for Assistive Technologies

Title Proceedings of the Eighth Workshop on Speech and Language Processing for Assistive Technologies
Authors
Abstract
Tasks
Published 2019-06-01
URL https://www.aclweb.org/anthology/W19-1700/
PDF https://www.aclweb.org/anthology/W19-1700
PWC https://paperswithcode.com/paper/proceedings-of-the-eighth-workshop-on-speech
Repo
Framework

Semantic Projection Network for Zero- and Few-Label Semantic Segmentation

Title Semantic Projection Network for Zero- and Few-Label Semantic Segmentation
Authors Yongqin Xian, Subhabrata Choudhury, Yang He, Bernt Schiele, Zeynep Akata
Abstract Semantic segmentation is one of the most fundamental problems in computer vision and pixel-level labelling in this context is particularly expensive. Hence, there have been several attempts to reduce the annotation effort such as learning from image level labels and bounding box annotations. In this paper we take this one step further and focus on the challenging task of zero- and few-shot learning of semantic segmentation. We define this task as image segmentation by assigning a label to every pixel even though either no labeled sample of that class was present during training, i.e. zero-label semantic segmentation, or only a few labeled samples were present, i.e. few-label semantic segmentation.Our goal is to transfer the knowledge from previously seen classes to novel classes. Our proposed semantic projection network (SPNet) achieves this goal by incorporating a class-level semantic information into any network designed for semantic segmentation, in an end-to-end manner. We also propose a benchmark for this task on the challenging COCO-Stuff and PASCAL VOC12 datasets. Our model is effective in segmenting novel classes, i.e. alleviating expensive dense annotations, but also in adapting to novel classes without forgetting its prior knowledge, i.e. generalized zero- and few-label semantic segmentation.
Tasks Few-Shot Learning, Semantic Segmentation
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Xian_Semantic_Projection_Network_for_Zero-_and_Few-Label_Semantic_Segmentation_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Xian_Semantic_Projection_Network_for_Zero-_and_Few-Label_Semantic_Segmentation_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/semantic-projection-network-for-zero-and-few
Repo
Framework

Classification Approaches to Identify Informative Tweets

Title Classification Approaches to Identify Informative Tweets
Authors Piush Aggarwal
Abstract Social media platforms have become prime forums for reporting news, with users sharing what they saw, heard or read on social media. News from social media is potentially useful for various stakeholders including aid organizations, news agencies, and individuals. However, social media also contains a vast amount of non-news content. For users to be able to draw on benefits from news reported on social media it is necessary to reliably identify news content and differentiate it from non-news. In this paper, we tackle the challenge of classifying a social post as news or not. To this end, we provide a new manually annotated dataset containing 2,992 tweets from 5 different topical categories. Unlike earlier datasets, it includes postings posted by personal users who do not promote a business or a product and are not affiliated with any organization. We also investigate various baseline systems and evaluate their performance on the newly generated dataset. Our results show that the best classifiers are the SVM and BERT models.
Tasks
Published 2019-09-01
URL https://www.aclweb.org/anthology/R19-2002/
PDF https://www.aclweb.org/anthology/R19-2002
PWC https://paperswithcode.com/paper/classification-approaches-to-identify
Repo
Framework

Weakly Supervised Attention Networks for Entity Recognition

Title Weakly Supervised Attention Networks for Entity Recognition
Authors Barun Patra, Joel Ruben Antony Moniz
Abstract The task of entity recognition has traditionally been modelled as a sequence labelling task. However, this usually requires a large amount of fine-grained data annotated at the token level, which in turn can be expensive and cumbersome to obtain. In this work, we aim to circumvent this requirement of word-level annotated data. To achieve this, we propose a novel architecture for entity recognition from a corpus containing weak binary presence/absence labels, which are relatively easier to obtain. We show that our proposed weakly supervised model, trained solely on a multi-label classification task, performs reasonably well on the task of entity recognition, despite not having access to any token-level ground truth data.
Tasks Multi-Label Classification
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-1652/
PDF https://www.aclweb.org/anthology/D19-1652
PWC https://paperswithcode.com/paper/weakly-supervised-attention-networks-for
Repo
Framework

Multiview 2D/3D Rigid Registration via a Point-Of-Interest Network for Tracking and Triangulation

Title Multiview 2D/3D Rigid Registration via a Point-Of-Interest Network for Tracking and Triangulation
Authors Haofu Liao, Wei-An Lin, Jiarui Zhang, Jingdan Zhang, Jiebo Luo, S. Kevin Zhou
Abstract We propose to tackle the problem of multiview 2D/3D rigid registration for intervention via a Point-Of-Interest Network for Tracking and Triangulation (POINT^2). POINT^2 learns to establish 2D point-to-point correspondences between the pre- and intra-intervention images by tracking a set of random POIs. The 3D pose of the pre-intervention volume is then estimated through a triangulation layer. In POINT^2, the unified framework of the POI tracker and the triangulation layer enables learning informative 2D features and estimating 3D pose jointly. In contrast to existing approaches, POINT^2 only requires a single forward-pass to achieve a reliable 2D/3D registration. As the POI tracker is shift-invariant, POINT^2 is more robust to the initial pose of the 3D pre-intervention image. Extensive experiments on a large-scale clinical cone-beam CT (CBCT) dataset show that the proposed POINT^2 method outperforms the existing learning-based method in terms of accuracy, robustness and running time. Furthermore, when used as an initial pose estimator, our method also improves the robustness and speed of the state-of-the-art optimization-based approaches by ten folds.
Tasks
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Liao_Multiview_2D3D_Rigid_Registration_via_a_Point-Of-Interest_Network_for_Tracking_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Liao_Multiview_2D3D_Rigid_Registration_via_a_Point-Of-Interest_Network_for_Tracking_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/multiview-2d3d-rigid-registration-via-a-point-1
Repo
Framework

Enhancing Air Quality Prediction with Social Media and Natural Language Processing

Title Enhancing Air Quality Prediction with Social Media and Natural Language Processing
Authors Jyun-Yu Jiang, Xue Sun, Wei Wang, Sean Young
Abstract Accompanied by modern industrial developments, air pollution has already become a major concern for human health. Hence, air quality measures, such as the concentration of PM2.5, have attracted increasing attention. Even some studies apply historical measurements into air quality forecast, the changes of air quality conditions are still hard to monitor. In this paper, we propose to exploit social media and natural language processing techniques to enhance air quality prediction. Social media users are treated as social sensors with their findings and locations. After filtering noisy tweets using word selection and topic modeling, a deep learning model based on convolutional neural networks and over-tweet-pooling is proposed to enhance air quality prediction. We conduct experiments on 7-month real-world Twitter datasets in the five most heavily polluted states in the USA. The results show that our approach significantly improves air quality prediction over the baseline that does not use social media by 6.9{%} to 17.7{%} in macro-F1 scores.
Tasks
Published 2019-07-01
URL https://www.aclweb.org/anthology/P19-1251/
PDF https://www.aclweb.org/anthology/P19-1251
PWC https://paperswithcode.com/paper/enhancing-air-quality-prediction-with-social
Repo
Framework
comments powered by Disqus