January 24, 2020

2627 words 13 mins read

Paper Group NANR 259

Paper Group NANR 259

Combining adaptive algorithms and hypergradient method: a performance and robustness study. Better Modeling of Incomplete Annotations for Named Entity Recognition. Hierarchical Self-Attention Network for Action Localization in Videos. Data-Driven Morphological Analysis for Uralic Languages. Critical Learning Periods in Deep Networks. SSF-DAN: Separ …

Combining adaptive algorithms and hypergradient method: a performance and robustness study

Title Combining adaptive algorithms and hypergradient method: a performance and robustness study
Authors Akram Erraqabi, Nicolas Le Roux
Abstract Wilson et al. (2017) showed that, when the stepsize schedule is properly designed, stochastic gradient generalizes better than ADAM (Kingma & Ba, 2014). In light of recent work on hypergradient methods (Baydin et al., 2018), we revisit these claims to see if such methods close the gap between the most popular optimizers. As a byproduct, we analyze the true benefit of these hypergradient methods compared to more classical schedules, such as the fixed decay of Wilson et al. (2017). In particular, we observe they are of marginal help since their performance varies significantly when tuning their hyperparameters. Finally, as robustness is a critical quality of an optimizer, we provide a sensitivity analysis of these gradient based optimizers to assess how challenging their tuning is.
Tasks
Published 2019-05-01
URL https://openreview.net/forum?id=rJgSV3AqKQ
PDF https://openreview.net/pdf?id=rJgSV3AqKQ
PWC https://paperswithcode.com/paper/combining-adaptive-algorithms-and
Repo
Framework

Better Modeling of Incomplete Annotations for Named Entity Recognition

Title Better Modeling of Incomplete Annotations for Named Entity Recognition
Authors Zhanming Jie, Pengjun Xie, Wei Lu, Ruixue Ding, Linlin Li
Abstract Supervised approaches to named entity recognition (NER) are largely developed based on the assumption that the training data is fully annotated with named entity information. However, in practice, annotated data can often be imperfect with one typical issue being the training data may contain incomplete annotations. We highlight several pitfalls associated with learning under such a setup in the context of NER and identify limitations associated with existing approaches, proposing a novel yet easy-to-implement approach for recognizing named entities with incomplete data annotations. We demonstrate the effectiveness of our approach through extensive experiments.
Tasks Named Entity Recognition
Published 2019-06-01
URL https://www.aclweb.org/anthology/N19-1079/
PDF https://www.aclweb.org/anthology/N19-1079
PWC https://paperswithcode.com/paper/better-modeling-of-incomplete-annotations-for
Repo
Framework

Hierarchical Self-Attention Network for Action Localization in Videos

Title Hierarchical Self-Attention Network for Action Localization in Videos
Authors Rizard Renanda Adhi Pramono, Yie-Tarng Chen, Wen-Hsien Fang
Abstract This paper presents a novel Hierarchical Self-Attention Network (HISAN) to generate spatial-temporal tubes for action localization in videos. The essence of HISAN is to combine the two-stream convolutional neural network (CNN) with hierarchical bidirectional self-attention mechanism, which comprises of two levels of bidirectional self-attention to efficaciously capture both of the long-term temporal dependency information and spatial context information to render more precise action localization. Also, a sequence rescoring (SR) algorithm is employed to resolve the dilemma of inconsistent detection scores incurred by occlusion or background clutter. Moreover, a new fusion scheme is invoked, which integrates not only the appearance and motion information from the two-stream network, but also the motion saliency to mitigate the effect of camera motion. Simulations reveal that the new approach achieves competitive performance as the state-of-the-art works in terms of action localization and recognition accuracy on the widespread UCF101-24 and J-HMDB datasets.
Tasks Action Localization
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Pramono_Hierarchical_Self-Attention_Network_for_Action_Localization_in_Videos_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Pramono_Hierarchical_Self-Attention_Network_for_Action_Localization_in_Videos_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/hierarchical-self-attention-network-for
Repo
Framework

Data-Driven Morphological Analysis for Uralic Languages

Title Data-Driven Morphological Analysis for Uralic Languages
Authors Miikka Silfverberg, Francis Tyers
Abstract
Tasks Lemmatization, Morphological Analysis
Published 2019-01-01
URL https://www.aclweb.org/anthology/W19-0301/
PDF https://www.aclweb.org/anthology/W19-0301
PWC https://paperswithcode.com/paper/data-driven-morphological-analysis-for-uralic
Repo
Framework

Critical Learning Periods in Deep Networks

Title Critical Learning Periods in Deep Networks
Authors Alessandro Achille, Matteo Rovere, Stefano Soatto
Abstract Similar to humans and animals, deep artificial neural networks exhibit critical periods during which a temporary stimulus deficit can impair the development of a skill. The extent of the impairment depends on the onset and length of the deficit window, as in animal models, and on the size of the neural network. Deficits that do not affect low-level statistics, such as vertical flipping of the images, have no lasting effect on performance and can be overcome with further training. To better understand this phenomenon, we use the Fisher Information of the weights to measure the effective connectivity between layers of a network during training. Counterintuitively, information rises rapidly in the early phases of training, and then decreases, preventing redistribution of information resources in a phenomenon we refer to as a loss of “Information Plasticity”. Our analysis suggests that the first few epochs are critical for the creation of strong connections that are optimal relative to the input data distribution. Once such strong connections are created, they do not appear to change during additional training. These findings suggest that the initial learning transient, under-scrutinized compared to asymptotic behavior, plays a key role in determining the outcome of the training process. Our findings, combined with recent theoretical results in the literature, also suggest that forgetting (decrease of information in the weights) is critical to achieving invariance and disentanglement in representation learning. Finally, critical periods are not restricted to biological systems, but can emerge naturally in learning systems, whether biological or artificial, due to fundamental constrains arising from learning dynamics and information processing.
Tasks Representation Learning
Published 2019-05-01
URL https://openreview.net/forum?id=BkeStsCcKQ
PDF https://openreview.net/pdf?id=BkeStsCcKQ
PWC https://paperswithcode.com/paper/critical-learning-periods-in-deep-networks
Repo
Framework

SSF-DAN: Separated Semantic Feature Based Domain Adaptation Network for Semantic Segmentation

Title SSF-DAN: Separated Semantic Feature Based Domain Adaptation Network for Semantic Segmentation
Authors Liang Du, Jingang Tan, Hongye Yang, Jianfeng Feng, Xiangyang Xue, Qibao Zheng, Xiaoqing Ye, Xiaolin Zhang
Abstract Despite the great success achieved by supervised fully convolutional models in semantic segmentation, training the models requires a large amount of labor-intensive work to generate pixel-level annotations. Recent works exploit synthetic data to train the model for semantic segmentation, but the domain adaptation between real and synthetic images remains a challenging problem. In this work, we propose a Separated Semantic Feature based domain adaptation network, named SSF-DAN, for semantic segmentation. First, a Semantic-wise Separable Discriminator (SS-D) is designed to independently adapt semantic features across the target and source domains, which addresses the inconsistent adaptation issue in the class-wise adversarial learning. In SS-D, a progressive confidence strategy is included to achieve a more reliable separation. Then, an efficient Class-wise Adversarial loss Reweighting module (CA-R) is introduced to balance the class-wise adversarial learning process, which leads the generator to focus more on poorly adapted classes. The presented framework demonstrates robust performance, superior to state-of-the-art methods on benchmark datasets.
Tasks Domain Adaptation, Semantic Segmentation
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Du_SSF-DAN_Separated_Semantic_Feature_Based_Domain_Adaptation_Network_for_Semantic_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Du_SSF-DAN_Separated_Semantic_Feature_Based_Domain_Adaptation_Network_for_Semantic_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/ssf-dan-separated-semantic-feature-based
Repo
Framework

BLCU_NLP at SemEval-2019 Task 7: An Inference Chain-based GPT Model for Rumour Evaluation

Title BLCU_NLP at SemEval-2019 Task 7: An Inference Chain-based GPT Model for Rumour Evaluation
Authors Ruoyao Yang, Wanying Xie, Chunhua Liu, Dong Yu
Abstract Researchers have been paying increasing attention to rumour evaluation due to the rapid spread of unsubstantiated rumours on social media platforms, including SemEval 2019 task 7. However, labelled data for learning rumour veracity is scarce, and labels in rumour stance data are highly disproportionate, making it challenging for a model to perform supervised-learning adequately. We propose an inference chain-based system, which fully utilizes conversation structure-based knowledge in the limited data and expand the training data in minority categories to alleviate class imbalance. Our approach obtains 12.6{%} improvement upon the baseline system for subtask A, ranks 1st among 21 systems in subtask A, and ranks 4th among 12 systems in subtask B.
Tasks Rumour Detection
Published 2019-06-01
URL https://www.aclweb.org/anthology/S19-2191/
PDF https://www.aclweb.org/anthology/S19-2191
PWC https://paperswithcode.com/paper/blcu_nlp-at-semeval-2019-task-7-an-inference
Repo
Framework

DON’T JUDGE A BOOK BY ITS COVER - ON THE DYNAMICS OF RECURRENT NEURAL NETWORKS

Title DON’T JUDGE A BOOK BY ITS COVER - ON THE DYNAMICS OF RECURRENT NEURAL NETWORKS
Authors Doron Haviv, Alexander Rivkind, Omri Barak
Abstract To be effective in sequential data processing, Recurrent Neural Networks (RNNs) are required to keep track of past events by creating memories. Consequently RNNs are harder to train than their feedforward counterparts, prompting the developments of both dedicated units such as LSTM and GRU and of a handful of training tricks. In this paper, we investigate the effect of different training protocols on the representation of memories in RNN. While reaching similar performance for different protocols, RNNs are shown to exhibit substantial differences in their ability to generalize for unforeseen tasks or conditions. We analyze the dynamics of the network’s hidden state, and uncover the reasons for this difference. Each memory is found to be associated with a nearly steady state of the dynamics whose speed predicts performance on unforeseen tasks and which we refer to as a ’slow point’. By tracing the formation of the slow points we are able to understand the origin of differences between training protocols. Our results show that multiple solutions to the same task exist but may rely on different dynamical mechanisms, and that training protocols can bias the choice of such solutions in an interpretable way.
Tasks
Published 2019-05-01
URL https://openreview.net/forum?id=H1z_Z2A5tX
PDF https://openreview.net/pdf?id=H1z_Z2A5tX
PWC https://paperswithcode.com/paper/dont-judge-a-book-by-its-cover-on-the
Repo
Framework

Columbia at SemEval-2019 Task 7: Multi-task Learning for Stance Classification and Rumour Verification

Title Columbia at SemEval-2019 Task 7: Multi-task Learning for Stance Classification and Rumour Verification
Authors Zhuoran Liu, Shivali Goel, Mukund Yelahanka Raghuprasad, Smar Muresan, a
Abstract The paper presents Columbia team{'}s participation in the SemEval 2019 Shared Task 7: RumourEval 2019. Detecting rumour on social networks has been a focus of research in recent years. Previous work suffered from data sparsity, which potentially limited the application of more sophisticated neural architecture to this task. We mitigate this problem by proposing a multi-task learning approach together with language model fine-tuning. Our attention-based model allows different tasks to leverage different level of information. Our system ranked 6th overall with an F1-score of 36.25 on stance classification and F1 of 22.44 on rumour verification.
Tasks Language Modelling, Multi-Task Learning
Published 2019-06-01
URL https://www.aclweb.org/anthology/S19-2194/
PDF https://www.aclweb.org/anthology/S19-2194
PWC https://paperswithcode.com/paper/columbia-at-semeval-2019-task-7-multi-task
Repo
Framework

Neural Network Prediction of Censorable Language

Title Neural Network Prediction of Censorable Language
Authors Kei Yin Ng, Anna Feldman, Jing Peng, Chris Leberknight
Abstract Internet censorship imposes restrictions on what information can be publicized or viewed on the Internet. According to Freedom House{'}s annual Freedom on the Net report, more than half the world{'}s Internet users now live in a place where the Internet is censored or restricted. China has built the world{'}s most extensive and sophisticated online censorship system. In this paper, we describe a new corpus of censored and uncensored social media tweets from a Chinese microblogging website, Sina Weibo, collected by tracking posts that mention {}sensitive{'} topics or authored by {}sensitive{'} users. We use this corpus to build a neural network classifier to predict censorship. Our model performs with a 88.50{%} accuracy using only linguistic features. We discuss these features in detail and hypothesize that they could potentially be used for censorship circumvention.
Tasks
Published 2019-06-01
URL https://www.aclweb.org/anthology/W19-2105/
PDF https://www.aclweb.org/anthology/W19-2105
PWC https://paperswithcode.com/paper/neural-network-prediction-of-censorable
Repo
Framework

SINAI-DL at SemEval-2019 Task 7: Data Augmentation and Temporal Expressions

Title SINAI-DL at SemEval-2019 Task 7: Data Augmentation and Temporal Expressions
Authors Miguel A. Garc{'\i}a-Cumbreras, Salud Mar{'\i}a Jim{'e}nez-Zafra, Arturo Montejo-R{'a}ez, Manuel Carlos D{'\i}az-Galiano, Estela Saquete
Abstract This paper describes the participation of the SINAI-DL team at RumourEval (Task 7 in SemEval 2019, subtask A: SDQC). SDQC addresses the challenge of rumour stance classification as an indirect way of identifying potential rumours. Given a tweet with several replies, our system classifies each reply into either supporting, denying, questioning or commenting on the underlying rumours. We have applied data augmentation, temporal expressions labelling and transfer learning with a four-layer neural classifier. We achieve an accuracy of 0.715 with the official run over reply tweets.
Tasks Data Augmentation, Rumour Detection, Transfer Learning
Published 2019-06-01
URL https://www.aclweb.org/anthology/S19-2196/
PDF https://www.aclweb.org/anthology/S19-2196
PWC https://paperswithcode.com/paper/sinai-dl-at-semeval-2019-task-7-data
Repo
Framework

UPV-28-UNITO at SemEval-2019 Task 7: Exploiting Post’s Nesting and Syntax Information for Rumor Stance Classification

Title UPV-28-UNITO at SemEval-2019 Task 7: Exploiting Post’s Nesting and Syntax Information for Rumor Stance Classification
Authors Bilal Ghanem, Aless Cignarella, ra Teresa, Cristina Bosco, Paolo Rosso, Francisco Manuel Rangel Pardo
Abstract In the present paper we describe the UPV-28-UNITO system{'}s submission to the RumorEval 2019 shared task. The approach we applied for addressing both the subtasks of the contest exploits both classical machine learning algorithms and word embeddings, and it is based on diverse groups of features: stylistic, lexical, emotional, sentiment, meta-structural and Twitter-based. A novel set of features that take advantage of the syntactic information in texts is moreover introduced in the paper.
Tasks Word Embeddings
Published 2019-06-01
URL https://www.aclweb.org/anthology/S19-2197/
PDF https://www.aclweb.org/anthology/S19-2197
PWC https://paperswithcode.com/paper/upv-28-unito-at-semeval-2019-task-7
Repo
Framework

Distant Supervised Centroid Shift: A Simple and Efficient Approach to Visual Domain Adaptation

Title Distant Supervised Centroid Shift: A Simple and Efficient Approach to Visual Domain Adaptation
Authors Jian Liang, Ran He, Zhenan Sun, Tieniu Tan
Abstract Conventional domain adaptation methods usually resort to deep neural networks or subspace learning to find invariant representations across domains. However, most deep learning methods highly rely on large-size source domains and are computationally expensive to train, while subspace learning methods always have a quadratic time complexity that suffers from the large domain size. This paper provides a simple and efficient solution, which could be regarded as a well-performing baseline for domain adaptation tasks. Our method is built upon the nearest centroid classifier, seeking a subspace where the centroids in the target domain are moderately shifted from those in the source domain. Specifically, we design a unified objective without accessing the source domain data and adopt an alternating minimization scheme to iteratively discover the pseudo target labels, invariant subspace, and target centroids. Besides its privacy-preserving property (distant supervision), the algorithm is provably convergent and has a promising linear time complexity. In addition, the proposed method can be readily extended to multi-source setting and domain generalization, and it remarkably enhances popular deep adaptation methods by borrowing the learned transferable features. Extensive experiments on several benchmarks including object, digit, and face recognition datasets validate that our methods yield state-of-the-art results in various domain adaptation tasks.
Tasks Domain Adaptation, Domain Generalization, Face Recognition
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Liang_Distant_Supervised_Centroid_Shift_A_Simple_and_Efficient_Approach_to_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Liang_Distant_Supervised_Centroid_Shift_A_Simple_and_Efficient_Approach_to_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/distant-supervised-centroid-shift-a-simple
Repo
Framework

CodeForTheChange at SemEval-2019 Task 8: Skip-Thoughts for Fact Checking in Community Question Answering

Title CodeForTheChange at SemEval-2019 Task 8: Skip-Thoughts for Fact Checking in Community Question Answering
Authors Adithya Avvaru, P, Anupam ey
Abstract The strengths of the scalable gradient tree boosting algorithm, XGBoost and distributed sentence encoder, Skip-Thought Vectors are not explored yet by the cQA research community. We tried to apply and combine these two effective methods for finding factual nature of the questions and answers. The work also include experimentation with other popular classifier models like AdaBoost Classifier, DecisionTree Classifier, RandomForest Classifier, ExtraTrees Classifier, XGBoost Classifier and Multi-layer Neural Network. In this paper, we present the features used, approaches followed for feature engineering, models experimented with and finally the results.
Tasks Community Question Answering, Feature Engineering, Question Answering
Published 2019-06-01
URL https://www.aclweb.org/anthology/S19-2199/
PDF https://www.aclweb.org/anthology/S19-2199
PWC https://paperswithcode.com/paper/codeforthechange-at-semeval-2019-task-8-skip
Repo
Framework

ColumbiaNLP at SemEval-2019 Task 8: The Answer is Language Model Fine-tuning

Title ColumbiaNLP at SemEval-2019 Task 8: The Answer is Language Model Fine-tuning
Authors Tuhin Chakrabarty, Smar Muresan, a
Abstract Community Question Answering forums are very popular nowadays, as they represent effective means for communities to share information around particular topics. But the information shared on these forums are often not authentic. This paper presents the ColumbiaNLP submission for the SemEval-2019 Task 8: Fact-Checking in Community Question Answering Forums. We show how fine-tuning a language model on a large unannotated corpus of old threads from Qatar Living forum helps us to classify question types (factual, opinion, socializing) and to judge the factuality of answers on the shared task labeled data from the same forum. Our system finished 4th and 2nd on Subtask A (question type classification) and B (answer factuality prediction), respectively, based on the official metric of accuracy.
Tasks Community Question Answering, Language Modelling, Question Answering
Published 2019-06-01
URL https://www.aclweb.org/anthology/S19-2200/
PDF https://www.aclweb.org/anthology/S19-2200
PWC https://paperswithcode.com/paper/columbianlp-at-semeval-2019-task-8-the-answer
Repo
Framework
comments powered by Disqus