January 24, 2020

3058 words 15 mins read

Paper Group NANR 261

Paper Group NANR 261

Learning Transferable Feature Representations Using Neural Networks. UNH at SemEval-2019 Task 12: Toponym Resolution in Scientific Papers. University of Arizona at SemEval-2019 Task 12: Deep-Affix Named Entity Recognition of Geolocation Entities. Employing Deep Part-Object Relationships for Salient Object Detection. Just ``OneSeC’’ for Producing Mu …

Learning Transferable Feature Representations Using Neural Networks

Title Learning Transferable Feature Representations Using Neural Networks
Authors Himanshu Sharad Bhatt, Shourya Roy, Arun Rajkumar, Sriranjani Ramakrishnan
Abstract Learning representations such that the source and target distributions appear as similar as possible has benefited transfer learning tasks across several applications. Generally it requires labeled data from the source and only unlabeled data from the target to learn such representations. While these representations act like a bridge to transfer knowledge learned in the source to the target; they may lead to negative transfer when the source specific characteristics detract their ability to represent the target data. We present a novel neural network architecture to simultaneously learn a two-part representation which is based on the principle of segregating source specific representation from the common representation. The first part captures the source specific characteristics while the second part captures the truly common representation. Our architecture optimizes an objective function which acts adversarial for the source specific part if it contributes towards the cross-domain learning. We empirically show that two parts of the representation, in different arrangements, outperforms existing learning algorithms on the source learning as well as cross-domain tasks on multiple datasets.
Tasks Transfer Learning
Published 2019-07-01
URL https://www.aclweb.org/anthology/P19-1404/
PDF https://www.aclweb.org/anthology/P19-1404
PWC https://paperswithcode.com/paper/learning-transferable-feature-representations
Repo
Framework

UNH at SemEval-2019 Task 12: Toponym Resolution in Scientific Papers

Title UNH at SemEval-2019 Task 12: Toponym Resolution in Scientific Papers
Authors Matthew Magnusson, Laura Dietz
Abstract The SemEval-2019 Task 12 is toponym resolution in scientific papers. We focus on Subtask 1: Toponym Detection which is the identification of spans of text for place names mentioned in a document. We propose two methods: 1) sliding window convolutional neural network using ELMo embeddings (cnn-elmo), and 2) sliding window multi-Layer perceptron using ELMo embeddings (mlp-elmo). We also submit Bi-lateral LSTM with Conditional Random Fields (bi-LSTM) as a strong baseline given its state-of-art performance in Named Entity Recognition (NER) task. Our best performing model is cnn-elmo with a F1 of 0.844 which was below bi-LSTM F1 of 0.862 when evaluated on overlap macro detection. Eight teams participated in this subtask with a total of 21 submissions.
Tasks Named Entity Recognition
Published 2019-06-01
URL https://www.aclweb.org/anthology/S19-2230/
PDF https://www.aclweb.org/anthology/S19-2230
PWC https://paperswithcode.com/paper/unh-at-semeval-2019-task-12-toponym
Repo
Framework

University of Arizona at SemEval-2019 Task 12: Deep-Affix Named Entity Recognition of Geolocation Entities

Title University of Arizona at SemEval-2019 Task 12: Deep-Affix Named Entity Recognition of Geolocation Entities
Authors Vikas Yadav, Egoitz Laparra, Ti-Tai Wang, Mihai Surdeanu, Steven Bethard
Abstract We present the Named Entity Recognition (NER) and disambiguation model used by the University of Arizona team (UArizona) for the SemEval 2019 task 12. We achieved fourth place on tasks 1 and 3. We implemented a deep-affix based LSTM-CRF NER model for task 1, which utilizes only character, word, pre- fix and suffix information for the identification of geolocation entities. Despite using just the training data provided by task organizers and not using any lexicon features, we achieved 78.85{%} strict micro F-score on task 1. We used the unsupervised population heuristics for task 3 and achieved 52.99{%} strict micro-F1 score in this task.
Tasks Named Entity Recognition
Published 2019-06-01
URL https://www.aclweb.org/anthology/S19-2232/
PDF https://www.aclweb.org/anthology/S19-2232
PWC https://paperswithcode.com/paper/university-of-arizona-at-semeval-2019-task-12
Repo
Framework

Employing Deep Part-Object Relationships for Salient Object Detection

Title Employing Deep Part-Object Relationships for Salient Object Detection
Authors Yi Liu, Qiang Zhang, Dingwen Zhang, Jungong Han
Abstract Despite Convolutional Neural Networks (CNNs) based methods have been successful in detecting salient objects, their underlying mechanism that decides the salient intensity of each image part separately cannot avoid inconsistency of parts within the same salient object. This would ultimately result in an incomplete shape of the detected salient object. To solve this problem, we dig into part-object relationships and take the unprecedented attempt to employ these relationships endowed by the Capsule Network (CapsNet) for salient object detection. The entire salient object detection system is built directly on a Two-Stream Part-Object Assignment Network (TSPOANet) consisting of three algorithmic steps. In the first step, the learned deep feature maps of the input image are transformed to a group of primary capsules. In the second step, we feed the primary capsules into two identical streams, within each of which low-level capsules (parts) will be assigned to their familiar high-level capsules (object) via a locally connected routing. In the final step, the two streams are integrated in the form of a fully connected layer, where the relevant parts can be clustered together to form a complete salient object. Experimental results demonstrate the superiority of the proposed salient object detection network over the state-of-the-art methods.
Tasks Object Detection, Salient Object Detection
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Liu_Employing_Deep_Part-Object_Relationships_for_Salient_Object_Detection_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Liu_Employing_Deep_Part-Object_Relationships_for_Salient_Object_Detection_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/employing-deep-part-object-relationships-for
Repo
Framework

Just ``OneSeC’’ for Producing Multilingual Sense-Annotated Data

Title Just ``OneSeC’’ for Producing Multilingual Sense-Annotated Data |
Authors Bianca Scarlini, Tommaso Pasini, Roberto Navigli
Abstract The well-known problem of knowledge acquisition is one of the biggest issues in Word Sense Disambiguation (WSD), where annotated data are still scarce in English and almost absent in other languages. In this paper we formulate the assumption of One Sense per Wikipedia Category and present OneSeC, a language-independent method for the automatic extraction of hundreds of thousands of sentences in which a target word is tagged with its meaning. Our automatically-generated data consistently lead a supervised WSD model to state-of-the-art performance when compared with other automatic and semi-automatic methods. Moreover, our approach outperforms its competitors on multilingual and domain-specific settings, where it beats the existing state of the art on all languages and most domains. All the training data are available for research purposes at http://trainomatic.org/onesec.
Tasks Word Sense Disambiguation
Published 2019-07-01
URL https://www.aclweb.org/anthology/P19-1069/
PDF https://www.aclweb.org/anthology/P19-1069
PWC https://paperswithcode.com/paper/just-onesec-for-producing-multilingual-sense
Repo
Framework

EGNet: Edge Guidance Network for Salient Object Detection

Title EGNet: Edge Guidance Network for Salient Object Detection
Authors Jia-Xing Zhao, Jiang-Jiang Liu, Deng-Ping Fan, Yang Cao, Jufeng Yang, Ming-Ming Cheng
Abstract Fully convolutional neural networks (FCNs) have shown their advantages in the salient object detection task. However, most existing FCNs-based methods still suffer from coarse object boundaries. In this paper, to solve this problem, we focus on the complementarity between salient edge information and salient object information. Accordingly, we present an edge guidance network (EGNet) for salient object detection with three steps to simultaneously model these two kinds of complementary information in a single network. In the first step, we extract the salient object features by a progressive fusion way. In the second step, we integrate the local edge information and global location information to obtain the salient edge features. Finally, to sufficiently leverage these complementary features, we couple the same salient edge features with salient object features at various resolutions. Benefiting from the rich edge information and location information in salient edge features, the fused features can help locate salient objects, especially their boundaries more accurately. Experimental results demonstrate that the proposed method performs favorably against the state-of-the-art methods on six widely used datasets without any pre-processing and post-processing. The source code is available at http: //mmcheng.net/egnet/.
Tasks Object Detection, Salient Object Detection
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Zhao_EGNet_Edge_Guidance_Network_for_Salient_Object_Detection_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Zhao_EGNet_Edge_Guidance_Network_for_Salient_Object_Detection_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/egnet-edge-guidance-network-for-salient
Repo
Framework

L2F/INESC-ID at SemEval-2019 Task 2: Unsupervised Lexical Semantic Frame Induction using Contextualized Word Representations

Title L2F/INESC-ID at SemEval-2019 Task 2: Unsupervised Lexical Semantic Frame Induction using Contextualized Word Representations
Authors Eug{'e}nio Ribeiro, V{^a}nia Mendon{\c{c}}a, Ricardo Ribeiro, David Martins de Matos, Alberto Sardinha, Ana L{'u}cia Santos, Lu{'\i}sa Coheur
Abstract Building large datasets annotated with semantic information, such as FrameNet, is an expensive process. Consequently, such resources are unavailable for many languages and specific domains. This problem can be alleviated by using unsupervised approaches to induce the frames evoked by a collection of documents. That is the objective of the second task of SemEval 2019, which comprises three subtasks: clustering of verbs that evoke the same frame and clustering of arguments into both frame-specific slots and semantic roles. We approach all the subtasks by applying a graph clustering algorithm on contextualized embedding representations of the verbs and arguments. Using such representations is appropriate in the context of this task, since they provide cues for word-sense disambiguation. Thus, they can be used to identify different frames evoked by the same words. Using this approach we were able to outperform all of the baselines reported for the task on the test set in terms of Purity F1, as well as in terms of BCubed F1 in most cases.
Tasks Graph Clustering, Word Sense Disambiguation
Published 2019-06-01
URL https://www.aclweb.org/anthology/S19-2019/
PDF https://www.aclweb.org/anthology/S19-2019
PWC https://paperswithcode.com/paper/l2finesc-id-at-semeval-2019-task-2
Repo
Framework

Semantic Spatial Representation: a unique representation of an environment based on an ontology for robotic applications

Title Semantic Spatial Representation: a unique representation of an environment based on an ontology for robotic applications
Authors Guillaume Sarthou, Aur{'e}lie Clodic, Rachid Alami
Abstract It is important, for human-robot interaction, to endow the robot with the knowledge necessary to understand human needs and to be able to respond to them. We present a formalized and unified representation for indoor environments using an ontology devised for a route description task in which a robot must provide explanations to a person. We show that this representation can be used to choose a route to explain to a human as well as to verbalize it using a route perspective. Based on ontology, this representation has a strong possibility of evolution to adapt to many other applications. With it, we get the semantics of the environment elements while keeping a description of the known connectivity of the environment. This representation and the illustration algorithms, to find and verbalize a route, have been tested in two environments of different scales.
Tasks
Published 2019-06-01
URL https://www.aclweb.org/anthology/W19-1606/
PDF https://www.aclweb.org/anthology/W19-1606
PWC https://paperswithcode.com/paper/semantic-spatial-representation-a-unique
Repo
Framework

Morphology-aware Word-Segmentation in Dialectal Arabic Adaptation of Neural Machine Translation

Title Morphology-aware Word-Segmentation in Dialectal Arabic Adaptation of Neural Machine Translation
Authors Ahmed Tawfik, Mahitab Emam, Khaled Essam, Robert Nabil, Hany Hassan
Abstract Parallel corpora available for building machine translation (MT) models for dialectal Arabic (DA) are rather limited. The scarcity of resources has prompted the use of Modern Standard Arabic (MSA) abundant resources to complement the limited dialectal resource. However, dialectal clitics often differ between MSA and DA. This paper compares morphology-aware DA word segmentation to other word segmentation approaches like Byte Pair Encoding (BPE) and Sub-word Regularization (SR). A set of experiments conducted on Egyptian Arabic (EA), Levantine Arabic (LA), and Gulf Arabic (GA) show that a sufficiently accurate morphology-aware segmentation used in conjunction with BPE outperforms the other word segmentation approaches.
Tasks Machine Translation
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-4602/
PDF https://www.aclweb.org/anthology/W19-4602
PWC https://paperswithcode.com/paper/morphology-aware-word-segmentation-in
Repo
Framework

Using Wiktionary as a resource for WSD : the case of French verbs

Title Using Wiktionary as a resource for WSD : the case of French verbs
Authors Vincent Segonne, C, Marie ito, Beno{^\i}t Crabb{'e}
Abstract As opposed to word sense induction, word sense disambiguation (WSD) has the advantage of us-ing interpretable senses, but requires annotated data, which are quite rare for most languages except English (Miller et al. 1993; Fellbaum, 1998). In this paper, we investigate which strategy to adopt to achieve WSD for languages lacking data that was annotated specifically for the task, focusing on the particular case of verb disambiguation in French. We first study the usability of Eurosense (Bovi et al. 2017) , a multilingual corpus extracted from Europarl (Kohen, 2005) and automatically annotated with BabelNet (Navigli and Ponzetto, 2010) senses. Such a resource opened up the way to supervised and semi-supervised WSD for resourceless languages like French. While this perspective looked promising, our evaluation on French verbs was inconclusive and showed the annotated senses{'} quality was not sufficient for supervised WSD on French verbs. Instead, we propose to use Wiktionary, a collaboratively edited, multilingual online dictionary, as a resource for WSD. Wiktionary provides both sense inventory and manually sense tagged examples which can be used to train supervised and semi-supervised WSD systems. Yet, because senses{'} distribution differ in lexicographic examples found in Wiktionary with respect to natural text, we then focus on studying the impact on WSD of the training data size and senses{'} distribution. Using state-of-the art semi-supervised systems, we report experiments of Wiktionary-based WSD for French verbs, evaluated on FrenchSemEval (FSE), a new dataset of French verbs manually annotated with wiktionary senses.
Tasks Word Sense Disambiguation, Word Sense Induction
Published 2019-05-01
URL https://www.aclweb.org/anthology/W19-0422/
PDF https://www.aclweb.org/anthology/W19-0422
PWC https://paperswithcode.com/paper/using-wiktionary-as-a-resource-for-wsd-the
Repo
Framework

Optimizing for Generalization in Machine Learning with Cross-Validation Gradients

Title Optimizing for Generalization in Machine Learning with Cross-Validation Gradients
Authors Barratt, Shane, Sharma, Rishi
Abstract Cross-validation is the workhorse of modern applied statistics and machine learning, as it provides a principled framework for selecting the model that maximizes generalization performance. In this paper, we show that the cross-validation risk is differentiable with respect to the hyperparameters and training data for many common machine learning algorithms, including logistic regression, elastic-net regression, and support vector machines. Leveraging this property of differentiability, we propose a cross-validation gradient method (CVGM) for hyperparameter optimization. Our method enables efficient optimization in high-dimensional hyperparameter spaces of the cross-validation risk, the best surrogate of the true generalization ability of our learning algorithm.
Tasks Hyperparameter Optimization
Published 2019-05-01
URL https://openreview.net/forum?id=rJlMBjAcYX
PDF https://openreview.net/pdf?id=rJlMBjAcYX
PWC https://paperswithcode.com/paper/optimizing-for-generalization-in-machine
Repo
Framework

Control Batch Size and Learning Rate to Generalize Well: Theoretical and Empirical Evidence

Title Control Batch Size and Learning Rate to Generalize Well: Theoretical and Empirical Evidence
Authors Fengxiang He, Tongliang Liu, Dacheng Tao
Abstract Deep neural networks have received dramatic success based on the optimization method of stochastic gradient descent (SGD). However, it is still not clear how to tune hyper-parameters, especially batch size and learning rate, to ensure good generalization. This paper reports both theoretical and empirical evidence of a training strategy that we should control the ratio of batch size to learning rate not too large to achieve a good generalization ability. Specifically, we prove a PAC-Bayes generalization bound for neural networks trained by SGD, which has a positive correlation with the ratio of batch size to learning rate. This correlation builds the theoretical foundation of the training strategy. Furthermore, we conduct a large-scale experiment to verify the correlation and training strategy. We trained 1,600 models based on architectures ResNet-110, and VGG-19 with datasets CIFAR-10 and CIFAR-100 while strictly control unrelated variables. Accuracies on the test sets are collected for the evaluation. Spearman’s rank-order correlation coefficients and the corresponding $p$ values on 164 groups of the collected data demonstrate that the correlation is statistically significant, which fully supports the training strategy.
Tasks
Published 2019-12-01
URL http://papers.nips.cc/paper/8398-control-batch-size-and-learning-rate-to-generalize-well-theoretical-and-empirical-evidence
PDF http://papers.nips.cc/paper/8398-control-batch-size-and-learning-rate-to-generalize-well-theoretical-and-empirical-evidence.pdf
PWC https://paperswithcode.com/paper/control-batch-size-and-learning-rate-to
Repo
Framework

Measuring and Modeling Language Change

Title Measuring and Modeling Language Change
Authors Jacob Eisenstein
Abstract This tutorial is designed to help researchers answer the following sorts of questions: - Are people happier on the weekend? - What was 1861{'}s word of the year? - Are Democrats and Republicans more different than ever? - When did {}gay{''} stop meaning {}happy{''}? - Are gender stereotypes getting weaker, stronger, or just different? - Who is a linguistic leader? - How can we get internet users to be more polite and objective? Such questions are fundamental to the social sciences and humanities, and scholars in these disciplines are increasingly turning to computational techniques for answers. Meanwhile, the ACL community is increasingly engaged with data that varies across time, and with the social insights that can be offered by analyzing temporal patterns and trends. The purpose of this tutorial is to facilitate this convergence in two main ways: 1. By synthesizing recent computational techniques for handling and modeling temporal data, such as dynamic word embeddings, the tutorial will provide a starting point for future computational research. It will also identify useful tools for social scientists and digital humanities scholars. 2. The tutorial will provide an overview of techniques and datasets from the quantitative social sciences and the digital humanities, which are not well-known in the computational linguistics community. These techniques include vector autoregressive models, multiple comparisons corrections for hypothesis testing, and causal inference. Datasets include historical newspaper archives and corpora of contemporary political speech.
Tasks Causal Inference, Word Embeddings
Published 2019-06-01
URL https://www.aclweb.org/anthology/N19-5003/
PDF https://www.aclweb.org/anthology/N19-5003
PWC https://paperswithcode.com/paper/measuring-and-modeling-language-change
Repo
Framework

Activity recognition using ST-GCN with 3D motion data

Title Activity recognition using ST-GCN with 3D motion data
Authors Xin Cao, Wataru Kudo, Chihiro Ito, Masaki Shuzo, Eisaku Maeda
Abstract For the Nurse Care Activity Recognition Challenge, an activity recognition algorithm was developed by Team TDU-DSML. A spatial-temporal graph convolutional network (ST-GCN) was applied to process 3D motion capture data included in the challenge dataset. Time-series data was divided into 20-second segments with a 10-second overlap. The recognition model with a tree-structure graph was then created. The prediction result was set to one-minute segments on the basis of a majority decision from each segment output. Our model was evaluated by using leave-one-subject-out cross-validation methods. An average accuracy of 57% for all six subjects was achieved.
Tasks Activity Recognition, Motion Capture, Multimodal Activity Recognition, Time Series
Published 2019-09-13
URL https://doi.org/10.1145/3341162.3345581
PDF http://delivery.acm.org/10.1145/3350000/3345581/p689-cao.pdf
PWC https://paperswithcode.com/paper/activity-recognition-using-st-gcn-with-3d
Repo
Framework

TuEval at SemEval-2019 Task 5: LSTM Approach to Hate Speech Detection in English and Spanish

Title TuEval at SemEval-2019 Task 5: LSTM Approach to Hate Speech Detection in English and Spanish
Authors Mihai Manolescu, Denise L{"o}fflad, Adham Nasser Mohamed Saber, Masoumeh Moradipour Tari
Abstract The detection of hate speech, especially in online platforms and forums, is quickly becoming a hot topic as anti-hate speech legislation begins to be applied to public discourse online. The HatEval shared task was created with this in mind; participants were expected to develop a model capable of determining whether or not input (in this case, Twitter datasets in English and Spanish) could be considered hate speech (designated as Task A), if they were aggressive, and whether the tweet was targeting an individual, or speaking generally (Task B). We approached this task by creating an LSTM model with an embedding layer. We found that our model performed considerably better on English language input when compared to Spanish language input. In English, we achieved an F1-Score of 0.466 for Task A and 0.462 for Task B; In Spanish, we achieved scores of 0.617 and 0.612 on Task A and Task B, respectively.
Tasks Hate Speech Detection
Published 2019-06-01
URL https://www.aclweb.org/anthology/S19-2089/
PDF https://www.aclweb.org/anthology/S19-2089
PWC https://paperswithcode.com/paper/tueval-at-semeval-2019-task-5-lstm-approach
Repo
Framework
comments powered by Disqus