October 15, 2019

2269 words 11 mins read

Paper Group NANR 195

Paper Group NANR 195

Stochastic Training of Graph Convolutional Networks. Towards Controllable Story Generation. Syntax-based Transfer Learning for the Task of Biomedical Relation Extraction. Content Extraction and Lexical Analysis from Customer-Agent Interactions. Covariate Adjusted Precision Matrix Estimation via Nonconvex Optimization. Syntactically Aware Neural Arc …

Stochastic Training of Graph Convolutional Networks

Title Stochastic Training of Graph Convolutional Networks
Authors Jianfei Chen, Jun Zhu
Abstract Graph convolutional networks (GCNs) are powerful deep neural networks for graph-structured data. However, GCN computes nodes’ representation recursively from their neighbors, making the receptive field size grow exponentially with the number of layers. Previous attempts on reducing the receptive field size by subsampling neighbors do not have any convergence guarantee, and their receptive field size per node is still in the order of hundreds. In this paper, we develop a preprocessing strategy and two control variate based algorithms to further reduce the receptive field size. Our algorithms are guaranteed to converge to GCN’s local optimum regardless of the neighbor sampling size. Empirical results show that our algorithms have a similar convergence speed per epoch with the exact algorithm even using only two neighbors per node. The time consumption of our algorithm on the Reddit dataset is only one fifth of previous neighbor sampling algorithms.
Tasks
Published 2018-01-01
URL https://openreview.net/forum?id=rylejExC-
PDF https://openreview.net/pdf?id=rylejExC-
PWC https://paperswithcode.com/paper/stochastic-training-of-graph-convolutional-1
Repo
Framework

Towards Controllable Story Generation

Title Towards Controllable Story Generation
Authors Nanyun Peng, Marjan Ghazvininejad, Jonathan May, Kevin Knight
Abstract We present a general framework of analyzing existing story corpora to generate controllable and creative new stories. The proposed framework needs little manual annotation to achieve controllable story generation. It creates a new interface for humans to interact with computers to generate personalized stories. We apply the framework to build recurrent neural network (RNN)-based generation models to control story ending valence and storyline. Experiments show that our methods successfully achieve the control and enhance the coherence of stories through introducing storylines. with additional control factors, the generation model gets lower perplexity, and yields more coherent stories that are faithful to the control factors according to human evaluation.
Tasks Text Generation
Published 2018-06-01
URL https://www.aclweb.org/anthology/W18-1505/
PDF https://www.aclweb.org/anthology/W18-1505
PWC https://paperswithcode.com/paper/towards-controllable-story-generation
Repo
Framework

Syntax-based Transfer Learning for the Task of Biomedical Relation Extraction

Title Syntax-based Transfer Learning for the Task of Biomedical Relation Extraction
Authors Joël Legrand, Yannick Toussaint, Chedy Raïssi, Adrien Coulet
Abstract
Tasks Domain Adaptation, Relation Extraction, Transfer Learning
Published 2018-10-01
URL https://www.aclweb.org/anthology/papers/W18-5617/w18-5617
PDF https://www.aclweb.org/anthology/W18-5617
PWC https://paperswithcode.com/paper/syntax-based-transfer-learning-for-the-task
Repo
Framework

Content Extraction and Lexical Analysis from Customer-Agent Interactions

Title Content Extraction and Lexical Analysis from Customer-Agent Interactions
Authors Sergiu Nisioi, Anca Bucur, Liviu P. Dinu
Abstract In this paper, we provide a lexical comparative analysis of the vocabulary used by customers and agents in an Enterprise Resource Planning (ERP) environment and a potential solution to clean the data and extract relevant content for NLP. As a result, we demonstrate that the actual vocabulary for the language that prevails in the ERP conversations is highly divergent from the standardized dictionary and further different from general language usage as extracted from the Common Crawl corpus. Moreover, in specific business communication circumstances, where it is expected to observe a high usage of standardized language, code switching and non-standard expression are predominant, emphasizing once more the discrepancy between the day-to-day use of language and the standardized one.
Tasks Lexical Analysis
Published 2018-11-01
URL https://www.aclweb.org/anthology/W18-6118/
PDF https://www.aclweb.org/anthology/W18-6118
PWC https://paperswithcode.com/paper/content-extraction-and-lexical-analysis-from
Repo
Framework

Covariate Adjusted Precision Matrix Estimation via Nonconvex Optimization

Title Covariate Adjusted Precision Matrix Estimation via Nonconvex Optimization
Authors Jinghui Chen, Pan Xu, Lingxiao Wang, Jian Ma, Quanquan Gu
Abstract We propose a nonconvex estimator for the covariate adjusted precision matrix estimation problem in the high dimensional regime, under sparsity constraints. To solve this estimator, we propose an alternating gradient descent algorithm with hard thresholding. Compared with existing methods along this line of research, which lack theoretical guarantees in optimization error and/or statistical error, the proposed algorithm not only is computationally much more efficient with a linear rate of convergence, but also attains the optimal statistical rate up to a logarithmic factor. Thorough experiments on both synthetic and real data support our theory.
Tasks
Published 2018-07-01
URL https://icml.cc/Conferences/2018/Schedule?showEvent=2478
PDF http://proceedings.mlr.press/v80/chen18n/chen18n.pdf
PWC https://paperswithcode.com/paper/covariate-adjusted-precision-matrix
Repo
Framework

Syntactically Aware Neural Architectures for Definition Extraction

Title Syntactically Aware Neural Architectures for Definition Extraction
Authors Luis Espinosa-Anke, Steven Schockaert
Abstract Automatically identifying definitional knowledge in text corpora (Definition Extraction or DE) is an important task with direct applications in, among others, Automatic Glossary Generation, Taxonomy Learning, Question Answering and Semantic Search. It is generally cast as a binary classification problem between definitional and non-definitional sentences. In this paper we present a set of neural architectures combining Convolutional and Recurrent Neural Networks, which are further enriched by incorporating linguistic information via syntactic dependencies. Our experimental results in the task of sentence classification, on two benchmarking DE datasets (one generic, one domain-specific), show that these models obtain consistent state of the art results. Furthermore, we demonstrate that models trained on clean Wikipedia-like definitions can successfully be applied to more noisy domain-specific corpora.
Tasks Question Answering, Sentence Classification, Word Sense Disambiguation
Published 2018-06-01
URL https://www.aclweb.org/anthology/N18-2061/
PDF https://www.aclweb.org/anthology/N18-2061
PWC https://paperswithcode.com/paper/syntactically-aware-neural-architectures-for
Repo
Framework

Testing Sparsity over Known and Unknown Bases

Title Testing Sparsity over Known and Unknown Bases
Authors Siddharth Barman, Arnab Bhattacharyya, Suprovat Ghoshal
Abstract Sparsity is a basic property of real vectors that is exploited in a wide variety of machine learning applications. In this work, we describe property testing algorithms for sparsity that observe a low-dimensional projec- tion of the input. We consider two settings. In the first setting, we test sparsity with respect to an unknown basis: given input vectors $y_1 ,…,y_p \in R^d$ whose concatenation as columns forms $Y \in R^{d \times p}$ , does $Y = AX$ for matrices $A \in R^{d\times m}$ and $X \in R^{m \times p}$ such that each column of $X$ is $k$-sparse, or is $Y$ “far” from having such a decomposition? In the second setting, we test sparsity with respect to a known basis: for a fixed design ma- trix $A \in R^{d \times m}$ , given input vector $y \in R^d$ , is $y = Ax$ for some $k$-sparse vector $x$ or is $y$ “far” from having such a decomposition? We analyze our algorithms using tools from high-dimensional geometry and probability.
Tasks
Published 2018-07-01
URL https://icml.cc/Conferences/2018/Schedule?showEvent=2164
PDF http://proceedings.mlr.press/v80/barman18a/barman18a.pdf
PWC https://paperswithcode.com/paper/testing-sparsity-over-known-and-unknown-bases
Repo
Framework

Generative Models for Alignment and Data Efficiency in Language

Title Generative Models for Alignment and Data Efficiency in Language
Authors Dustin Tran, Yura Burda, Ilya Sutskever
Abstract We examine how learning from unaligned data can improve both the data efficiency of supervised tasks as well as enable alignments without any supervision. For example, consider unsupervised machine translation: the input is two corpora of English and French, and the task is to translate from one language to the other but without any pairs of English and French sentences. To address this, we develop feature-matching autoencoders (FMAEs). FMAEs ensure that the marginal distribution of feature layers are preserved across forward and inverse mappings between domains. We show that FMAEs achieve state of the art for data efficiency and alignment across three tasks: text decipherment, sentiment transfer, and neural machine translation for English-to-German and English-to-French. Most compellingly, FMAEs achieve state of the art for neural translation with limited supervision, with significant BLEU score differences of up to 5.7 and 6.3 over traditional supervised models. Furthermore, on English-to-German, they outperform last year’s best fully supervised models such as ByteNet (Kalchbrenner et al., 2016) while using only half as many supervised examples.
Tasks Machine Translation, Unsupervised Machine Translation
Published 2018-01-01
URL https://openreview.net/forum?id=rJ7RBNe0-
PDF https://openreview.net/pdf?id=rJ7RBNe0-
PWC https://paperswithcode.com/paper/generative-models-for-alignment-and-data
Repo
Framework

A Dynamic Oracle for Linear-Time 2-Planar Dependency Parsing

Title A Dynamic Oracle for Linear-Time 2-Planar Dependency Parsing
Authors Daniel Fern{'a}ndez-Gonz{'a}lez, Carlos G{'o}mez-Rodr{'\i}guez
Abstract We propose an efficient dynamic oracle for training the 2-Planar transition-based parser, a linear-time parser with over 99{%} coverage on non-projective syntactic corpora. This novel approach outperforms the static training strategy in the vast majority of languages tested and scored better on most datasets than the arc-hybrid parser enhanced with the Swap transition, which can handle unrestricted non-projectivity.
Tasks Dependency Parsing
Published 2018-06-01
URL https://www.aclweb.org/anthology/N18-2062/
PDF https://www.aclweb.org/anthology/N18-2062
PWC https://paperswithcode.com/paper/a-dynamic-oracle-for-linear-time-2-planar
Repo
Framework

Non-convex Conditional Gradient Sliding

Title Non-convex Conditional Gradient Sliding
Authors Chao Qu, Yan Li, Huan Xu
Abstract We investigate a projection free optimization method, namely non-convex conditional gradient sliding (NCGS) for non-convex optimization problems on the batch, stochastic and finite-sum settings. Conditional gradient sliding (CGS) method, by integrating Nesterov’s accelerated gradient method with Frank-Wolfe (FW) method in a smart way, outperforms FW for convex optimization, by reducing the amount of gradient computations. However, the study of CGS in the non-convex setting is limited. In this paper, we propose the non-convex conditional gradient sliding (NCGS) methods and analyze their convergence properties. We also leverage the idea of variance reduction from the recent progress in convex optimization to obtain a new algorithm termed variance reduced NCGS (NCGS-VR), and obtain faster convergence rate than the batch NCGS in the finite-sum setting. We show that NCGS algorithms outperform their Frank-Wolfe counterparts both in theory and in practice, for all three settings, namely the batch, stochastic and finite-sum setting. This significantly improves our understanding of optimizing non-convex functions with complicated feasible sets (where projection is prohibitively expensive).
Tasks
Published 2018-07-01
URL https://icml.cc/Conferences/2018/Schedule?showEvent=1947
PDF http://proceedings.mlr.press/v80/qu18a/qu18a.pdf
PWC https://paperswithcode.com/paper/non-convex-conditional-gradient-sliding
Repo
Framework

Kronecker-factored Curvature Approximations for Recurrent Neural Networks

Title Kronecker-factored Curvature Approximations for Recurrent Neural Networks
Authors James Martens, Jimmy Ba, Matt Johnson
Abstract Kronecker-factor Approximate Curvature (Martens & Grosse, 2015) (K-FAC) is a 2nd-order optimization method which has been shown to give state-of-the-art performance on large-scale neural network optimization tasks (Ba et al., 2017). It is based on an approximation to the Fisher information matrix (FIM) that makes assumptions about the particular structure of the network and the way it is parameterized. The original K-FAC method was applicable only to fully-connected networks, although it has been recently extended by Grosse & Martens (2016) to handle convolutional networks as well. In this work we extend the method to handle RNNs by introducing a novel approximation to the FIM for RNNs. This approximation works by modelling the covariance structure between the gradient contributions at different time-steps using a chain-structured linear Gaussian graphical model, summing the various cross-covariances, and computing the inverse in closed form. We demonstrate in experiments that our method significantly outperforms general purpose state-of-the-art optimizers like SGD with momentum and Adam on several challenging RNN training tasks.
Tasks
Published 2018-01-01
URL https://openreview.net/forum?id=HyMTkQZAb
PDF https://openreview.net/pdf?id=HyMTkQZAb
PWC https://paperswithcode.com/paper/kronecker-factored-curvature-approximations
Repo
Framework

The Potential of the Computational Linguistic Analysis of Social Media for Population Studies

Title The Potential of the Computational Linguistic Analysis of Social Media for Population Studies
Authors Letizia Mencarini
Abstract The paper provides an outline of the scope for synergy between computational linguistic analysis and population stud-ies. It first reviews where population studies stand in terms of using social media data. Demographers are entering the realm of big data in force. But, this paper argues, population studies have much to gain from computational linguis-tic analysis, especially in terms of ex-plaining the drivers behind population processes. The paper gives two examples of how the method can be applied, and concludes with a fundamental caveat. Yes, computational linguistic analysis provides a possible key for integrating micro theory into any demographic analysis of social media data. But results may be of little value in as much as knowledge about fundamental sample characteristics are unknown.
Tasks
Published 2018-06-01
URL https://www.aclweb.org/anthology/W18-1109/
PDF https://www.aclweb.org/anthology/W18-1109
PWC https://paperswithcode.com/paper/the-potential-of-the-computational-linguistic
Repo
Framework

Lancaster at SemEval-2018 Task 3: Investigating Ironic Features in English Tweets

Title Lancaster at SemEval-2018 Task 3: Investigating Ironic Features in English Tweets
Authors Edward Dearden, Alistair Baron
Abstract This paper describes the system we submitted to SemEval-2018 Task 3. The aim of the system is to distinguish between irony and non-irony in English tweets. We create a targeted feature set and analyse how different features are useful in the task of irony detection, achieving an F1-score of 0.5914. The analysis of individual features provides insight that may be useful in future attempts at detecting irony in tweets.
Tasks Sentiment Analysis
Published 2018-06-01
URL https://www.aclweb.org/anthology/S18-1096/
PDF https://www.aclweb.org/anthology/S18-1096
PWC https://paperswithcode.com/paper/lancaster-at-semeval-2018-task-3
Repo
Framework
Title Identification of Alias Links among Participants in Narratives
Authors Sangameshwar Patil, Sachin Pawar, Swapnil Hingmire, Girish Palshikar, Vasudeva Varma, Pushpak Bhattacharyya
Abstract Identification of distinct and independent participants (entities of interest) in a narrative is an important task for many NLP applications. This task becomes challenging because these participants are often referred to using multiple aliases. In this paper, we propose an approach based on linguistic knowledge for identification of aliases mentioned using proper nouns, pronouns or noun phrases with common noun headword. We use Markov Logic Network (MLN) to encode the linguistic knowledge for identification of aliases. We evaluate on four diverse history narratives of varying complexity. Our approach performs better than the state-of-the-art approach as well as a combination of standard named entity recognition and coreference resolution techniques.
Tasks Coreference Resolution, Named Entity Recognition, Question Answering
Published 2018-07-01
URL https://www.aclweb.org/anthology/P18-2011/
PDF https://www.aclweb.org/anthology/P18-2011
PWC https://paperswithcode.com/paper/identification-of-alias-links-among
Repo
Framework

UTFPR at IEST 2018: Exploring Character-to-Word Composition for Emotion Analysis

Title UTFPR at IEST 2018: Exploring Character-to-Word Composition for Emotion Analysis
Authors Gustavo Paetzold
Abstract We introduce the UTFPR system for the Implicit Emotions Shared Task of 2018: A compositional character-to-word recurrent neural network that does not exploit heavy and/or hard-to-obtain resources. We find that our approach can outperform multiple baselines, and offers an elegant and effective solution to the problem of orthographic variance in tweets.
Tasks Emotion Recognition
Published 2018-10-01
URL https://www.aclweb.org/anthology/W18-6224/
PDF https://www.aclweb.org/anthology/W18-6224
PWC https://paperswithcode.com/paper/utfpr-at-iest-2018-exploring-character-to
Repo
Framework
comments powered by Disqus