October 15, 2019

2269 words 11 mins read

Paper Group NANR 195

Stochastic Training of Graph Convolutional Networks. Towards Controllable Story Generation. Syntax-based Transfer Learning for the Task of Biomedical Relation Extraction. Content Extraction and Lexical Analysis from Customer-Agent Interactions. Covariate Adjusted Precision Matrix Estimation via Nonconvex Optimization. Syntactically Aware Neural Arc …

Stochastic Training of Graph Convolutional Networks


Title	Stochastic Training of Graph Convolutional Networks
Authors	Jianfei Chen, Jun Zhu
Abstract	Graph convolutional networks (GCNs) are powerful deep neural networks for graph-structured data. However, GCN computes nodes’ representation recursively from their neighbors, making the receptive field size grow exponentially with the number of layers. Previous attempts on reducing the receptive field size by subsampling neighbors do not have any convergence guarantee, and their receptive field size per node is still in the order of hundreds. In this paper, we develop a preprocessing strategy and two control variate based algorithms to further reduce the receptive field size. Our algorithms are guaranteed to converge to GCN’s local optimum regardless of the neighbor sampling size. Empirical results show that our algorithms have a similar convergence speed per epoch with the exact algorithm even using only two neighbors per node. The time consumption of our algorithm on the Reddit dataset is only one fifth of previous neighbor sampling algorithms.
Tasks
Published	2018-01-01
URL	https://openreview.net/forum?id=rylejExC-
PDF	https://openreview.net/pdf?id=rylejExC-
PWC	https://paperswithcode.com/paper/stochastic-training-of-graph-convolutional-1
Repo
Framework

Towards Controllable Story Generation


Title	Towards Controllable Story Generation
Authors	Nanyun Peng, Marjan Ghazvininejad, Jonathan May, Kevin Knight
Abstract	We present a general framework of analyzing existing story corpora to generate controllable and creative new stories. The proposed framework needs little manual annotation to achieve controllable story generation. It creates a new interface for humans to interact with computers to generate personalized stories. We apply the framework to build recurrent neural network (RNN)-based generation models to control story ending valence and storyline. Experiments show that our methods successfully achieve the control and enhance the coherence of stories through introducing storylines. with additional control factors, the generation model gets lower perplexity, and yields more coherent stories that are faithful to the control factors according to human evaluation.
Tasks	Text Generation
Published	2018-06-01
URL	https://www.aclweb.org/anthology/W18-1505/
PDF	https://www.aclweb.org/anthology/W18-1505
PWC	https://paperswithcode.com/paper/towards-controllable-story-generation
Repo
Framework

Syntax-based Transfer Learning for the Task of Biomedical Relation Extraction


Title	Syntax-based Transfer Learning for the Task of Biomedical Relation Extraction
Authors	JoÃ«l Legrand, Yannick Toussaint, Chedy RaÃ¯ssi, Adrien Coulet
Abstract
Tasks	Domain Adaptation, Relation Extraction, Transfer Learning
Published	2018-10-01
URL	https://www.aclweb.org/anthology/papers/W18-5617/w18-5617
PDF	https://www.aclweb.org/anthology/W18-5617
PWC	https://paperswithcode.com/paper/syntax-based-transfer-learning-for-the-task
Repo
Framework

Content Extraction and Lexical Analysis from Customer-Agent Interactions


Title	Content Extraction and Lexical Analysis from Customer-Agent Interactions
Authors	Sergiu Nisioi, Anca Bucur, Liviu P. Dinu
Abstract	In this paper, we provide a lexical comparative analysis of the vocabulary used by customers and agents in an Enterprise Resource Planning (ERP) environment and a potential solution to clean the data and extract relevant content for NLP. As a result, we demonstrate that the actual vocabulary for the language that prevails in the ERP conversations is highly divergent from the standardized dictionary and further different from general language usage as extracted from the Common Crawl corpus. Moreover, in specific business communication circumstances, where it is expected to observe a high usage of standardized language, code switching and non-standard expression are predominant, emphasizing once more the discrepancy between the day-to-day use of language and the standardized one.
Tasks	Lexical Analysis
Published	2018-11-01
URL	https://www.aclweb.org/anthology/W18-6118/
PDF	https://www.aclweb.org/anthology/W18-6118
PWC	https://paperswithcode.com/paper/content-extraction-and-lexical-analysis-from
Repo
Framework

Covariate Adjusted Precision Matrix Estimation via Nonconvex Optimization


Title	Covariate Adjusted Precision Matrix Estimation via Nonconvex Optimization
Authors	Jinghui Chen, Pan Xu, Lingxiao Wang, Jian Ma, Quanquan Gu
Abstract	We propose a nonconvex estimator for the covariate adjusted precision matrix estimation problem in the high dimensional regime, under sparsity constraints. To solve this estimator, we propose an alternating gradient descent algorithm with hard thresholding. Compared with existing methods along this line of research, which lack theoretical guarantees in optimization error and/or statistical error, the proposed algorithm not only is computationally much more efficient with a linear rate of convergence, but also attains the optimal statistical rate up to a logarithmic factor. Thorough experiments on both synthetic and real data support our theory.
Tasks
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=2478
PDF	http://proceedings.mlr.press/v80/chen18n/chen18n.pdf
PWC	https://paperswithcode.com/paper/covariate-adjusted-precision-matrix
Repo
Framework

Syntactically Aware Neural Architectures for Definition Extraction


Title	Syntactically Aware Neural Architectures for Definition Extraction
Authors	Luis Espinosa-Anke, Steven Schockaert
Abstract	Automatically identifying definitional knowledge in text corpora (Definition Extraction or DE) is an important task with direct applications in, among others, Automatic Glossary Generation, Taxonomy Learning, Question Answering and Semantic Search. It is generally cast as a binary classification problem between definitional and non-definitional sentences. In this paper we present a set of neural architectures combining Convolutional and Recurrent Neural Networks, which are further enriched by incorporating linguistic information via syntactic dependencies. Our experimental results in the task of sentence classification, on two benchmarking DE datasets (one generic, one domain-specific), show that these models obtain consistent state of the art results. Furthermore, we demonstrate that models trained on clean Wikipedia-like definitions can successfully be applied to more noisy domain-specific corpora.
Tasks	Question Answering, Sentence Classification, Word Sense Disambiguation
Published	2018-06-01
URL	https://www.aclweb.org/anthology/N18-2061/
PDF	https://www.aclweb.org/anthology/N18-2061
PWC	https://paperswithcode.com/paper/syntactically-aware-neural-architectures-for
Repo
Framework

Testing Sparsity over Known and Unknown Bases


Title	Testing Sparsity over Known and Unknown Bases
Authors	Siddharth Barman, Arnab Bhattacharyya, Suprovat Ghoshal
Abstract	Sparsity is a basic property of real vectors that is exploited in a wide variety of machine learning applications. In this work, we describe property testing algorithms for sparsity that observe a low-dimensional projec- tion of the input. We consider two settings. In the first setting, we test sparsity with respect to an unknown basis: given input vectors $y_1 ,…,y_p \in R^d$ whose concatenation as columns forms $Y \in R^{d \times p}$ , does $Y = AX$ for matrices $A \in R^{d\times m}$ and $X \in R^{m \times p}$ such that each column of $X$ is $k$-sparse, or is $Y$ “far” from having such a decomposition? In the second setting, we test sparsity with respect to a known basis: for a fixed design ma- trix $A \in R^{d \times m}$ , given input vector $y \in R^d$ , is $y = Ax$ for some $k$-sparse vector $x$ or is $y$ “far” from having such a decomposition? We analyze our algorithms using tools from high-dimensional geometry and probability.
Tasks
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=2164
PDF	http://proceedings.mlr.press/v80/barman18a/barman18a.pdf
PWC	https://paperswithcode.com/paper/testing-sparsity-over-known-and-unknown-bases
Repo
Framework

Generative Models for Alignment and Data Efficiency in Language


Title	Generative Models for Alignment and Data Efficiency in Language
Authors	Dustin Tran, Yura Burda, Ilya Sutskever
Abstract	We examine how learning from unaligned data can improve both the data efficiency of supervised tasks as well as enable alignments without any supervision. For example, consider unsupervised machine translation: the input is two corpora of English and French, and the task is to translate from one language to the other but without any pairs of English and French sentences. To address this, we develop feature-matching autoencoders (FMAEs). FMAEs ensure that the marginal distribution of feature layers are preserved across forward and inverse mappings between domains. We show that FMAEs achieve state of the art for data efficiency and alignment across three tasks: text decipherment, sentiment transfer, and neural machine translation for English-to-German and English-to-French. Most compellingly, FMAEs achieve state of the art for neural translation with limited supervision, with significant BLEU score differences of up to 5.7 and 6.3 over traditional supervised models. Furthermore, on English-to-German, they outperform last year’s best fully supervised models such as ByteNet (Kalchbrenner et al., 2016) while using only half as many supervised examples.
Tasks	Machine Translation, Unsupervised Machine Translation
Published	2018-01-01
URL	https://openreview.net/forum?id=rJ7RBNe0-
PDF	https://openreview.net/pdf?id=rJ7RBNe0-
PWC	https://paperswithcode.com/paper/generative-models-for-alignment-and-data
Repo
Framework

A Dynamic Oracle for Linear-Time 2-Planar Dependency Parsing


Title	A Dynamic Oracle for Linear-Time 2-Planar Dependency Parsing
Authors	Daniel Fern{'a}ndez-Gonz{'a}lez, Carlos G{'o}mez-Rodr{'\i}guez
Abstract	We propose an efficient dynamic oracle for training the 2-Planar transition-based parser, a linear-time parser with over 99{%} coverage on non-projective syntactic corpora. This novel approach outperforms the static training strategy in the vast majority of languages tested and scored better on most datasets than the arc-hybrid parser enhanced with the Swap transition, which can handle unrestricted non-projectivity.
Tasks	Dependency Parsing
Published	2018-06-01
URL	https://www.aclweb.org/anthology/N18-2062/
PDF	https://www.aclweb.org/anthology/N18-2062
PWC	https://paperswithcode.com/paper/a-dynamic-oracle-for-linear-time-2-planar
Repo
Framework

Non-convex Conditional Gradient Sliding


Title	Non-convex Conditional Gradient Sliding
Authors	Chao Qu, Yan Li, Huan Xu
Abstract	We investigate a projection free optimization method, namely non-convex conditional gradient sliding (NCGS) for non-convex optimization problems on the batch, stochastic and finite-sum settings. Conditional gradient sliding (CGS) method, by integrating Nesterov’s accelerated gradient method with Frank-Wolfe (FW) method in a smart way, outperforms FW for convex optimization, by reducing the amount of gradient computations. However, the study of CGS in the non-convex setting is limited. In this paper, we propose the non-convex conditional gradient sliding (NCGS) methods and analyze their convergence properties. We also leverage the idea of variance reduction from the recent progress in convex optimization to obtain a new algorithm termed variance reduced NCGS (NCGS-VR), and obtain faster convergence rate than the batch NCGS in the finite-sum setting. We show that NCGS algorithms outperform their Frank-Wolfe counterparts both in theory and in practice, for all three settings, namely the batch, stochastic and finite-sum setting. This significantly improves our understanding of optimizing non-convex functions with complicated feasible sets (where projection is prohibitively expensive).
Tasks
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=1947
PDF	http://proceedings.mlr.press/v80/qu18a/qu18a.pdf
PWC	https://paperswithcode.com/paper/non-convex-conditional-gradient-sliding
Repo
Framework

Kronecker-factored Curvature Approximations for Recurrent Neural Networks


Title	Kronecker-factored Curvature Approximations for Recurrent Neural Networks
Authors	James Martens, Jimmy Ba, Matt Johnson
Abstract	Kronecker-factor Approximate Curvature (Martens & Grosse, 2015) (K-FAC) is a 2nd-order optimization method which has been shown to give state-of-the-art performance on large-scale neural network optimization tasks (Ba et al., 2017). It is based on an approximation to the Fisher information matrix (FIM) that makes assumptions about the particular structure of the network and the way it is parameterized. The original K-FAC method was applicable only to fully-connected networks, although it has been recently extended by Grosse & Martens (2016) to handle convolutional networks as well. In this work we extend the method to handle RNNs by introducing a novel approximation to the FIM for RNNs. This approximation works by modelling the covariance structure between the gradient contributions at different time-steps using a chain-structured linear Gaussian graphical model, summing the various cross-covariances, and computing the inverse in closed form. We demonstrate in experiments that our method significantly outperforms general purpose state-of-the-art optimizers like SGD with momentum and Adam on several challenging RNN training tasks.
Tasks
Published	2018-01-01
URL	https://openreview.net/forum?id=HyMTkQZAb
PDF	https://openreview.net/pdf?id=HyMTkQZAb
PWC	https://paperswithcode.com/paper/kronecker-factored-curvature-approximations
Repo
Framework


Title	The Potential of the Computational Linguistic Analysis of Social Media for Population Studies
Authors	Letizia Mencarini
Abstract	The paper provides an outline of the scope for synergy between computational linguistic analysis and population stud-ies. It first reviews where population studies stand in terms of using social media data. Demographers are entering the realm of big data in force. But, this paper argues, population studies have much to gain from computational linguis-tic analysis, especially in terms of ex-plaining the drivers behind population processes. The paper gives two examples of how the method can be applied, and concludes with a fundamental caveat. Yes, computational linguistic analysis provides a possible key for integrating micro theory into any demographic analysis of social media data. But results may be of little value in as much as knowledge about fundamental sample characteristics are unknown.
Tasks
Published	2018-06-01
URL	https://www.aclweb.org/anthology/W18-1109/
PDF	https://www.aclweb.org/anthology/W18-1109
PWC	https://paperswithcode.com/paper/the-potential-of-the-computational-linguistic
Repo
Framework

Lancaster at SemEval-2018 Task 3: Investigating Ironic Features in English Tweets


Title	Lancaster at SemEval-2018 Task 3: Investigating Ironic Features in English Tweets
Authors	Edward Dearden, Alistair Baron
Abstract	This paper describes the system we submitted to SemEval-2018 Task 3. The aim of the system is to distinguish between irony and non-irony in English tweets. We create a targeted feature set and analyse how different features are useful in the task of irony detection, achieving an F1-score of 0.5914. The analysis of individual features provides insight that may be useful in future attempts at detecting irony in tweets.
Tasks	Sentiment Analysis
Published	2018-06-01
URL	https://www.aclweb.org/anthology/S18-1096/
PDF	https://www.aclweb.org/anthology/S18-1096
PWC	https://paperswithcode.com/paper/lancaster-at-semeval-2018-task-3
Repo
Framework

Identification of Alias Links among Participants in Narratives


Title	Identification of Alias Links among Participants in Narratives
Authors	Sangameshwar Patil, Sachin Pawar, Swapnil Hingmire, Girish Palshikar, Vasudeva Varma, Pushpak Bhattacharyya
Abstract	Identification of distinct and independent participants (entities of interest) in a narrative is an important task for many NLP applications. This task becomes challenging because these participants are often referred to using multiple aliases. In this paper, we propose an approach based on linguistic knowledge for identification of aliases mentioned using proper nouns, pronouns or noun phrases with common noun headword. We use Markov Logic Network (MLN) to encode the linguistic knowledge for identification of aliases. We evaluate on four diverse history narratives of varying complexity. Our approach performs better than the state-of-the-art approach as well as a combination of standard named entity recognition and coreference resolution techniques.
Tasks	Coreference Resolution, Named Entity Recognition, Question Answering
Published	2018-07-01
URL	https://www.aclweb.org/anthology/P18-2011/
PDF	https://www.aclweb.org/anthology/P18-2011
PWC	https://paperswithcode.com/paper/identification-of-alias-links-among
Repo
Framework

UTFPR at IEST 2018: Exploring Character-to-Word Composition for Emotion Analysis


Title	UTFPR at IEST 2018: Exploring Character-to-Word Composition for Emotion Analysis
Authors	Gustavo Paetzold
Abstract	We introduce the UTFPR system for the Implicit Emotions Shared Task of 2018: A compositional character-to-word recurrent neural network that does not exploit heavy and/or hard-to-obtain resources. We find that our approach can outperform multiple baselines, and offers an elegant and effective solution to the problem of orthographic variance in tweets.
Tasks	Emotion Recognition
Published	2018-10-01
URL	https://www.aclweb.org/anthology/W18-6224/
PDF	https://www.aclweb.org/anthology/W18-6224
PWC	https://paperswithcode.com/paper/utfpr-at-iest-2018-exploring-character-to
Repo
Framework