July 26, 2019

1837 words 9 mins read

Paper Group NANR 91

Paper Group NANR 91

Variance in Historical Data: How bad is it and how can we profit from it for historical linguistics?. Synchronized Mediawiki based analyzer dictionary development. Preliminary Experiments concerning Verbal Predicative Structure Extraction from a Large Finnish Corpus. Language technology resources and tools for Mansi: an overview. Distributional reg …

Variance in Historical Data: How bad is it and how can we profit from it for historical linguistics?

Title Variance in Historical Data: How bad is it and how can we profit from it for historical linguistics?
Authors Stefanie Dipper
Abstract
Tasks
Published 2017-05-01
URL https://www.aclweb.org/anthology/W17-0501/
PDF https://www.aclweb.org/anthology/W17-0501
PWC https://paperswithcode.com/paper/variance-in-historical-data-how-bad-is-it-and
Repo
Framework

Synchronized Mediawiki based analyzer dictionary development

Title Synchronized Mediawiki based analyzer dictionary development
Authors Jack Rueter, Mika H{"a}m{"a}l{"a}inen
Abstract
Tasks
Published 2017-01-01
URL https://www.aclweb.org/anthology/W17-0601/
PDF https://www.aclweb.org/anthology/W17-0601
PWC https://paperswithcode.com/paper/synchronized-mediawiki-based-analyzer
Repo
Framework

Preliminary Experiments concerning Verbal Predicative Structure Extraction from a Large Finnish Corpus

Title Preliminary Experiments concerning Verbal Predicative Structure Extraction from a Large Finnish Corpus
Authors Guers Chaminade, e, Thierry Poibeau
Abstract
Tasks
Published 2017-01-01
URL https://www.aclweb.org/anthology/W17-0605/
PDF https://www.aclweb.org/anthology/W17-0605
PWC https://paperswithcode.com/paper/preliminary-experiments-concerning-verbal
Repo
Framework

Language technology resources and tools for Mansi: an overview

Title Language technology resources and tools for Mansi: an overview
Authors Csilla Horv{'a}th, Norbert Szil{'a}gyi, Veronika Vincze, {'A}goston Nagy
Abstract
Tasks
Published 2017-01-01
URL https://www.aclweb.org/anthology/W17-0606/
PDF https://www.aclweb.org/anthology/W17-0606
PWC https://paperswithcode.com/paper/language-technology-resources-and-tools-for
Repo
Framework

Distributional regularities of verbs and verbal adjectives: Treebank evidence and broader implications

Title Distributional regularities of verbs and verbal adjectives: Treebank evidence and broader implications
Authors Dani{"e}l de Kok, Patricia Fischer, Corina Dima, Erhard Hinrichs
Abstract
Tasks Lemmatization, Word Embeddings
Published 2017-01-01
URL https://www.aclweb.org/anthology/W17-7603/
PDF https://www.aclweb.org/anthology/W17-7603
PWC https://paperswithcode.com/paper/distributional-regularities-of-verbs-and
Repo
Framework

Predicting Japanese scrambling in the wild

Title Predicting Japanese scrambling in the wild
Authors Naho Orita
Abstract Japanese speakers have a choice between canonical SOV and scrambled OSV word order to express the same meaning. Although previous experiments examine the influence of one or two factors for scrambling in a controlled setting, it is not yet known what kinds of multiple effects contribute to scrambling. This study uses naturally distributed data to test the multiple effects on scrambling simultaneously. A regression analysis replicates the NP length effect and suggests the influence of noun types, but it provides no evidence for syntactic priming, given-new ordering, and the animacy effect. These findings only show evidence for sentence-internal factors, but we find no evidence that discourse level factors play a role.
Tasks
Published 2017-04-01
URL https://www.aclweb.org/anthology/W17-0706/
PDF https://www.aclweb.org/anthology/W17-0706
PWC https://paperswithcode.com/paper/predicting-japanese-scrambling-in-the-wild
Repo
Framework

Capacity Releasing Diffusion for Speed and Locality.

Title Capacity Releasing Diffusion for Speed and Locality.
Authors Di Wang, Kimon Fountoulakis, Monika Henzinger, Michael W. Mahoney, Satish Rao
Abstract Diffusions and related random walk procedures are of central importance in many areas of machine learning, data analysis, and applied mathematics. Because they spread mass agnostically at each step in an iterative manner, they can sometimes spread mass “too aggressively,” thereby failing to find the “right” clusters. We introduce a novel Capacity Releasing Diffusion (CRD) Process, which is both faster and stays more local than the classical spectral diffusion process. As an application, we use our CRD Process to develop an improved local algorithm for graph clustering. Our local graph clustering method can find local clusters in a model of clustering where one begins the CRD Process in a cluster whose vertices are connected better internally than externally by an $O(\log^2 n)$ factor, where $n$ is the number of nodes in the cluster. Thus, our CRD Process is the first local graph clustering algorithm that is not subject to the well-known quadratic Cheeger barrier. Our result requires a certain smoothness condition, which we expect to be an artifact of our analysis. Our empirical evaluation demonstrates improved results, in particular for realistic social graphs where there are moderately good—but not very good—clusters.
Tasks Graph Clustering
Published 2017-08-01
URL https://icml.cc/Conferences/2017/Schedule?showEvent=827
PDF http://proceedings.mlr.press/v70/wang17b/wang17b.pdf
PWC https://paperswithcode.com/paper/capacity-releasing-diffusion-for-speed-and
Repo
Framework

Leveraging Linguistic Resources for Improving Neural Text Classification

Title Leveraging Linguistic Resources for Improving Neural Text Classification
Authors Ming Liu, Gholamreza Haffari, Wray Buntine, An, Michelle a-Rajah
Abstract
Tasks Document Classification, Information Retrieval, Sentiment Analysis, Text Classification, Word Embeddings
Published 2017-12-01
URL https://www.aclweb.org/anthology/U17-1004/
PDF https://www.aclweb.org/anthology/U17-1004
PWC https://paperswithcode.com/paper/leveraging-linguistic-resources-for-improving
Repo
Framework

Generating Natural Language Question-Answer Pairs from a Knowledge Graph Using a RNN Based Question Generation Model

Title Generating Natural Language Question-Answer Pairs from a Knowledge Graph Using a RNN Based Question Generation Model
Authors Sathish Reddy, Dinesh Raghu, Mitesh M. Khapra, Sachindra Joshi
Abstract In recent years, knowledge graphs such as Freebase that capture facts about entities and relationships between them have been used actively for answering factoid questions. In this paper, we explore the problem of automatically generating question answer pairs from a given knowledge graph. The generated question answer (QA) pairs can be used in several downstream applications. For example, they could be used for training better QA systems. To generate such QA pairs, we first extract a set of keywords from entities and relationships expressed in a triple stored in the knowledge graph. From each such set, we use a subset of keywords to generate a natural language question that has a unique answer. We treat this subset of keywords as a sequence and propose a sequence to sequence model using RNN to generate a natural language question from it. Our RNN based model generates QA pairs with an accuracy of 33.61 percent and performs 110.47 percent (relative) better than a state-of-the-art template based method for generating natural language question from keywords. We also do an extrinsic evaluation by using the generated QA pairs to train a QA system and observe that the F1-score of the QA system improves by 5.5 percent (relative) when using automatically generated QA pairs in addition to manually generated QA pairs available for training.
Tasks Knowledge Graphs, Question Answering, Question Generation
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-1036/
PDF https://www.aclweb.org/anthology/E17-1036
PWC https://paperswithcode.com/paper/generating-natural-language-question-answer
Repo
Framework

SaToS: Assessing and Summarising Terms of Services from German Webshops

Title SaToS: Assessing and Summarising Terms of Services from German Webshops
Authors Daniel Braun, Elena Scepankova, Patrick Holl, Florian Matthes
Abstract Every time we buy something online, we are confronted with Terms of Services. However, only a few people actually read these terms, before accepting them, often to their disadvantage. In this paper, we present the SaToS browser plugin which summarises and simplifies Terms of Services from German webshops.
Tasks Text Generation
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-3534/
PDF https://www.aclweb.org/anthology/W17-3534
PWC https://paperswithcode.com/paper/satos-assessing-and-summarising-terms-of
Repo
Framework

Discourse Annotation of Non-native Spontaneous Spoken Responses Using the Rhetorical Structure Theory Framework

Title Discourse Annotation of Non-native Spontaneous Spoken Responses Using the Rhetorical Structure Theory Framework
Authors Xinhao Wang, James Bruno, Hillary Molloy, Keelan Evanini, Klaus Zechner
Abstract The availability of the Rhetorical Structure Theory (RST) Discourse Treebank has spurred substantial research into discourse analysis of written texts; however, limited research has been conducted to date on RST annotation and parsing of spoken language, in particular, non-native spontaneous speech. Considering that the measurement of discourse coherence is typically a key metric in human scoring rubrics for assessments of spoken language, we initiated a research effort to obtain RST annotations of a large number of non-native spoken responses from a standardized assessment of academic English proficiency. The resulting inter-annotator kappa agreements on the three different levels of Span, Nuclearity, and Relation are 0.848, 0.766, and 0.653, respectively. Furthermore, a set of features was explored to evaluate the discourse structure of non-native spontaneous speech based on these annotations; the highest performing feature resulted in a correlation of 0.612 with scores of discourse coherence provided by expert human raters.
Tasks Machine Translation, Text Generation, Text Summarization
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-2041/
PDF https://www.aclweb.org/anthology/P17-2041
PWC https://paperswithcode.com/paper/discourse-annotation-of-non-native
Repo
Framework

StingyCD: Safely Avoiding Wasteful Updates in Coordinate Descent

Title StingyCD: Safely Avoiding Wasteful Updates in Coordinate Descent
Authors Tyler B. Johnson, Carlos Guestrin
Abstract Coordinate descent (CD) is a scalable and simple algorithm for solving many optimization problems in machine learning. Despite this fact, CD can also be very computationally wasteful. Due to sparsity in sparse regression problems, for example, the majority of CD updates often result in no progress toward the solution. To address this inefficiency, we propose a modified CD algorithm named “StingyCD.” By skipping over many updates that are guaranteed to not decrease the objective value, StingyCD significantly reduces convergence times. Since StingyCD only skips updates with this guarantee, however, StingyCD does not fully exploit the problem’s sparsity. For this reason, we also propose StingyCD+, an algorithm that achieves further speed-ups by skipping updates more aggressively. Since StingyCD and StingyCD+ rely on simple modifications to CD, it is also straightforward to use these algorithms with other approaches to scaling optimization. In empirical comparisons, StingyCD and StingyCD+ improve convergence times considerably for several L1-regularized optimization problems.
Tasks
Published 2017-08-01
URL https://icml.cc/Conferences/2017/Schedule?showEvent=764
PDF http://proceedings.mlr.press/v70/johnson17a/johnson17a.pdf
PWC https://paperswithcode.com/paper/stingycd-safely-avoiding-wasteful-updates-in
Repo
Framework

Chatbot with a Discourse Structure-Driven Dialogue Management

Title Chatbot with a Discourse Structure-Driven Dialogue Management
Authors Boris Galitsky, Dmitry Ilvovsky
Abstract We build a chat bot with iterative content exploration that leads a user through a personalized knowledge acquisition session. The chat bot is designed as an automated customer support or product recommendation agent assisting a user in learning product features, product usability, suitability, troubleshooting and other related tasks. To control the user navigation through content, we extend the notion of a linguistic discourse tree (DT) towards a set of documents with multiple sections covering a topic. For a given paragraph, a DT is built by DT parsers. We then combine DTs for the paragraphs of documents to form what we call extended DT, which is a basis for interactive content exploration facilitated by the chat bot. To provide cohesive answers, we use a measure of rhetoric agreement between a question and an answer by tree kernel learning of their DTs.
Tasks Chatbot, Dialogue Management, Product Recommendation
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-3022/
PDF https://www.aclweb.org/anthology/E17-3022
PWC https://paperswithcode.com/paper/chatbot-with-a-discourse-structure-driven
Repo
Framework

A Feature Structure Algebra for FTAG

Title A Feature Structure Algebra for FTAG
Authors Alex Koller, er
Abstract
Tasks
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-6201/
PDF https://www.aclweb.org/anthology/W17-6201
PWC https://paperswithcode.com/paper/a-feature-structure-algebra-for-ftag
Repo
Framework

Approximating Style by N-gram-based Annotation

Title Approximating Style by N-gram-based Annotation
Authors Melanie Andresen, Heike Zinsmeister
Abstract The concept of style is much debated in theoretical as well as empirical terms. From an empirical perspective, the key question is how to operationalize style and thus make it accessible for annotation and quantification. In authorship attribution, many different approaches have successfully resolved this issue at the cost of linguistic interpretability: The resulting algorithms may be able to distinguish one language variety from the other, but do not give us much information on their distinctive linguistic properties. We approach the issue of interpreting stylistic features by extracting linear and syntactic n-grams that are distinctive for a language variety. We present a study that exemplifies this process by a comparison of the German academic languages of linguistics and literary studies. Overall, our findings show that distinctive n-grams can be related to linguistic categories. The results suggest that the style of German literary studies is characterized by nominal structures and the style of linguistics by verbal ones.
Tasks
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4913/
PDF https://www.aclweb.org/anthology/W17-4913
PWC https://paperswithcode.com/paper/approximating-style-by-n-gram-based
Repo
Framework
comments powered by Disqus