Paper Group NANR 208
Tagging Ingush - Language Technology For Low-Resource Languages Using Resources From Linguistic Field Work. Unified Methods for Exploiting Piecewise Linear Structure in Convex Optimization. Neural Enquirer: Learning to Query Tables in Natural Language. Using Confusion Graphs to Understand Classifier Error. Could Machine Learning Shed Light on Natur …
Tagging Ingush - Language Technology For Low-Resource Languages Using Resources From Linguistic Field Work
Title | Tagging Ingush - Language Technology For Low-Resource Languages Using Resources From Linguistic Field Work |
Authors | J{"o}rg Tiedemann, Johanna Nichols, Ronald Sprouse |
Abstract | This paper presents on-going work on creating NLP tools for under-resourced languages from very sparse training data coming from linguistic field work. In this work, we focus on Ingush, a Nakh-Daghestanian language spoken by about 300,000 people in the Russian republics Ingushetia and Chechnya. We present work on morphosyntactic taggers trained on transcribed and linguistically analyzed recordings and dependency parsers using English glosses to project annotation for creating synthetic treebanks. Our preliminary results are promising, supporting the goal of bootstrapping efficient NLP tools with limited or no task-specific annotated data resources available. |
Tasks | Cross-Lingual Transfer |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-4020/ |
https://www.aclweb.org/anthology/W16-4020 | |
PWC | https://paperswithcode.com/paper/tagging-ingush-language-technology-for-low |
Repo | |
Framework | |
Unified Methods for Exploiting Piecewise Linear Structure in Convex Optimization
Title | Unified Methods for Exploiting Piecewise Linear Structure in Convex Optimization |
Authors | Tyler B. Johnson, Carlos Guestrin |
Abstract | We develop methods for rapidly identifying important components of a convex optimization problem for the purpose of achieving fast convergence times. By considering a novel problem formulation—the minimization of a sum of piecewise functions—we describe a principled and general mechanism for exploiting piecewise linear structure in convex optimization. This result leads to a theoretically justified working set algorithm and a novel screening test, which generalize and improve upon many prior results on exploiting structure in convex optimization. In empirical comparisons, we study the scalability of our methods. We find that screening scales surprisingly poorly with the size of the problem, while our working set algorithm convincingly outperforms alternative approaches. |
Tasks | |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6043-unified-methods-for-exploiting-piecewise-linear-structure-in-convex-optimization |
http://papers.nips.cc/paper/6043-unified-methods-for-exploiting-piecewise-linear-structure-in-convex-optimization.pdf | |
PWC | https://paperswithcode.com/paper/unified-methods-for-exploiting-piecewise |
Repo | |
Framework | |
Neural Enquirer: Learning to Query Tables in Natural Language
Title | Neural Enquirer: Learning to Query Tables in Natural Language |
Authors | Pengcheng Yin, Zhengdong Lu, Hang Li, Kao Ben |
Abstract | |
Tasks | Learning to Execute, Question Answering, Semantic Parsing |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/W16-0105/ |
https://www.aclweb.org/anthology/W16-0105 | |
PWC | https://paperswithcode.com/paper/neural-enquirer-learning-to-query-tables-in |
Repo | |
Framework | |
Using Confusion Graphs to Understand Classifier Error
Title | Using Confusion Graphs to Understand Classifier Error |
Authors | Davis Yoshida, Jordan Boyd-Graber |
Abstract | |
Tasks | Question Answering |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/W16-0108/ |
https://www.aclweb.org/anthology/W16-0108 | |
PWC | https://paperswithcode.com/paper/using-confusion-graphs-to-understand |
Repo | |
Framework | |
Could Machine Learning Shed Light on Natural Language Complexity?
Title | Could Machine Learning Shed Light on Natural Language Complexity? |
Authors | Maria Dolores Jim{'e}nez-L{'o}pez, Leonor Becerra-Bonache |
Abstract | In this paper, we propose to use a subfield of machine learning {–}grammatical inference{–} to measure linguistic complexity from a developmental point of view. We focus on relative complexity by considering a child learner in the process of first language acquisition. The relevance of grammatical inference models for measuring linguistic complexity from a developmental point of view is based on the fact that algorithms proposed in this area can be considered computational models for studying first language acquisition. Even though it will be possible to use different techniques from the field of machine learning as computational models for dealing with linguistic complexity -since in any model we have algorithms that can learn from data-, we claim that grammatical inference models offer some advantages over other tools. |
Tasks | Language Acquisition |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-4101/ |
https://www.aclweb.org/anthology/W16-4101 | |
PWC | https://paperswithcode.com/paper/could-machine-learning-shed-light-on-natural |
Repo | |
Framework | |
Web services and data mining: combining linguistic tools for Polish with an analytical platform
Title | Web services and data mining: combining linguistic tools for Polish with an analytical platform |
Authors | Maciej Ogrodniczuk |
Abstract | In this paper we present a new combination of existing language tools for Polish with a popular data mining platform intended to help researchers from digital humanities perform computational analyses without any programming. The toolset includes RapidMiner Studio, a software solution offering graphical setup of integrated analytical processes and Multiservice, a Web service offering access to several state-of-the-art linguistic tools for Polish. The setting is verified in a simple task of counting frequencies of unknown words in a small corpus. |
Tasks | |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-4025/ |
https://www.aclweb.org/anthology/W16-4025 | |
PWC | https://paperswithcode.com/paper/web-services-and-data-mining-combining |
Repo | |
Framework | |
Addressing surprisal deficiencies in reading time models
Title | Addressing surprisal deficiencies in reading time models |
Authors | Marten van Schijndel, William Schuler |
Abstract | This study demonstrates a weakness in how n-gram and PCFG surprisal are used to predict reading times in eye-tracking data. In particular, the information conveyed by words skipped during saccades is not usually included in the surprisal measures. This study shows that correcting the surprisal calculation improves n-gram surprisal and that upcoming n-grams affect reading times, replicating previous findings of how lexical frequencies affect reading times. In contrast, the predictivity of PCFG surprisal does not benefit from the surprisal correction despite the fact that lexical sequences skipped by saccades are processed by readers, as demonstrated by the corrected n-gram measure. These results raise questions about the formulation of information-theoretic measures of syntactic processing such as PCFG surprisal and entropy reduction when applied to reading times. |
Tasks | Eye Tracking |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-4104/ |
https://www.aclweb.org/anthology/W16-4104 | |
PWC | https://paperswithcode.com/paper/addressing-surprisal-deficiencies-in-reading |
Repo | |
Framework | |
Memory access during incremental sentence processing causes reading time latency
Title | Memory access during incremental sentence processing causes reading time latency |
Authors | Cory Shain, Marten van Schijndel, Richard Futrell, Edward Gibson, William Schuler |
Abstract | Studies on the role of memory as a predictor of reading time latencies (1) differ in their predictions about when memory effects should occur in processing and (2) have had mixed results, with strong positive effects emerging from isolated constructed stimuli and weak or even negative effects emerging from naturally-occurring stimuli. Our study addresses these concerns by comparing several implementations of prominent sentence processing theories on an exploratory corpus and evaluating the most successful of these on a confirmatory corpus, using a new self-paced reading corpus of seemingly natural narratives constructed to contain an unusually high proportion of memory-intensive constructions. We show highly significant and complementary broad-coverage latency effects both for predictors based on the Dependency Locality Theory and for predictors based on a left-corner parsing model of sentence processing. Our results indicate that memory access during sentence processing does take time, but suggest that stimuli requiring many memory access events may be necessary in order to observe the effect. |
Tasks | |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-4106/ |
https://www.aclweb.org/anthology/W16-4106 | |
PWC | https://paperswithcode.com/paper/memory-access-during-incremental-sentence |
Repo | |
Framework | |
Automatic Triage of Mental Health Forum Posts
Title | Automatic Triage of Mental Health Forum Posts |
Authors | Benjamin Shickel, Parisa Rashidi |
Abstract | |
Tasks | Sentiment Analysis, Text Classification, Word Embeddings |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/W16-0326/ |
https://www.aclweb.org/anthology/W16-0326 | |
PWC | https://paperswithcode.com/paper/automatic-triage-of-mental-health-forum-posts |
Repo | |
Framework | |
Semi-supervised CLPsych 2016 Shared Task System Submission
Title | Semi-supervised CLPsych 2016 Shared Task System Submission |
Authors | Nicolas Rey-Villamizar, Prasha Shrestha, Thamar Solorio, Farig Sadeque, Steven Bethard, Ted Pedersen |
Abstract | |
Tasks | |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/W16-0322/ |
https://www.aclweb.org/anthology/W16-0322 | |
PWC | https://paperswithcode.com/paper/semi-supervised-clpsych-2016-shared-task |
Repo | |
Framework | |
Columbia-Jadavpur submission for EMNLP 2016 Code-Switching Workshop Shared Task: System description
Title | Columbia-Jadavpur submission for EMNLP 2016 Code-Switching Workshop Shared Task: System description |
Authors | Ch, Arunavha a, Dipankar Das, Ch Mazumdar, an |
Abstract | |
Tasks | Language Identification |
Published | 2016-11-01 |
URL | https://www.aclweb.org/anthology/W16-5814/ |
https://www.aclweb.org/anthology/W16-5814 | |
PWC | https://paperswithcode.com/paper/columbia-jadavpur-submission-for-emnlp-2016 |
Repo | |
Framework | |
Duluth at SemEval 2016 Task 14: Extending Gloss Overlaps to Enrich Semantic Taxonomies
Title | Duluth at SemEval 2016 Task 14: Extending Gloss Overlaps to Enrich Semantic Taxonomies |
Authors | Ted Pedersen |
Abstract | |
Tasks | |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/S16-1207/ |
https://www.aclweb.org/anthology/S16-1207 | |
PWC | https://paperswithcode.com/paper/duluth-at-semeval-2016-task-14-extending |
Repo | |
Framework | |
A Preliminary Study of Statistically Predictive Syntactic Complexity Features and Manual Simplifications in Basque
Title | A Preliminary Study of Statistically Predictive Syntactic Complexity Features and Manual Simplifications in Basque |
Authors | Itziar Gonzalez-Dios, Mar{'\i}a Jes{'u}s Aranzabe, Arantza D{'\i}az de Ilarraza |
Abstract | In this paper, we present a comparative analysis of statistically predictive syntactic features of complexity and the treatment of these features by humans when simplifying texts. To that end, we have used a list of the most five statistically predictive features obtained automatically and the Corpus of Basque Simplified Texts (CBST) to analyse how the syntactic phenomena in these features have been manually simplified. Our aim is to go beyond the descriptions of operations found in the corpus and relate the multidisciplinary findings to understand text complexity from different points of view. We also present some issues that can be important when analysing linguistic complexity. |
Tasks | Text Simplification |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-4110/ |
https://www.aclweb.org/anthology/W16-4110 | |
PWC | https://paperswithcode.com/paper/a-preliminary-study-of-statistically |
Repo | |
Framework | |
Fast and Provably Good Seedings for k-Means
Title | Fast and Provably Good Seedings for k-Means |
Authors | Olivier Bachem, Mario Lucic, Hamed Hassani, Andreas Krause |
Abstract | Seeding - the task of finding initial cluster centers - is critical in obtaining high-quality clusterings for k-Means. However, k-means++ seeding, the state of the art algorithm, does not scale well to massive datasets as it is inherently sequential and requires k full passes through the data. It was recently shown that Markov chain Monte Carlo sampling can be used to efficiently approximate the seeding step of k-means++. However, this result requires assumptions on the data generating distribution. We propose a simple yet fast seeding algorithm that produces provably good clusterings even without assumptions on the data. Our analysis shows that the algorithm allows for a favourable trade-off between solution quality and computational cost, speeding up k-means++ seeding by up to several orders of magnitude. We validate our theoretical results in extensive experiments on a variety of real-world data sets. |
Tasks | |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6478-fast-and-provably-good-seedings-for-k-means |
http://papers.nips.cc/paper/6478-fast-and-provably-good-seedings-for-k-means.pdf | |
PWC | https://paperswithcode.com/paper/fast-and-provably-good-seedings-for-k-means |
Repo | |
Framework | |
Computer-assisted stylistic revision with incomplete and noisy feedback. A pilot study
Title | Computer-assisted stylistic revision with incomplete and noisy feedback. A pilot study |
Authors | Christian M. Meyer, Johann Frerik Koch |
Abstract | |
Tasks | Grammatical Error Correction |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/W16-0505/ |
https://www.aclweb.org/anthology/W16-0505 | |
PWC | https://paperswithcode.com/paper/computer-assisted-stylistic-revision-with |
Repo | |
Framework | |