Paper Group NANR 49
![Paper Group NANR 49](/2017/images/pwc/paper-all_hu5eb227011acad6b922a57ded5f50b7dc_25576_900x500_fit_q75_box.jpg)
Rule-Based Translation of Spanish Verb-Noun Combinations into Basque. A Generative Model of Phonotactics. Developing a web-based workbook for English supporting the interaction of students and teachers. Communication with Robots using Multilayer Recurrent Networks. Identifying 1950s American Jazz Musicians: Fine-Grained IsA Extraction via Modifier …
Rule-Based Translation of Spanish Verb-Noun Combinations into Basque
Title | Rule-Based Translation of Spanish Verb-Noun Combinations into Basque |
Authors | Uxoa I{~n}urrieta, Itziar Aduriz, Arantza D{'\i}az de Ilarraza, Gorka Labaka, Kepa Sarasola |
Abstract | This paper presents a method to improve the translation of Verb-Noun Combinations (VNCs) in a rule-based Machine Translation (MT) system for Spanish-Basque. Linguistic information about a set of VNCs is gathered from the public database Konbitzul, and it is integrated into the MT system, leading to an improvement in BLEU, NIST and TER scores, as well as the results being evidently better according to human evaluators. |
Tasks | Machine Translation |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/W17-1720/ |
https://www.aclweb.org/anthology/W17-1720 | |
PWC | https://paperswithcode.com/paper/rule-based-translation-of-spanish-verb-noun |
Repo | |
Framework | |
A Generative Model of Phonotactics
Title | A Generative Model of Phonotactics |
Authors | Richard Futrell, Adam Albright, Peter Graff, Timothy J. O{'}Donnell |
Abstract | We present a probabilistic model of phonotactics, the set of well-formed phoneme sequences in a language. Unlike most computational models of phonotactics (Hayes and Wilson, 2008; Goldsmith and Riggle, 2012), we take a fully generative approach, modeling a process where forms are built up out of subparts by phonologically-informed structure building operations. We learn an inventory of subparts by applying stochastic memoization (Johnson et al., 2007; Goodman et al., 2008) to a generative process for phonemes structured as an and-or graph, based on concepts of feature hierarchy from generative phonology (Clements, 1985; Dresher, 2009). Subparts are combined in a way that allows tier-based feature interactions. We evaluate our models{'} ability to capture phonotactic distributions in the lexicons of 14 languages drawn from the WOLEX corpus (Graff, 2012). Our full model robustly assigns higher probabilities to held-out forms than a sophisticated N-gram model for all languages. We also present novel analyses that probe model behavior in more detail. |
Tasks | |
Published | 2017-01-01 |
URL | https://www.aclweb.org/anthology/Q17-1006/ |
https://www.aclweb.org/anthology/Q17-1006 | |
PWC | https://paperswithcode.com/paper/a-generative-model-of-phonotactics |
Repo | |
Framework | |
Developing a web-based workbook for English supporting the interaction of students and teachers
Title | Developing a web-based workbook for English supporting the interaction of students and teachers |
Authors | Bj{"o}rn Rudzewitz, Ramon Ziai, Kordula De Kuthy, Detmar Meurers |
Abstract | |
Tasks | Language Acquisition |
Published | 2017-05-01 |
URL | https://www.aclweb.org/anthology/W17-0305/ |
https://www.aclweb.org/anthology/W17-0305 | |
PWC | https://paperswithcode.com/paper/developing-a-web-based-workbook-for-english |
Repo | |
Framework | |
Communication with Robots using Multilayer Recurrent Networks
Title | Communication with Robots using Multilayer Recurrent Networks |
Authors | Bed{\v{r}}ich Pi{\v{s}}l, David Mare{\v{c}}ek |
Abstract | In this paper, we describe an improvement on the task of giving instructions to robots in a simulated block world using unrestricted natural language commands. |
Tasks | Tokenization |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/W17-2806/ |
https://www.aclweb.org/anthology/W17-2806 | |
PWC | https://paperswithcode.com/paper/communication-with-robots-using-multilayer |
Repo | |
Framework | |
Identifying 1950s American Jazz Musicians: Fine-Grained IsA Extraction via Modifier Composition
Title | Identifying 1950s American Jazz Musicians: Fine-Grained IsA Extraction via Modifier Composition |
Authors | Ellie Pavlick, Marius Pa{\c{s}}ca |
Abstract | We present a method for populating fine-grained classes (e.g., {``}1950s American jazz musicians{''}) with instances (e.g., Charles Mingus ). While state-of-the-art methods tend to treat class labels as single lexical units, the proposed method considers each of the individual modifiers in the class label relative to the head. An evaluation on the task of reconstructing Wikipedia category pages demonstrates a {\textgreater}10 point increase in AUC, over a strong baseline relying on widely-used Hearst patterns. | |
Tasks | |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-1192/ |
https://www.aclweb.org/anthology/P17-1192 | |
PWC | https://paperswithcode.com/paper/identifying-1950s-american-jazz-musicians |
Repo | |
Framework | |
Classifying Temporal Relations by Bidirectional LSTM over Dependency Paths
Title | Classifying Temporal Relations by Bidirectional LSTM over Dependency Paths |
Authors | Fei Cheng, Yusuke Miyao |
Abstract | Temporal relation classification is becoming an active research field. Lots of methods have been proposed, while most of them focus on extracting features from external resources. Less attention has been paid to a significant advance in a closely related task: relation extraction. In this work, we borrow a state-of-the-art method in relation extraction by adopting bidirectional long short-term memory (Bi-LSTM) along dependency paths (DP). We make a {``}common root{''} assumption to extend DP representations of cross-sentence links. In the final comparison to two state-of-the-art systems on TimeBank-Dense, our model achieves comparable performance, without using external knowledge, as well as manually annotated attributes of entities (class, tense, polarity, etc.). | |
Tasks | Question Answering, Relation Classification, Relation Extraction |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-2001/ |
https://www.aclweb.org/anthology/P17-2001 | |
PWC | https://paperswithcode.com/paper/classifying-temporal-relations-by |
Repo | |
Framework | |
A KL-LUCB algorithm for Large-Scale Crowdsourcing
Title | A KL-LUCB algorithm for Large-Scale Crowdsourcing |
Authors | Ervin Tanczos, Robert Nowak, Bob Mankoff |
Abstract | This paper focuses on best-arm identification in multi-armed bandits with bounded rewards. We develop an algorithm that is a fusion of lil-UCB and KL-LUCB, offering the best qualities of the two algorithms in one method. This is achieved by proving a novel anytime confidence bound for the mean of bounded distributions, which is the analogue of the LIL-type bounds recently developed for sub-Gaussian distributions. We corroborate our theoretical results with numerical experiments based on the New Yorker Cartoon Caption Contest. |
Tasks | Multi-Armed Bandits |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/7171-a-kl-lucb-algorithm-for-large-scale-crowdsourcing |
http://papers.nips.cc/paper/7171-a-kl-lucb-algorithm-for-large-scale-crowdsourcing.pdf | |
PWC | https://paperswithcode.com/paper/a-kl-lucb-algorithm-for-large-scale |
Repo | |
Framework | |
Ultra-Concise Multi-genre Summarisation of Web2.0: towards Intelligent Content Generation
Title | Ultra-Concise Multi-genre Summarisation of Web2.0: towards Intelligent Content Generation |
Authors | Elena Lloret, Ester Boldrini, Patricio Mart{'\i}nez-Barco, Manuel Palomar |
Abstract | The electronic Word of Mouth has become the most powerful communication channel thanks to the wide usage of the Social Media. Our research proposes an approach towards the production of automatic ultra-concise summaries from multiple Web 2.0 sources. We exploit user-generated content from reviews and microblogs in different domains, and compile and analyse four types of ultra-concise summaries: a)positive information, b) negative information; c) both or d) objective information. The appropriateness and usefulness of our model is demonstrated by its successful results and great potential in real-life applications, thus meaning a relevant advancement of the state-of-the-art approaches. |
Tasks | Information Retrieval, Opinion Mining |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/W17-1006/ |
https://www.aclweb.org/anthology/W17-1006 | |
PWC | https://paperswithcode.com/paper/ultra-concise-multi-genre-summarisation-of |
Repo | |
Framework | |
IIIT-H at IJCNLP-2017 Task 4: Customer Feedback Analysis using Machine Learning and Neural Network Approaches
Title | IIIT-H at IJCNLP-2017 Task 4: Customer Feedback Analysis using Machine Learning and Neural Network Approaches |
Authors | D, Prathyusha a, Pruthwik Mishra, Silpa Kanneganti, Soujanya Lanka |
Abstract | The IJCNLP 2017 shared task on Customer Feedback Analysis focuses on classifying customer feedback into one of a predefined set of categories or classes. In this paper, we describe our approach to this problem and the results on four languages, i.e. English, French, Japanese and Spanish. Our system implemented a bidirectional LSTM (Graves and Schmidhuber, 2005) using pre-trained glove (Pennington et al., 2014) and fastText (Joulin et al., 2016) embeddings, and SVM (Cortes and Vapnik, 1995) with TF-IDF vectors for classifying the feedback data which is described in the later sections. We also tried different machine learning techniques and compared the results in this paper. Out of the 12 participating teams, our systems obtained 0.65, 0.86, 0.70 and 0.56 exact accuracy score in English, Spanish, French and Japanese respectively. We observed that our systems perform better than the baseline systems in three languages while we match the baseline accuracy for Japanese on our submitted systems. We noticed significant improvements in Japanese in later experiments, matching the highest performing system that was submitted in the shared task, which we will discuss in this paper. |
Tasks | |
Published | 2017-12-01 |
URL | https://www.aclweb.org/anthology/I17-4026/ |
https://www.aclweb.org/anthology/I17-4026 | |
PWC | https://paperswithcode.com/paper/iiit-h-at-ijcnlp-2017-task-4-customer |
Repo | |
Framework | |
Separating Facts from Fiction: Linguistic Models to Classify Suspicious and Trusted News Posts on Twitter
Title | Separating Facts from Fiction: Linguistic Models to Classify Suspicious and Trusted News Posts on Twitter |
Authors | Svitlana Volkova, Kyle Shaffer, Jin Yea Jang, Nathan Hodas |
Abstract | Pew research polls report 62 percent of U.S. adults get news on social media (Gottfried and Shearer, 2016). In a December poll, 64 percent of U.S. adults said that {}made-up news{''} has caused a { }great deal of confusion{''} about the facts of current events (Barthel et al., 2016). Fabricated stories in social media, ranging from deliberate propaganda to hoaxes and satire, contributes to this confusion in addition to having serious effects on global stability. In this work we build predictive models to classify 130 thousand news posts as suspicious or verified, and predict four sub-types of suspicious news {–} satire, hoaxes, clickbait and propaganda. We show that neural network models trained on tweet content and social network interactions outperform lexical models. Unlike previous work on deception detection, we find that adding syntax and grammar features to our models does not improve performance. Incorporating linguistic features improves classification results, however, social interaction features are most informative for finer-grained separation between four types of suspicious news posts. |
Tasks | Deception Detection |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-2102/ |
https://www.aclweb.org/anthology/P17-2102 | |
PWC | https://paperswithcode.com/paper/separating-facts-from-fiction-linguistic |
Repo | |
Framework | |
Speech- and Text-driven Features for Automated Scoring of English Speaking Tasks
Title | Speech- and Text-driven Features for Automated Scoring of English Speaking Tasks |
Authors | Anastassia Loukina, Nitin Madnani, Aoife Cahill |
Abstract | We consider the automatic scoring of a task for which both the content of the response as well its spoken fluency are important. We combine features from a text-only content scoring system originally designed for written responses with several categories of acoustic features. Although adding any single category of acoustic features to the text-only system on its own does not significantly improve performance, adding all acoustic features together does yield a small but significant improvement. These results are consistent for responses to open-ended questions and to questions focused on some given source material. |
Tasks | Speech Recognition |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-4609/ |
https://www.aclweb.org/anthology/W17-4609 | |
PWC | https://paperswithcode.com/paper/speech-and-text-driven-features-for-automated |
Repo | |
Framework | |
Taking into account Inter-sentence Similarity for Update Summarization
Title | Taking into account Inter-sentence Similarity for Update Summarization |
Authors | Ma{^a}li Mnasri, Ga{"e}l de Chalendar, Olivier Ferret |
Abstract | Following Gillick and Favre (2009), a lot of work about extractive summarization has modeled this task by associating two contrary constraints: one aims at maximizing the coverage of the summary with respect to its information content while the other represents its size limit. In this context, the notion of redundancy is only implicitly taken into account. In this article, we extend the framework defined by Gillick and Favre (2009) by examining how and to what extent integrating semantic sentence similarity into an update summarization system can improve its results. We show more precisely the impact of this strategy through evaluations performed on DUC 2007 and TAC 2008 and 2009 datasets. |
Tasks | Document Summarization, Multi-Document Summarization, Natural Language Inference, Topic Models |
Published | 2017-11-01 |
URL | https://www.aclweb.org/anthology/I17-2035/ |
https://www.aclweb.org/anthology/I17-2035 | |
PWC | https://paperswithcode.com/paper/taking-into-account-inter-sentence-similarity |
Repo | |
Framework | |
Toward a Web-based Speech Corpus for Algerian Dialectal Arabic Varieties
Title | Toward a Web-based Speech Corpus for Algerian Dialectal Arabic Varieties |
Authors | Soumia Bougrine, Aicha Chorana, Abdallah Lakhdari, Hadda Cherroun |
Abstract | The success of machine learning for automatic speech processing has raised the need for large scale datasets. However, collecting such data is often a challenging task as it implies significant investment involving time and money cost. In this paper, we devise a recipe for building largescale Speech Corpora by harnessing Web resources namely YouTube, other Social Media, Online Radio and TV. We illustrate our methodology by building KALAM{'}DZ, An Arabic Spoken corpus dedicated to Algerian dialectal varieties. The preliminary version of our dataset covers all major Algerian dialects. In addition, we make sure that this material takes into account numerous aspects that foster its richness. In fact, we have targeted various speech topics. Some automatic and manual annotations are provided. They gather useful information related to the speakers and sub-dialect information at the utterance level. Our corpus encompasses the 8 major Algerian Arabic sub-dialects with 4881 speakers and more than 104.4 hours segmented in utterances of at least 6 s. |
Tasks | Speech Recognition, Speech Synthesis |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/W17-1317/ |
https://www.aclweb.org/anthology/W17-1317 | |
PWC | https://paperswithcode.com/paper/toward-a-web-based-speech-corpus-for-algerian |
Repo | |
Framework | |
Revising the METU-Sabanc\i Turkish Treebank: An Exercise in Surface-Syntactic Annotation of Agglutinative Languages
Title | Revising the METU-Sabanc\i Turkish Treebank: An Exercise in Surface-Syntactic Annotation of Agglutinative Languages |
Authors | Alicia Burga, Alp {"O}ktem, Leo Wanner |
Abstract | |
Tasks | Language Modelling |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-6506/ |
https://www.aclweb.org/anthology/W17-6506 | |
PWC | https://paperswithcode.com/paper/revising-the-metu-sabanca-turkish-treebank-an |
Repo | |
Framework | |
Towards Never Ending Language Learning for Morphologically Rich Languages
Title | Towards Never Ending Language Learning for Morphologically Rich Languages |
Authors | Kseniya Buraya, Lidia Pivovarova, Sergey Budkov, Andrey Filchenkov |
Abstract | This work deals with ontology learning from unstructured Russian text. We implement one of components Never Ending Language Learner and introduce the algorithm extensions aimed to gather specificity of morphologicaly rich free-word-order language. We demonstrate that this method may be successfully applied to Russian data. In addition we perform several additional experiments comparing different settings of the training process. We demonstrate that utilizing of morphological features significantly improves the system precision while using of seed patterns helps to improve the coverage. |
Tasks | Information Retrieval |
Published | 2017-04-01 |
URL | https://www.aclweb.org/anthology/W17-1417/ |
https://www.aclweb.org/anthology/W17-1417 | |
PWC | https://paperswithcode.com/paper/towards-never-ending-language-learning-for |
Repo | |
Framework | |