May 4, 2019

1885 words 9 mins read

Paper Group NANR 176

Paper Group NANR 176

SemEval-2016 Task 8: Meaning Representation Parsing. Improving Phrase-Based SMT Using Cross-Granularity Embedding Similarity. Blind Attacks on Machine Learners. Select-and-Sample for Spike-and-Slab Sparse Coding. Simultaneous Sentence Boundary Detection and Alignment with Pivot-based Machine Translation Generated Lexicons. Effects of Sampling on Tw …

SemEval-2016 Task 8: Meaning Representation Parsing

Title SemEval-2016 Task 8: Meaning Representation Parsing
Authors Jonathan May
Abstract
Tasks
Published 2016-06-01
URL https://www.aclweb.org/anthology/S16-1166/
PDF https://www.aclweb.org/anthology/S16-1166
PWC https://paperswithcode.com/paper/semeval-2016-task-8-meaning-representation
Repo
Framework

Improving Phrase-Based SMT Using Cross-Granularity Embedding Similarity

Title Improving Phrase-Based SMT Using Cross-Granularity Embedding Similarity
Authors Peyman Passban, Chris Hokamp, Andy Way, Qun Liu
Abstract
Tasks Language Modelling, Machine Translation
Published 2016-01-01
URL https://www.aclweb.org/anthology/W16-3403/
PDF https://www.aclweb.org/anthology/W16-3403
PWC https://paperswithcode.com/paper/improving-phrase-based-smt-using-cross
Repo
Framework

Blind Attacks on Machine Learners

Title Blind Attacks on Machine Learners
Authors Alex Beatson, Zhaoran Wang, Han Liu
Abstract The importance of studying the robustness of learners to malicious data is well established. While much work has been done establishing both robust estimators and effective data injection attacks when the attacker is omniscient, the ability of an attacker to provably harm learning while having access to little information is largely unstudied. We study the potential of a “blind attacker” to provably limit a learner’s performance by data injection attack without observing the learner’s training set or any parameter of the distribution from which it is drawn. We provide examples of simple yet effective attacks in two settings: firstly, where an “informed learner” knows the strategy chosen by the attacker, and secondly, where a “blind learner” knows only the proportion of malicious data and some family to which the malicious distribution chosen by the attacker belongs. For each attack, we analyze minimax rates of convergence and establish lower bounds on the learner’s minimax risk, exhibiting limits on a learner’s ability to learn under data injection attack even when the attacker is “blind”.
Tasks
Published 2016-12-01
URL http://papers.nips.cc/paper/6482-blind-attacks-on-machine-learners
PDF http://papers.nips.cc/paper/6482-blind-attacks-on-machine-learners.pdf
PWC https://paperswithcode.com/paper/blind-attacks-on-machine-learners
Repo
Framework

Select-and-Sample for Spike-and-Slab Sparse Coding

Title Select-and-Sample for Spike-and-Slab Sparse Coding
Authors Abdul-Saboor Sheikh, Jörg Lücke
Abstract Probabilistic inference serves as a popular model for neural processing. It is still unclear, however, how approximate probabilistic inference can be accurate and scalable to very high-dimensional continuous latent spaces. Especially as typical posteriors for sensory data can be expected to exhibit complex latent dependencies including multiple modes. Here, we study an approach that can efficiently be scaled while maintaining a richly structured posterior approximation under these conditions. As example model we use spike-and-slab sparse coding for V1 processing, and combine latent subspace selection with Gibbs sampling (select-and-sample). Unlike factored variational approaches, the method can maintain large numbers of posterior modes and complex latent dependencies. Unlike pure sampling, the method is scalable to very high-dimensional latent spaces. Among all sparse coding approaches with non-trivial posterior approximations (MAP or ICA-like models), we report the largest-scale results. In applications we firstly verify the approach by showing competitiveness in standard denoising benchmarks. Secondly, we use its scalability to, for the first time, study highly-overcomplete settings for V1 encoding using sophisticated posterior representations. More generally, our study shows that very accurate probabilistic inference for multi-modal posteriors with complex dependencies is tractable, functionally desirable and consistent with models for neural inference.
Tasks Denoising
Published 2016-12-01
URL http://papers.nips.cc/paper/6276-select-and-sample-for-spike-and-slab-sparse-coding
PDF http://papers.nips.cc/paper/6276-select-and-sample-for-spike-and-slab-sparse-coding.pdf
PWC https://paperswithcode.com/paper/select-and-sample-for-spike-and-slab-sparse
Repo
Framework

Simultaneous Sentence Boundary Detection and Alignment with Pivot-based Machine Translation Generated Lexicons

Title Simultaneous Sentence Boundary Detection and Alignment with Pivot-based Machine Translation Generated Lexicons
Authors Antoine Bourlon, Chenhui Chu, Toshiaki Nakazawa, Sadao Kurohashi
Abstract Sentence alignment is a task that consists in aligning the parallel sentences in a translated article pair. This paper describes a method to perform sentence boundary detection and alignment simultaneously, which significantly improves the alignment accuracy on languages like Chinese with uncertain sentence boundaries. It relies on the definition of hard (certain) and soft (uncertain) punctuation delimiters, the latter being possibly ignored to optimize the alignment result. The alignment method is used in combination with lexicons automatically generated from the input article pairs using pivot-based MT, achieving better coverage of the input words with fewer entries than pre-existing dictionaries. Pivot-based MT makes it possible to build dictionaries for language pairs that have scarce parallel data. The alignment method is implemented in a tool that will be freely available in the near future.
Tasks Boundary Detection, Machine Translation
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1348/
PDF https://www.aclweb.org/anthology/L16-1348
PWC https://paperswithcode.com/paper/simultaneous-sentence-boundary-detection-and
Repo
Framework

Effects of Sampling on Twitter Trend Detection

Title Effects of Sampling on Twitter Trend Detection
Authors Andrew Yates, Alek Kolcz, Nazli Goharian, Ophir Frieder
Abstract Much research has focused on detecting trends on Twitter, including health-related trends such as mentions of Influenza-like illnesses or their symptoms. The majority of this research has been conducted using Twitter{'}s public feed, which includes only about 1{%} of all public tweets. It is unclear if, when, and how using Twitter{'}s 1{%} feed has affected the evaluation of trend detection methods. In this work we use a larger feed to investigate the effects of sampling on Twitter trend detection. We focus on using health-related trends to estimate the prevalence of Influenza-like illnesses based on tweets. We use ground truth obtained from the CDC and Google Flu Trends to explore how the prevalence estimates degrade when moving from a 100{%} to a 1{%} sample. We find that using the 1{%} sample is unlikely to substantially harm ILI estimates made at the national level, but can cause poor performance when estimates are made at the city level.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1479/
PDF https://www.aclweb.org/anthology/L16-1479
PWC https://paperswithcode.com/paper/effects-of-sampling-on-twitter-trend
Repo
Framework

Tense and Aspect in Runyankore Using a Context-Free Grammar

Title Tense and Aspect in Runyankore Using a Context-Free Grammar
Authors Joan Byamugisha, C. Maria Keet, Brian DeRenzi
Abstract
Tasks Text Generation
Published 2016-09-01
URL https://www.aclweb.org/anthology/W16-6614/
PDF https://www.aclweb.org/anthology/W16-6614
PWC https://paperswithcode.com/paper/tense-and-aspect-in-runyankore-using-a
Repo
Framework

SemEval-2016 Task 5: Aspect Based Sentiment Analysis

Title SemEval-2016 Task 5: Aspect Based Sentiment Analysis
Authors Maria Pontiki, Dimitris Galanis, Haris Papageorgiou, Ion Androutsopoulos, Man, Suresh har, Mohammad AL-Smadi, Mahmoud Al-Ayyoub, Yanyan Zhao, Bing Qin, Orph{'e}e De Clercq, V{'e}ronique Hoste, Marianna Apidianaki, Xavier Tannier, Natalia Loukachevitch, Evgeniy Kotelnikov, Nuria Bel, Salud Mar{'\i}a Jim{'e}nez-Zafra, G{"u}l{\c{s}}en Eryi{\u{g}}it
Abstract
Tasks Aspect-Based Sentiment Analysis, Coreference Resolution, Decision Making, Sentiment Analysis
Published 2016-06-01
URL https://www.aclweb.org/anthology/S16-1002/
PDF https://www.aclweb.org/anthology/S16-1002
PWC https://paperswithcode.com/paper/semeval-2016-task-5-aspect-based-sentiment
Repo
Framework

NLANGP at SemEval-2016 Task 5: Improving Aspect Based Sentiment Analysis using Neural Network Features

Title NLANGP at SemEval-2016 Task 5: Improving Aspect Based Sentiment Analysis using Neural Network Features
Authors Zhiqiang Toh, Jian Su
Abstract
Tasks Aspect-Based Sentiment Analysis, Opinion Mining, Sentiment Analysis
Published 2016-06-01
URL https://www.aclweb.org/anthology/S16-1045/
PDF https://www.aclweb.org/anthology/S16-1045
PWC https://paperswithcode.com/paper/nlangp-at-semeval-2016-task-5-improving
Repo
Framework

Big Community Data before World Wide Web Era

Title Big Community Data before World Wide Web Era
Authors Tomoya Iwakura, Tetsuro Takahashi, Akihiro Ohtani, Kunio Matsui
Abstract This paper introduces the NIFTY-Serve corpus, a large data archive collected from Japanese discussion forums that operated via a Bulletin Board System (BBS) between 1987 and 2006. This corpus can be used in Artificial Intelligence researches such as Natural Language Processing, Community Analysis, and so on. The NIFTY-Serve corpus differs from data on WWW in three ways; (1) essentially spam- and duplication-free because of strict data collection procedures, (2) historic user-generated data before WWW, and (3) a complete data set because the service now shut down. We also introduce some examples of use of the corpus.
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-5408/
PDF https://www.aclweb.org/anthology/W16-5408
PWC https://paperswithcode.com/paper/big-community-data-before-world-wide-web-era
Repo
Framework

Combining Ontologies and Neural Networks for Analyzing Historical Language Varieties. A Case Study in Middle Low German

Title Combining Ontologies and Neural Networks for Analyzing Historical Language Varieties. A Case Study in Middle Low German
Authors Maria Sukhareva, Christian Chiarcos
Abstract In this paper, we describe experiments on the morphosyntactic annotation of historical language varieties for the example of Middle Low German (MLG), the official language of the German Hanse during the Middle Ages and a dominant language around the Baltic Sea by the time. To our best knowledge, this is the first experiment in automatically producing morphosyntactic annotations for Middle Low German, and accordingly, no part-of-speech (POS) tagset is currently agreed upon. In our experiment, we illustrate how ontology-based specifications of projected annotations can be employed to circumvent this issue: Instead of training and evaluating against a given tagset, we decomponse it into independent features which are predicted independently by a neural network. Using consistency constraints (axioms) from an ontology, then, the predicted feature probabilities are decoded into a sound ontological representation. Using these representations, we can finally bootstrap a POS tagset capturing only morphosyntactic features which could be reliably predicted. In this way, our approach is capable to optimize precision and recall of morphosyntactic annotations simultaneously with bootstrapping a tagset rather than performing iterative cycles.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1234/
PDF https://www.aclweb.org/anthology/L16-1234
PWC https://paperswithcode.com/paper/combining-ontologies-and-neural-networks-for
Repo
Framework

Using Linear Classifiers for the Automatic Triage of Posts in the 2016 CLPsych Shared Task

Title Using Linear Classifiers for the Automatic Triage of Posts in the 2016 CLPsych Shared Task
Authors Juri Opitz
Abstract
Tasks Document Classification, Information Retrieval, Text Categorization
Published 2016-06-01
URL https://www.aclweb.org/anthology/W16-0320/
PDF https://www.aclweb.org/anthology/W16-0320
PWC https://paperswithcode.com/paper/using-linear-classifiers-for-the-automatic
Repo
Framework

QCRI at SemEval-2016 Task 4: Probabilistic Methods for Binary and Ordinal Quantification

Title QCRI at SemEval-2016 Task 4: Probabilistic Methods for Binary and Ordinal Quantification
Authors Giovanni Da San Martino, Wei Gao, Fabrizio Sebastiani
Abstract
Tasks Sentiment Analysis
Published 2016-06-01
URL https://www.aclweb.org/anthology/S16-1006/
PDF https://www.aclweb.org/anthology/S16-1006
PWC https://paperswithcode.com/paper/qcri-at-semeval-2016-task-4-probabilistic
Repo
Framework

A Framework for Mining Enterprise Risk and Risk Factors from News Documents

Title A Framework for Mining Enterprise Risk and Risk Factors from News Documents
Authors Tirthankar Dasgupta, Lipika Dey, Prasenjit Dey, Rupsa Saha
Abstract Any real world events or trends that can affect the company{'}s growth trajectory can be considered as risk. There has been a growing need to automatically identify, extract and analyze risk related statements from news events. In this demonstration, we will present a risk analytics framework that processes enterprise project management reports in the form of textual data and news documents and classify them into valid and invalid risk categories. The framework also extracts information from the text pertaining to the different categories of risks like their possible cause and impacts. Accordingly, we have used machine learning based techniques and studied different linguistic features like n-gram, POS, dependency, future timing, uncertainty factors in texts and their various combinations. A manual annotation study from management experts using risk descriptions collected for a specific organization was conducted to evaluate the framework. The evaluation showed promising results for automated risk analysis and identification.
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-2038/
PDF https://www.aclweb.org/anthology/C16-2038
PWC https://paperswithcode.com/paper/a-framework-for-mining-enterprise-risk-and
Repo
Framework

Quantitative Analysis of Gazes and Grounding Acts in L1 and L2 Conversations

Title Quantitative Analysis of Gazes and Grounding Acts in L1 and L2 Conversations
Authors Ichiro Umata, Koki Ijuin, Mitsuru Ishida, Moe Takeuchi, Seiichi Yamamoto
Abstract The listener{'}s gazing activities during utterances were analyzed in a face-to-face three-party conversation setting. The function of each utterance was categorized according to the Grounding Acts defined by Traum (Traum, 1994) so that gazes during utterances could be analyzed from the viewpoint of grounding in communication (Clark, 1996). Quantitative analysis showed that the listeners were gazing at the speakers more in the second language (L2) conversation than in the native language (L1) conversation during the utterances that added new pieces of information, suggesting that they are using visual information to compensate for their lack of linguistic proficiency in L2 conversation.
Tasks
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1673/
PDF https://www.aclweb.org/anthology/L16-1673
PWC https://paperswithcode.com/paper/quantitative-analysis-of-gazes-and-grounding
Repo
Framework
comments powered by Disqus