May 4, 2019

1885 words 9 mins read

Paper Group NANR 176

SemEval-2016 Task 8: Meaning Representation Parsing. Improving Phrase-Based SMT Using Cross-Granularity Embedding Similarity. Blind Attacks on Machine Learners. Select-and-Sample for Spike-and-Slab Sparse Coding. Simultaneous Sentence Boundary Detection and Alignment with Pivot-based Machine Translation Generated Lexicons. Effects of Sampling on Tw …

SemEval-2016 Task 8: Meaning Representation Parsing


Title	SemEval-2016 Task 8: Meaning Representation Parsing
Authors	Jonathan May
Abstract
Tasks
Published	2016-06-01
URL	https://www.aclweb.org/anthology/S16-1166/
PDF	https://www.aclweb.org/anthology/S16-1166
PWC	https://paperswithcode.com/paper/semeval-2016-task-8-meaning-representation
Repo
Framework

Improving Phrase-Based SMT Using Cross-Granularity Embedding Similarity


Title	Improving Phrase-Based SMT Using Cross-Granularity Embedding Similarity
Authors	Peyman Passban, Chris Hokamp, Andy Way, Qun Liu
Abstract
Tasks	Language Modelling, Machine Translation
Published	2016-01-01
URL	https://www.aclweb.org/anthology/W16-3403/
PDF	https://www.aclweb.org/anthology/W16-3403
PWC	https://paperswithcode.com/paper/improving-phrase-based-smt-using-cross
Repo
Framework


Title	Blind Attacks on Machine Learners
Authors	Alex Beatson, Zhaoran Wang, Han Liu
Abstract	The importance of studying the robustness of learners to malicious data is well established. While much work has been done establishing both robust estimators and effective data injection attacks when the attacker is omniscient, the ability of an attacker to provably harm learning while having access to little information is largely unstudied. We study the potential of a “blind attacker” to provably limit a learner’s performance by data injection attack without observing the learner’s training set or any parameter of the distribution from which it is drawn. We provide examples of simple yet effective attacks in two settings: firstly, where an “informed learner” knows the strategy chosen by the attacker, and secondly, where a “blind learner” knows only the proportion of malicious data and some family to which the malicious distribution chosen by the attacker belongs. For each attack, we analyze minimax rates of convergence and establish lower bounds on the learner’s minimax risk, exhibiting limits on a learner’s ability to learn under data injection attack even when the attacker is “blind”.
Tasks
Published	2016-12-01
URL	http://papers.nips.cc/paper/6482-blind-attacks-on-machine-learners
PDF	http://papers.nips.cc/paper/6482-blind-attacks-on-machine-learners.pdf
PWC	https://paperswithcode.com/paper/blind-attacks-on-machine-learners
Repo
Framework

Select-and-Sample for Spike-and-Slab Sparse Coding


Title	Select-and-Sample for Spike-and-Slab Sparse Coding
Authors	Abdul-Saboor Sheikh, Jörg Lücke
Abstract	Probabilistic inference serves as a popular model for neural processing. It is still unclear, however, how approximate probabilistic inference can be accurate and scalable to very high-dimensional continuous latent spaces. Especially as typical posteriors for sensory data can be expected to exhibit complex latent dependencies including multiple modes. Here, we study an approach that can efficiently be scaled while maintaining a richly structured posterior approximation under these conditions. As example model we use spike-and-slab sparse coding for V1 processing, and combine latent subspace selection with Gibbs sampling (select-and-sample). Unlike factored variational approaches, the method can maintain large numbers of posterior modes and complex latent dependencies. Unlike pure sampling, the method is scalable to very high-dimensional latent spaces. Among all sparse coding approaches with non-trivial posterior approximations (MAP or ICA-like models), we report the largest-scale results. In applications we firstly verify the approach by showing competitiveness in standard denoising benchmarks. Secondly, we use its scalability to, for the first time, study highly-overcomplete settings for V1 encoding using sophisticated posterior representations. More generally, our study shows that very accurate probabilistic inference for multi-modal posteriors with complex dependencies is tractable, functionally desirable and consistent with models for neural inference.
Tasks	Denoising
Published	2016-12-01
URL	http://papers.nips.cc/paper/6276-select-and-sample-for-spike-and-slab-sparse-coding
PDF	http://papers.nips.cc/paper/6276-select-and-sample-for-spike-and-slab-sparse-coding.pdf
PWC	https://paperswithcode.com/paper/select-and-sample-for-spike-and-slab-sparse
Repo
Framework

Simultaneous Sentence Boundary Detection and Alignment with Pivot-based Machine Translation Generated Lexicons


Title	Simultaneous Sentence Boundary Detection and Alignment with Pivot-based Machine Translation Generated Lexicons
Authors	Antoine Bourlon, Chenhui Chu, Toshiaki Nakazawa, Sadao Kurohashi
Abstract	Sentence alignment is a task that consists in aligning the parallel sentences in a translated article pair. This paper describes a method to perform sentence boundary detection and alignment simultaneously, which significantly improves the alignment accuracy on languages like Chinese with uncertain sentence boundaries. It relies on the definition of hard (certain) and soft (uncertain) punctuation delimiters, the latter being possibly ignored to optimize the alignment result. The alignment method is used in combination with lexicons automatically generated from the input article pairs using pivot-based MT, achieving better coverage of the input words with fewer entries than pre-existing dictionaries. Pivot-based MT makes it possible to build dictionaries for language pairs that have scarce parallel data. The alignment method is implemented in a tool that will be freely available in the near future.
Tasks	Boundary Detection, Machine Translation
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1348/
PDF	https://www.aclweb.org/anthology/L16-1348
PWC	https://paperswithcode.com/paper/simultaneous-sentence-boundary-detection-and
Repo
Framework

Effects of Sampling on Twitter Trend Detection


Title	Effects of Sampling on Twitter Trend Detection
Authors	Andrew Yates, Alek Kolcz, Nazli Goharian, Ophir Frieder
Abstract	Much research has focused on detecting trends on Twitter, including health-related trends such as mentions of Influenza-like illnesses or their symptoms. The majority of this research has been conducted using Twitter{'}s public feed, which includes only about 1{%} of all public tweets. It is unclear if, when, and how using Twitter{'}s 1{%} feed has affected the evaluation of trend detection methods. In this work we use a larger feed to investigate the effects of sampling on Twitter trend detection. We focus on using health-related trends to estimate the prevalence of Influenza-like illnesses based on tweets. We use ground truth obtained from the CDC and Google Flu Trends to explore how the prevalence estimates degrade when moving from a 100{%} to a 1{%} sample. We find that using the 1{%} sample is unlikely to substantially harm ILI estimates made at the national level, but can cause poor performance when estimates are made at the city level.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1479/
PDF	https://www.aclweb.org/anthology/L16-1479
PWC	https://paperswithcode.com/paper/effects-of-sampling-on-twitter-trend
Repo
Framework

Tense and Aspect in Runyankore Using a Context-Free Grammar


Title	Tense and Aspect in Runyankore Using a Context-Free Grammar
Authors	Joan Byamugisha, C. Maria Keet, Brian DeRenzi
Abstract
Tasks	Text Generation
Published	2016-09-01
URL	https://www.aclweb.org/anthology/W16-6614/
PDF	https://www.aclweb.org/anthology/W16-6614
PWC	https://paperswithcode.com/paper/tense-and-aspect-in-runyankore-using-a
Repo
Framework

SemEval-2016 Task 5: Aspect Based Sentiment Analysis


Title	SemEval-2016 Task 5: Aspect Based Sentiment Analysis
Authors	Maria Pontiki, Dimitris Galanis, Haris Papageorgiou, Ion Androutsopoulos, Man, Suresh har, Mohammad AL-Smadi, Mahmoud Al-Ayyoub, Yanyan Zhao, Bing Qin, Orph{'e}e De Clercq, V{'e}ronique Hoste, Marianna Apidianaki, Xavier Tannier, Natalia Loukachevitch, Evgeniy Kotelnikov, Nuria Bel, Salud Mar{'\i}a Jim{'e}nez-Zafra, G{"u}l{\c{s}}en Eryi{\u{g}}it
Abstract
Tasks	Aspect-Based Sentiment Analysis, Coreference Resolution, Decision Making, Sentiment Analysis
Published	2016-06-01
URL	https://www.aclweb.org/anthology/S16-1002/
PDF	https://www.aclweb.org/anthology/S16-1002
PWC	https://paperswithcode.com/paper/semeval-2016-task-5-aspect-based-sentiment
Repo
Framework

NLANGP at SemEval-2016 Task 5: Improving Aspect Based Sentiment Analysis using Neural Network Features


Title	NLANGP at SemEval-2016 Task 5: Improving Aspect Based Sentiment Analysis using Neural Network Features
Authors	Zhiqiang Toh, Jian Su
Abstract
Tasks	Aspect-Based Sentiment Analysis, Opinion Mining, Sentiment Analysis
Published	2016-06-01
URL	https://www.aclweb.org/anthology/S16-1045/
PDF	https://www.aclweb.org/anthology/S16-1045
PWC	https://paperswithcode.com/paper/nlangp-at-semeval-2016-task-5-improving
Repo
Framework

Big Community Data before World Wide Web Era


Title	Big Community Data before World Wide Web Era
Authors	Tomoya Iwakura, Tetsuro Takahashi, Akihiro Ohtani, Kunio Matsui
Abstract	This paper introduces the NIFTY-Serve corpus, a large data archive collected from Japanese discussion forums that operated via a Bulletin Board System (BBS) between 1987 and 2006. This corpus can be used in Artificial Intelligence researches such as Natural Language Processing, Community Analysis, and so on. The NIFTY-Serve corpus differs from data on WWW in three ways; (1) essentially spam- and duplication-free because of strict data collection procedures, (2) historic user-generated data before WWW, and (3) a complete data set because the service now shut down. We also introduce some examples of use of the corpus.
Tasks
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-5408/
PDF	https://www.aclweb.org/anthology/W16-5408
PWC	https://paperswithcode.com/paper/big-community-data-before-world-wide-web-era
Repo
Framework

Combining Ontologies and Neural Networks for Analyzing Historical Language Varieties. A Case Study in Middle Low German


Title	Combining Ontologies and Neural Networks for Analyzing Historical Language Varieties. A Case Study in Middle Low German
Authors	Maria Sukhareva, Christian Chiarcos
Abstract	In this paper, we describe experiments on the morphosyntactic annotation of historical language varieties for the example of Middle Low German (MLG), the official language of the German Hanse during the Middle Ages and a dominant language around the Baltic Sea by the time. To our best knowledge, this is the first experiment in automatically producing morphosyntactic annotations for Middle Low German, and accordingly, no part-of-speech (POS) tagset is currently agreed upon. In our experiment, we illustrate how ontology-based specifications of projected annotations can be employed to circumvent this issue: Instead of training and evaluating against a given tagset, we decomponse it into independent features which are predicted independently by a neural network. Using consistency constraints (axioms) from an ontology, then, the predicted feature probabilities are decoded into a sound ontological representation. Using these representations, we can finally bootstrap a POS tagset capturing only morphosyntactic features which could be reliably predicted. In this way, our approach is capable to optimize precision and recall of morphosyntactic annotations simultaneously with bootstrapping a tagset rather than performing iterative cycles.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1234/
PDF	https://www.aclweb.org/anthology/L16-1234
PWC	https://paperswithcode.com/paper/combining-ontologies-and-neural-networks-for
Repo
Framework

Using Linear Classifiers for the Automatic Triage of Posts in the 2016 CLPsych Shared Task


Title	Using Linear Classifiers for the Automatic Triage of Posts in the 2016 CLPsych Shared Task
Authors	Juri Opitz
Abstract
Tasks	Document Classification, Information Retrieval, Text Categorization
Published	2016-06-01
URL	https://www.aclweb.org/anthology/W16-0320/
PDF	https://www.aclweb.org/anthology/W16-0320
PWC	https://paperswithcode.com/paper/using-linear-classifiers-for-the-automatic
Repo
Framework

QCRI at SemEval-2016 Task 4: Probabilistic Methods for Binary and Ordinal Quantification


Title	QCRI at SemEval-2016 Task 4: Probabilistic Methods for Binary and Ordinal Quantification
Authors	Giovanni Da San Martino, Wei Gao, Fabrizio Sebastiani
Abstract
Tasks	Sentiment Analysis
Published	2016-06-01
URL	https://www.aclweb.org/anthology/S16-1006/
PDF	https://www.aclweb.org/anthology/S16-1006
PWC	https://paperswithcode.com/paper/qcri-at-semeval-2016-task-4-probabilistic
Repo
Framework

A Framework for Mining Enterprise Risk and Risk Factors from News Documents


Title	A Framework for Mining Enterprise Risk and Risk Factors from News Documents
Authors	Tirthankar Dasgupta, Lipika Dey, Prasenjit Dey, Rupsa Saha
Abstract	Any real world events or trends that can affect the company{'}s growth trajectory can be considered as risk. There has been a growing need to automatically identify, extract and analyze risk related statements from news events. In this demonstration, we will present a risk analytics framework that processes enterprise project management reports in the form of textual data and news documents and classify them into valid and invalid risk categories. The framework also extracts information from the text pertaining to the different categories of risks like their possible cause and impacts. Accordingly, we have used machine learning based techniques and studied different linguistic features like n-gram, POS, dependency, future timing, uncertainty factors in texts and their various combinations. A manual annotation study from management experts using risk descriptions collected for a specific organization was conducted to evaluate the framework. The evaluation showed promising results for automated risk analysis and identification.
Tasks
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-2038/
PDF	https://www.aclweb.org/anthology/C16-2038
PWC	https://paperswithcode.com/paper/a-framework-for-mining-enterprise-risk-and
Repo
Framework

Quantitative Analysis of Gazes and Grounding Acts in L1 and L2 Conversations


Title	Quantitative Analysis of Gazes and Grounding Acts in L1 and L2 Conversations
Authors	Ichiro Umata, Koki Ijuin, Mitsuru Ishida, Moe Takeuchi, Seiichi Yamamoto
Abstract	The listener{'}s gazing activities during utterances were analyzed in a face-to-face three-party conversation setting. The function of each utterance was categorized according to the Grounding Acts defined by Traum (Traum, 1994) so that gazes during utterances could be analyzed from the viewpoint of grounding in communication (Clark, 1996). Quantitative analysis showed that the listeners were gazing at the speakers more in the second language (L2) conversation than in the native language (L1) conversation during the utterances that added new pieces of information, suggesting that they are using visual information to compensate for their lack of linguistic proficiency in L2 conversation.
Tasks
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1673/
PDF	https://www.aclweb.org/anthology/L16-1673
PWC	https://paperswithcode.com/paper/quantitative-analysis-of-gazes-and-grounding
Repo
Framework