Paper Group NANR 176
SemEval-2016 Task 8: Meaning Representation Parsing. Improving Phrase-Based SMT Using Cross-Granularity Embedding Similarity. Blind Attacks on Machine Learners. Select-and-Sample for Spike-and-Slab Sparse Coding. Simultaneous Sentence Boundary Detection and Alignment with Pivot-based Machine Translation Generated Lexicons. Effects of Sampling on Tw …
SemEval-2016 Task 8: Meaning Representation Parsing
Title | SemEval-2016 Task 8: Meaning Representation Parsing |
Authors | Jonathan May |
Abstract | |
Tasks | |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/S16-1166/ |
https://www.aclweb.org/anthology/S16-1166 | |
PWC | https://paperswithcode.com/paper/semeval-2016-task-8-meaning-representation |
Repo | |
Framework | |
Improving Phrase-Based SMT Using Cross-Granularity Embedding Similarity
Title | Improving Phrase-Based SMT Using Cross-Granularity Embedding Similarity |
Authors | Peyman Passban, Chris Hokamp, Andy Way, Qun Liu |
Abstract | |
Tasks | Language Modelling, Machine Translation |
Published | 2016-01-01 |
URL | https://www.aclweb.org/anthology/W16-3403/ |
https://www.aclweb.org/anthology/W16-3403 | |
PWC | https://paperswithcode.com/paper/improving-phrase-based-smt-using-cross |
Repo | |
Framework | |
Blind Attacks on Machine Learners
Title | Blind Attacks on Machine Learners |
Authors | Alex Beatson, Zhaoran Wang, Han Liu |
Abstract | The importance of studying the robustness of learners to malicious data is well established. While much work has been done establishing both robust estimators and effective data injection attacks when the attacker is omniscient, the ability of an attacker to provably harm learning while having access to little information is largely unstudied. We study the potential of a “blind attacker” to provably limit a learner’s performance by data injection attack without observing the learner’s training set or any parameter of the distribution from which it is drawn. We provide examples of simple yet effective attacks in two settings: firstly, where an “informed learner” knows the strategy chosen by the attacker, and secondly, where a “blind learner” knows only the proportion of malicious data and some family to which the malicious distribution chosen by the attacker belongs. For each attack, we analyze minimax rates of convergence and establish lower bounds on the learner’s minimax risk, exhibiting limits on a learner’s ability to learn under data injection attack even when the attacker is “blind”. |
Tasks | |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6482-blind-attacks-on-machine-learners |
http://papers.nips.cc/paper/6482-blind-attacks-on-machine-learners.pdf | |
PWC | https://paperswithcode.com/paper/blind-attacks-on-machine-learners |
Repo | |
Framework | |
Select-and-Sample for Spike-and-Slab Sparse Coding
Title | Select-and-Sample for Spike-and-Slab Sparse Coding |
Authors | Abdul-Saboor Sheikh, Jörg Lücke |
Abstract | Probabilistic inference serves as a popular model for neural processing. It is still unclear, however, how approximate probabilistic inference can be accurate and scalable to very high-dimensional continuous latent spaces. Especially as typical posteriors for sensory data can be expected to exhibit complex latent dependencies including multiple modes. Here, we study an approach that can efficiently be scaled while maintaining a richly structured posterior approximation under these conditions. As example model we use spike-and-slab sparse coding for V1 processing, and combine latent subspace selection with Gibbs sampling (select-and-sample). Unlike factored variational approaches, the method can maintain large numbers of posterior modes and complex latent dependencies. Unlike pure sampling, the method is scalable to very high-dimensional latent spaces. Among all sparse coding approaches with non-trivial posterior approximations (MAP or ICA-like models), we report the largest-scale results. In applications we firstly verify the approach by showing competitiveness in standard denoising benchmarks. Secondly, we use its scalability to, for the first time, study highly-overcomplete settings for V1 encoding using sophisticated posterior representations. More generally, our study shows that very accurate probabilistic inference for multi-modal posteriors with complex dependencies is tractable, functionally desirable and consistent with models for neural inference. |
Tasks | Denoising |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6276-select-and-sample-for-spike-and-slab-sparse-coding |
http://papers.nips.cc/paper/6276-select-and-sample-for-spike-and-slab-sparse-coding.pdf | |
PWC | https://paperswithcode.com/paper/select-and-sample-for-spike-and-slab-sparse |
Repo | |
Framework | |
Simultaneous Sentence Boundary Detection and Alignment with Pivot-based Machine Translation Generated Lexicons
Title | Simultaneous Sentence Boundary Detection and Alignment with Pivot-based Machine Translation Generated Lexicons |
Authors | Antoine Bourlon, Chenhui Chu, Toshiaki Nakazawa, Sadao Kurohashi |
Abstract | Sentence alignment is a task that consists in aligning the parallel sentences in a translated article pair. This paper describes a method to perform sentence boundary detection and alignment simultaneously, which significantly improves the alignment accuracy on languages like Chinese with uncertain sentence boundaries. It relies on the definition of hard (certain) and soft (uncertain) punctuation delimiters, the latter being possibly ignored to optimize the alignment result. The alignment method is used in combination with lexicons automatically generated from the input article pairs using pivot-based MT, achieving better coverage of the input words with fewer entries than pre-existing dictionaries. Pivot-based MT makes it possible to build dictionaries for language pairs that have scarce parallel data. The alignment method is implemented in a tool that will be freely available in the near future. |
Tasks | Boundary Detection, Machine Translation |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1348/ |
https://www.aclweb.org/anthology/L16-1348 | |
PWC | https://paperswithcode.com/paper/simultaneous-sentence-boundary-detection-and |
Repo | |
Framework | |
Effects of Sampling on Twitter Trend Detection
Title | Effects of Sampling on Twitter Trend Detection |
Authors | Andrew Yates, Alek Kolcz, Nazli Goharian, Ophir Frieder |
Abstract | Much research has focused on detecting trends on Twitter, including health-related trends such as mentions of Influenza-like illnesses or their symptoms. The majority of this research has been conducted using Twitter{'}s public feed, which includes only about 1{%} of all public tweets. It is unclear if, when, and how using Twitter{'}s 1{%} feed has affected the evaluation of trend detection methods. In this work we use a larger feed to investigate the effects of sampling on Twitter trend detection. We focus on using health-related trends to estimate the prevalence of Influenza-like illnesses based on tweets. We use ground truth obtained from the CDC and Google Flu Trends to explore how the prevalence estimates degrade when moving from a 100{%} to a 1{%} sample. We find that using the 1{%} sample is unlikely to substantially harm ILI estimates made at the national level, but can cause poor performance when estimates are made at the city level. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1479/ |
https://www.aclweb.org/anthology/L16-1479 | |
PWC | https://paperswithcode.com/paper/effects-of-sampling-on-twitter-trend |
Repo | |
Framework | |
Tense and Aspect in Runyankore Using a Context-Free Grammar
Title | Tense and Aspect in Runyankore Using a Context-Free Grammar |
Authors | Joan Byamugisha, C. Maria Keet, Brian DeRenzi |
Abstract | |
Tasks | Text Generation |
Published | 2016-09-01 |
URL | https://www.aclweb.org/anthology/W16-6614/ |
https://www.aclweb.org/anthology/W16-6614 | |
PWC | https://paperswithcode.com/paper/tense-and-aspect-in-runyankore-using-a |
Repo | |
Framework | |
SemEval-2016 Task 5: Aspect Based Sentiment Analysis
Title | SemEval-2016 Task 5: Aspect Based Sentiment Analysis |
Authors | Maria Pontiki, Dimitris Galanis, Haris Papageorgiou, Ion Androutsopoulos, Man, Suresh har, Mohammad AL-Smadi, Mahmoud Al-Ayyoub, Yanyan Zhao, Bing Qin, Orph{'e}e De Clercq, V{'e}ronique Hoste, Marianna Apidianaki, Xavier Tannier, Natalia Loukachevitch, Evgeniy Kotelnikov, Nuria Bel, Salud Mar{'\i}a Jim{'e}nez-Zafra, G{"u}l{\c{s}}en Eryi{\u{g}}it |
Abstract | |
Tasks | Aspect-Based Sentiment Analysis, Coreference Resolution, Decision Making, Sentiment Analysis |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/S16-1002/ |
https://www.aclweb.org/anthology/S16-1002 | |
PWC | https://paperswithcode.com/paper/semeval-2016-task-5-aspect-based-sentiment |
Repo | |
Framework | |
NLANGP at SemEval-2016 Task 5: Improving Aspect Based Sentiment Analysis using Neural Network Features
Title | NLANGP at SemEval-2016 Task 5: Improving Aspect Based Sentiment Analysis using Neural Network Features |
Authors | Zhiqiang Toh, Jian Su |
Abstract | |
Tasks | Aspect-Based Sentiment Analysis, Opinion Mining, Sentiment Analysis |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/S16-1045/ |
https://www.aclweb.org/anthology/S16-1045 | |
PWC | https://paperswithcode.com/paper/nlangp-at-semeval-2016-task-5-improving |
Repo | |
Framework | |
Big Community Data before World Wide Web Era
Title | Big Community Data before World Wide Web Era |
Authors | Tomoya Iwakura, Tetsuro Takahashi, Akihiro Ohtani, Kunio Matsui |
Abstract | This paper introduces the NIFTY-Serve corpus, a large data archive collected from Japanese discussion forums that operated via a Bulletin Board System (BBS) between 1987 and 2006. This corpus can be used in Artificial Intelligence researches such as Natural Language Processing, Community Analysis, and so on. The NIFTY-Serve corpus differs from data on WWW in three ways; (1) essentially spam- and duplication-free because of strict data collection procedures, (2) historic user-generated data before WWW, and (3) a complete data set because the service now shut down. We also introduce some examples of use of the corpus. |
Tasks | |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-5408/ |
https://www.aclweb.org/anthology/W16-5408 | |
PWC | https://paperswithcode.com/paper/big-community-data-before-world-wide-web-era |
Repo | |
Framework | |
Combining Ontologies and Neural Networks for Analyzing Historical Language Varieties. A Case Study in Middle Low German
Title | Combining Ontologies and Neural Networks for Analyzing Historical Language Varieties. A Case Study in Middle Low German |
Authors | Maria Sukhareva, Christian Chiarcos |
Abstract | In this paper, we describe experiments on the morphosyntactic annotation of historical language varieties for the example of Middle Low German (MLG), the official language of the German Hanse during the Middle Ages and a dominant language around the Baltic Sea by the time. To our best knowledge, this is the first experiment in automatically producing morphosyntactic annotations for Middle Low German, and accordingly, no part-of-speech (POS) tagset is currently agreed upon. In our experiment, we illustrate how ontology-based specifications of projected annotations can be employed to circumvent this issue: Instead of training and evaluating against a given tagset, we decomponse it into independent features which are predicted independently by a neural network. Using consistency constraints (axioms) from an ontology, then, the predicted feature probabilities are decoded into a sound ontological representation. Using these representations, we can finally bootstrap a POS tagset capturing only morphosyntactic features which could be reliably predicted. In this way, our approach is capable to optimize precision and recall of morphosyntactic annotations simultaneously with bootstrapping a tagset rather than performing iterative cycles. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1234/ |
https://www.aclweb.org/anthology/L16-1234 | |
PWC | https://paperswithcode.com/paper/combining-ontologies-and-neural-networks-for |
Repo | |
Framework | |
Using Linear Classifiers for the Automatic Triage of Posts in the 2016 CLPsych Shared Task
Title | Using Linear Classifiers for the Automatic Triage of Posts in the 2016 CLPsych Shared Task |
Authors | Juri Opitz |
Abstract | |
Tasks | Document Classification, Information Retrieval, Text Categorization |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/W16-0320/ |
https://www.aclweb.org/anthology/W16-0320 | |
PWC | https://paperswithcode.com/paper/using-linear-classifiers-for-the-automatic |
Repo | |
Framework | |
QCRI at SemEval-2016 Task 4: Probabilistic Methods for Binary and Ordinal Quantification
Title | QCRI at SemEval-2016 Task 4: Probabilistic Methods for Binary and Ordinal Quantification |
Authors | Giovanni Da San Martino, Wei Gao, Fabrizio Sebastiani |
Abstract | |
Tasks | Sentiment Analysis |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/S16-1006/ |
https://www.aclweb.org/anthology/S16-1006 | |
PWC | https://paperswithcode.com/paper/qcri-at-semeval-2016-task-4-probabilistic |
Repo | |
Framework | |
A Framework for Mining Enterprise Risk and Risk Factors from News Documents
Title | A Framework for Mining Enterprise Risk and Risk Factors from News Documents |
Authors | Tirthankar Dasgupta, Lipika Dey, Prasenjit Dey, Rupsa Saha |
Abstract | Any real world events or trends that can affect the company{'}s growth trajectory can be considered as risk. There has been a growing need to automatically identify, extract and analyze risk related statements from news events. In this demonstration, we will present a risk analytics framework that processes enterprise project management reports in the form of textual data and news documents and classify them into valid and invalid risk categories. The framework also extracts information from the text pertaining to the different categories of risks like their possible cause and impacts. Accordingly, we have used machine learning based techniques and studied different linguistic features like n-gram, POS, dependency, future timing, uncertainty factors in texts and their various combinations. A manual annotation study from management experts using risk descriptions collected for a specific organization was conducted to evaluate the framework. The evaluation showed promising results for automated risk analysis and identification. |
Tasks | |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/C16-2038/ |
https://www.aclweb.org/anthology/C16-2038 | |
PWC | https://paperswithcode.com/paper/a-framework-for-mining-enterprise-risk-and |
Repo | |
Framework | |
Quantitative Analysis of Gazes and Grounding Acts in L1 and L2 Conversations
Title | Quantitative Analysis of Gazes and Grounding Acts in L1 and L2 Conversations |
Authors | Ichiro Umata, Koki Ijuin, Mitsuru Ishida, Moe Takeuchi, Seiichi Yamamoto |
Abstract | The listener{'}s gazing activities during utterances were analyzed in a face-to-face three-party conversation setting. The function of each utterance was categorized according to the Grounding Acts defined by Traum (Traum, 1994) so that gazes during utterances could be analyzed from the viewpoint of grounding in communication (Clark, 1996). Quantitative analysis showed that the listeners were gazing at the speakers more in the second language (L2) conversation than in the native language (L1) conversation during the utterances that added new pieces of information, suggesting that they are using visual information to compensate for their lack of linguistic proficiency in L2 conversation. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1673/ |
https://www.aclweb.org/anthology/L16-1673 | |
PWC | https://paperswithcode.com/paper/quantitative-analysis-of-gazes-and-grounding |
Repo | |
Framework | |