May 5, 2019

1834 words 9 mins read

Paper Group NANR 127

Word Segmentation in Sanskrit Using Path Constrained Random Walks. Applying Core Scientific Concepts to Context-Based Citation Recommendation. Multistage Campaigning in Social Networks. Subdialectal Differences in Sorani Kurdish. iAppraise: A Manual Machine Translation Evaluation Environment Supporting Eye-tracking. Crowdsourcing for (almost) Real- …

Word Segmentation in Sanskrit Using Path Constrained Random Walks


Title	Word Segmentation in Sanskrit Using Path Constrained Random Walks
Authors	Amrith Krishna, Bishal Santra, Pavankumar Satuluri, B, Sasi Prasanth aru, Bhumi Faldu, Yajuvendra Singh, Pawan Goyal
Abstract	In Sanskrit, the phonemes at the word boundaries undergo changes to form new phonemes through a process called as sandhi. A fused sentence can be segmented into multiple possible segmentations. We propose a word segmentation approach that predicts the most semantically valid segmentation for a given sentence. We treat the problem as a query expansion problem and use the path-constrained random walks framework to predict the correct segments.
Tasks
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1048/
PDF	https://www.aclweb.org/anthology/C16-1048
PWC	https://paperswithcode.com/paper/word-segmentation-in-sanskrit-using-path
Repo
Framework

Applying Core Scientific Concepts to Context-Based Citation Recommendation


Title	Applying Core Scientific Concepts to Context-Based Citation Recommendation
Authors	Daniel Duma, Maria Liakata, Am Clare, a, James Ravenscroft, Ewan Klein
Abstract	The task of recommending relevant scientific literature for a draft academic paper has recently received significant interest. In our effort to ease the discovery of scientific literature and augment scientific writing, we aim to improve the relevance of results based on a shallow semantic analysis of the source document and the potential documents to recommend. We investigate the utility of automatic argumentative and rhetorical annotation of documents for this purpose. Specifically, we integrate automatic Core Scientific Concepts (CoreSC) classification into a prototype context-based citation recommendation system and investigate its usefulness to the task. We frame citation recommendation as an information retrieval task and we use the categories of the annotation schemes to apply different weights to the similarity formula. Our results show interesting and consistent correlations between the type of citation and the type of sentence containing the relevant information.
Tasks	Information Retrieval
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1274/
PDF	https://www.aclweb.org/anthology/L16-1274
PWC	https://paperswithcode.com/paper/applying-core-scientific-concepts-to-context
Repo
Framework


Title	Multistage Campaigning in Social Networks
Authors	Mehrdad Farajtabar, Xiaojing Ye, Sahar Harati, Le Song, Hongyuan Zha
Abstract	We consider control problems for multi-stage campaigning over social networks. The dynamic programming framework is employed to balance the high present reward and large penalty on low future outcome in the presence of extensive uncertainties. In particular, we establish theoretical foundations of optimal campaigning over social networks where the user activities are modeled as a multivariate Hawkes process, and we derive a time dependent linear relation between the intensity of exogenous events and several commonly used objective functions of campaigning. We further develop a convex dynamic programming framework for determining the optimal intervention policy that prescribes the required level of external drive at each stage for the desired campaigning result. Experiments on both synthetic data and the real-world MemeTracker dataset show that our algorithm can steer the user activities for optimal campaigning much more accurately than baselines.
Tasks
Published	2016-12-01
URL	http://papers.nips.cc/paper/6102-multistage-campaigning-in-social-networks
PDF	http://papers.nips.cc/paper/6102-multistage-campaigning-in-social-networks.pdf
PWC	https://paperswithcode.com/paper/multistage-campaigning-in-social-networks
Repo
Framework

Subdialectal Differences in Sorani Kurdish


Title	Subdialectal Differences in Sorani Kurdish
Authors	Shervin Malmasi
Abstract	In this study we apply classification methods for detecting subdialectal differences in Sorani Kurdish texts produced in different regions, namely Iran and Iraq. As Sorani is a low-resource language, no corpus including texts from different regions was readily available. To this end, we identified data sources that could be leveraged for this task to create a dataset of 200,000 sentences. Using surface features, we attempted to classify Sorani subdialects, showing that sentences from news sources in Iraq and Iran are distinguishable with 96{%} accuracy. This is the first preliminary study for a dialect that has not been widely studied in computational linguistics, evidencing the possible existence of distinct subdialects.
Tasks	Information Retrieval, Language Identification, Machine Translation
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-4812/
PDF	https://www.aclweb.org/anthology/W16-4812
PWC	https://paperswithcode.com/paper/subdialectal-differences-in-sorani-kurdish
Repo
Framework

iAppraise: A Manual Machine Translation Evaluation Environment Supporting Eye-tracking


Title	iAppraise: A Manual Machine Translation Evaluation Environment Supporting Eye-tracking
Authors	Ahmed Abdelali, Nadir Durrani, Francisco Guzm{'a}n
Abstract
Tasks	Eye Tracking, Machine Translation
Published	2016-06-01
URL	https://www.aclweb.org/anthology/N16-3004/
PDF	https://www.aclweb.org/anthology/N16-3004
PWC	https://paperswithcode.com/paper/iappraise-a-manual-machine-translation
Repo
Framework

Crowdsourcing for (almost) Real-time Question Answering


Title	Crowdsourcing for (almost) Real-time Question Answering
Authors	Denis Savenkov, Scott Weitzner, Eugene Agichtein
Abstract
Tasks	Community Question Answering, Question Answering
Published	2016-06-01
URL	https://www.aclweb.org/anthology/W16-0102/
PDF	https://www.aclweb.org/anthology/W16-0102
PWC	https://paperswithcode.com/paper/crowdsourcing-for-almost-real-time-question
Repo
Framework

Multilingual Information Extraction with PolyglotIE


Title	Multilingual Information Extraction with PolyglotIE
Authors	Alan Akbik, Laura Chiticariu, Marina Danilevsky, Yonas Kbrom, Yunyao Li, Huaiyu Zhu
Abstract	We present PolyglotIE, a web-based tool for developing extractors that perform Information Extraction (IE) over multilingual data. Our tool has two core features: First, it allows users to develop extractors against a unified abstraction that is shared across a large set of natural languages. This means that an extractor needs only be created once for one language, but will then run on multilingual data without any additional effort or language-specific knowledge on part of the user. Second, it embeds this abstraction as a set of views within a declarative IE system, allowing users to quickly create extractors using a mature IE query language. We present PolyglotIE as a hands-on demo in which users can experiment with creating extractors, execute them on multilingual text and inspect extraction results. Using the UI, we discuss the challenges and potential of using unified, crosslingual semantic abstractions as basis for downstream applications. We demonstrate multilingual IE for 9 languages from 4 different language groups: English, German, French, Spanish, Japanese, Chinese, Arabic, Russian and Hindi.
Tasks	Semantic Parsing
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-2056/
PDF	https://www.aclweb.org/anthology/C16-2056
PWC	https://paperswithcode.com/paper/multilingual-information-extraction-with
Repo
Framework

Letter Sequence Labeling for Compound Splitting


Title	Letter Sequence Labeling for Compound Splitting
Authors	Jianqiang Ma, Verena Henrich, Erhard Hinrichs
Abstract
Tasks	Machine Translation, Transfer Learning
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-2012/
PDF	https://www.aclweb.org/anthology/W16-2012
PWC	https://paperswithcode.com/paper/letter-sequence-labeling-for-compound
Repo
Framework

Combining Lexical and Spatial Knowledge to Predict Spatial Relations between Objects in Images


Title	Combining Lexical and Spatial Knowledge to Predict Spatial Relations between Objects in Images
Authors	Manuela H{"u}rlimann, Johan Bos
Abstract
Tasks	Object Detection, Object Recognition, Relation Classification
Published	2016-08-01
URL	https://www.aclweb.org/anthology/W16-3202/
PDF	https://www.aclweb.org/anthology/W16-3202
PWC	https://paperswithcode.com/paper/combining-lexical-and-spatial-knowledge-to
Repo
Framework

CMA-ES with Optimal Covariance Update and Storage Complexity


Title	CMA-ES with Optimal Covariance Update and Storage Complexity
Authors	Oswin Krause, Dídac Rodríguez Arbonès, Christian Igel
Abstract	The covariance matrix adaptation evolution strategy (CMA-ES) is arguably one of the most powerful real-valued derivative-free optimization algorithms, finding many applications in machine learning. The CMA-ES is a Monte Carlo method, sampling from a sequence of multi-variate Gaussian distributions. Given the function values at the sampled points, updating and storing the covariance matrix dominates the time and space complexity in each iteration of the algorithm. We propose a numerically stable quadratic-time covariance matrix update scheme with minimal memory requirements based on maintaining triangular Cholesky factors. This requires a modification of the cumulative step-size adaption (CSA) mechanism in the CMA-ES, in which we replace the inverse of the square root of the covariance matrix by the inverse of the triangular Cholesky factor. Because the triangular Cholesky factor changes smoothly with the matrix square root, this modification does not change the behavior of the CMA-ES in terms of required objective function evaluations as verified empirically. Thus, the described algorithm can and should replace the standard CMA-ES if updating and storing the covariance matrix matters.
Tasks
Published	2016-12-01
URL	http://papers.nips.cc/paper/6457-cma-es-with-optimal-covariance-update-and-storage-complexity
PDF	http://papers.nips.cc/paper/6457-cma-es-with-optimal-covariance-update-and-storage-complexity.pdf
PWC	https://paperswithcode.com/paper/cma-es-with-optimal-covariance-update-and
Repo
Framework

Evaluating a Deterministic Shift-Reduce Neural Parser for Constituent Parsing


Title	Evaluating a Deterministic Shift-Reduce Neural Parser for Constituent Parsing
Authors	Hao Zhou, Yue Zhang, Shujian Huang, Xin-Yu Dai, Jiajun Chen
Abstract	Greedy transition-based parsers are appealing for their very fast speed, with reasonably high accuracies. In this paper, we build a fast shift-reduce neural constituent parser by using a neural network to make local decisions. One challenge to the parsing speed is the large hidden and output layer sizes caused by the number of constituent labels and branching options. We speed up the parser by using a hierarchical output layer, inspired by the hierarchical log-bilinear neural language model. In standard WSJ experiments, the neural parser achieves an almost 2.4 time speed up (320 sen/sec) compared to a non-hierarchical baseline without significant accuracy loss (89.06 vs 89.13 F-score).
Tasks	Language Modelling
Published	2016-05-01
URL	https://www.aclweb.org/anthology/L16-1104/
PDF	https://www.aclweb.org/anthology/L16-1104
PWC	https://paperswithcode.com/paper/evaluating-a-deterministic-shift-reduce
Repo
Framework

Integrating empty category detection into preordering Machine Translation


Title	Integrating empty category detection into preordering Machine Translation
Authors	Shunsuke Takeno, Masaaki Nagata, Kazuhide Yamamoto
Abstract	We propose a method for integrating Japanese empty category detection into the preordering process of Japanese-to-English statistical machine translation. First, we apply machine-learning-based empty category detection to estimate the position and the type of empty categories in the constituent tree of the source sentence. Then, we apply discriminative preordering to the augmented constituent tree in which empty categories are treated as if they are normal lexical symbols. We find that it is effective to filter empty categories based on the confidence of estimation. Our experiments show that, for the IWSLT dataset consisting of short travel conversations, the insertion of empty categories alone improves the BLEU score from 33.2 to 34.3 and the RIBES score from 76.3 to 78.7, which imply that reordering has improved For the KFTT dataset consisting of Wikipedia sentences, the proposed preordering method considering empty categories improves the BLEU score from 19.9 to 20.2 and the RIBES score from 66.2 to 66.3, which shows both translation and reordering have improved slightly.
Tasks	Machine Translation, Word Alignment
Published	2016-12-01
URL	https://www.aclweb.org/anthology/W16-4615/
PDF	https://www.aclweb.org/anthology/W16-4615
PWC	https://paperswithcode.com/paper/integrating-empty-category-detection-into
Repo
Framework

Squibs: When the Whole Is Less Than the Sum of Its Parts: How Composition Affects PMI Values in Distributional Semantic Vectors


Title	Squibs: When the Whole Is Less Than the Sum of Its Parts: How Composition Affects PMI Values in Distributional Semantic Vectors
Authors	Denis Paperno, Marco Baroni
Abstract
Tasks
Published	2016-06-01
URL	https://www.aclweb.org/anthology/J16-2006/
PDF	https://www.aclweb.org/anthology/J16-2006
PWC	https://paperswithcode.com/paper/squibs-when-the-whole-is-less-than-the-sum-of
Repo
Framework

Modeling Discourse Segments in Lyrics Using Repeated Patterns


Title	Modeling Discourse Segments in Lyrics Using Repeated Patterns
Authors	Kento Watanabe, Yuichiroh Matsubayashi, Naho Orita, Naoaki Okazaki, Kentaro Inui, Satoru Fukayama, Tomoyasu Nakano, Jordan Smith, Masataka Goto
Abstract	This study proposes a computational model of the discourse segments in lyrics to understand and to model the structure of lyrics. To test our hypothesis that discourse segmentations in lyrics strongly correlate with repeated patterns, we conduct the first large-scale corpus study on discourse segments in lyrics. Next, we propose the task to automatically identify segment boundaries in lyrics and train a logistic regression model for the task with the repeated pattern and textual features. The results of our empirical experiments illustrate the significance of capturing repeated patterns in predicting the boundaries of discourse segments in lyrics.
Tasks
Published	2016-12-01
URL	https://www.aclweb.org/anthology/C16-1184/
PDF	https://www.aclweb.org/anthology/C16-1184
PWC	https://paperswithcode.com/paper/modeling-discourse-segments-in-lyrics-using
Repo
Framework

Disambiguating Entities Referred by Web Endpoints using Tree Ensembles


Title	Disambiguating Entities Referred by Web Endpoints using Tree Ensembles
Authors	Gitansh Khirbat, Jianzhong Qi, Rui Zhang
Abstract
Tasks	Coreference Resolution, Entity Linking
Published	2016-12-01
URL	https://www.aclweb.org/anthology/U16-1021/
PDF	https://www.aclweb.org/anthology/U16-1021
PWC	https://paperswithcode.com/paper/disambiguating-entities-referred-by-web
Repo
Framework