May 5, 2019

1834 words 9 mins read

Paper Group NANR 127

Paper Group NANR 127

Word Segmentation in Sanskrit Using Path Constrained Random Walks. Applying Core Scientific Concepts to Context-Based Citation Recommendation. Multistage Campaigning in Social Networks. Subdialectal Differences in Sorani Kurdish. iAppraise: A Manual Machine Translation Evaluation Environment Supporting Eye-tracking. Crowdsourcing for (almost) Real- …

Word Segmentation in Sanskrit Using Path Constrained Random Walks

Title Word Segmentation in Sanskrit Using Path Constrained Random Walks
Authors Amrith Krishna, Bishal Santra, Pavankumar Satuluri, B, Sasi Prasanth aru, Bhumi Faldu, Yajuvendra Singh, Pawan Goyal
Abstract In Sanskrit, the phonemes at the word boundaries undergo changes to form new phonemes through a process called as sandhi. A fused sentence can be segmented into multiple possible segmentations. We propose a word segmentation approach that predicts the most semantically valid segmentation for a given sentence. We treat the problem as a query expansion problem and use the path-constrained random walks framework to predict the correct segments.
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1048/
PDF https://www.aclweb.org/anthology/C16-1048
PWC https://paperswithcode.com/paper/word-segmentation-in-sanskrit-using-path
Repo
Framework

Applying Core Scientific Concepts to Context-Based Citation Recommendation

Title Applying Core Scientific Concepts to Context-Based Citation Recommendation
Authors Daniel Duma, Maria Liakata, Am Clare, a, James Ravenscroft, Ewan Klein
Abstract The task of recommending relevant scientific literature for a draft academic paper has recently received significant interest. In our effort to ease the discovery of scientific literature and augment scientific writing, we aim to improve the relevance of results based on a shallow semantic analysis of the source document and the potential documents to recommend. We investigate the utility of automatic argumentative and rhetorical annotation of documents for this purpose. Specifically, we integrate automatic Core Scientific Concepts (CoreSC) classification into a prototype context-based citation recommendation system and investigate its usefulness to the task. We frame citation recommendation as an information retrieval task and we use the categories of the annotation schemes to apply different weights to the similarity formula. Our results show interesting and consistent correlations between the type of citation and the type of sentence containing the relevant information.
Tasks Information Retrieval
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1274/
PDF https://www.aclweb.org/anthology/L16-1274
PWC https://paperswithcode.com/paper/applying-core-scientific-concepts-to-context
Repo
Framework

Multistage Campaigning in Social Networks

Title Multistage Campaigning in Social Networks
Authors Mehrdad Farajtabar, Xiaojing Ye, Sahar Harati, Le Song, Hongyuan Zha
Abstract We consider control problems for multi-stage campaigning over social networks. The dynamic programming framework is employed to balance the high present reward and large penalty on low future outcome in the presence of extensive uncertainties. In particular, we establish theoretical foundations of optimal campaigning over social networks where the user activities are modeled as a multivariate Hawkes process, and we derive a time dependent linear relation between the intensity of exogenous events and several commonly used objective functions of campaigning. We further develop a convex dynamic programming framework for determining the optimal intervention policy that prescribes the required level of external drive at each stage for the desired campaigning result. Experiments on both synthetic data and the real-world MemeTracker dataset show that our algorithm can steer the user activities for optimal campaigning much more accurately than baselines.
Tasks
Published 2016-12-01
URL http://papers.nips.cc/paper/6102-multistage-campaigning-in-social-networks
PDF http://papers.nips.cc/paper/6102-multistage-campaigning-in-social-networks.pdf
PWC https://paperswithcode.com/paper/multistage-campaigning-in-social-networks
Repo
Framework

Subdialectal Differences in Sorani Kurdish

Title Subdialectal Differences in Sorani Kurdish
Authors Shervin Malmasi
Abstract In this study we apply classification methods for detecting subdialectal differences in Sorani Kurdish texts produced in different regions, namely Iran and Iraq. As Sorani is a low-resource language, no corpus including texts from different regions was readily available. To this end, we identified data sources that could be leveraged for this task to create a dataset of 200,000 sentences. Using surface features, we attempted to classify Sorani subdialects, showing that sentences from news sources in Iraq and Iran are distinguishable with 96{%} accuracy. This is the first preliminary study for a dialect that has not been widely studied in computational linguistics, evidencing the possible existence of distinct subdialects.
Tasks Information Retrieval, Language Identification, Machine Translation
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-4812/
PDF https://www.aclweb.org/anthology/W16-4812
PWC https://paperswithcode.com/paper/subdialectal-differences-in-sorani-kurdish
Repo
Framework

iAppraise: A Manual Machine Translation Evaluation Environment Supporting Eye-tracking

Title iAppraise: A Manual Machine Translation Evaluation Environment Supporting Eye-tracking
Authors Ahmed Abdelali, Nadir Durrani, Francisco Guzm{'a}n
Abstract
Tasks Eye Tracking, Machine Translation
Published 2016-06-01
URL https://www.aclweb.org/anthology/N16-3004/
PDF https://www.aclweb.org/anthology/N16-3004
PWC https://paperswithcode.com/paper/iappraise-a-manual-machine-translation
Repo
Framework

Crowdsourcing for (almost) Real-time Question Answering

Title Crowdsourcing for (almost) Real-time Question Answering
Authors Denis Savenkov, Scott Weitzner, Eugene Agichtein
Abstract
Tasks Community Question Answering, Question Answering
Published 2016-06-01
URL https://www.aclweb.org/anthology/W16-0102/
PDF https://www.aclweb.org/anthology/W16-0102
PWC https://paperswithcode.com/paper/crowdsourcing-for-almost-real-time-question
Repo
Framework

Multilingual Information Extraction with PolyglotIE

Title Multilingual Information Extraction with PolyglotIE
Authors Alan Akbik, Laura Chiticariu, Marina Danilevsky, Yonas Kbrom, Yunyao Li, Huaiyu Zhu
Abstract We present PolyglotIE, a web-based tool for developing extractors that perform Information Extraction (IE) over multilingual data. Our tool has two core features: First, it allows users to develop extractors against a unified abstraction that is shared across a large set of natural languages. This means that an extractor needs only be created once for one language, but will then run on multilingual data without any additional effort or language-specific knowledge on part of the user. Second, it embeds this abstraction as a set of views within a declarative IE system, allowing users to quickly create extractors using a mature IE query language. We present PolyglotIE as a hands-on demo in which users can experiment with creating extractors, execute them on multilingual text and inspect extraction results. Using the UI, we discuss the challenges and potential of using unified, crosslingual semantic abstractions as basis for downstream applications. We demonstrate multilingual IE for 9 languages from 4 different language groups: English, German, French, Spanish, Japanese, Chinese, Arabic, Russian and Hindi.
Tasks Semantic Parsing
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-2056/
PDF https://www.aclweb.org/anthology/C16-2056
PWC https://paperswithcode.com/paper/multilingual-information-extraction-with
Repo
Framework

Letter Sequence Labeling for Compound Splitting

Title Letter Sequence Labeling for Compound Splitting
Authors Jianqiang Ma, Verena Henrich, Erhard Hinrichs
Abstract
Tasks Machine Translation, Transfer Learning
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-2012/
PDF https://www.aclweb.org/anthology/W16-2012
PWC https://paperswithcode.com/paper/letter-sequence-labeling-for-compound
Repo
Framework

Combining Lexical and Spatial Knowledge to Predict Spatial Relations between Objects in Images

Title Combining Lexical and Spatial Knowledge to Predict Spatial Relations between Objects in Images
Authors Manuela H{"u}rlimann, Johan Bos
Abstract
Tasks Object Detection, Object Recognition, Relation Classification
Published 2016-08-01
URL https://www.aclweb.org/anthology/W16-3202/
PDF https://www.aclweb.org/anthology/W16-3202
PWC https://paperswithcode.com/paper/combining-lexical-and-spatial-knowledge-to
Repo
Framework

CMA-ES with Optimal Covariance Update and Storage Complexity

Title CMA-ES with Optimal Covariance Update and Storage Complexity
Authors Oswin Krause, Dídac Rodríguez Arbonès, Christian Igel
Abstract The covariance matrix adaptation evolution strategy (CMA-ES) is arguably one of the most powerful real-valued derivative-free optimization algorithms, finding many applications in machine learning. The CMA-ES is a Monte Carlo method, sampling from a sequence of multi-variate Gaussian distributions. Given the function values at the sampled points, updating and storing the covariance matrix dominates the time and space complexity in each iteration of the algorithm. We propose a numerically stable quadratic-time covariance matrix update scheme with minimal memory requirements based on maintaining triangular Cholesky factors. This requires a modification of the cumulative step-size adaption (CSA) mechanism in the CMA-ES, in which we replace the inverse of the square root of the covariance matrix by the inverse of the triangular Cholesky factor. Because the triangular Cholesky factor changes smoothly with the matrix square root, this modification does not change the behavior of the CMA-ES in terms of required objective function evaluations as verified empirically. Thus, the described algorithm can and should replace the standard CMA-ES if updating and storing the covariance matrix matters.
Tasks
Published 2016-12-01
URL http://papers.nips.cc/paper/6457-cma-es-with-optimal-covariance-update-and-storage-complexity
PDF http://papers.nips.cc/paper/6457-cma-es-with-optimal-covariance-update-and-storage-complexity.pdf
PWC https://paperswithcode.com/paper/cma-es-with-optimal-covariance-update-and
Repo
Framework

Evaluating a Deterministic Shift-Reduce Neural Parser for Constituent Parsing

Title Evaluating a Deterministic Shift-Reduce Neural Parser for Constituent Parsing
Authors Hao Zhou, Yue Zhang, Shujian Huang, Xin-Yu Dai, Jiajun Chen
Abstract Greedy transition-based parsers are appealing for their very fast speed, with reasonably high accuracies. In this paper, we build a fast shift-reduce neural constituent parser by using a neural network to make local decisions. One challenge to the parsing speed is the large hidden and output layer sizes caused by the number of constituent labels and branching options. We speed up the parser by using a hierarchical output layer, inspired by the hierarchical log-bilinear neural language model. In standard WSJ experiments, the neural parser achieves an almost 2.4 time speed up (320 sen/sec) compared to a non-hierarchical baseline without significant accuracy loss (89.06 vs 89.13 F-score).
Tasks Language Modelling
Published 2016-05-01
URL https://www.aclweb.org/anthology/L16-1104/
PDF https://www.aclweb.org/anthology/L16-1104
PWC https://paperswithcode.com/paper/evaluating-a-deterministic-shift-reduce
Repo
Framework

Integrating empty category detection into preordering Machine Translation

Title Integrating empty category detection into preordering Machine Translation
Authors Shunsuke Takeno, Masaaki Nagata, Kazuhide Yamamoto
Abstract We propose a method for integrating Japanese empty category detection into the preordering process of Japanese-to-English statistical machine translation. First, we apply machine-learning-based empty category detection to estimate the position and the type of empty categories in the constituent tree of the source sentence. Then, we apply discriminative preordering to the augmented constituent tree in which empty categories are treated as if they are normal lexical symbols. We find that it is effective to filter empty categories based on the confidence of estimation. Our experiments show that, for the IWSLT dataset consisting of short travel conversations, the insertion of empty categories alone improves the BLEU score from 33.2 to 34.3 and the RIBES score from 76.3 to 78.7, which imply that reordering has improved For the KFTT dataset consisting of Wikipedia sentences, the proposed preordering method considering empty categories improves the BLEU score from 19.9 to 20.2 and the RIBES score from 66.2 to 66.3, which shows both translation and reordering have improved slightly.
Tasks Machine Translation, Word Alignment
Published 2016-12-01
URL https://www.aclweb.org/anthology/W16-4615/
PDF https://www.aclweb.org/anthology/W16-4615
PWC https://paperswithcode.com/paper/integrating-empty-category-detection-into
Repo
Framework

Squibs: When the Whole Is Less Than the Sum of Its Parts: How Composition Affects PMI Values in Distributional Semantic Vectors

Title Squibs: When the Whole Is Less Than the Sum of Its Parts: How Composition Affects PMI Values in Distributional Semantic Vectors
Authors Denis Paperno, Marco Baroni
Abstract
Tasks
Published 2016-06-01
URL https://www.aclweb.org/anthology/J16-2006/
PDF https://www.aclweb.org/anthology/J16-2006
PWC https://paperswithcode.com/paper/squibs-when-the-whole-is-less-than-the-sum-of
Repo
Framework

Modeling Discourse Segments in Lyrics Using Repeated Patterns

Title Modeling Discourse Segments in Lyrics Using Repeated Patterns
Authors Kento Watanabe, Yuichiroh Matsubayashi, Naho Orita, Naoaki Okazaki, Kentaro Inui, Satoru Fukayama, Tomoyasu Nakano, Jordan Smith, Masataka Goto
Abstract This study proposes a computational model of the discourse segments in lyrics to understand and to model the structure of lyrics. To test our hypothesis that discourse segmentations in lyrics strongly correlate with repeated patterns, we conduct the first large-scale corpus study on discourse segments in lyrics. Next, we propose the task to automatically identify segment boundaries in lyrics and train a logistic regression model for the task with the repeated pattern and textual features. The results of our empirical experiments illustrate the significance of capturing repeated patterns in predicting the boundaries of discourse segments in lyrics.
Tasks
Published 2016-12-01
URL https://www.aclweb.org/anthology/C16-1184/
PDF https://www.aclweb.org/anthology/C16-1184
PWC https://paperswithcode.com/paper/modeling-discourse-segments-in-lyrics-using
Repo
Framework

Disambiguating Entities Referred by Web Endpoints using Tree Ensembles

Title Disambiguating Entities Referred by Web Endpoints using Tree Ensembles
Authors Gitansh Khirbat, Jianzhong Qi, Rui Zhang
Abstract
Tasks Coreference Resolution, Entity Linking
Published 2016-12-01
URL https://www.aclweb.org/anthology/U16-1021/
PDF https://www.aclweb.org/anthology/U16-1021
PWC https://paperswithcode.com/paper/disambiguating-entities-referred-by-web
Repo
Framework
comments powered by Disqus