Paper Group NANR 60
UoB-UK at SemEval-2016 Task 1: A Flexible and Extendable System for Semantic Text Similarity using Types, Surprise and Phrase Linking. UNBNLP at SemEval-2016 Task 1: Semantic Textual Similarity: A Unified Framework for Semantic Processing and Evaluation. Fast recovery from a union of subspaces. Annotating Logical Forms for EHR Questions. Semantic T …
UoB-UK at SemEval-2016 Task 1: A Flexible and Extendable System for Semantic Text Similarity using Types, Surprise and Phrase Linking
Title | UoB-UK at SemEval-2016 Task 1: A Flexible and Extendable System for Semantic Text Similarity using Types, Surprise and Phrase Linking |
Authors | Harish Tayyar Madabushi, Mark Buhagiar, Mark Lee |
Abstract | |
Tasks | Machine Translation |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/S16-1104/ |
https://www.aclweb.org/anthology/S16-1104 | |
PWC | https://paperswithcode.com/paper/uob-uk-at-semeval-2016-task-1-a-flexible-and |
Repo | |
Framework | |
UNBNLP at SemEval-2016 Task 1: Semantic Textual Similarity: A Unified Framework for Semantic Processing and Evaluation
Title | UNBNLP at SemEval-2016 Task 1: Semantic Textual Similarity: A Unified Framework for Semantic Processing and Evaluation |
Authors | Milton King, Waseem Gharbieh, SoHyun Park, Paul Cook |
Abstract | |
Tasks | Semantic Textual Similarity, Word Embeddings |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/S16-1113/ |
https://www.aclweb.org/anthology/S16-1113 | |
PWC | https://paperswithcode.com/paper/unbnlp-at-semeval-2016-task-1-semantic |
Repo | |
Framework | |
Fast recovery from a union of subspaces
Title | Fast recovery from a union of subspaces |
Authors | Chinmay Hegde, Piotr Indyk, Ludwig Schmidt |
Abstract | We address the problem of recovering a high-dimensional but structured vector from linear observations in a general setting where the vector can come from an arbitrary union of subspaces. This setup includes well-studied problems such as compressive sensing and low-rank matrix recovery. We show how to design more efficient algorithms for the union-of subspace recovery problem by using approximate projections. Instantiating our general framework for the low-rank matrix recovery problem gives the fastest provable running time for an algorithm with optimal sample complexity. Moreover, we give fast approximate projections for 2D histograms, another well-studied low-dimensional model of data. We complement our theoretical results with experiments demonstrating that our framework also leads to improved time and sample complexity empirically. |
Tasks | Compressive Sensing |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6484-fast-recovery-from-a-union-of-subspaces |
http://papers.nips.cc/paper/6484-fast-recovery-from-a-union-of-subspaces.pdf | |
PWC | https://paperswithcode.com/paper/fast-recovery-from-a-union-of-subspaces |
Repo | |
Framework | |
Annotating Logical Forms for EHR Questions
Title | Annotating Logical Forms for EHR Questions |
Authors | Kirk Roberts, Dina Demner-Fushman |
Abstract | This paper discusses the creation of a semantically annotated corpus of questions about patient data in electronic health records (EHRs). The goal is provide the training data necessary for semantic parsers to automatically convert EHR questions into a structured query. A layered annotation strategy is used which mirrors a typical natural language processing (NLP) pipeline. First, questions are syntactically analyzed to identify multi-part questions. Second, medical concepts are recognized and normalized to a clinical ontology. Finally, logical forms are created using a lambda calculus representation. We use a corpus of 446 questions asking for patient-specific information. From these, 468 specific questions are found containing 259 unique medical concepts and requiring 53 unique predicates to represent the logical forms. We further present detailed characteristics of the corpus, including inter-annotator agreement results, and describe the challenges automatic NLP systems will face on this task. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1598/ |
https://www.aclweb.org/anthology/L16-1598 | |
PWC | https://paperswithcode.com/paper/annotating-logical-forms-for-ehr-questions |
Repo | |
Framework | |
Semantic Textual Similarity in Quality Estimation
Title | Semantic Textual Similarity in Quality Estimation |
Authors | Hanna Bechara, Carla Parra Escartin, Constantin Orasan, Lucia Specia |
Abstract | |
Tasks | Machine Translation, Semantic Textual Similarity |
Published | 2016-01-01 |
URL | https://www.aclweb.org/anthology/W16-3413/ |
https://www.aclweb.org/anthology/W16-3413 | |
PWC | https://paperswithcode.com/paper/semantic-textual-similarity-in-quality |
Repo | |
Framework | |
Keynote - More than meets the ear: Processes that shape dialogue
Title | Keynote - More than meets the ear: Processes that shape dialogue |
Authors | Susan Brennan |
Abstract | |
Tasks | Spoken Dialogue Systems |
Published | 2016-09-01 |
URL | https://www.aclweb.org/anthology/W16-3607/ |
https://www.aclweb.org/anthology/W16-3607 | |
PWC | https://paperswithcode.com/paper/keynote-more-than-meets-the-ear-processes |
Repo | |
Framework | |
Unsupervised Learning from Noisy Networks with Applications to Hi-C Data
Title | Unsupervised Learning from Noisy Networks with Applications to Hi-C Data |
Authors | Bo Wang, Junjie Zhu, Armin Pourshafeie, Oana Ursu, Serafim Batzoglou, Anshul Kundaje |
Abstract | Complex networks play an important role in a plethora of disciplines in natural sciences. Cleaning up noisy observed networks, poses an important challenge in network analysis Existing methods utilize labeled data to alleviate the noise effect in the network. However, labeled data is usually expensive to collect while unlabeled data can be gathered cheaply. In this paper, we propose an optimization framework to mine useful structures from noisy networks in an unsupervised manner. The key feature of our optimization framework is its ability to utilize local structures as well as global patterns in the network. We extend our method to incorporate multi-resolution networks in order to add further resistance to high-levels of noise. We also generalize our framework to utilize partial labels to enhance the performance. We specifically focus our method on multi-resolution Hi-C data by recovering clusters of genomic regions that co-localize in 3D space. Additionally, we use Capture-C-generated partial labels to further denoise the Hi-C network. We empirically demonstrate the effectiveness of our framework in denoising the network and improving community detection results. |
Tasks | Community Detection, Denoising |
Published | 2016-12-01 |
URL | http://papers.nips.cc/paper/6291-unsupervised-learning-from-noisy-networks-with-applications-to-hi-c-data |
http://papers.nips.cc/paper/6291-unsupervised-learning-from-noisy-networks-with-applications-to-hi-c-data.pdf | |
PWC | https://paperswithcode.com/paper/unsupervised-learning-from-noisy-networks |
Repo | |
Framework | |
Parallel Sentence Extraction from Comparable Corpora with Neural Network Features
Title | Parallel Sentence Extraction from Comparable Corpora with Neural Network Features |
Authors | Chenhui Chu, Raj Dabre, Sadao Kurohashi |
Abstract | Parallel corpora are crucial for machine translation (MT), however they are quite scarce for most language pairs and domains. As comparable corpora are far more available, many studies have been conducted to extract parallel sentences from them for MT. In this paper, we exploit the neural network features acquired from neural MT for parallel sentence extraction. We observe significant improvements for both accuracy in sentence extraction and MT performance. |
Tasks | Machine Translation |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1468/ |
https://www.aclweb.org/anthology/L16-1468 | |
PWC | https://paperswithcode.com/paper/parallel-sentence-extraction-from-comparable |
Repo | |
Framework | |
Summ-it++: an Enriched Version of the Summ-it Corpus
Title | Summ-it++: an Enriched Version of the Summ-it Corpus |
Authors | Ev Fonseca, ro, Andr{'e} Antonitsch, S Collovini, ra, Daniela Amaral, Renata Vieira, Anny Figueira |
Abstract | This paper presents Summ-it++, an enriched version the Summ-it corpus. In this new version, the corpus has received new semantic layers, named entity categories and relations between named entities, adding to the previous coreference annotation. In addition, we change the original Summ-it format to SemEval |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1324/ |
https://www.aclweb.org/anthology/L16-1324 | |
PWC | https://paperswithcode.com/paper/summ-it-an-enriched-version-of-the-summ-it |
Repo | |
Framework | |
Japanese-English Machine Translation of Recipe Texts
Title | Japanese-English Machine Translation of Recipe Texts |
Authors | Takayuki Sato, Jun Harashima, Mamoru Komachi |
Abstract | Concomitant with the globalization of food culture, demand for the recipes of specialty dishes has been increasing. The recent growth in recipe sharing websites and food blogs has resulted in numerous recipe texts being available for diverse foods in various languages. However, little work has been done on machine translation of recipe texts. In this paper, we address the task of translating recipes and investigate the advantages and disadvantages of traditional phrase-based statistical machine translation and more recent neural machine translation. Specifically, we translate Japanese recipes into English, analyze errors in the translated recipes, and discuss available room for improvements. |
Tasks | Information Retrieval, Machine Translation |
Published | 2016-12-01 |
URL | https://www.aclweb.org/anthology/W16-4603/ |
https://www.aclweb.org/anthology/W16-4603 | |
PWC | https://paperswithcode.com/paper/japanese-english-machine-translation-of |
Repo | |
Framework | |
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)
Title | Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016) |
Authors | |
Abstract | |
Tasks | |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/S16-1000/ |
https://www.aclweb.org/anthology/S16-1000 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-10th-international-3 |
Repo | |
Framework | |
iUBC at SemEval-2016 Task 2: RNNs and LSTMs for interpretable STS
Title | iUBC at SemEval-2016 Task 2: RNNs and LSTMs for interpretable STS |
Authors | I{~n}igo Lopez-Gazpio, Eneko Agirre, Montse Maritxalar |
Abstract | |
Tasks | Chunking, Semantic Textual Similarity |
Published | 2016-06-01 |
URL | https://www.aclweb.org/anthology/S16-1119/ |
https://www.aclweb.org/anthology/S16-1119 | |
PWC | https://paperswithcode.com/paper/iubc-at-semeval-2016-task-2-rnns-and-lstms |
Repo | |
Framework | |
SCALE: A Scalable Language Engineering Toolkit
Title | SCALE: A Scalable Language Engineering Toolkit |
Authors | Joris Pelemans, Lyan Verwimp, Kris Demuynck, Hugo Van hamme, Patrick Wambacq |
Abstract | In this paper we present SCALE, a new Python toolkit that contains two extensions to n-gram language models. The first extension is a novel technique to model compound words called Semantic Head Mapping (SHM). The second extension, Bag-of-Words Language Modeling (BagLM), bundles popular models such as Latent Semantic Analysis and Continuous Skip-grams. Both extensions scale to large data and allow the integration into first-pass ASR decoding. The toolkit is open source, includes working examples and can be found on http://github.com/jorispelemans/scale. |
Tasks | Language Modelling |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1612/ |
https://www.aclweb.org/anthology/L16-1612 | |
PWC | https://paperswithcode.com/paper/scale-a-scalable-language-engineering-toolkit |
Repo | |
Framework | |
Combining Manual and Automatic Prosodic Annotation for Expressive Speech Synthesis
Title | Combining Manual and Automatic Prosodic Annotation for Expressive Speech Synthesis |
Authors | S Brognaux, rine, Thomas Fran{\c{c}}ois, Marco Saerens |
Abstract | Text-to-speech has long been centered on the production of an intelligible message of good quality. More recently, interest has shifted to the generation of more natural and expressive speech. A major issue of existing approaches is that they usually rely on a manual annotation in expressive styles, which tends to be rather subjective. A typical related issue is that the annotation is strongly influenced ― and possibly biased ― by the semantic content of the text (e.g. a shot or a fault may incite the annotator to tag that sequence as expressing a high degree of excitation, independently of its acoustic realization). This paper investigates the assumption that human annotation of basketball commentaries in excitation levels can be automatically improved on the basis of acoustic features. It presents two techniques for label correction exploiting a Gaussian mixture and a proportional-odds logistic regression. The automatically re-annotated corpus is then used to train HMM-based expressive speech synthesizers, the performance of which is assessed through subjective evaluations. The results indicate that the automatic correction of the annotation with Gaussian mixture helps to synthesize more contrasted excitation levels, while preserving naturalness. |
Tasks | Speech Synthesis |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1613/ |
https://www.aclweb.org/anthology/L16-1613 | |
PWC | https://paperswithcode.com/paper/combining-manual-and-automatic-prosodic |
Repo | |
Framework | |
BAS Speech Science Web Services - an Update of Current Developments
Title | BAS Speech Science Web Services - an Update of Current Developments |
Authors | Thomas Kisler, Uwe Reichel, Florian Schiel, Christoph Draxler, Bernhard Jackl, Nina P{"o}rner |
Abstract | In 2012 the Bavarian Archive for Speech Signals started providing some of its tools from the field of spoken language in the form of Software as a Service (SaaS). This means users access the processing functionality over a web browser and therefore do not have to install complex software packages on a local computer. Amongst others, these tools include segmentation {&} labeling, grapheme-to-phoneme conversion, text alignment, syllabification and metadata generation, where all but the last are available for a variety of languages. Since its creation the number of available services and the web interface have changed considerably. We give an overview and a detailed description of the system architecture, the available web services and their functionality. Furthermore, we show how the number of files processed over the system developed in the last four years. |
Tasks | |
Published | 2016-05-01 |
URL | https://www.aclweb.org/anthology/L16-1614/ |
https://www.aclweb.org/anthology/L16-1614 | |
PWC | https://paperswithcode.com/paper/bas-speech-science-web-services-an-update-of |
Repo | |
Framework | |