Paper Group NANR 143
Paths for uncertainty: Exploring the intricacies of uncertainty identification for news. Proceedings of the Workshop on Deep Learning Approaches for Low-Resource NLP. Focus Manipulation Detection via Photometric Histogram Analysis. Scalable approximate Bayesian inference for particle tracking data. SimPA: A Sentence-Level Simplification Corpus for …
Paths for uncertainty: Exploring the intricacies of uncertainty identification for news
Title | Paths for uncertainty: Exploring the intricacies of uncertainty identification for news |
Authors | Chrysoula Zerva, Sophia Ananiadou |
Abstract | Currently, news articles are produced, shared and consumed at an extremely rapid rate. Although their quantity is increasing, at the same time, their quality and trustworthiness is becoming fuzzier. Hence, it is important not only to automate information extraction but also to quantify the certainty of this information. Automated identification of certainty has been studied both in the scientific and newswire domains, but performance is considerably higher in tasks focusing on scientific text. We compare the differences in the definition and expression of uncertainty between a scientific domain, i.e., biomedicine, and newswire. We delve into the different aspects that affect the certainty of an extracted event in a news article and examine whether they can be easily identified by techniques already validated in the biomedical domain. Finally, we present a comparison of the syntactic and lexical differences between the the expression of certainty in the biomedical and newswire domains, using two annotated corpora. |
Tasks | |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/W18-1302/ |
https://www.aclweb.org/anthology/W18-1302 | |
PWC | https://paperswithcode.com/paper/paths-for-uncertainty-exploring-the |
Repo | |
Framework | |
Proceedings of the Workshop on Deep Learning Approaches for Low-Resource NLP
Title | Proceedings of the Workshop on Deep Learning Approaches for Low-Resource NLP |
Authors | |
Abstract | |
Tasks | |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/W18-3400/ |
https://www.aclweb.org/anthology/W18-3400 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-workshop-on-deep-learning |
Repo | |
Framework | |
Focus Manipulation Detection via Photometric Histogram Analysis
Title | Focus Manipulation Detection via Photometric Histogram Analysis |
Authors | Can Chen, Scott McCloskey, Jingyi Yu |
Abstract | With the rise of misinformation spread via social media channels, enabled by the increasing automation and realism of image manipulation tools, image forensics is an increasingly relevant problem. Classic image forensic methods leverage low-level cues such as metadata, sensor noise fingerprints, and others that are easily fooled when the image is re-encoded upon upload to facebook, etc. This necessitates the use of higher-level physical and semantic cues that, once hard to estimate reliably in the wild, have become more effective due to the increasing power of computer vision. In particular, we detect manipulations introduced by artificial blurring of the image, which creates inconsistent photometric relationships between image intensity and various cues. We achieve 98% accuracy on the most challenging cases in a new dataset of blur manipulations, where the blur is geometrically correct and consistent with the scene’s physical arrangement. Such manipulations are now easily generated, for instance, by smartphone cameras having hardware to measure depth, e.g. Portrait Mode' of the iPhone7Plus. We also demonstrate good performance on a challenge dataset evaluating a wider range of manipulations in imagery representing in the wild’ conditions. |
Tasks | |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Chen_Focus_Manipulation_Detection_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Chen_Focus_Manipulation_Detection_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/focus-manipulation-detection-via-photometric |
Repo | |
Framework | |
Scalable approximate Bayesian inference for particle tracking data
Title | Scalable approximate Bayesian inference for particle tracking data |
Authors | Ruoxi Sun, Liam Paninski |
Abstract | Many important datasets in physics, chemistry, and biology consist of noisy sequences of images of multiple moving overlapping particles. In many cases, the observed particles are indistinguishable, leading to unavoidable uncertainty about nearby particles’ identities. Exact Bayesian inference is intractable in this setting, and previous approximate Bayesian methods scale poorly. Non-Bayesian approaches that output a single “best” estimate of the particle tracks (thus discarding important uncertainty information) are therefore dominant in practice. Here we propose a flexible and scalable amortized approach for Bayesian inference on this task. We introduce a novel neural network method to approximate the (intractable) filter-backward-sample-forward algorithm for Bayesian inference in this setting. By varying the simulated training data for the network, we can perform inference on a wide variety of data types. This approach is therefore highly flexible and improves on the state of the art in terms of accuracy; provides uncertainty estimates about the particle locations and identities; and has a test run-time that scales linearly as a function of the data length and number of particles, thus enabling Bayesian inference in arbitrarily large particle tracking datasets. |
Tasks | Bayesian Inference |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=2375 |
http://proceedings.mlr.press/v80/sun18b/sun18b.pdf | |
PWC | https://paperswithcode.com/paper/scalable-approximate-bayesian-inference-for |
Repo | |
Framework | |
SimPA: A Sentence-Level Simplification Corpus for the Public Administration Domain
Title | SimPA: A Sentence-Level Simplification Corpus for the Public Administration Domain |
Authors | Carolina Scarton, Gustavo Paetzold, Lucia Specia |
Abstract | |
Tasks | Lexical Simplification, Text Simplification, Word Embeddings |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1685/ |
https://www.aclweb.org/anthology/L18-1685 | |
PWC | https://paperswithcode.com/paper/simpa-a-sentence-level-simplification-corpus |
Repo | |
Framework | |
Korean L2 Vocabulary Prediction: Can a Large Annotated Corpus be Used to Train Better Models for Predicting Unknown Words?
Title | Korean L2 Vocabulary Prediction: Can a Large Annotated Corpus be Used to Train Better Models for Predicting Unknown Words? |
Authors | Kevin Yancey, Yves Lepage |
Abstract | |
Tasks | Language Acquisition, Lexical Simplification, Reading Comprehension, Text Simplification |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1068/ |
https://www.aclweb.org/anthology/L18-1068 | |
PWC | https://paperswithcode.com/paper/korean-l2-vocabulary-prediction-can-a-large |
Repo | |
Framework | |
The Alexa Meaning Representation Language
Title | The Alexa Meaning Representation Language |
Authors | Thomas Kollar, Danielle Berry, Lauren Stuart, Karolina Owczarzak, Tagyoung Chung, Lambert Mathias, Michael Kayser, Bradford Snow, Spyros Matsoukas |
Abstract | This paper introduces a meaning representation for spoken language understanding. The Alexa meaning representation language (AMRL), unlike previous approaches, which factor spoken utterances into domains, provides a common representation for how people communicate in spoken language. AMRL is a rooted graph, links to a large-scale ontology, supports cross-domain queries, fine-grained types, complex utterances and composition. A spoken language dataset has been collected for Alexa, which contains ∼20k examples across eight domains. A version of this meaning representation was released to developers at a trade show in 2016. |
Tasks | Spoken Language Understanding |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-3022/ |
https://www.aclweb.org/anthology/N18-3022 | |
PWC | https://paperswithcode.com/paper/the-alexa-meaning-representation-language |
Repo | |
Framework | |
MULLE: A grammar-based Latin language learning tool to supplement the classroom setting
Title | MULLE: A grammar-based Latin language learning tool to supplement the classroom setting |
Authors | Herbert Lange, Peter Ljungl{"o}f |
Abstract | MULLE is a tool for language learning that focuses on teaching Latin as a foreign language. It is aimed for easy integration into the traditional classroom setting and syllabus, which makes it distinct from other language learning tools that provide standalone learning experience. It uses grammar-based lessons and embraces methods of gamification to improve the learner motivation. The main type of exercise provided by our application is to practice translation, but it is also possible to shift the focus to vocabulary or morphology training. |
Tasks | |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/W18-3715/ |
https://www.aclweb.org/anthology/W18-3715 | |
PWC | https://paperswithcode.com/paper/mulle-a-grammar-based-latin-language-learning |
Repo | |
Framework | |
Efficient High Dimensional Bayesian Optimization with Additivity and Quadrature Fourier Features
Title | Efficient High Dimensional Bayesian Optimization with Additivity and Quadrature Fourier Features |
Authors | Mojmir Mutny, Andreas Krause |
Abstract | We develop an efficient and provably no-regret Bayesian optimization (BO) algorithm for optimization of black-box functions in high dimensions. We assume a generalized additive model with possibly overlapping variable groups. When the groups do not overlap, we are able to provide the first provably no-regret \emph{polynomial time} (in the number of evaluations of the acquisition function) algorithm for solving high dimensional BO. To make the optimization efficient and feasible, we introduce a novel deterministic Fourier Features approximation based on numerical integration with detailed analysis for the squared exponential kernel. The error of this approximation decreases \emph{exponentially} with the number of features, and allows for a precise approximation of both posterior mean and variance. In addition, the kernel matrix inversion improves in its complexity from cubic to essentially linear in the number of data points measured in basic arithmetic operations. |
Tasks | Hyperparameter Optimization |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/8115-efficient-high-dimensional-bayesian-optimization-with-additivity-and-quadrature-fourier-features |
http://papers.nips.cc/paper/8115-efficient-high-dimensional-bayesian-optimization-with-additivity-and-quadrature-fourier-features.pdf | |
PWC | https://paperswithcode.com/paper/efficient-high-dimensional-bayesian |
Repo | |
Framework | |
The University of Helsinki submissions to the WMT18 news task
Title | The University of Helsinki submissions to the WMT18 news task |
Authors | Aless Raganato, ro, Yves Scherrer, Tommi Nieminen, Arvi Hurskainen, J{"o}rg Tiedemann |
Abstract | This paper describes the University of Helsinki{'}s submissions to the WMT18 shared news translation task for English-Finnish and English-Estonian, in both directions. This year, our main submissions employ a novel neural architecture, the Transformer, using the open-source OpenNMT framework. Our experiments couple domain labeling and fine tuned multilingual models with shared vocabularies between the source and target language, using the provided parallel data of the shared task and additional back-translations. Finally, we compare, for the English-to-Finnish case, the effectiveness of different machine translation architectures, starting from a rule-based approach to our best neural model, analyzing the output and highlighting future research. |
Tasks | Machine Translation |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/W18-6425/ |
https://www.aclweb.org/anthology/W18-6425 | |
PWC | https://paperswithcode.com/paper/the-university-of-helsinki-submissions-to-the |
Repo | |
Framework | |
Annotation Schemes for Surface Construction Labeling
Title | Annotation Schemes for Surface Construction Labeling |
Authors | Lori Levin |
Abstract | In this talk I will describe the interaction of linguistics and language technologies in Surface Construction Labeling (SCL) from the perspective of corpus annotation tasks such as definiteness, modality, and causality. Linguistically, following Construction Grammar, SCL recognizes that meaning may be carried by morphemes, words, or arbitrary constellations of morpho-lexical elements. SCL is like Shallow Semantic Parsing in that it does not attempt a full compositional analysis of meaning, but rather identifies only the main elements of a semantic frame, where the frames may be invoked by constructions as well as lexical items. Computationally, SCL is different from tasks such as information extraction in that it deals only with meanings that are expressed in a conventional, grammaticalized way and does not address inferred meanings. I review the work of Dunietz (2018) on the labeling of causal frames including causal connectives and cause and effect arguments. I will describe how to design an annotation scheme for SCL, including isolating basic units of form and meaning and building a {``}constructicon{''}. I will conclude with remarks about the nature of universal categories and universal meaning representations in language technologies. This talk describes joint work with Jaime Carbonell, Jesse Dunietz, Nathan Schneider, and Miriam Petruck. | |
Tasks | Semantic Parsing |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/W18-4901/ |
https://www.aclweb.org/anthology/W18-4901 | |
PWC | https://paperswithcode.com/paper/annotation-schemes-for-surface-construction |
Repo | |
Framework | |
Augmenting Crowd-Sourced 3D Reconstructions Using Semantic Detections
Title | Augmenting Crowd-Sourced 3D Reconstructions Using Semantic Detections |
Authors | True Price, Johannes L. Schönberger, Zhen Wei, Marc Pollefeys, Jan-Michael Frahm |
Abstract | Image-based 3D reconstruction for Internet photo collections has become a robust technology to produce impressive virtual representations of real-world scenes. However, several fundamental challenges remain for Structure-from-Motion (SfM) pipelines, namely: the placement and reconstruction of transient objects only observed in single views, estimating the absolute scale of the scene, and (suprisingly often) recovering ground surfaces in the scene. We propose a method to jointly address these remaining open problems of SfM. In particular, we focus on detecting people in individual images and accurately placing them into an existing 3D model. As part of this placement, our method also estimates the absolute scale of the scene from object semantics, which in this case constitutes the height distribution of the population. Further, we obtain a smooth approximation of the ground surface and recover the gravity vector of the scene directly from the individual person detections. We demonstrate the results of our approach on a number of unordered Internet photo collections, and we quantitatively evaluate the obtained absolute scene scales. |
Tasks | 3D Reconstruction |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Price_Augmenting_Crowd-Sourced_3D_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Price_Augmenting_Crowd-Sourced_3D_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/augmenting-crowd-sourced-3d-reconstructions |
Repo | |
Framework | |
Sociolinguistic Corpus of WhatsApp Chats in Spanish among College Students
Title | Sociolinguistic Corpus of WhatsApp Chats in Spanish among College Students |
Authors | Alej Dorantes, ro, Gerardo Sierra, Tlauhlia Yam{'\i}n Donohue P{'e}rez, Gemma Bel-Enguix, M{'o}nica Jasso Rosales |
Abstract | This work presents the Sociolinguistic Corpus of WhatsApp Chats in Spanish among College Students, a corpus of raw data for general use. Its purpose is to offer data for the study of of language and interactions via Instant Messaging (IM) among bachelors. Our paper consists of an overview of both the corpus{'}s content and demographic metadata. Furthermore, it presents the current research being conducted with it {—}namely parenthetical expressions, orality traits, and code-switching. This work also includes a brief outline of similar corpora and recent studies in the field of IM. |
Tasks | |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/W18-3501/ |
https://www.aclweb.org/anthology/W18-3501 | |
PWC | https://paperswithcode.com/paper/sociolinguistic-corpus-of-whatsapp-chats-in |
Repo | |
Framework | |
繁體中文依存句法剖析器 (Traditional Chinese Dependency Parser) [In Chinese]
Title | 繁體中文依存句法剖析器 (Traditional Chinese Dependency Parser) [In Chinese] |
Authors | Yen-Hsuan Lee, Yih-Ru Wang |
Abstract | |
Tasks | |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/O18-1005/ |
https://www.aclweb.org/anthology/O18-1005 | |
PWC | https://paperswithcode.com/paper/c1ea-a34aa3aa-traditional-chinese-dependency |
Repo | |
Framework | |
Quantifying training challenges of dependency parsers
Title | Quantifying training challenges of dependency parsers |
Authors | Lauriane Aufrant, Guillaume Wisniewski, Fran{\c{c}}ois Yvon |
Abstract | Not all dependencies are equal when training a dependency parser: some are straightforward enough to be learned with only a sample of data, others embed more complexity. This work introduces a series of metrics to quantify those differences, and thereby to expose the shortcomings of various parsing algorithms and strategies. Apart from a more thorough comparison of parsing systems, these new tools also prove useful for characterizing the information conveyed by cross-lingual parsers, in a quantitative but still interpretable way. |
Tasks | Cross-Lingual Transfer, Dependency Parsing |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-1270/ |
https://www.aclweb.org/anthology/C18-1270 | |
PWC | https://paperswithcode.com/paper/quantifying-training-challenges-of-dependency |
Repo | |
Framework | |