October 15, 2019

2087 words 10 mins read

Paper Group NANR 228

Paper Group NANR 228

Vocabulary Tailored Summary Generation. Modern Neural Networks Generalize on Small Data Sets. Combining Human and Machine Transcriptions on the Zooniverse Platform. Towards Understanding the Geometry of Knowledge Graph Embeddings. Deep Cauchy Hashing for Hamming Space Retrieval. Defocus Blur Detection via Multi-Stream Bottom-Top-Bottom Fully Convol …

Vocabulary Tailored Summary Generation

Title Vocabulary Tailored Summary Generation
Authors Kundan Krishna, Aniket Murhekar, Saumitra Sharma, Balaji Vasan Srinivasan
Abstract Neural sequence-to-sequence models have been successfully extended for summary generation.However, existing frameworks generate a single summary for a given input and do not tune the summaries towards any additional constraints/preferences. Such a tunable framework is desirable to account for linguistic preferences of the specific audience who will consume the summary. In this paper, we propose a neural framework to generate summaries constrained to a vocabulary-defined linguistic preferences of a target audience. The proposed method accounts for the generation context by tuning the summary words at the time of generation. Our evaluations indicate that the proposed approach tunes summaries to the target vocabulary while still maintaining a superior summary quality against a state-of-the-art word embedding based lexical substitution algorithm, suggesting the feasibility of the proposed approach. We demonstrate two applications of the proposed approach - to generate understandable summaries with simpler words, and readable summaries with shorter words.
Tasks Abstractive Text Summarization, Text Summarization
Published 2018-08-01
URL https://www.aclweb.org/anthology/C18-1068/
PDF https://www.aclweb.org/anthology/C18-1068
PWC https://paperswithcode.com/paper/vocabulary-tailored-summary-generation
Repo
Framework

Modern Neural Networks Generalize on Small Data Sets

Title Modern Neural Networks Generalize on Small Data Sets
Authors Matthew Olson, Abraham Wyner, Richard Berk
Abstract In this paper, we use a linear program to empirically decompose fitted neural networks into ensembles of low-bias sub-networks. We show that these sub-networks are relatively uncorrelated which leads to an internal regularization process, very much like a random forest, which can explain why a neural network is surprisingly resistant to overfitting. We then demonstrate this in practice by applying large neural networks, with hundreds of parameters per training observation, to a collection of 116 real-world data sets from the UCI Machine Learning Repository. This collection of data sets contains a much smaller number of training examples than the types of image classification tasks generally studied in the deep learning literature, as well as non-trivial label noise. We show that even in this setting deep neural nets are capable of achieving superior classification accuracy without overfitting.
Tasks Image Classification
Published 2018-12-01
URL http://papers.nips.cc/paper/7620-modern-neural-networks-generalize-on-small-data-sets
PDF http://papers.nips.cc/paper/7620-modern-neural-networks-generalize-on-small-data-sets.pdf
PWC https://paperswithcode.com/paper/modern-neural-networks-generalize-on-small
Repo
Framework

Combining Human and Machine Transcriptions on the Zooniverse Platform

Title Combining Human and Machine Transcriptions on the Zooniverse Platform
Authors Daniel Hanson, Andrea Simenstad
Abstract Transcribing handwritten documents to create fully searchable texts is an essential part of the archival process. Traditional text recognition methods, such as optical character recognition (OCR), do not work on handwritten documents due to their frequent noisiness and OCR{'}s need for individually segmented letters. Crowdsourcing and improved machine models are two modern methods for transcribing handwritten documents.
Tasks Optical Character Recognition
Published 2018-11-01
URL https://www.aclweb.org/anthology/W18-6129/
PDF https://www.aclweb.org/anthology/W18-6129
PWC https://paperswithcode.com/paper/combining-human-and-machine-transcriptions-on
Repo
Framework

Towards Understanding the Geometry of Knowledge Graph Embeddings

Title Towards Understanding the Geometry of Knowledge Graph Embeddings
Authors {Ch, rahas}, Aditya Sharma, Partha Talukdar
Abstract Knowledge Graph (KG) embedding has emerged as a very active area of research over the last few years, resulting in the development of several embedding methods. These KG embedding methods represent KG entities and relations as vectors in a high-dimensional space. Despite this popularity and effectiveness of KG embeddings in various tasks (e.g., link prediction), geometric understanding of such embeddings (i.e., arrangement of entity and relation vectors in vector space) is unexplored {–} we fill this gap in the paper. We initiate a study to analyze the geometry of KG embeddings and correlate it with task performance and other hyperparameters. To the best of our knowledge, this is the first study of its kind. Through extensive experiments on real-world datasets, we discover several insights. For example, we find that there are sharp differences between the geometry of embeddings learnt by different classes of KG embeddings methods. We hope that this initial study will inspire other follow-up research on this important but unexplored problem.
Tasks Knowledge Graph Embeddings, Knowledge Graphs, Link Prediction, Word Embeddings
Published 2018-07-01
URL https://www.aclweb.org/anthology/P18-1012/
PDF https://www.aclweb.org/anthology/P18-1012
PWC https://paperswithcode.com/paper/towards-understanding-the-geometry-of
Repo
Framework

Deep Cauchy Hashing for Hamming Space Retrieval

Title Deep Cauchy Hashing for Hamming Space Retrieval
Authors Yue Cao, Mingsheng Long, Bin Liu, Jianmin Wang
Abstract Due to its computation efficiency and retrieval quality, hashing has been widely applied to approximate nearest neighbor search for large-scale image retrieval, while deep hashing further improves the retrieval quality by end-to-end representation learning and hash coding. With compact hash codes, Hamming space retrieval enables the most efficient constant-time search that returns data points within a given Hamming radius to each query, by hash table lookups instead of linear scan. However, subject to the weak capability of concentrating relevant images to be within a small Hamming ball due to mis-specified loss functions, existing deep hashing methods may underperform for Hamming space retrieval. This work presents Deep Cauchy Hashing (DCH), a novel deep hashing model that generates compact and concentrated binary hash codes to enable efficient and effective Hamming space retrieval. The main idea is to design a pairwise cross-entropy loss based on Cauchy distribution, which penalizes significantly on similar image pairs with Hamming distance larger than the given Hamming radius threshold. Comprehensive experiments demonstrate that DCH can generate highly concentrated hash codes and yield state-of-the-art Hamming space retrieval performance on three datasets, NUS-WIDE, CIFAR-10, and MS-COCO.
Tasks Image Retrieval, Representation Learning
Published 2018-06-01
URL http://openaccess.thecvf.com/content_cvpr_2018/html/Cao_Deep_Cauchy_Hashing_CVPR_2018_paper.html
PDF http://openaccess.thecvf.com/content_cvpr_2018/papers/Cao_Deep_Cauchy_Hashing_CVPR_2018_paper.pdf
PWC https://paperswithcode.com/paper/deep-cauchy-hashing-for-hamming-space
Repo
Framework

Defocus Blur Detection via Multi-Stream Bottom-Top-Bottom Fully Convolutional Network

Title Defocus Blur Detection via Multi-Stream Bottom-Top-Bottom Fully Convolutional Network
Authors Wenda Zhao, Fan Zhao, Dong Wang, Huchuan Lu
Abstract Defocus blur detection (DBD) is the separation of infocus and out-of-focus regions in an image. This process has been paid considerable attention because of its remarkable potential applications. Accurate differentiation of homogeneous regions and detection of low-contrast focal regions, as well as suppression of background clutter, are challenges associated with DBD. To address these issues, we propose a multi-stream bottom-top-bottom fully convolutional network (BTBNet), which is the first attempt to develop an end-to-end deep network for DBD. First, we develop a fully convolutional BTBNet to integrate low-level cues and high-level semantic information. Then, considering that the degree of defocus blur is sensitive to scales, we propose multi-stream BTBNets that handle input images with different scales to improve the performance of DBD. Finally, we design a fusion and recursive reconstruction network to recursively refine the preceding blur detection maps. To promote further study and evaluation of the DBD models, we construct a new database of 500 challenging images and their pixel-wise defocus blur annotations. Experimental results on the existing and our new datasets demonstrate that the proposed method achieves significantly better performance than other state-of-the-art algorithms.
Tasks
Published 2018-06-01
URL http://openaccess.thecvf.com/content_cvpr_2018/html/Zhao_Defocus_Blur_Detection_CVPR_2018_paper.html
PDF http://openaccess.thecvf.com/content_cvpr_2018/papers/Zhao_Defocus_Blur_Detection_CVPR_2018_paper.pdf
PWC https://paperswithcode.com/paper/defocus-blur-detection-via-multi-stream
Repo
Framework

A Gold Standard for Multilingual Automatic Term Extraction from Comparable Corpora: Term Structure and Translation Equivalents

Title A Gold Standard for Multilingual Automatic Term Extraction from Comparable Corpora: Term Structure and Translation Equivalents
Authors Ayla Rigouts Terryn, V{'e}ronique Hoste, Els Lefever
Abstract
Tasks
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1284/
PDF https://www.aclweb.org/anthology/L18-1284
PWC https://paperswithcode.com/paper/a-gold-standard-for-multilingual-automatic
Repo
Framework

Systems’ Agreements and Disagreements in Temporal Processing: An Extensive Error Analysis of the TempEval-3 Task

Title Systems’ Agreements and Disagreements in Temporal Processing: An Extensive Error Analysis of the TempEval-3 Task
Authors Tommaso Caselli, Roser Morante
Abstract
Tasks Natural Language Inference, Question Answering, Relation Classification
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1051/
PDF https://www.aclweb.org/anthology/L18-1051
PWC https://paperswithcode.com/paper/systemsa-agreements-and-disagreements-in
Repo
Framework

Low Resource Methods for Medieval Document Sections Analysis

Title Low Resource Methods for Medieval Document Sections Analysis
Authors Petra Galu{\v{s}}{\v{c}}{'a}kov{'a}, Lucie Neu{\v{z}}ilov{'a}
Abstract
Tasks Information Retrieval
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1371/
PDF https://www.aclweb.org/anthology/L18-1371
PWC https://paperswithcode.com/paper/low-resource-methods-for-medieval-document
Repo
Framework

A Corpus of Natural Multimodal Spatial Scene Descriptions

Title A Corpus of Natural Multimodal Spatial Scene Descriptions
Authors Ting Han, David Schlangen
Abstract
Tasks
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1333/
PDF https://www.aclweb.org/anthology/L18-1333
PWC https://paperswithcode.com/paper/a-corpus-of-natural-multimodal-spatial-scene
Repo
Framework

Associative Conversation Model: Generating Visual Information from Textual Information

Title Associative Conversation Model: Generating Visual Information from Textual Information
Authors Yoichi Ishibashi, Hisashi Miyamori
Abstract In this paper, we propose the Associative Conversation Model that generates visual information from textual information and uses it for generating sentences in order to utilize visual information in a dialogue system without image input. In research on Neural Machine Translation, there are studies that generate translated sentences using both images and sentences, and these studies show that visual information improves translation performance. However, it is not possible to use sentence generation algorithms using images for the dialogue systems since many text-based dialogue systems only accept text input. Our approach generates (associates) visual information from input text and generates response text using context vector fusing associative visual information and sentence textual information. A comparative experiment between our proposed model and a model without association showed that our proposed model is generating useful sentences by associating visual information related to sentences. Furthermore, analysis experiment of visual association showed that our proposed model generates (associates) visual information effective for sentence generation.
Tasks Machine Translation
Published 2018-01-01
URL https://openreview.net/forum?id=HJ39YKiTb
PDF https://openreview.net/pdf?id=HJ39YKiTb
PWC https://paperswithcode.com/paper/associative-conversation-model-generating
Repo
Framework

A Fast and Flexible Webinterface for Dialect Research in the Low Countries

Title A Fast and Flexible Webinterface for Dialect Research in the Low Countries
Authors Roel van Hout, , Nicoline van der Sijs, Erwin Komen, Henk van den Heuvel
Abstract
Tasks
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1572/
PDF https://www.aclweb.org/anthology/L18-1572
PWC https://paperswithcode.com/paper/a-fast-and-flexible-webinterface-for-dialect
Repo
Framework

Selecting NLP Techniques to Evaluate Learning Design Objectives in Collaborative Multi-perspective Elaboration Activities

Title Selecting NLP Techniques to Evaluate Learning Design Objectives in Collaborative Multi-perspective Elaboration Activities
Authors Aneesha Bakharia
Abstract PerspectivesX is a multi-perspective elaboration tool designed to encourage learner submission and curation across a range of collaborative learning activities. In this paper, it is shown that the learning design objectives of collaborative learning activities can be evaluated using NLP techniques, but that careful analysis of learner impact and pedagogical intent are required in order to select appropriate techniques. In particular, this paper focuses on the NLP techniques required to deliver an instructor dashboard, personalized learner feedback and content recommendation within multi-perspective elaboration activities. Key NLP techniques considered for inclusion include summarization, topic modeling, paraphrase detection and diversified content recommendation.
Tasks
Published 2018-07-01
URL https://www.aclweb.org/anthology/W18-3712/
PDF https://www.aclweb.org/anthology/W18-3712
PWC https://paperswithcode.com/paper/selecting-nlp-techniques-to-evaluate-learning
Repo
Framework

Compiling Combinatorial Prediction Games

Title Compiling Combinatorial Prediction Games
Authors Frederic Koriche
Abstract In online optimization, the goal is to iteratively choose solutions from a decision space, so as to minimize the average cost over time. As long as this decision space is described by combinatorial constraints, the problem is generally intractable. In this paper, we consider the paradigm of compiling the set of combinatorial constraints into a deterministic and Decomposable Negation Normal Form (dDNNF) circuit, for which the tasks of linear optimization and solution sampling take linear time. Based on this framework, we provide efficient characterizations of existing combinatorial prediction strategies, with a particular attention to mirror descent techniques. These strategies are compared on several real-world benchmarks for which the set of Boolean constraints is preliminarily compiled into a dDNNF circuit.
Tasks
Published 2018-07-01
URL https://icml.cc/Conferences/2018/Schedule?showEvent=2374
PDF http://proceedings.mlr.press/v80/koriche18a/koriche18a.pdf
PWC https://paperswithcode.com/paper/compiling-combinatorial-prediction-games
Repo
Framework

Detecting Denial-of-Service Attacks from Social Media Text: Applying NLP to Computer Security

Title Detecting Denial-of-Service Attacks from Social Media Text: Applying NLP to Computer Security
Authors Nathanael Chambers, Ben Fry, James McMasters
Abstract This paper describes a novel application of NLP models to detect denial of service attacks using only social media as evidence. Individual networks are often slow in reporting attacks, so a detection system from public data could better assist a response to a broad attack across multiple services. We explore NLP methods to use social media as an indirect measure of network service status. We describe two learning frameworks for this task: a feed-forward neural network and a partially labeled LDA model. Both models outperform previous work by significant margins (20{%} F1 score). We further show that the topic-based model enables the first fine-grained analysis of how the public reacts to ongoing network attacks, discovering multiple {``}stages{''} of observation. This is the first model that both detects network attacks (with best performance) and provides an analysis of when and how the public interprets service outages. We describe the models, present experiments on the largest twitter DDoS corpus to date, and conclude with an analysis of public reactions based on the learned model{'}s output. |
Tasks
Published 2018-06-01
URL https://www.aclweb.org/anthology/N18-1147/
PDF https://www.aclweb.org/anthology/N18-1147
PWC https://paperswithcode.com/paper/detecting-denial-of-service-attacks-from
Repo
Framework
comments powered by Disqus