October 15, 2019

2087 words 10 mins read

Paper Group NANR 228

Vocabulary Tailored Summary Generation. Modern Neural Networks Generalize on Small Data Sets. Combining Human and Machine Transcriptions on the Zooniverse Platform. Towards Understanding the Geometry of Knowledge Graph Embeddings. Deep Cauchy Hashing for Hamming Space Retrieval. Defocus Blur Detection via Multi-Stream Bottom-Top-Bottom Fully Convol …

Vocabulary Tailored Summary Generation


Title	Vocabulary Tailored Summary Generation
Authors	Kundan Krishna, Aniket Murhekar, Saumitra Sharma, Balaji Vasan Srinivasan
Abstract	Neural sequence-to-sequence models have been successfully extended for summary generation.However, existing frameworks generate a single summary for a given input and do not tune the summaries towards any additional constraints/preferences. Such a tunable framework is desirable to account for linguistic preferences of the specific audience who will consume the summary. In this paper, we propose a neural framework to generate summaries constrained to a vocabulary-defined linguistic preferences of a target audience. The proposed method accounts for the generation context by tuning the summary words at the time of generation. Our evaluations indicate that the proposed approach tunes summaries to the target vocabulary while still maintaining a superior summary quality against a state-of-the-art word embedding based lexical substitution algorithm, suggesting the feasibility of the proposed approach. We demonstrate two applications of the proposed approach - to generate understandable summaries with simpler words, and readable summaries with shorter words.
Tasks	Abstractive Text Summarization, Text Summarization
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-1068/
PDF	https://www.aclweb.org/anthology/C18-1068
PWC	https://paperswithcode.com/paper/vocabulary-tailored-summary-generation
Repo
Framework

Modern Neural Networks Generalize on Small Data Sets


Title	Modern Neural Networks Generalize on Small Data Sets
Authors	Matthew Olson, Abraham Wyner, Richard Berk
Abstract	In this paper, we use a linear program to empirically decompose fitted neural networks into ensembles of low-bias sub-networks. We show that these sub-networks are relatively uncorrelated which leads to an internal regularization process, very much like a random forest, which can explain why a neural network is surprisingly resistant to overfitting. We then demonstrate this in practice by applying large neural networks, with hundreds of parameters per training observation, to a collection of 116 real-world data sets from the UCI Machine Learning Repository. This collection of data sets contains a much smaller number of training examples than the types of image classification tasks generally studied in the deep learning literature, as well as non-trivial label noise. We show that even in this setting deep neural nets are capable of achieving superior classification accuracy without overfitting.
Tasks	Image Classification
Published	2018-12-01
URL	http://papers.nips.cc/paper/7620-modern-neural-networks-generalize-on-small-data-sets
PDF	http://papers.nips.cc/paper/7620-modern-neural-networks-generalize-on-small-data-sets.pdf
PWC	https://paperswithcode.com/paper/modern-neural-networks-generalize-on-small
Repo
Framework

Combining Human and Machine Transcriptions on the Zooniverse Platform


Title	Combining Human and Machine Transcriptions on the Zooniverse Platform
Authors	Daniel Hanson, Andrea Simenstad
Abstract	Transcribing handwritten documents to create fully searchable texts is an essential part of the archival process. Traditional text recognition methods, such as optical character recognition (OCR), do not work on handwritten documents due to their frequent noisiness and OCR{'}s need for individually segmented letters. Crowdsourcing and improved machine models are two modern methods for transcribing handwritten documents.
Tasks	Optical Character Recognition
Published	2018-11-01
URL	https://www.aclweb.org/anthology/W18-6129/
PDF	https://www.aclweb.org/anthology/W18-6129
PWC	https://paperswithcode.com/paper/combining-human-and-machine-transcriptions-on
Repo
Framework

Towards Understanding the Geometry of Knowledge Graph Embeddings


Title	Towards Understanding the Geometry of Knowledge Graph Embeddings
Authors	{Ch, rahas}, Aditya Sharma, Partha Talukdar
Abstract	Knowledge Graph (KG) embedding has emerged as a very active area of research over the last few years, resulting in the development of several embedding methods. These KG embedding methods represent KG entities and relations as vectors in a high-dimensional space. Despite this popularity and effectiveness of KG embeddings in various tasks (e.g., link prediction), geometric understanding of such embeddings (i.e., arrangement of entity and relation vectors in vector space) is unexplored {–} we fill this gap in the paper. We initiate a study to analyze the geometry of KG embeddings and correlate it with task performance and other hyperparameters. To the best of our knowledge, this is the first study of its kind. Through extensive experiments on real-world datasets, we discover several insights. For example, we find that there are sharp differences between the geometry of embeddings learnt by different classes of KG embeddings methods. We hope that this initial study will inspire other follow-up research on this important but unexplored problem.
Tasks	Knowledge Graph Embeddings, Knowledge Graphs, Link Prediction, Word Embeddings
Published	2018-07-01
URL	https://www.aclweb.org/anthology/P18-1012/
PDF	https://www.aclweb.org/anthology/P18-1012
PWC	https://paperswithcode.com/paper/towards-understanding-the-geometry-of
Repo
Framework

Deep Cauchy Hashing for Hamming Space Retrieval


Title	Deep Cauchy Hashing for Hamming Space Retrieval
Authors	Yue Cao, Mingsheng Long, Bin Liu, Jianmin Wang
Abstract	Due to its computation efficiency and retrieval quality, hashing has been widely applied to approximate nearest neighbor search for large-scale image retrieval, while deep hashing further improves the retrieval quality by end-to-end representation learning and hash coding. With compact hash codes, Hamming space retrieval enables the most efficient constant-time search that returns data points within a given Hamming radius to each query, by hash table lookups instead of linear scan. However, subject to the weak capability of concentrating relevant images to be within a small Hamming ball due to mis-specified loss functions, existing deep hashing methods may underperform for Hamming space retrieval. This work presents Deep Cauchy Hashing (DCH), a novel deep hashing model that generates compact and concentrated binary hash codes to enable efficient and effective Hamming space retrieval. The main idea is to design a pairwise cross-entropy loss based on Cauchy distribution, which penalizes significantly on similar image pairs with Hamming distance larger than the given Hamming radius threshold. Comprehensive experiments demonstrate that DCH can generate highly concentrated hash codes and yield state-of-the-art Hamming space retrieval performance on three datasets, NUS-WIDE, CIFAR-10, and MS-COCO.
Tasks	Image Retrieval, Representation Learning
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Cao_Deep_Cauchy_Hashing_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Cao_Deep_Cauchy_Hashing_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/deep-cauchy-hashing-for-hamming-space
Repo
Framework

Defocus Blur Detection via Multi-Stream Bottom-Top-Bottom Fully Convolutional Network


Title	Defocus Blur Detection via Multi-Stream Bottom-Top-Bottom Fully Convolutional Network
Authors	Wenda Zhao, Fan Zhao, Dong Wang, Huchuan Lu
Abstract	Defocus blur detection (DBD) is the separation of infocus and out-of-focus regions in an image. This process has been paid considerable attention because of its remarkable potential applications. Accurate differentiation of homogeneous regions and detection of low-contrast focal regions, as well as suppression of background clutter, are challenges associated with DBD. To address these issues, we propose a multi-stream bottom-top-bottom fully convolutional network (BTBNet), which is the first attempt to develop an end-to-end deep network for DBD. First, we develop a fully convolutional BTBNet to integrate low-level cues and high-level semantic information. Then, considering that the degree of defocus blur is sensitive to scales, we propose multi-stream BTBNets that handle input images with different scales to improve the performance of DBD. Finally, we design a fusion and recursive reconstruction network to recursively refine the preceding blur detection maps. To promote further study and evaluation of the DBD models, we construct a new database of 500 challenging images and their pixel-wise defocus blur annotations. Experimental results on the existing and our new datasets demonstrate that the proposed method achieves significantly better performance than other state-of-the-art algorithms.
Tasks
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Zhao_Defocus_Blur_Detection_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Zhao_Defocus_Blur_Detection_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/defocus-blur-detection-via-multi-stream
Repo
Framework

A Gold Standard for Multilingual Automatic Term Extraction from Comparable Corpora: Term Structure and Translation Equivalents


Title	A Gold Standard for Multilingual Automatic Term Extraction from Comparable Corpora: Term Structure and Translation Equivalents
Authors	Ayla Rigouts Terryn, V{'e}ronique Hoste, Els Lefever
Abstract
Tasks
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1284/
PDF	https://www.aclweb.org/anthology/L18-1284
PWC	https://paperswithcode.com/paper/a-gold-standard-for-multilingual-automatic
Repo
Framework

Systems’ Agreements and Disagreements in Temporal Processing: An Extensive Error Analysis of the TempEval-3 Task


Title	Systems’ Agreements and Disagreements in Temporal Processing: An Extensive Error Analysis of the TempEval-3 Task
Authors	Tommaso Caselli, Roser Morante
Abstract
Tasks	Natural Language Inference, Question Answering, Relation Classification
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1051/
PDF	https://www.aclweb.org/anthology/L18-1051
PWC	https://paperswithcode.com/paper/systemsa-agreements-and-disagreements-in
Repo
Framework

Low Resource Methods for Medieval Document Sections Analysis


Title	Low Resource Methods for Medieval Document Sections Analysis
Authors	Petra Galu{\v{s}}{\v{c}}{'a}kov{'a}, Lucie Neu{\v{z}}ilov{'a}
Abstract
Tasks	Information Retrieval
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1371/
PDF	https://www.aclweb.org/anthology/L18-1371
PWC	https://paperswithcode.com/paper/low-resource-methods-for-medieval-document
Repo
Framework

A Corpus of Natural Multimodal Spatial Scene Descriptions


Title	A Corpus of Natural Multimodal Spatial Scene Descriptions
Authors	Ting Han, David Schlangen
Abstract
Tasks
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1333/
PDF	https://www.aclweb.org/anthology/L18-1333
PWC	https://paperswithcode.com/paper/a-corpus-of-natural-multimodal-spatial-scene
Repo
Framework

Associative Conversation Model: Generating Visual Information from Textual Information


Title	Associative Conversation Model: Generating Visual Information from Textual Information
Authors	Yoichi Ishibashi, Hisashi Miyamori
Abstract	In this paper, we propose the Associative Conversation Model that generates visual information from textual information and uses it for generating sentences in order to utilize visual information in a dialogue system without image input. In research on Neural Machine Translation, there are studies that generate translated sentences using both images and sentences, and these studies show that visual information improves translation performance. However, it is not possible to use sentence generation algorithms using images for the dialogue systems since many text-based dialogue systems only accept text input. Our approach generates (associates) visual information from input text and generates response text using context vector fusing associative visual information and sentence textual information. A comparative experiment between our proposed model and a model without association showed that our proposed model is generating useful sentences by associating visual information related to sentences. Furthermore, analysis experiment of visual association showed that our proposed model generates (associates) visual information effective for sentence generation.
Tasks	Machine Translation
Published	2018-01-01
URL	https://openreview.net/forum?id=HJ39YKiTb
PDF	https://openreview.net/pdf?id=HJ39YKiTb
PWC	https://paperswithcode.com/paper/associative-conversation-model-generating
Repo
Framework

A Fast and Flexible Webinterface for Dialect Research in the Low Countries


Title	A Fast and Flexible Webinterface for Dialect Research in the Low Countries
Authors	Roel van Hout, , Nicoline van der Sijs, Erwin Komen, Henk van den Heuvel
Abstract
Tasks
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1572/
PDF	https://www.aclweb.org/anthology/L18-1572
PWC	https://paperswithcode.com/paper/a-fast-and-flexible-webinterface-for-dialect
Repo
Framework

Selecting NLP Techniques to Evaluate Learning Design Objectives in Collaborative Multi-perspective Elaboration Activities


Title	Selecting NLP Techniques to Evaluate Learning Design Objectives in Collaborative Multi-perspective Elaboration Activities
Authors	Aneesha Bakharia
Abstract	PerspectivesX is a multi-perspective elaboration tool designed to encourage learner submission and curation across a range of collaborative learning activities. In this paper, it is shown that the learning design objectives of collaborative learning activities can be evaluated using NLP techniques, but that careful analysis of learner impact and pedagogical intent are required in order to select appropriate techniques. In particular, this paper focuses on the NLP techniques required to deliver an instructor dashboard, personalized learner feedback and content recommendation within multi-perspective elaboration activities. Key NLP techniques considered for inclusion include summarization, topic modeling, paraphrase detection and diversified content recommendation.
Tasks
Published	2018-07-01
URL	https://www.aclweb.org/anthology/W18-3712/
PDF	https://www.aclweb.org/anthology/W18-3712
PWC	https://paperswithcode.com/paper/selecting-nlp-techniques-to-evaluate-learning
Repo
Framework

Compiling Combinatorial Prediction Games


Title	Compiling Combinatorial Prediction Games
Authors	Frederic Koriche
Abstract	In online optimization, the goal is to iteratively choose solutions from a decision space, so as to minimize the average cost over time. As long as this decision space is described by combinatorial constraints, the problem is generally intractable. In this paper, we consider the paradigm of compiling the set of combinatorial constraints into a deterministic and Decomposable Negation Normal Form (dDNNF) circuit, for which the tasks of linear optimization and solution sampling take linear time. Based on this framework, we provide efficient characterizations of existing combinatorial prediction strategies, with a particular attention to mirror descent techniques. These strategies are compared on several real-world benchmarks for which the set of Boolean constraints is preliminarily compiled into a dDNNF circuit.
Tasks
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=2374
PDF	http://proceedings.mlr.press/v80/koriche18a/koriche18a.pdf
PWC	https://paperswithcode.com/paper/compiling-combinatorial-prediction-games
Repo
Framework


Title	Detecting Denial-of-Service Attacks from Social Media Text: Applying NLP to Computer Security
Authors	Nathanael Chambers, Ben Fry, James McMasters
Abstract	This paper describes a novel application of NLP models to detect denial of service attacks using only social media as evidence. Individual networks are often slow in reporting attacks, so a detection system from public data could better assist a response to a broad attack across multiple services. We explore NLP methods to use social media as an indirect measure of network service status. We describe two learning frameworks for this task: a feed-forward neural network and a partially labeled LDA model. Both models outperform previous work by significant margins (20{%} F1 score). We further show that the topic-based model enables the first fine-grained analysis of how the public reacts to ongoing network attacks, discovering multiple {``}stages{''} of observation. This is the first model that both detects network attacks (with best performance) and provides an analysis of when and how the public interprets service outages. We describe the models, present experiments on the largest twitter DDoS corpus to date, and conclude with an analysis of public reactions based on the learned model{'}s output. \|
Tasks
Published	2018-06-01
URL	https://www.aclweb.org/anthology/N18-1147/
PDF	https://www.aclweb.org/anthology/N18-1147
PWC	https://paperswithcode.com/paper/detecting-denial-of-service-attacks-from
Repo
Framework