October 15, 2019

2976 words 14 mins read

Paper Group NANR 276

Convolutional Neural Network for Universal Sentence Embeddings. CNN Driven Sparse Multi-Level B-Spline Image Registration. On the Convergence of PatchMatch and Its Variants. Ab Initio: Automatic Latin Proto-word Reconstruction. TRANSRW at SemEval-2018 Task 12: Transforming Semantic Representations for Argument Reasoning Comprehension. Aggregated Se …

Convolutional Neural Network for Universal Sentence Embeddings


Title	Convolutional Neural Network for Universal Sentence Embeddings
Authors	Xiaoqi Jiao, Fang Wang, Dan Feng
Abstract	This paper proposes a simple CNN model for creating general-purpose sentence embeddings that can transfer easily across domains and can also act as effective initialization for downstream tasks. Recently, averaging the embeddings of words in a sentence has proven to be a surprisingly successful and efficient way of obtaining sentence embeddings. However, these models represent a sentence, only in terms of features of words or uni-grams in it. In contrast, our model (CSE) utilizes both features of words and n-grams to encode sentences, which is actually a generalization of these bag-of-words models. The extensive experiments demonstrate that CSE performs better than average models in transfer learning setting and exceeds the state of the art in supervised learning setting by initializing the parameters with the pre-trained sentence embeddings.
Tasks	Semantic Textual Similarity, Sentence Embeddings, Transfer Learning, Word Embeddings
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-1209/
PDF	https://www.aclweb.org/anthology/C18-1209
PWC	https://paperswithcode.com/paper/convolutional-neural-network-for-universal
Repo
Framework

CNN Driven Sparse Multi-Level B-Spline Image Registration


Title	CNN Driven Sparse Multi-Level B-Spline Image Registration
Authors	Pingge Jiang, James A. Shackleford
Abstract	Traditional single-grid and pyramidal B-spline parameterizations used in deformable image registration require users to specify control point spacing configurations capable of accurately capturing both global and complex local deformations. In many cases, such grid configurations are non-obvious and largely selected based on user experience. Recent regularization methods imposing sparsity upon the B-spline coefficients throughout simultaneous multi-grid optimization, however, have provided a promising means of determining suitable configurations automatically. Unfortunately, imposing sparsity on over-parameterized B-spline models is computationally expensive and introduces additional difficulties such as undesirable local minima in the B-spline coefficient optimization process. To overcome these difficulties in determining B-spline grid configurations, this paper investigates the use of convolutional neural networks (CNNs) to learn and infer expressive sparse multi-grid configurations prior to B-spline coefficient optimization. Experimental results show that multi-grid configurations produced in this fashion using our CNN based approach provide registration quality comparable to L1-norm constrained over-parameterizations in terms of exactness, while exhibiting significantly reduced computational requirements.
Tasks	Image Registration
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Jiang_CNN_Driven_Sparse_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Jiang_CNN_Driven_Sparse_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/cnn-driven-sparse-multi-level-b-spline-image
Repo
Framework

On the Convergence of PatchMatch and Its Variants


Title	On the Convergence of PatchMatch and Its Variants
Authors	Thibaud Ehret, Pablo Arias
Abstract	Many problems in image/video processing and computer vision require the computation of a dense k-nearest neighbor field (k-NNF) between two images. For each patch in a query image, the k-NNF determines the positions of the k most similar patches in a database image. With the introduction of the PatchMatch algorithm, Barnes et al. demonstrated that this large search problem can be approximated efficiently by collaborative search methods that exploit the local coherency of image patches. After its introduction, several variants of the original PatchMatch algorithm have been proposed, some of them reducing the computational time by two orders of magnitude. In this work we propose a theoretical framework for the analysis of PatchMatch and its variants, and apply it to derive bounds on their covergence rate. We consider a generic PatchMatch algorithm from which most specific instances found in the literature can be derived as particular cases. We also derive more specific bounds for two of these particular cases: the original PatchMatch and Coherency Sensitive Hashing. The proposed bounds are validated by contrasting them to the convergence observed in practice.
Tasks
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Ehret_On_the_Convergence_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Ehret_On_the_Convergence_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/on-the-convergence-of-patchmatch-and-its
Repo
Framework

Ab Initio: Automatic Latin Proto-word Reconstruction


Title	Ab Initio: Automatic Latin Proto-word Reconstruction
Authors	Alina Maria Ciobanu, Liviu P. Dinu
Abstract	Proto-word reconstruction is central to the study of language evolution. It consists of recreating the words in an ancient language from its modern daughter languages. In this paper we investigate automatic word form reconstruction for Latin proto-words. Having modern word forms in multiple Romance languages (French, Italian, Spanish, Portuguese and Romanian), we infer the form of their common Latin ancestors. Our approach relies on the regularities that occurred when the Latin words entered the modern languages. We leverage information from all modern languages, building an ensemble system for proto-word reconstruction. We use conditional random fields for sequence labeling, but we conduct preliminary experiments with recurrent neural networks as well. We apply our method on multiple datasets, showing that our method improves on previous results, having also the advantage of requiring less input data, which is essential in historical linguistics, where resources are generally scarce.
Tasks
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-1136/
PDF	https://www.aclweb.org/anthology/C18-1136
PWC	https://paperswithcode.com/paper/ab-initio-automatic-latin-proto-word
Repo
Framework

TRANSRW at SemEval-2018 Task 12: Transforming Semantic Representations for Argument Reasoning Comprehension


Title	TRANSRW at SemEval-2018 Task 12: Transforming Semantic Representations for Argument Reasoning Comprehension
Authors	Zhimin Chen, Wei Song, Lizhen Liu
Abstract	This paper describes our system in SemEval-2018 task 12: Argument Reasoning Comprehension. The task is to select the correct warrant that explains reasoning of a particular argument consisting of a claim and a reason. The main idea of our methods is based on the assumption that the semantic composition of the reason and the warrant should be close to the semantic representation of the corresponding claim. We propose two neural network models. The first one considers two warrant candidates simultaneously, while the second one processes each candidate separately and then chooses the best one. We also incorporate sentiment polarity by assuming that there are kinds of sentiment associations between the reason, the warrant and the claim. The experiments show that the first framework is more effective and sentiment polarity is useful.
Tasks	Semantic Composition
Published	2018-06-01
URL	https://www.aclweb.org/anthology/S18-1194/
PDF	https://www.aclweb.org/anthology/S18-1194
PWC	https://paperswithcode.com/paper/transrw-at-semeval-2018-task-12-transforming
Repo
Framework

Aggregated Semantic Matching for Short Text Entity Linking


Title	Aggregated Semantic Matching for Short Text Entity Linking
Authors	Feng Nie, Shuyan Zhou, Jing Liu, Jinpeng Wang, Chin-Yew Lin, Rong Pan
Abstract	The task of entity linking aims to identify concepts mentioned in a text fragments and link them to a reference knowledge base. Entity linking in long text has been well studied in previous work. However, short text entity linking is more challenging since the text are noisy and less coherent. To better utilize the local information provided in short texts, we propose a novel neural network framework, Aggregated Semantic Matching (ASM), in which two different aspects of semantic information between the local context and the candidate entity are captured via representation-based and interaction-based neural semantic matching models, and then two matching signals work jointly for disambiguation with a rank aggregation mechanism. Our evaluation shows that the proposed model outperforms the state-of-the-arts on public tweet datasets.
Tasks	Card Games, Entity Linking, Information Retrieval, Named Entity Recognition
Published	2018-10-01
URL	https://www.aclweb.org/anthology/K18-1046/
PDF	https://www.aclweb.org/anthology/K18-1046
PWC	https://paperswithcode.com/paper/aggregated-semantic-matching-for-short-text
Repo
Framework

Atypical Inputs in Educational Applications


Title	Atypical Inputs in Educational Applications
Authors	Su-Youn Yoon, Aoife Cahill, Anastassia Loukina, Klaus Zechner, Brian Riordan, Nitin Madnani
Abstract	In large-scale educational assessments, the use of automated scoring has recently become quite common. While the majority of student responses can be processed and scored without difficulty, there are a small number of responses that have atypical characteristics that make it difficult for an automated scoring system to assign a correct score. We describe a pipeline that detects and processes these kinds of responses at run-time. We present the most frequent kinds of what are called non-scorable responses along with effective filtering models based on various NLP and speech processing technologies. We give an overview of two operational automated scoring systems {—}one for essay scoring and one for speech scoring{—} and describe the filtering models they use. Finally, we present an evaluation and analysis of filtering models used for spoken responses in an assessment of language proficiency.
Tasks	Speech Recognition
Published	2018-06-01
URL	https://www.aclweb.org/anthology/N18-3008/
PDF	https://www.aclweb.org/anthology/N18-3008
PWC	https://paperswithcode.com/paper/atypical-inputs-in-educational-applications
Repo
Framework

Automatic Parameter Tying in Neural Networks


Title	Automatic Parameter Tying in Neural Networks
Authors	Yibo Yang, Nicholas Ruozzi, Vibhav Gogate
Abstract	Recently, there has been growing interest in methods that perform neural network compression, namely techniques that attempt to substantially reduce the size of a neural network without significant reduction in performance. However, most existing methods are post-processing approaches in that they take a learned neural network as input and output a compressed network by either forcing several parameters to take the same value (parameter tying via quantization) or pruning irrelevant edges (pruning) or both. In this paper, we propose a novel algorithm that jointly learns and compresses a neural network. The key idea in our approach is to change the optimization criteria by adding $k$ independent Gaussian priors over the parameters and a sparsity penalty. We show that our approach is easy to implement using existing neural network libraries, generalizes L1 and L2 regularization and elegantly enforces parameter tying as well as pruning constraints. Experimentally, we demonstrate that our new algorithm yields state-of-the-art compression on several standard benchmarks with minimal loss in accuracy while requiring little to no hyperparameter tuning as compared with related, competing approaches.
Tasks	L2 Regularization, Neural Network Compression, Quantization
Published	2018-01-01
URL	https://openreview.net/forum?id=HkinqfbAb
PDF	https://openreview.net/pdf?id=HkinqfbAb
PWC	https://paperswithcode.com/paper/automatic-parameter-tying-in-neural-networks
Repo
Framework

Is Nike female? Exploring the role of sound symbolism in predicting brand name gender


Title	Is Nike female? Exploring the role of sound symbolism in predicting brand name gender
Authors	Sridhar Moorthy, Ruth Pogacar, Samin Khan, Yang Xu
Abstract	Are brand names such as Nike female or male? Previous research suggests that the sound of a person{'}s first name is associated with the person{'}s gender, but no research has tried to use this knowledge to assess the gender of brand names. We present a simple computational approach that uses sound symbolism to address this open issue. Consistent with previous research, a model trained on various linguistic features of name endings predicts human gender with high accuracy. Applying this model to a data set of over a thousand commercially-traded brands in 17 product categories, our results reveal an overall bias toward male names, cutting across both male-oriented product categories as well as female-oriented categories. In addition, we find variation within categories, suggesting that firms might be seeking to imbue their brands with differentiating characteristics as part of their competitive strategy.
Tasks
Published	2018-10-01
URL	https://www.aclweb.org/anthology/D18-1142/
PDF	https://www.aclweb.org/anthology/D18-1142
PWC	https://paperswithcode.com/paper/is-nike-female-exploring-the-role-of-sound
Repo
Framework

Generative Entity Networks: Disentangling Entitites and Attributes in Visual Scenes using Partial Natural Language Descriptions


Title	Generative Entity Networks: Disentangling Entitites and Attributes in Visual Scenes using Partial Natural Language Descriptions
Authors	Charlie Nash, Sebastian Nowozin, Nate Kushman
Abstract	Generative image models have made significant progress in the last few years, and are now able to generate low-resolution images which sometimes look realistic. However the state-of-the-art models utilize fully entangled latent representations where small changes to a single neuron can effect every output pixel in relatively arbitrary ways, and different neurons have possibly arbitrary relationships with each other. This limits the ability of such models to generalize to new combinations or orientations of objects as well as their ability to connect with more structured representations such as natural language, without explicit strong supervision. In this work explore the synergistic effect of using partial natural language scene descriptions to help disentangle the latent entities visible an image. We present a novel neural network architecture called Generative Entity Networks, which jointly generates both the natural language descriptions and the images from a set of latent entities. Our model is based on the variational autoencoder framework and makes use of visual attention to identify and characterise the visual attributes of each entity. Using the Shapeworld dataset, we show that our representation both enables a better generative model of images, leading to higher quality image samples, as well as creating more semantically useful representations that improve performance over purely dicriminative models on a simple natural language yes/no question answering task.
Tasks	Question Answering
Published	2018-01-01
URL	https://openreview.net/forum?id=BJInMmWC-
PDF	https://openreview.net/pdf?id=BJInMmWC-
PWC	https://paperswithcode.com/paper/generative-entity-networks-disentangling
Repo
Framework

Policy-Conditioned Uncertainty Sets for Robust Markov Decision Processes


Title	Policy-Conditioned Uncertainty Sets for Robust Markov Decision Processes
Authors	Andrea Tirinzoni, Marek Petrik, Xiangli Chen, Brian Ziebart
Abstract	What policy should be employed in a Markov decision process with uncertain parameters? Robust optimization answer to this question is to use rectangular uncertainty sets, which independently reflect available knowledge about each state, and then obtains a decision policy that maximizes expected reward for the worst-case decision process parameters from these uncertainty sets. While this rectangularity is convenient computationally and leads to tractable solutions, it often produces policies that are too conservative in practice, and does not facilitate knowledge transfer between portions of the state space or across related decision processes. In this work, we propose non-rectangular uncertainty sets that bound marginal moments of state-action features defined over entire trajectories through a decision process. This enables generalization to different portions of the state space while retaining appropriate uncertainty of the decision process. We develop algorithms for solving the resulting robust decision problems, which reduce to finding an optimal policy for a mixture of decision processes, and demonstrate the benefits of our approach experimentally.
Tasks	Transfer Learning
Published	2018-12-01
URL	http://papers.nips.cc/paper/8109-policy-conditioned-uncertainty-sets-for-robust-markov-decision-processes
PDF	http://papers.nips.cc/paper/8109-policy-conditioned-uncertainty-sets-for-robust-markov-decision-processes.pdf
PWC	https://paperswithcode.com/paper/policy-conditioned-uncertainty-sets-for
Repo
Framework

One Language to rule them all: modelling Morphological Patterns in a Large Scale Italian Lexicon with SWRL


Title	One Language to rule them all: modelling Morphological Patterns in a Large Scale Italian Lexicon with SWRL
Authors	Fahad Khan, Bell, Andrea i, Francesca Frontini, Monica Monachini
Abstract
Tasks
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1694/
PDF	https://www.aclweb.org/anthology/L18-1694
PWC	https://paperswithcode.com/paper/one-language-to-rule-them-all-modelling
Repo
Framework

A Self-Attentive Hierarchical Model for Jointly Improving Text Summarization and Sentiment Classification


Title	A Self-Attentive Hierarchical Model for Jointly Improving Text Summarization and Sentiment Classification
Authors	Hongli Wang, Jiangtao Ren
Abstract	Text summarization and sentiment classification, in NLP, are two main tasks implemented on text analysis, focusing on extracting the major idea of a text at different levels. Based on the characteristics of both, sentiment classification can be regarded as a more abstrac- tive summarization task. According to the scheme, a Self-Attentive Hierarchical model for jointly improving text Summarization and Sentiment Classification (SAHSSC) is proposed in this paper. This model jointly performs abstractive text summarization and sentiment classification within a hierarchical end-to-end neural framework, in which the sentiment classification layer on top of the summarization layer predicts the sentiment label in the light of the text and the generated summary. Furthermore, a self-attention layer is also pro- posed in the hierarchical framework, which is the bridge that connects the summarization layer and the sentiment classification layer and aims at capturing emotional information at text-level as well as summary-level. The proposed model can generate a more relevant summary and lead to a more accurate summary-aware sentiment prediction. Experimental results evaluated on SNAP amazon online review datasets show that our model outper- forms the state-of-the-art baselines on both abstractive text summarization and sentiment classification by a considerable margin.
Tasks	Abstractive Text Summarization, Sentiment Analysis, Text Summarization
Published	2018-11-14
URL	http://proceedings.mlr.press/v95/wang18b.html
PDF	http://proceedings.mlr.press/v95/wang18b/wang18b.pdf
PWC	https://paperswithcode.com/paper/a-self-attentive-hierarchical-model-for
Repo
Framework

Detect Globally, Refine Locally: A Novel Approach to Saliency Detection


Title	Detect Globally, Refine Locally: A Novel Approach to Saliency Detection
Authors	Tiantian Wang, Lihe Zhang, Shuo Wang, Huchuan Lu, Gang Yang, Xiang Ruan, Ali Borji
Abstract	Effective integration of contextual information is crucial for salient object detection. To achieve this, most existing methods based on ‘skip’ architecture mainly focus on how to integrate hierarchical features of Convolutional Neural Networks (CNNs). They simply apply concatenation or element-wise operation to incorporate high-level semantic cues and low-level detailed information. However, this can degrade the quality of predictions because cluttered and noisy information can also be passed through. To address this problem, we proposes a global Recurrent Localization Network (RLN) which exploits contextual information by the weighted response map in order to localize salient objects more accurately. % and emphasize more on useful ones. Particularly, a recurrent module is employed to progressively refine the inner structure of the CNN over multiple time steps. Moreover, to effectively recover object boundaries, we propose a local Boundary Refinement Network (BRN) to adaptively learn the local contextual information for each spatial position. The learned propagation coefficients can be used to optimally capture relations between each pixel and its neighbors. Experiments on five challenging datasets show that our approach performs favorably against all existing methods in terms of the popular evaluation metrics.
Tasks	Object Detection, Saliency Detection, Salient Object Detection
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Wang_Detect_Globally_Refine_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Wang_Detect_Globally_Refine_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/detect-globally-refine-locally-a-novel
Repo
Framework

Extraction Meets Abstraction: Ideal Answer Generation for Biomedical Questions


Title	Extraction Meets Abstraction: Ideal Answer Generation for Biomedical Questions
Authors	Yutong Li, Nicholas Gekakis, Qiuze Wu, Boyue Li, Ch, Khyathi u, Eric Nyberg
Abstract	The growing number of biomedical publications is a challenge for human researchers, who invest considerable effort to search for relevant documents and pinpointed answers. Biomedical Question Answering can automatically generate answers for a user{'}s topic or question, significantly reducing the effort required to locate the most relevant information in a large document corpus. Extractive summarization techniques, which concatenate the most relevant text units drawn from multiple documents, perform well on automatic evaluation metrics like ROUGE, but score poorly on human readability, due to the presence of redundant text and grammatical errors in the answer. This work moves toward abstractive summarization, which attempts to distill and present the meaning of the original text in a more coherent way. We incorporate a sentence fusion approach, based on Integer Linear Programming, along with three novel approaches for sentence ordering, in an attempt to improve the human readability of ideal answers. Using an open framework for configuration space exploration (BOOM), we tested over 2000 unique system configurations in order to identify the best-performing combinations for the sixth edition of Phase B of the BioASQ challenge.
Tasks	Abstractive Text Summarization, Question Answering, Sentence Ordering
Published	2018-11-01
URL	https://www.aclweb.org/anthology/W18-5307/
PDF	https://www.aclweb.org/anthology/W18-5307
PWC	https://paperswithcode.com/paper/extraction-meets-abstraction-ideal-answer
Repo
Framework