October 15, 2019

2403 words 12 mins read

Paper Group NANR 128

Automatic Extraction of Entities and Relation from Legal Documents. Attention-based Semantic Priming for Slot-filling. pOSE: Pseudo Object Space Error for Initialization-Free Bundle Adjustment. Neural Machine Translation Techniques for Named Entity Transliteration. Proceedings of Workshop for NLP Open Source Software (NLP-OSS). The Principle of Log …

Automatic Extraction of Entities and Relation from Legal Documents


Title	Automatic Extraction of Entities and Relation from Legal Documents
Authors	Judith Jeyafreeda Andrew
Abstract	In recent years, the journalists and computer sciences speak to each other to identify useful technologies which would help them in extracting useful information. This is called {``}computational Journalism{''}. In this paper, we present a method that will enable the journalists to automatically identifies and annotates entities such as names of people, organizations, role and functions of people in legal documents; the relationship between these entities are also explored. The system uses a combination of both statistical and rule based technique. The statistical method used is Conditional Random Fields and for the rule based technique, document and language specific regular expressions are used. \|
Tasks	Named Entity Recognition
Published	2018-07-01
URL	https://www.aclweb.org/anthology/W18-2401/
PDF	https://www.aclweb.org/anthology/W18-2401
PWC	https://paperswithcode.com/paper/automatic-extraction-of-entities-and-relation
Repo
Framework

Attention-based Semantic Priming for Slot-filling


Title	Attention-based Semantic Priming for Slot-filling
Authors	Jiewen Wu, Rafael E. Banchs, Luis Fern D{'}Haro, o, Pavitra Krishnaswamy, Nancy Chen
Abstract	The problem of sequence labelling in language understanding would benefit from approaches inspired by semantic priming phenomena. We propose that an attention-based RNN architecture can be used to simulate semantic priming for sequence labelling. Specifically, we employ pre-trained word embeddings to characterize the semantic relationship between utterances and labels. We validate the approach using varying sizes of the ATIS and MEDIA datasets, and show up to 1.4-1.9{%} improvement in F1 score. The developed framework can enable more explainable and generalizable spoken language understanding systems.
Tasks	Slot Filling, Spoken Language Understanding, Word Embeddings
Published	2018-07-01
URL	https://www.aclweb.org/anthology/W18-2404/
PDF	https://www.aclweb.org/anthology/W18-2404
PWC	https://paperswithcode.com/paper/attention-based-semantic-priming-for-slot
Repo
Framework

pOSE: Pseudo Object Space Error for Initialization-Free Bundle Adjustment


Title	pOSE: Pseudo Object Space Error for Initialization-Free Bundle Adjustment
Authors	Je Hyeong Hong, Christopher Zach
Abstract	Bundle adjustment is a nonlinear refinement method for camera poses and 3D structure requiring sufficiently good initialization. In recent years, it was experimentally observed that useful minima can be reached even from arbitrary initialization for affine bundle adjustment problems (and fixed-rank matrix factorization instances in general). The key success factor lies in the use of the variable projection (VarPro) method, which is known to have a wide basin of convergence for such problems. In this paper, we propose the Pseudo Object Space Error (pOSE), which is an objective with cameras represented as a hybrid between the affine and projective models. This formulation allows us to obtain 3D reconstructions that are close to the true projective reconstructions while retaining a bilinear problem structure suitable for the VarPro method. Experimental results show that using pOSE has a high success rate to yield faithful 3D reconstructions from random initializations, taking one step towards initialization-free structure from motion.
Tasks
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Hong_pOSE_Pseudo_Object_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Hong_pOSE_Pseudo_Object_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/pose-pseudo-object-space-error-for
Repo
Framework

Neural Machine Translation Techniques for Named Entity Transliteration


Title	Neural Machine Translation Techniques for Named Entity Transliteration
Authors	Roman Grundkiewicz, Kenneth Heafield
Abstract	Transliterating named entities from one language into another can be approached as neural machine translation (NMT) problem, for which we use deep attentional RNN encoder-decoder models. To build a strong transliteration system, we apply well-established techniques from NMT, such as dropout regularization, model ensembling, rescoring with right-to-left models, and back-translation. Our submission to the NEWS 2018 Shared Task on Named Entity Transliteration ranked first in several tracks.
Tasks	Automatic Post-Editing, Grammatical Error Correction, Machine Translation, Transliteration
Published	2018-07-01
URL	https://www.aclweb.org/anthology/W18-2413/
PDF	https://www.aclweb.org/anthology/W18-2413
PWC	https://paperswithcode.com/paper/neural-machine-translation-techniques-for
Repo
Framework

Proceedings of Workshop for NLP Open Source Software (NLP-OSS)


Title	Proceedings of Workshop for NLP Open Source Software (NLP-OSS)
Authors
Abstract
Tasks
Published	2018-07-01
URL	https://www.aclweb.org/anthology/W18-2500/
PDF	https://www.aclweb.org/anthology/W18-2500
PWC	https://paperswithcode.com/paper/proceedings-of-workshop-for-nlp-open-source
Repo
Framework

The Principle of Logit Separation


Title	The Principle of Logit Separation
Authors	Gil Keren, Sivan Sabato, Björn Schuller
Abstract	We consider neural network training, in applications in which there are many possible classes, but at test-time, the task is to identify only whether the given example belongs to a specific class, which can be different in different applications of the classifier. For instance, this is the case in an image search engine. We consider the Single Logit Classification (SLC) task: training the network so that at test-time, it would be possible to accurately identify if the example belongs to a given class, based only on the output logit for this class. We propose a natural principle, the Principle of Logit Separation, as a guideline for choosing and designing losses suitable for the SLC. We show that the cross-entropy loss function is not aligned with the Principle of Logit Separation. In contrast, there are known loss functions, as well as novel batch loss functions that we propose, which are aligned with this principle. In total, we study seven loss functions. Our experiments show that indeed in almost all cases, losses that are aligned with Principle of Logit Separation obtain a 20%-35% relative performance improvement in the SLC task, compared to losses that are not aligned with it. We therefore conclude that the Principle of Logit Separation sheds light on an important property of the most common loss functions used by neural network classifiers.
Tasks	Image Retrieval
Published	2018-01-01
URL	https://openreview.net/forum?id=r1BRfhiab
PDF	https://openreview.net/pdf?id=r1BRfhiab
PWC	https://paperswithcode.com/paper/the-principle-of-logit-separation
Repo
Framework

Deeply Learned Filter Response Functions for Hyperspectral Reconstruction


Title	Deeply Learned Filter Response Functions for Hyperspectral Reconstruction
Authors	Shijie Nie, Lin Gu, Yinqiang Zheng, Antony Lam, Nobutaka Ono, Imari Sato
Abstract	Hyperspectral reconstruction from RGB imaging has recently achieved significant progress via sparse coding and deep learning. However, a largely ignored fact is that existing RGB cameras are tuned to mimic human richromatic perception, thus their spectral responses are not necessarily optimal for hyperspectral reconstruction. In this paper, rather than use RGB spectral responses, we simultaneously learn optimized camera spectral response functions (to be implemented in hardware) and a mapping for spectral reconstruction by using an end-to-end network. Our core idea is that since camera spectral filters act in effect like the convolution layer, their response functions could be optimized by training standard neural networks. We propose two types of designed filters: a three-chip setup without spatial mosaicing and a single-chip setup with a Bayer-style 2x2 filter array. Numerical simulations verify the advantages of deeply learned spectral responses compared to existing RGB cameras. More interestingly, by considering physical restrictions in the design process, we are able to realize the deeply learned spectral response functions by using modern film filter production technologies, and thus construct data-inspired multispectral cameras for snapshot hyperspectral imaging.
Tasks
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Nie_Deeply_Learned_Filter_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Nie_Deeply_Learned_Filter_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/deeply-learned-filter-response-functions-for
Repo
Framework

Interpretable Video Captioning via Trajectory Structured Localization


Title	Interpretable Video Captioning via Trajectory Structured Localization
Authors	Xian Wu, Guanbin Li, Qingxing Cao, Qingge Ji, Liang Lin
Abstract	Automatically describing open-domain videos with natural language are attracting increasing interest in the field of artificial intelligence. Most existing methods simply borrow ideas from image captioning and obtain a compact video representation from an ensemble of global image feature before feeding to an RNN decoder which outputs a sentence of variable length. However, it is not only arduous for the generator to focus on specific salient objects at different time given the global video representation, it is more formidable to capture the fine-grained motion information and the relation between moving instances for more subtle linguistic descriptions. In this paper, we propose a Trajectory Structured Attentional Encoder-Decoder (TSA-ED) neural network framework for more elaborate video captioning which works by integrating local spatial-temporal representation at trajectory level through structured attention mechanism. Our proposed method is based on a LSTM-based encoder-decoder framework, which incorporates an attention modeling scheme to adaptively learn the correlation between sentence structure and the moving objects in videos, and consequently generates more accurate and meticulous statement description in the decoding stage. Experimental results demonstrate that the feature representation and structured attention mechanism based on the trajectory cluster can efficiently obtain the local motion information in the video to help generate a more fine-grained video description, and achieve the state-of-the-art performance on the well-known Charades and MSVD datasets.
Tasks	Image Captioning, Video Captioning, Video Description
Published	2018-06-01
URL	http://openaccess.thecvf.com/content_cvpr_2018/html/Wu_Interpretable_Video_Captioning_CVPR_2018_paper.html
PDF	http://openaccess.thecvf.com/content_cvpr_2018/papers/Wu_Interpretable_Video_Captioning_CVPR_2018_paper.pdf
PWC	https://paperswithcode.com/paper/interpretable-video-captioning-via-trajectory
Repo
Framework

Iterative Value-Aware Model Learning


Title	Iterative Value-Aware Model Learning
Authors	Amir-Massoud Farahmand
Abstract	This paper introduces a model-based reinforcement learning (MBRL) framework that incorporates the underlying decision problem in learning the transition model of the environment. This is in contrast with conventional approaches to MBRL that learn the model of the environment, for example by finding the maximum likelihood estimate, without taking into account the decision problem. Value-Aware Model Learning (VAML) framework argues that this might not be a good idea, especially if the true model of the environment does not belong to the model class from which we are estimating the model. The original VAML framework, however, may result in an optimization problem that is difficult to solve. This paper introduces a new MBRL class of algorithms, called Iterative VAML, that benefits from the structure of how the planning is performed (i.e., through approximate value iteration) to devise a simpler optimization problem. The paper theoretically analyzes Iterative VAML and provides finite sample error upper bound guarantee for it.
Tasks
Published	2018-12-01
URL	http://papers.nips.cc/paper/8121-iterative-value-aware-model-learning
PDF	http://papers.nips.cc/paper/8121-iterative-value-aware-model-learning.pdf
PWC	https://paperswithcode.com/paper/iterative-value-aware-model-learning
Repo
Framework

Deep Attentive Sentence Ordering Network


Title	Deep Attentive Sentence Ordering Network
Authors	Baiyun Cui, Yingming Li, Ming Chen, Zhongfei Zhang
Abstract	In this paper, we propose a novel deep attentive sentence ordering network (referred as ATTOrderNet) which integrates self-attention mechanism with LSTMs in the encoding of input sentences. It enables us to capture global dependencies among sentences regardless of their input order and obtains a reliable representation of the sentence set. With this representation, a pointer network is exploited to generate an ordered sequence. The proposed model is evaluated on Sentence Ordering and Order Discrimination tasks. The extensive experimental results demonstrate its effectiveness and superiority to the state-of-the-art methods.
Tasks	Concept-To-Text Generation, Document Summarization, Feature Engineering, Multi-Document Summarization, Question Answering, Sentence Ordering, Text Generation
Published	2018-10-01
URL	https://www.aclweb.org/anthology/D18-1465/
PDF	https://www.aclweb.org/anthology/D18-1465
PWC	https://paperswithcode.com/paper/deep-attentive-sentence-ordering-network
Repo
Framework

On the Power of Over-parametrization in Neural Networks with Quadratic Activation


Title	On the Power of Over-parametrization in Neural Networks with Quadratic Activation
Authors	Simon Du, Jason Lee
Abstract	We provide new theoretical insights on why over-parametrization is effective in learning neural networks. For a $k$ hidden node shallow network with quadratic activation and $n$ training data points, we show as long as $ k \ge \sqrt{2n}$, over-parametrization enables local search algorithms to find a globally optimal solution for general smooth and convex loss functions. Further, despite that the number of parameters may exceed the sample size, using theory of Rademacher complexity, we show with weight decay, the solution also generalizes well if the data is sampled from a regular distribution such as Gaussian. To prove when $k\ge \sqrt{2n}$, the loss function has benign landscape properties, we adopt an idea from smoothed analysis, which may have other applications in studying loss surfaces of neural networks.
Tasks
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=1923
PDF	http://proceedings.mlr.press/v80/du18a/du18a.pdf
PWC	https://paperswithcode.com/paper/on-the-power-of-over-parametrization-in
Repo
Framework

Appraise Evaluation Framework for Machine Translation


Title	Appraise Evaluation Framework for Machine Translation
Authors	Christian Federmann
Abstract	We present Appraise, an open-source framework for crowd-based annotation tasks, notably for evaluation of machine translation output. This is the software used to run the yearly evaluation campaigns for shared tasks at the WMT Conference on Machine Translation. It has also been used at IWSLT 2017 and, recently, to measure human parity for machine translation for Chinese to English news text. The demo will present the full end-to-end lifecycle of an Appraise evaluation campaign, from task creation to annotation and interpretation of results.
Tasks	Machine Translation
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-2019/
PDF	https://www.aclweb.org/anthology/C18-2019
PWC	https://paperswithcode.com/paper/appraise-evaluation-framework-for-machine
Repo
Framework

An Application for Building a Polish Telephone Speech Corpus


Title	An Application for Building a Polish Telephone Speech Corpus
Authors	Bartosz Zi{'o}{\l}ko, Piotr {.Z}elasko, Ireneusz Gawlik, Tomasz P{\k{e}}dzim{\k{a}}{.z}, Tomasz Jadczyk
Abstract
Tasks	Speech Recognition
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1066/
PDF	https://www.aclweb.org/anthology/L18-1066
PWC	https://paperswithcode.com/paper/an-application-for-building-a-polish
Repo
Framework

Dependent Relational Gamma Process Models for Longitudinal Networks


Title	Dependent Relational Gamma Process Models for Longitudinal Networks
Authors	Sikun Yang, Heinz Koeppl
Abstract	A probabilistic framework based on the covariate-dependent relational gamma process is developed to analyze relational data arising from longitudinal networks. The proposed framework characterizes networked nodes by nonnegative node-group memberships, which allow each node to belong to multiple latent groups simultaneously, and encodes edge probabilities between each pair of nodes using a Bernoulli Poisson link to the embedded latent space. Within the latent space, our framework models the birth and death dynamics of individual groups via a thinning function. Our framework also captures the evolution of individual node-group memberships over time using gamma Markov processes. Exploiting the recent advances in data augmentation and marginalization techniques, a simple and efficient Gibbs sampler is proposed for posterior computation. Experimental results on a simulation study and three real-world temporal network data sets demonstrate the model’s capability, competitive performance and scalability compared to state-of-the-art methods.
Tasks	Data Augmentation
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=1942
PDF	http://proceedings.mlr.press/v80/yang18b/yang18b.pdf
PWC	https://paperswithcode.com/paper/dependent-relational-gamma-process-models-for
Repo
Framework

STRUCTURED ALIGNMENT NETWORKS


Title	STRUCTURED ALIGNMENT NETWORKS
Authors	Yang Liu, Matt Gardner
Abstract	Many tasks in natural language processing involve comparing two sentences to compute some notion of relevance, entailment, or similarity. Typically this comparison is done either at the word level or at the sentence level, with no attempt to leverage the inherent structure of the sentence. When sentence structure is used for comparison, it is obtained during a non-differentiable pre-processing step, leading to propagation of errors. We introduce a model of structured alignments between sentences, showing how to compare two sentences by matching their latent structures. Using a structured attention mechanism, our model matches possible spans in the first sentence to possible spans in the second sentence, simultaneously discovering the tree structure of each sentence and performing a comparison, in a model that is fully differentiable and is trained only on the comparison objective. We evaluate this model on two sentence comparison tasks: the Stanford natural language inference dataset and the TREC-QA dataset. We find that comparing spans results in superior performance to comparing words individually, and that the learned trees are consistent with actual linguistic structures.
Tasks	Natural Language Inference
Published	2018-01-01
URL	https://openreview.net/forum?id=Byht0GbRZ
PDF	https://openreview.net/pdf?id=Byht0GbRZ
PWC	https://paperswithcode.com/paper/structured-alignment-networks
Repo
Framework