October 15, 2019

2403 words 12 mins read

Paper Group NANR 128

Paper Group NANR 128

Automatic Extraction of Entities and Relation from Legal Documents. Attention-based Semantic Priming for Slot-filling. pOSE: Pseudo Object Space Error for Initialization-Free Bundle Adjustment. Neural Machine Translation Techniques for Named Entity Transliteration. Proceedings of Workshop for NLP Open Source Software (NLP-OSS). The Principle of Log …

Title Automatic Extraction of Entities and Relation from Legal Documents
Authors Judith Jeyafreeda Andrew
Abstract In recent years, the journalists and computer sciences speak to each other to identify useful technologies which would help them in extracting useful information. This is called {``}computational Journalism{''}. In this paper, we present a method that will enable the journalists to automatically identifies and annotates entities such as names of people, organizations, role and functions of people in legal documents; the relationship between these entities are also explored. The system uses a combination of both statistical and rule based technique. The statistical method used is Conditional Random Fields and for the rule based technique, document and language specific regular expressions are used. |
Tasks Named Entity Recognition
Published 2018-07-01
URL https://www.aclweb.org/anthology/W18-2401/
PDF https://www.aclweb.org/anthology/W18-2401
PWC https://paperswithcode.com/paper/automatic-extraction-of-entities-and-relation
Repo
Framework

Attention-based Semantic Priming for Slot-filling

Title Attention-based Semantic Priming for Slot-filling
Authors Jiewen Wu, Rafael E. Banchs, Luis Fern D{'}Haro, o, Pavitra Krishnaswamy, Nancy Chen
Abstract The problem of sequence labelling in language understanding would benefit from approaches inspired by semantic priming phenomena. We propose that an attention-based RNN architecture can be used to simulate semantic priming for sequence labelling. Specifically, we employ pre-trained word embeddings to characterize the semantic relationship between utterances and labels. We validate the approach using varying sizes of the ATIS and MEDIA datasets, and show up to 1.4-1.9{%} improvement in F1 score. The developed framework can enable more explainable and generalizable spoken language understanding systems.
Tasks Slot Filling, Spoken Language Understanding, Word Embeddings
Published 2018-07-01
URL https://www.aclweb.org/anthology/W18-2404/
PDF https://www.aclweb.org/anthology/W18-2404
PWC https://paperswithcode.com/paper/attention-based-semantic-priming-for-slot
Repo
Framework

pOSE: Pseudo Object Space Error for Initialization-Free Bundle Adjustment

Title pOSE: Pseudo Object Space Error for Initialization-Free Bundle Adjustment
Authors Je Hyeong Hong, Christopher Zach
Abstract Bundle adjustment is a nonlinear refinement method for camera poses and 3D structure requiring sufficiently good initialization. In recent years, it was experimentally observed that useful minima can be reached even from arbitrary initialization for affine bundle adjustment problems (and fixed-rank matrix factorization instances in general). The key success factor lies in the use of the variable projection (VarPro) method, which is known to have a wide basin of convergence for such problems. In this paper, we propose the Pseudo Object Space Error (pOSE), which is an objective with cameras represented as a hybrid between the affine and projective models. This formulation allows us to obtain 3D reconstructions that are close to the true projective reconstructions while retaining a bilinear problem structure suitable for the VarPro method. Experimental results show that using pOSE has a high success rate to yield faithful 3D reconstructions from random initializations, taking one step towards initialization-free structure from motion.
Tasks
Published 2018-06-01
URL http://openaccess.thecvf.com/content_cvpr_2018/html/Hong_pOSE_Pseudo_Object_CVPR_2018_paper.html
PDF http://openaccess.thecvf.com/content_cvpr_2018/papers/Hong_pOSE_Pseudo_Object_CVPR_2018_paper.pdf
PWC https://paperswithcode.com/paper/pose-pseudo-object-space-error-for
Repo
Framework

Neural Machine Translation Techniques for Named Entity Transliteration

Title Neural Machine Translation Techniques for Named Entity Transliteration
Authors Roman Grundkiewicz, Kenneth Heafield
Abstract Transliterating named entities from one language into another can be approached as neural machine translation (NMT) problem, for which we use deep attentional RNN encoder-decoder models. To build a strong transliteration system, we apply well-established techniques from NMT, such as dropout regularization, model ensembling, rescoring with right-to-left models, and back-translation. Our submission to the NEWS 2018 Shared Task on Named Entity Transliteration ranked first in several tracks.
Tasks Automatic Post-Editing, Grammatical Error Correction, Machine Translation, Transliteration
Published 2018-07-01
URL https://www.aclweb.org/anthology/W18-2413/
PDF https://www.aclweb.org/anthology/W18-2413
PWC https://paperswithcode.com/paper/neural-machine-translation-techniques-for
Repo
Framework

Proceedings of Workshop for NLP Open Source Software (NLP-OSS)

Title Proceedings of Workshop for NLP Open Source Software (NLP-OSS)
Authors
Abstract
Tasks
Published 2018-07-01
URL https://www.aclweb.org/anthology/W18-2500/
PDF https://www.aclweb.org/anthology/W18-2500
PWC https://paperswithcode.com/paper/proceedings-of-workshop-for-nlp-open-source
Repo
Framework

The Principle of Logit Separation

Title The Principle of Logit Separation
Authors Gil Keren, Sivan Sabato, Björn Schuller
Abstract We consider neural network training, in applications in which there are many possible classes, but at test-time, the task is to identify only whether the given example belongs to a specific class, which can be different in different applications of the classifier. For instance, this is the case in an image search engine. We consider the Single Logit Classification (SLC) task: training the network so that at test-time, it would be possible to accurately identify if the example belongs to a given class, based only on the output logit for this class. We propose a natural principle, the Principle of Logit Separation, as a guideline for choosing and designing losses suitable for the SLC. We show that the cross-entropy loss function is not aligned with the Principle of Logit Separation. In contrast, there are known loss functions, as well as novel batch loss functions that we propose, which are aligned with this principle. In total, we study seven loss functions. Our experiments show that indeed in almost all cases, losses that are aligned with Principle of Logit Separation obtain a 20%-35% relative performance improvement in the SLC task, compared to losses that are not aligned with it. We therefore conclude that the Principle of Logit Separation sheds light on an important property of the most common loss functions used by neural network classifiers.
Tasks Image Retrieval
Published 2018-01-01
URL https://openreview.net/forum?id=r1BRfhiab
PDF https://openreview.net/pdf?id=r1BRfhiab
PWC https://paperswithcode.com/paper/the-principle-of-logit-separation
Repo
Framework

Deeply Learned Filter Response Functions for Hyperspectral Reconstruction

Title Deeply Learned Filter Response Functions for Hyperspectral Reconstruction
Authors Shijie Nie, Lin Gu, Yinqiang Zheng, Antony Lam, Nobutaka Ono, Imari Sato
Abstract Hyperspectral reconstruction from RGB imaging has recently achieved significant progress via sparse coding and deep learning. However, a largely ignored fact is that existing RGB cameras are tuned to mimic human richromatic perception, thus their spectral responses are not necessarily optimal for hyperspectral reconstruction. In this paper, rather than use RGB spectral responses, we simultaneously learn optimized camera spectral response functions (to be implemented in hardware) and a mapping for spectral reconstruction by using an end-to-end network. Our core idea is that since camera spectral filters act in effect like the convolution layer, their response functions could be optimized by training standard neural networks. We propose two types of designed filters: a three-chip setup without spatial mosaicing and a single-chip setup with a Bayer-style 2x2 filter array. Numerical simulations verify the advantages of deeply learned spectral responses compared to existing RGB cameras. More interestingly, by considering physical restrictions in the design process, we are able to realize the deeply learned spectral response functions by using modern film filter production technologies, and thus construct data-inspired multispectral cameras for snapshot hyperspectral imaging.
Tasks
Published 2018-06-01
URL http://openaccess.thecvf.com/content_cvpr_2018/html/Nie_Deeply_Learned_Filter_CVPR_2018_paper.html
PDF http://openaccess.thecvf.com/content_cvpr_2018/papers/Nie_Deeply_Learned_Filter_CVPR_2018_paper.pdf
PWC https://paperswithcode.com/paper/deeply-learned-filter-response-functions-for
Repo
Framework

Interpretable Video Captioning via Trajectory Structured Localization

Title Interpretable Video Captioning via Trajectory Structured Localization
Authors Xian Wu, Guanbin Li, Qingxing Cao, Qingge Ji, Liang Lin
Abstract Automatically describing open-domain videos with natural language are attracting increasing interest in the field of artificial intelligence. Most existing methods simply borrow ideas from image captioning and obtain a compact video representation from an ensemble of global image feature before feeding to an RNN decoder which outputs a sentence of variable length. However, it is not only arduous for the generator to focus on specific salient objects at different time given the global video representation, it is more formidable to capture the fine-grained motion information and the relation between moving instances for more subtle linguistic descriptions. In this paper, we propose a Trajectory Structured Attentional Encoder-Decoder (TSA-ED) neural network framework for more elaborate video captioning which works by integrating local spatial-temporal representation at trajectory level through structured attention mechanism. Our proposed method is based on a LSTM-based encoder-decoder framework, which incorporates an attention modeling scheme to adaptively learn the correlation between sentence structure and the moving objects in videos, and consequently generates more accurate and meticulous statement description in the decoding stage. Experimental results demonstrate that the feature representation and structured attention mechanism based on the trajectory cluster can efficiently obtain the local motion information in the video to help generate a more fine-grained video description, and achieve the state-of-the-art performance on the well-known Charades and MSVD datasets.
Tasks Image Captioning, Video Captioning, Video Description
Published 2018-06-01
URL http://openaccess.thecvf.com/content_cvpr_2018/html/Wu_Interpretable_Video_Captioning_CVPR_2018_paper.html
PDF http://openaccess.thecvf.com/content_cvpr_2018/papers/Wu_Interpretable_Video_Captioning_CVPR_2018_paper.pdf
PWC https://paperswithcode.com/paper/interpretable-video-captioning-via-trajectory
Repo
Framework

Iterative Value-Aware Model Learning

Title Iterative Value-Aware Model Learning
Authors Amir-Massoud Farahmand
Abstract This paper introduces a model-based reinforcement learning (MBRL) framework that incorporates the underlying decision problem in learning the transition model of the environment. This is in contrast with conventional approaches to MBRL that learn the model of the environment, for example by finding the maximum likelihood estimate, without taking into account the decision problem. Value-Aware Model Learning (VAML) framework argues that this might not be a good idea, especially if the true model of the environment does not belong to the model class from which we are estimating the model. The original VAML framework, however, may result in an optimization problem that is difficult to solve. This paper introduces a new MBRL class of algorithms, called Iterative VAML, that benefits from the structure of how the planning is performed (i.e., through approximate value iteration) to devise a simpler optimization problem. The paper theoretically analyzes Iterative VAML and provides finite sample error upper bound guarantee for it.
Tasks
Published 2018-12-01
URL http://papers.nips.cc/paper/8121-iterative-value-aware-model-learning
PDF http://papers.nips.cc/paper/8121-iterative-value-aware-model-learning.pdf
PWC https://paperswithcode.com/paper/iterative-value-aware-model-learning
Repo
Framework

Deep Attentive Sentence Ordering Network

Title Deep Attentive Sentence Ordering Network
Authors Baiyun Cui, Yingming Li, Ming Chen, Zhongfei Zhang
Abstract In this paper, we propose a novel deep attentive sentence ordering network (referred as ATTOrderNet) which integrates self-attention mechanism with LSTMs in the encoding of input sentences. It enables us to capture global dependencies among sentences regardless of their input order and obtains a reliable representation of the sentence set. With this representation, a pointer network is exploited to generate an ordered sequence. The proposed model is evaluated on Sentence Ordering and Order Discrimination tasks. The extensive experimental results demonstrate its effectiveness and superiority to the state-of-the-art methods.
Tasks Concept-To-Text Generation, Document Summarization, Feature Engineering, Multi-Document Summarization, Question Answering, Sentence Ordering, Text Generation
Published 2018-10-01
URL https://www.aclweb.org/anthology/D18-1465/
PDF https://www.aclweb.org/anthology/D18-1465
PWC https://paperswithcode.com/paper/deep-attentive-sentence-ordering-network
Repo
Framework

On the Power of Over-parametrization in Neural Networks with Quadratic Activation

Title On the Power of Over-parametrization in Neural Networks with Quadratic Activation
Authors Simon Du, Jason Lee
Abstract We provide new theoretical insights on why over-parametrization is effective in learning neural networks. For a $k$ hidden node shallow network with quadratic activation and $n$ training data points, we show as long as $ k \ge \sqrt{2n}$, over-parametrization enables local search algorithms to find a globally optimal solution for general smooth and convex loss functions. Further, despite that the number of parameters may exceed the sample size, using theory of Rademacher complexity, we show with weight decay, the solution also generalizes well if the data is sampled from a regular distribution such as Gaussian. To prove when $k\ge \sqrt{2n}$, the loss function has benign landscape properties, we adopt an idea from smoothed analysis, which may have other applications in studying loss surfaces of neural networks.
Tasks
Published 2018-07-01
URL https://icml.cc/Conferences/2018/Schedule?showEvent=1923
PDF http://proceedings.mlr.press/v80/du18a/du18a.pdf
PWC https://paperswithcode.com/paper/on-the-power-of-over-parametrization-in
Repo
Framework

Appraise Evaluation Framework for Machine Translation

Title Appraise Evaluation Framework for Machine Translation
Authors Christian Federmann
Abstract We present Appraise, an open-source framework for crowd-based annotation tasks, notably for evaluation of machine translation output. This is the software used to run the yearly evaluation campaigns for shared tasks at the WMT Conference on Machine Translation. It has also been used at IWSLT 2017 and, recently, to measure human parity for machine translation for Chinese to English news text. The demo will present the full end-to-end lifecycle of an Appraise evaluation campaign, from task creation to annotation and interpretation of results.
Tasks Machine Translation
Published 2018-08-01
URL https://www.aclweb.org/anthology/C18-2019/
PDF https://www.aclweb.org/anthology/C18-2019
PWC https://paperswithcode.com/paper/appraise-evaluation-framework-for-machine
Repo
Framework

An Application for Building a Polish Telephone Speech Corpus

Title An Application for Building a Polish Telephone Speech Corpus
Authors Bartosz Zi{'o}{\l}ko, Piotr {.Z}elasko, Ireneusz Gawlik, Tomasz P{\k{e}}dzim{\k{a}}{.z}, Tomasz Jadczyk
Abstract
Tasks Speech Recognition
Published 2018-05-01
URL https://www.aclweb.org/anthology/L18-1066/
PDF https://www.aclweb.org/anthology/L18-1066
PWC https://paperswithcode.com/paper/an-application-for-building-a-polish
Repo
Framework

Dependent Relational Gamma Process Models for Longitudinal Networks

Title Dependent Relational Gamma Process Models for Longitudinal Networks
Authors Sikun Yang, Heinz Koeppl
Abstract A probabilistic framework based on the covariate-dependent relational gamma process is developed to analyze relational data arising from longitudinal networks. The proposed framework characterizes networked nodes by nonnegative node-group memberships, which allow each node to belong to multiple latent groups simultaneously, and encodes edge probabilities between each pair of nodes using a Bernoulli Poisson link to the embedded latent space. Within the latent space, our framework models the birth and death dynamics of individual groups via a thinning function. Our framework also captures the evolution of individual node-group memberships over time using gamma Markov processes. Exploiting the recent advances in data augmentation and marginalization techniques, a simple and efficient Gibbs sampler is proposed for posterior computation. Experimental results on a simulation study and three real-world temporal network data sets demonstrate the model’s capability, competitive performance and scalability compared to state-of-the-art methods.
Tasks Data Augmentation
Published 2018-07-01
URL https://icml.cc/Conferences/2018/Schedule?showEvent=1942
PDF http://proceedings.mlr.press/v80/yang18b/yang18b.pdf
PWC https://paperswithcode.com/paper/dependent-relational-gamma-process-models-for
Repo
Framework

STRUCTURED ALIGNMENT NETWORKS

Title STRUCTURED ALIGNMENT NETWORKS
Authors Yang Liu, Matt Gardner
Abstract Many tasks in natural language processing involve comparing two sentences to compute some notion of relevance, entailment, or similarity. Typically this comparison is done either at the word level or at the sentence level, with no attempt to leverage the inherent structure of the sentence. When sentence structure is used for comparison, it is obtained during a non-differentiable pre-processing step, leading to propagation of errors. We introduce a model of structured alignments between sentences, showing how to compare two sentences by matching their latent structures. Using a structured attention mechanism, our model matches possible spans in the first sentence to possible spans in the second sentence, simultaneously discovering the tree structure of each sentence and performing a comparison, in a model that is fully differentiable and is trained only on the comparison objective. We evaluate this model on two sentence comparison tasks: the Stanford natural language inference dataset and the TREC-QA dataset. We find that comparing spans results in superior performance to comparing words individually, and that the learned trees are consistent with actual linguistic structures.
Tasks Natural Language Inference
Published 2018-01-01
URL https://openreview.net/forum?id=Byht0GbRZ
PDF https://openreview.net/pdf?id=Byht0GbRZ
PWC https://paperswithcode.com/paper/structured-alignment-networks
Repo
Framework
comments powered by Disqus