Paper Group NANR 128
Automatic Extraction of Entities and Relation from Legal Documents. Attention-based Semantic Priming for Slot-filling. pOSE: Pseudo Object Space Error for Initialization-Free Bundle Adjustment. Neural Machine Translation Techniques for Named Entity Transliteration. Proceedings of Workshop for NLP Open Source Software (NLP-OSS). The Principle of Log …
Automatic Extraction of Entities and Relation from Legal Documents
Title | Automatic Extraction of Entities and Relation from Legal Documents |
Authors | Judith Jeyafreeda Andrew |
Abstract | In recent years, the journalists and computer sciences speak to each other to identify useful technologies which would help them in extracting useful information. This is called {``}computational Journalism{''}. In this paper, we present a method that will enable the journalists to automatically identifies and annotates entities such as names of people, organizations, role and functions of people in legal documents; the relationship between these entities are also explored. The system uses a combination of both statistical and rule based technique. The statistical method used is Conditional Random Fields and for the rule based technique, document and language specific regular expressions are used. | |
Tasks | Named Entity Recognition |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/W18-2401/ |
https://www.aclweb.org/anthology/W18-2401 | |
PWC | https://paperswithcode.com/paper/automatic-extraction-of-entities-and-relation |
Repo | |
Framework | |
Attention-based Semantic Priming for Slot-filling
Title | Attention-based Semantic Priming for Slot-filling |
Authors | Jiewen Wu, Rafael E. Banchs, Luis Fern D{'}Haro, o, Pavitra Krishnaswamy, Nancy Chen |
Abstract | The problem of sequence labelling in language understanding would benefit from approaches inspired by semantic priming phenomena. We propose that an attention-based RNN architecture can be used to simulate semantic priming for sequence labelling. Specifically, we employ pre-trained word embeddings to characterize the semantic relationship between utterances and labels. We validate the approach using varying sizes of the ATIS and MEDIA datasets, and show up to 1.4-1.9{%} improvement in F1 score. The developed framework can enable more explainable and generalizable spoken language understanding systems. |
Tasks | Slot Filling, Spoken Language Understanding, Word Embeddings |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/W18-2404/ |
https://www.aclweb.org/anthology/W18-2404 | |
PWC | https://paperswithcode.com/paper/attention-based-semantic-priming-for-slot |
Repo | |
Framework | |
pOSE: Pseudo Object Space Error for Initialization-Free Bundle Adjustment
Title | pOSE: Pseudo Object Space Error for Initialization-Free Bundle Adjustment |
Authors | Je Hyeong Hong, Christopher Zach |
Abstract | Bundle adjustment is a nonlinear refinement method for camera poses and 3D structure requiring sufficiently good initialization. In recent years, it was experimentally observed that useful minima can be reached even from arbitrary initialization for affine bundle adjustment problems (and fixed-rank matrix factorization instances in general). The key success factor lies in the use of the variable projection (VarPro) method, which is known to have a wide basin of convergence for such problems. In this paper, we propose the Pseudo Object Space Error (pOSE), which is an objective with cameras represented as a hybrid between the affine and projective models. This formulation allows us to obtain 3D reconstructions that are close to the true projective reconstructions while retaining a bilinear problem structure suitable for the VarPro method. Experimental results show that using pOSE has a high success rate to yield faithful 3D reconstructions from random initializations, taking one step towards initialization-free structure from motion. |
Tasks | |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Hong_pOSE_Pseudo_Object_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Hong_pOSE_Pseudo_Object_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/pose-pseudo-object-space-error-for |
Repo | |
Framework | |
Neural Machine Translation Techniques for Named Entity Transliteration
Title | Neural Machine Translation Techniques for Named Entity Transliteration |
Authors | Roman Grundkiewicz, Kenneth Heafield |
Abstract | Transliterating named entities from one language into another can be approached as neural machine translation (NMT) problem, for which we use deep attentional RNN encoder-decoder models. To build a strong transliteration system, we apply well-established techniques from NMT, such as dropout regularization, model ensembling, rescoring with right-to-left models, and back-translation. Our submission to the NEWS 2018 Shared Task on Named Entity Transliteration ranked first in several tracks. |
Tasks | Automatic Post-Editing, Grammatical Error Correction, Machine Translation, Transliteration |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/W18-2413/ |
https://www.aclweb.org/anthology/W18-2413 | |
PWC | https://paperswithcode.com/paper/neural-machine-translation-techniques-for |
Repo | |
Framework | |
Proceedings of Workshop for NLP Open Source Software (NLP-OSS)
Title | Proceedings of Workshop for NLP Open Source Software (NLP-OSS) |
Authors | |
Abstract | |
Tasks | |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/W18-2500/ |
https://www.aclweb.org/anthology/W18-2500 | |
PWC | https://paperswithcode.com/paper/proceedings-of-workshop-for-nlp-open-source |
Repo | |
Framework | |
The Principle of Logit Separation
Title | The Principle of Logit Separation |
Authors | Gil Keren, Sivan Sabato, Björn Schuller |
Abstract | We consider neural network training, in applications in which there are many possible classes, but at test-time, the task is to identify only whether the given example belongs to a specific class, which can be different in different applications of the classifier. For instance, this is the case in an image search engine. We consider the Single Logit Classification (SLC) task: training the network so that at test-time, it would be possible to accurately identify if the example belongs to a given class, based only on the output logit for this class. We propose a natural principle, the Principle of Logit Separation, as a guideline for choosing and designing losses suitable for the SLC. We show that the cross-entropy loss function is not aligned with the Principle of Logit Separation. In contrast, there are known loss functions, as well as novel batch loss functions that we propose, which are aligned with this principle. In total, we study seven loss functions. Our experiments show that indeed in almost all cases, losses that are aligned with Principle of Logit Separation obtain a 20%-35% relative performance improvement in the SLC task, compared to losses that are not aligned with it. We therefore conclude that the Principle of Logit Separation sheds light on an important property of the most common loss functions used by neural network classifiers. |
Tasks | Image Retrieval |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=r1BRfhiab |
https://openreview.net/pdf?id=r1BRfhiab | |
PWC | https://paperswithcode.com/paper/the-principle-of-logit-separation |
Repo | |
Framework | |
Deeply Learned Filter Response Functions for Hyperspectral Reconstruction
Title | Deeply Learned Filter Response Functions for Hyperspectral Reconstruction |
Authors | Shijie Nie, Lin Gu, Yinqiang Zheng, Antony Lam, Nobutaka Ono, Imari Sato |
Abstract | Hyperspectral reconstruction from RGB imaging has recently achieved significant progress via sparse coding and deep learning. However, a largely ignored fact is that existing RGB cameras are tuned to mimic human richromatic perception, thus their spectral responses are not necessarily optimal for hyperspectral reconstruction. In this paper, rather than use RGB spectral responses, we simultaneously learn optimized camera spectral response functions (to be implemented in hardware) and a mapping for spectral reconstruction by using an end-to-end network. Our core idea is that since camera spectral filters act in effect like the convolution layer, their response functions could be optimized by training standard neural networks. We propose two types of designed filters: a three-chip setup without spatial mosaicing and a single-chip setup with a Bayer-style 2x2 filter array. Numerical simulations verify the advantages of deeply learned spectral responses compared to existing RGB cameras. More interestingly, by considering physical restrictions in the design process, we are able to realize the deeply learned spectral response functions by using modern film filter production technologies, and thus construct data-inspired multispectral cameras for snapshot hyperspectral imaging. |
Tasks | |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Nie_Deeply_Learned_Filter_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Nie_Deeply_Learned_Filter_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/deeply-learned-filter-response-functions-for |
Repo | |
Framework | |
Interpretable Video Captioning via Trajectory Structured Localization
Title | Interpretable Video Captioning via Trajectory Structured Localization |
Authors | Xian Wu, Guanbin Li, Qingxing Cao, Qingge Ji, Liang Lin |
Abstract | Automatically describing open-domain videos with natural language are attracting increasing interest in the field of artificial intelligence. Most existing methods simply borrow ideas from image captioning and obtain a compact video representation from an ensemble of global image feature before feeding to an RNN decoder which outputs a sentence of variable length. However, it is not only arduous for the generator to focus on specific salient objects at different time given the global video representation, it is more formidable to capture the fine-grained motion information and the relation between moving instances for more subtle linguistic descriptions. In this paper, we propose a Trajectory Structured Attentional Encoder-Decoder (TSA-ED) neural network framework for more elaborate video captioning which works by integrating local spatial-temporal representation at trajectory level through structured attention mechanism. Our proposed method is based on a LSTM-based encoder-decoder framework, which incorporates an attention modeling scheme to adaptively learn the correlation between sentence structure and the moving objects in videos, and consequently generates more accurate and meticulous statement description in the decoding stage. Experimental results demonstrate that the feature representation and structured attention mechanism based on the trajectory cluster can efficiently obtain the local motion information in the video to help generate a more fine-grained video description, and achieve the state-of-the-art performance on the well-known Charades and MSVD datasets. |
Tasks | Image Captioning, Video Captioning, Video Description |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Wu_Interpretable_Video_Captioning_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Wu_Interpretable_Video_Captioning_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/interpretable-video-captioning-via-trajectory |
Repo | |
Framework | |
Iterative Value-Aware Model Learning
Title | Iterative Value-Aware Model Learning |
Authors | Amir-Massoud Farahmand |
Abstract | This paper introduces a model-based reinforcement learning (MBRL) framework that incorporates the underlying decision problem in learning the transition model of the environment. This is in contrast with conventional approaches to MBRL that learn the model of the environment, for example by finding the maximum likelihood estimate, without taking into account the decision problem. Value-Aware Model Learning (VAML) framework argues that this might not be a good idea, especially if the true model of the environment does not belong to the model class from which we are estimating the model. The original VAML framework, however, may result in an optimization problem that is difficult to solve. This paper introduces a new MBRL class of algorithms, called Iterative VAML, that benefits from the structure of how the planning is performed (i.e., through approximate value iteration) to devise a simpler optimization problem. The paper theoretically analyzes Iterative VAML and provides finite sample error upper bound guarantee for it. |
Tasks | |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/8121-iterative-value-aware-model-learning |
http://papers.nips.cc/paper/8121-iterative-value-aware-model-learning.pdf | |
PWC | https://paperswithcode.com/paper/iterative-value-aware-model-learning |
Repo | |
Framework | |
Deep Attentive Sentence Ordering Network
Title | Deep Attentive Sentence Ordering Network |
Authors | Baiyun Cui, Yingming Li, Ming Chen, Zhongfei Zhang |
Abstract | In this paper, we propose a novel deep attentive sentence ordering network (referred as ATTOrderNet) which integrates self-attention mechanism with LSTMs in the encoding of input sentences. It enables us to capture global dependencies among sentences regardless of their input order and obtains a reliable representation of the sentence set. With this representation, a pointer network is exploited to generate an ordered sequence. The proposed model is evaluated on Sentence Ordering and Order Discrimination tasks. The extensive experimental results demonstrate its effectiveness and superiority to the state-of-the-art methods. |
Tasks | Concept-To-Text Generation, Document Summarization, Feature Engineering, Multi-Document Summarization, Question Answering, Sentence Ordering, Text Generation |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1465/ |
https://www.aclweb.org/anthology/D18-1465 | |
PWC | https://paperswithcode.com/paper/deep-attentive-sentence-ordering-network |
Repo | |
Framework | |
On the Power of Over-parametrization in Neural Networks with Quadratic Activation
Title | On the Power of Over-parametrization in Neural Networks with Quadratic Activation |
Authors | Simon Du, Jason Lee |
Abstract | We provide new theoretical insights on why over-parametrization is effective in learning neural networks. For a $k$ hidden node shallow network with quadratic activation and $n$ training data points, we show as long as $ k \ge \sqrt{2n}$, over-parametrization enables local search algorithms to find a globally optimal solution for general smooth and convex loss functions. Further, despite that the number of parameters may exceed the sample size, using theory of Rademacher complexity, we show with weight decay, the solution also generalizes well if the data is sampled from a regular distribution such as Gaussian. To prove when $k\ge \sqrt{2n}$, the loss function has benign landscape properties, we adopt an idea from smoothed analysis, which may have other applications in studying loss surfaces of neural networks. |
Tasks | |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=1923 |
http://proceedings.mlr.press/v80/du18a/du18a.pdf | |
PWC | https://paperswithcode.com/paper/on-the-power-of-over-parametrization-in |
Repo | |
Framework | |
Appraise Evaluation Framework for Machine Translation
Title | Appraise Evaluation Framework for Machine Translation |
Authors | Christian Federmann |
Abstract | We present Appraise, an open-source framework for crowd-based annotation tasks, notably for evaluation of machine translation output. This is the software used to run the yearly evaluation campaigns for shared tasks at the WMT Conference on Machine Translation. It has also been used at IWSLT 2017 and, recently, to measure human parity for machine translation for Chinese to English news text. The demo will present the full end-to-end lifecycle of an Appraise evaluation campaign, from task creation to annotation and interpretation of results. |
Tasks | Machine Translation |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-2019/ |
https://www.aclweb.org/anthology/C18-2019 | |
PWC | https://paperswithcode.com/paper/appraise-evaluation-framework-for-machine |
Repo | |
Framework | |
An Application for Building a Polish Telephone Speech Corpus
Title | An Application for Building a Polish Telephone Speech Corpus |
Authors | Bartosz Zi{'o}{\l}ko, Piotr {.Z}elasko, Ireneusz Gawlik, Tomasz P{\k{e}}dzim{\k{a}}{.z}, Tomasz Jadczyk |
Abstract | |
Tasks | Speech Recognition |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1066/ |
https://www.aclweb.org/anthology/L18-1066 | |
PWC | https://paperswithcode.com/paper/an-application-for-building-a-polish |
Repo | |
Framework | |
Dependent Relational Gamma Process Models for Longitudinal Networks
Title | Dependent Relational Gamma Process Models for Longitudinal Networks |
Authors | Sikun Yang, Heinz Koeppl |
Abstract | A probabilistic framework based on the covariate-dependent relational gamma process is developed to analyze relational data arising from longitudinal networks. The proposed framework characterizes networked nodes by nonnegative node-group memberships, which allow each node to belong to multiple latent groups simultaneously, and encodes edge probabilities between each pair of nodes using a Bernoulli Poisson link to the embedded latent space. Within the latent space, our framework models the birth and death dynamics of individual groups via a thinning function. Our framework also captures the evolution of individual node-group memberships over time using gamma Markov processes. Exploiting the recent advances in data augmentation and marginalization techniques, a simple and efficient Gibbs sampler is proposed for posterior computation. Experimental results on a simulation study and three real-world temporal network data sets demonstrate the model’s capability, competitive performance and scalability compared to state-of-the-art methods. |
Tasks | Data Augmentation |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=1942 |
http://proceedings.mlr.press/v80/yang18b/yang18b.pdf | |
PWC | https://paperswithcode.com/paper/dependent-relational-gamma-process-models-for |
Repo | |
Framework | |
STRUCTURED ALIGNMENT NETWORKS
Title | STRUCTURED ALIGNMENT NETWORKS |
Authors | Yang Liu, Matt Gardner |
Abstract | Many tasks in natural language processing involve comparing two sentences to compute some notion of relevance, entailment, or similarity. Typically this comparison is done either at the word level or at the sentence level, with no attempt to leverage the inherent structure of the sentence. When sentence structure is used for comparison, it is obtained during a non-differentiable pre-processing step, leading to propagation of errors. We introduce a model of structured alignments between sentences, showing how to compare two sentences by matching their latent structures. Using a structured attention mechanism, our model matches possible spans in the first sentence to possible spans in the second sentence, simultaneously discovering the tree structure of each sentence and performing a comparison, in a model that is fully differentiable and is trained only on the comparison objective. We evaluate this model on two sentence comparison tasks: the Stanford natural language inference dataset and the TREC-QA dataset. We find that comparing spans results in superior performance to comparing words individually, and that the learned trees are consistent with actual linguistic structures. |
Tasks | Natural Language Inference |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=Byht0GbRZ |
https://openreview.net/pdf?id=Byht0GbRZ | |
PWC | https://paperswithcode.com/paper/structured-alignment-networks |
Repo | |
Framework | |