January 25, 2020

2411 words 12 mins read

Paper Group NANR 48

Paper Group NANR 48

Enhancing Key-Value Memory Neural Networks for Knowledge Based Question Answering. Linked Open Treebanks. Interlinking Syntactically Annotated Corpora in the LiLa Knowledge Base of Linguistic Resources for Latin. Noun Phrases Rooted by Adjectives: A Dependency Grammar Analysis of the Big Mess Construction. Exceptive constructions. A Dependency-base …

Enhancing Key-Value Memory Neural Networks for Knowledge Based Question Answering

Title Enhancing Key-Value Memory Neural Networks for Knowledge Based Question Answering
Authors Kun Xu, Yuxuan Lai, Yansong Feng, Zhiguo Wang
Abstract Traditional Key-value Memory Neural Networks (KV-MemNNs) are proved to be effective to support shallow reasoning over a collection of documents in domain specific Question Answering or Reading Comprehension tasks. However, extending KV-MemNNs to Knowledge Based Question Answering (KB-QA) is not trivia, which should properly decompose a complex question into a sequence of queries against the memory, and update the query representations to support multi-hop reasoning over the memory. In this paper, we propose a novel mechanism to enable conventional KV-MemNNs models to perform interpretable reasoning for complex questions. To achieve this, we design a new query updating strategy to mask previously-addressed memory information from the query representations, and introduce a novel STOP strategy to avoid invalid or repeated memory reading without strong annotation signals. This also enables KV-MemNNs to produce structured queries and work in a semantic parsing fashion. Experimental results on benchmark datasets show that our solution, trained with question-answer pairs only, can provide conventional KV-MemNNs models with better reasoning abilities on complex questions, and achieve state-of-art performances.
Tasks Question Answering, Reading Comprehension, Semantic Parsing
Published 2019-06-01
URL https://www.aclweb.org/anthology/N19-1301/
PDF https://www.aclweb.org/anthology/N19-1301
PWC https://paperswithcode.com/paper/enhancing-key-value-memory-neural-networks
Repo
Framework

Linked Open Treebanks. Interlinking Syntactically Annotated Corpora in the LiLa Knowledge Base of Linguistic Resources for Latin

Title Linked Open Treebanks. Interlinking Syntactically Annotated Corpora in the LiLa Knowledge Base of Linguistic Resources for Latin
Authors Francesco Mambrini, Marco Passarotti
Abstract
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-7808/
PDF https://www.aclweb.org/anthology/W19-7808
PWC https://paperswithcode.com/paper/linked-open-treebanks-interlinking
Repo
Framework

Noun Phrases Rooted by Adjectives: A Dependency Grammar Analysis of the Big Mess Construction

Title Noun Phrases Rooted by Adjectives: A Dependency Grammar Analysis of the Big Mess Construction
Authors Timothy Osborne
Abstract
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-7707/
PDF https://www.aclweb.org/anthology/W19-7707
PWC https://paperswithcode.com/paper/noun-phrases-rooted-by-adjectives-a
Repo
Framework

Exceptive constructions. A Dependency-based Analysis

Title Exceptive constructions. A Dependency-based Analysis
Authors Mohamed Galal, Sylvain Kahane, Yomna Safwat
Abstract
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-7720/
PDF https://www.aclweb.org/anthology/W19-7720
PWC https://paperswithcode.com/paper/exceptive-constructions-a-dependency-based
Repo
Framework

Ellipsis in Chinese AMR Corpus

Title Ellipsis in Chinese AMR Corpus
Authors Yihuan Liu, Bin Li, Peiyi Yan, Li Song, Weiguang Qu
Abstract Ellipsis is very common in language. It{'}s necessary for natural language processing to restore the elided elements in a sentence. However, there{'}s only a few corpora annotating the ellipsis, which draws back the automatic detection and recovery of the ellipsis. This paper introduces the annotation of ellipsis in Chinese sentences, using a novel graph-based representation Abstract Meaning Representation (AMR), which has a good mechanism to restore the elided elements manually. We annotate 5,000 sentences selected from Chinese TreeBank (CTB). We find that 54.98{%} of sentences have ellipses. 92{%} of the ellipses are restored by copying the antecedents{'} concepts. and 12.9{%} of them are the new added concepts. In addition, we find that the elided element is a word or phrase in most cases, but sometimes only the head of a phrase or parts of a phrase, which is rather hard for the automatic recovery of ellipsis.
Tasks
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-3310/
PDF https://www.aclweb.org/anthology/W19-3310
PWC https://paperswithcode.com/paper/ellipsis-in-chinese-amr-corpus
Repo
Framework

Visual Attention Consistency Under Image Transforms for Multi-Label Image Classification

Title Visual Attention Consistency Under Image Transforms for Multi-Label Image Classification
Authors Hao Guo, Kang Zheng, Xiaochuan Fan, Hongkai Yu, Song Wang
Abstract Human visual perception shows good consistency for many multi-label image classification tasks under certain spatial transforms, such as scaling, rotation, flipping and translation. This has motivated the data augmentation strategy widely used in CNN classifier training – transformed images are included for training by assuming the same class labels as their original images. In this paper, we further propose the assumption of perceptual consistency of visual attention regions for classification under such transforms, i.e., the attention region for a classification follows the same transform if the input image is spatially transformed. While the attention regions of CNN classifiers can be derived as an attention heatmap in middle layers of the network, we find that their consistency under many transforms are not preserved. To address this problem, we propose a two-branch network with an original image and its transformed image as inputs and introduce a new attention consistency loss that measures the attention heatmap consistency between two branches. This new loss is then combined with multi-label image classification loss for network training. Experiments on three datasets verify the superiority of the proposed network by achieving new state-of-the-art classification performance.
Tasks Data Augmentation, Image Classification
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Guo_Visual_Attention_Consistency_Under_Image_Transforms_for_Multi-Label_Image_Classification_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Guo_Visual_Attention_Consistency_Under_Image_Transforms_for_Multi-Label_Image_Classification_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/visual-attention-consistency-under-image
Repo
Framework

Learning New Tricks From Old Dogs: Multi-Source Transfer Learning From Pre-Trained Networks

Title Learning New Tricks From Old Dogs: Multi-Source Transfer Learning From Pre-Trained Networks
Authors Joshua Lee, Prasanna Sattigeri, Gregory Wornell
Abstract The advent of deep learning algorithms for mobile devices and sensors has led to a dramatic expansion in the availability and number of systems trained on a wide range of machine learning tasks, creating a host of opportunities and challenges in the realm of transfer learning. Currently, most transfer learning methods require some kind of control over the systems learned, either by enforcing constraints during the source training, or through the use of a joint optimization objective between tasks that requires all data be co-located for training. However, for practical, privacy, or other reasons, in a variety of applications we may have no control over the individual source task training, nor access to source training samples. Instead we only have access to features pre-trained on such data as the output of “black-boxes.’’ For such scenarios, we consider the multi-source learning problem of training a classifier using an ensemble of pre-trained neural networks for a set of classes that have not been observed by any of the source networks, and for which we have very few training samples. We show that by using these distributed networks as feature extractors, we can train an effective classifier in a computationally-efficient manner using tools from (nonlinear) maximal correlation analysis. In particular, we develop a method we refer to as maximal correlation weighting (MCW) to build the required target classifier from an appropriate weighting of the feature functions from the source networks. We illustrate the effectiveness of the resulting classifier on datasets derived from the CIFAR-100, Stanford Dogs, and Tiny ImageNet datasets, and, in addition, use the methodology to characterize the relative value of different source tasks in learning a target task.
Tasks Transfer Learning
Published 2019-12-01
URL http://papers.nips.cc/paper/8688-learning-new-tricks-from-old-dogs-multi-source-transfer-learning-from-pre-trained-networks
PDF http://papers.nips.cc/paper/8688-learning-new-tricks-from-old-dogs-multi-source-transfer-learning-from-pre-trained-networks.pdf
PWC https://paperswithcode.com/paper/learning-new-tricks-from-old-dogs-multi
Repo
Framework

Distilling weighted finite automata from arbitrary probabilistic models

Title Distilling weighted finite automata from arbitrary probabilistic models
Authors An Suresh, a Theertha, Brian Roark, Michael Riley, Vlad Schogol
Abstract Weighted finite automata (WFA) are often used to represent probabilistic models, such as n-gram language models, since they are efficient for recognition tasks in time and space. The probabilistic source to be represented as a WFA, however, may come in many forms. Given a generic probabilistic model over sequences, we propose an algorithm to approximate it as a weighted finite automaton such that the Kullback-Leibler divergence between the source model and the WFA target model is minimized. The proposed algorithm involves a counting step and a difference of convex optimization, both of which can be performed efficiently. We demonstrate the usefulness of our approach on some tasks including distilling n-gram models from neural models.
Tasks
Published 2019-09-01
URL https://www.aclweb.org/anthology/W19-3112/
PDF https://www.aclweb.org/anthology/W19-3112
PWC https://paperswithcode.com/paper/distilling-weighted-finite-automata-from
Repo
Framework

A Communication Efficient Stochastic Multi-Block Alternating Direction Method of Multipliers

Title A Communication Efficient Stochastic Multi-Block Alternating Direction Method of Multipliers
Authors Hao Yu
Abstract The alternating direction method of multipliers (ADMM) has recently received tremendous interests for distributed large scale optimization in machine learning, statistics, multi-agent networks and related applications. In this paper, we propose a new parallel multi-block stochastic ADMM for distributed stochastic optimization, where each node is only required to perform simple stochastic gradient descent updates. The proposed ADMM is fully parallel, can solve problems with arbitrary block structures, and has a convergence rate comparable to or better than existing state-of-the-art ADMM methods for stochastic optimization. Existing stochastic (or deterministic) ADMMs require each node to exchange its updated primal variables across nodes at each iteration and hence cause significant amount of communication overhead. Existing ADMMs require roughly the same number of inter-node communication rounds as the number of in-node computation rounds. In contrast, the number of communication rounds required by our new ADMM is only the square root of the number of computation rounds.
Tasks Stochastic Optimization
Published 2019-12-01
URL http://papers.nips.cc/paper/9068-a-communication-efficient-stochastic-multi-block-alternating-direction-method-of-multipliers
PDF http://papers.nips.cc/paper/9068-a-communication-efficient-stochastic-multi-block-alternating-direction-method-of-multipliers.pdf
PWC https://paperswithcode.com/paper/a-communication-efficient-stochastic-multi
Repo
Framework

A Contrastive Evaluation of Word Sense Disambiguation Systems for Finnish

Title A Contrastive Evaluation of Word Sense Disambiguation Systems for Finnish
Authors Frankie Robertson
Abstract
Tasks Word Sense Disambiguation
Published 2019-01-01
URL https://www.aclweb.org/anthology/W19-0304/
PDF https://www.aclweb.org/anthology/W19-0304
PWC https://paperswithcode.com/paper/a-contrastive-evaluation-of-word-sense
Repo
Framework

PolyU_CBS-CFA at the FinSBD Task: Sentence Boundary Detection of Financial Data with Domain Knowledge Enhancement and Bilingual Training

Title PolyU_CBS-CFA at the FinSBD Task: Sentence Boundary Detection of Financial Data with Domain Knowledge Enhancement and Bilingual Training
Authors Mingyu Wan, Rong Xiang, Emmanuele Chersoni, Natalia Klyueva, Kathleen Ahrens, Bin Miao, David Broadstock, Jian Kang, Amos Yung, Chu-Ren Huang
Abstract
Tasks Boundary Detection
Published 2019-08-01
URL https://www.aclweb.org/anthology/W19-5521/
PDF https://www.aclweb.org/anthology/W19-5521
PWC https://paperswithcode.com/paper/polyu_cbs-cfa-at-the-finsbd-task-sentence
Repo
Framework

Learning to Reduce Dual-Level Discrepancy for Infrared-Visible Person Re-Identification

Title Learning to Reduce Dual-Level Discrepancy for Infrared-Visible Person Re-Identification
Authors Zhixiang Wang, Zheng Wang, Yinqiang Zheng, Yung-Yu Chuang, Shin’ichi Satoh
Abstract Infrared-Visible person RE-IDentification (IV-REID) is a rising task. Compared to conventional person re-identification (re-ID), IV-REID concerns the additional modality discrepancy originated from the different imaging processes of spectrum cameras, in addition to the person’s appearance discrepancy caused by viewpoint changes, pose variations and deformations presented in the conventional re-ID task. The co-existed discrepancies make IV-REID more difficult to solve. Previous methods attempt to reduce the appearance and modality discrepancies simultaneously using feature-level constraints. It is however difficult to eliminate the mixed discrepancies using only feature-level constraints. To address the problem, this paper introduces a novel Dual-level Discrepancy Reduction Learning (D^2RL) scheme which handles the two discrepancies separately. For reducing the modality discrepancy, an image-level sub-network is trained to translate an infrared image into its visible counterpart and a visible image to its infrared version. With the image-level sub-network, we can unify the representations for images with different modalities. With the help of the unified multi-spectral images, a feature-level sub-network is trained to reduce the remaining appearance discrepancy through feature embedding. By cascading the two sub-networks and training them jointly, the dual-level reductions take their responsibilities cooperatively and attentively. Extensive experiments demonstrate the proposed approach outperforms the state-of-the-art methods.
Tasks Person Re-Identification
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Wang_Learning_to_Reduce_Dual-Level_Discrepancy_for_Infrared-Visible_Person_Re-Identification_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Wang_Learning_to_Reduce_Dual-Level_Discrepancy_for_Infrared-Visible_Person_Re-Identification_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/learning-to-reduce-dual-level-discrepancy-for
Repo
Framework

Minimum Divergence vs. Maximum Margin: an Empirical Comparison on Seq2Seq Models

Title Minimum Divergence vs. Maximum Margin: an Empirical Comparison on Seq2Seq Models
Authors Huan Zhang, Hai Zhao
Abstract Sequence to sequence (seq2seq) models have become a popular framework for neural sequence prediction. While traditional seq2seq models are trained by Maximum Likelihood Estimation (MLE), much recent work has made various attempts to optimize evaluation scores directly to solve the mismatch between training and evaluation, since model predictions are usually evaluated by a task specific evaluation metric like BLEU or ROUGE scores instead of perplexity. This paper for the first time puts this existing work into two categories, a) minimum divergence, and b) maximum margin. We introduce a new training criterion based on the analysis of existing work, and empirically compare models in the two categories. Our experimental results show that our new training criterion can usually work better than existing methods, on both the tasks of machine translation and sentence summarization.
Tasks Machine Translation
Published 2019-05-01
URL https://openreview.net/forum?id=H1xD9sR5Fm
PDF https://openreview.net/pdf?id=H1xD9sR5Fm
PWC https://paperswithcode.com/paper/minimum-divergence-vs-maximum-margin-an
Repo
Framework

Joint Representative Selection and Feature Learning: A Semi-Supervised Approach

Title Joint Representative Selection and Feature Learning: A Semi-Supervised Approach
Authors Suchen Wang, Jingjing Meng, Junsong Yuan, Yap-Peng Tan
Abstract In this paper, we propose a semi-supervised approach for representative selection, which finds a small set of representatives that can well summarize a large data collection. Given labeled source data and big unlabeled target data, we aim to find representatives in the target data, which can not only represent and associate data points belonging to each labeled category, but also discover novel categories in the target data, if any. To leverage labeled source data, we guide representative selection from labeled source to unlabeled target. We propose a joint optimization framework which alternately optimizes (1) representative selection in the target data and (2) discriminative feature learning from both the source and the target for better representative selection. Experiments on image and video datasets demonstrate that our proposed approach not only finds better representatives, but also can discover novel categories in the target data that are not in the source.
Tasks
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Wang_Joint_Representative_Selection_and_Feature_Learning_A_Semi-Supervised_Approach_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Wang_Joint_Representative_Selection_and_Feature_Learning_A_Semi-Supervised_Approach_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/joint-representative-selection-and-feature
Repo
Framework

Deep Contextualized Word Embeddings in Transition-Based and Graph-Based Dependency Parsing - A Tale of Two Parsers Revisited

Title Deep Contextualized Word Embeddings in Transition-Based and Graph-Based Dependency Parsing - A Tale of Two Parsers Revisited
Authors Artur Kulmizev, Miryam de Lhoneux, Johannes Gontrum, Elena Fano, Joakim Nivre
Abstract Transition-based and graph-based dependency parsers have previously been shown to have complementary strengths and weaknesses: transition-based parsers exploit rich structural features but suffer from error propagation, while graph-based parsers benefit from global optimization but have restricted feature scope. In this paper, we show that, even though some details of the picture have changed after the switch to neural networks and continuous representations, the basic trade-off between rich features and global optimization remains essentially the same. Moreover, we show that deep contextualized word embeddings, which allow parsers to pack information about global sentence structure into local feature representations, benefit transition-based parsers more than graph-based parsers, making the two approaches virtually equivalent in terms of both accuracy and error profile. We argue that the reason is that these representations help prevent search errors and thereby allow transition-based parsers to better exploit their inherent strength of making accurate local decisions. We support this explanation by an error analysis of parsing experiments on 13 languages.
Tasks Dependency Parsing, Word Embeddings
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-1277/
PDF https://www.aclweb.org/anthology/D19-1277
PWC https://paperswithcode.com/paper/deep-contextualized-word-embeddings-in-1
Repo
Framework
comments powered by Disqus