July 26, 2019

2264 words 11 mins read

Paper Group NANR 151

Paper Group NANR 151

Noisy-context surprisal as a human sentence processing cost model. Learning to Predict Denotational Probabilities For Modeling Entailment. Noisy Uyghur Text Normalization. Self-Paced Co-training. SplitNet: Learning to Semantically Split Deep Networks for Parameter Reduction and Model Parallelization. Learning Deep Latent Gaussian Models with Markov …

Noisy-context surprisal as a human sentence processing cost model

Title Noisy-context surprisal as a human sentence processing cost model
Authors Richard Futrell, Roger Levy
Abstract We use the noisy-channel theory of human sentence comprehension to develop an incremental processing cost model that unifies and extends key features of expectation-based and memory-based models. In this model, which we call noisy-context surprisal, the processing cost of a word is the surprisal of the word given a noisy representation of the preceding context. We show that this model accounts for an outstanding puzzle in sentence comprehension, language-dependent structural forgetting effects (Gibson and Thomas, 1999; Vasishth et al., 2010; Frank et al., 2016), which are previously not well modeled by either expectation-based or memory-based approaches. Additionally, we show that this model derives and generalizes locality effects (Gibson, 1998; Demberg and Keller, 2008), a signature prediction of memory-based models. We give corpus-based evidence for a key assumption in this derivation.
Tasks
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-1065/
PDF https://www.aclweb.org/anthology/E17-1065
PWC https://paperswithcode.com/paper/noisy-context-surprisal-as-a-human-sentence
Repo
Framework

Learning to Predict Denotational Probabilities For Modeling Entailment

Title Learning to Predict Denotational Probabilities For Modeling Entailment
Authors Alice Lai, Julia Hockenmaier
Abstract We propose a framework that captures the denotational probabilities of words and phrases by embedding them in a vector space, and present a method to induce such an embedding from a dataset of denotational probabilities. We show that our model successfully predicts denotational probabilities for unseen phrases, and that its predictions are useful for textual entailment datasets such as SICK and SNLI.
Tasks Coreference Resolution, Natural Language Inference
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-1068/
PDF https://www.aclweb.org/anthology/E17-1068
PWC https://paperswithcode.com/paper/learning-to-predict-denotational
Repo
Framework

Noisy Uyghur Text Normalization

Title Noisy Uyghur Text Normalization
Authors Osman Tursun, Ruket Cakici
Abstract Uyghur is the second largest and most actively used social media language in China. However, a non-negligible part of Uyghur text appearing in social media is unsystematically written with the Latin alphabet, and it continues to increase in size. Uyghur text in this format is incomprehensible and ambiguous even to native Uyghur speakers. In addition, Uyghur texts in this form lack the potential for any kind of advancement for the NLP tasks related to the Uyghur language. Restoring and preventing noisy Uyghur text written with unsystematic Latin alphabets will be essential to the protection of Uyghur language and improving the accuracy of Uyghur NLP tasks. To this purpose, in this work we propose and compare the noisy channel model and the neural encoder-decoder model as normalizing methods.
Tasks
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4412/
PDF https://www.aclweb.org/anthology/W17-4412
PWC https://paperswithcode.com/paper/noisy-uyghur-text-normalization
Repo
Framework

Self-Paced Co-training

Title Self-Paced Co-training
Authors Fan Ma, Deyu Meng, Qi Xie, Zina Li, Xuanyi Dong
Abstract Co-training is a well-known semi-supervised learning approach which trains classifiers on two different views and exchanges labels of unlabeled instances in an iterative way. During co-training process, labels of unlabeled instances in the training pool are very likely to be false especially in the initial training rounds, while the standard co-training algorithm utilizes a “draw without replacement” manner and does not remove these false labeled instances from training. This issue not only tends to degenerate its performance but also hampers its fundamental theory. Besides, there is no optimization model to explain what objective a cotraining process optimizes. To these issues, in this study we design a new co-training algorithm named self-paced cotraining (SPaCo) with a “draw with replacement” learning mode. The rationality of SPaCo can be proved under theoretical assumptions utilized in traditional co-training research, and furthermore, the algorithm exactly complies with the alternative optimization process for an optimization model of self-paced curriculum learning, which can be finely explained in robust learning manner. Experimental results substantiate the superiority of the proposed method as compared with current state-of-the-art co-training methods.
Tasks
Published 2017-08-01
URL https://icml.cc/Conferences/2017/Schedule?showEvent=518
PDF http://proceedings.mlr.press/v70/ma17b/ma17b.pdf
PWC https://paperswithcode.com/paper/self-paced-co-training
Repo
Framework

SplitNet: Learning to Semantically Split Deep Networks for Parameter Reduction and Model Parallelization

Title SplitNet: Learning to Semantically Split Deep Networks for Parameter Reduction and Model Parallelization
Authors Juyong Kim, Yookoon Park, Gunhee Kim, Sung Ju Hwang
Abstract We propose a novel deep neural network that is both lightweight and effectively structured for model parallelization. Our network, which we name as SplitNet, automatically learns to split the network weights into either a set or a hierarchy of multiple groups that use disjoint sets of features, by learning both the class-to-group and feature-to-group assignment matrices along with the network weights. This produces a tree-structured network that involves no connection between branched subtrees of semantically disparate class groups. SplitNet thus greatly reduces the number of parameters and requires significantly less computations, and is also embarrassingly model parallelizable at test time, since the network evaluation for each subnetwork is completely independent except for the shared lower layer weights that can be duplicated over multiple processors. We validate our method with two deep network models (ResNet and AlexNet) on two different datasets (CIFAR-100 and ILSVRC 2012) for image classification, on which our method obtains networks with significantly reduced number of parameters while achieving comparable or superior classification accuracies over original full deep networks, and accelerated test speed with multiple GPUs.
Tasks Image Classification
Published 2017-08-01
URL https://icml.cc/Conferences/2017/Schedule?showEvent=490
PDF http://proceedings.mlr.press/v70/kim17b/kim17b.pdf
PWC https://paperswithcode.com/paper/splitnet-learning-to-semantically-split-deep
Repo
Framework

Learning Deep Latent Gaussian Models with Markov Chain Monte Carlo

Title Learning Deep Latent Gaussian Models with Markov Chain Monte Carlo
Authors Matthew D. Hoffman
Abstract Deep latent Gaussian models are powerful and popular probabilistic models of high-dimensional data. These models are almost always fit using variational expectation-maximization, an approximation to true maximum-marginal-likelihood estimation. In this paper, we propose a different approach: rather than use a variational approximation (which produces biased gradient signals), we use Markov chain Monte Carlo (MCMC, which allows us to trade bias for computation). We find that our MCMC-based approach has several advantages: it yields higher held-out likelihoods, produces sharper images, and does not suffer from the variational overpruning effect. MCMC’s additional computational overhead proves to be significant, but not prohibitive.
Tasks
Published 2017-08-01
URL https://icml.cc/Conferences/2017/Schedule?showEvent=876
PDF http://proceedings.mlr.press/v70/hoffman17a/hoffman17a.pdf
PWC https://paperswithcode.com/paper/learning-deep-latent-gaussian-models-with
Repo
Framework

Proceedings of the 9th SIGHAN Workshop on Chinese Language Processing

Title Proceedings of the 9th SIGHAN Workshop on Chinese Language Processing
Authors
Abstract
Tasks
Published 2017-12-01
URL https://www.aclweb.org/anthology/W17-6000/
PDF https://www.aclweb.org/anthology/W17-6000
PWC https://paperswithcode.com/paper/proceedings-of-the-9th-sighan-workshop-on
Repo
Framework

Context-Sensitive Recognition for Emerging and Rare Entities

Title Context-Sensitive Recognition for Emerging and Rare Entities
Authors Jake Williams, Giovanni Santia
Abstract This paper is a shared task system description for the 2017 W-NUT shared task on Rare and Emerging Named Entities. Our paper describes the development and application of a novel algorithm for named entity recognition that relies only on the contexts of word forms. A comparison against the other submitted systems is provided.
Tasks Information Retrieval, Named Entity Recognition
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4423/
PDF https://www.aclweb.org/anthology/W17-4423
PWC https://paperswithcode.com/paper/context-sensitive-recognition-for-emerging
Repo
Framework

Local-to-Global Bayesian Network Structure Learning

Title Local-to-Global Bayesian Network Structure Learning
Authors Tian Gao, Kshitij Fadnis, Murray Campbell
Abstract We introduce a new local-to-global structure learning algorithm, called graph growing structure learning (GGSL), to learn Bayesian network (BN) structures. GGSL starts at a (random) node and then gradually expands the learned structure through a series of local learning steps. At each local learning step, the proposed algorithm only needs to revisit a subset of the learned nodes, consisting of the local neighborhood of a target, and therefore improves on both memory and time efficiency compared to traditional global structure learning approaches. GGSL also improves on the existing local-to-global learning approaches by removing the need for conflict-resolving AND-rules, and achieves better learning accuracy. We provide theoretical analysis for the local learning step, and show that GGSL outperforms existing algorithms on benchmark datasets. Overall, GGSL demonstrates a novel direction to scale up BN structure learning while limiting accuracy loss.
Tasks
Published 2017-08-01
URL https://icml.cc/Conferences/2017/Schedule?showEvent=688
PDF http://proceedings.mlr.press/v70/gao17a/gao17a.pdf
PWC https://paperswithcode.com/paper/local-to-global-bayesian-network-structure
Repo
Framework

Composing Tree Graphical Models with Persistent Homology Features for Clustering Mixed-Type Data

Title Composing Tree Graphical Models with Persistent Homology Features for Clustering Mixed-Type Data
Authors Xiuyan Ni, Novi Quadrianto, Yusu Wang, Chao Chen
Abstract Clustering data with both continuous and discrete attributes is a challenging task. Existing methods lack a principled probabilistic formulation. In this paper, we propose a clustering method based on a tree-structured graphical model to describe the generation process of mixed-type data. Our tree-structured model factorized into a product of pairwise interactions, and thus localizes the interaction between feature variables of different types. To provide a robust clustering method based on the tree-model, we adopt a topographical view and compute peaks of the density function and their attractive basins for clustering. Furthermore, we leverage the theory from topology data analysis to adaptively merge trivial peaks into large ones in order to achieve meaningful clusterings. Our method outperforms state-of-the-art methods on mixed-type data.
Tasks
Published 2017-08-01
URL https://icml.cc/Conferences/2017/Schedule?showEvent=504
PDF http://proceedings.mlr.press/v70/ni17a/ni17a.pdf
PWC https://paperswithcode.com/paper/composing-tree-graphical-models-with
Repo
Framework

Learning Hierarchical Features from Deep Generative Models

Title Learning Hierarchical Features from Deep Generative Models
Authors Shengjia Zhao, Jiaming Song, Stefano Ermon
Abstract Deep neural networks have been shown to be very successful at learning feature hierarchies in supervised learning tasks. Generative models, on the other hand, have benefited less from hierarchical models with multiple layers of latent variables. In this paper, we prove that hierarchical latent variable models do not take advantage of the hierarchical structure when trained with existing variational methods, and provide some limitations on the kind of features existing models can learn. Finally we propose an alternative architecture that do not suffer from these limitations. Our model is able to learn highly interpretable and disentangled hierarchical features on several natural image datasets with no task specific regularization or prior knowledge.
Tasks Latent Variable Models
Published 2017-08-01
URL https://icml.cc/Conferences/2017/Schedule?showEvent=852
PDF http://proceedings.mlr.press/v70/zhao17c/zhao17c.pdf
PWC https://paperswithcode.com/paper/learning-hierarchical-features-from-deep
Repo
Framework

Safety-Aware Algorithms for Adversarial Contextual Bandit

Title Safety-Aware Algorithms for Adversarial Contextual Bandit
Authors Wen Sun, Debadeepta Dey, Ashish Kapoor
Abstract In this work we study the safe sequential decision making problem under the setting of adversarial contextual bandits with sequential risk constraints. At each round, nature prepares a context, a cost for each arm, and additionally a risk for each arm. The learner leverages the context to pull an arm and receives the corresponding cost and risk associated with the pulled arm. In addition to minimizing the cumulative cost, for safety purposes, the learner needs to make safe decisions such that the average of the cumulative risk from all pulled arms should not be larger than a pre-defined threshold. To address this problem, we first study online convex programming in the full information setting where in each round the learner receives an adversarial convex loss and a convex constraint. We develop a meta algorithm leveraging online mirror descent for the full information setting and then extend it to contextual bandit with sequential risk constraints setting using expert advice. Our algorithms can achieve near-optimal regret in terms of minimizing the total cost, while successfully maintaining a sub-linear growth of accumulative risk constraint violation. We support our theoretical results by demonstrating our algorithm on a simple simulated robotics reactive control task.
Tasks Decision Making, Multi-Armed Bandits
Published 2017-08-01
URL https://icml.cc/Conferences/2017/Schedule?showEvent=628
PDF http://proceedings.mlr.press/v70/sun17a/sun17a.pdf
PWC https://paperswithcode.com/paper/safety-aware-algorithms-for-adversarial
Repo
Framework

End-to-end Relation Extraction using Neural Networks and Markov Logic Networks

Title End-to-end Relation Extraction using Neural Networks and Markov Logic Networks
Authors Sachin Pawar, Pushpak Bhattacharyya, Girish Palshikar
Abstract End-to-end relation extraction refers to identifying boundaries of entity mentions, entity types of these mentions and appropriate semantic relation for each pair of mentions. Traditionally, separate predictive models were trained for each of these tasks and were used in a {}pipeline{''} fashion where output of one model is fed as input to another. But it was observed that addressing some of these tasks jointly results in better performance. We propose a single, joint neural network based model to carry out all the three tasks of boundary identification, entity type classification and relation type classification. This model is referred to as {}All Word Pairs{''} model (AWP-NN) as it assigns an appropriate label to each word pair in a given sentence for performing end-to-end relation extraction. We also propose to refine output of the AWP-NN model by using inference in Markov Logic Networks (MLN) so that additional domain knowledge can be effectively incorporated. We demonstrate effectiveness of our approach by achieving better end-to-end relation extraction performance than all 4 previous joint modelling approaches, on the standard dataset of ACE 2004.
Tasks Relation Extraction
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-1077/
PDF https://www.aclweb.org/anthology/E17-1077
PWC https://paperswithcode.com/paper/end-to-end-relation-extraction-using-neural
Repo
Framework

Proceedings of the Software Demonstrations of the 15th Conference of the European Chapter of the Association for Computational Linguistics

Title Proceedings of the Software Demonstrations of the 15th Conference of the European Chapter of the Association for Computational Linguistics
Authors
Abstract
Tasks
Published 2017-04-01
URL https://www.aclweb.org/anthology/E17-3000/
PDF https://www.aclweb.org/anthology/E17-3000
PWC https://paperswithcode.com/paper/proceedings-of-the-software-demonstrations-of
Repo
Framework

Analyzing the Revision Logs of a Japanese Newspaper for Article Quality Assessment

Title Analyzing the Revision Logs of a Japanese Newspaper for Article Quality Assessment
Authors Hideaki Tamori, Yuta Hitomi, Naoaki Okazaki, Kentaro Inui
Abstract We address the issue of the quality of journalism and analyze daily article revision logs from a Japanese newspaper company. The revision logs contain data that can help reveal the requirements of quality journalism such as the types and number of edit operations and aspects commonly focused in revision. This study also discusses potential applications such as quality assessment and automatic article revision as our future research directions.
Tasks
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-4208/
PDF https://www.aclweb.org/anthology/W17-4208
PWC https://paperswithcode.com/paper/analyzing-the-revision-logs-of-a-japanese
Repo
Framework
comments powered by Disqus