Paper Group NANR 70
Theoretical guarantees for EM under misspecified Gaussian mixture models. NovelPerspective: Identifying Point of View Characters. Native Language Identification with User Generated Content. Model-based targeted dimensionality reduction for neuronal population data. Proceedings of the 11th International Conference on Natural Language Generation. Lea …
Theoretical guarantees for EM under misspecified Gaussian mixture models
Title | Theoretical guarantees for EM under misspecified Gaussian mixture models |
Authors | Raaz Dwivedi, Nhật Hồ, Koulik Khamaru, Martin J. Wainwright, Michael I. Jordan |
Abstract | Recent years have witnessed substantial progress in understanding the behavior of EM for mixture models that are correctly specified. Given that model misspecification is common in practice, it is important to understand EM in this more general setting. We provide non-asymptotic guarantees for population and sample-based EM for parameter estimation under a few specific univariate settings of misspecified Gaussian mixture models. Due to misspecification, the EM iterates no longer converge to the true model and instead converge to the projection of the true model over the set of models being searched over. We provide two classes of theoretical guarantees: first, we characterize the bias introduced due to the misspecification; and second, we prove that population EM converges at a geometric rate to the model projection under a suitable initialization condition. This geometric convergence rate for population EM imply a statistical complexity of order $1/\sqrt{n}$ when running EM with $n$ samples. We validate our theoretical findings in different cases via several numerical examples. |
Tasks | |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/8176-theoretical-guarantees-for-em-under-misspecified-gaussian-mixture-models |
http://papers.nips.cc/paper/8176-theoretical-guarantees-for-em-under-misspecified-gaussian-mixture-models.pdf | |
PWC | https://paperswithcode.com/paper/theoretical-guarantees-for-em-under |
Repo | |
Framework | |
NovelPerspective: Identifying Point of View Characters
Title | NovelPerspective: Identifying Point of View Characters |
Authors | Lyndon White, Roberto Togneri, Wei Liu, Mohammed Bennamoun |
Abstract | We present NovelPerspective: a tool to allow consumers to subset their digital literature, based on point of view (POV) character. Many novels have multiple main characters each with their own storyline running in parallel. A well-known example is George R. R. Martin{'}s novel: {``}A Game of Thrones{''}, and others from that series. Our tool detects the main character that each section is from the POV of, and allows the user to generate a new ebook with only those sections. This gives consumers new options in how they consume their media; allowing them to pursue the storylines sequentially, or skip chapters about characters they find boring. We present two heuristic-based baselines, and two machine learning based methods for the detection of the main character. | |
Tasks | Named Entity Recognition |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/P18-4002/ |
https://www.aclweb.org/anthology/P18-4002 | |
PWC | https://paperswithcode.com/paper/novelperspective-identifying-point-of-view |
Repo | |
Framework | |
Native Language Identification with User Generated Content
Title | Native Language Identification with User Generated Content |
Authors | Gili Goldin, Ella Rabinovich, Shuly Wintner |
Abstract | We address the task of native language identification in the context of social media content, where authors are highly-fluent, advanced nonnative speakers (of English). Using both linguistically-motivated features and the characteristics of the social media outlet, we obtain high accuracy on this challenging task. We provide a detailed analysis of the features that sheds light on differences between native and nonnative speakers, and among nonnative speakers with different backgrounds. |
Tasks | Language Identification, Native Language Identification |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1395/ |
https://www.aclweb.org/anthology/D18-1395 | |
PWC | https://paperswithcode.com/paper/native-language-identification-with-user |
Repo | |
Framework | |
Model-based targeted dimensionality reduction for neuronal population data
Title | Model-based targeted dimensionality reduction for neuronal population data |
Authors | Mikio Aoi, Jonathan W. Pillow |
Abstract | Summarizing high-dimensional data using a small number of parameters is a ubiquitous first step in the analysis of neuronal population activity. Recently developed methods use “targeted” approaches that work by identifying multiple, distinct low-dimensional subspaces of activity that capture the population response to individual experimental task variables, such as the value of a presented stimulus or the behavior of the animal. These methods have gained attention because they decompose total neural activity into what are ostensibly different parts of a neuronal computation. However, existing targeted methods have been developed outside of the confines of probabilistic modeling, making some aspects of the procedures ad hoc, or limited in flexibility or interpretability. Here we propose a new model-based method for targeted dimensionality reduction based on a probabilistic generative model of the population response data. The low-dimensional structure of our model is expressed as a low-rank factorization of a linear regression model. We perform efficient inference using a combination of expectation maximization and direct maximization of the marginal likelihood. We also develop an efficient method for estimating the dimensionality of each subspace. We show that our approach outperforms alternative methods in both mean squared error of the parameter estimates, and in identifying the correct dimensionality of encoding using simulated data. We also show that our method provides more accurate inference of low-dimensional subspaces of activity than a competing algorithm, demixed PCA. |
Tasks | Dimensionality Reduction |
Published | 2018-12-01 |
URL | http://papers.nips.cc/paper/7903-model-based-targeted-dimensionality-reduction-for-neuronal-population-data |
http://papers.nips.cc/paper/7903-model-based-targeted-dimensionality-reduction-for-neuronal-population-data.pdf | |
PWC | https://paperswithcode.com/paper/model-based-targeted-dimensionality-reduction |
Repo | |
Framework | |
Proceedings of the 11th International Conference on Natural Language Generation
Title | Proceedings of the 11th International Conference on Natural Language Generation |
Authors | |
Abstract | |
Tasks | Text Generation |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/W18-6500/ |
https://www.aclweb.org/anthology/W18-6500 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-11th-international-1 |
Repo | |
Framework | |
Learning to Control the Specificity in Neural Response Generation
Title | Learning to Control the Specificity in Neural Response Generation |
Authors | Ruqing Zhang, Jiafeng Guo, Yixing Fan, Yanyan Lan, Jun Xu, Xueqi Cheng |
Abstract | In conversation, a general response (e.g., {``}I don{'}t know{''}) could correspond to a large variety of input utterances. Previous generative conversational models usually employ a single model to learn the relationship between different utterance-response pairs, thus tend to favor general and trivial responses which appear frequently. To address this problem, we propose a novel controlled response generation mechanism to handle different utterance-response relationships in terms of specificity. Specifically, we introduce an explicit specificity control variable into a sequence-to-sequence model, which interacts with the usage representation of words through a Gaussian Kernel layer, to guide the model to generate responses at different specificity levels. We describe two ways to acquire distant labels for the specificity control variable in learning. Empirical studies show that our model can significantly outperform the state-of-the-art response generation models under both automatic and human evaluations. | |
Tasks | Machine Translation |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/P18-1102/ |
https://www.aclweb.org/anthology/P18-1102 | |
PWC | https://paperswithcode.com/paper/learning-to-control-the-specificity-in-neural |
Repo | |
Framework | |
A Neural Architecture for Automated ICD Coding
Title | A Neural Architecture for Automated ICD Coding |
Authors | Pengtao Xie, Eric Xing |
Abstract | The International Classification of Diseases (ICD) provides a hierarchy of diagnostic codes for classifying diseases. Medical coding {–} which assigns a subset of ICD codes to a patient visit {–} is a mandatory process that is crucial for patient care and billing. Manual coding is time-consuming, expensive, and error prone. In this paper, we build a neural architecture for automated coding. It takes the diagnosis descriptions (DDs) of a patient as inputs and selects the most relevant ICD codes. This architecture contains four major ingredients: (1) tree-of-sequences LSTM encoding of code descriptions (CDs), (2) adversarial learning for reconciling the different writing styles of DDs and CDs, (3) isotonic constraints for incorporating the importance order among the assigned codes, and (4) attentional matching for performing many-to-one and one-to-many mappings from DDs to CDs. We demonstrate the effectiveness of the proposed methods on a clinical datasets with 59K patient visits. |
Tasks | |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/P18-1098/ |
https://www.aclweb.org/anthology/P18-1098 | |
PWC | https://paperswithcode.com/paper/a-neural-architecture-for-automated-icd |
Repo | |
Framework | |
Distributed Fine-tuning of Language Models on Private Data
Title | Distributed Fine-tuning of Language Models on Private Data |
Authors | Vadim Popov, Mikhail Kudinov, Irina Piontkovskaya, Petr Vytovtov, Alex Nevidomsky |
Abstract | One of the big challenges in machine learning applications is that training data can be different from the real-world data faced by the algorithm. In language modeling, users’ language (e.g. in private messaging) could change in a year and be completely different from what we observe in publicly available data. At the same time, public data can be used for obtaining general knowledge (i.e. general model of English). We study approaches to distributed fine-tuning of a general model on user private data with the additional requirements of maintaining the quality on the general data and minimization of communication costs. We propose a novel technique that significantly improves prediction quality on users’ language compared to a general model and outperforms gradient compression methods in terms of communication efficiency. The proposed procedure is fast and leads to an almost 70% perplexity reduction and 8.7 percentage point improvement in keystroke saving rate on informal English texts. Finally, we propose an experimental framework for evaluating differential privacy of distributed training of language models and show that our approach has good privacy guarantees. |
Tasks | Language Modelling |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=HkgNdt26Z |
https://openreview.net/pdf?id=HkgNdt26Z | |
PWC | https://paperswithcode.com/paper/distributed-fine-tuning-of-language-models-on |
Repo | |
Framework | |
Don’t encrypt the data; just approximate the model \ Towards Secure Transaction and Fair Pricing of Training Data
Title | Don’t encrypt the data; just approximate the model \ Towards Secure Transaction and Fair Pricing of Training Data |
Authors | Xinlei Xu |
Abstract | As machine learning becomes ubiquitous, deployed systems need to be as accu- rate as they can. As a result, machine learning service providers have a surging need for useful, additional training data that benefits training, without giving up all the details about the trained program. At the same time, data owners would like to trade their data for its value, without having to first give away the data itself be- fore receiving compensation. It is difficult for data providers and model providers to agree on a fair price without first revealing the data or the trained model to the other side. Escrow systems only complicate this further, adding an additional layer of trust required of both parties. Currently, data owners and model owners don’t have a fair pricing system that eliminates the need to trust a third party and training the model on the data, which 1) takes a long time to complete, 2) does not guarantee that useful data is paid valuably and that useless data isn’t, without trusting in the third party with both the model and the data. Existing improve- ments to secure the transaction focus heavily on encrypting or approximating the data, such as training on encrypted data, and variants of federated learning. As powerful as the methods appear to be, we show them to be impractical in our use case with real world assumptions for preserving privacy for the data owners when facing black-box models. Thus, a fair pricing scheme that does not rely on secure data encryption and obfuscation is needed before the exchange of data. This pa- per proposes a novel method for fair pricing using data-model efficacy techniques such as influence functions, model extraction, and model compression methods, thus enabling secure data transactions. We successfully show that without running the data through the model, one can approximate the value of the data; that is, if the data turns out redundant, the pricing is minimal, and if the data leads to proper improvement, its value is properly assessed, without placing strong assumptions on the nature of the model. Future work will be focused on establishing a system with stronger transactional security against adversarial attacks that will reveal details about the model or the data to the other party. |
Tasks | Model Compression |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=r1ayG7WRZ |
https://openreview.net/pdf?id=r1ayG7WRZ | |
PWC | https://paperswithcode.com/paper/dont-encrypt-the-data-just-approximate-the |
Repo | |
Framework | |
CARER: Contextualized Affect Representations for Emotion Recognition
Title | CARER: Contextualized Affect Representations for Emotion Recognition |
Authors | Elvis Saravia, Hsien-Chi Toby Liu, Yen-Hao Huang, Junlin Wu, Yi-Shin Chen |
Abstract | Emotions are expressed in nuanced ways, which varies by collective or individual experiences, knowledge, and beliefs. Therefore, to understand emotion, as conveyed through text, a robust mechanism capable of capturing and modeling different linguistic nuances and phenomena is needed. We propose a semi-supervised, graph-based algorithm to produce rich structural descriptors which serve as the building blocks for constructing contextualized affect representations from text. The pattern-based representations are further enriched with word embeddings and evaluated through several emotion recognition tasks. Our experimental results demonstrate that the proposed method outperforms state-of-the-art techniques on emotion recognition tasks. |
Tasks | Emotion Recognition, Semantic Textual Similarity, Word Embeddings |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1404/ |
https://www.aclweb.org/anthology/D18-1404 | |
PWC | https://paperswithcode.com/paper/carer-contextualized-affect-representations |
Repo | |
Framework | |
Going Dutch: Creating SimpleNLG-NL
Title | Going Dutch: Creating SimpleNLG-NL |
Authors | Ruud de Jong, Mari{"e}t Theune |
Abstract | This paper presents SimpleNLG-NL, an adaptation of the SimpleNLG surface realisation engine for the Dutch language. It describes a novel method for determining and testing the grammatical constructions to be implemented, using target sentences sampled from a treebank. |
Tasks | Text Generation |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/W18-6508/ |
https://www.aclweb.org/anthology/W18-6508 | |
PWC | https://paperswithcode.com/paper/going-dutch-creating-simplenlg-nl |
Repo | |
Framework | |
Apertium’s Web Toolchain for Low-Resource Language Technology
Title | Apertium’s Web Toolchain for Low-Resource Language Technology |
Authors | Sushain Cherivirala, Shardul Chiplunkar, Jonathan Washington, Kevin Unhammer |
Abstract | |
Tasks | |
Published | 2018-03-01 |
URL | https://www.aclweb.org/anthology/W18-2207/ |
https://www.aclweb.org/anthology/W18-2207 | |
PWC | https://paperswithcode.com/paper/apertiumas-web-toolchain-for-low-resource |
Repo | |
Framework | |
The Lost Combinator
Title | The Lost Combinator |
Authors | Mark Steedman |
Abstract | |
Tasks | Spoken Language Understanding |
Published | 2018-12-01 |
URL | https://www.aclweb.org/anthology/J18-4001/ |
https://www.aclweb.org/anthology/J18-4001 | |
PWC | https://paperswithcode.com/paper/the-lost-combinator |
Repo | |
Framework | |
Tree2Tree Learning with Memory Unit
Title | Tree2Tree Learning with Memory Unit |
Authors | Ning Miao, Hengliang Wang, Ran Le, Chongyang Tao, Mingyue Shang, Rui Yan, Dongyan Zhao |
Abstract | Traditional recurrent neural network (RNN) or convolutional neural net- work (CNN) based sequence-to-sequence model can not handle tree structural data well. To alleviate this problem, in this paper, we propose a tree-to-tree model with specially designed encoder unit and decoder unit, which recursively encodes tree inputs into highly folded tree embeddings and decodes the embeddings into tree outputs. Our model could represent the complex information of a tree while also restore a tree from embeddings. We evaluate our model in random tree recovery task and neural machine translation task. Experiments show that our model outperforms the baseline model. |
Tasks | Machine Translation |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=Syt0r4bRZ |
https://openreview.net/pdf?id=Syt0r4bRZ | |
PWC | https://paperswithcode.com/paper/tree2tree-learning-with-memory-unit |
Repo | |
Framework | |
Human Pose Estimation With Parsing Induced Learner
Title | Human Pose Estimation With Parsing Induced Learner |
Authors | Xuecheng Nie, Jiashi Feng, Yiming Zuo, Shuicheng Yan |
Abstract | Human pose estimation still faces various difficulties in challenging scenarios. Human parsing, as a closely related task, can provide valuable cues for better pose estimation, which however has not been fully exploited. In this paper, we propose a novel Parsing Induced Learner to exploit parsing information to effectively assist pose estimation by learning to fast adapt the base pose estimation model. The proposed Parsing Induced Learner is composed of a parsing encoder and a pose model parameter adapter, which together learn to predict dynamic parameters of the pose model to extract complementary useful features for more accurate pose estimation. Comprehensive experiments on benchmarks LIP and extended PASCAL-Person-Part show that the proposed Parsing Induced Learner can improve performance of both single- and multi-person pose estimation to new state-of-the-art. Cross-dataset experiments also show that the proposed Parsing Induced Learner from LIP dataset can accelerate learning of a human pose estimation model on MPII benchmark in addition to achieving outperforming performance. |
Tasks | Human Parsing, Multi-Person Pose Estimation, Pose Estimation |
Published | 2018-06-01 |
URL | http://openaccess.thecvf.com/content_cvpr_2018/html/Nie_Human_Pose_Estimation_CVPR_2018_paper.html |
http://openaccess.thecvf.com/content_cvpr_2018/papers/Nie_Human_Pose_Estimation_CVPR_2018_paper.pdf | |
PWC | https://paperswithcode.com/paper/human-pose-estimation-with-parsing-induced |
Repo | |
Framework | |