October 15, 2019

2389 words 12 mins read

Paper Group NANR 194

A Multi-Context Character Prediction Model for a Brain-Computer Interface. Challenges and Opportunities of Applying Natural Language Processing in Business Process Management. Objective and efficient inference for couplings in neuronal networks. The Sample Complexity of Semi-Supervised Learning with Nonparametric Mixture Models. Identifying the Mos …

A Multi-Context Character Prediction Model for a Brain-Computer Interface


Title	A Multi-Context Character Prediction Model for a Brain-Computer Interface
Authors	Shiran Dudy, Shaobin Xu, Steven Bedrick, David Smith
Abstract	Brain-computer interfaces and other augmentative and alternative communication devices introduce language-modeing challenges distinct from other character-entry methods. In particular, the acquired signal of the EEG (electroencephalogram) signal is noisier, which, in turn, makes the user intent harder to decipher. In order to adapt to this condition, we propose to maintain ambiguous history for every time step, and to employ, apart from the character language model, word information to produce a more robust prediction system. We present preliminary results that compare this proposed Online-Context Language Model (OCLM) to current algorithms that are used in this type of setting. Evaluation on both perplexity and predictive accuracy demonstrates promising results when dealing with ambiguous histories in order to provide to the front end a distribution of the next character the user might type.
Tasks	EEG, Language Modelling
Published	2018-06-01
URL	https://www.aclweb.org/anthology/W18-1210/
PDF	https://www.aclweb.org/anthology/W18-1210
PWC	https://paperswithcode.com/paper/a-multi-context-character-prediction-model
Repo
Framework

Challenges and Opportunities of Applying Natural Language Processing in Business Process Management


Title	Challenges and Opportunities of Applying Natural Language Processing in Business Process Management
Authors	Han van der Aa, Josep Carmona, Henrik Leopold, Jan Mendling, Llu{'\i}s Padr{'o}
Abstract	The Business Process Management (BPM) field focuses in the coordination of labor so that organizational processes are smoothly executed in a way that products and services are properly delivered. At the same time, NLP has reached a maturity level that enables its widespread application in many contexts, thanks to publicly available frameworks. In this position paper, we show how NLP has potential in raising the benefits of BPM practices at different levels. Instead of being exhaustive, we show selected key challenges were a successful application of NLP techniques would facilitate the automation of particular tasks that nowadays require a significant effort to accomplish. Finally, we report on applications that consider both the process perspective and its enhancement through NLP.
Tasks
Published	2018-08-01
URL	https://www.aclweb.org/anthology/C18-1236/
PDF	https://www.aclweb.org/anthology/C18-1236
PWC	https://paperswithcode.com/paper/challenges-and-opportunities-of-applying
Repo
Framework

Objective and efficient inference for couplings in neuronal networks


Title	Objective and efficient inference for couplings in neuronal networks
Authors	Yu Terada, Tomoyuki Obuchi, Takuya Isomura, Yoshiyuki Kabashima
Abstract	Inferring directional couplings from the spike data of networks is desired in various scientific fields such as neuroscience. Here, we apply a recently proposed objective procedure to the spike data obtained from the Hodgkin-Huxley type models and in vitro neuronal networks cultured in a circular structure. As a result, we succeed in reconstructing synaptic connections accurately from the evoked activity as well as the spontaneous one. To obtain the results, we invent an analytic formula approximately implementing a method of screening relevant couplings. This significantly reduces the computational cost of the screening method employed in the proposed objective procedure, making it possible to treat large-size systems as in this study.
Tasks
Published	2018-12-01
URL	http://papers.nips.cc/paper/7745-objective-and-efficient-inference-for-couplings-in-neuronal-networks
PDF	http://papers.nips.cc/paper/7745-objective-and-efficient-inference-for-couplings-in-neuronal-networks.pdf
PWC	https://paperswithcode.com/paper/objective-and-efficient-inference-for
Repo
Framework

The Sample Complexity of Semi-Supervised Learning with Nonparametric Mixture Models


Title	The Sample Complexity of Semi-Supervised Learning with Nonparametric Mixture Models
Authors	Chen Dan, Liu Leqi, Bryon Aragam, Pradeep K. Ravikumar, Eric P. Xing
Abstract	We study the sample complexity of semi-supervised learning (SSL) and introduce new assumptions based on the mismatch between a mixture model learned from unlabeled data and the true mixture model induced by the (unknown) class conditional distributions. Under these assumptions, we establish an $\Omega(K\log K)$ labeled sample complexity bound without imposing parametric assumptions, where $K$ is the number of classes. Our results suggest that even in nonparametric settings it is possible to learn a near-optimal classifier using only a few labeled samples. Unlike previous theoretical work which focuses on binary classification, we consider general multiclass classification ($K>2$), which requires solving a difficult permutation learning problem. This permutation defines a classifier whose classification error is controlled by the Wasserstein distance between mixing measures, and we provide finite-sample results characterizing the behaviour of the excess risk of this classifier. Finally, we describe three algorithms for computing these estimators based on a connection to bipartite graph matching, and perform experiments to illustrate the superiority of the MLE over the majority vote estimator.
Tasks	Graph Matching
Published	2018-12-01
URL	http://papers.nips.cc/paper/8144-the-sample-complexity-of-semi-supervised-learning-with-nonparametric-mixture-models
PDF	http://papers.nips.cc/paper/8144-the-sample-complexity-of-semi-supervised-learning-with-nonparametric-mixture-models.pdf
PWC	https://paperswithcode.com/paper/the-sample-complexity-of-semi-supervised
Repo
Framework

Identifying the Most Dominant Event in a News Article by Mining Event Coreference Relations


Title	Identifying the Most Dominant Event in a News Article by Mining Event Coreference Relations
Authors	Prafulla Kumar Choubey, Kaushik Raju, Ruihong Huang
Abstract	Identifying the most dominant and central event of a document, which governs and connects other foreground and background events in the document, is useful for many applications, such as text summarization, storyline generation and text segmentation. We observed that the central event of a document usually has many coreferential event mentions that are scattered throughout the document for enabling a smooth transition of subtopics. Our empirical experiments, using gold event coreference relations, have shown that the central event of a document can be well identified by mining properties of event coreference chains. But the performance drops when switching to system predicted event coreference relations. In addition, we found that the central event can be more accurately identified by further considering the number of sub-events as well as the realis status of an event.
Tasks	Text Summarization
Published	2018-06-01
URL	https://www.aclweb.org/anthology/N18-2055/
PDF	https://www.aclweb.org/anthology/N18-2055
PWC	https://paperswithcode.com/paper/identifying-the-most-dominant-event-in-a-news
Repo
Framework

Combining rule-based and embedding-based approaches to normalize textual entities with an ontology


Title	Combining rule-based and embedding-based approaches to normalize textual entities with an ontology
Authors	Arnaud Ferr{'e}, Louise Del{'e}ger, Pierre Zweigenbaum, Claire N{'e}dellec
Abstract
Tasks	Entity Linking, Word Embeddings
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1543/
PDF	https://www.aclweb.org/anthology/L18-1543
PWC	https://paperswithcode.com/paper/combining-rule-based-and-embedding-based
Repo
Framework

T-REx: A Large Scale Alignment of Natural Language with Knowledge Base Triples


Title	T-REx: A Large Scale Alignment of Natural Language with Knowledge Base Triples
Authors	Hady Elsahar, Pavlos Vougiouklis, Arslen Remaci, Christophe Gravier, Jonathon Hare, Frederique Laforest, Elena Simperl
Abstract
Tasks	Entity Linking, Knowledge Base Population, Question Answering, Relation Extraction, Text Generation
Published	2018-05-01
URL	https://www.aclweb.org/anthology/L18-1544/
PDF	https://www.aclweb.org/anthology/L18-1544
PWC	https://paperswithcode.com/paper/t-rex-a-large-scale-alignment-of-natural
Repo
Framework

Deep Bayesian Nonparametric Tracking


Title	Deep Bayesian Nonparametric Tracking
Authors	Aonan Zhang, John Paisley
Abstract	Time-series data often exhibit irregular behavior, making them hard to analyze and explain with a simple dynamic model. For example, information in social networks may show change-point-like bursts that then diffuse with smooth dynamics. Powerful models such as deep neural networks learn smooth functions from data, but are not as well-suited (in off-the-shelf form) for discovering and explaining sparse, discrete and bursty dynamic patterns. Bayesian models can do this well by encoding the appropriate probabilistic assumptions in the model prior. We propose an integration of Bayesian nonparametric methods within deep neural networks for modeling irregular patterns in time-series data. We use a Bayesian nonparametrics to model change-point behavior in time, and a deep neural network to model nonlinear latent space dynamics. We compare with a non-deep linear version of the model also proposed here. Empirical evaluations demonstrates improved performance and interpretable results when tracking stock prices and Twitter trends.
Tasks	Time Series
Published	2018-07-01
URL	https://icml.cc/Conferences/2018/Schedule?showEvent=2090
PDF	http://proceedings.mlr.press/v80/zhang18j/zhang18j.pdf
PWC	https://paperswithcode.com/paper/deep-bayesian-nonparametric-tracking
Repo
Framework

Computing Higher Order Derivatives of Matrix and Tensor Expressions


Title	Computing Higher Order Derivatives of Matrix and Tensor Expressions
Authors	Soeren Laue, Matthias Mitterreiter, Joachim Giesen
Abstract	Optimization is an integral part of most machine learning systems and most numerical optimization schemes rely on the computation of derivatives. Therefore, frameworks for computing derivatives are an active area of machine learning research. Surprisingly, as of yet, no existing framework is capable of computing higher order matrix and tensor derivatives directly. Here, we close this fundamental gap and present an algorithmic framework for computing matrix and tensor derivatives that extends seamlessly to higher order derivatives. The framework can be used for symbolic as well as for forward and reverse mode automatic differentiation. Experiments show a speedup between one and four orders of magnitude over state-of-the-art frameworks when evaluating higher order derivatives.
Tasks
Published	2018-12-01
URL	http://papers.nips.cc/paper/7540-computing-higher-order-derivatives-of-matrix-and-tensor-expressions
PDF	http://papers.nips.cc/paper/7540-computing-higher-order-derivatives-of-matrix-and-tensor-expressions.pdf
PWC	https://paperswithcode.com/paper/computing-higher-order-derivatives-of-matrix
Repo
Framework

Learning to Write by Learning the Objective


Title	Learning to Write by Learning the Objective
Authors	Ari Holtzman, Jan Buys, Maxwell Forbes, Antoine Bosselut, Yejin Choi
Abstract	Recurrent Neural Networks (RNNs) are powerful autoregressive sequence models for learning prevalent patterns in natural language. Yet language generated by RNNs often shows several degenerate characteristics that are uncommon in human language; while fluent, RNN language production can be overly generic, repetitive, and even self-contradictory. We postulate that the objective function optimized by RNN language models, which amounts to the overall perplexity of a text, is not expressive enough to capture the abstract qualities of good generation such as Grice’s Maxims. In this paper, we introduce a general learning framework that can construct a decoding objective better suited for generation. Starting with a generatively trained RNN language model, our framework learns to construct a substantially stronger generator by combining several discriminatively trained models that can collectively address the limitations of RNN generation. Human evaluation demonstrates that text generated by the resulting generator is preferred over that of baselines by a large margin and significantly enhances the overall coherence, style, and information content of the generated text.
Tasks	Language Modelling
Published	2018-01-01
URL	https://openreview.net/forum?id=r1lfpfZAb
PDF	https://openreview.net/pdf?id=r1lfpfZAb
PWC	https://paperswithcode.com/paper/learning-to-write-by-learning-the-objective
Repo
Framework

Proceedings of the Second Workshop on Stylistic Variation


Title	Proceedings of the Second Workshop on Stylistic Variation
Authors
Abstract
Tasks
Published	2018-06-01
URL	https://www.aclweb.org/anthology/W18-1600/
PDF	https://www.aclweb.org/anthology/W18-1600
PWC	https://paperswithcode.com/paper/proceedings-of-the-second-workshop-on
Repo
Framework

Empirical Risk Landscape Analysis for Understanding Deep Neural Networks


Title	Empirical Risk Landscape Analysis for Understanding Deep Neural Networks
Authors	Pan Zhou, Jiashi Feng
Abstract	This work aims to provide comprehensive landscape analysis of empirical risk in deep neural networks (DNNs), including the convergence behavior of its gradient, its stationary points and the empirical risk itself to their corresponding population counterparts, which reveals how various network parameters determine the convergence performance. In particular, for an $l$-layer linear neural network consisting of $\dm_i$ neurons in the $i$-th layer, we prove the gradient of its empirical risk uniformly converges to the one of its population risk, at the rate of $\mathcal{O}(r^{2l} \sqrt{l\sqrt{\max_i \dm_i} s\log(d/l)/n})$. Here $d$ is the total weight dimension, $s$ is the number of nonzero entries of all the weights and the magnitude of weights per layer is upper bounded by $r$. Moreover, we prove the one-to-one correspondence of the non-degenerate stationary points between the empirical and population risks and provide convergence guarantee for each pair. We also establish the uniform convergence of the empirical risk to its population counterpart and further derive the stability and generalization bounds for the empirical risk. In addition, we analyze these properties for deep \emph{nonlinear} neural networks with sigmoid activation functions. We prove similar results for convergence behavior of their empirical risk gradients, non-degenerate stationary points as well as the empirical risk itself. To our best knowledge, this work is the first one theoretically characterizing the uniform convergence of the gradient and stationary points of the empirical risk of DNN models, which benefits the theoretical understanding on how the neural network depth $l$, the layer width $\dm_i$, the network size $d$, the sparsity in weight and the parameter magnitude $r$ determine the neural network landscape.
Tasks
Published	2018-01-01
URL	https://openreview.net/forum?id=B1QgVti6Z
PDF	https://openreview.net/pdf?id=B1QgVti6Z
PWC	https://paperswithcode.com/paper/empirical-risk-landscape-analysis-for
Repo
Framework

Learning Compression from Limited Unlabeled Data


Title	Learning Compression from Limited Unlabeled Data
Authors	Xiangyu He, Jian Cheng
Abstract	Convolutional neural networks (CNNs) have dramatically advanced the state-of-art in a number of domains. However, most models are both computation and memory intensive, which arouse the interest of network compression. While existing compression methods achieve good performance, they suffer from three limitations: 1) the inevitable retraining with enormous labeled data; 2) the massive GPU hours for retraining; 3) the training tricks for model compression. Especially the requirement of retraining on original datasets makes it difficult to apply in many real-world scenarios, where training data is not publicly available. In this paper, we reveal that re-normalization is the practical and effective way to alleviate the above limitations. Through quantization or pruning, most methods may compress a large number of parameters but ignore the core role in performance degradation, which is the Gaussian conjugate prior induced by batch normalization. By employing the re-estimated statistics in batch normalization, we significantly improve the accuracy of compressed CNNs. Extensive experiments on ImageNet show it outperforms baselines by a large margin and is comparable to label-based methods. Besides, the fine-tuning process takes less than 5 minutes on CPU, using 1000 unlabeled images.
Tasks	Model Compression, Quantization
Published	2018-09-01
URL	http://openaccess.thecvf.com/content_ECCV_2018/html/Xiangyu_He_Learning_Compression_from_ECCV_2018_paper.html
PDF	http://openaccess.thecvf.com/content_ECCV_2018/papers/Xiangyu_He_Learning_Compression_from_ECCV_2018_paper.pdf
PWC	https://paperswithcode.com/paper/learning-compression-from-limited-unlabeled
Repo
Framework

Assigning people to tasks identified in email: The EPA dataset for addressee tagging for detected task intent


Title	Assigning people to tasks identified in email: The EPA dataset for addressee tagging for detected task intent
Authors	Revanth Rameshkumar, Peter Bailey, Abhishek Jha, Chris Quirk
Abstract	We describe the Enron People Assignment (EPA) dataset, in which tasks that are described in emails are associated with the person(s) responsible for carrying out these tasks. We identify tasks and the responsible people in the Enron email dataset. We define evaluation methods for this challenge and report scores for our model and na{"\i}ve baselines. The resulting model enables a user experience operating within a commercial email service: given a person and a task, it determines if the person should be notified of the task.
Tasks
Published	2018-11-01
URL	https://www.aclweb.org/anthology/W18-6104/
PDF	https://www.aclweb.org/anthology/W18-6104
PWC	https://paperswithcode.com/paper/assigning-people-to-tasks-identified-in-email
Repo
Framework

Turn-Taking Strategies for Human-Robot Peer-Learning Dialogue


Title	Turn-Taking Strategies for Human-Robot Peer-Learning Dialogue
Authors	Ranjini Das, Heather Pon-Barry
Abstract	In this paper, we apply the contribution model of grounding to a corpus of human-human peer-mentoring dialogues. From this analysis, we propose effective turn-taking strategies for human-robot interaction with a teachable robot. Specifically, we focus on (1) how robots can encourage humans to present and (2) how robots can signal that they are going to begin a new presentation. We evaluate the strategies against a corpus of human-robot dialogues and offer three guidelines for teachable robots to follow to achieve more human-like collaborative dialogue.
Tasks
Published	2018-07-01
URL	https://www.aclweb.org/anthology/W18-5013/
PDF	https://www.aclweb.org/anthology/W18-5013
PWC	https://paperswithcode.com/paper/turn-taking-strategies-for-human-robot-peer
Repo
Framework