July 26, 2019

2537 words 12 mins read

Paper Group NANR 183

Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Spherical convolutions and their application in molecular modelling. Mean Field Residual Networks: On the Edge of Chaos. BioCreative VI Precision Medicine Track: creating a training corpus for mining protein-protein interactions affecte …

Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)


Title	Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Authors
Abstract
Tasks
Published	2017-07-01
URL	https://www.aclweb.org/anthology/P17-2000/
PDF	https://www.aclweb.org/anthology/P17-2000
PWC	https://paperswithcode.com/paper/proceedings-of-the-55th-annual-meeting-of-the
Repo
Framework

Spherical convolutions and their application in molecular modelling


Title	Spherical convolutions and their application in molecular modelling
Authors	Wouter Boomsma, Jes Frellsen
Abstract	Convolutional neural networks are increasingly used outside the domain of image analysis, in particular in various areas of the natural sciences concerned with spatial data. Such networks often work out-of-the box, and in some cases entire model architectures from image analysis can be carried over to other problem domains almost unaltered. Unfortunately, this convenience does not trivially extend to data in non-euclidean spaces, such as spherical data. In this paper, we introduce two strategies for conducting convolutions on the sphere, using either a spherical-polar grid or a grid based on the cubed-sphere representation. We investigate the challenges that arise in this setting, and extend our discussion to include scenarios of spherical volumes, with several strategies for parameterizing the radial dimension. As a proof of concept, we conclude with an assessment of the performance of spherical convolutions in the context of molecular modelling, by considering structural environments within proteins. We show that the models are capable of learning non-trivial functions in these molecular environments, and that our spherical convolutions generally outperform standard 3D convolutions in this setting. In particular, despite the lack of any domain specific feature-engineering, we demonstrate performance comparable to state-of-the-art methods in the field, which build on decades of domain-specific knowledge.
Tasks	Feature Engineering
Published	2017-12-01
URL	http://papers.nips.cc/paper/6935-spherical-convolutions-and-their-application-in-molecular-modelling
PDF	http://papers.nips.cc/paper/6935-spherical-convolutions-and-their-application-in-molecular-modelling.pdf
PWC	https://paperswithcode.com/paper/spherical-convolutions-and-their-application
Repo
Framework

Mean Field Residual Networks: On the Edge of Chaos


Title	Mean Field Residual Networks: On the Edge of Chaos
Authors	Ge Yang, Samuel Schoenholz
Abstract	We study randomly initialized residual networks using mean field theory and the theory of difference equations. Classical feedforward neural networks, such as those with tanh activations, exhibit exponential behavior on the average when propagating inputs forward or gradients backward. The exponential forward dynamics causes rapid collapsing of the input space geometry, while the exponential backward dynamics causes drastic vanishing or exploding gradients. We show, in contrast, that by adding skip connections, the network will, depending on the nonlinearity, adopt subexponential forward and backward dynamics, and in many cases in fact polynomial. The exponents of these polynomials are obtained through analytic methods and proved and verified empirically to be correct. In terms of the “edge of chaos” hypothesis, these subexponential and polynomial laws allow residual networks to “hover over the boundary between stability and chaos,” thus preserving the geometry of the input space and the gradient information flow. In our experiments, for each activation function we study here, we initialize residual networks with different hyperparameters and train them on MNIST. Remarkably, our initialization time theory can accurately predict test time performance of these networks, by tracking either the expected amount of gradient explosion or the expected squared distance between the images of two input vectors. Importantly, we show, theoretically as well as empirically, that common initializations such as the Xavier or the He schemes are not optimal for residual networks, because the optimal initialization variances depend on the depth. Finally, we have made mathematical contributions by deriving several new identities for the kernels of powers of ReLU functions by relating them to the zeroth Bessel function of the second kind.
Tasks
Published	2017-12-01
URL	http://papers.nips.cc/paper/6879-mean-field-residual-networks-on-the-edge-of-chaos
PDF	http://papers.nips.cc/paper/6879-mean-field-residual-networks-on-the-edge-of-chaos.pdf
PWC	https://paperswithcode.com/paper/mean-field-residual-networks-on-the-edge-of
Repo
Framework

BioCreative VI Precision Medicine Track: creating a training corpus for mining protein-protein interactions affected by mutations


Title	BioCreative VI Precision Medicine Track: creating a training corpus for mining protein-protein interactions affected by mutations
Authors	Rezarta Islamaj Do{\u{g}}an, Andrew Chatr-aryamontri, Sun Kim, Chih-Hsuan Wei, Yifan Peng, Donald Comeau, Zhiyong Lu
Abstract	The Precision Medicine Track in BioCre-ative VI aims to bring together the Bi-oNLP community for a novel challenge focused on mining the biomedical litera-ture in search of mutations and protein-protein interactions (PPI). In order to support this track with an effective train-ing dataset with limited curator time, the track organizers carefully reviewed Pub-Med articles from two different sources: curated public PPI databases, and the re-sults of state-of-the-art public text mining tools. We detail here the data collection, manual review and annotation process and describe this training corpus charac-teristics. We also describe a corpus per-formance baseline. This analysis will provide useful information to developers and researchers for comparing and devel-oping innovative text mining approaches for the BioCreative VI challenge and other Precision Medicine related applica-tions.
Tasks	Relation Extraction
Published	2017-08-01
URL	https://www.aclweb.org/anthology/W17-2321/
PDF	https://www.aclweb.org/anthology/W17-2321
PWC	https://paperswithcode.com/paper/biocreative-vi-precision-medicine-track
Repo
Framework

Position-based Multiple-play Bandit Problem with Unknown Position Bias


Title	Position-based Multiple-play Bandit Problem with Unknown Position Bias
Authors	Junpei Komiyama, Junya Honda, Akiko Takeda
Abstract	Motivated by online advertising, we study a multiple-play multi-armed bandit problem with position bias that involves several slots and the latter slots yield fewer rewards. We characterize the hardness of the problem by deriving an asymptotic regret bound. We propose the Permutation Minimum Empirical Divergence (PMED) algorithm and derive its asymptotically optimal regret bound. Because of the uncertainty of the position bias, the optimal algorithm for such a problem requires non-convex optimizations that are different from usual partial monitoring and semi-bandit problems. We propose a cutting-plane method and related bi-convex relaxation for these optimizations by using auxiliary variables.
Tasks
Published	2017-12-01
URL	http://papers.nips.cc/paper/7085-position-based-multiple-play-bandit-problem-with-unknown-position-bias
PDF	http://papers.nips.cc/paper/7085-position-based-multiple-play-bandit-problem-with-unknown-position-bias.pdf
PWC	https://paperswithcode.com/paper/position-based-multiple-play-bandit-problem
Repo
Framework

Learning from uncertain curves: The 2-Wasserstein metric for Gaussian processes


Title	Learning from uncertain curves: The 2-Wasserstein metric for Gaussian processes
Authors	Anton Mallasto, Aasa Feragen
Abstract	We introduce a novel framework for statistical analysis of populations of non-degenerate Gaussian processes (GPs), which are natural representations of uncertain curves. This allows inherent variation or uncertainty in function-valued data to be properly incorporated in the population analysis. Using the 2-Wasserstein metric we geometrize the space of GPs with L2 mean and covariance functions over compact index spaces. We prove uniqueness of the barycenter of a population of GPs, as well as convergence of the metric and the barycenter of their finite-dimensional counterparts. This justifies practical computations. Finally, we demonstrate our framework through experimental validation on GP datasets representing brain connectivity and climate development. A Matlab library for relevant computations will be published at https://sites.google.com/view/antonmallasto/software.
Tasks	Gaussian Processes
Published	2017-12-01
URL	http://papers.nips.cc/paper/7149-learning-from-uncertain-curves-the-2-wasserstein-metric-for-gaussian-processes
PDF	http://papers.nips.cc/paper/7149-learning-from-uncertain-curves-the-2-wasserstein-metric-for-gaussian-processes.pdf
PWC	https://paperswithcode.com/paper/learning-from-uncertain-curves-the-2
Repo
Framework

IITPB at SemEval-2017 Task 5: Sentiment Prediction in Financial Text


Title	IITPB at SemEval-2017 Task 5: Sentiment Prediction in Financial Text
Authors	Abhishek Kumar, Abhishek Sethi, Md Shad Akhtar, Asif Ekbal, Chris Biemann, Pushpak Bhattacharyya
Abstract	This paper reports team IITPB{'}s participation in the SemEval 2017 Task 5 on {`}Fine-grained sentiment analysis on financial microblogs and news{'}. We developed 2 systems for the two tracks. One system was based on an ensemble of Support Vector Classifier and Logistic Regression. This system relied on Distributional Thesaurus (DT), word embeddings and lexicon features to predict a floating sentiment value between -1 and +1. The other system was based on Support Vector Regression using word embeddings, lexicon features, and PMI scores as features. The system was ranked 5th in track 1 and 8th in track 2. \|
Tasks	Sentiment Analysis, Word Embeddings
Published	2017-08-01
URL	https://www.aclweb.org/anthology/S17-2153/
PDF	https://www.aclweb.org/anthology/S17-2153
PWC	https://paperswithcode.com/paper/iitpb-at-semeval-2017-task-5-sentiment
Repo
Framework

Meritocratic Fairness for Cross-Population Selection


Title	Meritocratic Fairness for Cross-Population Selection
Authors	Michael Kearns, Aaron Roth, Zhiwei Steven Wu
Abstract	We consider the problem of selecting a strong pool of individuals from several populations with incomparable skills (e.g. soccer players, mathematicians, and singers) in a fair manner. The quality of an individual is defined to be their relative rank (by cumulative distribution value) within their own population, which permits cross-population comparisons. We study algorithms which attempt to select the highest quality subset despite the fact that true CDF values are not known, and can only be estimated from the finite pool of candidates. Specifically, we quantify the regret in quality imposed by “meritocratic” notions of fairness, which require that individuals are selected with probability that is monotonically increasing in their true quality. We give algorithms with provable fairness and regret guarantees, as well as lower bounds, and provide empirical results which suggest that our algorithms perform better than the theory suggests. We extend our results to a sequential batch setting, in which an algorithm must repeatedly select subsets of individuals from new pools of applicants, but has the benefit of being able to compare them to the accumulated data from previous rounds.
Tasks
Published	2017-08-01
URL	https://icml.cc/Conferences/2017/Schedule?showEvent=744
PDF	http://proceedings.mlr.press/v70/kearns17a/kearns17a.pdf
PWC	https://paperswithcode.com/paper/meritocratic-fairness-for-cross-population
Repo
Framework

Exploring Treebanks with INESS Search


Title	Exploring Treebanks with INESS Search
Authors	Victoria Ros{'e}n, Helge Dyvik, Paul Meurer, Koenraad De Smedt
Abstract
Tasks
Published	2017-05-01
URL	https://www.aclweb.org/anthology/W17-0248/
PDF	https://www.aclweb.org/anthology/W17-0248
PWC	https://paperswithcode.com/paper/exploring-treebanks-with-iness-search
Repo
Framework

An Unsupervised Neural Attention Model for Aspect Extraction


Title	An Unsupervised Neural Attention Model for Aspect Extraction
Authors	Ruidan He, Wee Sun Lee, Hwee Tou Ng, Daniel Dahlmeier
Abstract	Aspect extraction is an important and challenging task in aspect-based sentiment analysis. Existing works tend to apply variants of topic models on this task. While fairly successful, these methods usually do not produce highly coherent aspects. In this paper, we present a novel neural approach with the aim of discovering coherent aspects. The model improves coherence by exploiting the distribution of word co-occurrences through the use of neural word embeddings. Unlike topic models which typically assume independently generated words, word embedding models encourage words that appear in similar contexts to be located close to each other in the embedding space. In addition, we use an attention mechanism to de-emphasize irrelevant words during training, further improving the coherence of aspects. Experimental results on real-life datasets demonstrate that our approach discovers more meaningful and coherent aspects, and substantially outperforms baseline methods on several evaluation tasks.
Tasks	Aspect-Based Sentiment Analysis, Aspect Extraction, Domain Adaptation, Sentiment Analysis, Topic Models, Word Embeddings
Published	2017-07-01
URL	https://www.aclweb.org/anthology/P17-1036/
PDF	https://www.aclweb.org/anthology/P17-1036
PWC	https://paperswithcode.com/paper/an-unsupervised-neural-attention-model-for
Repo
Framework

Marginalized graph autoencoder for graph clustering


Title	Marginalized graph autoencoder for graph clustering
Authors	Chun Wang, Shirui Pan, Guodong Long, Xingquan Zhu, Jing Jiang
Abstract	Graph clustering aims to discovercommunity structures in networks, the task being fundamentally challenging mainly because the topology structure and the content of the graphs are difficult to represent for clustering analysis. Recently, graph clustering has moved from traditional shallow methods to deep learning approaches, thanks to the unique feature representation learning capability of deep learning. However, existing deep approaches for graph clustering can only exploit the structure information, while ignoring the content information associated with the nodes in a graph. In this paper, we propose a novel marginalized graph autoencoder (MGAE) algorithm for graph clustering. The key innovation of MGAE is that it advances the autoencoder to the graph domain, so graph representation learning can be carried out not only in a purely unsupervised setting by leveraging structure and content information, it can also be stacked in a deep fashion to learn effective representation. From a technical viewpoint, we propose a marginalized graph convolutional network to corrupt network node content, allowing node content to interact with network features, and marginalizes the corrupted features in a graph autoencoder context to learn graph feature representations. The learned features are fed into the spectral clustering algorithm for graph clustering. Experimental results on benchmark datasets demonstrate the superior performance of MGAE, compared to numerous baselines.
Tasks	Graph Clustering, Graph Representation Learning, Representation Learning
Published	2017-11-06
URL	https://doi.org/10.1145/3132847.3132967
PDF	https://opus.lib.uts.edu.au/handle/10453/127537
PWC	https://paperswithcode.com/paper/marginalized-graph-autoencoder-for-graph
Repo
Framework

Convergence of Gradient EM on Multi-component Mixture of Gaussians


Title	Convergence of Gradient EM on Multi-component Mixture of Gaussians
Authors	Bowei Yan, Mingzhang Yin, Purnamrita Sarkar
Abstract	In this paper, we study convergence properties of the gradient variant of Expectation-Maximization algorithm~\cite{lange1995gradient} for Gaussian Mixture Models for arbitrary number of clusters and mixing coefficients. We derive the convergence rate depending on the mixing coefficients, minimum and maximum pairwise distances between the true centers, dimensionality and number of components; and obtain a near-optimal local contraction radius. While there have been some recent notable works that derive local convergence rates for EM in the two symmetric mixture of Gaussians, in the more general case, the derivations need structurally different and non-trivial arguments. We use recent tools from learning theory and empirical processes to achieve our theoretical results.
Tasks
Published	2017-12-01
URL	http://papers.nips.cc/paper/7271-convergence-of-gradient-em-on-multi-component-mixture-of-gaussians
PDF	http://papers.nips.cc/paper/7271-convergence-of-gradient-em-on-multi-component-mixture-of-gaussians.pdf
PWC	https://paperswithcode.com/paper/convergence-of-gradient-em-on-multi-component
Repo
Framework

IJCNLP-2017 Task 1: Chinese Grammatical Error Diagnosis


Title	IJCNLP-2017 Task 1: Chinese Grammatical Error Diagnosis
Authors	Gaoqi Rao, Baolin Zhang, Endong Xun, Lung-Hao Lee
Abstract	This paper presents the IJCNLP 2017 shared task for Chinese grammatical error diagnosis (CGED) which seeks to identify grammatical error types and their range of occurrence within sentences written by learners of Chinese as foreign language. We describe the task definition, data preparation, performance metrics, and evaluation results. Of the 13 teams registered for this shared task, 5 teams developed the system and submitted a total of 13 runs. We expected this evaluation campaign could lead to the development of more advanced NLP techniques for educational applications, especially for Chinese error detection. All data sets with gold standards and scoring scripts are made publicly available to researchers.
Tasks	Grammatical Error Correction
Published	2017-12-01
URL	https://www.aclweb.org/anthology/I17-4001/
PDF	https://www.aclweb.org/anthology/I17-4001
PWC	https://paperswithcode.com/paper/ijcnlp-2017-task-1-chinese-grammatical-error
Repo
Framework

A Simple Model for Improving the Performance of the Stanford Parser for Action Detection in Textual Instructions


Title	A Simple Model for Improving the Performance of the Stanford Parser for Action Detection in Textual Instructions
Authors	Kristina Yordanova
Abstract	Different approaches for behaviour understanding rely on textual instructions to generate models of human behaviour. These approaches usually use state of the art parsers to obtain the part of speech (POS) meaning and dependencies of the words in the instructions. For them it is essential that the parser is able to correctly annotate the instructions and especially the verbs as they describe the actions of the person. State of the art parsers usually make errors when annotating textual instructions, as they have short sentence structure often in imperative form. The inability of the parser to identify the verbs results in the inability of behaviour understanding systems to identify the relevant actions. To address this problem, we propose a simple rule-based model that attempts to correct any incorrectly annotated verbs. We argue that the model is able to significantly improve the parser{'}s performance without the need of additional training data. We evaluate our approach by extracting the actions from 61 textual instructions annotated only with the Stanford parser and once again after applying our model. The results show a significant improvement in the recognition rate when applying the rules (75{%} accuracy compared to 68{%} without the rules, p-value {\textless} 0.001).
Tasks	Action Detection
Published	2017-09-01
URL	https://www.aclweb.org/anthology/R17-1106/
PDF	https://doi.org/10.26615/978-954-452-049-6_106
PWC	https://paperswithcode.com/paper/a-simple-model-for-improving-the-performance
Repo
Framework

Scope, Time, and Predicate Restriction in Blackfoot using MC-STAG


Title	Scope, Time, and Predicate Restriction in Blackfoot using MC-STAG
Authors	Dennis Ryan Storoshenko
Abstract
Tasks
Published	2017-09-01
URL	https://www.aclweb.org/anthology/W17-6206/
PDF	https://www.aclweb.org/anthology/W17-6206
PWC	https://paperswithcode.com/paper/scope-time-and-predicate-restriction-in
Repo
Framework