July 26, 2019

2537 words 12 mins read

Paper Group NANR 183

Paper Group NANR 183

Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Spherical convolutions and their application in molecular modelling. Mean Field Residual Networks: On the Edge of Chaos. BioCreative VI Precision Medicine Track: creating a training corpus for mining protein-protein interactions affecte …

Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Title Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Authors
Abstract
Tasks
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-2000/
PDF https://www.aclweb.org/anthology/P17-2000
PWC https://paperswithcode.com/paper/proceedings-of-the-55th-annual-meeting-of-the
Repo
Framework

Spherical convolutions and their application in molecular modelling

Title Spherical convolutions and their application in molecular modelling
Authors Wouter Boomsma, Jes Frellsen
Abstract Convolutional neural networks are increasingly used outside the domain of image analysis, in particular in various areas of the natural sciences concerned with spatial data. Such networks often work out-of-the box, and in some cases entire model architectures from image analysis can be carried over to other problem domains almost unaltered. Unfortunately, this convenience does not trivially extend to data in non-euclidean spaces, such as spherical data. In this paper, we introduce two strategies for conducting convolutions on the sphere, using either a spherical-polar grid or a grid based on the cubed-sphere representation. We investigate the challenges that arise in this setting, and extend our discussion to include scenarios of spherical volumes, with several strategies for parameterizing the radial dimension. As a proof of concept, we conclude with an assessment of the performance of spherical convolutions in the context of molecular modelling, by considering structural environments within proteins. We show that the models are capable of learning non-trivial functions in these molecular environments, and that our spherical convolutions generally outperform standard 3D convolutions in this setting. In particular, despite the lack of any domain specific feature-engineering, we demonstrate performance comparable to state-of-the-art methods in the field, which build on decades of domain-specific knowledge.
Tasks Feature Engineering
Published 2017-12-01
URL http://papers.nips.cc/paper/6935-spherical-convolutions-and-their-application-in-molecular-modelling
PDF http://papers.nips.cc/paper/6935-spherical-convolutions-and-their-application-in-molecular-modelling.pdf
PWC https://paperswithcode.com/paper/spherical-convolutions-and-their-application
Repo
Framework

Mean Field Residual Networks: On the Edge of Chaos

Title Mean Field Residual Networks: On the Edge of Chaos
Authors Ge Yang, Samuel Schoenholz
Abstract We study randomly initialized residual networks using mean field theory and the theory of difference equations. Classical feedforward neural networks, such as those with tanh activations, exhibit exponential behavior on the average when propagating inputs forward or gradients backward. The exponential forward dynamics causes rapid collapsing of the input space geometry, while the exponential backward dynamics causes drastic vanishing or exploding gradients. We show, in contrast, that by adding skip connections, the network will, depending on the nonlinearity, adopt subexponential forward and backward dynamics, and in many cases in fact polynomial. The exponents of these polynomials are obtained through analytic methods and proved and verified empirically to be correct. In terms of the “edge of chaos” hypothesis, these subexponential and polynomial laws allow residual networks to “hover over the boundary between stability and chaos,” thus preserving the geometry of the input space and the gradient information flow. In our experiments, for each activation function we study here, we initialize residual networks with different hyperparameters and train them on MNIST. Remarkably, our initialization time theory can accurately predict test time performance of these networks, by tracking either the expected amount of gradient explosion or the expected squared distance between the images of two input vectors. Importantly, we show, theoretically as well as empirically, that common initializations such as the Xavier or the He schemes are not optimal for residual networks, because the optimal initialization variances depend on the depth. Finally, we have made mathematical contributions by deriving several new identities for the kernels of powers of ReLU functions by relating them to the zeroth Bessel function of the second kind.
Tasks
Published 2017-12-01
URL http://papers.nips.cc/paper/6879-mean-field-residual-networks-on-the-edge-of-chaos
PDF http://papers.nips.cc/paper/6879-mean-field-residual-networks-on-the-edge-of-chaos.pdf
PWC https://paperswithcode.com/paper/mean-field-residual-networks-on-the-edge-of
Repo
Framework

BioCreative VI Precision Medicine Track: creating a training corpus for mining protein-protein interactions affected by mutations

Title BioCreative VI Precision Medicine Track: creating a training corpus for mining protein-protein interactions affected by mutations
Authors Rezarta Islamaj Do{\u{g}}an, Andrew Chatr-aryamontri, Sun Kim, Chih-Hsuan Wei, Yifan Peng, Donald Comeau, Zhiyong Lu
Abstract The Precision Medicine Track in BioCre-ative VI aims to bring together the Bi-oNLP community for a novel challenge focused on mining the biomedical litera-ture in search of mutations and protein-protein interactions (PPI). In order to support this track with an effective train-ing dataset with limited curator time, the track organizers carefully reviewed Pub-Med articles from two different sources: curated public PPI databases, and the re-sults of state-of-the-art public text mining tools. We detail here the data collection, manual review and annotation process and describe this training corpus charac-teristics. We also describe a corpus per-formance baseline. This analysis will provide useful information to developers and researchers for comparing and devel-oping innovative text mining approaches for the BioCreative VI challenge and other Precision Medicine related applica-tions.
Tasks Relation Extraction
Published 2017-08-01
URL https://www.aclweb.org/anthology/W17-2321/
PDF https://www.aclweb.org/anthology/W17-2321
PWC https://paperswithcode.com/paper/biocreative-vi-precision-medicine-track
Repo
Framework

Position-based Multiple-play Bandit Problem with Unknown Position Bias

Title Position-based Multiple-play Bandit Problem with Unknown Position Bias
Authors Junpei Komiyama, Junya Honda, Akiko Takeda
Abstract Motivated by online advertising, we study a multiple-play multi-armed bandit problem with position bias that involves several slots and the latter slots yield fewer rewards. We characterize the hardness of the problem by deriving an asymptotic regret bound. We propose the Permutation Minimum Empirical Divergence (PMED) algorithm and derive its asymptotically optimal regret bound. Because of the uncertainty of the position bias, the optimal algorithm for such a problem requires non-convex optimizations that are different from usual partial monitoring and semi-bandit problems. We propose a cutting-plane method and related bi-convex relaxation for these optimizations by using auxiliary variables.
Tasks
Published 2017-12-01
URL http://papers.nips.cc/paper/7085-position-based-multiple-play-bandit-problem-with-unknown-position-bias
PDF http://papers.nips.cc/paper/7085-position-based-multiple-play-bandit-problem-with-unknown-position-bias.pdf
PWC https://paperswithcode.com/paper/position-based-multiple-play-bandit-problem
Repo
Framework

Learning from uncertain curves: The 2-Wasserstein metric for Gaussian processes

Title Learning from uncertain curves: The 2-Wasserstein metric for Gaussian processes
Authors Anton Mallasto, Aasa Feragen
Abstract We introduce a novel framework for statistical analysis of populations of non-degenerate Gaussian processes (GPs), which are natural representations of uncertain curves. This allows inherent variation or uncertainty in function-valued data to be properly incorporated in the population analysis. Using the 2-Wasserstein metric we geometrize the space of GPs with L2 mean and covariance functions over compact index spaces. We prove uniqueness of the barycenter of a population of GPs, as well as convergence of the metric and the barycenter of their finite-dimensional counterparts. This justifies practical computations. Finally, we demonstrate our framework through experimental validation on GP datasets representing brain connectivity and climate development. A Matlab library for relevant computations will be published at https://sites.google.com/view/antonmallasto/software.
Tasks Gaussian Processes
Published 2017-12-01
URL http://papers.nips.cc/paper/7149-learning-from-uncertain-curves-the-2-wasserstein-metric-for-gaussian-processes
PDF http://papers.nips.cc/paper/7149-learning-from-uncertain-curves-the-2-wasserstein-metric-for-gaussian-processes.pdf
PWC https://paperswithcode.com/paper/learning-from-uncertain-curves-the-2
Repo
Framework

IITPB at SemEval-2017 Task 5: Sentiment Prediction in Financial Text

Title IITPB at SemEval-2017 Task 5: Sentiment Prediction in Financial Text
Authors Abhishek Kumar, Abhishek Sethi, Md Shad Akhtar, Asif Ekbal, Chris Biemann, Pushpak Bhattacharyya
Abstract This paper reports team IITPB{'}s participation in the SemEval 2017 Task 5 on {`}Fine-grained sentiment analysis on financial microblogs and news{'}. We developed 2 systems for the two tracks. One system was based on an ensemble of Support Vector Classifier and Logistic Regression. This system relied on Distributional Thesaurus (DT), word embeddings and lexicon features to predict a floating sentiment value between -1 and +1. The other system was based on Support Vector Regression using word embeddings, lexicon features, and PMI scores as features. The system was ranked 5th in track 1 and 8th in track 2. |
Tasks Sentiment Analysis, Word Embeddings
Published 2017-08-01
URL https://www.aclweb.org/anthology/S17-2153/
PDF https://www.aclweb.org/anthology/S17-2153
PWC https://paperswithcode.com/paper/iitpb-at-semeval-2017-task-5-sentiment
Repo
Framework

Meritocratic Fairness for Cross-Population Selection

Title Meritocratic Fairness for Cross-Population Selection
Authors Michael Kearns, Aaron Roth, Zhiwei Steven Wu
Abstract We consider the problem of selecting a strong pool of individuals from several populations with incomparable skills (e.g. soccer players, mathematicians, and singers) in a fair manner. The quality of an individual is defined to be their relative rank (by cumulative distribution value) within their own population, which permits cross-population comparisons. We study algorithms which attempt to select the highest quality subset despite the fact that true CDF values are not known, and can only be estimated from the finite pool of candidates. Specifically, we quantify the regret in quality imposed by “meritocratic” notions of fairness, which require that individuals are selected with probability that is monotonically increasing in their true quality. We give algorithms with provable fairness and regret guarantees, as well as lower bounds, and provide empirical results which suggest that our algorithms perform better than the theory suggests. We extend our results to a sequential batch setting, in which an algorithm must repeatedly select subsets of individuals from new pools of applicants, but has the benefit of being able to compare them to the accumulated data from previous rounds.
Tasks
Published 2017-08-01
URL https://icml.cc/Conferences/2017/Schedule?showEvent=744
PDF http://proceedings.mlr.press/v70/kearns17a/kearns17a.pdf
PWC https://paperswithcode.com/paper/meritocratic-fairness-for-cross-population
Repo
Framework
Title Exploring Treebanks with INESS Search
Authors Victoria Ros{'e}n, Helge Dyvik, Paul Meurer, Koenraad De Smedt
Abstract
Tasks
Published 2017-05-01
URL https://www.aclweb.org/anthology/W17-0248/
PDF https://www.aclweb.org/anthology/W17-0248
PWC https://paperswithcode.com/paper/exploring-treebanks-with-iness-search
Repo
Framework

An Unsupervised Neural Attention Model for Aspect Extraction

Title An Unsupervised Neural Attention Model for Aspect Extraction
Authors Ruidan He, Wee Sun Lee, Hwee Tou Ng, Daniel Dahlmeier
Abstract Aspect extraction is an important and challenging task in aspect-based sentiment analysis. Existing works tend to apply variants of topic models on this task. While fairly successful, these methods usually do not produce highly coherent aspects. In this paper, we present a novel neural approach with the aim of discovering coherent aspects. The model improves coherence by exploiting the distribution of word co-occurrences through the use of neural word embeddings. Unlike topic models which typically assume independently generated words, word embedding models encourage words that appear in similar contexts to be located close to each other in the embedding space. In addition, we use an attention mechanism to de-emphasize irrelevant words during training, further improving the coherence of aspects. Experimental results on real-life datasets demonstrate that our approach discovers more meaningful and coherent aspects, and substantially outperforms baseline methods on several evaluation tasks.
Tasks Aspect-Based Sentiment Analysis, Aspect Extraction, Domain Adaptation, Sentiment Analysis, Topic Models, Word Embeddings
Published 2017-07-01
URL https://www.aclweb.org/anthology/P17-1036/
PDF https://www.aclweb.org/anthology/P17-1036
PWC https://paperswithcode.com/paper/an-unsupervised-neural-attention-model-for
Repo
Framework

Marginalized graph autoencoder for graph clustering

Title Marginalized graph autoencoder for graph clustering
Authors Chun Wang, Shirui Pan, Guodong Long, Xingquan Zhu, Jing Jiang
Abstract Graph clustering aims to discovercommunity structures in networks, the task being fundamentally challenging mainly because the topology structure and the content of the graphs are difficult to represent for clustering analysis. Recently, graph clustering has moved from traditional shallow methods to deep learning approaches, thanks to the unique feature representation learning capability of deep learning. However, existing deep approaches for graph clustering can only exploit the structure information, while ignoring the content information associated with the nodes in a graph. In this paper, we propose a novel marginalized graph autoencoder (MGAE) algorithm for graph clustering. The key innovation of MGAE is that it advances the autoencoder to the graph domain, so graph representation learning can be carried out not only in a purely unsupervised setting by leveraging structure and content information, it can also be stacked in a deep fashion to learn effective representation. From a technical viewpoint, we propose a marginalized graph convolutional network to corrupt network node content, allowing node content to interact with network features, and marginalizes the corrupted features in a graph autoencoder context to learn graph feature representations. The learned features are fed into the spectral clustering algorithm for graph clustering. Experimental results on benchmark datasets demonstrate the superior performance of MGAE, compared to numerous baselines.
Tasks Graph Clustering, Graph Representation Learning, Representation Learning
Published 2017-11-06
URL https://doi.org/10.1145/3132847.3132967
PDF https://opus.lib.uts.edu.au/handle/10453/127537
PWC https://paperswithcode.com/paper/marginalized-graph-autoencoder-for-graph
Repo
Framework

Convergence of Gradient EM on Multi-component Mixture of Gaussians

Title Convergence of Gradient EM on Multi-component Mixture of Gaussians
Authors Bowei Yan, Mingzhang Yin, Purnamrita Sarkar
Abstract In this paper, we study convergence properties of the gradient variant of Expectation-Maximization algorithm~\cite{lange1995gradient} for Gaussian Mixture Models for arbitrary number of clusters and mixing coefficients. We derive the convergence rate depending on the mixing coefficients, minimum and maximum pairwise distances between the true centers, dimensionality and number of components; and obtain a near-optimal local contraction radius. While there have been some recent notable works that derive local convergence rates for EM in the two symmetric mixture of Gaussians, in the more general case, the derivations need structurally different and non-trivial arguments. We use recent tools from learning theory and empirical processes to achieve our theoretical results.
Tasks
Published 2017-12-01
URL http://papers.nips.cc/paper/7271-convergence-of-gradient-em-on-multi-component-mixture-of-gaussians
PDF http://papers.nips.cc/paper/7271-convergence-of-gradient-em-on-multi-component-mixture-of-gaussians.pdf
PWC https://paperswithcode.com/paper/convergence-of-gradient-em-on-multi-component
Repo
Framework

IJCNLP-2017 Task 1: Chinese Grammatical Error Diagnosis

Title IJCNLP-2017 Task 1: Chinese Grammatical Error Diagnosis
Authors Gaoqi Rao, Baolin Zhang, Endong Xun, Lung-Hao Lee
Abstract This paper presents the IJCNLP 2017 shared task for Chinese grammatical error diagnosis (CGED) which seeks to identify grammatical error types and their range of occurrence within sentences written by learners of Chinese as foreign language. We describe the task definition, data preparation, performance metrics, and evaluation results. Of the 13 teams registered for this shared task, 5 teams developed the system and submitted a total of 13 runs. We expected this evaluation campaign could lead to the development of more advanced NLP techniques for educational applications, especially for Chinese error detection. All data sets with gold standards and scoring scripts are made publicly available to researchers.
Tasks Grammatical Error Correction
Published 2017-12-01
URL https://www.aclweb.org/anthology/I17-4001/
PDF https://www.aclweb.org/anthology/I17-4001
PWC https://paperswithcode.com/paper/ijcnlp-2017-task-1-chinese-grammatical-error
Repo
Framework

A Simple Model for Improving the Performance of the Stanford Parser for Action Detection in Textual Instructions

Title A Simple Model for Improving the Performance of the Stanford Parser for Action Detection in Textual Instructions
Authors Kristina Yordanova
Abstract Different approaches for behaviour understanding rely on textual instructions to generate models of human behaviour. These approaches usually use state of the art parsers to obtain the part of speech (POS) meaning and dependencies of the words in the instructions. For them it is essential that the parser is able to correctly annotate the instructions and especially the verbs as they describe the actions of the person. State of the art parsers usually make errors when annotating textual instructions, as they have short sentence structure often in imperative form. The inability of the parser to identify the verbs results in the inability of behaviour understanding systems to identify the relevant actions. To address this problem, we propose a simple rule-based model that attempts to correct any incorrectly annotated verbs. We argue that the model is able to significantly improve the parser{'}s performance without the need of additional training data. We evaluate our approach by extracting the actions from 61 textual instructions annotated only with the Stanford parser and once again after applying our model. The results show a significant improvement in the recognition rate when applying the rules (75{%} accuracy compared to 68{%} without the rules, p-value {\textless} 0.001).
Tasks Action Detection
Published 2017-09-01
URL https://www.aclweb.org/anthology/R17-1106/
PDF https://doi.org/10.26615/978-954-452-049-6_106
PWC https://paperswithcode.com/paper/a-simple-model-for-improving-the-performance
Repo
Framework

Scope, Time, and Predicate Restriction in Blackfoot using MC-STAG

Title Scope, Time, and Predicate Restriction in Blackfoot using MC-STAG
Authors Dennis Ryan Storoshenko
Abstract
Tasks
Published 2017-09-01
URL https://www.aclweb.org/anthology/W17-6206/
PDF https://www.aclweb.org/anthology/W17-6206
PWC https://paperswithcode.com/paper/scope-time-and-predicate-restriction-in
Repo
Framework
comments powered by Disqus