Paper Group NANR 183
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Spherical convolutions and their application in molecular modelling. Mean Field Residual Networks: On the Edge of Chaos. BioCreative VI Precision Medicine Track: creating a training corpus for mining protein-protein interactions affecte …
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Title | Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) |
Authors | |
Abstract | |
Tasks | |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-2000/ |
https://www.aclweb.org/anthology/P17-2000 | |
PWC | https://paperswithcode.com/paper/proceedings-of-the-55th-annual-meeting-of-the |
Repo | |
Framework | |
Spherical convolutions and their application in molecular modelling
Title | Spherical convolutions and their application in molecular modelling |
Authors | Wouter Boomsma, Jes Frellsen |
Abstract | Convolutional neural networks are increasingly used outside the domain of image analysis, in particular in various areas of the natural sciences concerned with spatial data. Such networks often work out-of-the box, and in some cases entire model architectures from image analysis can be carried over to other problem domains almost unaltered. Unfortunately, this convenience does not trivially extend to data in non-euclidean spaces, such as spherical data. In this paper, we introduce two strategies for conducting convolutions on the sphere, using either a spherical-polar grid or a grid based on the cubed-sphere representation. We investigate the challenges that arise in this setting, and extend our discussion to include scenarios of spherical volumes, with several strategies for parameterizing the radial dimension. As a proof of concept, we conclude with an assessment of the performance of spherical convolutions in the context of molecular modelling, by considering structural environments within proteins. We show that the models are capable of learning non-trivial functions in these molecular environments, and that our spherical convolutions generally outperform standard 3D convolutions in this setting. In particular, despite the lack of any domain specific feature-engineering, we demonstrate performance comparable to state-of-the-art methods in the field, which build on decades of domain-specific knowledge. |
Tasks | Feature Engineering |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/6935-spherical-convolutions-and-their-application-in-molecular-modelling |
http://papers.nips.cc/paper/6935-spherical-convolutions-and-their-application-in-molecular-modelling.pdf | |
PWC | https://paperswithcode.com/paper/spherical-convolutions-and-their-application |
Repo | |
Framework | |
Mean Field Residual Networks: On the Edge of Chaos
Title | Mean Field Residual Networks: On the Edge of Chaos |
Authors | Ge Yang, Samuel Schoenholz |
Abstract | We study randomly initialized residual networks using mean field theory and the theory of difference equations. Classical feedforward neural networks, such as those with tanh activations, exhibit exponential behavior on the average when propagating inputs forward or gradients backward. The exponential forward dynamics causes rapid collapsing of the input space geometry, while the exponential backward dynamics causes drastic vanishing or exploding gradients. We show, in contrast, that by adding skip connections, the network will, depending on the nonlinearity, adopt subexponential forward and backward dynamics, and in many cases in fact polynomial. The exponents of these polynomials are obtained through analytic methods and proved and verified empirically to be correct. In terms of the “edge of chaos” hypothesis, these subexponential and polynomial laws allow residual networks to “hover over the boundary between stability and chaos,” thus preserving the geometry of the input space and the gradient information flow. In our experiments, for each activation function we study here, we initialize residual networks with different hyperparameters and train them on MNIST. Remarkably, our initialization time theory can accurately predict test time performance of these networks, by tracking either the expected amount of gradient explosion or the expected squared distance between the images of two input vectors. Importantly, we show, theoretically as well as empirically, that common initializations such as the Xavier or the He schemes are not optimal for residual networks, because the optimal initialization variances depend on the depth. Finally, we have made mathematical contributions by deriving several new identities for the kernels of powers of ReLU functions by relating them to the zeroth Bessel function of the second kind. |
Tasks | |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/6879-mean-field-residual-networks-on-the-edge-of-chaos |
http://papers.nips.cc/paper/6879-mean-field-residual-networks-on-the-edge-of-chaos.pdf | |
PWC | https://paperswithcode.com/paper/mean-field-residual-networks-on-the-edge-of |
Repo | |
Framework | |
BioCreative VI Precision Medicine Track: creating a training corpus for mining protein-protein interactions affected by mutations
Title | BioCreative VI Precision Medicine Track: creating a training corpus for mining protein-protein interactions affected by mutations |
Authors | Rezarta Islamaj Do{\u{g}}an, Andrew Chatr-aryamontri, Sun Kim, Chih-Hsuan Wei, Yifan Peng, Donald Comeau, Zhiyong Lu |
Abstract | The Precision Medicine Track in BioCre-ative VI aims to bring together the Bi-oNLP community for a novel challenge focused on mining the biomedical litera-ture in search of mutations and protein-protein interactions (PPI). In order to support this track with an effective train-ing dataset with limited curator time, the track organizers carefully reviewed Pub-Med articles from two different sources: curated public PPI databases, and the re-sults of state-of-the-art public text mining tools. We detail here the data collection, manual review and annotation process and describe this training corpus charac-teristics. We also describe a corpus per-formance baseline. This analysis will provide useful information to developers and researchers for comparing and devel-oping innovative text mining approaches for the BioCreative VI challenge and other Precision Medicine related applica-tions. |
Tasks | Relation Extraction |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/W17-2321/ |
https://www.aclweb.org/anthology/W17-2321 | |
PWC | https://paperswithcode.com/paper/biocreative-vi-precision-medicine-track |
Repo | |
Framework | |
Position-based Multiple-play Bandit Problem with Unknown Position Bias
Title | Position-based Multiple-play Bandit Problem with Unknown Position Bias |
Authors | Junpei Komiyama, Junya Honda, Akiko Takeda |
Abstract | Motivated by online advertising, we study a multiple-play multi-armed bandit problem with position bias that involves several slots and the latter slots yield fewer rewards. We characterize the hardness of the problem by deriving an asymptotic regret bound. We propose the Permutation Minimum Empirical Divergence (PMED) algorithm and derive its asymptotically optimal regret bound. Because of the uncertainty of the position bias, the optimal algorithm for such a problem requires non-convex optimizations that are different from usual partial monitoring and semi-bandit problems. We propose a cutting-plane method and related bi-convex relaxation for these optimizations by using auxiliary variables. |
Tasks | |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/7085-position-based-multiple-play-bandit-problem-with-unknown-position-bias |
http://papers.nips.cc/paper/7085-position-based-multiple-play-bandit-problem-with-unknown-position-bias.pdf | |
PWC | https://paperswithcode.com/paper/position-based-multiple-play-bandit-problem |
Repo | |
Framework | |
Learning from uncertain curves: The 2-Wasserstein metric for Gaussian processes
Title | Learning from uncertain curves: The 2-Wasserstein metric for Gaussian processes |
Authors | Anton Mallasto, Aasa Feragen |
Abstract | We introduce a novel framework for statistical analysis of populations of non-degenerate Gaussian processes (GPs), which are natural representations of uncertain curves. This allows inherent variation or uncertainty in function-valued data to be properly incorporated in the population analysis. Using the 2-Wasserstein metric we geometrize the space of GPs with L2 mean and covariance functions over compact index spaces. We prove uniqueness of the barycenter of a population of GPs, as well as convergence of the metric and the barycenter of their finite-dimensional counterparts. This justifies practical computations. Finally, we demonstrate our framework through experimental validation on GP datasets representing brain connectivity and climate development. A Matlab library for relevant computations will be published at https://sites.google.com/view/antonmallasto/software. |
Tasks | Gaussian Processes |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/7149-learning-from-uncertain-curves-the-2-wasserstein-metric-for-gaussian-processes |
http://papers.nips.cc/paper/7149-learning-from-uncertain-curves-the-2-wasserstein-metric-for-gaussian-processes.pdf | |
PWC | https://paperswithcode.com/paper/learning-from-uncertain-curves-the-2 |
Repo | |
Framework | |
IITPB at SemEval-2017 Task 5: Sentiment Prediction in Financial Text
Title | IITPB at SemEval-2017 Task 5: Sentiment Prediction in Financial Text |
Authors | Abhishek Kumar, Abhishek Sethi, Md Shad Akhtar, Asif Ekbal, Chris Biemann, Pushpak Bhattacharyya |
Abstract | This paper reports team IITPB{'}s participation in the SemEval 2017 Task 5 on {`}Fine-grained sentiment analysis on financial microblogs and news{'}. We developed 2 systems for the two tracks. One system was based on an ensemble of Support Vector Classifier and Logistic Regression. This system relied on Distributional Thesaurus (DT), word embeddings and lexicon features to predict a floating sentiment value between -1 and +1. The other system was based on Support Vector Regression using word embeddings, lexicon features, and PMI scores as features. The system was ranked 5th in track 1 and 8th in track 2. | |
Tasks | Sentiment Analysis, Word Embeddings |
Published | 2017-08-01 |
URL | https://www.aclweb.org/anthology/S17-2153/ |
https://www.aclweb.org/anthology/S17-2153 | |
PWC | https://paperswithcode.com/paper/iitpb-at-semeval-2017-task-5-sentiment |
Repo | |
Framework | |
Meritocratic Fairness for Cross-Population Selection
Title | Meritocratic Fairness for Cross-Population Selection |
Authors | Michael Kearns, Aaron Roth, Zhiwei Steven Wu |
Abstract | We consider the problem of selecting a strong pool of individuals from several populations with incomparable skills (e.g. soccer players, mathematicians, and singers) in a fair manner. The quality of an individual is defined to be their relative rank (by cumulative distribution value) within their own population, which permits cross-population comparisons. We study algorithms which attempt to select the highest quality subset despite the fact that true CDF values are not known, and can only be estimated from the finite pool of candidates. Specifically, we quantify the regret in quality imposed by “meritocratic” notions of fairness, which require that individuals are selected with probability that is monotonically increasing in their true quality. We give algorithms with provable fairness and regret guarantees, as well as lower bounds, and provide empirical results which suggest that our algorithms perform better than the theory suggests. We extend our results to a sequential batch setting, in which an algorithm must repeatedly select subsets of individuals from new pools of applicants, but has the benefit of being able to compare them to the accumulated data from previous rounds. |
Tasks | |
Published | 2017-08-01 |
URL | https://icml.cc/Conferences/2017/Schedule?showEvent=744 |
http://proceedings.mlr.press/v70/kearns17a/kearns17a.pdf | |
PWC | https://paperswithcode.com/paper/meritocratic-fairness-for-cross-population |
Repo | |
Framework | |
Exploring Treebanks with INESS Search
Title | Exploring Treebanks with INESS Search |
Authors | Victoria Ros{'e}n, Helge Dyvik, Paul Meurer, Koenraad De Smedt |
Abstract | |
Tasks | |
Published | 2017-05-01 |
URL | https://www.aclweb.org/anthology/W17-0248/ |
https://www.aclweb.org/anthology/W17-0248 | |
PWC | https://paperswithcode.com/paper/exploring-treebanks-with-iness-search |
Repo | |
Framework | |
An Unsupervised Neural Attention Model for Aspect Extraction
Title | An Unsupervised Neural Attention Model for Aspect Extraction |
Authors | Ruidan He, Wee Sun Lee, Hwee Tou Ng, Daniel Dahlmeier |
Abstract | Aspect extraction is an important and challenging task in aspect-based sentiment analysis. Existing works tend to apply variants of topic models on this task. While fairly successful, these methods usually do not produce highly coherent aspects. In this paper, we present a novel neural approach with the aim of discovering coherent aspects. The model improves coherence by exploiting the distribution of word co-occurrences through the use of neural word embeddings. Unlike topic models which typically assume independently generated words, word embedding models encourage words that appear in similar contexts to be located close to each other in the embedding space. In addition, we use an attention mechanism to de-emphasize irrelevant words during training, further improving the coherence of aspects. Experimental results on real-life datasets demonstrate that our approach discovers more meaningful and coherent aspects, and substantially outperforms baseline methods on several evaluation tasks. |
Tasks | Aspect-Based Sentiment Analysis, Aspect Extraction, Domain Adaptation, Sentiment Analysis, Topic Models, Word Embeddings |
Published | 2017-07-01 |
URL | https://www.aclweb.org/anthology/P17-1036/ |
https://www.aclweb.org/anthology/P17-1036 | |
PWC | https://paperswithcode.com/paper/an-unsupervised-neural-attention-model-for |
Repo | |
Framework | |
Marginalized graph autoencoder for graph clustering
Title | Marginalized graph autoencoder for graph clustering |
Authors | Chun Wang, Shirui Pan, Guodong Long, Xingquan Zhu, Jing Jiang |
Abstract | Graph clustering aims to discovercommunity structures in networks, the task being fundamentally challenging mainly because the topology structure and the content of the graphs are difficult to represent for clustering analysis. Recently, graph clustering has moved from traditional shallow methods to deep learning approaches, thanks to the unique feature representation learning capability of deep learning. However, existing deep approaches for graph clustering can only exploit the structure information, while ignoring the content information associated with the nodes in a graph. In this paper, we propose a novel marginalized graph autoencoder (MGAE) algorithm for graph clustering. The key innovation of MGAE is that it advances the autoencoder to the graph domain, so graph representation learning can be carried out not only in a purely unsupervised setting by leveraging structure and content information, it can also be stacked in a deep fashion to learn effective representation. From a technical viewpoint, we propose a marginalized graph convolutional network to corrupt network node content, allowing node content to interact with network features, and marginalizes the corrupted features in a graph autoencoder context to learn graph feature representations. The learned features are fed into the spectral clustering algorithm for graph clustering. Experimental results on benchmark datasets demonstrate the superior performance of MGAE, compared to numerous baselines. |
Tasks | Graph Clustering, Graph Representation Learning, Representation Learning |
Published | 2017-11-06 |
URL | https://doi.org/10.1145/3132847.3132967 |
https://opus.lib.uts.edu.au/handle/10453/127537 | |
PWC | https://paperswithcode.com/paper/marginalized-graph-autoencoder-for-graph |
Repo | |
Framework | |
Convergence of Gradient EM on Multi-component Mixture of Gaussians
Title | Convergence of Gradient EM on Multi-component Mixture of Gaussians |
Authors | Bowei Yan, Mingzhang Yin, Purnamrita Sarkar |
Abstract | In this paper, we study convergence properties of the gradient variant of Expectation-Maximization algorithm~\cite{lange1995gradient} for Gaussian Mixture Models for arbitrary number of clusters and mixing coefficients. We derive the convergence rate depending on the mixing coefficients, minimum and maximum pairwise distances between the true centers, dimensionality and number of components; and obtain a near-optimal local contraction radius. While there have been some recent notable works that derive local convergence rates for EM in the two symmetric mixture of Gaussians, in the more general case, the derivations need structurally different and non-trivial arguments. We use recent tools from learning theory and empirical processes to achieve our theoretical results. |
Tasks | |
Published | 2017-12-01 |
URL | http://papers.nips.cc/paper/7271-convergence-of-gradient-em-on-multi-component-mixture-of-gaussians |
http://papers.nips.cc/paper/7271-convergence-of-gradient-em-on-multi-component-mixture-of-gaussians.pdf | |
PWC | https://paperswithcode.com/paper/convergence-of-gradient-em-on-multi-component |
Repo | |
Framework | |
IJCNLP-2017 Task 1: Chinese Grammatical Error Diagnosis
Title | IJCNLP-2017 Task 1: Chinese Grammatical Error Diagnosis |
Authors | Gaoqi Rao, Baolin Zhang, Endong Xun, Lung-Hao Lee |
Abstract | This paper presents the IJCNLP 2017 shared task for Chinese grammatical error diagnosis (CGED) which seeks to identify grammatical error types and their range of occurrence within sentences written by learners of Chinese as foreign language. We describe the task definition, data preparation, performance metrics, and evaluation results. Of the 13 teams registered for this shared task, 5 teams developed the system and submitted a total of 13 runs. We expected this evaluation campaign could lead to the development of more advanced NLP techniques for educational applications, especially for Chinese error detection. All data sets with gold standards and scoring scripts are made publicly available to researchers. |
Tasks | Grammatical Error Correction |
Published | 2017-12-01 |
URL | https://www.aclweb.org/anthology/I17-4001/ |
https://www.aclweb.org/anthology/I17-4001 | |
PWC | https://paperswithcode.com/paper/ijcnlp-2017-task-1-chinese-grammatical-error |
Repo | |
Framework | |
A Simple Model for Improving the Performance of the Stanford Parser for Action Detection in Textual Instructions
Title | A Simple Model for Improving the Performance of the Stanford Parser for Action Detection in Textual Instructions |
Authors | Kristina Yordanova |
Abstract | Different approaches for behaviour understanding rely on textual instructions to generate models of human behaviour. These approaches usually use state of the art parsers to obtain the part of speech (POS) meaning and dependencies of the words in the instructions. For them it is essential that the parser is able to correctly annotate the instructions and especially the verbs as they describe the actions of the person. State of the art parsers usually make errors when annotating textual instructions, as they have short sentence structure often in imperative form. The inability of the parser to identify the verbs results in the inability of behaviour understanding systems to identify the relevant actions. To address this problem, we propose a simple rule-based model that attempts to correct any incorrectly annotated verbs. We argue that the model is able to significantly improve the parser{'}s performance without the need of additional training data. We evaluate our approach by extracting the actions from 61 textual instructions annotated only with the Stanford parser and once again after applying our model. The results show a significant improvement in the recognition rate when applying the rules (75{%} accuracy compared to 68{%} without the rules, p-value {\textless} 0.001). |
Tasks | Action Detection |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/R17-1106/ |
https://doi.org/10.26615/978-954-452-049-6_106 | |
PWC | https://paperswithcode.com/paper/a-simple-model-for-improving-the-performance |
Repo | |
Framework | |
Scope, Time, and Predicate Restriction in Blackfoot using MC-STAG
Title | Scope, Time, and Predicate Restriction in Blackfoot using MC-STAG |
Authors | Dennis Ryan Storoshenko |
Abstract | |
Tasks | |
Published | 2017-09-01 |
URL | https://www.aclweb.org/anthology/W17-6206/ |
https://www.aclweb.org/anthology/W17-6206 | |
PWC | https://paperswithcode.com/paper/scope-time-and-predicate-restriction-in |
Repo | |
Framework | |