Paper Group NANR 52
CensNet: Convolution with Edge-Node Switching in Graph Neural Networks. Differentially Private Covariance Estimation. UA at SemEval-2019 Task 5: Setting A Strong Linear Baseline for Hate Speech Detection. Our Neural Machine Translation Systems for WAT 2019. TLR at BSNLP2019: A Multilingual Named Entity Recognition System. A Walk with SGD: How SGD E …
CensNet: Convolution with Edge-Node Switching in Graph Neural Networks
Title | CensNet: Convolution with Edge-Node Switching in Graph Neural Networks |
Authors | Xiaodong Jiang, Pengsheng Ji, Sheng Li |
Abstract | In this paper, we present CensNet, Convolution with Edge-Node Switching graph neural network, for semi-supervised classification and regression in graph-structured data with both node and edge features. CensNet is a general graph embedding framework, which embeds both nodes and edges to a latent feature space. By using line graph of the original undirected graph, the role of nodes and edges are switched, and two novel graph convolution operations are proposed for feature propagation. Experimental results on real-world academic citation networks and quantum chemistry graphs show that our approach has achieved or matched the state-of-the-art performance. |
Tasks | Graph Classification, Graph Embedding, Graph Regression, Node Classification |
Published | 2019-08-10 |
URL | https://doi.org/10.24963/ijcai.2019/369 |
https://www.ijcai.org/proceedings/2019/0369.pdf | |
PWC | https://paperswithcode.com/paper/censnet-convolution-with-edge-node-switching |
Repo | |
Framework | |
Differentially Private Covariance Estimation
Title | Differentially Private Covariance Estimation |
Authors | Kareem Amin, Travis Dick, Alex Kulesza, Andres Munoz, Sergei Vassilvitskii |
Abstract | The covariance matrix of a dataset is a fundamental statistic that can be used for calculating optimum regression weights as well as in many other learning and data analysis settings. For datasets containing private user information, we often want to estimate the covariance matrix in a way that preserves differential privacy. While there are known methods for privately computing the covariance matrix, they all have one of two major shortcomings. Some, like the Gaussian mechanism, only guarantee (epsilon, delta)-differential privacy, leaving a non-trivial probability of privacy failure. Others give strong epsilon-differential privacy guarantees, but are impractical, requiring complicated sampling schemes, and tend to perform poorly on real data. In this work we propose a new epsilon-differentially private algorithm for computing the covariance matrix of a dataset that addresses both of these limitations. We show that it has lower error than existing state-of-the-art approaches, both analytically and empirically. In addition, the algorithm is significantly less complicated than other methods and can be efficiently implemented with rejection sampling. |
Tasks | |
Published | 2019-12-01 |
URL | http://papers.nips.cc/paper/9567-differentially-private-covariance-estimation |
http://papers.nips.cc/paper/9567-differentially-private-covariance-estimation.pdf | |
PWC | https://paperswithcode.com/paper/differentially-private-covariance-estimation |
Repo | |
Framework | |
UA at SemEval-2019 Task 5: Setting A Strong Linear Baseline for Hate Speech Detection
Title | UA at SemEval-2019 Task 5: Setting A Strong Linear Baseline for Hate Speech Detection |
Authors | Carlos Perell{'o}, David Tom{'a}s, Alberto Garcia-Garcia, Jose Garcia-Rodriguez, Jose Camacho-Collados |
Abstract | This paper describes the system developed at the University of Alicante (UA) for the SemEval 2019 Task 5: Shared Task on Multilingual Detection of Hate. The purpose of this work is to build a strong baseline for hate speech detection, using a traditional machine learning approach with standard textual features, which could serve in a near future as a reference to compare with deep learning systems. We participated in both task A (Hate Speech Detection against Immigrants and Women) and task B (Aggressive behavior and Target Classification). Despite its simplicity, our system obtained a remarkable F1-score of 72.5 (sixth highest) and an accuracy of 73.6 (second highest) in Spanish (task A), outperforming more complex neural models from a total of 40 participant systems. |
Tasks | Hate Speech Detection |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/S19-2091/ |
https://www.aclweb.org/anthology/S19-2091 | |
PWC | https://paperswithcode.com/paper/ua-at-semeval-2019-task-5-setting-a-strong |
Repo | |
Framework | |
Our Neural Machine Translation Systems for WAT 2019
Title | Our Neural Machine Translation Systems for WAT 2019 |
Authors | Wei Yang, Jun Ogata |
Abstract | In this paper, we describe our Neural Machine Translation (NMT) systems for the WAT 2019 translation tasks we focus on. This year we participate in scientific paper tasks and focus on the language pair between English and Japanese. We use Transformer model through our work in this paper to explore and experience the powerful of the Transformer architecture relying on self-attention mechanism. We use different NMT toolkit/library as the implementation of training the Transformer model. For word segmentation, we use different subword segmentation strategies while using different toolkit/library. We not only give the translation accuracy obtained based on absolute position encodings that introduced in the Transformer model, but also report the the improvements in translation accuracy while replacing absolute position encodings with relative position representations. We also ensemble several independent trained Transformer models to further improve the translation accuracy. |
Tasks | Machine Translation |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-5220/ |
https://www.aclweb.org/anthology/D19-5220 | |
PWC | https://paperswithcode.com/paper/our-neural-machine-translation-systems-for |
Repo | |
Framework | |
TLR at BSNLP2019: A Multilingual Named Entity Recognition System
Title | TLR at BSNLP2019: A Multilingual Named Entity Recognition System |
Authors | Jose G. Moreno, Elvys Linhares Pontes, Mickael Coustaty, Antoine Doucet |
Abstract | This paper presents our participation at the shared task on multilingual named entity recognition at BSNLP2019. Our strategy is based on a standard neural architecture for sequence labeling. In particular, we use a mixed model which combines multilingualcontextual and language-specific embeddings. Our only submitted run is based on a voting schema using multiple models, one for each of the four languages of the task (Bulgarian, Czech, Polish, and Russian) and another for English. Results for named entity recognition are encouraging for all languages, varying from 60{%} to 83{%} in terms of Strict and Relaxed metrics, respectively. |
Tasks | Named Entity Recognition |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-3711/ |
https://www.aclweb.org/anthology/W19-3711 | |
PWC | https://paperswithcode.com/paper/tlr-at-bsnlp2019-a-multilingual-named-entity |
Repo | |
Framework | |
A Walk with SGD: How SGD Explores Regions of Deep Network Loss?
Title | A Walk with SGD: How SGD Explores Regions of Deep Network Loss? |
Authors | Chen Xing, Devansh Arpit, Christos Tsirigotis, Yoshua Bengio |
Abstract | The non-convex nature of the loss landscape of deep neural networks (DNN) lends them the intuition that over the course of training, stochastic optimization algorithms explore different regions of the loss surface by entering and escaping many local minima due to the noise induced by mini-batches. But is this really the case? This question couples the geometry of the DNN loss landscape with how stochastic optimization algorithms like SGD interact with it during training. Answering this question may help us qualitatively understand the dynamics of deep neural network optimization. We show evidence through qualitative and quantitative experiments that mini-batch SGD rarely crosses barriers during DNN optimization. As we show, the mini-batch induced noise helps SGD explore different regions of the loss surface using a seemingly different mechanism. To complement this finding, we also investigate the qualitative reason behind the slowing down of this exploration when using larger batch-sizes. We show this happens because gradients from larger batch-sizes align more with the top eigenvectors of the Hessian, which makes SGD oscillate in the proximity of the parameter initialization, thus preventing exploration. |
Tasks | Stochastic Optimization |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=B1l6e3RcF7 |
https://openreview.net/pdf?id=B1l6e3RcF7 | |
PWC | https://paperswithcode.com/paper/a-walk-with-sgd-how-sgd-explores-regions-of |
Repo | |
Framework | |
Ranking and Sampling in Open-Domain Question Answering
Title | Ranking and Sampling in Open-Domain Question Answering |
Authors | Yanfu Xu, Zheng Lin, Yuanxin Liu, Rui Liu, Weiping Wang, Dan Meng |
Abstract | Open-domain question answering (OpenQA) aims to answer questions based on a number of unlabeled paragraphs. Existing approaches always follow the distantly supervised setup where some of the paragraphs are wrong-labeled (noisy), and mainly utilize the paragraph-question relevance to denoise. However, the paragraph-paragraph relevance, which may aggregate the evidence among relevant paragraphs, can also be utilized to discover more useful paragraphs. Moreover, current approaches mainly focus on the positive paragraphs which are known to contain the answer during training. This will affect the generalization ability of the model and make it be disturbed by the similar but irrelevant (distracting) paragraphs during testing. In this paper, we first introduce a ranking model leveraging the paragraph-question and the paragraph-paragraph relevance to compute a confidence score for each paragraph. Furthermore, based on the scores, we design a modified weighted sampling strategy for training to mitigate the influence of the noisy and distracting paragraphs. Experiments on three public datasets (Quasar-T, SearchQA and TriviaQA) show that our model advances the state of the art. |
Tasks | Open-Domain Question Answering, Question Answering |
Published | 2019-11-01 |
URL | https://www.aclweb.org/anthology/D19-1245/ |
https://www.aclweb.org/anthology/D19-1245 | |
PWC | https://paperswithcode.com/paper/ranking-and-sampling-in-open-domain-question |
Repo | |
Framework | |
A New Annotation Scheme for the Sejong Part-of-speech Tagged Corpus
Title | A New Annotation Scheme for the Sejong Part-of-speech Tagged Corpus |
Authors | Jungyeul Park, Francis Tyers |
Abstract | In this paper we present a new annotation scheme for the Sejong part-of-speech tagged corpus based on Universal Dependencies style annotation. By using a new annotation scheme, we can produce Sejong-style morphological analysis and part-of-speech tagging results which have been the \textit{de facto} standard for Korean language processing. We also explore the possibility of doing named-entity recognition and semantic-role labelling for Korean using the new annotation scheme. |
Tasks | Morphological Analysis, Named Entity Recognition, Part-Of-Speech Tagging |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-4022/ |
https://www.aclweb.org/anthology/W19-4022 | |
PWC | https://paperswithcode.com/paper/a-new-annotation-scheme-for-the-sejong-part |
Repo | |
Framework | |
LIUM’s Contributions to the WMT2019 News Translation Task: Data and Systems for German-French Language Pairs
Title | LIUM’s Contributions to the WMT2019 News Translation Task: Data and Systems for German-French Language Pairs |
Authors | Fethi Bougares, Jane Wottawa, Anne Baillot, Lo{"\i}c Barrault, Adrien Bardet |
Abstract | This paper describes the neural machine translation (NMT) systems of the LIUM Laboratory developed for the French↔German news translation task of the Fourth Conference onMachine Translation (WMT 2019). The chosen language pair is included for the first time in the WMT news translation task. We de-scribe how the training and the evaluation data was created. We also present our participation in the French↔German translation directions using self-attentional Transformer networks with small and big architectures. |
Tasks | Machine Translation |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5307/ |
https://www.aclweb.org/anthology/W19-5307 | |
PWC | https://paperswithcode.com/paper/liums-contributions-to-the-wmt2019-news |
Repo | |
Framework | |
Interest aware influential information disseminators in social networks
Title | Interest aware influential information disseminators in social networks |
Authors | Santhoshkumar Srinivasan, L. D. Dhinesh Babu |
Abstract | In recent days, finding influential disseminators in social networks has become a crucial issue due to its importance in the spread control of information, product advertisement, and rumor control. Most current researches on influencer identification are focused on topological factors such as coreness, centrality, and degree distribution. But these methods do not consider the interest of receiver though it plays a vital role in carrying information forward. To consider the receiver’s interest in finding influential spreaders, this paper proposes a robust and reliable two-step influencer finder model which considers the individual’s interest on spreader as well as the spreading information. This approach combines the individual’s location and the interest on the neighbor/topic. In step 1, a novel method to find a trust vertex of spreaders is proposed. In step 2, a weighted neighborhood centrality method is proposed to identify one or more influential spreaders using the trust vertex. The experiments conducted on six different datasets to prove the effectiveness of the proposed approach. The results show that the proposed approach is better than other recent and well-known state-of-the-art algorithms |
Tasks | |
Published | 2019-10-22 |
URL | https://link.springer.com/article/10.1007/s42452-019-1436-x |
https://link.springer.com/content/pdf/10.1007%2Fs42452-019-1436-x.pdf | |
PWC | https://paperswithcode.com/paper/interest-aware-influential-information |
Repo | |
Framework | |
Natural Language Inference with Monotonicity
Title | Natural Language Inference with Monotonicity |
Authors | Hai Hu, Qi Chen, Larry Moss |
Abstract | This paper describes a working system which performs natural language inference using polarity-marked parse trees. The system handles all of the instances of monotonicity inference in the FraCaS data set. Except for the initial parse, it is entirely deterministic. It handles multi-premise arguments, and the kind of inference performed is essentially {``}logical{''}, but it goes beyond what is representable in first-order logic. In any case, the system works on surface forms rather than on representations of any kind. | |
Tasks | Natural Language Inference |
Published | 2019-05-01 |
URL | https://www.aclweb.org/anthology/W19-0502/ |
https://www.aclweb.org/anthology/W19-0502 | |
PWC | https://paperswithcode.com/paper/natural-language-inference-with-monotonicity |
Repo | |
Framework | |
Kyoto University Participation to the WMT 2019 News Shared Task
Title | Kyoto University Participation to the WMT 2019 News Shared Task |
Authors | Fabien Cromieres, Sadao Kurohashi |
Abstract | We describe here the experiments we did for the the news translation shared task of WMT 2019. We focused on the new German-to-French language direction, and mostly used current standard approaches to develop a Neural Machine Translation system. We make use of the Tensor2Tensor implementation of the Transformer model. After carefully cleaning the data and noting the importance of the good use of recent monolingual data for the task, we obtain our final result by combining the output of a diverse set of trained models through the use of their {``}checkpoint agreement{''}. | |
Tasks | Machine Translation |
Published | 2019-08-01 |
URL | https://www.aclweb.org/anthology/W19-5312/ |
https://www.aclweb.org/anthology/W19-5312 | |
PWC | https://paperswithcode.com/paper/kyoto-university-participation-to-the-wmt |
Repo | |
Framework | |
Exploiting Noisy Data in Distant Supervision Relation Classification
Title | Exploiting Noisy Data in Distant Supervision Relation Classification |
Authors | Kaijia Yang, Liang He, Xin-yu Dai, Shujian Huang, Jiajun Chen |
Abstract | Distant supervision has obtained great progress on relation classification task. However, it still suffers from noisy labeling problem. Different from previous works that underutilize noisy data which inherently characterize the property of classification, in this paper, we propose RCEND, a novel framework to enhance Relation Classification by Exploiting Noisy Data. First, an instance discriminator with reinforcement learning is designed to split the noisy data into correctly labeled data and incorrectly labeled data. Second, we learn a robust relation classifier in semi-supervised learning way, whereby the correctly and incorrectly labeled data are treated as labeled and unlabeled data respectively. The experimental results show that our method outperforms the state-of-the-art models. |
Tasks | Relation Classification |
Published | 2019-06-01 |
URL | https://www.aclweb.org/anthology/N19-1325/ |
https://www.aclweb.org/anthology/N19-1325 | |
PWC | https://paperswithcode.com/paper/exploiting-noisy-data-in-distant-supervision |
Repo | |
Framework | |
ALISTA: Analytic Weights Are As Good As Learned Weights in LISTA
Title | ALISTA: Analytic Weights Are As Good As Learned Weights in LISTA |
Authors | Jialin Liu, Xiaohan Chen, Zhangyang Wang, Wotao Yin |
Abstract | Deep neural networks based on unfolding an iterative algorithm, for example, LISTA (learned iterative shrinkage thresholding algorithm), have been an empirical success for sparse signal recovery. The weights of these neural networks are currently determined by data-driven “black-box” training. In this work, we propose Analytic LISTA (ALISTA), where the weight matrix in LISTA is computed as the solution to a data-free optimization problem, leaving only the stepsize and threshold parameters to data-driven learning. This significantly simplifies the training. Specifically, the data-free optimization problem is based on coherence minimization. We show our ALISTA retains the optimal linear convergence proved in (Chen et al., 2018) and has a performance comparable to LISTA. Furthermore, we extend ALISTA to convolutional linear operators, again determined in a data-free manner. We also propose a feed-forward framework that combines the data-free optimization and ALISTA networks from end to end, one that can be jointly trained to gain robustness to small perturbations in the encoding model. |
Tasks | |
Published | 2019-05-01 |
URL | https://openreview.net/forum?id=B1lnzn0ctQ |
https://openreview.net/pdf?id=B1lnzn0ctQ | |
PWC | https://paperswithcode.com/paper/alista-analytic-weights-are-as-good-as |
Repo | |
Framework | |
Distributional Semantics in the Real World: Building Word Vector Representations from a Truth-Theoretic Model
Title | Distributional Semantics in the Real World: Building Word Vector Representations from a Truth-Theoretic Model |
Authors | Elizaveta Kuzmenko, Aur{'e}lie Herbelot |
Abstract | Distributional semantics models (DSMs) are known to produce excellent representations of word meaning, which correlate with a range of behavioural data. As lexical representations, they have been said to be fundamentally different from truth-theoretic models of semantics, where meaning is defined as a correspondence relation to the world. There are two main aspects to this difference: a) DSMs are built over corpus data which may or may not reflect {}what is in the world{'}; b) they are built from word co-occurrences, that is, from lexical types rather than entities and sets. In this paper, we inspect the properties of a distributional model built over a set-theoretic approximation of { }the real world{'}. To achieve this, we take the annotation a large database of images marked with objects, attributes and relations, convert the data into a representation akin to first-order logic and build several distributional models using various combinations of features. We evaluate those models over both relatedness and similarity datasets, demonstrating their effectiveness in standard evaluations. This allows us to conclude that, despite prior claims, truth-theoretic models are good candidates for building graded lexical representations of meaning. |
Tasks | |
Published | 2019-05-01 |
URL | https://www.aclweb.org/anthology/W19-0503/ |
https://www.aclweb.org/anthology/W19-0503 | |
PWC | https://paperswithcode.com/paper/distributional-semantics-in-the-real-world |
Repo | |
Framework | |