Paper Group NANR 255
Scalable Bilinear Pi Learning Using State and Action Features. Semi-Supervised Learning on Data Streams via Temporal Label Propagation. Personalized Microblog Sentiment Classification via Adversarial Cross-lingual Multi-task Learning. Neural Network based Extreme Classification and Similarity Models for Product Matching. NextGen AML: Distributed De …
Scalable Bilinear Pi Learning Using State and Action Features
Title | Scalable Bilinear Pi Learning Using State and Action Features |
Authors | Yichen Chen, Lihong Li, Mengdi Wang |
Abstract | Approximate linear programming (ALP) represents one of the major algorithmic families to solve large-scale Markov decision processes (MDP). In this work, we study a primal-dual formulation of the ALP, and develop a scalable, model-free algorithm called bilinear $\pi$ learning for reinforcement learning when a sampling oracle is provided. This algorithm enjoys a number of advantages. First, it adopts linear and bilinear models to represent the high-dimensional value function and state-action distributions, respectively, using given state and action features. Its run-time complexity depends on the number of features, not the size of the underlying MDPs. Second, it operates in a fully online fashion without having to store any sample, thus having minimal memory footprint. Third, we prove that it is sample-efficient, solving for the optimal policy to high precision with a sample complexity linear in the dimension of the parameter space. |
Tasks | |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=2330 |
http://proceedings.mlr.press/v80/chen18e/chen18e.pdf | |
PWC | https://paperswithcode.com/paper/scalable-bilinear-pi-learning-using-state-and |
Repo | |
Framework | |
Semi-Supervised Learning on Data Streams via Temporal Label Propagation
Title | Semi-Supervised Learning on Data Streams via Temporal Label Propagation |
Authors | Tal Wagner, Sudipto Guha, Shiva Kasiviswanathan, Nina Mishra |
Abstract | We consider the problem of labeling points on a fast-moving data stream when only a small number of labeled examples are available. In our setting, incoming points must be processed efficiently and the stream is too large to store in its entirety. We present a semi-supervised learning algorithm for this task. The algorithm maintains a small synopsis of the stream which can be quickly updated as new points arrive, and labels every incoming point by provably learning from the full history of the stream. Experiments on real datasets validate that the algorithm can quickly and accurately classify points on a stream with a small quantity of labeled examples. |
Tasks | |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=1879 |
http://proceedings.mlr.press/v80/wagner18a/wagner18a.pdf | |
PWC | https://paperswithcode.com/paper/semi-supervised-learning-on-data-streams-via |
Repo | |
Framework | |
Personalized Microblog Sentiment Classification via Adversarial Cross-lingual Multi-task Learning
Title | Personalized Microblog Sentiment Classification via Adversarial Cross-lingual Multi-task Learning |
Authors | Weichao Wang, Shi Feng, Wei Gao, Daling Wang, Yifei Zhang |
Abstract | Sentiment expression in microblog posts can be affected by user{'}s personal character, opinion bias, political stance and so on. Most of existing personalized microblog sentiment classification methods suffer from the insufficiency of discriminative tweets for personalization learning. We observed that microblog users have consistent individuality and opinion bias in different languages. Based on this observation, in this paper we propose a novel user-attention-based Convolutional Neural Network (CNN) model with adversarial cross-lingual learning framework. The user attention mechanism is leveraged in CNN model to capture user{'}s language-specific individuality from the posts. Then the attention-based CNN model is incorporated into a novel adversarial cross-lingual learning framework, in which with the help of user properties as bridge between languages, we can extract the language-specific features and language-independent features to enrich the user post representation so as to alleviate the data insufficiency problem. Results on English and Chinese microblog datasets confirm that our method outperforms state-of-the-art baseline algorithms with large margins. |
Tasks | Multi-Task Learning, Sentiment Analysis |
Published | 2018-10-01 |
URL | https://www.aclweb.org/anthology/D18-1031/ |
https://www.aclweb.org/anthology/D18-1031 | |
PWC | https://paperswithcode.com/paper/personalized-microblog-sentiment |
Repo | |
Framework | |
Neural Network based Extreme Classification and Similarity Models for Product Matching
Title | Neural Network based Extreme Classification and Similarity Models for Product Matching |
Authors | Kashif Shah, Selcuk Kopru, Jean-David Ruvini |
Abstract | Matching a seller listed item to an appropriate product has become a fundamental and one of the most significant step for e-commerce platforms for product based experience. It has a huge impact on making the search effective, search engine optimization, providing product reviews and product price estimation etc. along with many other advantages for a better user experience. As significant and vital it has become, the challenge to tackle the complexity has become huge with the exponential growth of individual and business sellers trading millions of products everyday. We explored two approaches; classification based on shallow neural network and similarity based on deep siamese network. These models outperform the baseline by more than 5{%} in term of accuracy and are capable of extremely efficient training and inference. |
Tasks | |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/N18-3002/ |
https://www.aclweb.org/anthology/N18-3002 | |
PWC | https://paperswithcode.com/paper/neural-network-based-extreme-classification |
Repo | |
Framework | |
NextGen AML: Distributed Deep Learning based Language Technologies to Augment Anti Money Laundering Investigation
Title | NextGen AML: Distributed Deep Learning based Language Technologies to Augment Anti Money Laundering Investigation |
Authors | Jingguang Han, Utsab Barman, Jeremiah Hayes, Jinhua Du, Edward Burgin, Dadong Wan |
Abstract | Most of the current anti money laundering (AML) systems, using handcrafted rules, are heavily reliant on existing structured databases, which are not capable of effectively and efficiently identifying hidden and complex ML activities, especially those with dynamic and time-varying characteristics, resulting in a high percentage of false positives. Therefore, analysts are engaged for further investigation which significantly increases human capital cost and processing time. To alleviate these issues, this paper presents a novel framework for the next generation AML by applying and visualizing deep learning-driven natural language processing (NLP) technologies in a distributed and scalable manner to augment AML monitoring and investigation. The proposed distributed framework performs news and tweet sentiment analysis, entity recognition, relation extraction, entity linking and link analysis on different data sources (e.g. news articles and tweets) to provide additional evidence to human investigators for final decision-making. Each NLP module is evaluated on a task-specific data set, and the overall experiments are performed on synthetic and real-world datasets. Feedback from AML practitioners suggests that our system can reduce approximately 30{%} time and cost compared to their previous manual approaches of AML investigation. |
Tasks | Decision Making, Entity Linking, Relation Extraction, Sentiment Analysis |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/P18-4007/ |
https://www.aclweb.org/anthology/P18-4007 | |
PWC | https://paperswithcode.com/paper/nextgen-aml-distributed-deep-learning-based |
Repo | |
Framework | |
Data-Driven Text Simplification
Title | Data-Driven Text Simplification |
Authors | Sanja {\v{S}}tajner, Horacio Saggion |
Abstract | |
Tasks | Lexical Simplification, Machine Translation, Semantic Role Labeling, Text Simplification |
Published | 2018-08-01 |
URL | https://www.aclweb.org/anthology/C18-3005/ |
https://www.aclweb.org/anthology/C18-3005 | |
PWC | https://paperswithcode.com/paper/data-driven-text-simplification |
Repo | |
Framework | |
Identifying Speakers and Addressees in Dialogues Extracted from Literary Fiction
Title | Identifying Speakers and Addressees in Dialogues Extracted from Literary Fiction |
Authors | Adam Ek, Mats Wir{'e}n, Robert {"O}stling, Kristina N. Bj{"o}rkenstam, Gintar{.e} Grigonyt{.e}, Sofia Gustafson Capkov{'a} |
Abstract | |
Tasks | Speaker Identification |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1131/ |
https://www.aclweb.org/anthology/L18-1131 | |
PWC | https://paperswithcode.com/paper/identifying-speakers-and-addressees-in |
Repo | |
Framework | |
Unsupervised Learning of Distributional Relation Vectors
Title | Unsupervised Learning of Distributional Relation Vectors |
Authors | Shoaib Jameel, Zied Bouraoui, Steven Schockaert |
Abstract | Word embedding models such as GloVe rely on co-occurrence statistics to learn vector representations of word meaning. While we may similarly expect that co-occurrence statistics can be used to capture rich information about the relationships between different words, existing approaches for modeling such relationships are based on manipulating pre-trained word vectors. In this paper, we introduce a novel method which directly learns relation vectors from co-occurrence statistics. To this end, we first introduce a variant of GloVe, in which there is an explicit connection between word vectors and PMI weighted co-occurrence vectors. We then show how relation vectors can be naturally embedded into the resulting vector space. |
Tasks | Relation Extraction, Word Embeddings |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/P18-1003/ |
https://www.aclweb.org/anthology/P18-1003 | |
PWC | https://paperswithcode.com/paper/unsupervised-learning-of-distributional |
Repo | |
Framework | |
Feasible Arm Identification
Title | Feasible Arm Identification |
Authors | Julian Katz-Samuels, Clay Scott |
Abstract | We introduce the feasible arm identification problem, a pure exploration multi-armed bandit problem where the agent is given a set of $D$-dimensional arms and a polyhedron $P = {x : A x \leq b } \subset R^D$. Pulling an arm gives a random vector and the goal is to determine, using a fixed budget of $T$ pulls, which of the arms have means belonging to $P$. We propose three algorithms MD-UCBE, MD-SAR, and MD-APT and provide a unified analysis establishing upper bounds for each of them. We also establish a lower bound that matches up to constants the upper bounds of MD-UCBE and MD-APT. Finally, we demonstrate the effectiveness of our algorithms on synthetic and real-world datasets. |
Tasks | |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=2064 |
http://proceedings.mlr.press/v80/katz-samuels18a/katz-samuels18a.pdf | |
PWC | https://paperswithcode.com/paper/feasible-arm-identification |
Repo | |
Framework | |
CAYLEYNETS: SPECTRAL GRAPH CNNS WITH COMPLEX RATIONAL FILTERS
Title | CAYLEYNETS: SPECTRAL GRAPH CNNS WITH COMPLEX RATIONAL FILTERS |
Authors | Ron Levie, Federico Monti, Xavier Bresson, Michael M. Bronstein |
Abstract | The rise of graph-structured data such as social networks, regulatory networks, citation graphs, and functional brain networks, in combination with resounding success of deep learning in various applications, has brought the interest in generalizing deep learning models to non-Euclidean domains. In this paper, we introduce a new spectral domain convolutional architecture for deep learning on graphs. The core ingredient of our model is a new class of parametric rational complex functions (Cayley polynomials) allowing to efficiently compute spectral filters on graphs that specialize on frequency bands of interest. Our model generates rich spectral filters that are localized in space, scales linearly with the size of the input data for sparsely-connected graphs, and can handle different constructions of Laplacian operators. Extensive experimental results show the superior performance of our approach on spectral image classification, community detection, vertex classification and matrix completion tasks. |
Tasks | Community Detection, Image Classification, Matrix Completion |
Published | 2018-01-01 |
URL | https://openreview.net/forum?id=S1680_1Rb |
https://openreview.net/pdf?id=S1680_1Rb | |
PWC | https://paperswithcode.com/paper/cayleynets-spectral-graph-cnns-with-complex |
Repo | |
Framework | |
ChatEval: A Tool for the Systematic Evaluation of Chatbots
Title | ChatEval: A Tool for the Systematic Evaluation of Chatbots |
Authors | Jo{~a}o Sedoc, Daphne Ippolito, Arun Kirubarajan, Jai Thirani, Lyle Ungar, Chris Callison-Burch |
Abstract | |
Tasks | Chatbot, Text Generation |
Published | 2018-11-01 |
URL | https://www.aclweb.org/anthology/W18-6709/ |
https://www.aclweb.org/anthology/W18-6709 | |
PWC | https://paperswithcode.com/paper/chateval-a-tool-for-the-systematic-evaluation |
Repo | |
Framework | |
Predicting accuracy on large datasets from smaller pilot data
Title | Predicting accuracy on large datasets from smaller pilot data |
Authors | Mark Johnson, Peter Anderson, Mark Dras, Mark Steedman |
Abstract | Because obtaining training data is often the most difficult part of an NLP or ML project, we develop methods for predicting how much data is required to achieve a desired test accuracy by extrapolating results from models trained on a small pilot training dataset. We model how accuracy varies as a function of training size on subsets of the pilot data, and use that model to predict how much training data would be required to achieve the desired accuracy. We introduce a new performance extrapolation task to evaluate how well different extrapolations predict accuracy on larger training sets. We show that details of hyperparameter optimisation and the extrapolation models can have dramatic effects in a document classification task. We believe this is an important first step in developing methods for estimating the resources required to meet specific engineering performance targets. |
Tasks | Document Classification |
Published | 2018-07-01 |
URL | https://www.aclweb.org/anthology/P18-2072/ |
https://www.aclweb.org/anthology/P18-2072 | |
PWC | https://paperswithcode.com/paper/predicting-accuracy-on-large-datasets-from |
Repo | |
Framework | |
Collection and Analysis of Code-switch Egyptian Arabic-English Speech Corpus
Title | Collection and Analysis of Code-switch Egyptian Arabic-English Speech Corpus |
Authors | Injy Hamed, Mohamed Elmahdy, Slim Abdennadher |
Abstract | |
Tasks | Speech Recognition |
Published | 2018-05-01 |
URL | https://www.aclweb.org/anthology/L18-1601/ |
https://www.aclweb.org/anthology/L18-1601 | |
PWC | https://paperswithcode.com/paper/collection-and-analysis-of-code-switch |
Repo | |
Framework | |
A Delay-tolerant Proximal-Gradient Algorithm for Distributed Learning
Title | A Delay-tolerant Proximal-Gradient Algorithm for Distributed Learning |
Authors | Konstantin Mishchenko, Franck Iutzeler, Jérôme Malick, Massih-Reza Amini |
Abstract | Distributed learning aims at computing high-quality models by training over scattered data. This covers a diversity of scenarios, including computer clusters or mobile agents. One of the main challenges is then to deal with heterogeneous machines and unreliable communications. In this setting, we propose and analyze a flexible asynchronous optimization algorithm for solving nonsmooth learning problems. Unlike most existing methods, our algorithm is adjustable to various levels of communication costs, machines computational powers, and data distribution evenness. We prove that the algorithm converges linearly with a fixed learning rate that does not depend on communication delays nor on the number of machines. Although long delays in communication may slow down performance, no delay can break convergence. |
Tasks | |
Published | 2018-07-01 |
URL | https://icml.cc/Conferences/2018/Schedule?showEvent=2143 |
http://proceedings.mlr.press/v80/mishchenko18a/mishchenko18a.pdf | |
PWC | https://paperswithcode.com/paper/a-delay-tolerant-proximal-gradient-algorithm |
Repo | |
Framework | |
ECNU at SemEval-2018 Task 2: Leverage Traditional NLP Features and Neural Networks Methods to Address Twitter Emoji Prediction Task
Title | ECNU at SemEval-2018 Task 2: Leverage Traditional NLP Features and Neural Networks Methods to Address Twitter Emoji Prediction Task |
Authors | Xingwu Lu, Xin Mao, Man Lan, Yuanbin Wu |
Abstract | This paper describes our submissions to Task 2 in SemEval 2018, i.e., Multilingual Emoji Prediction. We first investigate several traditional Natural Language Processing (NLP) features, and then design several deep learning models. For subtask 1: Emoji Prediction in English, we combine two different methods to represent tweet, i.e., supervised model using traditional features and deep learning model. For subtask 2: Emoji Prediction in Spanish, we only use deep learning model. |
Tasks | |
Published | 2018-06-01 |
URL | https://www.aclweb.org/anthology/S18-1068/ |
https://www.aclweb.org/anthology/S18-1068 | |
PWC | https://paperswithcode.com/paper/ecnu-at-semeval-2018-task-2-leverage |
Repo | |
Framework | |