January 27, 2020

3221 words 16 mins read

Paper Group ANR 1137

Paper Group ANR 1137

Porous Lattice-based Transformer Encoder for Chinese NER. Hierarchical Reinforcement Learning for Multi-agent MOBA Game. Evaluation Metrics for Item Recommendation under Sampling. Neural Chinese Named Entity Recognition via CNN-LSTM-CRF and Joint Training with Word Segmentation. Journal Name Extraction from Japanese Scientific News Articles. Occlus …

Porous Lattice-based Transformer Encoder for Chinese NER

Title Porous Lattice-based Transformer Encoder for Chinese NER
Authors Xue Mengge, Yu Bowen, Liu Tingwen, Wang Bin, Meng Erli, Li Quangang
Abstract Incorporating lattices into character-level Chinese named entity recognition is an effective method to exploit explicit word information. Recent works extend recurrent and convolutional neural networks to model lattice inputs. However, due to the DAG structure or the variable-sized potential word set for lattice inputs, these models prevent the convenient use of batched computation, resulting in serious inefficient. In this paper, we propose a porous lattice-based transformer encoder for Chinese named entity recognition, which is capable to better exploit the GPU parallelism and batch the computation owing to the mask mechanism in transformer. We first investigate the lattice-aware self-attention coupled with relative position representations to explore effective word information in the lattice structure. Besides, to strengthen the local dependencies among neighboring tokens, we propose a novel porous structure during self-attentional computation processing, in which every two non-neighboring tokens are connected through a shared pivot node. Experimental results on four datasets show that our model performs up to 9.47 times faster than state-of-the-art models, while is roughly on a par with its performance. The source code of this paper can be obtained from https://github.com/xxx/xxx.
Tasks Chinese Named Entity Recognition, Named Entity Recognition
Published 2019-11-07
URL https://arxiv.org/abs/1911.02733v1
PDF https://arxiv.org/pdf/1911.02733v1.pdf
PWC https://paperswithcode.com/paper/porous-lattice-based-transformer-encoder-for
Repo
Framework

Hierarchical Reinforcement Learning for Multi-agent MOBA Game

Title Hierarchical Reinforcement Learning for Multi-agent MOBA Game
Authors Zhijian Zhang, Haozheng Li, Luo Zhang, Tianyin Zheng, Ting Zhang, Xiong Hao, Xiaoxin Chen, Min Chen, Fangxu Xiao, Wei Zhou
Abstract Real Time Strategy (RTS) games require macro strategies as well as micro strategies to obtain satisfactory performance since it has large state space, action space, and hidden information. This paper presents a novel hierarchical reinforcement learning model for mastering Multiplayer Online Battle Arena (MOBA) games, a sub-genre of RTS games. The novelty of this work are: (1) proposing a hierarchical framework, where agents execute macro strategies by imitation learning and carry out micromanipulations through reinforcement learning, (2) developing a simple self-learning method to get better sample efficiency for training, and (3) designing a dense reward function for multi-agent cooperation in the absence of game engine or Application Programming Interface (API). Finally, various experiments have been performed to validate the superior performance of the proposed method over other state-of-the-art reinforcement learning algorithms. Agent successfully learns to combat and defeat bronze-level built-in AI with 100% win rate, and experiments show that our method can create a competitive multi-agent for a kind of mobile MOBA game {\it King of Glory} in 5v5 mode.
Tasks Hierarchical Reinforcement Learning, Imitation Learning
Published 2019-01-23
URL https://arxiv.org/abs/1901.08004v6
PDF https://arxiv.org/pdf/1901.08004v6.pdf
PWC https://paperswithcode.com/paper/hierarchical-reinforcement-learning-for-multi
Repo
Framework

Evaluation Metrics for Item Recommendation under Sampling

Title Evaluation Metrics for Item Recommendation under Sampling
Authors Steffen Rendle
Abstract The task of item recommendation requires ranking a large catalogue of items given a context. Item recommendation algorithms are evaluated using ranking metrics that depend on the positions of relevant items. To speed up the computation of metrics, recent work often uses sampled metrics where only a smaller set of random items and the relevant items are ranked. This paper investigates sampled metrics in more detail and shows that sampled metrics are inconsistent with their exact version. Sampled metrics do not persist relative statements, e.g., ‘algorithm A is better than B’, not even in expectation. Moreover the smaller the sampling size, the less difference between metrics, and for very small sampling size, all metrics collapse to the AUC metric.
Tasks
Published 2019-12-04
URL https://arxiv.org/abs/1912.02263v1
PDF https://arxiv.org/pdf/1912.02263v1.pdf
PWC https://paperswithcode.com/paper/evaluation-metrics-for-item-recommendation
Repo
Framework

Neural Chinese Named Entity Recognition via CNN-LSTM-CRF and Joint Training with Word Segmentation

Title Neural Chinese Named Entity Recognition via CNN-LSTM-CRF and Joint Training with Word Segmentation
Authors Fangzhao Wu, Junxin Liu, Chuhan Wu, Yongfeng Huang, Xing Xie
Abstract Chinese named entity recognition (CNER) is an important task in Chinese natural language processing field. However, CNER is very challenging since Chinese entity names are highly context-dependent. In addition, Chinese texts lack delimiters to separate words, making it difficult to identify the boundary of entities. Besides, the training data for CNER in many domains is usually insufficient, and annotating enough training data for CNER is very expensive and time-consuming. In this paper, we propose a neural approach for CNER. First, we introduce a CNN-LSTM-CRF neural architecture to capture both local and long-distance contexts for CNER. Second, we propose a unified framework to jointly train CNER and word segmentation models in order to enhance the ability of CNER model in identifying entity boundaries. Third, we introduce an automatic method to generate pseudo labeled samples from existing labeled data which can enrich the training data. Experiments on two benchmark datasets show that our approach can effectively improve the performance of Chinese named entity recognition, especially when training data is insufficient.
Tasks Chinese Named Entity Recognition, Named Entity Recognition
Published 2019-04-26
URL http://arxiv.org/abs/1905.01964v1
PDF http://arxiv.org/pdf/1905.01964v1.pdf
PWC https://paperswithcode.com/paper/190501964
Repo
Framework

Journal Name Extraction from Japanese Scientific News Articles

Title Journal Name Extraction from Japanese Scientific News Articles
Authors Masato Kikuchi, Mitsuo Yoshida, Kyoji Umemura
Abstract In Japanese scientific news articles, although the research results are described clearly, the article’s sources tend to be uncited. This makes it difficult for readers to know the details of the research. In this paper, we address the task of extracting journal names from Japanese scientific news articles. We hypothesize that a journal name is likely to occur in a specific context. To support the hypothesis, we construct a character-based method and extract journal names using this method. This method only uses the left and right context features of journal names. The results of the journal name extractions suggest that the distribution hypothesis plays an important role in identifying the journal names.
Tasks
Published 2019-06-11
URL https://arxiv.org/abs/1906.04655v1
PDF https://arxiv.org/pdf/1906.04655v1.pdf
PWC https://paperswithcode.com/paper/journal-name-extraction-from-japanese
Repo
Framework

Occlusion-robust Online Multi-object Visual Tracking using a GM-PHD Filter with CNN-based Re-identification

Title Occlusion-robust Online Multi-object Visual Tracking using a GM-PHD Filter with CNN-based Re-identification
Authors Nathanael L. Baisa
Abstract We propose a novel online multi-object visual tracking algorithm via a tracking-by-detection paradigm using a Gaussian mixture Probability Hypothesis Density (GM-PHD) filter and deep Convolutional Neural Network (CNN) appearance representations learning. The GM-PHD filter has a linear complexity with the number of objects and observations while estimating the states and cardinality of unknown and time-varying number of objects in the scene. Though it handles object birth, death and clutter in a unified framework, it is susceptible to miss-detections and does not include the identity of objects. We use visual-spatio-temporal information obtained from object bounding boxes and deeply learned appearance representations to perform estimates-to-tracks data association for labeling of each target. We learn the deep CNN appearance representations by training an identification network (IdNet) on large-scale person re-identification data sets. We also employ additional unassigned tracks prediction after the data association step to overcome the susceptibility of the GM-PHD filter towards miss-detections caused by occlusion. Our tracker which runs in real-time is applied to track multiple objects in video sequences acquired under varying environmental conditions and objects density. Lastly, we make extensive evaluations on Multiple Object Tracking 2016 (MOT16) and 2017 (MOT17) benchmark data sets and find out that our online tracker significantly outperforms several state-of-the-art trackers in terms of tracking accuracy and identification.
Tasks Large-Scale Person Re-Identification, Multiple Object Tracking, Object Tracking, Person Re-Identification, Visual Tracking
Published 2019-12-10
URL https://arxiv.org/abs/1912.05949v2
PDF https://arxiv.org/pdf/1912.05949v2.pdf
PWC https://paperswithcode.com/paper/occlusion-robust-online-multi-object-visual
Repo
Framework

A Formal Framework for Robot Construction Problems: A Hybrid Planning Approach

Title A Formal Framework for Robot Construction Problems: A Hybrid Planning Approach
Authors Faseeh Ahmad, Esra Erdem, Volkan Patoglu
Abstract We study robot construction problems where multiple autonomous robots rearrange stacks of prefabricated blocks to build stable structures. These problems are challenging due to ramifications of actions, true concurrency, and requirements of supportedness of blocks by other blocks and stability of the structure at all times. We propose a formal hybrid planning framework to solve a wide range of robot construction problems, based on Answer Set Programming. This framework not only decides for a stable final configuration of the structure, but also computes the order of manipulation tasks for multiple autonomous robots to build the structure from an initial configuration, while simultaneously ensuring the stability, supportedness and other desired properties of the partial construction at each step of the plan. We prove the soundness and completeness of our formal method with respect to these properties. We introduce a set of challenging robot construction benchmark instances, including bridge building and stack overhanging scenarios, discuss the usefulness of our framework over these instances, and demonstrate the applicability of our method using a bimanual Baxter robot.
Tasks
Published 2019-03-02
URL http://arxiv.org/abs/1903.00745v2
PDF http://arxiv.org/pdf/1903.00745v2.pdf
PWC https://paperswithcode.com/paper/a-formal-framework-for-robot-construction
Repo
Framework

Convex hull algorithms based on some variational models

Title Convex hull algorithms based on some variational models
Authors Lingfeng Li, Shousheng Luo, Xue-Cheng Tai, Jiang Yang
Abstract Seeking the convex hull of an object is a very fundamental problem arising from various tasks. In this work, we propose two variational convex hull models using level set representation for 2-dimensional data. The first one is an exact model, which can get the convex hull of one or multiple objects. In this model, the convex hull is characterized by the zero sublevel-set of a convex level set function, which is non-positive at every given point. By minimizing the area of the zero sublevel-set, we can find the desired convex hull. The second one is intended to get convex hull of objects with outliers. Instead of requiring all the given points are included, this model penalizes the distance from each given point to the zero sublevel-set. Literature methods are not able to handle outliers. For the solution of these models, we develop efficient numerical schemes using alternating direction method of multipliers. Numerical examples are given to demonstrate the advantages of the proposed methods.
Tasks
Published 2019-08-09
URL https://arxiv.org/abs/1908.03323v1
PDF https://arxiv.org/pdf/1908.03323v1.pdf
PWC https://paperswithcode.com/paper/convex-hull-algorithms-based-on-some
Repo
Framework

Global Autoregressive Models for Data-Efficient Sequence Learning

Title Global Autoregressive Models for Data-Efficient Sequence Learning
Authors Tetiana Parshakova, Jean-Marc Andreoli, Marc Dymetman
Abstract Standard autoregressive seq2seq models are easily trained by max-likelihood, but tend to show poor results under small-data conditions. We introduce a class of seq2seq models, GAMs (Global Autoregressive Models), which combine an autoregressive component with a log-linear component, allowing the use of global \textit{a priori} features to compensate for lack of data. We train these models in two steps. In the first step, we obtain an \emph{unnormalized} GAM that maximizes the likelihood of the data, but is improper for fast inference or evaluation. In the second step, we use this GAM to train (by distillation) a second autoregressive model that approximates the \emph{normalized} distribution associated with the GAM, and can be used for fast inference and evaluation. Our experiments focus on language modelling under synthetic conditions and show a strong perplexity reduction of using the second autoregressive model over the standard one.
Tasks Language Modelling
Published 2019-09-16
URL https://arxiv.org/abs/1909.07063v2
PDF https://arxiv.org/pdf/1909.07063v2.pdf
PWC https://paperswithcode.com/paper/global-autoregressive-models-for-data
Repo
Framework

Network Pruning for Low-Rank Binary Indexing

Title Network Pruning for Low-Rank Binary Indexing
Authors Dongsoo Lee, Se Jung Kwon, Byeongwook Kim, Parichay Kapoor, Gu-Yeon Wei
Abstract Pruning is an efficient model compression technique to remove redundancy in the connectivity of deep neural networks (DNNs). Computations using sparse matrices obtained by pruning parameters, however, exhibit vastly different parallelism depending on the index representation scheme. As a result, fine-grained pruning has not gained much attention due to its irregular index form leading to large memory footprint and low parallelism for convolutions and matrix multiplications. In this paper, we propose a new network pruning technique that generates a low-rank binary index matrix to compress index data while decompressing index data is performed by simple binary matrix multiplication. This proposed compression method finds a particular fine-grained pruning mask that can be decomposed into two binary matrices. We also propose a tile-based factorization technique that not only lowers memory requirements but also enhances compression ratio. Various DNN models can be pruned with much fewer indexes compared to previous sparse matrix formats while maintaining the same pruning rate.
Tasks Model Compression, Network Pruning
Published 2019-05-14
URL https://arxiv.org/abs/1905.05686v1
PDF https://arxiv.org/pdf/1905.05686v1.pdf
PWC https://paperswithcode.com/paper/network-pruning-for-low-rank-binary-indexing
Repo
Framework

Mise en abyme with artificial intelligence: how to predict the accuracy of NN, applied to hyper-parameter tuning

Title Mise en abyme with artificial intelligence: how to predict the accuracy of NN, applied to hyper-parameter tuning
Authors Giorgia Franchini, Mathilde Galinier, Micaela Verucchi
Abstract In the context of deep learning, the costliest phase from a computational point of view is the full training of the learning algorithm. However, this process is to be used a significant number of times during the design of a new artificial neural network, leading therefore to extremely expensive operations. Here, we propose a low-cost strategy to predict the accuracy of the algorithm, based only on its initial behaviour. To do so, we train the network of interest up to convergence several times, modifying its characteristics at each training. The initial and final accuracies observed during this beforehand process are stored in a database. We then make use of both curve fitting and Support Vector Machines techniques, the latter being trained on the created database, to predict the accuracy of the network, given its accuracy on the primary iterations of its learning. This approach can be of particular interest when the space of the characteristics of the network is notably large or when its full training is highly time-consuming. The results we obtained are promising and encouraged us to apply this strategy to a topical issue: hyper-parameter optimisation (HO). In particular, we focused on the HO of a convolutional neural network for the classification of the databases MNIST and CIFAR-10. By using our method of prediction, and an algorithm implemented by us for a probabilistic exploration of the hyper-parameter space, we were able to find the hyper-parameter settings corresponding to the optimal accuracies already known in literature, at a quite low-cost.
Tasks
Published 2019-06-28
URL https://arxiv.org/abs/1907.00924v1
PDF https://arxiv.org/pdf/1907.00924v1.pdf
PWC https://paperswithcode.com/paper/mise-en-abyme-with-artificial-intelligence
Repo
Framework

The Limitations of Stylometry for Detecting Machine-Generated Fake News

Title The Limitations of Stylometry for Detecting Machine-Generated Fake News
Authors Tal Schuster, Roei Schuster, Darsh J Shah, Regina Barzilay
Abstract Recent developments in neural language models (LMs) have raised concerns about their potential misuse for automatically spreading misinformation. In light of these concerns, several studies have proposed to detect machine-generated fake news by capturing their stylistic differences from human-written text. These approaches, broadly termed stylometry, have found success in source attribution and misinformation detection in human-written texts. However, in this work, we show that stylometry is limited against machine-generated misinformation. While humans speak differently when trying to deceive, LMs generate stylistically consistent text, regardless of underlying motive. Thus, though stylometry can successfully prevent impersonation by identifying text provenance, it fails to distinguish legitimate LM applications from those that introduce false information. We create two benchmarks demonstrating the stylistic similarity between malicious and legitimate uses of LMs, employed in auto-completion and editing-assistance settings. Our findings highlight the need for non-stylometry approaches in detecting machine-generated misinformation, and open up the discussion on the desired evaluation benchmarks.
Tasks Fake News Detection, Language Modelling
Published 2019-08-26
URL https://arxiv.org/abs/1908.09805v2
PDF https://arxiv.org/pdf/1908.09805v2.pdf
PWC https://paperswithcode.com/paper/are-we-safe-yet-the-limitations-of
Repo
Framework

Tiered Graph Autoencoders with PyTorch Geometric for Molecular Graphs

Title Tiered Graph Autoencoders with PyTorch Geometric for Molecular Graphs
Authors Daniel T. Chang
Abstract Tiered latent representations and latent spaces for molecular graphs provide a simple but effective way to explicitly represent and utilize groups (e.g., functional groups), which consist of the atom (node) tier, the group tier and the molecule (graph) tier. They can be learned using the tiered graph autoencoder architecture. In this paper we discuss adapting tiered graph autoencoders for use with PyTorch Geometric, for both the deterministic tiered graph autoencoder model and the probabilistic tiered variational graph autoencoder model. We also discuss molecular structure information sources that can be accessed to extract training data for molecular graphs. To support transfer learning, a critical consideration is that the information must utilize standard unique molecule and constituent atom identifiers. As a result of using tiered graph autoencoders for deep learning, each molecular graph possesses tiered latent representations. At each tier, the latent representation consists of: node features, edge indices, edge features, membership matrix, and node embeddings. This enables the utilization and exploration of tiered molecular latent spaces, either individually (the node tier, the group tier, or the graph tier) or jointly, as well as navigation across the tiers.
Tasks Transfer Learning
Published 2019-08-22
URL https://arxiv.org/abs/1908.08612v1
PDF https://arxiv.org/pdf/1908.08612v1.pdf
PWC https://paperswithcode.com/paper/tiered-graph-autoencoders-with-pytorch
Repo
Framework

HAS-QA: Hierarchical Answer Spans Model for Open-domain Question Answering

Title HAS-QA: Hierarchical Answer Spans Model for Open-domain Question Answering
Authors Liang Pang, Yanyan Lan, Jiafeng Guo, Jun Xu, Lixin Su, Xueqi Cheng
Abstract This paper is concerned with open-domain question answering (i.e., OpenQA). Recently, some works have viewed this problem as a reading comprehension (RC) task, and directly applied successful RC models to it. However, the performances of such models are not so good as that in the RC task. In our opinion, the perspective of RC ignores three characteristics in OpenQA task: 1) many paragraphs without the answer span are included in the data collection; 2) multiple answer spans may exist within one given paragraph; 3) the end position of an answer span is dependent with the start position. In this paper, we first propose a new probabilistic formulation of OpenQA, based on a three-level hierarchical structure, i.e.,~the question level, the paragraph level and the answer span level. Then a Hierarchical Answer Spans Model (HAS-QA) is designed to capture each probability. HAS-QA has the ability to tackle the above three problems, and experiments on public OpenQA datasets show that it significantly outperforms traditional RC baselines and recent OpenQA baselines.
Tasks Open-Domain Question Answering, Question Answering, Reading Comprehension
Published 2019-01-12
URL http://arxiv.org/abs/1901.03866v1
PDF http://arxiv.org/pdf/1901.03866v1.pdf
PWC https://paperswithcode.com/paper/has-qa-hierarchical-answer-spans-model-for
Repo
Framework

Modeling Artist Preferences of Users with Different Music Consumption Patterns for Fair Music Recommendations

Title Modeling Artist Preferences of Users with Different Music Consumption Patterns for Fair Music Recommendations
Authors Dominik Kowald, Elisabeth Lex, Markus Schedl
Abstract Music recommender systems have become central parts of popular streaming platforms such as Last.fm, Pandora, or Spotify to help users find music that fits their preferences. These systems learn from the past listening events of users to recommend music a user will likely listen to in the future. Here, current algorithms typically employ collaborative filtering (CF) utilizing similarities between users’ listening behaviors. Some approaches also combine CF with content features into hybrid recommender systems. While music recommender systems can provide quality recommendations to listeners of mainstream music artists, recent research has shown that they tend to discriminate listeners of unorthodox, low-mainstream artists. This is foremost due to the scarcity of usage data of low-mainstream music as music consumption patterns are biased towards popular artists. Thus, the objective of our work is to provide a novel approach for modeling artist preferences of users with different music consumption patterns and listening habits.
Tasks Recommendation Systems
Published 2019-07-23
URL https://arxiv.org/abs/1907.09781v1
PDF https://arxiv.org/pdf/1907.09781v1.pdf
PWC https://paperswithcode.com/paper/modeling-artist-preferences-of-users-with
Repo
Framework
comments powered by Disqus