January 31, 2020

3372 words 16 mins read

Paper Group AWR 393

Modeling Color Terminology Across Thousands of Languages. Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning. Transferring Robustness for Graph Neural Network Against Poisoning Attacks. A Pilot Study for Chinese SQL Semantic Parsing. Entity, Relation, and Event Extraction with Contextualized Span Representations. …

Modeling Color Terminology Across Thousands of Languages


Title	Modeling Color Terminology Across Thousands of Languages
Authors	Arya D. McCarthy, Winston Wu, Aaron Mueller, Bill Watson, David Yarowsky
Abstract	There is an extensive history of scholarship into what constitutes a “basic” color term, as well as a broadly attested acquisition sequence of basic color terms across many languages, as articulated in the seminal work of Berlin and Kay (1969). This paper employs a set of diverse measures on massively cross-linguistic data to operationalize and critique the Berlin and Kay color term hypotheses. Collectively, the 14 empirically-grounded computational linguistic metrics we design—as well as their aggregation—correlate strongly with both the Berlin and Kay basic/secondary color term partition (gamma=0.96) and their hypothesized universal acquisition sequence. The measures and result provide further empirical evidence from computational linguistics in support of their claims, as well as additional nuance: they suggest treating the partition as a spectrum instead of a dichotomy.
Tasks
Published	2019-10-03
URL	https://arxiv.org/abs/1910.01531v1
PDF	https://arxiv.org/pdf/1910.01531v1.pdf
PWC	https://paperswithcode.com/paper/modeling-color-terminology-across-thousands
Repo	https://github.com/aryamccarthy/basic-color-terms
Framework	none

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning


Title	Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning
Authors	Tianhe Yu, Deirdre Quillen, Zhanpeng He, Ryan Julian, Karol Hausman, Chelsea Finn, Sergey Levine
Abstract	Meta-reinforcement learning algorithms can enable robots to acquire new skills much more quickly, by leveraging prior experience to learn how to learn. However, much of the current research on meta-reinforcement learning focuses on task distributions that are very narrow. For example, a commonly used meta-reinforcement learning benchmark uses different running velocities for a simulated robot as different tasks. When policies are meta-trained on such narrow task distributions, they cannot possibly generalize to more quickly acquire entirely new tasks. Therefore, if the aim of these methods is to enable faster acquisition of entirely new behaviors, we must evaluate them on task distributions that are sufficiently broad to enable generalization to new behaviors. In this paper, we propose an open-source simulated benchmark for meta-reinforcement learning and multi-task learning consisting of 50 distinct robotic manipulation tasks. Our aim is to make it possible to develop algorithms that generalize to accelerate the acquisition of entirely new, held-out tasks. We evaluate 6 state-of-the-art meta-reinforcement learning and multi-task learning algorithms on these tasks. Surprisingly, while each task and its variations (e.g., with different object positions) can be learned with reasonable success, these algorithms struggle to learn with multiple tasks at the same time, even with as few as ten distinct training tasks. Our analysis and open-source environments pave the way for future research in multi-task learning and meta-learning that can enable meaningful generalization, thereby unlocking the full potential of these methods.
Tasks	Meta-Learning, Multi-Task Learning
Published	2019-10-24
URL	https://arxiv.org/abs/1910.10897v1
PDF	https://arxiv.org/pdf/1910.10897v1.pdf
PWC	https://paperswithcode.com/paper/meta-world-a-benchmark-and-evaluation-for
Repo	https://github.com/rlworkgroup/metaworld
Framework	none

Transferring Robustness for Graph Neural Network Against Poisoning Attacks


Title	Transferring Robustness for Graph Neural Network Against Poisoning Attacks
Authors	Xianfeng Tang, Yandong Li, Yiwei Sun, Huaxiu Yao, Prasenjit Mitra, Suhang Wang
Abstract	Graph neural networks (GNNs) are widely used in many applications. However, their robustness against adversarial attacks is criticized. Prior studies show that using unnoticeable modifications on graph topology or nodal features can significantly reduce the performances of GNNs. It is very challenging to design robust graph neural networks against poisoning attack and several efforts have been taken. Existing work aims at reducing the negative impact from adversarial edges only with the poisoned graph, which is sub-optimal since they fail to discriminate adversarial edges from normal ones. On the other hand, clean graphs from similar domains as the target poisoned graph are usually available in the real world. By perturbing these clean graphs, we create supervised knowledge to train the ability to detect adversarial edges so that the robustness of GNNs is elevated. However, such potential for clean graphs is neglected by existing work. To this end, we investigate a novel problem of improving the robustness of GNNs against poisoning attacks by exploring clean graphs. Specifically, we propose PA-GNN, which relies on a penalized aggregation mechanism that directly restrict the negative impact of adversarial edges by assigning them lower attention coefficients. To optimize PA-GNN for a poisoned graph, we design a meta-optimization algorithm that trains PA-GNN to penalize perturbations using clean graphs and their adversarial counterparts, and transfers such ability to improve the robustness of PA-GNN on the poisoned graph. Experimental results on four real-world datasets demonstrate the robustness of PA-GNN against poisoning attacks on graphs. Code and data are available here: https://github.com/tangxianfeng/PA-GNN.
Tasks	Node Classification, Transfer Learning
Published	2019-08-20
URL	https://arxiv.org/abs/1908.07558v3
PDF	https://arxiv.org/pdf/1908.07558v3.pdf
PWC	https://paperswithcode.com/paper/190807558
Repo	https://github.com/tangxianfeng/PA-GNN
Framework	tf

A Pilot Study for Chinese SQL Semantic Parsing


Title	A Pilot Study for Chinese SQL Semantic Parsing
Authors	Qingkai Min, Yuefeng Shi, Yue Zhang
Abstract	The task of semantic parsing is highly useful for dialogue and question answering systems. Many datasets have been proposed to map natural language text into SQL, among which the recent Spider dataset provides cross-domain samples with multiple tables and complex queries. We build a Spider dataset for Chinese, which is currently a low-resource language in this task area. Interesting research questions arise from the uniqueness of the language, which requires word segmentation, and also from the fact that SQL keywords and columns of DB tables are typically written in English. We compare character- and word-based encoders for a semantic parser, and different embedding schemes. Results show that word-based semantic parser is subject to segmentation errors and cross-lingual word embeddings are useful for text-to-SQL.
Tasks	Question Answering, Semantic Parsing, Text-To-Sql, Word Embeddings
Published	2019-09-29
URL	https://arxiv.org/abs/1909.13293v2
PDF	https://arxiv.org/pdf/1909.13293v2.pdf
PWC	https://paperswithcode.com/paper/a-pilot-study-for-chinese-sql-semantic
Repo	https://github.com/taolusi/chisp
Framework	pytorch

Entity, Relation, and Event Extraction with Contextualized Span Representations


Title	Entity, Relation, and Event Extraction with Contextualized Span Representations
Authors	David Wadden, Ulme Wennberg, Yi Luan, Hannaneh Hajishirzi
Abstract	We examine the capabilities of a unified, multi-task framework for three information extraction tasks: named entity recognition, relation extraction, and event extraction. Our framework (called DyGIE++) accomplishes all tasks by enumerating, refining, and scoring text spans designed to capture local (within-sentence) and global (cross-sentence) context. Our framework achieves state-of-the-art results across all tasks, on four datasets from a variety of domains. We perform experiments comparing different techniques to construct span representations. Contextualized embeddings like BERT perform well at capturing relationships among entities in the same or adjacent sentences, while dynamic span graph updates model long-range cross-sentence relationships. For instance, propagating span representations via predicted coreference links can enable the model to disambiguate challenging entity mentions. Our code is publicly available at https://github.com/dwadden/dygiepp and can be easily adapted for new tasks or datasets.
Tasks	Joint Entity and Relation Extraction, Named Entity Recognition, Relation Extraction
Published	2019-09-08
URL	https://arxiv.org/abs/1909.03546v2
PDF	https://arxiv.org/pdf/1909.03546v2.pdf
PWC	https://paperswithcode.com/paper/entity-relation-and-event-extraction-with
Repo	https://github.com/dwadden/dygiepp
Framework	pytorch

Hyperbolic Graph Convolutional Neural Networks


Title	Hyperbolic Graph Convolutional Neural Networks
Authors	Ines Chami, Rex Ying, Christopher Ré, Jure Leskovec
Abstract	Graph convolutional neural networks (GCNs) embed nodes in a graph into Euclidean space, which has been shown to incur a large distortion when embedding real-world graphs with scale-free or hierarchical structure. Hyperbolic geometry offers an exciting alternative, as it enables embeddings with much smaller distortion. However, extending GCNs to hyperbolic geometry presents several unique challenges because it is not clear how to define neural network operations, such as feature transformation and aggregation, in hyperbolic space. Furthermore, since input features are often Euclidean, it is unclear how to transform the features into hyperbolic embeddings with the right amount of curvature. Here we propose Hyperbolic Graph Convolutional Neural Network (HGCN), the first inductive hyperbolic GCN that leverages both the expressiveness of GCNs and hyperbolic geometry to learn inductive node representations for hierarchical and scale-free graphs. We derive GCN operations in the hyperboloid model of hyperbolic space and map Euclidean input features to embeddings in hyperbolic spaces with different trainable curvature at each layer. Experiments demonstrate that HGCN learns embeddings that preserve hierarchical structure, and leads to improved performance when compared to Euclidean analogs, even with very low dimensional embeddings: compared to state-of-the-art GCNs, HGCN achieves an error reduction of up to 63.1% in ROC AUC for link prediction and of up to 47.5% in F1 score for node classification, also improving state-of-the art on the Pubmed dataset.
Tasks	Link Prediction, Node Classification
Published	2019-10-28
URL	https://arxiv.org/abs/1910.12933v1
PDF	https://arxiv.org/pdf/1910.12933v1.pdf
PWC	https://paperswithcode.com/paper/hyperbolic-graph-convolutional-neural
Repo	https://github.com/HazyResearch/hgcn
Framework	pytorch

An interpretable probabilistic machine learning method for heterogeneous longitudinal studies


Title	An interpretable probabilistic machine learning method for heterogeneous longitudinal studies
Authors	Juho Timonen, Henrik Mannerström, Aki Vehtari, Harri Lähdesmäki
Abstract	Identifying risk factors from longitudinal data requires statistical tools that are not restricted to linear models, yet provide interpretable associations between different types of covariates and a response variable. Here, we present a widely applicable and interpretable probabilistic machine learning method for nonparametric longitudinal data analysis using additive Gaussian process regression. We demonstrate that it outperforms previous longitudinal modeling approaches and provides useful novel features, including the ability to account for uncertainty in disease effect times as well as heterogeneity in their effects.
Tasks
Published	2019-12-07
URL	https://arxiv.org/abs/1912.03549v1
PDF	https://arxiv.org/pdf/1912.03549v1.pdf
PWC	https://paperswithcode.com/paper/an-interpretable-probabilistic-machine
Repo	https://github.com/jtimonen/lgpr
Framework	none

PPDM: Parallel Point Detection and Matching for Real-time Human-Object Interaction Detection


Title	PPDM: Parallel Point Detection and Matching for Real-time Human-Object Interaction Detection
Authors	Yue Liao, Si Liu, Fei Wang, Yanjie Chen, Chen Qian, Jiashi Feng
Abstract	We propose a single-stage Human-Object Interaction (HOI) detection method that has outperformed all existing methods on HICO-DET dataset at 37 fps on a single Titan XP GPU. It is the first real-time HOI detection method. Conventional HOI detection methods are composed of two stages, i.e., human-object proposals generation, and proposals classification. Their effectiveness and efficiency are limited by the sequential and separate architecture. In this paper, we propose a Parallel Point Detection and Matching (PPDM) HOI detection framework. In PPDM, an HOI is defined as a point triplet < human point, interaction point, object point>. Human and object points are the center of the detection boxes, and the interaction point is the midpoint of the human and object points. PPDM contains two parallel branches, namely point detection branch and point matching branch. The point detection branch predicts three points. Simultaneously, the point matching branch predicts two displacements from the interaction point to its corresponding human and object points. The human point and the object point originated from the same interaction point are considered as matched pairs. In our novel parallel architecture, the interaction points implicitly provide context and regularization for human and object detection. The isolated detection boxes are unlikely to form meaning HOI triplets are suppressed, which increases the precision of HOI detection. Moreover, the matching between human and object detection boxes is only applied around limited numbers of filtered candidate interaction points, which saves much computational cost. Additionally, we build a new application-oriented database named HOI-A, which severs as a good supplement to the existing datasets. The source code and the dataset will be made publicly available to facilitate the development of HOI detection.
Tasks	Human-Object Interaction Detection, Object Detection
Published	2019-12-30
URL	https://arxiv.org/abs/1912.12898v3
PDF	https://arxiv.org/pdf/1912.12898v3.pdf
PWC	https://paperswithcode.com/paper/ppdm-parallel-point-detection-and-matching
Repo	https://github.com/YueLiao/PPDM
Framework	pytorch

Domain Generalization by Solving Jigsaw Puzzles


Title	Domain Generalization by Solving Jigsaw Puzzles
Authors	Fabio Maria Carlucci, Antonio D’Innocente, Silvia Bucci, Barbara Caputo, Tatiana Tommasi
Abstract	Human adaptability relies crucially on the ability to learn and merge knowledge both from supervised and unsupervised learning: the parents point out few important concepts, but then the children fill in the gaps on their own. This is particularly effective, because supervised learning can never be exhaustive and thus learning autonomously allows to discover invariances and regularities that help to generalize. In this paper we propose to apply a similar approach to the task of object recognition across domains: our model learns the semantic labels in a supervised fashion, and broadens its understanding of the data by learning from self-supervised signals how to solve a jigsaw puzzle on the same images. This secondary task helps the network to learn the concepts of spatial correlation while acting as a regularizer for the classification task. Multiple experiments on the PACS, VLCS, Office-Home and digits datasets confirm our intuition and show that this simple method outperforms previous domain generalization and adaptation solutions. An ablation study further illustrates the inner workings of our approach.
Tasks	Domain Generalization, Object Recognition
Published	2019-03-16
URL	http://arxiv.org/abs/1903.06864v2
PDF	http://arxiv.org/pdf/1903.06864v2.pdf
PWC	https://paperswithcode.com/paper/domain-generalization-by-solving-jigsaw
Repo	https://github.com/Emma0118/domain-generalization
Framework	pytorch

Exploration via Flow-Based Intrinsic Rewards


Title	Exploration via Flow-Based Intrinsic Rewards
Authors	Hsuan-Kung Yang, Po-Han Chiang, Min-Fong Hong, Chun-Yi Lee
Abstract	Exploration bonuses derived from the novelty of observations in an environment have become a popular approach to motivate exploration for reinforcement learning (RL) agents in the past few years. Recent methods such as curiosity-driven exploration usually estimate the novelty of new observations by the prediction errors of their system dynamics models. In this paper, we introduce the concept of optical flow estimation from the field of computer vision to the RL domain and utilize the errors from optical flow estimation to evaluate the novelty of new observations. We introduce a flow-based intrinsic curiosity module (FICM) capable of learning the motion features and understanding the observations in a more comprehensive and efficient fashion. We evaluate our method and compare it with a number of baselines on several benchmark environments, including Atari games, Super Mario Bros., and ViZDoom. Our results show that the proposed method is superior to the baselines in certain environments, especially for those featuring sophisticated moving patterns or with high-dimensional observation spaces.
Tasks	Atari Games, Optical Flow Estimation
Published	2019-05-24
URL	https://arxiv.org/abs/1905.10071v2
PDF	https://arxiv.org/pdf/1905.10071v2.pdf
PWC	https://paperswithcode.com/paper/exploration-via-flow-based-intrinsic-rewards
Repo	https://github.com/hellochick/MarioO_O-flow-curioisty
Framework	tf

Turbo Autoencoder: Deep learning based channel codes for point-to-point communication channels


Title	Turbo Autoencoder: Deep learning based channel codes for point-to-point communication channels
Authors	Yihan Jiang, Hyeji Kim, Himanshu Asnani, Sreeram Kannan, Sewoong Oh, Pramod Viswanath
Abstract	Designing codes that combat the noise in a communication medium has remained a significant area of research in information theory as well as wireless communications. Asymptotically optimal channel codes have been developed by mathematicians for communicating under canonical models after over 60 years of research. On the other hand, in many non-canonical channel settings, optimal codes do not exist and the codes designed for canonical models are adapted via heuristics to these channels and are thus not guaranteed to be optimal. In this work, we make significant progress on this problem by designing a fully end-to-end jointly trained neural encoder and decoder, namely, Turbo Autoencoder (TurboAE), with the following contributions: ($a$) under moderate block lengths, TurboAE approaches state-of-the-art performance under canonical channels; ($b$) moreover, TurboAE outperforms the state-of-the-art codes under non-canonical settings in terms of reliability. TurboAE shows that the development of channel coding design can be automated via deep learning, with near-optimal performance.
Tasks
Published	2019-11-08
URL	https://arxiv.org/abs/1911.03038v1
PDF	https://arxiv.org/pdf/1911.03038v1.pdf
PWC	https://paperswithcode.com/paper/turbo-autoencoder-deep-learning-based-channel
Repo	https://github.com/yihanjiang/turboae
Framework	pytorch

Taxonomy of Real Faults in Deep Learning Systems


Title	Taxonomy of Real Faults in Deep Learning Systems
Authors	Nargiz Humbatova, Gunel Jahangirova, Gabriele Bavota, Vincenzo Riccio, Andrea Stocco, Paolo Tonella
Abstract	The growing application of deep neural networks in safety-critical domains makes the analysis of faults that occur in such systems of enormous importance. In this paper we introduce a large taxonomy of faults in deep learning (DL) systems. We have manually analysed 1059 artefacts gathered from GitHub commits and issues of projects that use the most popular DL frameworks (TensorFlow, Keras and PyTorch) and from related Stack Overflow posts. Structured interviews with 20 researchers and practitioners describing the problems they have encountered in their experience have enriched our taxonomy with a variety of additional faults that did not emerge from the other two sources. Our final taxonomy was validated with a survey involving an additional set of 21 developers, confirming that almost all fault categories (13/15) were experienced by at least 50% of the survey participants.
Tasks
Published	2019-10-24
URL	https://arxiv.org/abs/1910.11015v3
PDF	https://arxiv.org/pdf/1910.11015v3.pdf
PWC	https://paperswithcode.com/paper/taxonomy-of-real-faults-in-deep-learning
Repo	https://github.com/dlfaults/dl_faults
Framework	tf

Interpreting OWL Complex Classes in AutomationML based on Bidirectional Translation


Title	Interpreting OWL Complex Classes in AutomationML based on Bidirectional Translation
Authors	Yingbing Hua, Björn Hein
Abstract	The World Wide Web Consortium (W3C) has published several recommendations for building and storing ontologies, including the most recent OWL 2 Web Ontology Language (OWL). These initiatives have been followed by practical implementations that popularize OWL in various domains. For example, OWL has been used for conceptual modeling in industrial engineering, and its reasoning facilities are used to provide a wealth of services, e.g. model diagnosis, automated code generation, and semantic integration. More specifically, recent studies have shown that OWL is well suited for harmonizing information of engineering tools stored as AutomationML (AML) files. However, OWL and its tools can be cumbersome for direct use by engineers such that an ontology expert is often required in practice. Although much attention has been paid in the literature to overcome this issue by transforming OWL ontologies from/to AML models automatically, dealing with OWL complex classes remains an open research question. In this paper, we introduce the AML concept models for representing OWL complex classes in AutomationML, and present algorithms for the bidirectional translation between OWL complex classes and their corresponding AML concept models. We show that this approach provides an efficient and intuitive interface for nonexperts to visualize, modify, and create OWL complex classes.
Tasks	Code Generation
Published	2019-06-04
URL	https://arxiv.org/abs/1906.04240v1
PDF	https://arxiv.org/pdf/1906.04240v1.pdf
PWC	https://paperswithcode.com/paper/interpreting-owl-complex-classes-in
Repo	https://github.com/kit-hua/ETFA2019
Framework	none

CodeGRU: Context-aware Deep Learning with Gated Recurrent Unit for Source Code Modeling


Title	CodeGRU: Context-aware Deep Learning with Gated Recurrent Unit for Source Code Modeling
Authors	Yasir Hussain, Zhiqiu Huang, Senzhang Wang, Yu Zhou
Abstract	Recently many NLP-based deep learning models have been applied to model source code for source code suggestion and recommendation tasks. A major limitation of these approaches is that they take source code as simple tokens of text and ignore its contextual, syntaxtual and structural dependencies. In this work, we present CodeGRU, a Gated Recurrent Unit based source code language model that is capable of capturing contextual, syntaxtual and structural dependencies for modeling the source code. The CodeGRU introduces the following several new components. The Code Sampler is first proposed for selecting noise-free code samples and transforms obfuscate code to its proper syntax, which helps to capture syntaxtual and structural dependencies. The Code Regularize is next introduced to encode source code which helps capture the contextual dependencies of the source code. Finally, we propose a novel method which can learn variable size context for modeling source code. We evaluated CodeGRU with real-world dataset and it shows that CodeGRU can effectively capture contextual, syntaxtual and structural dependencies which previous works fails. We also discuss and visualize two use cases of CodeGRU for source code modeling tasks (1) source code suggestion, and (2) source code generation.
Tasks	Code Generation, Language Modelling
Published	2019-03-03
URL	http://arxiv.org/abs/1903.00884v1
PDF	http://arxiv.org/pdf/1903.00884v1.pdf
PWC	https://paperswithcode.com/paper/codegru-context-aware-deep-learning-with
Repo	https://github.com/yaxirhuxxain/Source-Code-Suggestion
Framework	tf

AutoGAN: Neural Architecture Search for Generative Adversarial Networks


Title	AutoGAN: Neural Architecture Search for Generative Adversarial Networks
Authors	Xinyu Gong, Shiyu Chang, Yifan Jiang, Zhangyang Wang
Abstract	Neural architecture search (NAS) has witnessed prevailing success in image classification and (very recently) segmentation tasks. In this paper, we present the first preliminary study on introducing the NAS algorithm to generative adversarial networks (GANs), dubbed AutoGAN. The marriage of NAS and GANs faces its unique challenges. We define the search space for the generator architectural variations and use an RNN controller to guide the search, with parameter sharing and dynamic-resetting to accelerate the process. Inception score is adopted as the reward, and a multi-level search strategy is introduced to perform NAS in a progressive way. Experiments validate the effectiveness of AutoGAN on the task of unconditional image generation. Specifically, our discovered architectures achieve highly competitive performance compared to current state-of-the-art hand-crafted GANs, e.g., setting new state-of-the-art FID scores of 12.42 on CIFAR-10, and 31.01 on STL-10, respectively. We also conclude with a discussion of the current limitations and future potential of AutoGAN. The code is available at https://github.com/TAMU-VITA/AutoGAN
Tasks	Image Classification, Image Generation, Neural Architecture Search
Published	2019-08-11
URL	https://arxiv.org/abs/1908.03835v1
PDF	https://arxiv.org/pdf/1908.03835v1.pdf
PWC	https://paperswithcode.com/paper/autogan-neural-architecture-search-for
Repo	https://github.com/TAMU-VITA/AutoGAN
Framework	pytorch