January 31, 2020

3372 words 16 mins read

Paper Group AWR 393

Paper Group AWR 393

Modeling Color Terminology Across Thousands of Languages. Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning. Transferring Robustness for Graph Neural Network Against Poisoning Attacks. A Pilot Study for Chinese SQL Semantic Parsing. Entity, Relation, and Event Extraction with Contextualized Span Representations. …

Modeling Color Terminology Across Thousands of Languages

Title Modeling Color Terminology Across Thousands of Languages
Authors Arya D. McCarthy, Winston Wu, Aaron Mueller, Bill Watson, David Yarowsky
Abstract There is an extensive history of scholarship into what constitutes a “basic” color term, as well as a broadly attested acquisition sequence of basic color terms across many languages, as articulated in the seminal work of Berlin and Kay (1969). This paper employs a set of diverse measures on massively cross-linguistic data to operationalize and critique the Berlin and Kay color term hypotheses. Collectively, the 14 empirically-grounded computational linguistic metrics we design—as well as their aggregation—correlate strongly with both the Berlin and Kay basic/secondary color term partition (gamma=0.96) and their hypothesized universal acquisition sequence. The measures and result provide further empirical evidence from computational linguistics in support of their claims, as well as additional nuance: they suggest treating the partition as a spectrum instead of a dichotomy.
Tasks
Published 2019-10-03
URL https://arxiv.org/abs/1910.01531v1
PDF https://arxiv.org/pdf/1910.01531v1.pdf
PWC https://paperswithcode.com/paper/modeling-color-terminology-across-thousands
Repo https://github.com/aryamccarthy/basic-color-terms
Framework none

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Title Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning
Authors Tianhe Yu, Deirdre Quillen, Zhanpeng He, Ryan Julian, Karol Hausman, Chelsea Finn, Sergey Levine
Abstract Meta-reinforcement learning algorithms can enable robots to acquire new skills much more quickly, by leveraging prior experience to learn how to learn. However, much of the current research on meta-reinforcement learning focuses on task distributions that are very narrow. For example, a commonly used meta-reinforcement learning benchmark uses different running velocities for a simulated robot as different tasks. When policies are meta-trained on such narrow task distributions, they cannot possibly generalize to more quickly acquire entirely new tasks. Therefore, if the aim of these methods is to enable faster acquisition of entirely new behaviors, we must evaluate them on task distributions that are sufficiently broad to enable generalization to new behaviors. In this paper, we propose an open-source simulated benchmark for meta-reinforcement learning and multi-task learning consisting of 50 distinct robotic manipulation tasks. Our aim is to make it possible to develop algorithms that generalize to accelerate the acquisition of entirely new, held-out tasks. We evaluate 6 state-of-the-art meta-reinforcement learning and multi-task learning algorithms on these tasks. Surprisingly, while each task and its variations (e.g., with different object positions) can be learned with reasonable success, these algorithms struggle to learn with multiple tasks at the same time, even with as few as ten distinct training tasks. Our analysis and open-source environments pave the way for future research in multi-task learning and meta-learning that can enable meaningful generalization, thereby unlocking the full potential of these methods.
Tasks Meta-Learning, Multi-Task Learning
Published 2019-10-24
URL https://arxiv.org/abs/1910.10897v1
PDF https://arxiv.org/pdf/1910.10897v1.pdf
PWC https://paperswithcode.com/paper/meta-world-a-benchmark-and-evaluation-for
Repo https://github.com/rlworkgroup/metaworld
Framework none

Transferring Robustness for Graph Neural Network Against Poisoning Attacks

Title Transferring Robustness for Graph Neural Network Against Poisoning Attacks
Authors Xianfeng Tang, Yandong Li, Yiwei Sun, Huaxiu Yao, Prasenjit Mitra, Suhang Wang
Abstract Graph neural networks (GNNs) are widely used in many applications. However, their robustness against adversarial attacks is criticized. Prior studies show that using unnoticeable modifications on graph topology or nodal features can significantly reduce the performances of GNNs. It is very challenging to design robust graph neural networks against poisoning attack and several efforts have been taken. Existing work aims at reducing the negative impact from adversarial edges only with the poisoned graph, which is sub-optimal since they fail to discriminate adversarial edges from normal ones. On the other hand, clean graphs from similar domains as the target poisoned graph are usually available in the real world. By perturbing these clean graphs, we create supervised knowledge to train the ability to detect adversarial edges so that the robustness of GNNs is elevated. However, such potential for clean graphs is neglected by existing work. To this end, we investigate a novel problem of improving the robustness of GNNs against poisoning attacks by exploring clean graphs. Specifically, we propose PA-GNN, which relies on a penalized aggregation mechanism that directly restrict the negative impact of adversarial edges by assigning them lower attention coefficients. To optimize PA-GNN for a poisoned graph, we design a meta-optimization algorithm that trains PA-GNN to penalize perturbations using clean graphs and their adversarial counterparts, and transfers such ability to improve the robustness of PA-GNN on the poisoned graph. Experimental results on four real-world datasets demonstrate the robustness of PA-GNN against poisoning attacks on graphs. Code and data are available here: https://github.com/tangxianfeng/PA-GNN.
Tasks Node Classification, Transfer Learning
Published 2019-08-20
URL https://arxiv.org/abs/1908.07558v3
PDF https://arxiv.org/pdf/1908.07558v3.pdf
PWC https://paperswithcode.com/paper/190807558
Repo https://github.com/tangxianfeng/PA-GNN
Framework tf

A Pilot Study for Chinese SQL Semantic Parsing

Title A Pilot Study for Chinese SQL Semantic Parsing
Authors Qingkai Min, Yuefeng Shi, Yue Zhang
Abstract The task of semantic parsing is highly useful for dialogue and question answering systems. Many datasets have been proposed to map natural language text into SQL, among which the recent Spider dataset provides cross-domain samples with multiple tables and complex queries. We build a Spider dataset for Chinese, which is currently a low-resource language in this task area. Interesting research questions arise from the uniqueness of the language, which requires word segmentation, and also from the fact that SQL keywords and columns of DB tables are typically written in English. We compare character- and word-based encoders for a semantic parser, and different embedding schemes. Results show that word-based semantic parser is subject to segmentation errors and cross-lingual word embeddings are useful for text-to-SQL.
Tasks Question Answering, Semantic Parsing, Text-To-Sql, Word Embeddings
Published 2019-09-29
URL https://arxiv.org/abs/1909.13293v2
PDF https://arxiv.org/pdf/1909.13293v2.pdf
PWC https://paperswithcode.com/paper/a-pilot-study-for-chinese-sql-semantic
Repo https://github.com/taolusi/chisp
Framework pytorch

Entity, Relation, and Event Extraction with Contextualized Span Representations

Title Entity, Relation, and Event Extraction with Contextualized Span Representations
Authors David Wadden, Ulme Wennberg, Yi Luan, Hannaneh Hajishirzi
Abstract We examine the capabilities of a unified, multi-task framework for three information extraction tasks: named entity recognition, relation extraction, and event extraction. Our framework (called DyGIE++) accomplishes all tasks by enumerating, refining, and scoring text spans designed to capture local (within-sentence) and global (cross-sentence) context. Our framework achieves state-of-the-art results across all tasks, on four datasets from a variety of domains. We perform experiments comparing different techniques to construct span representations. Contextualized embeddings like BERT perform well at capturing relationships among entities in the same or adjacent sentences, while dynamic span graph updates model long-range cross-sentence relationships. For instance, propagating span representations via predicted coreference links can enable the model to disambiguate challenging entity mentions. Our code is publicly available at https://github.com/dwadden/dygiepp and can be easily adapted for new tasks or datasets.
Tasks Joint Entity and Relation Extraction, Named Entity Recognition, Relation Extraction
Published 2019-09-08
URL https://arxiv.org/abs/1909.03546v2
PDF https://arxiv.org/pdf/1909.03546v2.pdf
PWC https://paperswithcode.com/paper/entity-relation-and-event-extraction-with
Repo https://github.com/dwadden/dygiepp
Framework pytorch

Hyperbolic Graph Convolutional Neural Networks

Title Hyperbolic Graph Convolutional Neural Networks
Authors Ines Chami, Rex Ying, Christopher Ré, Jure Leskovec
Abstract Graph convolutional neural networks (GCNs) embed nodes in a graph into Euclidean space, which has been shown to incur a large distortion when embedding real-world graphs with scale-free or hierarchical structure. Hyperbolic geometry offers an exciting alternative, as it enables embeddings with much smaller distortion. However, extending GCNs to hyperbolic geometry presents several unique challenges because it is not clear how to define neural network operations, such as feature transformation and aggregation, in hyperbolic space. Furthermore, since input features are often Euclidean, it is unclear how to transform the features into hyperbolic embeddings with the right amount of curvature. Here we propose Hyperbolic Graph Convolutional Neural Network (HGCN), the first inductive hyperbolic GCN that leverages both the expressiveness of GCNs and hyperbolic geometry to learn inductive node representations for hierarchical and scale-free graphs. We derive GCN operations in the hyperboloid model of hyperbolic space and map Euclidean input features to embeddings in hyperbolic spaces with different trainable curvature at each layer. Experiments demonstrate that HGCN learns embeddings that preserve hierarchical structure, and leads to improved performance when compared to Euclidean analogs, even with very low dimensional embeddings: compared to state-of-the-art GCNs, HGCN achieves an error reduction of up to 63.1% in ROC AUC for link prediction and of up to 47.5% in F1 score for node classification, also improving state-of-the art on the Pubmed dataset.
Tasks Link Prediction, Node Classification
Published 2019-10-28
URL https://arxiv.org/abs/1910.12933v1
PDF https://arxiv.org/pdf/1910.12933v1.pdf
PWC https://paperswithcode.com/paper/hyperbolic-graph-convolutional-neural
Repo https://github.com/HazyResearch/hgcn
Framework pytorch

An interpretable probabilistic machine learning method for heterogeneous longitudinal studies

Title An interpretable probabilistic machine learning method for heterogeneous longitudinal studies
Authors Juho Timonen, Henrik Mannerström, Aki Vehtari, Harri Lähdesmäki
Abstract Identifying risk factors from longitudinal data requires statistical tools that are not restricted to linear models, yet provide interpretable associations between different types of covariates and a response variable. Here, we present a widely applicable and interpretable probabilistic machine learning method for nonparametric longitudinal data analysis using additive Gaussian process regression. We demonstrate that it outperforms previous longitudinal modeling approaches and provides useful novel features, including the ability to account for uncertainty in disease effect times as well as heterogeneity in their effects.
Tasks
Published 2019-12-07
URL https://arxiv.org/abs/1912.03549v1
PDF https://arxiv.org/pdf/1912.03549v1.pdf
PWC https://paperswithcode.com/paper/an-interpretable-probabilistic-machine
Repo https://github.com/jtimonen/lgpr
Framework none

PPDM: Parallel Point Detection and Matching for Real-time Human-Object Interaction Detection

Title PPDM: Parallel Point Detection and Matching for Real-time Human-Object Interaction Detection
Authors Yue Liao, Si Liu, Fei Wang, Yanjie Chen, Chen Qian, Jiashi Feng
Abstract We propose a single-stage Human-Object Interaction (HOI) detection method that has outperformed all existing methods on HICO-DET dataset at 37 fps on a single Titan XP GPU. It is the first real-time HOI detection method. Conventional HOI detection methods are composed of two stages, i.e., human-object proposals generation, and proposals classification. Their effectiveness and efficiency are limited by the sequential and separate architecture. In this paper, we propose a Parallel Point Detection and Matching (PPDM) HOI detection framework. In PPDM, an HOI is defined as a point triplet < human point, interaction point, object point>. Human and object points are the center of the detection boxes, and the interaction point is the midpoint of the human and object points. PPDM contains two parallel branches, namely point detection branch and point matching branch. The point detection branch predicts three points. Simultaneously, the point matching branch predicts two displacements from the interaction point to its corresponding human and object points. The human point and the object point originated from the same interaction point are considered as matched pairs. In our novel parallel architecture, the interaction points implicitly provide context and regularization for human and object detection. The isolated detection boxes are unlikely to form meaning HOI triplets are suppressed, which increases the precision of HOI detection. Moreover, the matching between human and object detection boxes is only applied around limited numbers of filtered candidate interaction points, which saves much computational cost. Additionally, we build a new application-oriented database named HOI-A, which severs as a good supplement to the existing datasets. The source code and the dataset will be made publicly available to facilitate the development of HOI detection.
Tasks Human-Object Interaction Detection, Object Detection
Published 2019-12-30
URL https://arxiv.org/abs/1912.12898v3
PDF https://arxiv.org/pdf/1912.12898v3.pdf
PWC https://paperswithcode.com/paper/ppdm-parallel-point-detection-and-matching
Repo https://github.com/YueLiao/PPDM
Framework pytorch

Domain Generalization by Solving Jigsaw Puzzles

Title Domain Generalization by Solving Jigsaw Puzzles
Authors Fabio Maria Carlucci, Antonio D’Innocente, Silvia Bucci, Barbara Caputo, Tatiana Tommasi
Abstract Human adaptability relies crucially on the ability to learn and merge knowledge both from supervised and unsupervised learning: the parents point out few important concepts, but then the children fill in the gaps on their own. This is particularly effective, because supervised learning can never be exhaustive and thus learning autonomously allows to discover invariances and regularities that help to generalize. In this paper we propose to apply a similar approach to the task of object recognition across domains: our model learns the semantic labels in a supervised fashion, and broadens its understanding of the data by learning from self-supervised signals how to solve a jigsaw puzzle on the same images. This secondary task helps the network to learn the concepts of spatial correlation while acting as a regularizer for the classification task. Multiple experiments on the PACS, VLCS, Office-Home and digits datasets confirm our intuition and show that this simple method outperforms previous domain generalization and adaptation solutions. An ablation study further illustrates the inner workings of our approach.
Tasks Domain Generalization, Object Recognition
Published 2019-03-16
URL http://arxiv.org/abs/1903.06864v2
PDF http://arxiv.org/pdf/1903.06864v2.pdf
PWC https://paperswithcode.com/paper/domain-generalization-by-solving-jigsaw
Repo https://github.com/Emma0118/domain-generalization
Framework pytorch

Exploration via Flow-Based Intrinsic Rewards

Title Exploration via Flow-Based Intrinsic Rewards
Authors Hsuan-Kung Yang, Po-Han Chiang, Min-Fong Hong, Chun-Yi Lee
Abstract Exploration bonuses derived from the novelty of observations in an environment have become a popular approach to motivate exploration for reinforcement learning (RL) agents in the past few years. Recent methods such as curiosity-driven exploration usually estimate the novelty of new observations by the prediction errors of their system dynamics models. In this paper, we introduce the concept of optical flow estimation from the field of computer vision to the RL domain and utilize the errors from optical flow estimation to evaluate the novelty of new observations. We introduce a flow-based intrinsic curiosity module (FICM) capable of learning the motion features and understanding the observations in a more comprehensive and efficient fashion. We evaluate our method and compare it with a number of baselines on several benchmark environments, including Atari games, Super Mario Bros., and ViZDoom. Our results show that the proposed method is superior to the baselines in certain environments, especially for those featuring sophisticated moving patterns or with high-dimensional observation spaces.
Tasks Atari Games, Optical Flow Estimation
Published 2019-05-24
URL https://arxiv.org/abs/1905.10071v2
PDF https://arxiv.org/pdf/1905.10071v2.pdf
PWC https://paperswithcode.com/paper/exploration-via-flow-based-intrinsic-rewards
Repo https://github.com/hellochick/MarioO_O-flow-curioisty
Framework tf

Turbo Autoencoder: Deep learning based channel codes for point-to-point communication channels

Title Turbo Autoencoder: Deep learning based channel codes for point-to-point communication channels
Authors Yihan Jiang, Hyeji Kim, Himanshu Asnani, Sreeram Kannan, Sewoong Oh, Pramod Viswanath
Abstract Designing codes that combat the noise in a communication medium has remained a significant area of research in information theory as well as wireless communications. Asymptotically optimal channel codes have been developed by mathematicians for communicating under canonical models after over 60 years of research. On the other hand, in many non-canonical channel settings, optimal codes do not exist and the codes designed for canonical models are adapted via heuristics to these channels and are thus not guaranteed to be optimal. In this work, we make significant progress on this problem by designing a fully end-to-end jointly trained neural encoder and decoder, namely, Turbo Autoencoder (TurboAE), with the following contributions: ($a$) under moderate block lengths, TurboAE approaches state-of-the-art performance under canonical channels; ($b$) moreover, TurboAE outperforms the state-of-the-art codes under non-canonical settings in terms of reliability. TurboAE shows that the development of channel coding design can be automated via deep learning, with near-optimal performance.
Tasks
Published 2019-11-08
URL https://arxiv.org/abs/1911.03038v1
PDF https://arxiv.org/pdf/1911.03038v1.pdf
PWC https://paperswithcode.com/paper/turbo-autoencoder-deep-learning-based-channel
Repo https://github.com/yihanjiang/turboae
Framework pytorch

Taxonomy of Real Faults in Deep Learning Systems

Title Taxonomy of Real Faults in Deep Learning Systems
Authors Nargiz Humbatova, Gunel Jahangirova, Gabriele Bavota, Vincenzo Riccio, Andrea Stocco, Paolo Tonella
Abstract The growing application of deep neural networks in safety-critical domains makes the analysis of faults that occur in such systems of enormous importance. In this paper we introduce a large taxonomy of faults in deep learning (DL) systems. We have manually analysed 1059 artefacts gathered from GitHub commits and issues of projects that use the most popular DL frameworks (TensorFlow, Keras and PyTorch) and from related Stack Overflow posts. Structured interviews with 20 researchers and practitioners describing the problems they have encountered in their experience have enriched our taxonomy with a variety of additional faults that did not emerge from the other two sources. Our final taxonomy was validated with a survey involving an additional set of 21 developers, confirming that almost all fault categories (13/15) were experienced by at least 50% of the survey participants.
Tasks
Published 2019-10-24
URL https://arxiv.org/abs/1910.11015v3
PDF https://arxiv.org/pdf/1910.11015v3.pdf
PWC https://paperswithcode.com/paper/taxonomy-of-real-faults-in-deep-learning
Repo https://github.com/dlfaults/dl_faults
Framework tf

Interpreting OWL Complex Classes in AutomationML based on Bidirectional Translation

Title Interpreting OWL Complex Classes in AutomationML based on Bidirectional Translation
Authors Yingbing Hua, Björn Hein
Abstract The World Wide Web Consortium (W3C) has published several recommendations for building and storing ontologies, including the most recent OWL 2 Web Ontology Language (OWL). These initiatives have been followed by practical implementations that popularize OWL in various domains. For example, OWL has been used for conceptual modeling in industrial engineering, and its reasoning facilities are used to provide a wealth of services, e.g. model diagnosis, automated code generation, and semantic integration. More specifically, recent studies have shown that OWL is well suited for harmonizing information of engineering tools stored as AutomationML (AML) files. However, OWL and its tools can be cumbersome for direct use by engineers such that an ontology expert is often required in practice. Although much attention has been paid in the literature to overcome this issue by transforming OWL ontologies from/to AML models automatically, dealing with OWL complex classes remains an open research question. In this paper, we introduce the AML concept models for representing OWL complex classes in AutomationML, and present algorithms for the bidirectional translation between OWL complex classes and their corresponding AML concept models. We show that this approach provides an efficient and intuitive interface for nonexperts to visualize, modify, and create OWL complex classes.
Tasks Code Generation
Published 2019-06-04
URL https://arxiv.org/abs/1906.04240v1
PDF https://arxiv.org/pdf/1906.04240v1.pdf
PWC https://paperswithcode.com/paper/interpreting-owl-complex-classes-in
Repo https://github.com/kit-hua/ETFA2019
Framework none

CodeGRU: Context-aware Deep Learning with Gated Recurrent Unit for Source Code Modeling

Title CodeGRU: Context-aware Deep Learning with Gated Recurrent Unit for Source Code Modeling
Authors Yasir Hussain, Zhiqiu Huang, Senzhang Wang, Yu Zhou
Abstract Recently many NLP-based deep learning models have been applied to model source code for source code suggestion and recommendation tasks. A major limitation of these approaches is that they take source code as simple tokens of text and ignore its contextual, syntaxtual and structural dependencies. In this work, we present CodeGRU, a Gated Recurrent Unit based source code language model that is capable of capturing contextual, syntaxtual and structural dependencies for modeling the source code. The CodeGRU introduces the following several new components. The Code Sampler is first proposed for selecting noise-free code samples and transforms obfuscate code to its proper syntax, which helps to capture syntaxtual and structural dependencies. The Code Regularize is next introduced to encode source code which helps capture the contextual dependencies of the source code. Finally, we propose a novel method which can learn variable size context for modeling source code. We evaluated CodeGRU with real-world dataset and it shows that CodeGRU can effectively capture contextual, syntaxtual and structural dependencies which previous works fails. We also discuss and visualize two use cases of CodeGRU for source code modeling tasks (1) source code suggestion, and (2) source code generation.
Tasks Code Generation, Language Modelling
Published 2019-03-03
URL http://arxiv.org/abs/1903.00884v1
PDF http://arxiv.org/pdf/1903.00884v1.pdf
PWC https://paperswithcode.com/paper/codegru-context-aware-deep-learning-with
Repo https://github.com/yaxirhuxxain/Source-Code-Suggestion
Framework tf

AutoGAN: Neural Architecture Search for Generative Adversarial Networks

Title AutoGAN: Neural Architecture Search for Generative Adversarial Networks
Authors Xinyu Gong, Shiyu Chang, Yifan Jiang, Zhangyang Wang
Abstract Neural architecture search (NAS) has witnessed prevailing success in image classification and (very recently) segmentation tasks. In this paper, we present the first preliminary study on introducing the NAS algorithm to generative adversarial networks (GANs), dubbed AutoGAN. The marriage of NAS and GANs faces its unique challenges. We define the search space for the generator architectural variations and use an RNN controller to guide the search, with parameter sharing and dynamic-resetting to accelerate the process. Inception score is adopted as the reward, and a multi-level search strategy is introduced to perform NAS in a progressive way. Experiments validate the effectiveness of AutoGAN on the task of unconditional image generation. Specifically, our discovered architectures achieve highly competitive performance compared to current state-of-the-art hand-crafted GANs, e.g., setting new state-of-the-art FID scores of 12.42 on CIFAR-10, and 31.01 on STL-10, respectively. We also conclude with a discussion of the current limitations and future potential of AutoGAN. The code is available at https://github.com/TAMU-VITA/AutoGAN
Tasks Image Classification, Image Generation, Neural Architecture Search
Published 2019-08-11
URL https://arxiv.org/abs/1908.03835v1
PDF https://arxiv.org/pdf/1908.03835v1.pdf
PWC https://paperswithcode.com/paper/autogan-neural-architecture-search-for
Repo https://github.com/TAMU-VITA/AutoGAN
Framework pytorch
comments powered by Disqus