Paper Group AWR 393
Modeling Color Terminology Across Thousands of Languages. Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning. Transferring Robustness for Graph Neural Network Against Poisoning Attacks. A Pilot Study for Chinese SQL Semantic Parsing. Entity, Relation, and Event Extraction with Contextualized Span Representations. …
Modeling Color Terminology Across Thousands of Languages
Title | Modeling Color Terminology Across Thousands of Languages |
Authors | Arya D. McCarthy, Winston Wu, Aaron Mueller, Bill Watson, David Yarowsky |
Abstract | There is an extensive history of scholarship into what constitutes a “basic” color term, as well as a broadly attested acquisition sequence of basic color terms across many languages, as articulated in the seminal work of Berlin and Kay (1969). This paper employs a set of diverse measures on massively cross-linguistic data to operationalize and critique the Berlin and Kay color term hypotheses. Collectively, the 14 empirically-grounded computational linguistic metrics we design—as well as their aggregation—correlate strongly with both the Berlin and Kay basic/secondary color term partition (gamma=0.96) and their hypothesized universal acquisition sequence. The measures and result provide further empirical evidence from computational linguistics in support of their claims, as well as additional nuance: they suggest treating the partition as a spectrum instead of a dichotomy. |
Tasks | |
Published | 2019-10-03 |
URL | https://arxiv.org/abs/1910.01531v1 |
https://arxiv.org/pdf/1910.01531v1.pdf | |
PWC | https://paperswithcode.com/paper/modeling-color-terminology-across-thousands |
Repo | https://github.com/aryamccarthy/basic-color-terms |
Framework | none |
Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning
Title | Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning |
Authors | Tianhe Yu, Deirdre Quillen, Zhanpeng He, Ryan Julian, Karol Hausman, Chelsea Finn, Sergey Levine |
Abstract | Meta-reinforcement learning algorithms can enable robots to acquire new skills much more quickly, by leveraging prior experience to learn how to learn. However, much of the current research on meta-reinforcement learning focuses on task distributions that are very narrow. For example, a commonly used meta-reinforcement learning benchmark uses different running velocities for a simulated robot as different tasks. When policies are meta-trained on such narrow task distributions, they cannot possibly generalize to more quickly acquire entirely new tasks. Therefore, if the aim of these methods is to enable faster acquisition of entirely new behaviors, we must evaluate them on task distributions that are sufficiently broad to enable generalization to new behaviors. In this paper, we propose an open-source simulated benchmark for meta-reinforcement learning and multi-task learning consisting of 50 distinct robotic manipulation tasks. Our aim is to make it possible to develop algorithms that generalize to accelerate the acquisition of entirely new, held-out tasks. We evaluate 6 state-of-the-art meta-reinforcement learning and multi-task learning algorithms on these tasks. Surprisingly, while each task and its variations (e.g., with different object positions) can be learned with reasonable success, these algorithms struggle to learn with multiple tasks at the same time, even with as few as ten distinct training tasks. Our analysis and open-source environments pave the way for future research in multi-task learning and meta-learning that can enable meaningful generalization, thereby unlocking the full potential of these methods. |
Tasks | Meta-Learning, Multi-Task Learning |
Published | 2019-10-24 |
URL | https://arxiv.org/abs/1910.10897v1 |
https://arxiv.org/pdf/1910.10897v1.pdf | |
PWC | https://paperswithcode.com/paper/meta-world-a-benchmark-and-evaluation-for |
Repo | https://github.com/rlworkgroup/metaworld |
Framework | none |
Transferring Robustness for Graph Neural Network Against Poisoning Attacks
Title | Transferring Robustness for Graph Neural Network Against Poisoning Attacks |
Authors | Xianfeng Tang, Yandong Li, Yiwei Sun, Huaxiu Yao, Prasenjit Mitra, Suhang Wang |
Abstract | Graph neural networks (GNNs) are widely used in many applications. However, their robustness against adversarial attacks is criticized. Prior studies show that using unnoticeable modifications on graph topology or nodal features can significantly reduce the performances of GNNs. It is very challenging to design robust graph neural networks against poisoning attack and several efforts have been taken. Existing work aims at reducing the negative impact from adversarial edges only with the poisoned graph, which is sub-optimal since they fail to discriminate adversarial edges from normal ones. On the other hand, clean graphs from similar domains as the target poisoned graph are usually available in the real world. By perturbing these clean graphs, we create supervised knowledge to train the ability to detect adversarial edges so that the robustness of GNNs is elevated. However, such potential for clean graphs is neglected by existing work. To this end, we investigate a novel problem of improving the robustness of GNNs against poisoning attacks by exploring clean graphs. Specifically, we propose PA-GNN, which relies on a penalized aggregation mechanism that directly restrict the negative impact of adversarial edges by assigning them lower attention coefficients. To optimize PA-GNN for a poisoned graph, we design a meta-optimization algorithm that trains PA-GNN to penalize perturbations using clean graphs and their adversarial counterparts, and transfers such ability to improve the robustness of PA-GNN on the poisoned graph. Experimental results on four real-world datasets demonstrate the robustness of PA-GNN against poisoning attacks on graphs. Code and data are available here: https://github.com/tangxianfeng/PA-GNN. |
Tasks | Node Classification, Transfer Learning |
Published | 2019-08-20 |
URL | https://arxiv.org/abs/1908.07558v3 |
https://arxiv.org/pdf/1908.07558v3.pdf | |
PWC | https://paperswithcode.com/paper/190807558 |
Repo | https://github.com/tangxianfeng/PA-GNN |
Framework | tf |
A Pilot Study for Chinese SQL Semantic Parsing
Title | A Pilot Study for Chinese SQL Semantic Parsing |
Authors | Qingkai Min, Yuefeng Shi, Yue Zhang |
Abstract | The task of semantic parsing is highly useful for dialogue and question answering systems. Many datasets have been proposed to map natural language text into SQL, among which the recent Spider dataset provides cross-domain samples with multiple tables and complex queries. We build a Spider dataset for Chinese, which is currently a low-resource language in this task area. Interesting research questions arise from the uniqueness of the language, which requires word segmentation, and also from the fact that SQL keywords and columns of DB tables are typically written in English. We compare character- and word-based encoders for a semantic parser, and different embedding schemes. Results show that word-based semantic parser is subject to segmentation errors and cross-lingual word embeddings are useful for text-to-SQL. |
Tasks | Question Answering, Semantic Parsing, Text-To-Sql, Word Embeddings |
Published | 2019-09-29 |
URL | https://arxiv.org/abs/1909.13293v2 |
https://arxiv.org/pdf/1909.13293v2.pdf | |
PWC | https://paperswithcode.com/paper/a-pilot-study-for-chinese-sql-semantic |
Repo | https://github.com/taolusi/chisp |
Framework | pytorch |
Entity, Relation, and Event Extraction with Contextualized Span Representations
Title | Entity, Relation, and Event Extraction with Contextualized Span Representations |
Authors | David Wadden, Ulme Wennberg, Yi Luan, Hannaneh Hajishirzi |
Abstract | We examine the capabilities of a unified, multi-task framework for three information extraction tasks: named entity recognition, relation extraction, and event extraction. Our framework (called DyGIE++) accomplishes all tasks by enumerating, refining, and scoring text spans designed to capture local (within-sentence) and global (cross-sentence) context. Our framework achieves state-of-the-art results across all tasks, on four datasets from a variety of domains. We perform experiments comparing different techniques to construct span representations. Contextualized embeddings like BERT perform well at capturing relationships among entities in the same or adjacent sentences, while dynamic span graph updates model long-range cross-sentence relationships. For instance, propagating span representations via predicted coreference links can enable the model to disambiguate challenging entity mentions. Our code is publicly available at https://github.com/dwadden/dygiepp and can be easily adapted for new tasks or datasets. |
Tasks | Joint Entity and Relation Extraction, Named Entity Recognition, Relation Extraction |
Published | 2019-09-08 |
URL | https://arxiv.org/abs/1909.03546v2 |
https://arxiv.org/pdf/1909.03546v2.pdf | |
PWC | https://paperswithcode.com/paper/entity-relation-and-event-extraction-with |
Repo | https://github.com/dwadden/dygiepp |
Framework | pytorch |
Hyperbolic Graph Convolutional Neural Networks
Title | Hyperbolic Graph Convolutional Neural Networks |
Authors | Ines Chami, Rex Ying, Christopher Ré, Jure Leskovec |
Abstract | Graph convolutional neural networks (GCNs) embed nodes in a graph into Euclidean space, which has been shown to incur a large distortion when embedding real-world graphs with scale-free or hierarchical structure. Hyperbolic geometry offers an exciting alternative, as it enables embeddings with much smaller distortion. However, extending GCNs to hyperbolic geometry presents several unique challenges because it is not clear how to define neural network operations, such as feature transformation and aggregation, in hyperbolic space. Furthermore, since input features are often Euclidean, it is unclear how to transform the features into hyperbolic embeddings with the right amount of curvature. Here we propose Hyperbolic Graph Convolutional Neural Network (HGCN), the first inductive hyperbolic GCN that leverages both the expressiveness of GCNs and hyperbolic geometry to learn inductive node representations for hierarchical and scale-free graphs. We derive GCN operations in the hyperboloid model of hyperbolic space and map Euclidean input features to embeddings in hyperbolic spaces with different trainable curvature at each layer. Experiments demonstrate that HGCN learns embeddings that preserve hierarchical structure, and leads to improved performance when compared to Euclidean analogs, even with very low dimensional embeddings: compared to state-of-the-art GCNs, HGCN achieves an error reduction of up to 63.1% in ROC AUC for link prediction and of up to 47.5% in F1 score for node classification, also improving state-of-the art on the Pubmed dataset. |
Tasks | Link Prediction, Node Classification |
Published | 2019-10-28 |
URL | https://arxiv.org/abs/1910.12933v1 |
https://arxiv.org/pdf/1910.12933v1.pdf | |
PWC | https://paperswithcode.com/paper/hyperbolic-graph-convolutional-neural |
Repo | https://github.com/HazyResearch/hgcn |
Framework | pytorch |
An interpretable probabilistic machine learning method for heterogeneous longitudinal studies
Title | An interpretable probabilistic machine learning method for heterogeneous longitudinal studies |
Authors | Juho Timonen, Henrik Mannerström, Aki Vehtari, Harri Lähdesmäki |
Abstract | Identifying risk factors from longitudinal data requires statistical tools that are not restricted to linear models, yet provide interpretable associations between different types of covariates and a response variable. Here, we present a widely applicable and interpretable probabilistic machine learning method for nonparametric longitudinal data analysis using additive Gaussian process regression. We demonstrate that it outperforms previous longitudinal modeling approaches and provides useful novel features, including the ability to account for uncertainty in disease effect times as well as heterogeneity in their effects. |
Tasks | |
Published | 2019-12-07 |
URL | https://arxiv.org/abs/1912.03549v1 |
https://arxiv.org/pdf/1912.03549v1.pdf | |
PWC | https://paperswithcode.com/paper/an-interpretable-probabilistic-machine |
Repo | https://github.com/jtimonen/lgpr |
Framework | none |
PPDM: Parallel Point Detection and Matching for Real-time Human-Object Interaction Detection
Title | PPDM: Parallel Point Detection and Matching for Real-time Human-Object Interaction Detection |
Authors | Yue Liao, Si Liu, Fei Wang, Yanjie Chen, Chen Qian, Jiashi Feng |
Abstract | We propose a single-stage Human-Object Interaction (HOI) detection method that has outperformed all existing methods on HICO-DET dataset at 37 fps on a single Titan XP GPU. It is the first real-time HOI detection method. Conventional HOI detection methods are composed of two stages, i.e., human-object proposals generation, and proposals classification. Their effectiveness and efficiency are limited by the sequential and separate architecture. In this paper, we propose a Parallel Point Detection and Matching (PPDM) HOI detection framework. In PPDM, an HOI is defined as a point triplet < human point, interaction point, object point>. Human and object points are the center of the detection boxes, and the interaction point is the midpoint of the human and object points. PPDM contains two parallel branches, namely point detection branch and point matching branch. The point detection branch predicts three points. Simultaneously, the point matching branch predicts two displacements from the interaction point to its corresponding human and object points. The human point and the object point originated from the same interaction point are considered as matched pairs. In our novel parallel architecture, the interaction points implicitly provide context and regularization for human and object detection. The isolated detection boxes are unlikely to form meaning HOI triplets are suppressed, which increases the precision of HOI detection. Moreover, the matching between human and object detection boxes is only applied around limited numbers of filtered candidate interaction points, which saves much computational cost. Additionally, we build a new application-oriented database named HOI-A, which severs as a good supplement to the existing datasets. The source code and the dataset will be made publicly available to facilitate the development of HOI detection. |
Tasks | Human-Object Interaction Detection, Object Detection |
Published | 2019-12-30 |
URL | https://arxiv.org/abs/1912.12898v3 |
https://arxiv.org/pdf/1912.12898v3.pdf | |
PWC | https://paperswithcode.com/paper/ppdm-parallel-point-detection-and-matching |
Repo | https://github.com/YueLiao/PPDM |
Framework | pytorch |
Domain Generalization by Solving Jigsaw Puzzles
Title | Domain Generalization by Solving Jigsaw Puzzles |
Authors | Fabio Maria Carlucci, Antonio D’Innocente, Silvia Bucci, Barbara Caputo, Tatiana Tommasi |
Abstract | Human adaptability relies crucially on the ability to learn and merge knowledge both from supervised and unsupervised learning: the parents point out few important concepts, but then the children fill in the gaps on their own. This is particularly effective, because supervised learning can never be exhaustive and thus learning autonomously allows to discover invariances and regularities that help to generalize. In this paper we propose to apply a similar approach to the task of object recognition across domains: our model learns the semantic labels in a supervised fashion, and broadens its understanding of the data by learning from self-supervised signals how to solve a jigsaw puzzle on the same images. This secondary task helps the network to learn the concepts of spatial correlation while acting as a regularizer for the classification task. Multiple experiments on the PACS, VLCS, Office-Home and digits datasets confirm our intuition and show that this simple method outperforms previous domain generalization and adaptation solutions. An ablation study further illustrates the inner workings of our approach. |
Tasks | Domain Generalization, Object Recognition |
Published | 2019-03-16 |
URL | http://arxiv.org/abs/1903.06864v2 |
http://arxiv.org/pdf/1903.06864v2.pdf | |
PWC | https://paperswithcode.com/paper/domain-generalization-by-solving-jigsaw |
Repo | https://github.com/Emma0118/domain-generalization |
Framework | pytorch |
Exploration via Flow-Based Intrinsic Rewards
Title | Exploration via Flow-Based Intrinsic Rewards |
Authors | Hsuan-Kung Yang, Po-Han Chiang, Min-Fong Hong, Chun-Yi Lee |
Abstract | Exploration bonuses derived from the novelty of observations in an environment have become a popular approach to motivate exploration for reinforcement learning (RL) agents in the past few years. Recent methods such as curiosity-driven exploration usually estimate the novelty of new observations by the prediction errors of their system dynamics models. In this paper, we introduce the concept of optical flow estimation from the field of computer vision to the RL domain and utilize the errors from optical flow estimation to evaluate the novelty of new observations. We introduce a flow-based intrinsic curiosity module (FICM) capable of learning the motion features and understanding the observations in a more comprehensive and efficient fashion. We evaluate our method and compare it with a number of baselines on several benchmark environments, including Atari games, Super Mario Bros., and ViZDoom. Our results show that the proposed method is superior to the baselines in certain environments, especially for those featuring sophisticated moving patterns or with high-dimensional observation spaces. |
Tasks | Atari Games, Optical Flow Estimation |
Published | 2019-05-24 |
URL | https://arxiv.org/abs/1905.10071v2 |
https://arxiv.org/pdf/1905.10071v2.pdf | |
PWC | https://paperswithcode.com/paper/exploration-via-flow-based-intrinsic-rewards |
Repo | https://github.com/hellochick/MarioO_O-flow-curioisty |
Framework | tf |
Turbo Autoencoder: Deep learning based channel codes for point-to-point communication channels
Title | Turbo Autoencoder: Deep learning based channel codes for point-to-point communication channels |
Authors | Yihan Jiang, Hyeji Kim, Himanshu Asnani, Sreeram Kannan, Sewoong Oh, Pramod Viswanath |
Abstract | Designing codes that combat the noise in a communication medium has remained a significant area of research in information theory as well as wireless communications. Asymptotically optimal channel codes have been developed by mathematicians for communicating under canonical models after over 60 years of research. On the other hand, in many non-canonical channel settings, optimal codes do not exist and the codes designed for canonical models are adapted via heuristics to these channels and are thus not guaranteed to be optimal. In this work, we make significant progress on this problem by designing a fully end-to-end jointly trained neural encoder and decoder, namely, Turbo Autoencoder (TurboAE), with the following contributions: ($a$) under moderate block lengths, TurboAE approaches state-of-the-art performance under canonical channels; ($b$) moreover, TurboAE outperforms the state-of-the-art codes under non-canonical settings in terms of reliability. TurboAE shows that the development of channel coding design can be automated via deep learning, with near-optimal performance. |
Tasks | |
Published | 2019-11-08 |
URL | https://arxiv.org/abs/1911.03038v1 |
https://arxiv.org/pdf/1911.03038v1.pdf | |
PWC | https://paperswithcode.com/paper/turbo-autoencoder-deep-learning-based-channel |
Repo | https://github.com/yihanjiang/turboae |
Framework | pytorch |
Taxonomy of Real Faults in Deep Learning Systems
Title | Taxonomy of Real Faults in Deep Learning Systems |
Authors | Nargiz Humbatova, Gunel Jahangirova, Gabriele Bavota, Vincenzo Riccio, Andrea Stocco, Paolo Tonella |
Abstract | The growing application of deep neural networks in safety-critical domains makes the analysis of faults that occur in such systems of enormous importance. In this paper we introduce a large taxonomy of faults in deep learning (DL) systems. We have manually analysed 1059 artefacts gathered from GitHub commits and issues of projects that use the most popular DL frameworks (TensorFlow, Keras and PyTorch) and from related Stack Overflow posts. Structured interviews with 20 researchers and practitioners describing the problems they have encountered in their experience have enriched our taxonomy with a variety of additional faults that did not emerge from the other two sources. Our final taxonomy was validated with a survey involving an additional set of 21 developers, confirming that almost all fault categories (13/15) were experienced by at least 50% of the survey participants. |
Tasks | |
Published | 2019-10-24 |
URL | https://arxiv.org/abs/1910.11015v3 |
https://arxiv.org/pdf/1910.11015v3.pdf | |
PWC | https://paperswithcode.com/paper/taxonomy-of-real-faults-in-deep-learning |
Repo | https://github.com/dlfaults/dl_faults |
Framework | tf |
Interpreting OWL Complex Classes in AutomationML based on Bidirectional Translation
Title | Interpreting OWL Complex Classes in AutomationML based on Bidirectional Translation |
Authors | Yingbing Hua, Björn Hein |
Abstract | The World Wide Web Consortium (W3C) has published several recommendations for building and storing ontologies, including the most recent OWL 2 Web Ontology Language (OWL). These initiatives have been followed by practical implementations that popularize OWL in various domains. For example, OWL has been used for conceptual modeling in industrial engineering, and its reasoning facilities are used to provide a wealth of services, e.g. model diagnosis, automated code generation, and semantic integration. More specifically, recent studies have shown that OWL is well suited for harmonizing information of engineering tools stored as AutomationML (AML) files. However, OWL and its tools can be cumbersome for direct use by engineers such that an ontology expert is often required in practice. Although much attention has been paid in the literature to overcome this issue by transforming OWL ontologies from/to AML models automatically, dealing with OWL complex classes remains an open research question. In this paper, we introduce the AML concept models for representing OWL complex classes in AutomationML, and present algorithms for the bidirectional translation between OWL complex classes and their corresponding AML concept models. We show that this approach provides an efficient and intuitive interface for nonexperts to visualize, modify, and create OWL complex classes. |
Tasks | Code Generation |
Published | 2019-06-04 |
URL | https://arxiv.org/abs/1906.04240v1 |
https://arxiv.org/pdf/1906.04240v1.pdf | |
PWC | https://paperswithcode.com/paper/interpreting-owl-complex-classes-in |
Repo | https://github.com/kit-hua/ETFA2019 |
Framework | none |
CodeGRU: Context-aware Deep Learning with Gated Recurrent Unit for Source Code Modeling
Title | CodeGRU: Context-aware Deep Learning with Gated Recurrent Unit for Source Code Modeling |
Authors | Yasir Hussain, Zhiqiu Huang, Senzhang Wang, Yu Zhou |
Abstract | Recently many NLP-based deep learning models have been applied to model source code for source code suggestion and recommendation tasks. A major limitation of these approaches is that they take source code as simple tokens of text and ignore its contextual, syntaxtual and structural dependencies. In this work, we present CodeGRU, a Gated Recurrent Unit based source code language model that is capable of capturing contextual, syntaxtual and structural dependencies for modeling the source code. The CodeGRU introduces the following several new components. The Code Sampler is first proposed for selecting noise-free code samples and transforms obfuscate code to its proper syntax, which helps to capture syntaxtual and structural dependencies. The Code Regularize is next introduced to encode source code which helps capture the contextual dependencies of the source code. Finally, we propose a novel method which can learn variable size context for modeling source code. We evaluated CodeGRU with real-world dataset and it shows that CodeGRU can effectively capture contextual, syntaxtual and structural dependencies which previous works fails. We also discuss and visualize two use cases of CodeGRU for source code modeling tasks (1) source code suggestion, and (2) source code generation. |
Tasks | Code Generation, Language Modelling |
Published | 2019-03-03 |
URL | http://arxiv.org/abs/1903.00884v1 |
http://arxiv.org/pdf/1903.00884v1.pdf | |
PWC | https://paperswithcode.com/paper/codegru-context-aware-deep-learning-with |
Repo | https://github.com/yaxirhuxxain/Source-Code-Suggestion |
Framework | tf |
AutoGAN: Neural Architecture Search for Generative Adversarial Networks
Title | AutoGAN: Neural Architecture Search for Generative Adversarial Networks |
Authors | Xinyu Gong, Shiyu Chang, Yifan Jiang, Zhangyang Wang |
Abstract | Neural architecture search (NAS) has witnessed prevailing success in image classification and (very recently) segmentation tasks. In this paper, we present the first preliminary study on introducing the NAS algorithm to generative adversarial networks (GANs), dubbed AutoGAN. The marriage of NAS and GANs faces its unique challenges. We define the search space for the generator architectural variations and use an RNN controller to guide the search, with parameter sharing and dynamic-resetting to accelerate the process. Inception score is adopted as the reward, and a multi-level search strategy is introduced to perform NAS in a progressive way. Experiments validate the effectiveness of AutoGAN on the task of unconditional image generation. Specifically, our discovered architectures achieve highly competitive performance compared to current state-of-the-art hand-crafted GANs, e.g., setting new state-of-the-art FID scores of 12.42 on CIFAR-10, and 31.01 on STL-10, respectively. We also conclude with a discussion of the current limitations and future potential of AutoGAN. The code is available at https://github.com/TAMU-VITA/AutoGAN |
Tasks | Image Classification, Image Generation, Neural Architecture Search |
Published | 2019-08-11 |
URL | https://arxiv.org/abs/1908.03835v1 |
https://arxiv.org/pdf/1908.03835v1.pdf | |
PWC | https://paperswithcode.com/paper/autogan-neural-architecture-search-for |
Repo | https://github.com/TAMU-VITA/AutoGAN |
Framework | pytorch |