Paper Group AWR 149
Graph Neural Networks: A Review of Methods and Applications. Deep contextualized word representations. Towards Principled Design of Deep Convolutional Networks: Introducing SimpNet. DensePose: Dense Human Pose Estimation In The Wild. Optimal Transport for structured data with application on graphs. Machine Teaching for Inverse Reinforcement Learnin …
Graph Neural Networks: A Review of Methods and Applications
Title | Graph Neural Networks: A Review of Methods and Applications |
Authors | Jie Zhou, Ganqu Cui, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Lifeng Wang, Changcheng Li, Maosong Sun |
Abstract | Lots of learning tasks require dealing with graph data which contains rich relation information among elements. Modeling physics system, learning molecular fingerprints, predicting protein interface, and classifying diseases require a model to learn from graph inputs. In other domains such as learning from non-structural data like texts and images, reasoning on extracted structures, like the dependency tree of sentences and the scene graph of images, is an important research topic which also needs graph reasoning models. Graph neural networks (GNNs) are connectionist models that capture the dependence of graphs via message passing between the nodes of graphs. Unlike standard neural networks, graph neural networks retain a state that can represent information from its neighborhood with arbitrary depth. Although the primitive GNNs have been found difficult to train for a fixed point, recent advances in network architectures, optimization techniques, and parallel computation have enabled successful learning with them. In recent years, systems based on variants of graph neural networks such as graph convolutional network (GCN), graph attention network (GAT), gated graph neural network (GGNN) have demonstrated ground-breaking performance on many tasks mentioned above. In this survey, we provide a detailed review over existing graph neural network models, systematically categorize the applications, and propose four open problems for future research. |
Tasks | |
Published | 2018-12-20 |
URL | https://arxiv.org/abs/1812.08434v4 |
https://arxiv.org/pdf/1812.08434v4.pdf | |
PWC | https://paperswithcode.com/paper/graph-neural-networks-a-review-of-methods-and |
Repo | https://github.com/NorthPolesky/GNNpaper |
Framework | none |
Deep contextualized word representations
Title | Deep contextualized word representations |
Authors | Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, Luke Zettlemoyer |
Abstract | We introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e.g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i.e., to model polysemy). Our word vectors are learned functions of the internal states of a deep bidirectional language model (biLM), which is pre-trained on a large text corpus. We show that these representations can be easily added to existing models and significantly improve the state of the art across six challenging NLP problems, including question answering, textual entailment and sentiment analysis. We also present an analysis showing that exposing the deep internals of the pre-trained network is crucial, allowing downstream models to mix different types of semi-supervision signals. |
Tasks | Citation Intent Classification, Coreference Resolution, Language Modelling, Named Entity Recognition, Natural Language Inference, Question Answering, Semantic Role Labeling, Sentiment Analysis |
Published | 2018-02-15 |
URL | http://arxiv.org/abs/1802.05365v2 |
http://arxiv.org/pdf/1802.05365v2.pdf | |
PWC | https://paperswithcode.com/paper/deep-contextualized-word-representations |
Repo | https://github.com/kinimod23/NMT_Project |
Framework | none |
Towards Principled Design of Deep Convolutional Networks: Introducing SimpNet
Title | Towards Principled Design of Deep Convolutional Networks: Introducing SimpNet |
Authors | Seyyed Hossein Hasanpour, Mohammad Rouhani, Mohsen Fayyaz, Mohammad Sabokrou, Ehsan Adeli |
Abstract | Major winning Convolutional Neural Networks (CNNs), such as VGGNet, ResNet, DenseNet, \etc, include tens to hundreds of millions of parameters, which impose considerable computation and memory overheads. This limits their practical usage in training and optimizing for real-world applications. On the contrary, light-weight architectures, such as SqueezeNet, are being proposed to address this issue. However, they mainly suffer from low accuracy, as they have compromised between the processing power and efficiency. These inefficiencies mostly stem from following an ad-hoc designing procedure. In this work, we discuss and propose several crucial design principles for an efficient architecture design and elaborate intuitions concerning different aspects of the design procedure. Furthermore, we introduce a new layer called {\it SAF-pooling} to improve the generalization power of the network while keeping it simple by choosing best features. Based on such principles, we propose a simple architecture called {\it SimpNet}. We empirically show that SimpNet provides a good trade-off between the computation/memory efficiency and the accuracy solely based on these primitive but crucial principles. SimpNet outperforms the deeper and more complex architectures such as VGGNet, ResNet, WideResidualNet \etc, on several well-known benchmarks, while having 2 to 25 times fewer number of parameters and operations. We obtain state-of-the-art results (in terms of a balance between the accuracy and the number of involved parameters) on standard datasets, such as CIFAR10, CIFAR100, MNIST and SVHN. The implementations are available at \href{url}{https://github.com/Coderx7/SimpNet}. |
Tasks | Image Classification |
Published | 2018-02-17 |
URL | http://arxiv.org/abs/1802.06205v1 |
http://arxiv.org/pdf/1802.06205v1.pdf | |
PWC | https://paperswithcode.com/paper/towards-principled-design-of-deep |
Repo | https://github.com/hexpheus/SimpNet-Tensorflow |
Framework | tf |
DensePose: Dense Human Pose Estimation In The Wild
Title | DensePose: Dense Human Pose Estimation In The Wild |
Authors | Rıza Alp Güler, Natalia Neverova, Iasonas Kokkinos |
Abstract | In this work, we establish dense correspondences between RGB image and a surface-based representation of the human body, a task we refer to as dense human pose estimation. We first gather dense correspondences for 50K persons appearing in the COCO dataset by introducing an efficient annotation pipeline. We then use our dataset to train CNN-based systems that deliver dense correspondence ‘in the wild’, namely in the presence of background, occlusions and scale variations. We improve our training set’s effectiveness by training an ‘inpainting’ network that can fill in missing groundtruth values and report clear improvements with respect to the best results that would be achievable in the past. We experiment with fully-convolutional networks and region-based models and observe a superiority of the latter; we further improve accuracy through cascading, obtaining a system that delivers highly0accurate results in real time. Supplementary materials and videos are provided on the project page http://densepose.org |
Tasks | Pose Estimation |
Published | 2018-02-01 |
URL | http://arxiv.org/abs/1802.00434v1 |
http://arxiv.org/pdf/1802.00434v1.pdf | |
PWC | https://paperswithcode.com/paper/densepose-dense-human-pose-estimation-in-the |
Repo | https://github.com/facebookresearch/DensePose |
Framework | none |
Optimal Transport for structured data with application on graphs
Title | Optimal Transport for structured data with application on graphs |
Authors | Titouan Vayer, Laetitia Chapel, Rémi Flamary, Romain Tavenard, Nicolas Courty |
Abstract | This work considers the problem of computing distances between structured objects such as undirected graphs, seen as probability distributions in a specific metric space. We consider a new transportation distance (i.e. that minimizes a total cost of transporting probability masses) that unveils the geometric nature of the structured objects space. Unlike Wasserstein or Gromov-Wasserstein metrics that focus solely and respectively on features (by considering a metric in the feature space) or structure (by seeing structure as a metric space), our new distance exploits jointly both information, and is consequently called Fused Gromov-Wasserstein (FGW). After discussing its properties and computational aspects, we show results on a graph classification task, where our method outperforms both graph kernels and deep graph convolutional networks. Exploiting further on the metric properties of FGW, interesting geometric objects such as Fr'echet means or barycenters of graphs are illustrated and discussed in a clustering context. |
Tasks | Graph Classification, Graph Clustering, Time Series |
Published | 2018-05-23 |
URL | https://arxiv.org/abs/1805.09114v3 |
https://arxiv.org/pdf/1805.09114v3.pdf | |
PWC | https://paperswithcode.com/paper/optimal-transport-for-structured-data-with |
Repo | https://github.com/tvayer/FGW |
Framework | none |
Machine Teaching for Inverse Reinforcement Learning: Algorithms and Applications
Title | Machine Teaching for Inverse Reinforcement Learning: Algorithms and Applications |
Authors | Daniel S. Brown, Scott Niekum |
Abstract | Inverse reinforcement learning (IRL) infers a reward function from demonstrations, allowing for policy improvement and generalization. However, despite much recent interest in IRL, little work has been done to understand the minimum set of demonstrations needed to teach a specific sequential decision-making task. We formalize the problem of finding maximally informative demonstrations for IRL as a machine teaching problem where the goal is to find the minimum number of demonstrations needed to specify the reward equivalence class of the demonstrator. We extend previous work on algorithmic teaching for sequential decision-making tasks by showing a reduction to the set cover problem which enables an efficient approximation algorithm for determining the set of maximally-informative demonstrations. We apply our proposed machine teaching algorithm to two novel applications: providing a lower bound on the number of queries needed to learn a policy using active IRL and developing a novel IRL algorithm that can learn more efficiently from informative demonstrations than a standard IRL approach. |
Tasks | Decision Making |
Published | 2018-05-20 |
URL | https://arxiv.org/abs/1805.07687v7 |
https://arxiv.org/pdf/1805.07687v7.pdf | |
PWC | https://paperswithcode.com/paper/machine-teaching-for-inverse-reinforcement |
Repo | https://github.com/dsbrown1331/machine-teaching-irl |
Framework | none |
WikiHow: A Large Scale Text Summarization Dataset
Title | WikiHow: A Large Scale Text Summarization Dataset |
Authors | Mahnaz Koupaee, William Yang Wang |
Abstract | Sequence-to-sequence models have recently gained the state of the art performance in summarization. However, not too many large-scale high-quality datasets are available and almost all the available ones are mainly news articles with specific writing style. Moreover, abstractive human-style systems involving description of the content at a deeper level require data with higher levels of abstraction. In this paper, we present WikiHow, a dataset of more than 230,000 article and summary pairs extracted and constructed from an online knowledge base written by different human authors. The articles span a wide range of topics and therefore represent high diversity styles. We evaluate the performance of the existing methods on WikiHow to present its challenges and set some baselines to further improve it. |
Tasks | Text Summarization |
Published | 2018-10-18 |
URL | http://arxiv.org/abs/1810.09305v1 |
http://arxiv.org/pdf/1810.09305v1.pdf | |
PWC | https://paperswithcode.com/paper/wikihow-a-large-scale-text-summarization |
Repo | https://github.com/mahnazkoupaee/WikiHow-Dataset |
Framework | none |
CrowdPose: Efficient Crowded Scenes Pose Estimation and A New Benchmark
Title | CrowdPose: Efficient Crowded Scenes Pose Estimation and A New Benchmark |
Authors | Jiefeng Li, Can Wang, Hao Zhu, Yihuan Mao, Hao-Shu Fang, Cewu Lu |
Abstract | Multi-person pose estimation is fundamental to many computer vision tasks and has made significant progress in recent years. However, few previous methods explored the problem of pose estimation in crowded scenes while it remains challenging and inevitable in many scenarios. Moreover, current benchmarks cannot provide an appropriate evaluation for such cases. In this paper, we propose a novel and efficient method to tackle the problem of pose estimation in the crowd and a new dataset to better evaluate algorithms. Our model consists of two key components: joint-candidate single person pose estimation (SPPE) and global maximum joints association. With multi-peak prediction for each joint and global association using graph model, our method is robust to inevitable interference in crowded scenes and very efficient in inference. The proposed method surpasses the state-of-the-art methods on CrowdPose dataset by 5.2 mAP and results on MSCOCO dataset demonstrate the generalization ability of our method. Source code and dataset will be made publicly available. |
Tasks | Keypoint Detection, Multi-Person Pose Estimation, Pose Estimation |
Published | 2018-12-02 |
URL | http://arxiv.org/abs/1812.00324v2 |
http://arxiv.org/pdf/1812.00324v2.pdf | |
PWC | https://paperswithcode.com/paper/crowdpose-efficient-crowded-scenes-pose |
Repo | https://github.com/Jeff-sjtu/CrowdPose |
Framework | pytorch |
A Compact Embedding for Facial Expression Similarity
Title | A Compact Embedding for Facial Expression Similarity |
Authors | Raviteja Vemulapalli, Aseem Agarwala |
Abstract | Most of the existing work on automatic facial expression analysis focuses on discrete emotion recognition, or facial action unit detection. However, facial expressions do not always fall neatly into pre-defined semantic categories. Also, the similarity between expressions measured in the action unit space need not correspond to how humans perceive expression similarity. Different from previous work, our goal is to describe facial expressions in a continuous fashion using a compact embedding space that mimics human visual preferences. To achieve this goal, we collect a large-scale faces-in-the-wild dataset with human annotations in the form: Expressions A and B are visually more similar when compared to expression C, and use this dataset to train a neural network that produces a compact (16-dimensional) expression embedding. We experimentally demonstrate that the learned embedding can be successfully used for various applications such as expression retrieval, photo album summarization, and emotion recognition. We also show that the embedding learned using the proposed dataset performs better than several other embeddings learned using existing emotion or action unit datasets. |
Tasks | Action Unit Detection, Emotion Recognition, Facial Expression Recognition |
Published | 2018-11-27 |
URL | http://arxiv.org/abs/1811.11283v2 |
http://arxiv.org/pdf/1811.11283v2.pdf | |
PWC | https://paperswithcode.com/paper/a-compact-embedding-for-facial-expression |
Repo | https://github.com/GerardLiu96/FECNet |
Framework | tf |
nn-dependability-kit: Engineering Neural Networks for Safety-Critical Autonomous Driving Systems
Title | nn-dependability-kit: Engineering Neural Networks for Safety-Critical Autonomous Driving Systems |
Authors | Chih-Hong Cheng, Chung-Hao Huang, Georg Nührenberg |
Abstract | Can engineering neural networks be approached in a disciplined way similar to how engineers build software for civil aircraft? We present nn-dependability-kit, an open-source toolbox to support safety engineering of neural networks for autonomous driving systems. The rationale behind nn-dependability-kit is to consider a structured approach (via Goal Structuring Notation) to argue the quality of neural networks. In particular, the tool realizes recent scientific results including (a) novel dependability metrics for indicating sufficient elimination of uncertainties in the product life cycle, (b) formal reasoning engine for ensuring that the generalization does not lead to undesired behaviors, and (c) runtime monitoring for reasoning whether a decision of a neural network in operation is supported by prior similarities in the training data. A proprietary version of nn-dependability-kit has been used to improve the quality of a level-3 autonomous driving component developed by Audi for highway maneuvers. |
Tasks | Autonomous Driving |
Published | 2018-11-16 |
URL | https://arxiv.org/abs/1811.06746v2 |
https://arxiv.org/pdf/1811.06746v2.pdf | |
PWC | https://paperswithcode.com/paper/nn-dependability-kit-engineering-neural |
Repo | https://github.com/dependable-ai/nn-dependability-kit |
Framework | tf |
Tuning Fairness by Marginalizing Latent Target Labels
Title | Tuning Fairness by Marginalizing Latent Target Labels |
Authors | Thomas Kehrenberg, Zexun Chen, Novi Quadrianto |
Abstract | Addressing fairness in machine learning models has recently attracted a lot of attention, as it will ensure continued confidence of the general public in the deployment of machine learning systems. Here, we focus on mitigating harm of a biased system that offers better outputs (e.g. loans, jobs) for certain groups than for others. We show that bias in the output can naturally be handled in probabilistic models by introducing a latent target output that will modulate the likelihood function. This simple formulation has several advantages: first, it is a unified framework for several notions of fairness such as demographic parity and equalized odds; second, it is expressed as marginalization instead of constrained problems; and third, it allows encoding our knowledge of what the bias in outputs should be. Practically, the latter translates to the ability to control the level of fairness by varying directly fairness target rates. In contrast, existing approaches rely on intermediate, arguably unintuitive control parameters such as a covariance threshold. |
Tasks | |
Published | 2018-10-12 |
URL | http://arxiv.org/abs/1810.05598v3 |
http://arxiv.org/pdf/1810.05598v3.pdf | |
PWC | https://paperswithcode.com/paper/interpretable-fairness-via-target-labels-in |
Repo | https://github.com/predictive-analytics-lab/UniversalGP |
Framework | tf |
Axiomatic Interpretability for Multiclass Additive Models
Title | Axiomatic Interpretability for Multiclass Additive Models |
Authors | Xuezhou Zhang, Sarah Tan, Paul Koch, Yin Lou, Urszula Chajewska, Rich Caruana |
Abstract | Generalized additive models (GAMs) are favored in many regression and binary classification problems because they are able to fit complex, nonlinear functions while still remaining interpretable. In the first part of this paper, we generalize a state-of-the-art GAM learning algorithm based on boosted trees to the multiclass setting, and show that this multiclass algorithm outperforms existing GAM learning algorithms and sometimes matches the performance of full complexity models such as gradient boosted trees. In the second part, we turn our attention to the interpretability of GAMs in the multiclass setting. Surprisingly, the natural interpretability of GAMs breaks down when there are more than two classes. Naive interpretation of multiclass GAMs can lead to false conclusions. Inspired by binary GAMs, we identify two axioms that any additive model must satisfy in order to not be visually misleading. We then develop a technique called Additive Post-Processing for Interpretability (API), that provably transforms a pre-trained additive model to satisfy the interpretability axioms without sacrificing accuracy. The technique works not just on models trained with our learning algorithm, but on any multiclass additive model, including multiclass linear and logistic regression. We demonstrate the effectiveness of API on a 12-class infant mortality dataset. |
Tasks | |
Published | 2018-10-22 |
URL | https://arxiv.org/abs/1810.09092v2 |
https://arxiv.org/pdf/1810.09092v2.pdf | |
PWC | https://paperswithcode.com/paper/interpretability-is-harder-in-the-multiclass |
Repo | https://github.com/microsoft/interpret |
Framework | none |
Training Well-Generalizing Classifiers for Fairness Metrics and Other Data-Dependent Constraints
Title | Training Well-Generalizing Classifiers for Fairness Metrics and Other Data-Dependent Constraints |
Authors | Andrew Cotter, Maya Gupta, Heinrich Jiang, Nathan Srebro, Karthik Sridharan, Serena Wang, Blake Woodworth, Seungil You |
Abstract | Classifiers can be trained with data-dependent constraints to satisfy fairness goals, reduce churn, achieve a targeted false positive rate, or other policy goals. We study the generalization performance for such constrained optimization problems, in terms of how well the constraints are satisfied at evaluation time, given that they are satisfied at training time. To improve generalization performance, we frame the problem as a two-player game where one player optimizes the model parameters on a training dataset, and the other player enforces the constraints on an independent validation dataset. We build on recent work in two-player constrained optimization to show that if one uses this two-dataset approach, then constraint generalization can be significantly improved. As we illustrate experimentally, this approach works not only in theory, but also in practice. |
Tasks | |
Published | 2018-06-29 |
URL | http://arxiv.org/abs/1807.00028v2 |
http://arxiv.org/pdf/1807.00028v2.pdf | |
PWC | https://paperswithcode.com/paper/training-well-generalizing-classifiers-for |
Repo | https://github.com/google-research/tensorflow_constrained_optimization |
Framework | tf |
Rethinking Self-driving: Multi-task Knowledge for Better Generalization and Accident Explanation Ability
Title | Rethinking Self-driving: Multi-task Knowledge for Better Generalization and Accident Explanation Ability |
Authors | Zhihao Li, Toshiyuki Motoyoshi, Kazuma Sasaki, Tetsuya Ogata, Shigeki Sugano |
Abstract | Current end-to-end deep learning driving models have two problems: (1) Poor generalization ability of unobserved driving environment when diversity of training driving dataset is limited (2) Lack of accident explanation ability when driving models don’t work as expected. To tackle these two problems, rooted on the believe that knowledge of associated easy task is benificial for addressing difficult task, we proposed a new driving model which is composed of perception module for \textit{see and think} and driving module for \textit{behave}, and trained it with multi-task perception-related basic knowledge and driving knowledge stepwisely. Specifically segmentation map and depth map (pixel level understanding of images) were considered as \textit{what & where} and \textit{how far} knowledge for tackling easier driving-related perception problems before generating final control commands for difficult driving task. The results of experiments demonstrated the effectiveness of multi-task perception knowledge for better generalization and accident explanation ability. With our method the average sucess rate of finishing most difficult navigation tasks in untrained city of CoRL test surpassed current benchmark method for 15 percent in trained weather and 20 percent in untrained weathers. Demonstration video link is: https://www.youtube.com/watch?v=N7ePnnZZwdE |
Tasks | |
Published | 2018-09-28 |
URL | http://arxiv.org/abs/1809.11100v1 |
http://arxiv.org/pdf/1809.11100v1.pdf | |
PWC | https://paperswithcode.com/paper/rethinking-self-driving-multi-task-knowledge-1 |
Repo | https://github.com/jackspp/rethinking-self-driving |
Framework | tf |
A Projection Pursuit Forest Algorithm for Supervised Classification
Title | A Projection Pursuit Forest Algorithm for Supervised Classification |
Authors | Natalia da Silva, Dianne Cook, Eun-Kyung Lee |
Abstract | This paper presents a new ensemble learning method for classification problems called projection pursuit random forest (PPF). PPF uses the PPtree algorithm introduced in Lee et al. (2013). In PPF, trees are constructed by splitting on linear combinations of randomly chosen variables. Projection pursuit is used to choose a projection of the variables that best separates the classes. Utilizing linear combinations of variables to separate classes takes the correlation between variables into account which allows PPF to outperform a traditional random forest when separations between groups occurs in combinations of variables. The method presented here can be used in multi-class problems and is implemented into an R (R Core Team, 2018) package, PPforest, which is available on CRAN, with development versions at https://github.com/natydasilva/PPforest. |
Tasks | |
Published | 2018-07-19 |
URL | http://arxiv.org/abs/1807.07207v2 |
http://arxiv.org/pdf/1807.07207v2.pdf | |
PWC | https://paperswithcode.com/paper/a-projection-pursuit-forest-algorithm-for |
Repo | https://github.com/natydasilva/PPforest |
Framework | none |