October 21, 2019

3135 words 15 mins read

Paper Group AWR 149

Graph Neural Networks: A Review of Methods and Applications. Deep contextualized word representations. Towards Principled Design of Deep Convolutional Networks: Introducing SimpNet. DensePose: Dense Human Pose Estimation In The Wild. Optimal Transport for structured data with application on graphs. Machine Teaching for Inverse Reinforcement Learnin …

Graph Neural Networks: A Review of Methods and Applications


Title	Graph Neural Networks: A Review of Methods and Applications
Authors	Jie Zhou, Ganqu Cui, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Lifeng Wang, Changcheng Li, Maosong Sun
Abstract	Lots of learning tasks require dealing with graph data which contains rich relation information among elements. Modeling physics system, learning molecular fingerprints, predicting protein interface, and classifying diseases require a model to learn from graph inputs. In other domains such as learning from non-structural data like texts and images, reasoning on extracted structures, like the dependency tree of sentences and the scene graph of images, is an important research topic which also needs graph reasoning models. Graph neural networks (GNNs) are connectionist models that capture the dependence of graphs via message passing between the nodes of graphs. Unlike standard neural networks, graph neural networks retain a state that can represent information from its neighborhood with arbitrary depth. Although the primitive GNNs have been found difficult to train for a fixed point, recent advances in network architectures, optimization techniques, and parallel computation have enabled successful learning with them. In recent years, systems based on variants of graph neural networks such as graph convolutional network (GCN), graph attention network (GAT), gated graph neural network (GGNN) have demonstrated ground-breaking performance on many tasks mentioned above. In this survey, we provide a detailed review over existing graph neural network models, systematically categorize the applications, and propose four open problems for future research.
Tasks
Published	2018-12-20
URL	https://arxiv.org/abs/1812.08434v4
PDF	https://arxiv.org/pdf/1812.08434v4.pdf
PWC	https://paperswithcode.com/paper/graph-neural-networks-a-review-of-methods-and
Repo	https://github.com/NorthPolesky/GNNpaper
Framework	none

Deep contextualized word representations


Title	Deep contextualized word representations
Authors	Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, Luke Zettlemoyer
Abstract	We introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e.g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i.e., to model polysemy). Our word vectors are learned functions of the internal states of a deep bidirectional language model (biLM), which is pre-trained on a large text corpus. We show that these representations can be easily added to existing models and significantly improve the state of the art across six challenging NLP problems, including question answering, textual entailment and sentiment analysis. We also present an analysis showing that exposing the deep internals of the pre-trained network is crucial, allowing downstream models to mix different types of semi-supervision signals.
Tasks	Citation Intent Classification, Coreference Resolution, Language Modelling, Named Entity Recognition, Natural Language Inference, Question Answering, Semantic Role Labeling, Sentiment Analysis
Published	2018-02-15
URL	http://arxiv.org/abs/1802.05365v2
PDF	http://arxiv.org/pdf/1802.05365v2.pdf
PWC	https://paperswithcode.com/paper/deep-contextualized-word-representations
Repo	https://github.com/kinimod23/NMT_Project
Framework	none

Towards Principled Design of Deep Convolutional Networks: Introducing SimpNet


Title	Towards Principled Design of Deep Convolutional Networks: Introducing SimpNet
Authors	Seyyed Hossein Hasanpour, Mohammad Rouhani, Mohsen Fayyaz, Mohammad Sabokrou, Ehsan Adeli
Abstract	Major winning Convolutional Neural Networks (CNNs), such as VGGNet, ResNet, DenseNet, \etc, include tens to hundreds of millions of parameters, which impose considerable computation and memory overheads. This limits their practical usage in training and optimizing for real-world applications. On the contrary, light-weight architectures, such as SqueezeNet, are being proposed to address this issue. However, they mainly suffer from low accuracy, as they have compromised between the processing power and efficiency. These inefficiencies mostly stem from following an ad-hoc designing procedure. In this work, we discuss and propose several crucial design principles for an efficient architecture design and elaborate intuitions concerning different aspects of the design procedure. Furthermore, we introduce a new layer called {\it SAF-pooling} to improve the generalization power of the network while keeping it simple by choosing best features. Based on such principles, we propose a simple architecture called {\it SimpNet}. We empirically show that SimpNet provides a good trade-off between the computation/memory efficiency and the accuracy solely based on these primitive but crucial principles. SimpNet outperforms the deeper and more complex architectures such as VGGNet, ResNet, WideResidualNet \etc, on several well-known benchmarks, while having 2 to 25 times fewer number of parameters and operations. We obtain state-of-the-art results (in terms of a balance between the accuracy and the number of involved parameters) on standard datasets, such as CIFAR10, CIFAR100, MNIST and SVHN. The implementations are available at \href{url}{https://github.com/Coderx7/SimpNet}.
Tasks	Image Classification
Published	2018-02-17
URL	http://arxiv.org/abs/1802.06205v1
PDF	http://arxiv.org/pdf/1802.06205v1.pdf
PWC	https://paperswithcode.com/paper/towards-principled-design-of-deep
Repo	https://github.com/hexpheus/SimpNet-Tensorflow
Framework	tf

DensePose: Dense Human Pose Estimation In The Wild


Title	DensePose: Dense Human Pose Estimation In The Wild
Authors	Rıza Alp Güler, Natalia Neverova, Iasonas Kokkinos
Abstract	In this work, we establish dense correspondences between RGB image and a surface-based representation of the human body, a task we refer to as dense human pose estimation. We first gather dense correspondences for 50K persons appearing in the COCO dataset by introducing an efficient annotation pipeline. We then use our dataset to train CNN-based systems that deliver dense correspondence ‘in the wild’, namely in the presence of background, occlusions and scale variations. We improve our training set’s effectiveness by training an ‘inpainting’ network that can fill in missing groundtruth values and report clear improvements with respect to the best results that would be achievable in the past. We experiment with fully-convolutional networks and region-based models and observe a superiority of the latter; we further improve accuracy through cascading, obtaining a system that delivers highly0accurate results in real time. Supplementary materials and videos are provided on the project page http://densepose.org
Tasks	Pose Estimation
Published	2018-02-01
URL	http://arxiv.org/abs/1802.00434v1
PDF	http://arxiv.org/pdf/1802.00434v1.pdf
PWC	https://paperswithcode.com/paper/densepose-dense-human-pose-estimation-in-the
Repo	https://github.com/facebookresearch/DensePose
Framework	none

Optimal Transport for structured data with application on graphs


Title	Optimal Transport for structured data with application on graphs
Authors	Titouan Vayer, Laetitia Chapel, Rémi Flamary, Romain Tavenard, Nicolas Courty
Abstract	This work considers the problem of computing distances between structured objects such as undirected graphs, seen as probability distributions in a specific metric space. We consider a new transportation distance (i.e. that minimizes a total cost of transporting probability masses) that unveils the geometric nature of the structured objects space. Unlike Wasserstein or Gromov-Wasserstein metrics that focus solely and respectively on features (by considering a metric in the feature space) or structure (by seeing structure as a metric space), our new distance exploits jointly both information, and is consequently called Fused Gromov-Wasserstein (FGW). After discussing its properties and computational aspects, we show results on a graph classification task, where our method outperforms both graph kernels and deep graph convolutional networks. Exploiting further on the metric properties of FGW, interesting geometric objects such as Fr'echet means or barycenters of graphs are illustrated and discussed in a clustering context.
Tasks	Graph Classification, Graph Clustering, Time Series
Published	2018-05-23
URL	https://arxiv.org/abs/1805.09114v3
PDF	https://arxiv.org/pdf/1805.09114v3.pdf
PWC	https://paperswithcode.com/paper/optimal-transport-for-structured-data-with
Repo	https://github.com/tvayer/FGW
Framework	none

Machine Teaching for Inverse Reinforcement Learning: Algorithms and Applications


Title	Machine Teaching for Inverse Reinforcement Learning: Algorithms and Applications
Authors	Daniel S. Brown, Scott Niekum
Abstract	Inverse reinforcement learning (IRL) infers a reward function from demonstrations, allowing for policy improvement and generalization. However, despite much recent interest in IRL, little work has been done to understand the minimum set of demonstrations needed to teach a specific sequential decision-making task. We formalize the problem of finding maximally informative demonstrations for IRL as a machine teaching problem where the goal is to find the minimum number of demonstrations needed to specify the reward equivalence class of the demonstrator. We extend previous work on algorithmic teaching for sequential decision-making tasks by showing a reduction to the set cover problem which enables an efficient approximation algorithm for determining the set of maximally-informative demonstrations. We apply our proposed machine teaching algorithm to two novel applications: providing a lower bound on the number of queries needed to learn a policy using active IRL and developing a novel IRL algorithm that can learn more efficiently from informative demonstrations than a standard IRL approach.
Tasks	Decision Making
Published	2018-05-20
URL	https://arxiv.org/abs/1805.07687v7
PDF	https://arxiv.org/pdf/1805.07687v7.pdf
PWC	https://paperswithcode.com/paper/machine-teaching-for-inverse-reinforcement
Repo	https://github.com/dsbrown1331/machine-teaching-irl
Framework	none

WikiHow: A Large Scale Text Summarization Dataset


Title	WikiHow: A Large Scale Text Summarization Dataset
Authors	Mahnaz Koupaee, William Yang Wang
Abstract	Sequence-to-sequence models have recently gained the state of the art performance in summarization. However, not too many large-scale high-quality datasets are available and almost all the available ones are mainly news articles with specific writing style. Moreover, abstractive human-style systems involving description of the content at a deeper level require data with higher levels of abstraction. In this paper, we present WikiHow, a dataset of more than 230,000 article and summary pairs extracted and constructed from an online knowledge base written by different human authors. The articles span a wide range of topics and therefore represent high diversity styles. We evaluate the performance of the existing methods on WikiHow to present its challenges and set some baselines to further improve it.
Tasks	Text Summarization
Published	2018-10-18
URL	http://arxiv.org/abs/1810.09305v1
PDF	http://arxiv.org/pdf/1810.09305v1.pdf
PWC	https://paperswithcode.com/paper/wikihow-a-large-scale-text-summarization
Repo	https://github.com/mahnazkoupaee/WikiHow-Dataset
Framework	none

CrowdPose: Efficient Crowded Scenes Pose Estimation and A New Benchmark


Title	CrowdPose: Efficient Crowded Scenes Pose Estimation and A New Benchmark
Authors	Jiefeng Li, Can Wang, Hao Zhu, Yihuan Mao, Hao-Shu Fang, Cewu Lu
Abstract	Multi-person pose estimation is fundamental to many computer vision tasks and has made significant progress in recent years. However, few previous methods explored the problem of pose estimation in crowded scenes while it remains challenging and inevitable in many scenarios. Moreover, current benchmarks cannot provide an appropriate evaluation for such cases. In this paper, we propose a novel and efficient method to tackle the problem of pose estimation in the crowd and a new dataset to better evaluate algorithms. Our model consists of two key components: joint-candidate single person pose estimation (SPPE) and global maximum joints association. With multi-peak prediction for each joint and global association using graph model, our method is robust to inevitable interference in crowded scenes and very efficient in inference. The proposed method surpasses the state-of-the-art methods on CrowdPose dataset by 5.2 mAP and results on MSCOCO dataset demonstrate the generalization ability of our method. Source code and dataset will be made publicly available.
Tasks	Keypoint Detection, Multi-Person Pose Estimation, Pose Estimation
Published	2018-12-02
URL	http://arxiv.org/abs/1812.00324v2
PDF	http://arxiv.org/pdf/1812.00324v2.pdf
PWC	https://paperswithcode.com/paper/crowdpose-efficient-crowded-scenes-pose
Repo	https://github.com/Jeff-sjtu/CrowdPose
Framework	pytorch

A Compact Embedding for Facial Expression Similarity


Title	A Compact Embedding for Facial Expression Similarity
Authors	Raviteja Vemulapalli, Aseem Agarwala
Abstract	Most of the existing work on automatic facial expression analysis focuses on discrete emotion recognition, or facial action unit detection. However, facial expressions do not always fall neatly into pre-defined semantic categories. Also, the similarity between expressions measured in the action unit space need not correspond to how humans perceive expression similarity. Different from previous work, our goal is to describe facial expressions in a continuous fashion using a compact embedding space that mimics human visual preferences. To achieve this goal, we collect a large-scale faces-in-the-wild dataset with human annotations in the form: Expressions A and B are visually more similar when compared to expression C, and use this dataset to train a neural network that produces a compact (16-dimensional) expression embedding. We experimentally demonstrate that the learned embedding can be successfully used for various applications such as expression retrieval, photo album summarization, and emotion recognition. We also show that the embedding learned using the proposed dataset performs better than several other embeddings learned using existing emotion or action unit datasets.
Tasks	Action Unit Detection, Emotion Recognition, Facial Expression Recognition
Published	2018-11-27
URL	http://arxiv.org/abs/1811.11283v2
PDF	http://arxiv.org/pdf/1811.11283v2.pdf
PWC	https://paperswithcode.com/paper/a-compact-embedding-for-facial-expression
Repo	https://github.com/GerardLiu96/FECNet
Framework	tf

nn-dependability-kit: Engineering Neural Networks for Safety-Critical Autonomous Driving Systems


Title	nn-dependability-kit: Engineering Neural Networks for Safety-Critical Autonomous Driving Systems
Authors	Chih-Hong Cheng, Chung-Hao Huang, Georg Nührenberg
Abstract	Can engineering neural networks be approached in a disciplined way similar to how engineers build software for civil aircraft? We present nn-dependability-kit, an open-source toolbox to support safety engineering of neural networks for autonomous driving systems. The rationale behind nn-dependability-kit is to consider a structured approach (via Goal Structuring Notation) to argue the quality of neural networks. In particular, the tool realizes recent scientific results including (a) novel dependability metrics for indicating sufficient elimination of uncertainties in the product life cycle, (b) formal reasoning engine for ensuring that the generalization does not lead to undesired behaviors, and (c) runtime monitoring for reasoning whether a decision of a neural network in operation is supported by prior similarities in the training data. A proprietary version of nn-dependability-kit has been used to improve the quality of a level-3 autonomous driving component developed by Audi for highway maneuvers.
Tasks	Autonomous Driving
Published	2018-11-16
URL	https://arxiv.org/abs/1811.06746v2
PDF	https://arxiv.org/pdf/1811.06746v2.pdf
PWC	https://paperswithcode.com/paper/nn-dependability-kit-engineering-neural
Repo	https://github.com/dependable-ai/nn-dependability-kit
Framework	tf

Tuning Fairness by Marginalizing Latent Target Labels


Title	Tuning Fairness by Marginalizing Latent Target Labels
Authors	Thomas Kehrenberg, Zexun Chen, Novi Quadrianto
Abstract	Addressing fairness in machine learning models has recently attracted a lot of attention, as it will ensure continued confidence of the general public in the deployment of machine learning systems. Here, we focus on mitigating harm of a biased system that offers better outputs (e.g. loans, jobs) for certain groups than for others. We show that bias in the output can naturally be handled in probabilistic models by introducing a latent target output that will modulate the likelihood function. This simple formulation has several advantages: first, it is a unified framework for several notions of fairness such as demographic parity and equalized odds; second, it is expressed as marginalization instead of constrained problems; and third, it allows encoding our knowledge of what the bias in outputs should be. Practically, the latter translates to the ability to control the level of fairness by varying directly fairness target rates. In contrast, existing approaches rely on intermediate, arguably unintuitive control parameters such as a covariance threshold.
Tasks
Published	2018-10-12
URL	http://arxiv.org/abs/1810.05598v3
PDF	http://arxiv.org/pdf/1810.05598v3.pdf
PWC	https://paperswithcode.com/paper/interpretable-fairness-via-target-labels-in
Repo	https://github.com/predictive-analytics-lab/UniversalGP
Framework	tf

Axiomatic Interpretability for Multiclass Additive Models


Title	Axiomatic Interpretability for Multiclass Additive Models
Authors	Xuezhou Zhang, Sarah Tan, Paul Koch, Yin Lou, Urszula Chajewska, Rich Caruana
Abstract	Generalized additive models (GAMs) are favored in many regression and binary classification problems because they are able to fit complex, nonlinear functions while still remaining interpretable. In the first part of this paper, we generalize a state-of-the-art GAM learning algorithm based on boosted trees to the multiclass setting, and show that this multiclass algorithm outperforms existing GAM learning algorithms and sometimes matches the performance of full complexity models such as gradient boosted trees. In the second part, we turn our attention to the interpretability of GAMs in the multiclass setting. Surprisingly, the natural interpretability of GAMs breaks down when there are more than two classes. Naive interpretation of multiclass GAMs can lead to false conclusions. Inspired by binary GAMs, we identify two axioms that any additive model must satisfy in order to not be visually misleading. We then develop a technique called Additive Post-Processing for Interpretability (API), that provably transforms a pre-trained additive model to satisfy the interpretability axioms without sacrificing accuracy. The technique works not just on models trained with our learning algorithm, but on any multiclass additive model, including multiclass linear and logistic regression. We demonstrate the effectiveness of API on a 12-class infant mortality dataset.
Tasks
Published	2018-10-22
URL	https://arxiv.org/abs/1810.09092v2
PDF	https://arxiv.org/pdf/1810.09092v2.pdf
PWC	https://paperswithcode.com/paper/interpretability-is-harder-in-the-multiclass
Repo	https://github.com/microsoft/interpret
Framework	none

Training Well-Generalizing Classifiers for Fairness Metrics and Other Data-Dependent Constraints


Title	Training Well-Generalizing Classifiers for Fairness Metrics and Other Data-Dependent Constraints
Authors	Andrew Cotter, Maya Gupta, Heinrich Jiang, Nathan Srebro, Karthik Sridharan, Serena Wang, Blake Woodworth, Seungil You
Abstract	Classifiers can be trained with data-dependent constraints to satisfy fairness goals, reduce churn, achieve a targeted false positive rate, or other policy goals. We study the generalization performance for such constrained optimization problems, in terms of how well the constraints are satisfied at evaluation time, given that they are satisfied at training time. To improve generalization performance, we frame the problem as a two-player game where one player optimizes the model parameters on a training dataset, and the other player enforces the constraints on an independent validation dataset. We build on recent work in two-player constrained optimization to show that if one uses this two-dataset approach, then constraint generalization can be significantly improved. As we illustrate experimentally, this approach works not only in theory, but also in practice.
Tasks
Published	2018-06-29
URL	http://arxiv.org/abs/1807.00028v2
PDF	http://arxiv.org/pdf/1807.00028v2.pdf
PWC	https://paperswithcode.com/paper/training-well-generalizing-classifiers-for
Repo	https://github.com/google-research/tensorflow_constrained_optimization
Framework	tf

Rethinking Self-driving: Multi-task Knowledge for Better Generalization and Accident Explanation Ability


Title	Rethinking Self-driving: Multi-task Knowledge for Better Generalization and Accident Explanation Ability
Authors	Zhihao Li, Toshiyuki Motoyoshi, Kazuma Sasaki, Tetsuya Ogata, Shigeki Sugano
Abstract	Current end-to-end deep learning driving models have two problems: (1) Poor generalization ability of unobserved driving environment when diversity of training driving dataset is limited (2) Lack of accident explanation ability when driving models don’t work as expected. To tackle these two problems, rooted on the believe that knowledge of associated easy task is benificial for addressing difficult task, we proposed a new driving model which is composed of perception module for \textit{see and think} and driving module for \textit{behave}, and trained it with multi-task perception-related basic knowledge and driving knowledge stepwisely. Specifically segmentation map and depth map (pixel level understanding of images) were considered as \textit{what & where} and \textit{how far} knowledge for tackling easier driving-related perception problems before generating final control commands for difficult driving task. The results of experiments demonstrated the effectiveness of multi-task perception knowledge for better generalization and accident explanation ability. With our method the average sucess rate of finishing most difficult navigation tasks in untrained city of CoRL test surpassed current benchmark method for 15 percent in trained weather and 20 percent in untrained weathers. Demonstration video link is: https://www.youtube.com/watch?v=N7ePnnZZwdE
Tasks
Published	2018-09-28
URL	http://arxiv.org/abs/1809.11100v1
PDF	http://arxiv.org/pdf/1809.11100v1.pdf
PWC	https://paperswithcode.com/paper/rethinking-self-driving-multi-task-knowledge-1
Repo	https://github.com/jackspp/rethinking-self-driving
Framework	tf

A Projection Pursuit Forest Algorithm for Supervised Classification


Title	A Projection Pursuit Forest Algorithm for Supervised Classification
Authors	Natalia da Silva, Dianne Cook, Eun-Kyung Lee
Abstract	This paper presents a new ensemble learning method for classification problems called projection pursuit random forest (PPF). PPF uses the PPtree algorithm introduced in Lee et al. (2013). In PPF, trees are constructed by splitting on linear combinations of randomly chosen variables. Projection pursuit is used to choose a projection of the variables that best separates the classes. Utilizing linear combinations of variables to separate classes takes the correlation between variables into account which allows PPF to outperform a traditional random forest when separations between groups occurs in combinations of variables. The method presented here can be used in multi-class problems and is implemented into an R (R Core Team, 2018) package, PPforest, which is available on CRAN, with development versions at https://github.com/natydasilva/PPforest.
Tasks
Published	2018-07-19
URL	http://arxiv.org/abs/1807.07207v2
PDF	http://arxiv.org/pdf/1807.07207v2.pdf
PWC	https://paperswithcode.com/paper/a-projection-pursuit-forest-algorithm-for
Repo	https://github.com/natydasilva/PPforest
Framework	none