January 25, 2020

3296 words 16 mins read

Paper Group NAWR 28

Paper Group NAWR 28

MARS: Motion-Augmented RGB Stream for Action Recognition. SMILe: Scalable Meta Inverse Reinforcement Learning through Context-Conditional Policies. Generalized Block-Diagonal Structure Pursuit: Learning Soft Latent Task Assignment against Negative Transfer. Exploiting Spatial-Temporal Relationships for 3D Pose Estimation via Graph Convolutional Net …

MARS: Motion-Augmented RGB Stream for Action Recognition

Title MARS: Motion-Augmented RGB Stream for Action Recognition
Authors Nieves Crasto, Philippe Weinzaepfel, Karteek Alahari, Cordelia Schmid
Abstract Most state-of-the-art methods for action recognition consist of a two-stream architecture with 3D convolutions: an appearance stream for RGB frames and a motion stream for optical flow frames. Although combining flow with RGB improves the performance, the cost of computing accurate optical flow is high, and increases action recognition latency. This limits the usage of two-stream approaches in real-world applications requiring low latency. In this paper, we introduce two learning approaches to train a standard 3D CNN, operating on RGB frames, that mimics the motion stream, and as a result avoids flow computation at test time. First, by minimizing a feature-based loss compared to the Flow stream, we show that the network reproduces the motion stream with high fidelity. Second, to leverage both appearance and motion information effectively, we train with a linear combination of the feature-based loss and the standard cross-entropy loss for action recognition. We denote the stream trained using this combined loss as Motion-Augmented RGB Stream (MARS). As a single stream, MARS performs better than RGB or Flow alone, for instance with 72.7% accuracy on Kinetics compared to 72.0% and 65.6% with RGB and Flow streams respectively.
Tasks Action Classification, Action Recognition In Videos, Optical Flow Estimation, Temporal Action Localization
Published 2019-06-01
URL http://openaccess.thecvf.com/content_CVPR_2019/html/Crasto_MARS_Motion-Augmented_RGB_Stream_for_Action_Recognition_CVPR_2019_paper.html
PDF http://openaccess.thecvf.com/content_CVPR_2019/papers/Crasto_MARS_Motion-Augmented_RGB_Stream_for_Action_Recognition_CVPR_2019_paper.pdf
PWC https://paperswithcode.com/paper/mars-motion-augmented-rgb-stream-for-action
Repo https://github.com/craston/MARS
Framework pytorch

SMILe: Scalable Meta Inverse Reinforcement Learning through Context-Conditional Policies

Title SMILe: Scalable Meta Inverse Reinforcement Learning through Context-Conditional Policies
Authors Seyed Kamyar Seyed Ghasemipour, Shixiang (Shane) Gu, Richard Zemel
Abstract Imitation Learning (IL) has been successfully applied to complex sequential decision-making problems where standard Reinforcement Learning (RL) algorithms fail. A number of recent methods extend IL to few-shot learning scenarios, where a meta-trained policy learns to quickly master new tasks using limited demonstrations. However, although Inverse Reinforcement Learning (IRL) often outperforms Behavioral Cloning (BC) in terms of imitation quality, most of these approaches build on BC due to its simple optimization objective. In this work, we propose SMILe, a scalable framework for Meta Inverse Reinforcement Learning (Meta-IRL) based on maximum entropy IRL, which can learn high-quality policies from few demonstrations. We examine the efficacy of our method on a variety of high-dimensional simulated continuous control tasks and observe that SMILe significantly outperforms Meta-BC. Furthermore, we observe that SMILe performs comparably or outperforms Meta-DAgger, while being applicable in the state-only setting and not requiring online experts. To our knowledge, our approach is the first efficient method for Meta-IRL that scales to the function approximator setting. For datasets and reproducing results please refer to https://github.com/KamyarGh/rl_swiss/blob/master/reproducing/smile_paper.md .
Tasks Continuous Control, Decision Making, Few-Shot Learning, Imitation Learning
Published 2019-12-01
URL http://papers.nips.cc/paper/9002-smile-scalable-meta-inverse-reinforcement-learning-through-context-conditional-policies
PDF http://papers.nips.cc/paper/9002-smile-scalable-meta-inverse-reinforcement-learning-through-context-conditional-policies.pdf
PWC https://paperswithcode.com/paper/smile-scalable-meta-inverse-reinforcement
Repo https://github.com/KamyarGh/rl_swiss
Framework pytorch

Generalized Block-Diagonal Structure Pursuit: Learning Soft Latent Task Assignment against Negative Transfer

Title Generalized Block-Diagonal Structure Pursuit: Learning Soft Latent Task Assignment against Negative Transfer
Authors Zhiyong Yang, Qianqian Xu, Yangbangyan Jiang, Xiaochun Cao, Qingming Huang
Abstract In multi-task learning, a major challenge springs from a notorious issue known as negative transfer, which refers to the phenomenon that sharing the knowledge with dissimilar and hard tasks often results in a worsened performance. To circumvent this issue, we propose a novel multi-task learning method, which simultaneously learns latent task representations and a block-diagonal Latent Task Assignment Matrix (LTAM). Different from most of the previous work, pursuing the Block-Diagonal structure of LTAM (assigning latent tasks to output tasks) alleviates negative transfer via collaboratively grouping latent tasks and output tasks such that inter-group knowledge transfer and sharing is suppressed. This goal is challenging, since 1) our notion of Block-Diagonal Property extends the traditional notion for square matrices where the $i$-th column and the $i$-th column represents the same concept; 2) marginal constraints on rows and columns are also required for avoiding isolated latent/output tasks. Facing such challenges, we propose a novel regularizer by means of an equivalent spectral condition realizing this generalized block-diagonal property. Practically, we provide a relaxation scheme which improves the flexibility of the model. With the objective function given, we then propose an alternating optimization method, which not only tells how negative transfer is alleviated in our method but also reveals an interesting connection between our method and the optimal transport problem. Finally, the method is demonstrated on a simulation dataset, three real-world benchmark datasets and further applied to personalized attribute predictions.
Tasks Multi-Task Learning, Transfer Learning
Published 2019-12-01
URL http://papers.nips.cc/paper/8820-generalized-block-diagonal-structure-pursuit-learning-soft-latent-task-assignment-against-negative-transfer
PDF http://papers.nips.cc/paper/8820-generalized-block-diagonal-structure-pursuit-learning-soft-latent-task-assignment-against-negative-transfer.pdf
PWC https://paperswithcode.com/paper/generalized-block-diagonal-structure-pursuit
Repo https://github.com/joshuaas/GBDSP-NeurIPS19
Framework none

Exploiting Spatial-Temporal Relationships for 3D Pose Estimation via Graph Convolutional Networks

Title Exploiting Spatial-Temporal Relationships for 3D Pose Estimation via Graph Convolutional Networks
Authors Yujun Cai, Liuhao Ge, Jun Liu, Jianfei Cai, Tat-Jen Cham, Junsong Yuan, Nadia Magnenat Thalmann
Abstract Despite great progress in 3D pose estimation from single-view images or videos, it remains a challenging task due to the substantial depth ambiguity and severe self-occlusions. Motivated by the effectiveness of incorporating spatial dependencies and temporal consistencies to alleviate these issues, we propose a novel graph-based method to tackle the problem of 3D human body and 3D hand pose estimation from a short sequence of 2D joint detections. Particularly, domain knowledge about the human hand (body) configurations is explicitly incorporated into the graph convolutional operations to meet the specific demand of the 3D pose estimation. Furthermore, we introduce a local-to-global network architecture, which is capable of learning multi-scale features for the graph-based representations. We evaluate the proposed method on challenging benchmark datasets for both 3D hand pose estimation and 3D body pose estimation. Experimental results show that our method achieves state-of-the-art performance on both tasks.
Tasks 3D Pose Estimation, Hand Pose Estimation, Pose Estimation
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Cai_Exploiting_Spatial-Temporal_Relationships_for_3D_Pose_Estimation_via_Graph_Convolutional_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Cai_Exploiting_Spatial-Temporal_Relationships_for_3D_Pose_Estimation_via_Graph_Convolutional_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/exploiting-spatial-temporal-relationships-for
Repo https://github.com/vanoracai/Exploiting-Spatial-temporal-Relationships-for-3D-Pose-Estimation-via-Graph-Convolutional-Networks
Framework pytorch

Analysis | OPEN | Published: 17 June 2019 Multitask learning and benchmarking with clinical time series data

Title Analysis
Authors Hrayr Harutyunyan, Hrant Khachatrian, David C. Kale, Greg Ver Steeg, Aram Galstyan
Abstract Health care is one of the most exciting frontiers in data mining and machine learning. Successful adoption of electronic health records (EHRs) created an explosion in digital clinical data available for analysis, but progress in machine learning for healthcare research has been difficult to measure because of the absence of publicly available benchmark data sets. To address this problem, we propose four clinical prediction benchmarks using data derived from the publicly available Medical Information Mart for Intensive Care (MIMIC-III) database. These tasks cover a range of clinical problems including modeling risk of mortality, forecasting length of stay, detecting physiologic decline, and phenotype classification. We propose strong linear and neural baselines for all four tasks and evaluate the effect of deep supervision, multitask training and data-specific architectural modifications on the performance of neural models.
Tasks Computational Phenotyping, Length-of-Stay prediction, Mortality Prediction
Published 2019-06-17
URL https://www.nature.com/articles/s41597-019-0103-9
PDF https://www.nature.com/articles/s41597-019-0103-9.pdf
PWC https://paperswithcode.com/paper/analysis-open-published-17-june-2019
Repo https://github.com/yerevann/mimic3-benchmarks
Framework none

Efficiently Learning Fourier Sparse Set Functions

Title Efficiently Learning Fourier Sparse Set Functions
Authors Andisheh Amrollahi, Amir Zandieh, Michael Kapralov, Andreas Krause
Abstract Learning set functions is a key challenge arising in many domains, ranging from sketching graphs to black-box optimization with discrete parameters. In this paper we consider the problem of efficiently learning set functions that are defined over a ground set of size $n$ and that are sparse (say $k$-sparse) in the Fourier domain. This is a wide class, that includes graph and hypergraph cut functions, decision trees and more. Our central contribution is the first algorithm that allows learning functions whose Fourier support only contains low degree (say degree $d=o(n)$) polynomials using $O(k d \log n)$ sample complexity and runtime $O( kn \log^2 k \log n \log d)$. This implies that sparse graphs with $k$ edges can, for the first time, be learned from $O(k \log n)$ observations of cut values and in linear time in the number of vertices. Our algorithm can also efficiently learn (sums of) decision trees of small depth. The algorithm exploits techniques from the sparse Fourier transform literature and is easily implementable. Lastly, we also develop an efficient robust version of our algorithm and prove $\ell_2/\ell_2$ approximation guarantees without any statistical assumptions on the noise.
Tasks
Published 2019-12-01
URL http://papers.nips.cc/paper/9648-efficiently-learning-fourier-sparse-set-functions
PDF http://papers.nips.cc/paper/9648-efficiently-learning-fourier-sparse-set-functions.pdf
PWC https://paperswithcode.com/paper/efficiently-learning-fourier-sparse-set
Repo https://github.com/andisheh94/Efficiently-Learning-Fourier-Sparse-Set-Functions
Framework none

Spatial-Aware Feature Aggregation for Image based Cross-View Geo-Localization

Title Spatial-Aware Feature Aggregation for Image based Cross-View Geo-Localization
Authors Yujiao Shi, Liu Liu, Xin Yu, Hongdong Li
Abstract In this paper, we develop a new deep network to explicitly address these inherent differences between ground and aerial views. We observe there exist some approximate domain correspondences between ground and aerial images. Specifically, pixels lying on the same azimuth direction in an aerial image approximately correspond to a vertical image column in the ground view image. Thus, we propose a two-step approach to exploit this prior knowledge. The first step is to apply a regular polar transform to warp an aerial image such that its domain is closer to that of a ground-view panorama. Note that polar transform as a pure geometric transformation is agnostic to scene content, hence cannot bring the two domains into full alignment. Then, we add a subsequent spatial-attention mechanism which further brings corresponding deep features closer in the embedding space. To improve the robustness of feature representation, we introduce a feature aggregation strategy via learning multiple spatial embeddings. By the above two-step approach, we achieve more discriminative deep representations, facilitating cross-view Geo-localization more accurate. Our experiments on standard benchmark datasets show significant performance boosting, achieving more than doubled recall rate compared with the previous state of the art.
Tasks
Published 2019-12-01
URL http://papers.nips.cc/paper/9199-spatial-aware-feature-aggregation-for-image-based-cross-view-geo-localization
PDF http://papers.nips.cc/paper/9199-spatial-aware-feature-aggregation-for-image-based-cross-view-geo-localization.pdf
PWC https://paperswithcode.com/paper/spatial-aware-feature-aggregation-for-image
Repo https://github.com/shiyujiao/SAFA.git
Framework none

A Boundary-aware Neural Model for Nested Named Entity Recognition

Title A Boundary-aware Neural Model for Nested Named Entity Recognition
Authors Changmeng Zheng, Yi Cai, Jingyun Xu, Ho-fung Leung, Gu Xu, ong
Abstract In natural language processing, it is common that many entities contain other entities inside them. Most existing works on named entity recognition (NER) only deal with flat entities but ignore nested ones. We propose a boundary-aware neural model for nested NER which leverages entity boundaries to predict entity categorical labels. Our model can locate entities precisely by detecting boundaries using sequence labeling models. Based on the detected boundaries, our model utilizes the boundary-relevant regions to predict entity categorical labels, which can decrease computation cost and relieve error propagation problem in layered sequence labeling model. We introduce multitask learning to capture the dependencies of entity boundaries and their categorical labels, which helps to improve the performance of identifying entities. We conduct our experiments on GENIA dataset and the experimental results demonstrate that our model outperforms other state-of-the-art methods.
Tasks Named Entity Recognition, Nested Named Entity Recognition
Published 2019-11-01
URL https://www.aclweb.org/anthology/D19-1034/
PDF https://www.aclweb.org/anthology/D19-1034
PWC https://paperswithcode.com/paper/a-boundary-aware-neural-model-for-nested
Repo https://github.com/thecharm/boundary-aware-nested-ner
Framework pytorch

Hierarchical Shot Detector

Title Hierarchical Shot Detector
Authors Jiale Cao, Yanwei Pang, Jungong Han, Xuelong Li
Abstract Single shot detector simultaneously predicts object categories and regression offsets of the default boxes. Despite of high efficiency, this structure has some inappropriate designs: (1) The classification result of the default box is improperly assigned to that of the regressed box during inference, (2) Only regression once is not good enough for accurate object detection. To solve the first problem, a novel reg-offset-cls (ROC) module is proposed. It contains three hierarchical steps: box regression, the feature sampling location predication, and the regressed box classification with the features of offset locations. To further solve the second problem, a hierarchical shot detector (HSD) is proposed, which stacks two ROC modules and one feature enhanced module. The second ROC treats the regressed boxes and the feature sampling locations of features in the first ROC as the inputs. Meanwhile, the feature enhanced module injected between two ROCs aims to extract the local and non-local context. Experiments on the MS COCO and PASCAL VOC datasets demonstrate the superiority of proposed HSD. Without the bells or whistles, HSD outperforms all one-stage methods at real-time speed.
Tasks Object Detection
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Cao_Hierarchical_Shot_Detector_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Cao_Hierarchical_Shot_Detector_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/hierarchical-shot-detector
Repo https://github.com/JialeCao001/HSD
Framework pytorch

Robust Anomaly Detection for Multivariate Time Series through Stochastic Recurrent Neural Network

Title Robust Anomaly Detection for Multivariate Time Series through Stochastic Recurrent Neural Network
Authors Ya Su; Youjian Zhao; Chenhao Niu; Rong Liu; Wei Sun; Dan Pei
Abstract Industry devices (i.e., entities) such as server machines, spacecrafts, engines, etc., are typically monitored with multivariate time series, whose anomaly detection is critical for an entity’s service quality management. However, due to the complex temporal dependence and stochasticity of multivariate time series, their anomaly detection remains a big challenge. This paper proposes OmniAnomaly, a stochastic recurrent neural network for multivariate time series anomaly detection that works well robustly for various devices. Its core idea is to capture the normal patterns of multivariate time series by learning their robust representations with key techniques such as stochastic variable connection and planar normalizing flow, reconstruct input data by the representations, and use the reconstruction probabilities to determine anomalies. Moreover, for a detected entity anomaly, OmniAnomaly can provide interpretations based on the reconstruction probabilities of its constituent univariate time series. The evaluation experiments are conducted on two public datasets from aerospace and a new server machine dataset (collected and released by us) from an Internet company. OmniAnomaly achieves an overall F1-Score of 0.86 in three real-world datasets, signicantly outperforming the best performing baseline method by 0.09. The interpretation accuracy for OmniAnomaly is up to 0.89.
Tasks Anomaly Detection, Time Series
Published 2019-07-25
URL https://dl.acm.org/citation.cfm?id=3330672
PDF https://netman.aiops.org/wp-content/uploads/2019/08/OmniAnomaly_camera-ready.pdf
PWC https://paperswithcode.com/paper/robust-anomaly-detection-for-multivariate
Repo https://github.com/smallcowbaby/OmniAnomaly
Framework tf

Temporal FiLM: Capturing Long-Range Sequence Dependencies with Feature-Wise Modulations.

Title Temporal FiLM: Capturing Long-Range Sequence Dependencies with Feature-Wise Modulations.
Authors Sawyer Birnbaum, Volodymyr Kuleshov, Zayd Enam, Pang Wei W. Koh, Stefano Ermon
Abstract Learning representations that accurately capture long-range dependencies in sequential inputs — including text, audio, and genomic data — is a key problem in deep learning. Feed-forward convolutional models capture only feature interactions within finite receptive fields while recurrent architectures can be slow and difficult to train due to vanishing gradients. Here, we propose Temporal Feature-Wise Linear Modulation (TFiLM) — a novel architectural component inspired by adaptive batch normalization and its extensions — that uses a recurrent neural network to alter the activations of a convolutional model. This approach expands the receptive field of convolutional sequence models with minimal computational overhead. Empirically, we find that TFiLM significantly improves the learning speed and accuracy of feed-forward neural networks on a range of generative and discriminative learning tasks, including text classification and audio super-resolution.
Tasks Audio Super-Resolution, Super-Resolution, Text Classification
Published 2019-12-01
URL http://papers.nips.cc/paper/9217-temporal-film-capturing-long-range-sequence-dependencies-with-feature-wise-modulations
PDF http://papers.nips.cc/paper/9217-temporal-film-capturing-long-range-sequence-dependencies-with-feature-wise-modulations.pdf
PWC https://paperswithcode.com/paper/temporal-film-capturing-long-range-sequence-1
Repo https://github.com/kuleshov/audio-super-res
Framework tf

A Unifying Framework for Spectrum-Preserving Graph Sparsification and Coarsening

Title A Unifying Framework for Spectrum-Preserving Graph Sparsification and Coarsening
Authors Gecia Bravo Hermsdorff, Lee Gunderson
Abstract How might one ``reduce’’ a graph? That is, generate a smaller graph that preserves the global structure at the expense of discarding local details? There has been extensive work on both graph sparsification (removing edges) and graph coarsening (merging nodes, often by edge contraction); however, these operations are currently treated separately. Interestingly, for a planar graph, edge deletion corresponds to edge contraction in its planar dual (and more generally, for a graphical matroid and its dual). Moreover, with respect to the dynamics induced by the graph Laplacian (e.g., diffusion), deletion and contraction are physical manifestations of two reciprocal limits: edge weights of $0$ and $\infty$, respectively. In this work, we provide a unifying framework that captures both of these operations, allowing one to simultaneously sparsify and coarsen a graph while preserving its large-scale structure. The limit of infinite edge weight is rarely considered, as many classical notions of graph similarity diverge. However, its algebraic, geometric, and physical interpretations are reflected in the Laplacian pseudoinverse $\mat{L}^\dagger$, which remains finite in this limit. Motivated by this insight, we provide a probabilistic algorithm that reduces graphs while preserving $\mat{L}^\dagger$, using an unbiased procedure that minimizes its variance. We compare our algorithm with several existing sparsification and coarsening algorithms using real-world datasets, and demonstrate that it more accurately preserves the large-scale structure. |
Tasks Graph Similarity
Published 2019-12-01
URL http://papers.nips.cc/paper/8989-a-unifying-framework-for-spectrum-preserving-graph-sparsification-and-coarsening
PDF http://papers.nips.cc/paper/8989-a-unifying-framework-for-spectrum-preserving-graph-sparsification-and-coarsening.pdf
PWC https://paperswithcode.com/paper/a-unifying-framework-for-spectrum-preserving
Repo https://github.com/Gecia/A-Unifying-Framework-for-Spectrum-Preserving-Graph-Sparsification-and-Coarsening
Framework none

Efficient Convex Relaxations for Streaming PCA

Title Efficient Convex Relaxations for Streaming PCA
Authors Raman Arora, Teodor Vanislavov Marinov
Abstract We revisit two algorithms, matrix stochastic gradient (MSG) and $\ell_2$-regularized MSG (RMSG), that are instances of stochastic gradient descent (SGD) on a convex relaxation to principal component analysis (PCA). These algorithms have been shown to outperform Oja’s algorithm, empirically, in terms of the iteration complexity, and to have runtime comparable with Oja’s. However, these findings are not supported by existing theoretical results. While the iteration complexity bound for $\ell_2$-RMSG was recently shown to match that of Oja’s algorithm, its theoretical efficiency was left as an open problem. In this work, we give improved bounds on per iteration cost of mini-batched variants of both MSG and $\ell_2$-RMSG and arrive at an algorithm with total computational complexity matching that of Oja’s algorithm.
Tasks
Published 2019-12-01
URL http://papers.nips.cc/paper/9236-efficient-convex-relaxations-for-streaming-pca
PDF http://papers.nips.cc/paper/9236-efficient-convex-relaxations-for-streaming-pca.pdf
PWC https://paperswithcode.com/paper/efficient-convex-relaxations-for-streaming
Repo https://github.com/tmarino2/Streaming_PCA
Framework none

View-Consistent 4D Light Field Superpixel Segmentation

Title View-Consistent 4D Light Field Superpixel Segmentation
Authors Numair Khan, Qian Zhang, Lucas Kasser, Henry Stone, Min H. Kim, James Tompkin
Abstract Many 4D light field processing applications rely on superpixel segmentations, for which occlusion-aware view consistency is important. Yet, existing methods often enforce consistency by propagating clusters from a central view only, which can lead to inconsistent superpixels for non-central views. Our proposed approach combines an occlusion-aware angular segmentation in horizontal and vertical EPI spaces with an occlusion-aware clustering and propagation step across all views. Qualitative video demonstrations show that this helps to remove flickering and inconsistent boundary shapes versus the state-of-the-art approach, and quantitative metrics reflect these findings with improved boundary accuracy and view consistency scores.
Tasks
Published 2019-10-01
URL http://openaccess.thecvf.com/content_ICCV_2019/html/Khan_View-Consistent_4D_Light_Field_Superpixel_Segmentation_ICCV_2019_paper.html
PDF http://openaccess.thecvf.com/content_ICCV_2019/papers/Khan_View-Consistent_4D_Light_Field_Superpixel_Segmentation_ICCV_2019_paper.pdf
PWC https://paperswithcode.com/paper/view-consistent-4d-light-field-superpixel
Repo https://github.com/brownvc/lightfieldsuperpixels
Framework none

Constraint-based Causal Structure Learning with Consistent Separating Sets

Title Constraint-based Causal Structure Learning with Consistent Separating Sets
Authors Honghao Li, Vincent Cabeli, Nadir Sella, Herve Isambert
Abstract We consider constraint-based methods for causal structure learning, such as the PC algorithm or any PC-derived algorithms whose first step consists in pruning a complete graph to obtain an undirected graph skeleton, which is subsequently oriented. All constraint-based methods perform this first step of removing dispensable edges, iteratively, whenever a separating set and corresponding conditional independence can be found. Yet, constraint-based methods lack robustness over sampling noise and are prone to uncover spurious conditional independences in finite datasets. In particular, there is no guarantee that the separating sets identified during the iterative pruning step remain consistent with the final graph. In this paper, we propose a simple modification of PC and PC-derived algorithms so as to ensure that all separating sets identified to remove dispensable edges are consistent with the final graph,thus enhancing the explainability of constraint-basedmethods. It is achieved by repeating the constraint-based causal structure learning scheme, iteratively, while searching for separating sets that are consistent with the graph obtained at the previous iteration. Ensuring the consistency of separating sets can be done at a limited complexity cost, through the use of block-cut tree decomposition of graph skeletons, and is found to increase their validity in terms of actual d-separation. It also significantly improves the sensitivity of constraint-based methods while retaining good overall structure learning performance. Finally and foremost, ensuring sepset consistency improves the interpretability of constraint-based models for real-life applications.
Tasks
Published 2019-12-01
URL http://papers.nips.cc/paper/9573-constraint-based-causal-structure-learning-with-consistent-separating-sets
PDF http://papers.nips.cc/paper/9573-constraint-based-causal-structure-learning-with-consistent-separating-sets.pdf
PWC https://paperswithcode.com/paper/constraint-based-causal-structure-learning
Repo https://github.com/honghaoli42/consistent_pcalg
Framework none
comments powered by Disqus