A KNN-Based Non-Parametric Conditional Independence Test for Mixed Data and Application in Causal Discovery

被引:0
|
作者
Huegle, Johannes [1 ]
Hagedorn, Christopher [1 ]
Schlosser, Rainer [1 ]
机构
[1] Univ Potsdam, Hasso Plattner Inst, Potsdam, Germany
关键词
Non-Parametric CI Testing; Causal Discovery; Mixed Data; BAYESIAN NETWORKS; VARIABLES; DISCRETE;
D O I
10.1007/978-3-031-43412-9_32
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Testing for Conditional Independence (CI) is a fundamental task for causal discovery but is particularly challenging in mixed discrete-continuous data. In this context, inadequate assumptions or discretization of continuous variables reduce the CI test's statistical power, which yields incorrect learned causal structures. In this work, we present a non-parametric CI test leveraging k-nearest neighbor (kNN) methods that are adaptive to mixed discrete-continuous data. In particular, a kNN-based conditional mutual information estimator serves as the test statistic, and the p-value is calculated using a kNN-based local permutation scheme. We prove the CI test's statistical validity and power in mixed discrete-continuous data, which yields consistency when used in constraint-based causal discovery. An extensive evaluation of synthetic and real-world data shows that the proposed CI test outperforms state-of-the-art approaches in the accuracy of CI testing and causal discovery, particularly in settings with low sample sizes.
引用
收藏
页码:541 / 558
页数:18
相关论文
共 50 条
  • [21] Bayesian non-parametric conditional copula estimation of twin data
    Dalla Valle, Luciana
    Leisen, Fabrizio
    Rossini, Luca
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2018, 67 (03) : 523 - 548
  • [22] Causal Discovery Using Regression-Based Conditional Independence Tests
    Zhang, Hao
    Zhou, Shuigeng
    Zhang, Kun
    Guan, Jihong
    [J]. THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 1250 - 1256
  • [23] Measuring Conditional Independence by Independent Residuals: Theoretical Results and Application in Causal Discovery
    Zhang, Hao
    Zhou, Shuigeng
    Guan, Jihong
    [J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 2029 - 2036
  • [24] Comment on: A non-parametric spatial independence test using symbolic entropy
    Elsinger, Helmut
    [J]. REGIONAL SCIENCE AND URBAN ECONOMICS, 2013, 43 (05) : 838 - 840
  • [25] A Conditional Mutual Information Estimator for Mixed Data and an Associated Conditional Independence Test
    Zan, Lei
    Meynaoui, Anouar
    Assaad, Charles K.
    Devijver, Emilie
    Gaussier, Eric
    [J]. ENTROPY, 2022, 24 (09)
  • [26] Causal independence between energy consumption and economic growth in Liberia: Evidence from a non-parametric bootstrapped causality test
    Wesseh, Presley K., Jr.
    Zoumara, Babette
    [J]. ENERGY POLICY, 2012, 50 : 518 - 527
  • [27] A conditional independence test for dependent data based on maximal conditional correlation
    Cheng, Yu-Hsiang
    Huang, Tzee-Ming
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2012, 107 : 210 - 226
  • [28] Conditional monitoring and fault detection of wind turbines based on Kolmogorov-Smirnov non-parametric test
    Ohunakin, Olayinka S.
    Henry, Emerald U.
    Matthew, Olaniran J.
    Ezekiel, Victor U.
    Adelekan, Damola S.
    Oyeniran, Ayodele T.
    [J]. ENERGY REPORTS, 2024, 11 : 2577 - 2591
  • [29] Non-parametric weighted tests for independence based on empirical copula process
    Medovikov, Ivan
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2016, 86 (01) : 105 - 121
  • [30] kruX: matrix-based non-parametric eQTL discovery
    Qi, Jianlong
    Asl, Hassan Foroughi
    Bjorkegren, Johan
    Michoel, Tom
    [J]. BMC BIOINFORMATICS, 2014, 15