MDL for Causal Inference on Discrete Data

被引:19
|
作者
Budhathoki, Kailash [1 ]
Vreeken, Jilles
机构
[1] Max Planck Inst Informat, Saarland Informat Campus, Saarbrucken, Germany
关键词
causal inference; MDL; discrete data; DISCOVERY;
D O I
10.1109/ICDM.2017.87
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The algorithmic Markov condition states that the most likely causal direction between two random variables X and Y can be identified as the direction with the lowest Kolmogorov complexity. This notion is very powerful as it can detect any causal dependency that can be explained by a physical process. However, due to the halting problem, it is also not computable. In this paper we propose an computable instantiation that provably maintains the key aspects of the ideal. We propose to approximate Kolmogorov complexity via the Minimum Description Length (MDL) principle, using a score that is mini-max optimal with regard to the model class under consideration. This means that even in an adversarial setting, the score degrades gracefully, and we are still maximally able to detect dependencies between the marginal and the conditional distribution. As a proof of concept, we propose CISC, a linear-time algorithm for causal inference by stochastic complexity, for pairs of univariate discrete variables. Experiments show that CISC is highly accurate on synthetic, benchmark, as well as real-world data, outperforming the state of the art by a margin, and scales extremely well with regard to sample and domain sizes.
引用
收藏
页码:751 / 756
页数:6
相关论文
共 50 条
  • [1] Accurate Causal Inference on Discrete Data
    Budhathoki, Kailash
    Vreeken, Junes
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 881 - 886
  • [2] Causal Inference on Discrete Data via Estimating Distance Correlations
    Liu, Furui
    Chan, Laiwan
    [J]. NEURAL COMPUTATION, 2016, 28 (05) : 801 - 814
  • [3] Causal Inference on Discrete Data Using Additive Noise Models
    Peters, Jonas
    Janzing, Dominik
    Schoelkopf, Bernhard
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (12) : 2436 - 2450
  • [4] Causal inference and observational data
    Ivan Olier
    Yiqiang Zhan
    Xiaoyu Liang
    Victor Volovici
    [J]. BMC Medical Research Methodology, 23
  • [5] Data integration in causal inference
    Shi, Xu
    Pan, Ziyang
    Miao, Wang
    [J]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2023, 15 (01)
  • [6] An Automated Approach to Causal Inference in Discrete Settings
    Duarte, Guilherme
    Finkelstein, Noam
    Knox, Dean
    Mummolo, Jonathan
    Shpitser, Ilya
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2023,
  • [7] On the use of discrete choice models for causal inference
    Tchernis, R
    Horvitz-Lennon, M
    Normand, SLT
    [J]. STATISTICS IN MEDICINE, 2005, 24 (14) : 2197 - 2212
  • [8] Causal inference and observational data
    Olier, Ivan
    Zhan, Yiqiang
    Liang, Xiaoyu
    Volovici, Victor
    [J]. BMC MEDICAL RESEARCH METHODOLOGY, 2023, 23 (01)
  • [9] An Automated Approach to Causal Inference in Discrete Settings
    Duarte, Guilherme
    Finkelstein, Noam
    Knox, Dean
    Mummolo, Jonathan
    Shpitser, Ilya
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2023,
  • [10] Causal inference with observational data
    Nichols, Austin
    [J]. STATA JOURNAL, 2007, 7 (04): : 507 - 541