Incorporating Causal Graphical Prior Knowledge into Predictive Modeling via Simple Data Augmentation

被引:0
|
作者
Teshima, Takeshi [1 ,2 ]
Sugiyama, Masashi [1 ,2 ]
机构
[1] Univ Tokyo, Grad Sch Frontier Sci, Tokyo, Japan
[2] RIKEN, Wako, Saitama, Japan
关键词
DISCOVERY; NETWORKS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Causal graphs (CGs) are compact representations of the knowledge of the data generating processes behind the data distributions. When a CG is available, e.g., from the domain knowledge, we can infer the conditional independence (CI) relations that should hold in the data distribution. However, it is not straightforward how to incorporate this knowledge into predictive modeling. In this work, we propose a model-agnostic data augmentation method that allows us to exploit the prior knowledge of the CI encoded in a CG for supervised machine learning. We theoretically justify the proposed method by providing an excess risk bound indicating that the proposed method suppresses overfitting by reducing the apparent complexity of the predictor hypothesis class. Using real-world data with CGs provided by domain experts, we experimentally show that the proposed method is effective in improving the prediction accuracy, especially in the small-data regime.
引用
收藏
页码:86 / 96
页数:11
相关论文
共 50 条
  • [1] GPR Data Augmentation Methods by Incorporating Domain Knowledge
    Yue, Guanghua
    Liu, Chenglong
    Li, Yishun
    Du, Yuchuan
    Guo, Shili
    APPLIED SCIENCES-BASEL, 2022, 12 (21):
  • [2] Statistical modeling and inference in the era of Data Science and Graphical Causal modeling
    Spanos, Aris
    JOURNAL OF ECONOMIC SURVEYS, 2022, 36 (05) : 1251 - 1287
  • [3] Incorporating prior knowledge into learning by dividing training data
    Lu, Baoliang
    Wang, Xiaolin
    Utiyama, Masao
    FRONTIERS OF COMPUTER SCIENCE IN CHINA, 2009, 3 (01): : 109 - 122
  • [4] Incorporating prior knowledge into learning by dividing training data
    Baoliang Lu
    Xiaolin Wang
    Masao Utiyama
    Frontiers of Computer Science in China, 2009, 3 : 109 - 122
  • [5] DSRIG: Incorporating graphical structure in the regularized modeling of SNP data
    Stephenson, Matthew
    Darlington, Gerarda A.
    Schenkel, Flavio S.
    Squires, E. James
    Ali, R. Ayesha
    JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2019, 17 (03)
  • [6] RGB-D Object Recognition via Incorporating Latent Data Structure and Prior Knowledge
    Tang, Jinhui
    Jin, Lu
    Li, Zechao
    Gao, Shenghua
    IEEE TRANSACTIONS ON MULTIMEDIA, 2015, 17 (11) : 1899 - 1908
  • [7] Boosting Probabilistic Graphical Model Inference by Incorporating Prior Knowledge from Multiple Sources
    Praveen, Paurush
    Froehlich, Holger
    PLOS ONE, 2013, 8 (06):
  • [8] Causal Learning From Predictive Modeling for Observational Data
    Ramanan, Nandini
    Natarajan, Sriraam
    FRONTIERS IN BIG DATA, 2020, 3
  • [9] Incorporating causal modeling into data envelopment analysis for performance evaluation
    Fukuyama, Hirofumi
    Tsionas, Mike
    Tan, Yong
    ANNALS OF OPERATIONS RESEARCH, 2024, 342 (03) : 1865 - 1904
  • [10] A graphical approach for incorporating prior knowledge when determining a sample size for the assessment of batched products
    Woodward, P
    Branson, J
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES D-THE STATISTICIAN, 2001, 50 : 417 - 426