Incorporating Causal Graphical Prior Knowledge into Predictive Modeling via Simple Data Augmentation

被引:0
|
作者
Teshima, Takeshi [1 ,2 ]
Sugiyama, Masashi [1 ,2 ]
机构
[1] Univ Tokyo, Grad Sch Frontier Sci, Tokyo, Japan
[2] RIKEN, Wako, Saitama, Japan
关键词
DISCOVERY; NETWORKS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Causal graphs (CGs) are compact representations of the knowledge of the data generating processes behind the data distributions. When a CG is available, e.g., from the domain knowledge, we can infer the conditional independence (CI) relations that should hold in the data distribution. However, it is not straightforward how to incorporate this knowledge into predictive modeling. In this work, we propose a model-agnostic data augmentation method that allows us to exploit the prior knowledge of the CI encoded in a CG for supervised machine learning. We theoretically justify the proposed method by providing an excess risk bound indicating that the proposed method suppresses overfitting by reducing the apparent complexity of the predictor hypothesis class. Using real-world data with CGs provided by domain experts, we experimentally show that the proposed method is effective in improving the prediction accuracy, especially in the small-data regime.
引用
收藏
页码:86 / 96
页数:11
相关论文
共 50 条
  • [21] Incorporating biological prior knowledge for Bayesian learning via maximal knowledge-driven information priors
    Shahin Boluki
    Mohammad Shahrokh Esfahani
    Xiaoning Qian
    Edward R Dougherty
    BMC Bioinformatics, 18
  • [22] Incorporating Molecular Knowledge in Large Language Models via Multimodal Modeling
    Yang, Zekun
    Lv, Kun
    Shu, Jian
    Li, Zheng
    Xiao, Ping
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2025,
  • [23] Causal integration of multi-omics data with prior knowledge to generate mechanistic hypotheses
    Dugourd, Aurelien
    Kuppe, Christoph
    Sciacovelli, Marco
    Gjerga, Enio
    Gabor, Attila
    Emdal, Kristina B.
    Vieira, Vitor
    Bekker-Jensen, Dorte B.
    Kranz, Jennifer
    Bindels, Eric. M. J.
    Costa, Ana S. H.
    Sousa, Abel
    Beltrao, Pedro
    Rocha, Miguel
    Olsen, Jesper V.
    Frezza, Christian
    Kramann, Rafael
    Saez-Rodriguez, Julio
    MOLECULAR SYSTEMS BIOLOGY, 2021, 17 (01)
  • [24] Nonstationary Modeling With Sparsity for Spatial Data via the Basis Graphical Lasso
    Krock, Mitchell
    Kleiber, William
    Becker, Stephen
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2021, 30 (02) : 375 - 389
  • [25] Boolean network inference from time series data incorporating prior biological knowledge
    Saad Haider
    Ranadip Pal
    BMC Genomics, 13
  • [26] Identification of Boolean Network Models From Time Series Data Incorporating Prior Knowledge
    Leifeld, Thomas
    Zhang, Zhihua
    Zhang, Ping
    FRONTIERS IN PHYSIOLOGY, 2018, 9
  • [27] Incorporating prior knowledge of gene functional groups into regularized discriminant analysis of microarray data
    Tai, Feng
    Pan, Wei
    BIOINFORMATICS, 2007, 23 (23) : 3170 - 3177
  • [28] Incorporating Prior Knowledge when Learning Mixtures of Truncated Basis Functions from Data
    Fernandez, Antonio
    Perez-Bernabe, Inmaculada
    Rumi, Rafael
    Salmeron, Antonio
    TWELFTH SCANDINAVIAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (SCAI 2013), 2013, 257 : 95 - 104
  • [29] Incorporating Prior Expert Knowledge In Learning Bayesian Networks From Genetic Epidemiological Data
    Su, Chengwei
    Borsuk, Mark E.
    Andrew, Angeline
    Karagas, Margaret
    2014 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2014,
  • [30] Boolean network inference from time series data incorporating prior biological knowledge
    Haider, Saad
    Pal, Ranadip
    BMC GENOMICS, 2012, 13