A Parallel Framework for Constraint-Based Bayesian Network Learning via Markov Blanket Discovery

被引:6
|
作者
Srivastava, Ankit [1 ]
Chockalingam, Sriram P. [2 ]
Aluru, Srinivas [1 ]
机构
[1] Georgia Inst Technol, Sch Computat Sci & Engn, Atlanta, GA 30332 USA
[2] Georgia Inst Technol, Inst Data Engn & Sci, Atlanta, GA 30332 USA
基金
美国国家科学基金会;
关键词
Random variables; Scalability; Probability distribution; Markov processes; Machine learning algorithms; Bayes methods; Software algorithms; Bayesian networks; constraint-based learning; parallel machine learning; gene networks; reproducibility; ALGORITHM; INDUCTION; INFERENCE; MODELS;
D O I
10.1109/TPDS.2023.3244135
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Bayesian networks (BNs) are a widely used graphical model in machine learning. As learning the structure of BNs is NP-hard, high-performance computing methods are necessary for constructing large-scale networks. In this article, we present a parallel framework to scale BN structure learning algorithms to tens of thousands of variables. Our framework is applicable to learning algorithms that rely on the discovery of Markov blankets (MBs) as an intermediate step. We demonstrate the applicability of our framework by parallelizing three different algorithms: Grow-Shrink (GS), Incremental Association MB (IAMB), and Interleaved IAMB (Inter-IAMB). Our implementations are available as part of an open-source software called ramBLe, and are able to construct BNs from real data sets with tens of thousands of variables and thousands of observations in less than a minute on 1024 cores, with a speedup of up to 845X and 82.5% efficiency. Furthermore, we demonstrate using simulated data sets that our proposed parallel framework can scale to BNs of even higher dimensionality. Our implementations were selected for the reproducibility challenge component of the 2021 student cluster competition (SCC'21), which tasked undergraduate teams from around the world with reproducing the results that we obtained using the implementations. We discuss details of the challenge and the results of the experiments conducted by the top teams in the competition. The results of these experiments indicate that our key results are reproducible, despite the use of completely different data sets and experiment infrastructure, and validate the scalability of our implementations.
引用
收藏
页码:1699 / 1715
页数:17
相关论文
共 50 条
  • [1] A Parallel Framework for Constraint-Based Bayesian Network Learning via Markov Blanket Discovery
    Srivastava, Ankit
    Chockalingam, Sriram P.
    Aluru, Srinivas
    [J]. PROCEEDINGS OF SC20: THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC20), 2020,
  • [2] Critique of "A Parallel Framework for Constraint-Based Bayesian Network Learning via Markov Blanket Discovery" by SCC Team From Peking University
    Si, Jiaqi
    Guo, Junyi
    Hao, Zhewen
    He, Wenyang
    Li, Ruihan
    Pan, Yueyang
    Fu, Zhenxin
    Fan, Chun
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2023, 34 (06) : 1720 - 1722
  • [3] Critique of "A Parallel Framework for Constraint-Based Bayesian Network Learning via Markov Blanket Discovery" by SCC Team From Tsinghua University
    Cao, Juncheng
    Rong, Kaiyuan
    Zhai, Mingshu
    Song, Zeyu
    Ren, Yanyu
    Zhu, Yuxi
    Han, Wentao
    Zhai, Jidong
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2023, 34 (06) : 1723 - 1726
  • [4] Critique of "A Parallel Framework for Constraint-Based Bayesian Network Learning via Markov Blanket Discovery" by SCC Team From ShanghaiTech University
    Li, Guancheng
    Cao, Songhui
    Zhao, Chuyi
    Zhang, Siyuan
    Ji, Yuchen
    Jing, Haotian
    Li, Zecheng
    Cheng, Jiajun
    Yang, Yiwei
    Yin, Shu
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2023, 34 (06) : 1716 - 1719
  • [5] Critique of: "A Parallel Framework for Constraint-Based Bayesian Network Learning via Markov Blanket Discovery" by SCC Team From UC San Diego
    Gupta, Arunav
    Ge, John
    Li, John
    Kong, Zihao
    He, Kaiwen
    Mikhailov, Matthew
    Chin, Bryan
    Li, Xiaochen
    Apodaca, Max
    Rodriguez, Paul
    Tatineni, Mahidar
    Thomas, Mary
    Bhatt, Santosh
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2023, 34 (06) : 1727 - 1730
  • [6] Bayesian Network Constraint-Based Structure Learning Algorithms: Parallel and Optimized Implementations in the bnlearn R Package
    Scutari, Marco
    [J]. JOURNAL OF STATISTICAL SOFTWARE, 2017, 77 (02): : 1 - 20
  • [7] Constraint-Based Querying for Bayesian Network Exploration
    Babaki, Behrouz
    Guns, Tias
    Nijssen, Siegfried
    De Raedt, Luc
    [J]. ADVANCES IN INTELLIGENT DATA ANALYSIS XIV, 2015, 9385 : 13 - 24
  • [8] Bayesian network parameter learning using constraint-based data extension method
    Xinxin Ru
    Xiaoguang Gao
    Yangyang Wang
    Xiaohan Liu
    [J]. Applied Intelligence, 2023, 53 : 9958 - 9977
  • [9] Bayesian network parameter learning using constraint-based data extension method
    Ru, Xinxin
    Gao, Xiaoguang
    Wang, Yangyang
    Liu, Xiaohan
    [J]. APPLIED INTELLIGENCE, 2023, 53 (09) : 9958 - 9977
  • [10] MCMC learning of Bayesian network models by Markov blanket decomposition
    Riggelsen, C
    [J]. MACHINE LEARNING: ECML 2005, PROCEEDINGS, 2005, 3720 : 329 - 340