Learning Markov Blanket Bayesian Network for Big Data in MapReduce

被引:0
|
作者
Che, Yuxin [1 ]
Hong, Shaohui [1 ]
Zhang, Defu [1 ]
Zhang, Liming [2 ]
机构
[1] Xiamen Univ, Dept Comp Sci, Xiamen 361005, Peoples R China
[2] Univ Macau, Dept Comp Informat Sci, Macau, Peoples R China
关键词
Big Data; MapReduce; Bayesian Network; Markov blanket; Data Mining; CLASSIFICATION;
D O I
10.1109/ICTAI.2016.135
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A challenge task of data mining is to process massive data in the big data era. MapReduce is an attractive model to overcome this challenge. This paper presents a new method to accelerate the process of learning Markov blanket Bayesian network(MBBN). Markov blanket is a better model type of Bayesian network in some complex datasets. The time and space cost of learning Markov blanket is large, and grows fast as the variables increase. Large amounts of data are needed for its independence test which makes the problem harder. The statistical phase and independence test are parallelized to make it find an appropriate relation among variables in the MapReduce framework. Computational results are reported by testing four datasets and show that the speed-up can be obtained by means of MapReduce. In particular, the Markov blanket in MapReduce has higher accuracy rate than naive Bayesian and tree-augmented naive Bayesian.
引用
收藏
页码:896 / 900
页数:5
相关论文
共 50 条
  • [21] ELM-MapReduce: MapReduce Accelerated Extreme Learning Machine for Big Spatial Data Analysis
    Chen, Jiaoyan
    Zheng, Guozhou
    Chen, Huajun
    [J]. 2013 10TH IEEE INTERNATIONAL CONFERENCE ON CONTROL AND AUTOMATION (ICCA), 2013, : 400 - 405
  • [22] Challenges for MapReduce in Big Data
    Grolinger, Katarina
    Hayes, Michael
    Higashino, Wilson A.
    L'Heureux, Alexandra
    Allison, David S.
    Capretz, Miriam A. M.
    [J]. 2014 IEEE WORLD CONGRESS ON SERVICES (SERVICES), 2014, : 182 - 189
  • [23] MapReduce Clustering for Big Data
    Ghattas, Badih
    Pinto, Antoine
    Diao, Sambou
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 5116 - 5124
  • [24] Bayesian statistical learning for big data biology
    Yau C.
    Campbell K.
    [J]. Biophysical Reviews, 2019, 11 (1) : 95 - 102
  • [25] A MapReduce-based approach to social network big data mining
    Qi, Fuli
    [J]. JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2023, 23 (05) : 2535 - 2547
  • [26] MapReduce: Simplified Data Analysis of Big Data
    Maitrey, Seema
    Jha, C. K.
    [J]. 3RD INTERNATIONAL CONFERENCE ON RECENT TRENDS IN COMPUTING 2015 (ICRTC-2015), 2015, 57 : 563 - 571
  • [27] Deep Bayesian network architecture for Big Data mining
    Njah, Hasna
    Jamoussi, Salma
    Mahdi, Walid
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (02):
  • [28] Learning distributed discrete Bayesian Network Classifiers under MapReduce with Apache Spark
    Arias, Jacinto
    Gamez, Jose A.
    Puerta, Jose M.
    [J]. KNOWLEDGE-BASED SYSTEMS, 2017, 117 : 16 - 26
  • [29] A Bayesian perspective of statistical machine learning for big data
    Sambasivan, Rajiv
    Das, Sourish
    Sahu, Sujit K.
    [J]. COMPUTATIONAL STATISTICS, 2020, 35 (03) : 893 - 930
  • [30] Telescopic broad Bayesian learning for big data stream
    Yuen, Ka-Veng
    Kuok, Sin-Chi
    [J]. COMPUTER-AIDED CIVIL AND INFRASTRUCTURE ENGINEERING, 2024, : 33 - 53