Efficient parameter learning for Bayesian Network classifiers following the Apache Spark Dataframes paradigm

被引:0
|
作者
Akarepis, Ioannis [1 ]
Bompotas, Agorakis [1 ]
Makris, Christos [1 ]
机构
[1] Univ Patras, Comp Engn & Informat Dept, Univ Campus, Patras 26504, Achaia, Greece
关键词
Machine learning; Bayesian Network classifiers; Big data; Apache Spark; BIG DATA; MAPREDUCE;
D O I
10.1007/s10115-024-02096-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Every year the volume of information is growing at a high rate; therefore, more modern approaches are required to deal with such issues efficiently. Distributed systems, such as Apache Spark, offer such a modern approach, resulting in more and more machine learning models, being adapted into using distributed logic. In this paper, we propose a classification model, based on Bayesian Networks (BNs), that utilizes the distributed environment of Apache Spark using the Dataframes paradigm. This model can exploit any user-provided directed acyclic graph (DAG) that portrays the dependencies between the features of a dataset to estimate the parameters of the conditional probability distributions associated with each node in the graph to make accurate predictions. Moreover, in contrast with the majority of implementations that are only able to handle discrete features, it is also capable of efficiently handling continuous features by calculating the Gaussian probability density function.
引用
收藏
页码:4437 / 4461
页数:25
相关论文
共 50 条
  • [1] Efficient parameter learning of Bayesian network classifiers
    Zaidi, Nayyar A.
    Webb, Geoffrey I.
    Carman, Mark J.
    Petitjean, Francois
    Buntine, Wray
    Hynes, Mike
    De Sterck, Hans
    [J]. MACHINE LEARNING, 2017, 106 (9-10) : 1289 - 1329
  • [2] Efficient parameter learning of Bayesian network classifiers
    Nayyar A. Zaidi
    Geoffrey I. Webb
    Mark J. Carman
    François Petitjean
    Wray Buntine
    Mike Hynes
    Hans De Sterck
    [J]. Machine Learning, 2017, 106 : 1289 - 1329
  • [3] Learning distributed discrete Bayesian Network Classifiers under MapReduce with Apache Spark
    Arias, Jacinto
    Gamez, Jose A.
    Puerta, Jose M.
    [J]. KNOWLEDGE-BASED SYSTEMS, 2017, 117 : 16 - 26
  • [4] On Discriminative Parameter Learning of Bayesian Network Classifiers
    Pernkopf, Franz
    Wohlmayr, Michael
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT II, 2009, 5782 : 221 - 237
  • [5] Discriminative parameter learning of general Bayesian network classifiers
    Shen, B
    Su, XY
    Greiner, R
    Musilek, P
    Cheng, C
    [J]. 15TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2003, : 296 - 305
  • [6] Parameter Learning of Bayesian Network Classifiers Under Computational Constraints
    Tschiatschek, Sebastian
    Pernkopf, Franz
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2015, PT I, 2015, 9284 : 86 - 101
  • [7] Efficient learning of Bayesian network classifiers - An extension to the TAN classifier
    Carvalho, Alexandra M.
    Oliveira, Arlindo L.
    Sagot, Marie-France
    [J]. AI 2007: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2007, 4830 : 16 - +
  • [8] Efficient Heuristics for Discriminative Structure Learning of Bayesian Network Classifiers
    Pernkopf, Franz
    Bilmes, Jeff A.
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2010, 11 : 2323 - 2360
  • [9] An Improved De-noising Algorithm for Bayesian Network Classifiers Parameter Learning
    Kang, Qing
    Wang, Li-Qing
    Xu, Yong-Yue
    Li, Hong
    An, Hong-Ping
    Wang, Xing-Chao
    Yao, Han-Bing
    [J]. 2016 INTERNATIONAL CONFERENCE ON SERVICE SCIENCE, TECHNOLOGY AND ENGINEERING (SSTE 2016), 2016, : 161 - 167
  • [10] Learning Bayesian classifiers from dependency network classifiers
    Gamez, Jose A.
    Mateo, Juan L.
    Puerta, Jose M.
    [J]. ADAPTIVE AND NATURAL COMPUTING ALGORITHMS, PT 1, 2007, 4431 : 806 - +