Handling numeric attributes when comparing Bayesian network classifiers: does the discretization method matter?

被引:0
|
作者
M. Julia Flores
José A. Gámez
Ana M. Martínez
José M. Puerta
机构
[1] University of Castilla-La Mancha,Computer Systems Department, Intelligent Systems & Data Mining—SIMD, I3A
来源
Applied Intelligence | 2011年 / 34卷
关键词
Discretization; Bayesian classifiers; AODE; Naive Bayes;
D O I
暂无
中图分类号
学科分类号
摘要
Within the framework of Bayesian networks (BNs), most classifiers assume that the variables involved are of a discrete nature, but this assumption rarely holds in real problems. Despite the loss of information discretization entails, it is a direct easy-to-use mechanism that can offer some benefits: sometimes discretization improves the run time for certain algorithms; it provides a reduction in the value set and then a reduction in the noise which might be present in the data; in other cases, there are some Bayesian methods that can only deal with discrete variables. Hence, even though there are many ways to deal with continuous variables other than discretization, it is still commonly used. This paper presents a study of the impact of using different discretization strategies on a set of representative BN classifiers, with a significant sample consisting of 26 datasets. For this comparison, we have chosen Naive Bayes (NB) together with several other semi-Naive Bayes classifiers: Tree-Augmented Naive Bayes (TAN), k-Dependence Bayesian (KDB), Aggregating One-Dependence Estimators (AODE) and Hybrid AODE (HAODE). Also, we have included an augmented Bayesian network created by using a hill climbing algorithm (BNHC). With this comparison we analyse to what extent the type of discretization method affects classifier performance in terms of accuracy and bias-variance discretization. Our main conclusion is that even if a discretization method produces different results for a particular dataset, it does not really have an effect when classifiers are being compared. That is, given a set of datasets, accuracy values might vary but the classifier ranking is generally maintained. This is a very useful outcome, assuming that the type of discretization applied is not decisive future experiments can be d times faster, d being the number of discretization methods considered.
引用
收藏
页码:372 / 385
页数:13
相关论文
共 38 条
  • [1] Handling numeric attributes when comparing Bayesian network classifiers: does the discretization method matter?
    Flores, M. Julia
    Gamez, Jose A.
    Martinez, Ana M.
    Puerta, Jose M.
    APPLIED INTELLIGENCE, 2011, 34 (03) : 372 - 385
  • [2] Analyzing the Impact of the Discretization Method When Comparing Bayesian Classifiers
    Julia Flores, M.
    Gamez, Jose A.
    Martinez, Ana M.
    Puerta, Jose M.
    TRENDS IN APPLIED INTELLIGENT SYSTEMS, PT I, PROCEEDINGS, 2010, 6096 : 570 - 579
  • [3] Comparing Bayesian network classifiers
    Cheng, J
    Greiner, R
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 1999, : 101 - 108
  • [4] A hybrid discretization method for naive Bayesian classifiers
    Wong, Tzu-Tsung
    PATTERN RECOGNITION, 2012, 45 (06) : 2321 - 2325
  • [5] A Hellinger-based discretization method for numeric attributes in classification learning
    Lee, Chang-Hwan
    KNOWLEDGE-BASED SYSTEMS, 2007, 20 (04) : 419 - 425
  • [6] HANDLING MISSING FEATURES IN MAXIMUM MARGIN BAYESIAN NETWORK CLASSIFIERS
    Tschiatschek, Sebastian
    Mutsam, Nikolaus
    Pernkopf, Franz
    2012 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2012,
  • [7] A Method for Discretization of Continuous Attributes in Network Performance Measurement
    Ma, Lirong
    Li, Xuefeng
    Su, Zhuang
    2014 IEEE WORKSHOP ON ELECTRONICS, COMPUTER AND APPLICATIONS, 2014, : 88 - 91
  • [8] Comparing Single and Multiple Bayesian Classifiers Approaches for Network Intrusion Detection
    Khor, Kok-Chin
    Ting, Choo-Yee
    Amnuaisuk, Somnuk-Phon
    2010 SECOND INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND APPLICATIONS: ICCEA 2010, PROCEEDINGS, VOL 2, 2010, : 325 - 329
  • [9] Comparing case-based Bayesian network and recursive Bayesian multi-net classifiers
    Santos, E
    Hussein, A
    IC-AI '04 & MLMTA'04 , VOL 1 AND 2, PROCEEDINGS, 2004, : 627 - 633
  • [10] A Bayesian method for comparing and combining binary classifiers in the absence of a gold standard
    Keith, Jonathan M.
    Davey, Christian M.
    Boyd, Sarah E.
    BMC BIOINFORMATICS, 2012, 13