Incorporating receiver operating characteristics into naive Bayes for unbalanced data classification

被引:14
|
作者
Kim, Taeheung [1 ]
Chung, Byung Do [2 ]
Lee, Jong-Seok [1 ]
机构
[1] Sungkyunkwan Univ, Dept Ind Engn, Suwon 16419, South Korea
[2] Yonsei Univ, Dept Informat & Ind Engn, 50 Yonsei Ro, Seoul 03722, South Korea
关键词
Unbalanced classification; Weighted naive Bayes; Receiver operating characteristics; Area under ROC curve;
D O I
10.1007/s00607-016-0483-z
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Naive Bayesian classification has been widely used in data mining area because of its simplicity and robustness to missing values and irrelevant attributes. However, naive Bayes classifiers sometimes show poor performance due to their unrealistic assumption that all attributes are equally important and conditionally independent of each other. In this research, we dispense with the former assumption by proposing a new attribute weighting method. The proposed method considers each attribute as a single classifier and measures its discriminating ability using the area under an ROC curve (AUC). Each AUC value is then used to weight the corresponding attribute. In addition, we try to reduce the complexity of classification models by selecting high AUC attributes. Using 20 real datasets from the machine learning repository at UC Irvine (UCI), we conduct a numerical experiment to show that the proposed method is an improvement over standard naive Bayes classification and existing weighting methods.
引用
收藏
页码:203 / 218
页数:16
相关论文
共 50 条
  • [41] Variable-Length Event Classification using PMU Data with Naive Bayes
    Foster, David
    Liu, Xueqin
    Rafferty, Mark
    Laverty, David
    2022 57TH INTERNATIONAL UNIVERSITIES POWER ENGINEERING CONFERENCE (UPEC 2022): BIG DATA AND SMART GRIDS, 2022,
  • [42] Downtime Data Classification Using Naive Bayes Algorithm on 2008 ESEC Engine
    Kirana, Mira Chandra
    Fani, Maidel
    Kartikasari, Tri Shella
    Nashrullah, Muhammad
    2020 3RD INTERNATIONAL CONFERENCE ON APPLIED ENGINEERING (ICAE), 2020,
  • [43] Applying Naive Bayes Data Mining Technique for Classification of Agricultural Land Soils
    Bhargavi, P.
    Jyothi, S.
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2009, 9 (08): : 117 - 122
  • [44] Discretization as the enabling technique for the Naive Bayes and semi-Naive Bayes-based classification
    Mizianty, Marcin J.
    Kurgan, Lukasz A.
    Ogiela, Marek R.
    KNOWLEDGE ENGINEERING REVIEW, 2010, 25 (04): : 421 - 449
  • [45] Comparative analysis of the impact of discretization on the classification with Naive Bayes and semi-Naive Bayes classifiers
    Mizianty, Marcin
    Kurgan, Lukasz
    Ogiela, Marek
    SEVENTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2008, : 823 - +
  • [46] Naive Bayes as an imputation tool for classification problems
    Garcia, AJT
    Hruschka, ER
    HIS 2005: 5th International Conference on Hybrid Intelligent Systems, Proceedings, 2005, : 497 - 499
  • [47] Texture Classification using Naive Bayes Classifier
    Mansour, Ayman M.
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2018, 18 (01): : 112 - 120
  • [48] Application of Naive Bayes in Classification of Use Cases
    Strba, Radoslav
    Bris, Radim
    Vondrak, Ivo
    Stolfa, Svatopluk
    PROCEEDINGS OF THE SECOND INTERNATIONAL AFRO-EUROPEAN CONFERENCE FOR INDUSTRIAL ADVANCEMENT (AECIA 2015), 2016, 427 : 361 - 370
  • [49] Adapting naive Bayes tree for text classification
    Shasha Wang
    Liangxiao Jiang
    Chaoqun Li
    Knowledge and Information Systems, 2015, 44 : 77 - 89
  • [50] Combining decision tree and Naive Bayes for classification
    Wang, Li-Min
    Li, Xiao-n Li
    Cao, Chun-Hong
    Yuan, Sen-Miao
    KNOWLEDGE-BASED SYSTEMS, 2006, 19 (07) : 511 - 515