Incorporating receiver operating characteristics into naive Bayes for unbalanced data classification

被引:14
|
作者
Kim, Taeheung [1 ]
Chung, Byung Do [2 ]
Lee, Jong-Seok [1 ]
机构
[1] Sungkyunkwan Univ, Dept Ind Engn, Suwon 16419, South Korea
[2] Yonsei Univ, Dept Informat & Ind Engn, 50 Yonsei Ro, Seoul 03722, South Korea
关键词
Unbalanced classification; Weighted naive Bayes; Receiver operating characteristics; Area under ROC curve;
D O I
10.1007/s00607-016-0483-z
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Naive Bayesian classification has been widely used in data mining area because of its simplicity and robustness to missing values and irrelevant attributes. However, naive Bayes classifiers sometimes show poor performance due to their unrealistic assumption that all attributes are equally important and conditionally independent of each other. In this research, we dispense with the former assumption by proposing a new attribute weighting method. The proposed method considers each attribute as a single classifier and measures its discriminating ability using the area under an ROC curve (AUC). Each AUC value is then used to weight the corresponding attribute. In addition, we try to reduce the complexity of classification models by selecting high AUC attributes. Using 20 real datasets from the machine learning repository at UC Irvine (UCI), we conduct a numerical experiment to show that the proposed method is an improvement over standard naive Bayes classification and existing weighting methods.
引用
收藏
页码:203 / 218
页数:16
相关论文
共 50 条
  • [21] Educational data Classification using Selective Naive Bayes for Quota categorization
    Dangi, Abhilasha
    Srivastava, Sumit
    2014 IEEE INTERNATIONAL CONFERENCE ON MOOC, INNOVATION AND TECHNOLOGY IN EDUCATION (MITE), 2014, : 118 - 121
  • [22] Fast Feature Selection for Naive Bayes Classification in Data Stream Mining
    Lutu, Patricia E. N.
    WORLD CONGRESS ON ENGINEERING - WCE 2013, VOL III, 2013, : 1549 - 1554
  • [23] Comparison of SVM and Naive Bayes for Sentiment Classification using BERT data
    Rana, Shivani
    Kanji, Rakesh
    Jain, Shruti
    2022 5TH INTERNATIONAL CONFERENCE ON MULTIMEDIA, SIGNAL PROCESSING AND COMMUNICATION TECHNOLOGIES (IMPACT), 2022,
  • [24] A sequential feature extraction approach for naive bayes classification of microarray data
    Fan, Liwei
    Poh, Kim-Leng
    Zhou, Peng
    EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (06) : 9919 - 9923
  • [25] Variable selection for Naive Bayes classification
    Blanquero, Rafael
    Carrizosa, Emilio
    Ramirez-Cobo, Pepa
    Remedios Sillero-Denamiel, M.
    COMPUTERS & OPERATIONS RESEARCH, 2021, 135
  • [26] Structured Features in Naive Bayes Classification
    Choi, Arthur
    Tavabi, Nazgol
    Darwiche, Adnan
    THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 3233 - 3240
  • [27] An Improvement to Naive Bayes for Text Classification
    Zhang, Wei
    Gao, Feng
    CEIS 2011, 2011, 15
  • [28] Survey of improving naive Bayes for classification
    Jiang, Liangxiao
    Wang, Dianhong
    Cai, Zhihua
    Yan, Xuesong
    ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2007, 4632 : 134 - +
  • [29] Naive Bayes Approach for Website Classification
    Rajalakshmi, R.
    Aravindan, C.
    INFORMATION TECHNOLOGY AND MOBILE COMMUNICATION, 2011, 147 : 323 - 326
  • [30] Differentially Private Naive Bayes Classification
    Vaidya, Jaideep
    Basu, Anirban
    Shafiq, Basit
    Hong, Yuan
    2013 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 1, 2013, : 571 - 576