C4.5 or Naive Bayes: A Discriminative Model Selection Approach

被引:3
|
作者
Zhang, Lungan [1 ]
Jiang, Liangxiao [1 ]
Li, Chaoqun [2 ]
机构
[1] China Univ Geosci, Dept Comp Sci, Wuhan 430074, Hubei, Peoples R China
[2] China Univ Geosci, Dept Math, Wuhan 430074, Hubei, Peoples R China
关键词
Model selection; C4.5; naive Bayes; The nearest neighbor;
D O I
10.1007/978-3-319-44778-0_49
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
C4.5 and naive Bayes (NB) are two of the top 10 data mining algorithms thanks to their simplicity, effectiveness, and efficiency. It is well known that NB performs very well on some domains, and poorly on others that involve correlated features. C4.5, on the other hand, typically works better than NB on such domains. To integrate their advantages and avoid their disadvantages, many approaches, such as model insertion and model combination, are proposed. The model insertion approach such as NBTree inserts NB into each leaf of the built decision tree. The model combination approach such as C4.5-NB builds C4.5 and NB on a training dataset independently and then combines their prediction results for an unseen instance. In this paper, we focus on a new view and propose a discriminative model selection approach. For detail, at the training time, C4.5 and NB are built on a training dataset independently, and the most reliable one is recorded for each training instance. At the test time, for each test instance, we firstly find its nearest neighbor and then choose the most reliable model for its nearest neighbor to predict its class label. We simply denote the proposed algorithm as C4.5 parallel to NB. C4.5 parallel to NB retains the interpretability of C4.5 and NB, but significantly outperforms C4.5, NB, NBTree, and C4.5-NB.
引用
收藏
页码:419 / 426
页数:8
相关论文
共 50 条
  • [1] The impact of feature extraction on the performance of a classifier: kNN, Naive Bayes and C4.5
    Pechenizkiy, M
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2005, 3501 : 268 - 279
  • [2] Feature Selection Based on Sampling and C4.5 Algorithm to Improve the Quality of Text Classification Using Naive Bayes
    Molano, Viviana
    Cobos, Carlos
    Mendoza, Martha
    Herrera-Viedma, Enrique
    Manic, Milos
    [J]. HUMAN-INSPIRED COMPUTING AND ITS APPLICATIONS, PT I, 2014, 8856 : 80 - 91
  • [3] Comparison of Classification Data Mining C4.5 and Naive Bayes Algorithms of EDM Dataset
    Santoso, Joseph Teguh
    Ginantra, Ni Luh Wiwik Sri Rahayu
    Arifin, Muhammad
    Riinawati, R.
    Sudrajat, Dadang
    Rahim, Robbi
    [J]. TEM JOURNAL-TECHNOLOGY EDUCATION MANAGEMENT INFORMATICS, 2021, 10 (04): : 1738 - 1744
  • [4] Evaluation Information Extraction for Health Text Categories Using C4.5 and Naive Bayes
    Silachan, Klaokanlaya
    Tantasanawong, Panjai
    [J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON INFORMATION MANAGEMENT AND EVALUATION, 2011, : 403 - 411
  • [5] ACCURACY EVALUATION OF C4.5 AND NAIVE BAYES CLASSIFIERS USING ATTRIBUTE RANKING METHOD
    Sivakumari, S.
    Priyadarsini, R. Praveena
    Amudha, P.
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2009, 2 (01) : 60 - 68
  • [6] COMPARISON PERFORMANCE OF C4.5, NAIVE BAYES AND K-NEAREST NEIGHBOR IN DETERMINATION DRUG REHABILITATION
    Islamiyah
    Afiyah, Anisa Nur
    Dengen, Nataniel
    Taruk, Medi
    [J]. 2019 5TH INTERNATIONAL CONFERENCE ON SCIENCE ININFORMATION TECHNOLOGY (ICSITECH): EMBRACING INDUSTRY 4.0 - TOWARDS INNOVATION IN CYBER PHYSICAL SYSTEM, 2019, : 112 - 117
  • [7] Algorithm Implementations Naive Bayes, Random Forest. C4.5 on Online Gaming for Learning Achievement Predictions
    Gata, Windu
    Basri, Hasan
    Hidayat, Rais
    Patras, Yuyun Elizabeth
    Baharuddin, Baharuddin
    Fatmasari, Rhini
    Tohari, Siswanto
    Wardhani, Nia Kusuma
    [J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON RESEARCH OF EDUCATIONAL ADMINISTRATION AND MANAGEMENT (ICREAM 2018), 2018, 258 : 1 - 9
  • [8] A machine learning approach for Student Assessment in E-Learning Using Quinlan's C4.5, Naive Bayes and Random Forest Algorithms
    Mahboob, Tahira
    Irfan, Sadaf
    Karamat, Aysha
    [J]. PROCEEDINGS OF THE 2016 19TH INTERNATIONAL MULTI-TOPIC CONFERENCE (INMIC), 2016, : 17 - 24
  • [9] Comparative performance between C4.5 and Naive Bayes classifiers in predicting student academic performance in a Virtual Learning Environment
    Azizah, Erwina Nurul
    Pujianto, Utomo
    Nugraha, Eki
    Darusalam
    [J]. 2018 4TH INTERNATIONAL CONFERENCE ON EDUCATION AND TECHNOLOGY (ICET), 2018, : 18 - 22
  • [10] Parameter determination and feature selection for C4.5 algorithm using scatter search approach
    Lin, Shih-Wei
    Chen, Shih-Chieh
    [J]. SOFT COMPUTING, 2012, 16 (01) : 63 - 75