Learning naive Bayes for probability estimation by feature selection

被引:0
|
作者
Jiang, Liangxiao [1 ]
Zhang, Harry
机构
[1] China Univ Geosci, Fac Comp Sci, Wuhan 430074, Hubei, Peoples R China
[2] Univ New Brunswick, Fac Comp Sci, Fredericton, NB E3B 5A3, Canada
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Naive Bayes is a well-known effective and efficient classification algorithm. But its probability estimation is poor. In many applications, however, accurate probability estimation is often required in order to make optimal decisions. Usually, probability estimation is measured by conditional log likelihood (CLL). There have been some learning algorithms proposed recently to extend naive Bayes for high CLL, such as ERL [8,9] and BNC-2P [10]. Unfortunately, their computational complexity is relatively high. Is there a simple but effective and efficient approach to improve the probability estimation of naive Bayes? In this paper, we propose to use feature selection for this purpose. More precisely, a search process is conducted to select a subset of attributes, and then a naive Bayes is deployed on the selected attribute set. In fact, feature selection has been successfully applied to naive Bayes and achieves significant improvement in classification accuracy. Among the feature selection algorithms for naive Bayes, selective Bayesian classifiers (SBC) by Langley et al. [13] demonstrates good performance. In this paper, we first study the performance of SBC in terms of probability estimation, and then propose an improved SBC algorithm SBC-CLL, in which the CLL score is directly used for attribute selection, instead of using classification accuracy. Our experiments show that both SBC and SBC-CLL achieve significant improvement over naive Bayes, and that SBC-CLL outperforms SBC substantially, in probability estimation measured by CLL. Our work provides an efficient and surprisingly effective approach to improve the probability estimation of naive Bayes.
引用
收藏
页码:503 / 514
页数:12
相关论文
共 50 条
  • [1] Learning naive Bayes Tree for conditional probability estimation
    Liang, Han
    Yan, Yuhong
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4013 : 455 - 466
  • [2] Naive Feature Selection: Sparsity in Naive Bayes
    Askari, Armin
    d'Aspremont, Alex
    El Ghaoui, Laurent
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 1813 - 1821
  • [3] Feature selection for optimizing the Naive Bayes algorithm
    Winarti, Titin
    Vydia, Vensy
    [J]. ENGINEERING, INFORMATION AND AGRICULTURAL TECHNOLOGY IN THE GLOBAL DIGITAL REVOLUTION, 2020, : 47 - 51
  • [4] Feature selection for text classification with Naive Bayes
    Chen, Jingnian
    Huang, Houkuan
    Tian, Shengfeng
    Qu, Youli
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (03) : 5432 - 5435
  • [5] Naive Bayes classification given probability estimation trees
    Qin, Zengchang
    [J]. ICMLA 2006: 5TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2006, : 34 - 39
  • [6] Feature selection for unbalanced class distribution and Naive Bayes
    Mladenic, D
    Grobelnik, M
    [J]. MACHINE LEARNING, PROCEEDINGS, 1999, : 258 - 267
  • [7] Variable Selection for Naive Bayes Semisupervised Learning
    Choi, Byoung-Jeong
    Kim, Kwang-Rae
    Cho, Kyu-Dong
    Park, Changyi
    Koo, Ja-Yong
    [J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2014, 43 (10) : 2702 - 2713
  • [8] Naive Feature Selection: A Nearly Tight Convex Relaxation for Sparse Naive Bayes
    Askari, Armin
    d'Aspremont, Alexandre
    El Ghaoui, Laurent
    [J]. MATHEMATICS OF OPERATIONS RESEARCH, 2024, 49 (01)
  • [9] The mysterious optimality of Naive Bayes: Estimation of the probability in the system of "classifiers"
    Kupervasser O.
    [J]. Pattern Recognition and Image Analysis, 2014, 24 (1) : 1 - 10
  • [10] Self-Adaptive Probability Estimation for Naive Bayes Classification
    Wu, Jia
    Cai, Zhihua
    Zhu, Xingquan
    [J]. 2013 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2013,