Learning naive Bayes for probability estimation by feature selection

被引：0

作者：

Jiang, Liangxiao ^{[1
]}

Zhang, Harry

机构：

[1] China Univ Geosci, Fac Comp Sci, Wuhan 430074, Hubei, Peoples R China

[2] Univ New Brunswick, Fac Comp Sci, Fredericton, NB E3B 5A3, Canada

来源：

ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS | 2006年 / 4013卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Naive Bayes is a well-known effective and efficient classification algorithm. But its probability estimation is poor. In many applications, however, accurate probability estimation is often required in order to make optimal decisions. Usually, probability estimation is measured by conditional log likelihood (CLL). There have been some learning algorithms proposed recently to extend naive Bayes for high CLL, such as ERL [8,9] and BNC-2P [10]. Unfortunately, their computational complexity is relatively high. Is there a simple but effective and efficient approach to improve the probability estimation of naive Bayes? In this paper, we propose to use feature selection for this purpose. More precisely, a search process is conducted to select a subset of attributes, and then a naive Bayes is deployed on the selected attribute set. In fact, feature selection has been successfully applied to naive Bayes and achieves significant improvement in classification accuracy. Among the feature selection algorithms for naive Bayes, selective Bayesian classifiers (SBC) by Langley et al. [13] demonstrates good performance. In this paper, we first study the performance of SBC in terms of probability estimation, and then propose an improved SBC algorithm SBC-CLL, in which the CLL score is directly used for attribute selection, instead of using classification accuracy. Our experiments show that both SBC and SBC-CLL achieve significant improvement over naive Bayes, and that SBC-CLL outperforms SBC substantially, in probability estimation measured by CLL. Our work provides an efficient and surprisingly effective approach to improve the probability estimation of naive Bayes.

引用

页码：503 / 514

页数：12

共 50 条

[1] Learning naive Bayes Tree for conditional probability estimation
Liang, Han
Yan, Yuhong
[J]. ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4013 : 455 - 466
[2] Naive Feature Selection: Sparsity in Naive Bayes
Askari, Armin
d'Aspremont, Alex
El Ghaoui, Laurent
[J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 1813 - 1821
[3] Feature selection for optimizing the Naive Bayes algorithm
Winarti, Titin
Vydia, Vensy
[J]. ENGINEERING, INFORMATION AND AGRICULTURAL TECHNOLOGY IN THE GLOBAL DIGITAL REVOLUTION, 2020, : 47 - 51
[4] Feature selection for text classification with Naive Bayes
Chen, Jingnian
Huang, Houkuan
Tian, Shengfeng
Qu, Youli
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (03) : 5432 - 5435
[5] Naive Bayes classification given probability estimation trees
Qin, Zengchang
[J]. ICMLA 2006: 5TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2006, : 34 - 39
[6] Feature selection for unbalanced class distribution and Naive Bayes
Mladenic, D
Grobelnik, M
[J]. MACHINE LEARNING, PROCEEDINGS, 1999, : 258 - 267
[7] Variable Selection for Naive Bayes Semisupervised Learning
Choi, Byoung-Jeong
Kim, Kwang-Rae
Cho, Kyu-Dong
Park, Changyi
Koo, Ja-Yong
[J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2014, 43 (10) : 2702 - 2713
[8] Naive Feature Selection: A Nearly Tight Convex Relaxation for Sparse Naive Bayes
Askari, Armin
d'Aspremont, Alexandre
El Ghaoui, Laurent
[J]. MATHEMATICS OF OPERATIONS RESEARCH, 2024, 49 (01)
[9] The mysterious optimality of Naive Bayes: Estimation of the probability in the system of "classifiers"
Kupervasser O.
[J]. Pattern Recognition and Image Analysis, 2014, 24 (1) : 1 - 10
[10] Self-Adaptive Probability Estimation for Naive Bayes Classification
Wu, Jia
Cai, Zhihua
Zhu, Xingquan
[J]. 2013 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2013,

← 1 2 3 4 5 →