Structure extended multinomial naive Bayes

被引：70

作者：

Jiang, Liangxiao ^{[1
,2
]}

Wang, Shasha ^{[1
]}

Li, Chaoqun ^{[3
]}

Zhang, Lungan ^{[1
]}

机构：

[1] China Univ Geosci, Dept Comp Sci, Wuhan 430074, Peoples R China

[2] China Univ Geosci, Hubei Key Lab Intelligent Geoinformat Proc, Wuhan 430074, Peoples R China

[3] China Univ Geosci, Dept Math, Wuhan 430074, Peoples R China

来源：

INFORMATION SCIENCES | 2016年 / 329卷

基金：

中国国家自然科学基金;

关键词：

Text classification; Multinomial naive Bayes; Structure extension; TERM-WEIGHTING SCHEME; SOFTWARE TOOL; TEXT; CLASSIFIERS; ALGORITHMS; KEEL;

D O I：

10.1016/j.ins.2015.09.037

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Multinomial naive Bayes (MNB) assumes that all attributes (i.e., features) are independent of each other given the context of the class, and it ignores all dependencies among attributes. However, in many real-world applications, the attribute independence assumption required by MNB is often violated and thus harms its performance. To weaken this assumption, one of the most direct ways is to extend its structure to represent explicitly attribute dependencies by adding arcs between attributes. On the other hand, although a Bayesian network can represent arbitrary attribute dependencies, learning an optimal Bayesian network from high-dimensional text data is almost impossible. The main reason is that learning the optimal structure of a Bayesian network from high-dimensional text data is extremely time and space consuming. Thus, it would be desirable if a multinomial Bayesian network model can avoid structure learning and be able to represent attribute dependencies to some extent. In this paper, we propose a novel model called structure extended multinomial naive Bayes (SEMNB). SEMNB alleviates the attribute independence assumption by averaging all of the weighted one-dependence multinomial estimators. To learn SEMNB, we propose a simple but effective learning algorithm without structure searching. The experimental results on a large suite of benchmark text datasets show that SEMNB significantly outperforms MNB and is even markedly better than other three state-of-the-art improved algorithms including TOM, DWMNB, and Rw,cMNB. (C) 2015 Elsevier Inc. All rights reserved.

引用

页码：346 / 356

页数：11

共 50 条

[41] Fuzzy Discretization on the Multinomial Naive Bayes Method for Modeling Multiclass Classification of Corn Plant Diseases and Pests
Resti, Yulia
Irsan, Chandra
Neardiaty, Adinda
Annabila, Choirunnisa
Yani, Irsyadi
[J]. MATHEMATICS, 2023, 11 (08)
[42] Identification of Bacteriophage Virion Proteins Using Multinomial Naive Bayes with g-Gap Feature Tree
Pan, Yanyuan
Gao, Hui
Lin, Hao
Liu, Zhen
Tang, Lixia
Li, Songtao
[J]. INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2018, 19 (06)
[43] AN EMPIRICAL BAYES ESTIMATE OF MULTINOMIAL PROBABILITIES
ALAM, K
MITRA, A
[J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 1986, 15 (10) : 3103 - 3127
[44] Using Character N-gram Features and Multinomial Naive Bayes for Sentiment Polarity Detection in Bengali Tweets
Sarkar, Kamal
[J]. PROCEEDINGS OF 2018 FIFTH INTERNATIONAL CONFERENCE ON EMERGING APPLICATIONS OF INFORMATION TECHNOLOGY (EAIT), 2018,
[45] An intelligent model based on integrated inverse document frequency and multinomial Naive Bayes for current affairs news categorisation
Kumar, Sachin
Sharma, Aditya
Reddy, B. Kartheek
Sachan, Shreyas
Jain, Vaibhav
Singh, Jagvinder
[J]. INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2022, 13 (03) : 1341 - 1355
[46] Evolutional naive Bayes
Jiang, LX
Zhang, HJ
Cai, ZH
Su, J
[J]. PROGRESS IN INTELLIGENCE COMPUTATION & APPLICATIONS, 2005, : 344 - 350
[47] Naive Bayes for regression
Frank, E
Trigg, L
Holmes, G
Witten, IH
[J]. MACHINE LEARNING, 2000, 41 (01) : 5 - 25
[48] Naive Bayes clusterer
Liu, Mujiexin
Wang, Hongjun
Li, Tian Rui
Deng, Ping
[J]. DATA SCIENCE AND KNOWLEDGE ENGINEERING FOR SENSING DECISION SUPPORT, 2018, 11 : 637 - 644
[49] Semi-Supervised Multinomial Naive Bayes for Text Classification by Leveraging Word-Level Statistical Constraint
Zhao, Li
Huang, Minlie
Yao, Ziyu
Su, Rongwei
Jiang, Yingying
Zhu, Xiaoyan
[J]. THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 2877 - 2883
[50] An intelligent model based on integrated inverse document frequency and multinomial Naive Bayes for current affairs news categorisation
Sachin Kumar
Aditya Sharma
B Kartheek Reddy
Shreyas Sachan
Vaibhav Jain
Jagvinder Singh
[J]. International Journal of System Assurance Engineering and Management, 2022, 13 : 1341 - 1355

← 1 2 3 4 5 →