Weighted Naive Bayes Classifier with Forgetting for Drifting Data Streams

被引:14
|
作者
Krawczyk, Bartosz [1 ]
Wozniak, Michal [1 ]
机构
[1] Wroclaw Univ Technol, Dept Syst & Comp Networks, PL-50370 Wroclaw, Poland
关键词
machine learning; data stream; concept drift; big data; incremental learning; forgetting; DESIGN;
D O I
10.1109/SMC.2015.375
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Mining massive data streams in real-time is one of the contemporary challenges for machine learning systems. Such a domain encompass many of difficulties hidden beneath the term of Big Data. We deal with massive, incoming information that must be processed on-the-fly, with lowest possible response delay. We are forced to take into account time, memory and quality constraints. Our models must be able to quickly process large collection of data and swiftly adapt themselves to occurring changes (shifts and drifts) in data streams. In this paper, we propose a novel version of simple, yet effective Naive Bayes classifier for mining streams. We add a weighting module, that automatically assigns an importance factor to each object extracted from the stream. The higher the weight, the bigger influence given object exerts on the classifier training procedure. We assume, that our model works in the non-stationary environment with the presence of concept drift phenomenon. To allow our classifier to quickly adapt its properties to evolving data, we imbue it with forgetting principle implemented as weight decay. With each passing iteration, the level of importance of previous objects is decreased until they are discarded from the data collection. We propose an efficient sigmoidal function for modeling the forgetting rate. Experimental analysis, carried out on a number of large data streams with concept drift prove that our weighted Naive Bayes classifier displays highly satisfactory performance in comparison with state-of-the-art stream classifiers.
引用
收藏
页码:2147 / 2152
页数:6
相关论文
共 50 条
  • [41] Incremental discretization for Naive-Bayes classifier
    Lu, Jingli
    Yang, Ying
    Webb, Geoffrey I.
    [J]. ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2006, 4093 : 223 - 238
  • [42] Applying Naive Bayes Classifier to Document Clustering
    Ji, Jie
    Zhao, Qiangfu
    [J]. JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2010, 14 (06) : 624 - 630
  • [43] Exact Learning Augmented Naive Bayes Classifier
    Sugahara, Shouta
    Ueno, Maomi
    [J]. ENTROPY, 2021, 23 (12)
  • [44] An Extension of Tree Augmented Naive Bayes Classifier
    Wang, Zhongfeng
    Tian, Jianwei
    [J]. 2011 SECOND ETP/IITA CONFERENCE ON TELECOMMUNICATION AND INFORMATION (TEIN 2011), VOL 1, 2011, : 243 - 246
  • [45] Understanding of the Naive Bayes Classifier in Spam Filtering
    Wei, Qijia
    [J]. 6TH INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN, MANUFACTURING, MODELING AND SIMULATION (CDMMS 2018), 2018, 1967
  • [46] Boosting the Tree Augmented Naive Bayes classifier
    Downs, T
    Tang, A
    [J]. INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING IDEAL 2004, PROCEEDINGS, 2004, 3177 : 708 - 713
  • [47] Threshold-based Naive Bayes classifier
    Romano, Maurizio
    Contu, Giulia
    Mola, Francesco
    Conversano, Claudio
    [J]. ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2024, 18 (02) : 325 - 361
  • [48] Regularization and averaging of the selective Naive Bayes classifier
    Boulle, Marc
    [J]. 2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 1680 - 1688
  • [49] Multiple explanations driven Naive Bayes classifier
    Almonayyes, A
    [J]. JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2006, 12 (02) : 127 - 139
  • [50] A Naive Bayes Classifier Based on Neighborhood Granulation
    Fu, Xingyu
    Chen, Yingyue
    Yao, Zhiyuan
    Chen, Yumin
    Zeng, Nianfeng
    [J]. ROUGH SETS, IJCRS 2022, 2022, 13633 : 132 - 142