Weighted Naive Bayes Classifier with Forgetting for Drifting Data Streams

被引:14
|
作者
Krawczyk, Bartosz [1 ]
Wozniak, Michal [1 ]
机构
[1] Wroclaw Univ Technol, Dept Syst & Comp Networks, PL-50370 Wroclaw, Poland
关键词
machine learning; data stream; concept drift; big data; incremental learning; forgetting; DESIGN;
D O I
10.1109/SMC.2015.375
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Mining massive data streams in real-time is one of the contemporary challenges for machine learning systems. Such a domain encompass many of difficulties hidden beneath the term of Big Data. We deal with massive, incoming information that must be processed on-the-fly, with lowest possible response delay. We are forced to take into account time, memory and quality constraints. Our models must be able to quickly process large collection of data and swiftly adapt themselves to occurring changes (shifts and drifts) in data streams. In this paper, we propose a novel version of simple, yet effective Naive Bayes classifier for mining streams. We add a weighting module, that automatically assigns an importance factor to each object extracted from the stream. The higher the weight, the bigger influence given object exerts on the classifier training procedure. We assume, that our model works in the non-stationary environment with the presence of concept drift phenomenon. To allow our classifier to quickly adapt its properties to evolving data, we imbue it with forgetting principle implemented as weight decay. With each passing iteration, the level of importance of previous objects is decreased until they are discarded from the data collection. We propose an efficient sigmoidal function for modeling the forgetting rate. Experimental analysis, carried out on a number of large data streams with concept drift prove that our weighted Naive Bayes classifier displays highly satisfactory performance in comparison with state-of-the-art stream classifiers.
引用
收藏
页码:2147 / 2152
页数:6
相关论文
共 50 条
  • [1] Attribute Weighted Naive Bayes Classifier
    Foo, Lee-Kien
    Chua, Sook-Ling
    Ibrahim, Neveen
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 71 (01): : 1945 - 1957
  • [2] A method of cleaning RFID data streams based on Naive Bayes classifier
    Lin, Qiao-min
    Xiao, Yan
    Ye, Ning
    Wang, Ru-chuan
    [J]. INTERNATIONAL JOURNAL OF AD HOC AND UBIQUITOUS COMPUTING, 2016, 21 (04) : 237 - 244
  • [3] Weighted Naive Bayes Classifier on Categorical Features
    Omura, Kazuhiro
    Kudo, Mineichi
    Endo, Tomomi
    Murai, Tetsuya
    [J]. 2012 12TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS (ISDA), 2012, : 865 - 870
  • [4] The naive Bayes classifier for functional data
    Zhang, Yi-Chen
    Sakhanenko, Lyudmila
    [J]. STATISTICS & PROBABILITY LETTERS, 2019, 152 : 137 - 146
  • [5] A double weighted fuzzy gamma naive bayes classifier
    de Moraes, Ronei Marcos
    de Melo Gomes Soares, Elaine Anita
    Machado, Liliane dos Santos
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 38 (01) : 577 - 588
  • [6] Classifying Twitter Data with Naive Bayes Classifier
    Tseng, Chris
    Patel, Nishant
    Paranjape, Hrishikesh
    Lin, T. Y.
    Teoh, SooTee
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING (GRC 2012), 2012, : 294 - 299
  • [7] An Aggregated Fuzzy Naive Bayes Data Classifier
    Tutuncu, G. Yazgi
    Kayaalp, Necla
    [J]. JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2015, 286 : 17 - 27
  • [8] DECOMPOSABLE NAIVE BAYES CLASSIFIER FOR PARTITIONED DATA
    Khedr, Ahmed M.
    [J]. COMPUTING AND INFORMATICS, 2012, 31 (06) : 1511 - 1531
  • [9] Extended Naive Bayes classifier for mixed data
    Hsu, Chung-Chian
    Huang, Yan-Ping
    Chang, Keng-Wei
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2008, 35 (03) : 1080 - 1083
  • [10] Attribute weighted Naive Bayes classifier using a local optimization
    Sona Taheri
    John Yearwood
    Musa Mammadov
    Sattar Seifollahi
    [J]. Neural Computing and Applications, 2014, 24 : 995 - 1002