Using a classifier pool in accuracy based tracking of recurring concepts in data stream classification

被引:24
|
作者
Hosseini, Mohammad Javad [1 ]
Ahmadi, Zahra [1 ]
Beigy, Hamid [1 ]
机构
[1] Sharif Univ Technol, Dept Comp Engn, Tehran, Iran
关键词
Recurring concepts; Concept drift; Stream mining; Ensemble learning;
D O I
10.1007/s12530-012-9064-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data streams have some unique properties which make them applicable in precise modeling of many real data mining applications. The most challenging property of data streams is the occurrence of "concept drift''. Recurring concepts is a type of concept drift which can be seen in most of real world problems. Detecting recurring concepts makes it possible to exploit previous knowledge obtained in the learning process. This leads to quick adaptation of the learner whenever a concept reappears. In this paper, we propose a learning algorithm called Pool and Accuracy based Stream Classification with some variations, which takes the advantage of maintaining a pool of classifiers to track recurring concepts. Each classifier is used to describe an existing concept. Consecutive batches of instances are first classified by the pool of classifiers. Two approaches are presented for this task: active classifier and weighted classifiers methods. Then the true labels are revealed and the pool is updated at the end of the batch. Updating the pool is done using one of the following methods: exact Bayesian, Bayesian and Heuristic. As the algorithm may assign multiple classifiers to a single concept, a classifier merging process is used to resolve this problem. Experimental results on real and artificial datasets show the effectiveness of weighted classifiers method while dealing with sudden concept drifting datasets. In addition, the proposed updating methods outperform the existing algorithms in datasets with arbitrary attributes. Finally some performed experiments represent superiority of using merging process in large datasets.
引用
收藏
页码:43 / 60
页数:18
相关论文
共 50 条
  • [31] A Classification Method between Novice and Experienced Drivers Using Eye Tracking Data and Gaussian Process Classifier
    Zhang, Zujie
    Kubo, Takatomi
    Watanabe, Jin
    Shibata, Tomohiro
    Ikeda, Kazushi
    Bando, Takashi
    Hitomi, Kentarou
    Egawa, Masumi
    [J]. 2015 54TH ANNUAL CONFERENCE OF THE SOCIETY OF INSTRUMENT AND CONTROL ENGINEERS OF JAPAN (SICE), 2015, : 1409 - 1412
  • [32] Probability-weighted ensemble classifier using holoentropy-enabled decision tree for data stream classification
    Sayed S.
    Poonia R.
    [J]. International Journal of Computers and Applications, 2021, 43 (03) : 267 - 281
  • [33] An effective approach for improving the accuracy of a random forest classifier in the classification of Hyperion data
    Chutia, Dibyajyoti
    Borah, Naiwrita
    Baruah, Diganta
    Bhattacharyya, Dhruba Kumar
    Raju, P. L. N.
    Sarma, K. K.
    [J]. APPLIED GEOMATICS, 2020, 12 (01) : 95 - 105
  • [34] An effective approach for improving the accuracy of a random forest classifier in the classification of Hyperion data
    Dibyajyoti Chutia
    Naiwrita Borah
    Diganta Baruah
    Dhruba Kumar Bhattacharyya
    P. L. N. Raju
    K. K. Sarma
    [J]. Applied Geomatics, 2020, 12 : 95 - 105
  • [35] Employing One-Class SVM Classifier Ensemble for Imbalanced Data Stream Classification
    Klikowski, Jakub
    Wozniak, Michal
    [J]. COMPUTATIONAL SCIENCE - ICCS 2020, PT IV, 2020, 12140 : 117 - 127
  • [36] Robust on-line neural learning classifier system for data stream classification tasks
    Sancho-Asensio, Andreu
    Orriols-Puig, Albert
    Golobardes, Elisabet
    [J]. SOFT COMPUTING, 2014, 18 (08) : 1441 - 1461
  • [37] Robust on-line neural learning classifier system for data stream classification tasks
    Andreu Sancho-Asensio
    Albert Orriols-Puig
    Elisabet Golobardes
    [J]. Soft Computing, 2014, 18 : 1441 - 1461
  • [38] Enhancement of Classification Accuracy of our Adaptive Classifier using Image Processing Techniques in the Field of Medical Data Mining
    Chandra, Sneha
    Kaur, Maneet
    [J]. 2015 INTERNATIONAL CONFERENCE ON GREEN COMPUTING AND INTERNET OF THINGS (ICGCIOT), 2015, : 948 - 954
  • [39] Coupling self-organizing maps with a Naive Bayesian classifier : Stream classification studies using multiple assessment data
    Fytilis, Nikolaos
    Rizzo, Donna M.
    [J]. WATER RESOURCES RESEARCH, 2013, 49 (11) : 7747 - 7762
  • [40] Imbalanced Data Stream Classification Using Hybrid Data Preprocessing
    Bobowska, Barbara
    Klikowski, Jakub
    Wozniak, Michal
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT II, 2020, 1168 : 402 - 413