An overview and comprehensive comparison of ensembles for concept drift

被引:48
|
作者
Maior de Barros, Roberto Souto [1 ]
de Carvalho Santos, Silas Garrido T. [1 ]
机构
[1] Univ Fed Pernambuco, Ctr Informat, BR-50740560 Recife, PE, Brazil
关键词
Concept drift; Ensembles; Detectors; Large-scale comparison; Data stream; Online learning; WEIGHTED-MAJORITY; ONLINE; CLASSIFIERS;
D O I
10.1016/j.inffus.2019.03.006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Online learning is about extracting information from large data streams which may be affected by changes in the distribution of the data, events known as concept drift. Concept drift detectors are small programs that try to detect these changes and make it possible to replace the base classifier, improving the overall accuracy. Ensembles of classifiers are also common in this application area and some of them are configurable with a drift detector. This article summarizes a large-scale comparison of six ensemble algorithms, configured with 10 different drift detectors, for learning from fully labeled data streams, using a large number of artificial datasets and two popular base learners in the area: Naive Bayes and Hoeffding Tree. In addition, the code of one the ensembles (Leveraging Bagging) was modified to permit its configuration with any drift detector: its original implementation only uses ADWIN. The goal is to assess how good the existing ensemble algorithms configurable with detectors really are and also to verify and challenge a common belief in the area. The results of the experiments suggest that, in most datasets, the choice of ensemble algorithm has much more impact on the final accuracy than the choice of drift detector used in its configuration. They also suggest the best auxiliary detectors to configure the ensembles, i.e. those that maximize the accuracy of the ensembles, are only marginally different from the best detectors in the same datasets in terms of their accuracies (recently reported in another article).
引用
收藏
页码:213 / 244
页数:32
相关论文
共 50 条
  • [41] Unsupervised Drift Detector Ensembles for Data Stream Mining
    Korycki, Lukasz
    Krawczyk, Bartosz
    2019 IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA 2019), 2019, : 317 - 325
  • [42] An Overview on Concepts Drift Learning
    Iwashita, Adriana Sayuri
    Papa, Joao Paulo
    IEEE ACCESS, 2019, 7 : 1532 - 1547
  • [43] Drift: A Historical and Conceptual Overview
    Plutynski A.
    Biological Theory, 2007, 2 (2) : 156 - 167
  • [44] DynamicWEB: Adapting to Concept Drift and Object Drift in COBWEB
    Scanlan, Joel
    Hartnett, Jacky
    Williams, Raymond
    AI 2008: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2008, 5360 : 454 - 460
  • [45] Concept drift and how to identify it
    Wang, Shenghui
    Schlobach, Stefan
    Klein, Michel
    JOURNAL OF WEB SEMANTICS, 2011, 9 (03): : 247 - 265
  • [46] Paired Learners for Concept Drift
    Bach, Stephen H.
    Maloof, Marcus A.
    ICDM 2008: EIGHTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2008, : 23 - 32
  • [47] Concept drift and the importance of examples
    Klinkenberg, R
    Rüping, S
    TEXT MINING: THEORETICAL ASPECTS AND APPLICATIONS, 2003, : 55 - 77
  • [48] Detection & management of concept drift
    Mak, Lee-Onn
    Krause, Paul
    PROCEEDINGS OF 2006 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2006, : 3486 - +
  • [49] Clustering in the Presence of Concept Drift
    Moulton, Richard Hugh
    Viktor, Herna L.
    Japkowicz, Nathalie
    Gama, Joao
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2018, PT I, 2019, 11051 : 339 - 355
  • [50] A Survey on Concept Drift Adaptation
    Gama, Joao
    Zliobaite, Indre
    Bifet, Albert
    Pechenizkiy, Mykola
    Bouchachia, Abdelhamid
    ACM COMPUTING SURVEYS, 2014, 46 (04)