Analyzing and repairing concept drift adaptation in data stream classification

被引:18
|
作者
Halstead, Ben [1 ]
Koh, Yun Sing [1 ]
Riddle, Patricia [1 ]
Pears, Russel [2 ]
Pechenizkiy, Mykola [3 ]
Bifet, Albert [4 ,5 ]
Olivares, Gustavo [6 ]
Coulson, Guy [6 ]
机构
[1] Univ Auckland, Sch Comp Sci, Auckland, New Zealand
[2] Auckland Univ Technol, Auckland, New Zealand
[3] Eindhoven Univ Technol, Eindhoven, Netherlands
[4] Univ Waikato, Hamilton, New Zealand
[5] IP Paris, Telecom Paris, LTCI, Paris, France
[6] Natl Inst Water & Atmospher Res, Auckland, New Zealand
关键词
Concept drift; Data stream classification; Recurring concepts; CLASSIFIERS; SELECTION;
D O I
10.1007/s10994-021-05993-w
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data collected over time often exhibit changes in distribution, or concept drift, caused by changes in factors relevant to the classification task, e.g. weather conditions. Incorporating all relevant factors into the model may be able to capture these changes, however, this is usually not practical. Data stream based methods, which instead explicitly detect concept drift, have been shown to retain performance under unknown changing conditions. These methods adapt to concept drift by training a model to classify each distinct data distribution. However, we hypothesize that existing methods do not robustly handle real-world tasks, leading to adaptation errors where context is misidentified. Adaptation errors may cause a system to use a model which does not fit the current data, reducing performance. We propose a novel repair algorithm to identify and correct errors in concept drift adaptation. Evaluation on synthetic data shows that our proposed AiRStream system has higher performance than baseline methods, while is also better at capturing the dynamics of the stream. Evaluation on an air quality inference task shows AiRStream provides increased real-world performance compared to eight baseline methods. A case study shows that AiRStream is able to build a robust model of environmental conditions over this task, allowing the adaptions made to concept drift to be analysed and related to changes in weather. We discovered a strong predictive link between the adaptions made by AiRStream and changes in meteorological conditions.
引用
收藏
页码:3489 / 3523
页数:35
相关论文
共 50 条
  • [1] Analyzing and Repairing Concept Drift Adaptation in Data Stream Classification
    Halstead, Ben
    Koh, Yun Sing
    Riddle, Patricia
    Pears, Russel
    Pechenizkiy, Mykola
    Bifet, Albert
    Olivares, Gustavo
    Coulson, Guy
    2021 IEEE 8TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2021,
  • [2] Analyzing and repairing concept drift adaptation in data stream classification
    Ben Halstead
    Yun Sing Koh
    Patricia Riddle
    Russel Pears
    Mykola Pechenizkiy
    Albert Bifet
    Gustavo Olivares
    Guy Coulson
    Machine Learning, 2022, 111 : 3489 - 3523
  • [3] Uncertain Data Stream Classification with Concept Drift
    Lv Yanxia
    Wang Cuirong
    Wang Cong
    Liu Bingyu
    2016 FOURTH INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA (CBD 2016), 2016, : 265 - +
  • [4] Scalable concept drift adaptation for stream data mining
    Hu, Lisha
    Li, Wenxiu
    Lu, Yaru
    Hu, Chunyu
    COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (05) : 6725 - 6743
  • [5] Adaptive Classification Algorithm for Concept Drift Data Stream
    Cai H.
    Lu K.
    Wu Q.
    Wu D.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2022, 59 (03): : 633 - 646
  • [6] An ensemble method for data stream classification in the presence of concept drift
    Department of Computer Engineering, University of Zanjan, Zanjan
    45371-38791, Iran
    Front. Inf. Technol. Electr. Eng., 12 (1059-1068):
  • [7] Anensemble method for data stream classification in the presence of concept drift
    Abbaszadeh, Omid
    Amiri, Ali
    Khanteymoori, Ali Reza
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2015, 16 (12) : 1059 - 1068
  • [8] Study on a classification model of data stream based on concept drift
    1600, Science and Engineering Research Support Society (09):
  • [9] An ensemble method for data stream classification in the presence of concept drift
    Omid ABBASZADEH
    Ali AMIRI
    Ali Reza KHANTEYMOORI
    FrontiersofInformationTechnology&ElectronicEngineering, 2015, 16 (12) : 1059 - 1068
  • [10] An ensemble method for data stream classification in the presence of concept drift
    Omid Abbaszadeh
    Ali Amiri
    Ali Reza Khanteymoori
    Frontiers of Information Technology & Electronic Engineering, 2015, 16 : 1059 - 1068