SDDM: an interpretable statistical concept drift detection method for data streams

被引:0
|
作者
Simona Micevska
Ahmed Awad
Sherif Sakr
机构
[1] University of Tartu,
[2] Nile University,undefined
关键词
Online machine learning; Concept drift detection; Data streams analytics;
D O I
暂无
中图分类号
学科分类号
摘要
Machine learning models assume that data is drawn from a stationary distribution. However, in practice, challenges are imposed on models that need to make sense of fast-evolving data streams, where the content of data is changing and evolving over time. This change between the distributions of training data seen so-far and the distribution of newly coming data is called concept drift. It is of utmost importance to detect concept drifts to maintain the accuracy and reliability of online classifiers. Reactive drift detectors monitor the performance of the underlying machine learning model. That is, to detect a drift, feedback on the classifier output has to be given to the drift detector, known as prequential evaluation. In many real-life scenarios, immediate feedback on classifier output is not possible. Thus, drift detection is delayed and gets out of context. Moreover, the drift detector output is in the form of a binary answer if there is a drift or not. However, it is equally important to explain the source of drift. In this paper, we present the Statistical Drift Detection Method (SDDM) which can detect drifts by monitoring the change of data distribution without the need for feedback on classifier output. Moreover, the detection is quantified and the source of drift is identified. We empirically evaluate our method against the state-of-the-art on both synthetic and real life data sets. SDDM outperforms other related approaches by producing a smaller number of false positives and false negatives.
引用
收藏
页码:459 / 484
页数:25
相关论文
共 50 条
  • [1] SDDM: an interpretable statistical concept drift detection method for data streams
    Micevska, Simona
    Awad, Ahmed
    Sakr, Sherif
    [J]. JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2021, 56 (03) : 459 - 484
  • [2] A Multiscale Concept Drift Detection Method for Learning from Data Streams
    Wang, XueSong
    Kang, Qi
    Zhou, MengChu
    Yao, SiYa
    [J]. 2018 IEEE 14TH INTERNATIONAL CONFERENCE ON AUTOMATION SCIENCE AND ENGINEERING (CASE), 2018, : 786 - 790
  • [3] A novel concept drift detection method in data streams using ensemble classifiers
    Dehghan, Mahdie
    Beigy, Hamid
    ZareMoodi, Poorya
    [J]. INTELLIGENT DATA ANALYSIS, 2016, 20 (06) : 1329 - 1350
  • [4] Handling Concept Drift in Data Streams by Using Drift Detection Methods
    Patil, Malini M.
    [J]. DATA MANAGEMENT, ANALYTICS AND INNOVATION, ICDMAI 2018, VOL 2, 2019, 839 : 155 - 166
  • [5] An Active Learning Method for Data Streams with Concept Drift
    Park, Cheong Hee
    Kang, Youngsoon
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 746 - 752
  • [6] New Drift Detection Method for Data Streams
    Sobhani, Parinaz
    Beigy, Hamid
    [J]. ADAPTIVE AND INTELLIGENT SYSTEMS, 2011, 6943 : 88 - 97
  • [7] On learning guarantees to unsupervised concept drift detection on data streams
    de Mello, Rodrigo F.
    Vaz, Yule
    Grossi, Carlos H.
    Bifet, Albert
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2019, 117 : 90 - 102
  • [8] Nacre: Proactive Recurrent Concept Drift Detection in Data Streams
    Wu, Ocean
    Koh, Yun Sing
    Dobbie, Gillian
    Lacombe, Thomas
    [J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [9] Concept drift robust adaptive novelty detection for data streams
    Cejnek, Matous
    Bukovsky, Ivo
    [J]. NEUROCOMPUTING, 2018, 309 : 46 - 53
  • [10] Online Clustering for Novelty Detection and Concept Drift in Data Streams
    Garcia, Kemilly Dearo
    Poel, Mannes
    Kok, Joost N.
    de Carvalho, Andre C. P. L. F.
    [J]. PROGRESS IN ARTIFICIAL INTELLIGENCE, PT II, 2019, 11805 : 448 - 459