Modeling recurring concepts in data streams: a graph-based framework

被引:0
|
作者
Zahra Ahmadi
Stefan Kramer
机构
[1] Johannes Gutenberg-Universität,Institut für Informatik
来源
关键词
Pool management; Recurring concepts; Concept drift; Data stream classification;
D O I
暂无
中图分类号
学科分类号
摘要
Classifying a stream of non-stationary data with recurrent drift is a challenging task and has been considered as an interesting problem in recent years. All of the existing approaches handling recurrent concepts maintain a pool of concepts/classifiers and use that pool for future classifications to reduce the error on classifying the instances from a recurring concept. However, the number of classifiers in the pool usually grows very fast as the accurate detection of an underlying concept is a challenging task in itself. Thus, there may be many concepts in the pool representing the same underlying concept. This paper proposes the GraphPool framework that refines the pool of concepts by applying a merging mechanism whenever necessary: after receiving a new batch of data, we extract a concept representation from the current batch considering the correlation among features. Then, we compare the current batch representation to the concept representations in the pool using a statistical multivariate likelihood test. If more than one concept is similar to the current batch, all the corresponding concepts will be merged. GraphPool not only keeps the concepts but also maintains the transition among concepts via a first-order Markov chain. The current state is maintained at all times and new instances are predicted based on that. Keeping these transitions helps to quickly recover from drifts in some real-world problems with periodic behavior. Comprehensive experimental results of the framework on synthetic and real-world data show the effectiveness of the framework in terms of performance and pool management.
引用
收藏
页码:15 / 44
页数:29
相关论文
共 50 条
  • [1] Modeling recurring concepts in data streams: a graph-based framework
    Ahmadi, Zahra
    Kramer, Stefan
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2018, 55 (01) : 15 - 44
  • [2] A Probabilistic Framework for Adapting to Changing and Recurring Concepts in Data Streams
    Halstead, Ben
    Koh, Yun Sing
    Riddle, Patricia
    Pechenizkiy, Mykola
    Bifet, Albert
    [J]. 2022 IEEE 9TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2022, : 407 - 416
  • [3] A Graph-based Hybrid Framework for Modeling Complex Heterogeneity
    Yang, Pei
    He, Jingrui
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2015, : 1081 - 1086
  • [4] A Graph-Based Framework for Multiscale Modeling of Physiological Transport
    Maheshvare, M. Deepa
    Raha, Soumyendu
    Pal, Debnath
    [J]. FRONTIERS IN NETWORK PHYSIOLOGY, 2022, 1
  • [5] Graph-based authentication of digital streams
    Miner, S
    Staddon, J
    [J]. 2001 IEEE SYMPOSIUM ON SECURITY AND PRIVACY, PROCEEDINGS, 2001, : 232 - 246
  • [6] Modeling semistructured data by using graph-based constraints
    Damiani, E
    Oliboni, B
    Quintarelli, E
    Tanca, L
    [J]. ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS 2003: OTM 2003 WORKSHOPS, 2003, 2889 : 20 - 21
  • [7] A graph-based approach for modeling and indexing video data
    Lee, Jeongkyu
    [J]. ISM 2006: EIGHTH IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA, PROCEEDINGS, 2006, : 348 - 355
  • [8] A Data Quality Framework for Graph-Based Virtual Data Integration Systems
    Li, Yalei
    Nadal, Sergi
    Romero, Oscar
    [J]. ADVANCES IN DATABASES AND INFORMATION SYSTEMS, ADBIS 2022, 2022, 13389 : 104 - 117
  • [9] A knowledge graph-based data harmonization framework for secondary data reuse
    Abad-Navarro, Francisco
    Martinez-Costa, Catalina
    [J]. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2024, 243
  • [10] A multiobjective evolutionary programming framework for graph-based data mining
    Shelokar, Prakash
    Quirin, Arnaud
    Cordon, Oscar
    [J]. INFORMATION SCIENCES, 2013, 237 : 118 - 136