Online Feature Screening for Data Streams With Concept Drift

被引:3
|
作者
Wang, Mingyuan [1 ]
Barbu, Adrian [1 ]
机构
[1] Florida State Univ, Dept Stat, Tallahassee, FL 32306 USA
关键词
Feature extraction; Adaptation models; Fading channels; Data models; Computational modeling; Uncertainty; Indexes; Concept drift; data stream mining; feature screening; feature selection; model adaptation; DYNAMIC FEATURE-SELECTION; GENE-EXPRESSION; CLASSIFICATION;
D O I
10.1109/TKDE.2022.3232752
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Screening feature selection methods are often used as a preprocessing step for reducing the number of variables before training a model. Traditional screening methods only focus on dealing with complete high dimensional datasets. However, modern datasets not only have higher dimensions and larger sample size, but also have properties such as streaming input, sparsity, and concept drift. Therefore a considerable number of online feature selection methods were introduced to handle these kinds of problems in recent years. Online screening methods are one of the categories of online feature selection methods. The methods that we propose in this paper are capable of handling all three situations mentioned above, in classification settings. Our experiments show that the proposed methods can generate the same feature importance as their offline versions with faster speed and less storage requirements. Furthermore, the results show that online screening methods with integrated model adaptation have a higher true feature detection rate than without model adaptation on data streams exhibiting concept drift. Among the three large real datasets that potentially have concept drift, online screening methods with model adaptation show advantages in either saving computation time and space, reducing model complexity, or improving prediction accuracy.
引用
下载
收藏
页码:11693 / 11707
页数:15
相关论文
共 50 条
  • [31] Nacre: Proactive Recurrent Concept Drift Detection in Data Streams
    Wu, Ocean
    Koh, Yun Sing
    Dobbie, Gillian
    Lacombe, Thomas
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [32] Discussion and review on evolving data streams and concept drift adapting
    Khamassi, Imen
    Sayed-Mouchaweh, Moamar
    Hammami, Moez
    Ghedira, Khaled
    EVOLVING SYSTEMS, 2018, 9 (01) : 1 - 23
  • [33] Batch Weighted Ensemble for Mining Data Streams with Concept Drift
    Deckert, Magdalena
    FOUNDATIONS OF INTELLIGENT SYSTEMS, 2011, 6804 : 290 - 299
  • [34] Detecting concept drift in data streams using model explanation
    Demsar, Jaka
    Bosnic, Zoran
    EXPERT SYSTEMS WITH APPLICATIONS, 2018, 92 : 546 - 559
  • [35] Predicting concept drift in data streams using metadata clustering
    Anderson, Robert
    Koh, Yun Sing
    Dobbie, Gillian
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [36] A SURVEY OF ENSEMBLE CLASSIFICATION OVER CONCEPT DRIFT DATA STREAMS
    Du, Shiyu
    Han, Meng
    Shen, Mingyao
    Zhang, Chunyan
    Sun, Rui
    Gao, Tianji
    JOURNAL OF NONLINEAR AND CONVEX ANALYSIS, 2020, 21 (07) : 1567 - 1579
  • [37] Detecting group concept drift from multiple data streams
    Yu, Hang
    Liu, Weixu
    Lu, Jie
    Wen, Yimin
    Luo, Xiangfeng
    Zhang, Guangquan
    PATTERN RECOGNITION, 2023, 134
  • [38] Learning Decision Trees from Data Streams with Concept Drift
    Jankowski, Dariusz
    Jackowski, Konrad
    Cyganek, Boguslaw
    INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE 2016 (ICCS 2016), 2016, 80 : 1682 - 1691
  • [39] Learning Parameter Distributions to Detect Concept Drift in Data Streams
    Haug, Johannes
    Kasneci, Gjergji
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 9452 - 9459
  • [40] Discussion and review on evolving data streams and concept drift adapting
    Imen Khamassi
    Moamar Sayed-Mouchaweh
    Moez Hammami
    Khaled Ghédira
    Evolving Systems, 2018, 9 : 1 - 23