Online Feature Screening for Data Streams With Concept Drift

被引:3
|
作者
Wang, Mingyuan [1 ]
Barbu, Adrian [1 ]
机构
[1] Florida State Univ, Dept Stat, Tallahassee, FL 32306 USA
关键词
Feature extraction; Adaptation models; Fading channels; Data models; Computational modeling; Uncertainty; Indexes; Concept drift; data stream mining; feature screening; feature selection; model adaptation; DYNAMIC FEATURE-SELECTION; GENE-EXPRESSION; CLASSIFICATION;
D O I
10.1109/TKDE.2022.3232752
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Screening feature selection methods are often used as a preprocessing step for reducing the number of variables before training a model. Traditional screening methods only focus on dealing with complete high dimensional datasets. However, modern datasets not only have higher dimensions and larger sample size, but also have properties such as streaming input, sparsity, and concept drift. Therefore a considerable number of online feature selection methods were introduced to handle these kinds of problems in recent years. Online screening methods are one of the categories of online feature selection methods. The methods that we propose in this paper are capable of handling all three situations mentioned above, in classification settings. Our experiments show that the proposed methods can generate the same feature importance as their offline versions with faster speed and less storage requirements. Furthermore, the results show that online screening methods with integrated model adaptation have a higher true feature detection rate than without model adaptation on data streams exhibiting concept drift. Among the three large real datasets that potentially have concept drift, online screening methods with model adaptation show advantages in either saving computation time and space, reducing model complexity, or improving prediction accuracy.
引用
下载
收藏
页码:11693 / 11707
页数:15
相关论文
共 50 条
  • [41] Concept drift robust adaptive novelty detection for data streams
    Cejnek, Matous
    Bukovsky, Ivo
    NEUROCOMPUTING, 2018, 309 : 46 - 53
  • [42] DragStream: An Anomaly And Concept Drift Detector In Univariate Data Streams
    Bibinbe, Anne Marthe Sophie Ngo
    Mahamadou, Abdoul J.
    Mbouopda, Michael Franklin
    Nguifo, Engelbert Mephu
    2022 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS, ICDMW, 2022, : 842 - 851
  • [43] Semi-supervised classification on data streams with recurring concept drift and concept evolution
    Zheng, Xiulin
    Li, Peipei
    Hu, Xuegang
    Yu, Kui
    KNOWLEDGE-BASED SYSTEMS, 2021, 215
  • [44] Addressing Feature Drift in Data Streams Using Iterative Subset Selection
    Yuan, Lanqin
    Pfahringer, Bernhard
    Barddal, Jean Paul
    APPLIED COMPUTING REVIEW, 2019, 19 (01): : 20 - 33
  • [45] The Gradual Resampling Ensemble for mining imbalanced data streams with concept drift
    Ren, Siqi
    Liao, Bo
    Zhu, Wen
    Li, Zeng
    Liu, Wei
    Li, Keqin
    NEUROCOMPUTING, 2018, 286 : 150 - 166
  • [46] Dynamically Adjusting Diversity in Ensembles for the Classification of Data Streams with Concept Drift
    Hidalgo, Juan I. G.
    Santos, Silas G. T. C.
    Barros, Roberto S. M.
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2022, 16 (02)
  • [47] Anytime clustering of data streams while handling noise and concept drift
    Challa, Jagat Sesh
    Goyal, Poonam
    Kokandakar, Ajinkya
    Mantri, Dhananjay
    Verma, Pranet
    Balasubramaniam, Sundar
    Goyal, Navneet
    JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2022, 34 (03) : 399 - 429
  • [48] An Ensemble Classifier Method for Classifying Data Streams with Recurrent Concept Drift
    Wei, Guiying
    Zhang, Tao
    Wu, Sen
    Zou, Lei
    4TH INTERNATIONAL CONFERENCE ON AWARENESS SCIENCE AND TECHNOLOGY (ICAST 2012), 2012, : 3 - 9
  • [49] Intrusion detection in the IoT data streams using concept drift localization
    Chu, Renjie
    Jin, Peiyuan
    Qiao, Hanli
    Feng, Quanxi
    AIMS MATHEMATICS, 2024, 9 (01): : 1535 - 1561
  • [50] Accumulating regional density dissimilarity for concept drift detection in data streams
    Liu, Anjin
    Lu, Jie
    Liu, Feng
    Zhang, Guangquan
    PATTERN RECOGNITION, 2018, 76 : 256 - 272