Online Feature Screening for Data Streams With Concept Drift

被引:3
|
作者
Wang, Mingyuan [1 ]
Barbu, Adrian [1 ]
机构
[1] Florida State Univ, Dept Stat, Tallahassee, FL 32306 USA
关键词
Feature extraction; Adaptation models; Fading channels; Data models; Computational modeling; Uncertainty; Indexes; Concept drift; data stream mining; feature screening; feature selection; model adaptation; DYNAMIC FEATURE-SELECTION; GENE-EXPRESSION; CLASSIFICATION;
D O I
10.1109/TKDE.2022.3232752
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Screening feature selection methods are often used as a preprocessing step for reducing the number of variables before training a model. Traditional screening methods only focus on dealing with complete high dimensional datasets. However, modern datasets not only have higher dimensions and larger sample size, but also have properties such as streaming input, sparsity, and concept drift. Therefore a considerable number of online feature selection methods were introduced to handle these kinds of problems in recent years. Online screening methods are one of the categories of online feature selection methods. The methods that we propose in this paper are capable of handling all three situations mentioned above, in classification settings. Our experiments show that the proposed methods can generate the same feature importance as their offline versions with faster speed and less storage requirements. Furthermore, the results show that online screening methods with integrated model adaptation have a higher true feature detection rate than without model adaptation on data streams exhibiting concept drift. Among the three large real datasets that potentially have concept drift, online screening methods with model adaptation show advantages in either saving computation time and space, reducing model complexity, or improving prediction accuracy.
引用
下载
收藏
页码:11693 / 11707
页数:15
相关论文
共 50 条
  • [1] Calculating Feature Importance in Data Streams with Concept Drift using Online Random Forest
    Cassidy, Andrew Phelps
    Deviney, Frank A., Jr.
    2014 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2014,
  • [2] A Stable and Online Approach to Detect Concept Drift in Data Streams
    da Costa, Fausto Guzzo
    de Mello, Rodrigo Fernandes
    2014 BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS), 2014, : 330 - 335
  • [3] A Novel Online Ensemble Approach for Concept Drift in Data Streams
    Sidhu, Parneeta
    Bhatia, M. P. S.
    Bindal, Aditya
    2013 IEEE SECOND INTERNATIONAL CONFERENCE ON IMAGE INFORMATION PROCESSING (ICIIP), 2013, : 550 - 555
  • [4] Online Clustering for Novelty Detection and Concept Drift in Data Streams
    Garcia, Kemilly Dearo
    Poel, Mannes
    Kok, Joost N.
    de Carvalho, Andre C. P. L. F.
    PROGRESS IN ARTIFICIAL INTELLIGENCE, PT II, 2019, 11805 : 448 - 459
  • [5] Online Ensemble Using Adaptive Windowing for Data Streams with Concept Drift
    Sun, Yange
    Wang, Zhihai
    Liu, Haiyang
    Du, Chao
    Yuan, Jidong
    INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2016,
  • [6] Classification of concept drift data streams
    Padmalatha, E.
    Reddy, C. R. K.
    Rani, B. Padmaja
    2014 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND APPLICATIONS (ICISA), 2014,
  • [7] An online ensembles approach for handling concept drift in data streams: diversified online ensembles detection
    Parneeta Sidhu
    M. P. S. Bhatia
    International Journal of Machine Learning and Cybernetics, 2015, 6 : 883 - 909
  • [8] An online ensembles approach for handling concept drift in data streams: diversified online ensembles detection
    Sidhu, Parneeta
    Bhatia, M. P. S.
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2015, 6 (06) : 883 - 909
  • [9] Feature Drift Detection in Evolving Data Streams
    Zhao, Di
    Koh, Yun Sing
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2020, PT II, 2020, 12392 : 335 - 349
  • [10] Towards Online Concept Drift Detection with Feature Selection for Data Stream Classification
    Hammoodi, Mahmood
    Stahl, Frederic
    Tennant, Mark
    ECAI 2016: 22ND EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, 285 : 1549 - 1550