Adaptive PCA-based feature drift detection using statistical measure

被引:4
|
作者
Agrahari, Supriya [1 ]
Singh, Anil Kumar [1 ]
机构
[1] MNNIT Allahabad, Prayagraj, India
关键词
Data stream; Principal component analysis (PCA); Feature drift; Concept drift; Prediction model;
D O I
10.1007/s10586-022-03695-z
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The plethora of existing methods in the streaming environment is sensitive to extensive and high-dimensional data. The distribution of these streaming data may change concerning time, known as concept drift. Several drift detectors are built to identify the drift near its occurrence point. Still, they lack proper attention to determine the feature relevance change over time, known as feature drift. Over time, the distribution change of the relevant features subset or the change in the relevant features subset itself may cause feature drift in the data stream. The paper proposes an adaptive principal component analysis based feature drift detection method (PCA-FDD) using the statistical measure to determine the feature drift. The proposed work presents a framework for identifying the most important features subset, feature drift, and incremental adaptation of the prediction model. The proposed method finds the relevant features subset by utilizing the incremental PCA and detects feature drift by observing the change in the percentage similarities among the most important features subset with respect to time. It also helps to forecast the prediction error of the base learning model. The proposed method is compared with state-of-the-art methods using synthetic and real-time datasets. The evaluation results exhibit that the proposed work performs better than the existing compared methods in terms of classification accuracy.
引用
收藏
页码:4481 / 4494
页数:14
相关论文
共 50 条
  • [1] Adaptive PCA-based feature drift detection using statistical measure
    Supriya Agrahari
    Anil Kumar Singh
    [J]. Cluster Computing, 2022, 25 : 4481 - 4494
  • [2] Statistical fault detection using PCA-based GLR hypothesis testing
    Harrou, Fouzi
    Nounou, Mohamed N.
    Nounou, Hazem N.
    Madakyaru, Muddu
    [J]. JOURNAL OF LOSS PREVENTION IN THE PROCESS INDUSTRIES, 2013, 26 (01) : 129 - 139
  • [3] PCA-based feature extraction using class information
    Park, MS
    Na, JH
    Choi, JY
    [J]. INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOL 1-4, PROCEEDINGS, 2005, : 341 - 345
  • [4] Speaker recognition using PCA-based feature transformation
    Ahmed, Ahmed Isam
    Chiverton, John P.
    Ndzi, David L.
    Becerra, Victor M.
    [J]. SPEECH COMMUNICATION, 2019, 110 : 33 - 46
  • [5] PCA-based multivariate statistical network monitoring for anomaly detection
    Camacho, Jose
    Perez-Villegas, Alejandro
    Garcia-Teodoro, Pedro
    Macia-Fernandez, Gabriel
    [J]. COMPUTERS & SECURITY, 2016, 59 : 118 - 137
  • [6] Exploring Dataset Similarities using PCA-based Feature Selection
    Siegert, Ingo
    Boeck, Ronald
    Wendemuth, Andreas
    Vlasenko, Bogdan
    [J]. 2015 INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2015, : 387 - 393
  • [7] Hierarchical PCA-Based Multivariate Statistical Network Monitoring for Anomaly Detection
    Macia-Fernandez, Gabriel
    Camacho, Jose
    Garcia-Teodoro, Pedro
    Rodriguez-Gomez, Rafael A.
    [J]. 2016 8TH IEEE INTERNATIONAL WORKSHOP ON INFORMATION FORENSICS AND SECURITY (WIFS 2016), 2016,
  • [8] PCA-based Arabic Character feature extraction
    Zidouri, Abdelmalek
    [J]. 2007 9TH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND ITS APPLICATIONS, VOLS 1-3, 2007, : 652 - 655
  • [9] Improving Performance of Network Scanning Detection Through PCA-Based Feature Selection
    Abdurrazaq, Muhammad N.
    Rahardjo, Budi
    Bambang, Riyanto T.
    [J]. 2014 INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY SYSTEMS AND INNOVATION (ICITSI), 2014, : 323 - 328
  • [10] Color Watermark Extraction Using Deep Neural Network in IWT Domain with PCA-Based Statistical Feature Reduction
    Jaiswal S.
    Pandey M.K.
    [J]. SN Computer Science, 4 (5)