Towards Unsupervised Sudden Data Drift Detection in Federated Learning with Fuzzy Clustering

被引:0
|
作者
Stallmann, Morris [1 ]
Wilbik, Anna [1 ]
Weiss, Gerhard [1 ]
机构
[1] Maastricht Univ, Dept Adv Comp Sci, Maastricht, Netherlands
关键词
federated learning; fuzzy clustering; unsupervised; drift; drift detection; federated drift detection; federated data drift detection; FCM;
D O I
10.1109/FUZZ-IEEE60900.2024.10611883
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Federated learning (FL) is a machine learning (ML) discipline that allows to train ML models on distributed data without revealing raw data instances. It promises to enable ML in environments with data sharing constraints, e.g., due to data privacy concerns, or other considerations. Data and concept drift are commonly referred to as unpredictable changes in data distributions over time. It is known to impact a ML model's performances in many real-world scenarios. While drift detection and adaptation has been studied extensively in the non-federated setting, it is still less explored in the FL setting. The private and distributed nature of data in FL makes drift detection much harder in FL since no entity can oversee all data instances to estimate changes in the global data distribution. In this paper, we propose a novel unsupervised federated data drift detection method that is based on federated fuzzy c-means clustering and the federated fuzzy Davies-Bouldin index, a global cluster validation metric. First, using the federated fuzzy c-means clustering algorithm, an initial global data model is learned. Second, the federated fuzzy Davies-Bouldin index . is calculated estimating how well the data fits the learned model. Third, whenever a new batch of data is available at time t, the fit of initial data model and new data is evaluated through the federated fuzzy Davies-Bouldin index Delta(t). Finally Delta and Delta(t) are compared to detect drift. The method is unsupervised as it does not require any labels and detects global data drift while keeping all data private. We evaluate our method carefully in a controlled environment by simulating multiple federated drift scenarios. We observe promising results as it rarely signals false positive alarms and detects drift in multiple scenarios. We also observe short-comings such as sensitivity to parameter choices and low detection rate in case only few data points in a new batch of data are affected by drift.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] Combining supervised and unsupervised learning for data clustering
    Corsini, Paolo
    Lazzerini, Beatrice
    Marcelloni, Francesco
    NEURAL COMPUTING & APPLICATIONS, 2006, 15 (3-4): : 289 - 297
  • [22] On unsupervised simultaneous kernel learning and data clustering
    Malhotra, Akshay
    Schizas, Ioannis D.
    PATTERN RECOGNITION, 2020, 108
  • [23] FedRFC: Federated Learning with Recursive Fuzzy Clustering for improved non-IID data training
    Deng, Yuxiao
    Wang, Anqi
    Zhang, Lei
    Lei, Ying
    Li, Beibei
    Li, Yizhou
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2024, 160 : 835 - 843
  • [24] Consistent Post-Hoc Explainability in Federated Learning through Federated Fuzzy Clustering
    Ducange, Pietro
    Marcelloni, Francesco
    Renda, Alessandro
    Ruffini, Fabrizio
    2024 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, FUZZ-IEEE 2024, 2024,
  • [25] FedCrack: Federated Transfer Learning With Unsupervised Representation for Crack Detection
    Jin, Xiating
    Bu, Jiajun
    Yu, Zhi
    Zhang, Hui
    Wang, Yaonan
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (10) : 11171 - 11184
  • [26] Federated disentangled representation learning for unsupervised brain anomaly detection
    Cosmin I. Bercea
    Benedikt Wiestler
    Daniel Rueckert
    Shadi Albarqouni
    Nature Machine Intelligence, 2022, 4 : 685 - 695
  • [27] Federated Fuzzy Clustering for Decentralized Incomplete Longitudinal Behavioral Data
    Ngo, Hieu
    Fang, Hua
    Rumbut, Joshua
    Wang, Honggang
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (08): : 14657 - 14670
  • [28] Federated disentangled representation learning for unsupervised brain anomaly detection
    Bercea, Cosmin, I
    Wiestler, Benedikt
    Rueckert, Daniel
    Albarqouni, Shadi
    NATURE MACHINE INTELLIGENCE, 2022, 4 (08) : 685 - +
  • [29] Towards Instant Clustering Approach for Federated Learning Client Selection
    Arisdakessian, Sarhad
    Wahab, Omar Abdel
    Mourad, Azzam
    Otrok, Hadi
    2023 INTERNATIONAL CONFERENCE ON COMPUTING, NETWORKING AND COMMUNICATIONS, ICNC, 2023, : 409 - 413
  • [30] Data Poisoning Detection in Federated Learning
    Khuu, Denise-Phi
    Sober, Michael
    Kaaser, Dominik
    Fischer, Mathias
    Schulte, Stefan
    39TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2024, 2024, : 1549 - 1558