A Federated Learning Approach for Anomaly Detection in High Performance Computing

被引:4
|
作者
Farooq, Emmen [1 ]
Borghesi, Andrea [1 ]
机构
[1] Univ Bologna, DISI, Bologna, Italy
关键词
Federated Learning; High Performance Computing; Anomaly Detection; Machine Learning;
D O I
10.1109/ICTAI59109.2023.00079
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
High Performance Computing (HPC) systems are complex machines that need to be operated at their maximum potential to recoup their investment cost and to mitigate their environmental impact. Anomalous conditions hindering the correct usage of the supercomputing nodes are a significant problem. Hence, the development of automated anomaly detection techniques remains a vital area of research. Machine Learning (ML) models demonstrated to be good at detecting anomalies on individual nodes. However, the potential of combining data from multiple computing nodes and associated ML models has not been explored yet. Federated Learning (FL) can address this shortcoming, by allowing individual models to learn from each other. This paper applies FL to improve the performance of anomaly detection models for HPC systems. The approach has been validated on data from an actual supercomputer, obtaining an improvement in the average f-score from 0.31 to 0.84. We also show how FL can significantly shorten the data collection period needed to create a training set. While ML models need, on average, 4.5 months of training data, FL reduces the training set size to 1.2 weeks - a 15x reduction.
引用
收藏
页码:496 / 500
页数:5
相关论文
共 50 条
  • [1] A Federated Learning Approach to Anomaly Detection in Smart Buildings
    Sater, Raed Abdel
    Ben Hamza, A.
    ACM TRANSACTIONS ON INTERNET OF THINGS, 2021, 2 (04):
  • [2] Enhancing IoT Anomaly Detection Performance for Federated Learning
    Weinger, Brett
    Kim, Jinoh
    Sim, Alex
    Nakashima, Makiya
    Moustafa, Nour
    Wu, K. John
    2020 16TH INTERNATIONAL CONFERENCE ON MOBILITY, SENSING AND NETWORKING (MSN 2020), 2020, : 206 - 213
  • [3] Enhancing IoT anomaly detection performance for federated learning
    Weinger, Brett
    Kim, Jinoh
    Sim, Alex
    Nakashima, Makiya
    Moustafa, Nour
    Wu, K. John
    DIGITAL COMMUNICATIONS AND NETWORKS, 2022, 8 (03) : 314 - 323
  • [4] Enhancing IoT anomaly detection performance for federated learning
    Brett Weinger
    Jinoh Kim
    Alex Sim
    Makiya Nakashima
    Nour Moustafa
    KJohn Wu
    Digital Communications and Networks, 2022, 8 (03) : 314 - 323
  • [5] FedGroup: A Federated Learning Approach for Anomaly Detection in IoT Environments
    Zhang, Yixuan
    Suleiman, Basem
    Alibasa, Muhammad Johan
    MOBILE AND UBIQUITOUS SYSTEMS: COMPUTING, NETWORKING AND SERVICES, MOBIQUITOUS 2022, 2023, 492 : 121 - 132
  • [6] FADngs: Federated Learning for Anomaly Detection
    Dong, Boyu
    Chen, Dong
    Wu, Yu
    Tang, Siliang
    Zhuang, Yueting
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (02) : 2578 - 2592
  • [7] Anomaly Detection and Anticipation in High Performance Computing Systems
    Borghesi, Andrea
    Molan, Martin
    Milano, Michela
    Bartolini, Andrea
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (04) : 739 - 750
  • [8] A semisupervised autoencoder-based approach for anomaly detection in high performance computing systems
    Borghesi, Andrea
    Bartolini, Andrea
    Lombardi, Michele
    Milano, Michela
    Benini, Luca
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2019, 85 : 634 - 644
  • [9] Anomaly Detection through Unsupervised Federated Learning
    Nardi, Mirko
    Valerio, Lorenzo
    Passarella, Andrea
    2022 18TH INTERNATIONAL CONFERENCE ON MOBILITY, SENSING AND NETWORKING, MSN, 2022, : 495 - 501
  • [10] Federated Learning for Anomaly Detection in Vehicular Networks
    Tham, Chen-Khong
    Yang, Lu
    Khanna, Akshit
    Gera, Bhavya
    2023 IEEE 97TH VEHICULAR TECHNOLOGY CONFERENCE, VTC2023-SPRING, 2023,