Logging requirement for continuous auditing of responsible machine learning-based applications

被引:0
|
作者
Patrick Loic Foalem [1 ]
Leuson Da Silva [1 ]
Foutse Khomh [1 ]
Heng Li [1 ]
Ettore Merlo [1 ]
机构
[1] Polytechnique Montreal,Department of Computer Engineering and Software Engineering
关键词
Empirical; GitHub repository; Machine learning; Responsible ML; Logging; Auditing; Transparency; Fairness; Accountability;
D O I
10.1007/s10664-025-10656-8
中图分类号
学科分类号
摘要
Machine learning (ML) is increasingly used across various industries to automate decision-making processes. However, concerns about the ethical and legal compliance of ML models have arisen due to their lack of transparency, fairness, and accountability. Monitoring, particularly through logging, is a widely used technique in traditional software systems that could be leveraged to assist in auditing ML-based applications. Logs provide a record of an application’s behavior, which can be used for continuous auditing, debugging, and analyzing both the behavior and performance of the application. In this study, we investigate the logging practices of ML practitioners to capture responsible ML-related information in ML applications. We analyzed 85 ML projects hosted on GitHub, leveraging 20 responsible ML libraries that span principles such as privacy, transparency & explainability, fairness, and security & safety. Our analysis revealed important differences in the implementation of responsible AI principles. For example, out of 5,733 function calls analyzed, privacy accounted for 89.3% (5,120 calls), while fairness represented only 2.1% (118 calls), highlighting the uneven emphasis on these principles across projects. Furthermore, our manual analysis of 44,877 issue discussions revealed that only 8.1% of the sampled issues addressed responsible AI principles, with transparency & explainability being the most frequently discussed principles (32.2% of all issues related to responsible AI principles). Additionally, a survey conducted with ML practitioners provided direct insights into their perspectives, informing our exploration of ways to enhance logging practices for more effective, responsible ML auditing. We discovered that while privacy, model interpretability & explainability, fairness, and security & safety are commonly considered, there is a gap in how metrics associated with these principles are logged. Specifically, crucial fairness metrics like group and individual fairness, privacy metrics such as epsilon and delta, and explainability metrics like SHAP values are not considered current logging practices. The insights from this study highlight the need for ML practitioners and logging tool developers to adopt enhanced logging strategies that incorporate a broader range of responsible AI metrics. This adjustment will facilitate the development of auditable and ethically responsible ML applications, ensuring they meet emerging regulatory and societal expectations. These specific insights offer actionable guidance for improving the accountability and trustworthiness of ML systems.
引用
收藏
相关论文
共 50 条
  • [21] Detecting the impact of subject characteristics on machine learning-based diagnostic applications
    Neto, Elias Chaibub
    Pratap, Abhishek
    Perumal, Thanneer M.
    Tummalacherla, Meghasyam
    Snyder, Phil
    Bot, Brian M.
    Trister, Andrew D.
    Friend, Stephen H.
    Mangravite, Lara
    Omberg, Larsson
    NPJ DIGITAL MEDICINE, 2019, 2 (1)
  • [22] Conceptual Mappings of Conventional Software and Machine Learning-based Applications Development
    Angel, Shannon
    Namin, Akbar Siami
    2022 IEEE 46TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2022), 2022, : 1223 - 1230
  • [23] A DT Machine Learning-Based Satellite Orbit Prediction for IoT Applications
    Xu X.
    Wen H.
    Song H.
    Zhao Y.
    IEEE Internet of Things Magazine, 2023, 6 (02): : 96 - 100
  • [24] Efficient Encoding and Decoding of Voxelized Models for Machine Learning-Based Applications
    Strnad, Damjan
    Kohek, Stefan
    Zalik, Borut
    Vasa, Libor
    Nerat, Andrej
    IEEE ACCESS, 2025, 13 : 5551 - 5561
  • [25] Bayesian and machine learning-based fault detection and diagnostics for marine applications
    Cheliotis, Michail
    Lazakis, Iraklis
    Cheliotis, Angelos
    SHIPS AND OFFSHORE STRUCTURES, 2022, 17 (12) : 2686 - 2698
  • [26] Detecting the impact of subject characteristics on machine learning-based diagnostic applications
    Elias Chaibub Neto
    Abhishek Pratap
    Thanneer M. Perumal
    Meghasyam Tummalacherla
    Phil Snyder
    Brian M. Bot
    Andrew D. Trister
    Stephen H. Friend
    Lara Mangravite
    Larsson Omberg
    npj Digital Medicine, 2
  • [27] A review of machine learning-based human activity recognition for diverse applications
    Kulsoom, Farzana
    Narejo, Sanam
    Mehmood, Zahid
    Chaudhry, Hassan Nazeer
    Butt, Aisha
    Bashir, Ali Kashif
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (21): : 18289 - 18324
  • [28] A review of machine learning-based human activity recognition for diverse applications
    Farzana Kulsoom
    Sanam Narejo
    Zahid Mehmood
    Hassan Nazeer Chaudhry
    Ayesha Butt
    Ali Kashif Bashir
    Neural Computing and Applications, 2022, 34 : 18289 - 18324
  • [29] Evolution of Machine Learning in Tuberculosis Diagnosis: A Review of Deep Learning-Based Medical Applications
    Singh, Manisha
    Pujar, Gurubasavaraj Veeranna
    Kumar, Sethu Arun
    Bhagyalalitha, Meduri
    Akshatha, Handattu Shankaranarayana
    Abuhaija, Belal
    Alsoud, Anas Ratib
    Abualigah, Laith
    Beeraka, Narasimha M.
    Gandomi, Amir H.
    ELECTRONICS, 2022, 11 (17)
  • [30] Continuous Defect Prediction in CI/CD Pipelines: A Machine Learning-Based Framework
    Giorgio, Lazzarinetti
    Nicola, Massarenti
    Fabio, Sgro
    Andrea, Salafia
    AIXIA 2021 - ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, 13196 : 591 - 606