Silent bugs in deep learning frameworks: an empirical study of Keras and TensorFlow

被引：0

作者：

Tambon, Florian ^{[1
]}

Nikanjam, Amin ^{[1
]}

An, Le ^{[1
]}

Khomh, Foutse ^{[1
]}

Antoniol, Giuliano ^{[1
]}

机构：

[1] Polytech Montreal, Montreal, PQ H3C 3A7, Canada

来源：

EMPIRICAL SOFTWARE ENGINEERING | 2024年 / 29卷 / 01期

基金：

加拿大自然科学与工程研究理事会;

关键词：

Deep learning; Bug analysis; Empirical study; Keras; TensorFlow;

D O I：

10.1007/s10664-023-10389-6

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Deep Learning (DL) frameworks are now widely used, simplifying the creation of complex models as well as their integration into various applications even among non-DL experts. However, like any other programs, they are prone to bugs. This paper deals with the subcategory of bugs named silent bugs: they lead to wrong behavior but they do not cause system crashes or hangs, nor show an error message to the user. Such bugs are even more dangerous in DL applications and frameworks due to the "black-box" and stochastic nature of the DL systems (i.e., the end user can not understand how the model makes decisions). This paper presents the first empirical study of the silent bugs in Tensorflow, specifically its high-level API Keras, and their impact on users' programs. We extracted closed issues related to Keras API from the TensorFlow GitHub repository. Out of the 1,168 issues that we gathered, 77 were reproducible silent bugs affecting users' programs. We categorized the bugs based on the effects on the users' programs and the components where the issues occurred, using information from the issue reports. We then derived a threat level for each of the issues, based on the impact they had on the users' programs. To assess the relevance of identified categories and the impact scale, we conducted an online survey with 103 DL developers. The participants generally agreed with the significant impact of silent bugs in DL frameworks and how they impact users and acknowledged our findings (i.e., categories of silent bugs and the proposed impact scale).

引用

页数：34

共 50 条

[1] Silent bugs in deep learning frameworks: an empirical study of Keras and TensorFlow
Florian Tambon
Amin Nikanjam
Le An
Foutse Khomh
Giuliano Antoniol
[J]. Empirical Software Engineering, 2024, 29
[2] An Empirical Study on Performance Bugs in Deep Learning Frameworks
Makkouk, Tarek
Kim, Dong Jae
Chen, Tse-Hsun
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME 2022), 2022, : 35 - 46
[3] An Empirical Study on TensorFlow Program Bugs
Zhang, Yuhao
Chen, Yifan
Cheung, Shing-Chi
Xiong, Yingfei
Zhang, Lu
[J]. ISSTA'18: PROCEEDINGS OF THE 27TH ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, 2018, : 129 - 140
[4] An Empirical Study on Bugs Inside TensorFlow
Jia, Li
Zhong, Hao
Wang, Xiaoyin
Huang, Linpeng
Lu, Xuansheng
[J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2020), PT I, 2020, 12112 : 604 - 620
[5] An Empirical Study of Bugs in Quantum Machine Learning Frameworks
Zhao, Pengzhan
Wu, Xiongfei
Luo, Junjie
Li, Zhuo
Zhao, Jianjun
[J]. 2023 IEEE INTERNATIONAL CONFERENCE ON QUANTUM SOFTWARE, QSW, 2023, : 68 - 75
[6] An Empirical Study on Common Bugs in Deep Learning Compilers
Du, Xiaoting
Zheng, Zheng
Ma, Lei
Zhao, Jianjun
[J]. 2021 IEEE 32ND INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING (ISSRE 2021), 2021, : 184 - 195
[7] An Empirical Study on Numerical Bugs in Deep Learning Programs
Wang, Gan
Wang, Zan
Chen, Junjie
Chen, Xiang
Yan, Ming
[J]. PROCEEDINGS OF THE 37TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE 2022, 2022,
[8] hyper-sinh: An accurate and reliable function from shallow to deep learning in TensorFlow and Keras
Parisi, Luca
Ma, Renfei
RaviChandran, Narrendar
Lanzillotta, Matteo
[J]. MACHINE LEARNING WITH APPLICATIONS, 2021, 6
[9] A Review of Local Feature Algorithms and Deep Learning Approaches in Facial Expression Recognition with Tensorflow and Keras
Chengeta, Kennedy
[J]. PATTERN RECOGNITION, MCPR 2019, 2019, 11524 : 127 - 138
[10] Unveiling the Mystery of API Evolution in Deep Learning Frameworks A Case Study of Tensorflow 2
Zhang, Zejun
Yang, Yanming
Xia, Xin
Lo, David
Ren, Xiaoxue
Grundy, John
[J]. 2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: SOFTWARE ENGINEERING IN PRACTICE (ICSE-SEIP 2021), 2021, : 238 - 247

← 1 2 3 4 5 →