The Prevalence of Errors in Machine Learning Experiments

被引:6
|
作者
Shepperd, Martin [1 ]
Guo, Yuchen [2 ]
Li, Ning [3 ]
Arzoky, Mahir [1 ]
Capiluppi, Andrea [1 ]
Counsell, Steve [1 ]
Destefanis, Giuseppe [1 ]
Swift, Stephen [1 ]
Tucker, Allan [1 ]
Yousefi, Leila [1 ]
机构
[1] Brunel Univ London, London, England
[2] Xi An Jiao Tong Univ, Xian, Peoples R China
[3] Northwestern Polytech Univ, Xian, Peoples R China
关键词
Classifier; Computational experiment; Reliability; Error;
D O I
10.1007/978-3-030-33607-3_12
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Context: Conducting experiments is central to research machine learning research to benchmark, evaluate and compare learning algorithms. Consequently it is important we conduct reliable, trustworthy experiments. Objective: We investigate the incidence of errors in a sample of machine learning experiments in the domain of software defect prediction. Our focus is simple arithmetical and statistical errors. Method: We analyse 49 papers describing 2456 individual experimental results from a previously undertaken systematic review comparing supervised and unsupervised defect prediction classifiers. We extract the confusion matrices and test for relevant constraints, e.g., the marginal probabilities must sum to one. We also check for multiple statistical significance testing errors. Results: We find that a total of 22 out of 49 papers contain demonstrable errors. Of these 7 were statistical and 16 related to confusion matrix inconsistency (one paper contained both classes of error). Conclusions: Whilst some errors may be of a relatively trivial nature, e.g., transcription errors their presence does not engender confidence. We strongly urge researchers to follow open science principles so errors can be more easily be detected and corrected, thus as a community reduce this worryingly high error rate with our computational experiments.
引用
收藏
页码:102 / 109
页数:8
相关论文
共 50 条
  • [1] MACHINE LEARNING The chemistry of errors
    Cole, Jacqueline M.
    NATURE CHEMISTRY, 2022, 14 (09) : 973 - 975
  • [2] EXPERIMENTS IN MACHINE LEARNING AND THINKING
    KILBURN, T
    GRIMSDALE, RL
    SUMNER, FH
    COMMUNICATIONS OF THE ACM, 1959, 2 (07) : 20 - 21
  • [3] Extreme learning machine with errors in variables
    Zhao, Jianwei
    Wang, Zhihui
    Cao, Feilong
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2014, 17 (05): : 1205 - 1216
  • [4] Extreme learning machine with errors in variables
    Jianwei Zhao
    Zhihui Wang
    Feilong Cao
    World Wide Web, 2014, 17 : 1205 - 1216
  • [5] Design of experiments and machine learning with application to industrial experiments
    Roberto Fontana
    Alberto Molena
    Luca Pegoraro
    Luigi Salmaso
    Statistical Papers, 2023, 64 : 1251 - 1274
  • [6] Design of experiments and machine learning with application to industrial experiments
    Fontana, Roberto
    Molena, Alberto
    Pegoraro, Luca
    Salmaso, Luigi
    STATISTICAL PAPERS, 2023, 64 (04) : 1251 - 1274
  • [7] A review on machine learning for neutrino experiments
    Psihas, Fernanda
    Groh, Micah
    Tunnell, Christopher
    Warburton, Karl
    INTERNATIONAL JOURNAL OF MODERN PHYSICS A, 2020, 35 (33):
  • [8] Experiments on the Generalization of Machine Learning Algorithms
    Franz, Arthur
    ARTIFICIAL GENERAL INTELLIGENCE, AGI 2021, 2022, 13154 : 75 - 85
  • [9] Effect of Errors on the Evaluation of Machine Learning Systems
    Bracamonte, Vanessa
    Hidano, Seira
    Kiyomoto, Shinsaku
    PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (HUCAPP), VOL 2, 2022, : 48 - 57
  • [10] Accounting for Machine Learning Prediction Errors in Design
    Du, Xiaoping
    JOURNAL OF MECHANICAL DESIGN, 2024, 146 (05)