The Prevalence of Errors in Machine Learning Experiments

被引:6
|
作者
Shepperd, Martin [1 ]
Guo, Yuchen [2 ]
Li, Ning [3 ]
Arzoky, Mahir [1 ]
Capiluppi, Andrea [1 ]
Counsell, Steve [1 ]
Destefanis, Giuseppe [1 ]
Swift, Stephen [1 ]
Tucker, Allan [1 ]
Yousefi, Leila [1 ]
机构
[1] Brunel Univ London, London, England
[2] Xi An Jiao Tong Univ, Xian, Peoples R China
[3] Northwestern Polytech Univ, Xian, Peoples R China
关键词
Classifier; Computational experiment; Reliability; Error;
D O I
10.1007/978-3-030-33607-3_12
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Context: Conducting experiments is central to research machine learning research to benchmark, evaluate and compare learning algorithms. Consequently it is important we conduct reliable, trustworthy experiments. Objective: We investigate the incidence of errors in a sample of machine learning experiments in the domain of software defect prediction. Our focus is simple arithmetical and statistical errors. Method: We analyse 49 papers describing 2456 individual experimental results from a previously undertaken systematic review comparing supervised and unsupervised defect prediction classifiers. We extract the confusion matrices and test for relevant constraints, e.g., the marginal probabilities must sum to one. We also check for multiple statistical significance testing errors. Results: We find that a total of 22 out of 49 papers contain demonstrable errors. Of these 7 were statistical and 16 related to confusion matrix inconsistency (one paper contained both classes of error). Conclusions: Whilst some errors may be of a relatively trivial nature, e.g., transcription errors their presence does not engender confidence. We strongly urge researchers to follow open science principles so errors can be more easily be detected and corrected, thus as a community reduce this worryingly high error rate with our computational experiments.
引用
收藏
页码:102 / 109
页数:8
相关论文
共 50 条
  • [41] JetTrain: IDE-Native Machine Learning Experiments
    Trofimov, Artem
    Kostyukov, Mikhail
    Ugdyzhekov, Sergei
    Ponomareva, Natalia
    Naumov, Igor
    Melekhovets, Maksim
    PROCEEDINGS OF THE 2024 FIRST IDE WORKSHOP, IDE 2024, 2024, : 59 - 61
  • [42] Discovery Learning Experiments in a New Machine Design Laboratory
    Nagurka, Mark
    Anton, Fernando Rodriguez
    2013 ASEE ANNUAL CONFERENCE, 2013,
  • [43] Interacting meaningfully with machine learning systems: Three experiments
    Stumpf, Simone
    Rajaram, Vidya
    Li, Lida
    Wong, Weng-Keen
    Burnett, Margaret
    Dietterich, Thomas
    Sullivan, Erin
    Herlocker, Jonathan
    INTERNATIONAL JOURNAL OF HUMAN-COMPUTER STUDIES, 2009, 67 (08) : 639 - 662
  • [44] Double machine learning and design in batch adaptive experiments
    Li, Harrison H.
    Owen, Art B.
    JOURNAL OF CAUSAL INFERENCE, 2024, 12 (01)
  • [45] Machine Learning for Chemical Reactivity: The Importance of Failed Experiments
    Strieth-Kalthoff, Felix
    Sandfort, Frederik
    Kuhnemund, Marius
    Schaefer, Felix R.
    Kuchen, Herbert
    Glorius, Frank
    ANGEWANDTE CHEMIE-INTERNATIONAL EDITION, 2022, 61 (29)
  • [46] A hybrid machine learning algorithm for designing quantum experiments
    O'Driscoll, L.
    Nichols, R.
    Knott, P. A.
    QUANTUM MACHINE INTELLIGENCE, 2019, 1 (1-2) : 5 - 15
  • [47] The transformative potential of machine learning for experiments in fluid mechanics
    Vinuesa, Ricardo
    Brunton, Steven L.
    McKeon, Beverley J.
    NATURE REVIEWS PHYSICS, 2023, 5 (09) : 536 - 545
  • [48] Designing optimal behavioral experiments using machine learning
    Valentin, Simon
    Kleinegesse, Steven
    Bramley, Neil R.
    Series, Peggy
    Gutmann, Michael U.
    Lucas, Christopher G.
    ELIFE, 2024, 13
  • [49] Microfluidic Devices Controlled by Machine Learning with Failure Experiments
    Fukada, Kenta
    Seyama, Michiko
    ANALYTICAL CHEMISTRY, 2022, 94 (19) : 7060 - 7065
  • [50] Experiments quantum homodyne tomography via machine learning
    Tiunov, E. S.
    Tiunova , V. V.
    Ulanov, A. E.
    Lvovsky, A., I
    Fedorov, A. K.
    OPTICA, 2020, 7 (05): : 448 - 454