Do machine learning platforms provide out-of-the-box reproducibility?

被引:28
|
作者
Gundersen, Odd Erik [1 ]
Shamsaliei, Saeid [1 ]
Isdahl, Richard Juul [1 ]
机构
[1] Norwegian Univ Sci & Technol, Dept Comp Sci, Trondheim, Norway
关键词
Reproducibility; Reproducible AI; Machine learning; Survey; Reproducibility experiment; SOFTWARE;
D O I
10.1016/j.future.2021.06.014
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Science is experiencing an ongoing reproducibility crisis. In light of this crisis, our objective is to investigate whether machine learning platforms provide out-of-the-box reproducibility. Our method is twofold: First, we survey machine learning platforms for whether they provide features that simplify making experiments reproducible out-of-the-box. Second, we conduct the exact same experiment on four different machine learning platforms, and by this varying the processing unit and ancillary software only. The survey shows that no machine learning platform supports the feature set described by the proposed framework while the experiment reveals statstically significant difference in results when the exact same experiment is conducted on different machine learning platforms. The surveyed machine learning platforms do not on their own enable users to achieve the full reproducibility potential of their research. Also, the machine learning platforms with most users provide less functionality for achieving it. Furthermore, results differ when executing the same experiment on the different platforms. Wrong conclusions can be inferred at the at 95% confidence level. Hence, we conclude that machine learning platforms do not provide reproducibility out-of-the-box and that results generated from one machine learning platform alone cannot be fully trusted. (C) 2021 The Author(s). Published by Elsevier B.V.
引用
收藏
页码:34 / 47
页数:14
相关论文
共 50 条