A large-scale study on research code quality and execution

被引:0
|
作者
Ana Trisovic
Matthew K. Lau
Thomas Pasquier
Mercè Crosas
机构
[1] Harvard University,Institute for Quantitative Social Science
[2] Chinese Academy of Sciences,CAS Key Laboratory of Forest Ecology and Management, Institute of Applied Ecology
[3] University of British Columbia,Department of Computer Science
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
This article presents a study on the quality and execution of research code from publicly-available replication datasets at the Harvard Dataverse repository. Research code is typically created by a group of scientists and published together with academic papers to facilitate research transparency and reproducibility. For this study, we define ten questions to address aspects impacting research reproducibility and reuse. First, we retrieve and analyze more than 2000 replication datasets with over 9000 unique R files published from 2010 to 2020. Second, we execute the code in a clean runtime environment to assess its ease of reuse. Common coding errors were identified, and some of them were solved with automatic code cleaning to aid code execution. We find that 74% of R files failed to complete without error in the initial execution, while 56% failed when code cleaning was applied, showing that many errors can be prevented with good coding practices. We also analyze the replication datasets from journals’ collections and discuss the impact of the journal policy strictness on the code re-execution rate. Finally, based on our results, we propose a set of recommendations for code dissemination aimed at researchers, journals, and repositories.
引用
收藏
相关论文
共 50 条
  • [31] LARGE-SCALE CHANGE AND THE QUALITY REVOLUTION
    COLE, RE
    LARGE-SCALE ORGANIZATIONAL CHANGE, 1989, : 229 - 254
  • [32] RESEARCH IN LARGE-SCALE INTERVENTION PROGRAMS
    FREEMAN, HE
    SHERWOOD, CC
    JOURNAL OF SOCIAL ISSUES, 1965, 21 (01) : 11 - 28
  • [33] A Framework for Research on Large-scale Reform
    Kenneth Leithwood
    Doris Jantzi
    Blair Mascall
    Journal of Educational Change, 2002, 3 (1) : 7 - 33
  • [34] FUTURE OF LARGE-SCALE RESEARCH CENTERS
    GRILLO, W
    HARIG, HD
    KUTSCHKE, D
    ATOMWIRTSCHAFT-ATOMTECHNIK, 1973, 18 (12): : 566 - 568
  • [35] IR software for large-scale research
    Newby, G
    ASIST 2001: PROCEEDINGS OF THE 64TH ASIST ANNUAL MEETING, VOL 38, 2001, 2001, 38 : 656 - 656
  • [36] A RESEARCH SUGGESTION IN LARGE-SCALE RORSCHACH
    ZUCKERMAN, SB
    JOURNAL OF CONSULTING PSYCHOLOGY, 1948, 12 (05): : 300 - 302
  • [37] LARGE-SCALE RESEARCH - AN EXPERT OF INTEGRITY
    KORBMANN, R
    UMSCHAU DAS WISSENSCHAFTSMAGAZIN, 1982, 82 (09): : 274 - 274
  • [38] Research and Application of Integrated Quality System for Large-Scale Complex Equipment Manufacturing
    Li, Cheng
    PROCEEDINGS OF THE 22ND INTERNATIONAL CONFERENCE ON INDUSTRIAL ENGINEERING AND ENGINEERING MANAGEMENT: INNOVATION AND PRACTICE IN INDUSTRIAL ENGINEERING AND MANAGEMENT (VOL 2), 2016, : 871 - 879
  • [39] The Research of Power Quality Prediction and Evaluation Method for the Large-scale Charging Load
    Qu, Tianyi
    Chen, Kui
    PROCEEDINGS OF THE 2016 6TH INTERNATIONAL CONFERENCE ON MANAGEMENT, EDUCATION, INFORMATION AND CONTROL (MEICI 2016), 2016, 135 : 507 - 511
  • [40] A Large-Scale Study of Misophonia
    Rouw, Romke
    Erfanian, Mercede
    JOURNAL OF CLINICAL PSYCHOLOGY, 2018, 74 (03) : 453 - 479