SciDataFlow: a tool for improving the flow of data through science

被引:0
|
作者
Buffalo, Vince [1 ,2 ]
机构
[1] Univ Calif Berkeley, Dept Integrat Biol, Berkeley, CA USA
[2] Univ Calif Berkeley, Dept Integrat Biol, 4134 Valley Life Sci Bldg, Berkeley, CA 94720 USA
关键词
D O I
10.1093/bioinformatics/btad754
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation Managing data and code in open scientific research is complicated by two key problems: large datasets often cannot be stored alongside code in repository platforms like GitHub, and iterative analysis can lead to unnoticed changes to data, increasing the risk that analyses are based on older versions of data.Results SciDataFlow is a fast, concurrent command-line tool paired with a simple Data Manifest specification that streamlines tracking data changes, uploading data to remote repositories, and pulling in all data necessary to reproduce a computational analysis.Availability and implementation SciDataFlow is available at https://github.com/vsbuffalo/scidataflow.
引用
收藏
页数:3
相关论文
共 50 条
  • [21] Improving accessibility and discovery of ESA planetary data through the new planetary science archive
    Macfarlane, A. J.
    Docasal, R.
    Rios, C.
    Barbarisi, I.
    Saiz, J.
    Vallejo, F.
    Besse, S.
    Arviset, C.
    Barthelemy, M.
    De Marchi, G.
    Fraga, D.
    Grotheer, E.
    Heather, D.
    Lim, T.
    Martinez, S.
    Vallat, C.
    [J]. PLANETARY AND SPACE SCIENCE, 2018, 150 : 104 - 110
  • [22] Towards a data science platform for improving SME collaboration through Industry 4.0 technologies
    Han, Hui
    Trimi, Silvana
    [J]. TECHNOLOGICAL FORECASTING AND SOCIAL CHANGE, 2022, 174
  • [23] Improving healthcare management with data science
    Chiu, Hung-Wen
    Li, Yu-Chuan
    [J]. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2018, 154 : A1 - A1
  • [24] IMPROVING DATA ANALYSIS IN POLITICAL SCIENCE
    TUFTE, ER
    [J]. WORLD POLITICS, 1969, 21 (04) : 641 - 654
  • [25] Improving science return using orbit navigation analysis based on non-imaging science data and the GGS visualization tool
    Simmons, KE
    Lasica, SJ
    Pape, BM
    Pryor, WR
    [J]. ORBIT DETERMINATION AND ANALYSIS, 1997, 19 (11): : 1719 - 1722
  • [26] Tool and science: data history at NTNU
    Nygaard, Pal
    [J]. HISTORISK TIDSSKRIFT, 2011, 90 (02) : 275 - 279
  • [27] Data Science Through the Lens of Social Science
    Conway, Drew
    [J]. PROCEEDINGS OF THE 20TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'14), 2014, : 1520 - 1520
  • [28] Improving Forensic Science Through State Oversight
    Goldstein, Ryan M.
    [J]. TEXAS LAW REVIEW, 2011, 90 (01) : 225 - 258
  • [29] Improving disaster response through the science of work
    Wright, Natalie A.
    Foster, Lori
    [J]. INTERNATIONAL JOURNAL OF DISASTER RISK REDUCTION, 2018, 31 : 112 - 120