Reproducible biomedical benchmarking in the cloud: lessons from crowd-sourced data challenges

被引:16
|
作者
Ellrott, Kyle [1 ]
Buchanan, Alex [1 ]
Creason, Allison [1 ]
Mason, Michael [2 ]
Schaffter, Thomas [3 ]
Hoff, Bruce [2 ]
Eddy, James [2 ]
Chilton, John M. [4 ]
Yu, Thomas [2 ]
Stuart, Joshua M. [5 ]
Saez-Rodriguez, Julio [6 ,7 ,8 ]
Stolovitzky, Gustavo [3 ]
Boutros, Paul C. [9 ,10 ,11 ,12 ,13 ,14 ,15 ]
Guinney, Justin [2 ,16 ]
机构
[1] Oregon Hlth & Sci Univ, Biomed Engn, Portland, OR 97239 USA
[2] Sage Bionetworks, Seattle, WA 98121 USA
[3] IBM Res, Yorktown Hts, NY USA
[4] Penn State Univ, Dept Biochem & Mol Biol, State Coll, PA USA
[5] Univ Calif Santa Cruz, Santa Cruz, CA 95064 USA
[6] Heidelberg Univ, Fac Med, Inst Computat Biomed, Heidelberg, Germany
[7] Heidelberg Univ Hosp, Bioquant, Heidelberg, Germany
[8] Rhein Westfal TH Aachen, Fac Med, Joint Res Ctr Computat Biomed, Aachen, Germany
[9] Ontario Inst Canc Res, Toronto, ON, Canada
[10] Univ Toronto, Dept Med Biophys, Toronto, ON, Canada
[11] Univ Toronto, Dept Pharmacol & Toxicol, Toronto, ON, Canada
[12] Univ Calif Los Angeles, Dept Human Genet, Los Angeles, CA USA
[13] Univ Calif Los Angeles, Dept Urol, Los Angeles, CA USA
[14] Univ Calif Los Angeles, Jonsson Comprehens Canc Ctr, Los Angeles, CA 90024 USA
[15] Univ Calif Los Angeles, Inst Precis Hlth, Los Angeles, CA USA
[16] Univ Washington, Biomed Informat & Med Educ, Seattle, WA 98195 USA
关键词
EXPRESSION;
D O I
10.1186/s13059-019-1794-0
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Challenges are achieving broad acceptance for addressing many biomedical questions and enabling tool assessment. But ensuring that the methods evaluated are reproducible and reusable is complicated by the diversity of software architectures, input and output file formats, and computing environments. To mitigate these problems, some challenges have leveraged new virtualization and compute methods, requiring participants to submit cloud-ready software packages. We review recent data challenges with innovative approaches to model reproducibility and data sharing, and outline key lessons for improving quantitative biomedical data analysis through crowd-sourced benchmarking challenges.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Reproducible biomedical benchmarking in the cloud: lessons from crowd-sourced data challenges
    Kyle Ellrott
    Alex Buchanan
    Allison Creason
    Michael Mason
    Thomas Schaffter
    Bruce Hoff
    James Eddy
    John M. Chilton
    Thomas Yu
    Joshua M. Stuart
    Julio Saez-Rodriguez
    Gustavo Stolovitzky
    Paul C. Boutros
    Justin Guinney
    Genome Biology, 20
  • [2] Crowd-sourced Text Analysis: Reproducible and Agile Production of Political Data
    Benoit, Kenneth
    Conway, Drew
    Lauderdale, Benjamin E.
    Laver, Michael
    Mikhaylov, Slava
    AMERICAN POLITICAL SCIENCE REVIEW, 2016, 110 (02) : 278 - 295
  • [3] Using Crowd-Sourced Data to Study Public Services: Lessons from Urban India
    Alison E. Post
    Anustubh Agnihotri
    Christopher Hyun
    Studies in Comparative International Development, 2018, 53 : 324 - 342
  • [4] Crowd-sourced soil data for Europe
    Shelley, Wayne
    Lawley, Russell
    Robinson, David A.
    NATURE, 2013, 496 (7445) : 300 - 300
  • [5] Using Crowd-Sourced Data to Study Public Services: Lessons from Urban India
    Post, Alison E.
    Agnihotri, Anustubh
    Hyun, Christopher
    STUDIES IN COMPARATIVE INTERNATIONAL DEVELOPMENT, 2018, 53 (03) : 324 - 342
  • [6] Crowd-sourced soil data for Europe
    Wayne Shelley
    Russell Lawley
    David A. Robinson
    Nature, 2013, 496 : 300 - 300
  • [7] HETEROGENEOUS CROWD-SOURCED DATA ANALYTICS
    Barhamgi, Mahmoud
    Zhou, Zhangbing
    Chen, Chao
    Thill, Jean-Claude
    IEEE ACCESS, 2017, 5 : 27807 - 27809
  • [8] Lessons from Fraxinus, a crowd-sourced citizen science game in genomics
    Rallapalli, Ghanasyam
    Players, Fraxinus
    Saunders, Diane Go
    Yoshida, Kentaro
    Edwards, Anne
    Lugo, Carlos A.
    Collin, Steve
    Clavijo, Bernardo
    Corpas, Manuel
    Swarbreck, David
    Clark, Matthew
    Downie, J. Allan
    Kamoun, Sophien
    Cooper, Team
    MacLean, Dan
    ELIFE, 2015, 4
  • [9] CDME - Crowd-Sourced Data Mapping Engine System that Analyzes, Mapps & Publishes Crowd-Sourced Data on Enviorenment Facts
    Ruwanpathirana, S.
    Perera, I.
    2015 Moratuwa Engineering Research Conference (MERCon), 2015, : 271 - 276
  • [10] Processing of Crowd-sourced Data from an Internet of Floating Things
    Montella, Raffaele
    Di Luccio, Diana
    Marcellino, Livia
    Galletti, Ardelio
    Kosta, Sokol
    Brizius, Alison
    Foster, Ian
    PROCEEDINGS OF WORKS 2017: 12TH WORKSHOP ON WORKFLOWS IN SUPPORT OF LARGE-SCALE SCIENCE, 2017,