Hardware Performance Evaluation of De novo Transcriptome Assembly Software in Amazon Elastic Compute Cloud

被引:2
|
作者
Mora-Marquez, Fernando [1 ]
Luis Vazquez-Poletti, Jose [2 ]
Chano, Victor [1 ]
Collada, Carmen [1 ]
Soto, Alvaro [1 ]
Lopez de Heredia, Unai [1 ]
机构
[1] Univ Politecn Madrid, ETSI Montes Forestal & Medio Nat, Dept Sistemas & Recursos Nat, GI Sistemas Nat & Hist Forestal, Ciudad Univ, Madrid 28040, Spain
[2] Univ Complutense Madrid, Fac Informat, Dept Arquitectura Comp & Automat, GI Arquitectura Sistemas Distribuidos, Ciudad Univ, Madrid 28040, Spain
关键词
Cloud computing; cost-efficiency; quality; RNA-seq; transcriptome; magnitude; QUALITY ASSESSMENT-TOOL; RNA-SEQ DATA; GENERATION; RECONSTRUCTION; SELECTION;
D O I
10.2174/1574893615666191219095817
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Bioinformatics software for RNA-seq analysis has a high computational requirement in terms of the number of CPUs, RAM size, and processor characteristics. Specifically, de novo transcriptome assembly demands large computational infrastructure due to the massive data size, and complexity of the algorithms employed. Comparative studies on the quality of the transcriptome yielded by de novo assemblers have been previously published, lacking, however, a hardware efficiency-oriented approach to help select the assembly hardware platform in a cost-efficient way. Objective: We tested the performance of two popular de novo transcriptome assemblers, Trinity and SOAPdenovo-Trans (SDNT), in terms of cost-efficiency and quality to assess limitations, and provided troubleshooting and guidelines to run transcriptome assemblies efficiently. Methods: We built virtual machines with different hardware characteristics (CPU number, RAM size) in the Amazon Elastic Compute Cloud of the Amazon Web Services. Using simulated and real data sets, we measured the elapsed time, cost, CPU percentage and output size of small and large data set assemblies. Results: For small data sets, SDNT outperformed Trinity by an order the magnitude, significantly reducing the time duration and costs of the assembly. For large data sets, Trinity performed better than SDNT. Both the assemblers provide good quality transcriptomes. Conclusion: The selection of the optimal transcriptome assembler and provision of computational resources depend on the combined effect of size and complexity of RNA-seq experiments.
引用
收藏
页码:420 / 430
页数:11
相关论文
共 50 条
  • [1] Performance evaluation of Amazon Elastic Compute Cloud for NASA high-performance computing applications
    Mehrotra, Piyush
    Djomehri, Jahed
    Heistand, Steve
    Hood, Robert
    Jin, Haoqiang
    Lazanoff, Arthur
    Saini, Subhash
    Biswas, Rupak
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2016, 28 (04): : 1041 - 1055
  • [2] Application and Network Performance of Amazon Elastic Compute Cloud Instances
    Gilani, Mehrin
    Inibhunu, Catherine
    Mahmoud, Qusay H.
    [J]. 2015 IEEE 4TH INTERNATIONAL CONFERENCE ON CLOUD NETWORKING (CLOUDNET), 2015, : 315 - 318
  • [3] Price Efficiency in High Performance Computing on Amazon Elastic Compute Cloud Provider in Compute Optimize Packages
    Prukkantragorn, Pongtorn
    Tientanopajai, Kitt
    [J]. 2016 20TH INTERNATIONAL COMPUTER SCIENCE AND ENGINEERING CONFERENCE (ICSEC), 2016,
  • [4] De novo transcriptome assembly with ABySS
    Birol, Inanc
    Jackman, Shaun D.
    Nielsen, Cydney B.
    Qian, Jenny Q.
    Varhol, Richard
    Stazyk, Greg
    Morin, Ryan D.
    Zhao, Yongjun
    Hirst, Martin
    Schein, Jacqueline E.
    Horsman, Doug E.
    Connors, Joseph M.
    Gascoyne, Randy D.
    Marra, Marco A.
    Jones, Steven J. M.
    [J]. BIOINFORMATICS, 2009, 25 (21) : 2872 - 2877
  • [5] Evaluating Characteristics of De Novo Assembly Software on 454 Transcriptome Data: A Simulation Approach
    Mundry, Marvin
    Bornberg-Bauer, Erich
    Sammeth, Michael
    Feulner, Philine G. D.
    [J]. PLOS ONE, 2012, 7 (02):
  • [6] De novo assembly of the mouse taste transcriptome
    Sukumaran, Sunil K.
    Lewandowski, Brian C.
    Bachmanov, Alexander A.
    Margolskee, Robert F.
    [J]. CHEMICAL SENSES, 2018, 43 (04) : E106 - E107
  • [7] Comparative analysis of de novo transcriptome assembly
    CLARKE Kaitlin
    YANG Yi
    MARSH Ronald
    XIE LingLin
    ZHANG Ke K.
    [J]. Science China Life Sciences, 2013, (02) : 156 - 162
  • [8] Comparative analysis of de novo transcriptome assembly
    Kaitlin Clarke
    Yi Yang
    Ronald Marsh
    LingLin Xie
    Zhang Ke K.
    [J]. Science China Life Sciences, 2013, 56 : 156 - 162
  • [9] Comparative analysis of de novo transcriptome assembly
    Clarke, Kaitlin
    Yang Yi
    Marsh, Ronald
    Xie LingLin
    Zhang, Ke K.
    [J]. SCIENCE CHINA-LIFE SCIENCES, 2013, 56 (02) : 156 - 162
  • [10] Comparative analysis of de novo transcriptome assembly
    CLARKE Kaitlin
    YANG Yi
    MARSH Ronald
    XIE LingLin
    ZHANG Ke K
    [J]. Science China(Life Sciences)., 2013, 56 (02) - 162