A comprehensive workflow for optimizing RNA-seq data analysis

被引:1
|
作者
Jiang, Gao [1 ]
Zheng, Juan-Yu [1 ]
Ren, Shu-Ning [2 ]
Yin, Weilun [2 ]
Xia, Xinli [2 ]
Li, Yun [1 ]
Wang, Hou-Ling [2 ]
机构
[1] Beijing Forestry Univ, Sch Artificial Intelligence, Sch Informat Sci & Technol, Beijing 100083, Peoples R China
[2] Beijing Forestry Univ, Coll Biol Sci & Technol, Natl Engn Res Ctr Tree Breeding & Ecol Restorat, State Key Lab Tree Genet & Breeding, Beijing 100083, Peoples R China
来源
BMC GENOMICS | 2024年 / 25卷 / 01期
基金
北京市自然科学基金; 中国国家自然科学基金;
关键词
RNA-seq data; Differential gene analysis; Software comparison; DIFFERENTIAL EXPRESSION; ALIGNMENT; PROGRAM; HISAT;
D O I
10.1186/s12864-024-10414-y
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background Current RNA-seq analysis software for RNA-seq data tends to use similar parameters across different species without considering species-specific differences. However, the suitability and accuracy of these tools may vary when analyzing data from different species, such as humans, animals, plants, fungi, and bacteria. For most laboratory researchers lacking a background in information science, determining how to construct an analysis workflow that meets their specific needs from the array of complex analytical tools available poses a significant challenge.Results By utilizing RNA-seq data from plants, animals, and fungi, it was observed that different analytical tools demonstrate some variations in performance when applied to different species. A comprehensive experiment was conducted specifically for analyzing plant pathogenic fungal data, focusing on differential gene analysis as the ultimate goal. In this study, 288 pipelines using different tools were applied to analyze five fungal RNA-seq datasets, and the performance of their results was evaluated based on simulation. This led to the establishment of a relatively universal and superior fungal RNA-seq analysis pipeline that can serve as a reference, and certain standards for selecting analysis tools were derived for reference. Additionally, we compared various tools for alternative splicing analysis. The results based on simulated data indicated that rMATS remained the optimal choice, although consideration could be given to supplementing with tools such as SpliceWiz.Conclusion The experimental results demonstrate that, in comparison to the default software parameter configurations, the analysis combination results after tuning can provide more accurate biological insights. It is beneficial to carefully select suitable analysis software based on the data, rather than indiscriminately choosing tools, in order to achieve high-quality analysis results more efficiently.
引用
下载
收藏
页数:21
相关论文
共 50 条
  • [1] A comprehensive review on RNA-seq data analysis
    Zhang, Li
    Liu, Xuejun
    Transactions of Nanjing University of Aeronautics and Astronautics, 2016, 33 (03) : 339 - 361
  • [2] A Comprehensive Review on RNA-seq Data Analysis
    Zhang Li
    Liu Xuejun
    Transactions of Nanjing University of Aeronautics and Astronautics, 2016, 33 (03) : 339 - 361
  • [3] RASflow: an RNA-Seq analysis workflow with Snakemake
    Zhang, Xiaokang
    Jonassen, Inge
    BMC BIOINFORMATICS, 2020, 21 (01)
  • [4] RASflow: an RNA-Seq analysis workflow with Snakemake
    Xiaokang Zhang
    Inge Jonassen
    BMC Bioinformatics, 21
  • [5] Improving an RNA-Seq Analysis Workflow in Blueberry
    Paya-Milans, Miriam
    Nunez, Gerardo H.
    Olmstead, James W.
    Rinehart, Timothy
    HORTSCIENCE, 2017, 52 (09) : S421 - S421
  • [6] SMART-RDA: A Galaxy Workflow for RNA-Seq Data Analysis
    Aditama, Redi
    Tanjung, Zulfikar Achmad
    Sudania, Widyartini Made
    Liwang, Tony
    4TH INTERNATIONAL CONFERENCE ON BIOLOGICAL SCIENCE (2015), 2017, : 186 - 193
  • [7] COMPSRA: a COMprehensive Platform for Small RNA-Seq data Analysis
    Jiang Li
    Alvin T. Kho
    Robert P. Chase
    Lorena Pantano
    Leanna Farnam
    Sami S. Amr
    Kelan G. Tantisira
    Scientific Reports, 10
  • [8] iSmaRT: a toolkit for a comprehensive analysis of small RNA-Seq data
    Panero, Riccardo
    Rinaldi, Antonio
    Memoli, Domenico
    Nassa, Giovanni
    Ravo, Maria
    Rizzo, Francesca
    Tarallo, Roberta
    Milanesi, Luciano
    Weisz, Alessandro
    Giurato, Giorgio
    BIOINFORMATICS, 2017, 33 (06) : 938 - 940
  • [9] COMPSRA: a COMprehensive Platform for Small RNA-Seq data Analysis
    Li, Jiang
    Kho, Alvin T.
    Chase, Robert P.
    Pantano, Lorena
    Farnam, Leanna
    Amr, Sami S.
    Tantisira, Kelan G.
    SCIENTIFIC REPORTS, 2020, 10 (01)
  • [10] VIPER: Visualization Pipeline for RNA-seq, a Snakemake workflow for efficient and complete RNA-seq analysis
    MacIntosh Cornwell
    Mahesh Vangala
    Len Taing
    Zachary Herbert
    Johannes Köster
    Bo Li
    Hanfei Sun
    Taiwen Li
    Jian Zhang
    Xintao Qiu
    Matthew Pun
    Rinath Jeselsohn
    Myles Brown
    X. Shirley Liu
    Henry W. Long
    BMC Bioinformatics, 19