A comprehensive workflow for optimizing RNA-seq data analysis

被引:1
|
作者
Jiang, Gao [1 ]
Zheng, Juan-Yu [1 ]
Ren, Shu-Ning [2 ]
Yin, Weilun [2 ]
Xia, Xinli [2 ]
Li, Yun [1 ]
Wang, Hou-Ling [2 ]
机构
[1] Beijing Forestry Univ, Sch Artificial Intelligence, Sch Informat Sci & Technol, Beijing 100083, Peoples R China
[2] Beijing Forestry Univ, Coll Biol Sci & Technol, Natl Engn Res Ctr Tree Breeding & Ecol Restorat, State Key Lab Tree Genet & Breeding, Beijing 100083, Peoples R China
来源
BMC GENOMICS | 2024年 / 25卷 / 01期
基金
北京市自然科学基金; 中国国家自然科学基金;
关键词
RNA-seq data; Differential gene analysis; Software comparison; DIFFERENTIAL EXPRESSION; ALIGNMENT; PROGRAM; HISAT;
D O I
10.1186/s12864-024-10414-y
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background Current RNA-seq analysis software for RNA-seq data tends to use similar parameters across different species without considering species-specific differences. However, the suitability and accuracy of these tools may vary when analyzing data from different species, such as humans, animals, plants, fungi, and bacteria. For most laboratory researchers lacking a background in information science, determining how to construct an analysis workflow that meets their specific needs from the array of complex analytical tools available poses a significant challenge.Results By utilizing RNA-seq data from plants, animals, and fungi, it was observed that different analytical tools demonstrate some variations in performance when applied to different species. A comprehensive experiment was conducted specifically for analyzing plant pathogenic fungal data, focusing on differential gene analysis as the ultimate goal. In this study, 288 pipelines using different tools were applied to analyze five fungal RNA-seq datasets, and the performance of their results was evaluated based on simulation. This led to the establishment of a relatively universal and superior fungal RNA-seq analysis pipeline that can serve as a reference, and certain standards for selecting analysis tools were derived for reference. Additionally, we compared various tools for alternative splicing analysis. The results based on simulated data indicated that rMATS remained the optimal choice, although consideration could be given to supplementing with tools such as SpliceWiz.Conclusion The experimental results demonstrate that, in comparison to the default software parameter configurations, the analysis combination results after tuning can provide more accurate biological insights. It is beneficial to carefully select suitable analysis software based on the data, rather than indiscriminately choosing tools, in order to achieve high-quality analysis results more efficiently.
引用
下载
收藏
页数:21
相关论文
共 50 条
  • [11] VIPER: Visualization Pipeline for RNA-seq, a Snakemake workflow for efficient and complete RNA-seq analysis
    Cornwell, MacIntosh
    Vangala, Mahesh
    Taing, Len
    Herbert, Zachary
    Koester, Johannes
    Li, Bo
    Sun, Hanfei
    Li, Taiwen
    Zhang, Jian
    Qiu, Xintao
    Pun, Matthew
    Jeselsohn, Rinath
    Brown, Myles
    Liu, X. Shirley
    Long, Henry W.
    BMC BIOINFORMATICS, 2018, 19
  • [12] grandR: a comprehensive package for nucleotide conversion RNA-seq data analysis
    Rummel, Teresa
    Sakellaridi, Lygeri
    Erhard, Florian
    NATURE COMMUNICATIONS, 2023, 14 (01)
  • [13] grandR: a comprehensive package for nucleotide conversion RNA-seq data analysis
    Teresa Rummel
    Lygeri Sakellaridi
    Florian Erhard
    Nature Communications, 14
  • [14] Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome
    Peng, Zhiyu
    Cheng, Yanbing
    Tan, Bertrand Chin-Ming
    Kang, Lin
    Tian, Zhijian
    Zhu, Yuankun
    Zhang, Wenwei
    Liang, Yu
    Hu, Xueda
    Tan, Xuemei
    Guo, Jing
    Dong, Zirui
    Liang, Yan
    Bao, Li
    Wang, Jun
    NATURE BIOTECHNOLOGY, 2012, 30 (03) : 253 - +
  • [15] Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome
    Zhiyu Peng
    Yanbing Cheng
    Bertrand Chin-Ming Tan
    Lin Kang
    Zhijian Tian
    Yuankun Zhu
    Wenwei Zhang
    Yu Liang
    Xueda Hu
    Xuemei Tan
    Jing Guo
    Zirui Dong
    Yan Liang
    Li Bao
    Jun Wang
    Nature Biotechnology, 2012, 30 : 253 - 260
  • [16] Analysis of clustered RNA-seq data
    Park, Hyunjin
    Lee, Seungyeoun
    Kim, Ye Jin
    Choi, Myung-Sook
    Park, Taesung
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2017, 19 (01) : 19 - 31
  • [17] A comprehensive simulation study on classification of RNA-Seq data
    Zararsiz, Gokmen
    Goksuluk, Dincer
    Korkmaz, Selcuk
    Eldem, Vahap
    Zararsiz, Gozde Erturk
    Duru, Izzet Parug
    Ozturk, Ahmet
    PLOS ONE, 2017, 12 (08):
  • [18] ARMOR: An Automated Reproducible MOdular Workflow for Preprocessing and Differential Analysis of RNA-seq Data
    Orjuela, Stephany
    Huang, Ruizhu
    Hembach, Katharina M.
    Robinson, Mark D.
    Soneson, Charlotte
    G3-GENES GENOMES GENETICS, 2019, 9 (07): : 2089 - 2096
  • [19] Scaling up single-cell RNA-seq data analysis with CellBridge workflow
    Nouri, Nima
    Kurlovs, Andre H.
    Gaglia, Giorgio
    de Rinaldis, Emanuele
    Savova, Virginia
    BIOINFORMATICS, 2023, 39 (12)
  • [20] The Mechanisms of Maize Resistance to Fusarium verticillioides by Comprehensive Analysis of RNA-seq Data
    Wang, Yanping
    Zhou, Zijian
    Gao, Jingyang
    Wu, Yabin
    Xia, Zongliang
    Zhang, Huiyong
    Wu, Jianyu
    FRONTIERS IN PLANT SCIENCE, 2016, 7