How tool combinations in different pipeline versions affect the outcome in RNA-seq analysis

被引:0
|
作者
Perelo, Louisa Wessels [1 ]
Gabernet, Gisela [1 ,5 ]
Straub, Daniel [1 ]
Nahnsen, Sven [1 ,2 ,3 ,4 ]
机构
[1] Univ Tubingen, Quant Biol Ctr QBiC, Otfried Muller Str 37, D-72076 Tubingen, Germany
[2] Univ Tubingen, Fac Med, M3 Res Ctr, Otfried Muller Str 37, D-72076 Tubingen, Germany
[3] Univ Tubingen, Inst Bioinformat & Med Informat IBMI, Dept Comp Sci, Otfried Muller Str 37, D-72076 Tubingen, Germany
[4] Univ Tubingen, Image Guided & Functionally Instruct Tumor Therapi, Cluster Excellence iFIT EXC 2180, Otfried Muller Str 37, D-72076 Tubingen, Germany
[5] Yale Sch Med, Computat Immunol, New Haven, CT 06511 USA
关键词
ALIGNMENT;
D O I
10.1093/nargab/lqae020
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Data analysis tools are continuously changed and improved over time. In order to test how these changes influence the comparability between analyses, the output of different workflow options of the nf-core/rnaseq pipeline were compared. Five different pipeline settings (STAR+Salmon, STAR+RSEM, STAR+featureCounts, HISAT2+featureCounts, pseudoaligner Salmon) were run on three datasets (human, Arabidopsis, zebrafish) containing spike-ins of the External RNA Control Consortium (ERCC). Fold change ratios and differential expression of genes and spike-ins were used for comparative analyses of the different tools and versions settings of the pipeline. An overlap of 85% for differential gene classification between pipelines could be shown. Genes interpreted with a bias were mostly those present at lower concentration. Also, the number of isoforms and exons per gene were determinants. Previous pipeline versions using featureCounts showed a higher sensitivity to detect one-isoform genes like ERCC. To ensure data comparability in long-term analysis series it would be recommendable to either stay with the pipeline version the series was initialized with or to run both versions during a transition time in order to ensure that the target genes are addressed the same way.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] A pipeline for RNA-seq based eQTL analysis with automated quality control procedures
    Wang, Tao
    Liu, Yongzhuang
    Ruan, Junpeng
    Dong, Xianjun
    Wang, Yadong
    Peng, Jiajie
    BMC BIOINFORMATICS, 2021, 22 (SUPPL 9)
  • [32] VIRTUS: a pipeline for comprehensive virus analysis from conventional RNA-seq data
    Yasumizu, Yoshiaki
    Hara, Atsushi
    Sakaguchi, Shimon
    Ohkura, Naganari
    BIOINFORMATICS, 2021, 37 (10) : 1465 - 1467
  • [33] RNAseqViewer: visualization tool for RNA-Seq data
    Roge, Xavier
    Zhang, Xuegong
    BIOINFORMATICS, 2014, 30 (06) : 891 - 892
  • [34] RNA-seq and Analysis of Argyrosomus japonicus Under Different Salinities
    Li, Zhujun
    Gao, Tianxiang
    Han, Zhiqiang
    FRONTIERS IN MARINE SCIENCE, 2021, 8
  • [35] RNA-seq analysis is a useful tool in variant classification. Reply
    Nix, Paola
    Mundt, Erin
    Manley, Susan
    Coffee, Bradford
    Roa, Benjamin
    JCO PRECISION ONCOLOGY, 2020, 4 : 1224 - 1225
  • [36] RNA-Seq differential expression analysis: An extended review and a software tool
    Costa-Silva, Juliana
    Domingues, Douglas
    Lopes, Fabricio Martins
    PLOS ONE, 2017, 12 (12):
  • [37] Comparative analysis of RNA-Seq alignment algorithms and the RNA-Seq unified mapper (RUM)
    Grant, Gregory R.
    Farkas, Michael H.
    Pizarro, Angel D.
    Lahens, Nicholas F.
    Schug, Jonathan
    Brunk, Brian P.
    Stoeckert, Christian J.
    Hogenesch, John B.
    Pierce, Eric A.
    BIOINFORMATICS, 2011, 27 (18) : 2518 - 2528
  • [39] Physiological RNA dynamics in RNA-Seq analysis
    Xu, Zhongneng
    Asakawa, Shuichi
    BRIEFINGS IN BIOINFORMATICS, 2019, 20 (05) : 1725 - 1733
  • [40] The RNA-Seq data analysis shows how the ontogenesis defines aging
    Salnikov, Lev
    Goldberg, Saveli
    Rijhwani, Heena
    Shi, Yuran
    Pinsky, Eugene
    FRONTIERS IN AGING, 2023, 4