How tool combinations in different pipeline versions affect the outcome in RNA-seq analysis

被引:0
|
作者
Perelo, Louisa Wessels [1 ]
Gabernet, Gisela [1 ,5 ]
Straub, Daniel [1 ]
Nahnsen, Sven [1 ,2 ,3 ,4 ]
机构
[1] Univ Tubingen, Quant Biol Ctr QBiC, Otfried Muller Str 37, D-72076 Tubingen, Germany
[2] Univ Tubingen, Fac Med, M3 Res Ctr, Otfried Muller Str 37, D-72076 Tubingen, Germany
[3] Univ Tubingen, Inst Bioinformat & Med Informat IBMI, Dept Comp Sci, Otfried Muller Str 37, D-72076 Tubingen, Germany
[4] Univ Tubingen, Image Guided & Functionally Instruct Tumor Therapi, Cluster Excellence iFIT EXC 2180, Otfried Muller Str 37, D-72076 Tubingen, Germany
[5] Yale Sch Med, Computat Immunol, New Haven, CT 06511 USA
关键词
ALIGNMENT;
D O I
10.1093/nargab/lqae020
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Data analysis tools are continuously changed and improved over time. In order to test how these changes influence the comparability between analyses, the output of different workflow options of the nf-core/rnaseq pipeline were compared. Five different pipeline settings (STAR+Salmon, STAR+RSEM, STAR+featureCounts, HISAT2+featureCounts, pseudoaligner Salmon) were run on three datasets (human, Arabidopsis, zebrafish) containing spike-ins of the External RNA Control Consortium (ERCC). Fold change ratios and differential expression of genes and spike-ins were used for comparative analyses of the different tools and versions settings of the pipeline. An overlap of 85% for differential gene classification between pipelines could be shown. Genes interpreted with a bias were mostly those present at lower concentration. Also, the number of isoforms and exons per gene were determinants. Previous pipeline versions using featureCounts showed a higher sensitivity to detect one-isoform genes like ERCC. To ensure data comparability in long-term analysis series it would be recommendable to either stay with the pipeline version the series was initialized with or to run both versions during a transition time in order to ensure that the target genes are addressed the same way.
引用
收藏
页数:9
相关论文
共 50 条
  • [21] QmihR: Pipeline for Quantification of Microbiome in Human RNA-seq
    Cavadas, Bruno
    Ferreira, Joana
    Camacho, Rui
    Fonseca, Nuno A.
    Pereira, Luisa
    11TH INTERNATIONAL CONFERENCE ON PRACTICAL APPLICATIONS OF COMPUTATIONAL BIOLOGY & BIOINFORMATICS, 2017, 616 : 173 - 179
  • [22] RNA-Seq analysis in MeV
    Howe, Eleanor A.
    Sinha, Raktim
    Schlauch, Daniel
    Quackenbush, John
    BIOINFORMATICS, 2011, 27 (22) : 3209 - 3210
  • [23] Analysis of microbial drivers of psoriasis using a novel universal RNA-seq pipeline
    Furnholm, T.
    Reingold, L.
    Johnston, A.
    JOURNAL OF INVESTIGATIVE DERMATOLOGY, 2017, 137 (05) : S110 - S110
  • [24] RNA-seq analysis for Dystrophinopathy
    Okubo, M.
    Noguchi, S.
    Hayashi, S.
    Komaki, H.
    Nishino, I.
    NEUROMUSCULAR DISORDERS, 2021, 31 : S84 - S84
  • [25] Advancing RNA-Seq analysis
    Haas, Brian J.
    Zody, Michael C.
    NATURE BIOTECHNOLOGY, 2010, 28 (05) : 421 - 423
  • [26] Introduction to RNA-seq analysis
    Kojima, Shinya
    CANCER SCIENCE, 2024, 115 : 1554 - 1554
  • [27] Advancing RNA-Seq analysis
    Brian J Haas
    Michael C Zody
    Nature Biotechnology, 2010, 28 : 421 - 423
  • [28] Incomplete removal of ribosomal RNA can affect chromatin RNA-seq data analysis
    Tellier, Michael
    Murphy, Shona
    TRANSCRIPTION-AUSTIN, 2020, 11 (05): : 230 - 235
  • [29] Lung adenocarcinoma differential expression analysis using the Maverix RNA-Seq pipeline
    Fitzsimons, Michael S.
    Lee, Mei-Chong Wendy
    Lee, Byung-Ln
    Subramanian, Lenin
    CANCER RESEARCH, 2016, 76
  • [30] A pipeline for RNA-seq based eQTL analysis with automated quality control procedures
    Tao Wang
    Yongzhuang Liu
    Junpeng Ruan
    Xianjun Dong
    Yadong Wang
    Jiajie Peng
    BMC Bioinformatics, 22