The importance of data transformation in RNA-Seq preprocessing for bladder cancer subtyping

被引:0
|
作者
Acedo-Terrades, Ariadna [1 ]
Perera-Bel, Julia [1 ]
Nonell, Lara [2 ]
机构
[1] Hosp del Mar Res Inst HMRI, Barcelona, Spain
[2] Vall dHebron Inst Oncol, Bioinformat Unit, Barcelona, Spain
关键词
Molecular subtypes; RNA sequencing; Preprocessing; Bladder cancer; MOLECULAR TAXONOMY;
D O I
10.1186/s13104-025-07138-x
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
ObjectiveRNA-Seq provides an accurate quantification of gene expression levels and it is widely used for molecular subtype classification in cancer, with special importance in prognosis. However, the reliability and validity of these analyses can significantly be influenced by how data are processed. In this study we evaluate how RNA-Seq preprocessing methods influence molecular subtype classification in bladder cancer. By benchmarking various aligners, quantifiers and methods of normalization and transformation, we stress the importance of preprocessing choices for accurate and consistent subtype classification.ResultsOur findings highlight that log-transformation plays a crucial role in centroid-based classifiers such as consensusMIBC and TCGAclas, while distribution-free algorithms like LundTax offer robustness to preprocessing variations. Non log-transformed data resulted in low classification rates and poor agreement with reference classifications in consensusMIBC and TCGAclas classifiers. Additionally, LundTax consistently demonstrated better separation among subtypes, compared to consensusMIBC and TCGAclas, regardless of preprocessing methods. Nonetheless, the study is limited by the lack of a true reference for objective assessment of the accuracy of the assigned subtypes. Hence, future work will be necessary to determine the robustness and scalability of the obtained results.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] FastqPuri: high-performance preprocessing of RNA-seq data
    Paula Pérez-Rubio
    Claudio Lottaz
    Julia C. Engelmann
    BMC Bioinformatics, 20
  • [2] FastqPuri: high-performance preprocessing of RNA-seq data
    Perez-Rubio, Paula
    Lottaz, Claudio
    Engelmann, Julia C.
    BMC BIOINFORMATICS, 2019, 20 (1)
  • [3] RNA-Seq profiling of silibinin effects on bladder cancer cells
    Yamamura, Soichiro
    Mitsui, Yozo
    Bucay, Nathan
    Saini, Sharanjot
    Majid, Shahana
    Deng, Guoren
    Shahryary, Varahram
    Dahiya, Rajvir
    Tanaka, Yuichiro
    CANCER RESEARCH, 2015, 75
  • [4] Novel fusion transcripts in bladder cancer identified by RNA-seq
    Kekeeva, T.
    Tanas, A.
    Kanygina, A.
    Alexeev, D.
    Shikeeva, A.
    Zavalishina, L.
    Andreeva, Y.
    Frank, G. A.
    Zaletaev, D.
    CANCER LETTERS, 2016, 374 (02) : 224 - 228
  • [5] NDRindex: a method for the quality assessment of single-cell RNA-Seq preprocessing data
    Xiao, Ruiyu
    Lu, Guoshan
    Guo, Wanqian
    Jin, Shuilin
    BMC BIOINFORMATICS, 2020, 21 (Suppl 16)
  • [6] NDRindex: a method for the quality assessment of single-cell RNA-Seq preprocessing data
    Ruiyu Xiao
    Guoshan Lu
    Wanqian Guo
    Shuilin Jin
    BMC Bioinformatics, 21
  • [7] Utilizing RNA-Seq Data for Cancer Network Inference
    Cai, Ying
    Fendler, Bernard
    Atwal, Gurinder S.
    2012 IEEE INTERNATIONAL WORKSHOP ON GENOMIC SIGNAL PROCESSING AND STATISTICS (GENSIPS), 2012, : 46 - 49
  • [8] A comparison of RNA-Seq data preprocessing pipelines for transcriptomic predictions across independent studies
    Van, Richard
    Alvarez, Daniel
    Mize, Travis
    Gannavarapu, Sravani
    Chintham Reddy, Lohitha
    Nasoz, Fatma
    Han, Mira V.
    BMC BIOINFORMATICS, 2024, 25 (01)
  • [9] NDRindex: A method for the quality assessment of single-cell RNA-Seq preprocessing data
    Xiao, Ruiyu
    Lu, Guoshan
    Jin, Shuilin
    2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 1792 - 1800
  • [10] ARMOR: An Automated Reproducible MOdular Workflow for Preprocessing and Differential Analysis of RNA-seq Data
    Orjuela, Stephany
    Huang, Ruizhu
    Hembach, Katharina M.
    Robinson, Mark D.
    Soneson, Charlotte
    G3-GENES GENOMES GENETICS, 2019, 9 (07): : 2089 - 2096