VAE-Surv: A novel approach for genetic-based clustering and prognosis prediction in myelodysplastic syndromes

被引:0
|
作者
Rollo, Cesare [1 ]
Pancotti, Corrado [1 ]
Sartori, Flavio [1 ]
Caranzano, Isabella [1 ]
D'Amico, Saverio [2 ,3 ]
Carota, Luciana [4 ]
Casadei, Francesco [5 ]
Birolo, Giovanni [1 ]
Lanino, Luca [2 ]
Sauta, Elisabetta [2 ]
Asti, Gianluca [2 ]
Buizza, Alessandro [2 ]
Delleani, Mattia [2 ,3 ]
Zazzetti, Elena [2 ]
Bicchieri, Marilena [2 ]
Maggioni, Giulia [2 ]
Fenaux, Pierre [6 ]
Platzbecker, Uwe [7 ]
Diez-Campelo, Maria [8 ]
Haferlach, Torsten [9 ]
Castellani, Gastone [4 ,10 ]
Porta, Matteo Giovanni Della [2 ,11 ]
Fariselli, Piero [1 ]
Sanavia, Tiziana [1 ]
机构
[1] Univ Torino, Computat Biomed Unit, Dept Med Sci, Via Santena 19, I-10126 Turin, Italy
[2] IRCCS Humanitas Res Hosp, Via Manzoni 56, I-20089 Rozzano Milan, Italy
[3] Train Srl, via Alessandro Manzoni 56, I-20089 Rozzano Milan, Italy
[4] Univ Bologna, Dept Med & Surg Sci DIMEC, I-40126 Bologna, Italy
[5] IRCCS Ist Sci Neurol Bologna, I-40138 Bologna, Italy
[6] Hop St Louis, Hematol Bone Marrow Transplantat, Paris, France
[7] Univ Hosp Leipzig, Med Clin & Policlin 1, Hematol & Cellular Therapy, Leipzig, Germany
[8] Hosp Univ Salamanca, Hematol Dept, Salamanca, Spain
[9] MLL Munich Leukemia Lab, Max Lebsche Pl 31, Munich, Germany
[10] IRCCS Azienda Osped Univ Bologna S Orsola, I-40138 Bologna, Italy
[11] Humanitas Univ, Dept Biomed Sci, Via Montalcini 4, I-20072 Pieve Emanuele Milan, Italy
关键词
Survival analysis; Deep Learning; Variational Autoencoder; Myelodysplastic syndrome; Genetic-based clustering; CLASSIFICATION; MODELS;
D O I
10.1016/j.cmpb.2025.108605
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Background and Objectives Several computational pipelines for biomedical data have been proposed to stratify patients and to predict their prognosis through survival analysis. However, these analyses are usually performed independently, without integrating the information derived from each of them. Clustering of survival data is an underexplored problem, and current approaches are limited for biomedical applications, whose data are usually heterogeneous and multimodal, with poor scalability for high-dimensionality. Methods We introduce VAE-Surv, a multimodal computational framework for patients' stratification and prognosis prediction. VAE-Surv integrates a Variational Autoencoder (VAE), which reduces the high-dimensional space characterizing the molecular data, with a deep survival model, which combines the embedded information with the clinical features. The VAE embedding step prioritizes local coherence within the feature space to detect potential nonlinear relationships among the molecular markers. The latent representation is then exploited to perform K-means clustering. To test the clinical robustness of the algorithm, VAE-Surv was applied to the Genomed4all cohort of Myelodysplastic Syndromes (MDS), comparing the identified subtypes with the World Health Organization (WHO) classification. The survival outcome was compared with the state-of-the-art Cox model and its penalized versions. Finally, to assess the generalizability of the results, the method was also validated on an external MDS cohort. Results Tested on 2,043 patients in the GenomMed4All cohort, VAE-Surv achieved a median C-index of 0.78, outperforming classical approaches. In addition, the latent space enhanced the clustering performance compared to a traditional approach that applies the clustering directly to the input data. Compared to the WHO 2016 MDS subtypes, the analysis of the identified clusters showed that the proposed framework can capture existing clinical categorizations while also suggesting novel, data-driven patient groups. Even tested in an external MDS cohort of 2,384 patients, VAE-Surv achieved a good prediction performance (median C-index=0.74), preserving the interpretability of the main clinical and genetic features. Conclusions VAE-Surv enables automatic identification of patients' clusters, while outperforming the traditional CoxPH model in survival prediction tasks at the same time. Applied to MDS use case, the obtained genetic-based clusters exhibit a clear survival stratification, and the application of the clinical information allowed high performance in prognosis prediction.
引用
收藏
页数:8
相关论文
共 19 条
  • [1] A Novel Ab-initio Genetic-Based Approach for Protein Folding Prediction
    Duarte, Sergio R.
    Becerra, David C.
    Nino, Fernando
    Pinzon, Yoan J.
    GECCO 2007: GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, VOL 1 AND 2, 2007, : 393 - 400
  • [2] A Genetic-based Clustering Approach to Traffic Network Design for Car Navigation System
    Wen, Feng
    Gen, Mitsuo
    2008 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), VOLS 1-6, 2008, : 1687 - 1692
  • [3] A genetic-based neuro-fuzzy approach for prediction of solar activity
    Attia, AF
    Abdel-Hamid, RH
    Quassim, M
    MODELING AND SYSTEMS ENGINEERING FOR ASTRONOMY, 2004, 5497 : 542 - 552
  • [4] CGC: centralized genetic-based clustering protocol for wireless sensor networks using onion approach
    Majid Hatamian
    Hamid Barati
    Ali Movaghar
    Alireza Naghizadeh
    Telecommunication Systems, 2016, 62 : 657 - 674
  • [5] CGC: centralized genetic-based clustering protocol for wireless sensor networks using onion approach
    Hatamian, Majid
    Barati, Hamid
    Movaghar, Ali
    Naghizadeh, Alireza
    TELECOMMUNICATION SYSTEMS, 2016, 62 (04) : 657 - 674
  • [6] Unsupervised optimal model bank for multiple model control systems: Genetic-based automatic clustering approach
    Fathi, Mohammad
    Bolandi, Hossein
    HELIYON, 2024, 10 (04)
  • [7] A novel genetic-based residual stress and deformation prediction method for the coupled machining process of connecting rod
    Zhou, Honggen
    Peng, Zhicheng
    Li, Guochao
    Zhou, Tao
    Wu, Hengheng
    Sun, Li
    INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY, 2024, 130 (5-6): : 2705 - 2729
  • [8] A novel approach to guidance and control system design using genetic-based fuzzy logic model
    Lin, CL
    Lai, RM
    IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, 2002, 10 (04) : 600 - 610
  • [9] A novel genetic-based residual stress and deformation prediction method for the coupled machining process of connecting rod
    Zhou, Honggen
    Peng, Zhicheng
    Li, Guochao
    Zhou, Tao
    Wu, Hengheng
    Sun, Li
    International Journal of Advanced Manufacturing Technology, 1600, 133 (1-2): : 971 - 985
  • [10] Microarray-based comparative genomic hybridization of cancer targets reveals novel, recurrent genetic aberrations in the myelodysplastic syndromes
    Kolquist, Kathryn A.
    Schultz, Roger A.
    Furrow, Aubry
    Brown, Theresa C.
    Han, Jin-Yeong
    Campbell, Lynda J.
    Wall, Meaghan
    Slovak, Marilyn L.
    Shaffer, Lisa G.
    Ballif, Blake C.
    CANCER GENETICS, 2011, 204 (11) : 603 - 628