Unlocking the complete blood count as a risk stratification tool for breast cancer using machine learning: a large scale retrospective study

被引:2
|
作者
Araujo, Daniella Castro [1 ,4 ]
Rocha, Bruno Aragao [2 ]
Gomes, Karina Braga [3 ]
da Silva, Daniel Noce [1 ]
Ribeiro, Vinicius Moura [1 ]
Kohara, Marco Aurelio [1 ]
Marana, Fernanda Tostes [1 ]
Bitar, Renata Andrade [1 ]
Veloso, Adriano Alonso [4 ]
Pintao, Maria Carolina [2 ]
da Silva, Flavia Helena [2 ]
Viana, Celso Ferraz [2 ]
de Souza, Pedro Henrique Araujo [1 ,5 ]
da Silva, Ismael Dale Cotrim Guerreiro [2 ,6 ]
机构
[1] Huna, Sao Paulo, Brazil
[2] Grp Fleury, Sao Paulo, Brazil
[3] Univ Fed Minas Gerais UFMG, Fac Farm, Dept Anal Clin & Toxicol, Campus Belo Horizonte, Belo Horizonte, MG, Brazil
[4] Univ Fed Minas Gerais UFMG, Dept Ciencias Computacao, Inst Ciencias Exatas, Campus Belo Horizonte, Belo Horizonte, MG, Brazil
[5] Inst Nacl Canc INCA, Dept Oncol Clin Res, Rio De Janeiro, Brazil
[6] Univ Fed Sao Paulo, Dept Gynecol, Escola Paulista Med, Sao Paulo, Brazil
来源
SCIENTIFIC REPORTS | 2024年 / 14卷 / 01期
基金
巴西圣保罗研究基金会;
关键词
Breast cancer; Screening; Machine learning; Risk stratification; Routine blood tests; CBC; NLR; RBC; CBC-ratios; Hemogram;
D O I
10.1038/s41598-024-61215-y
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Optimizing early breast cancer (BC) detection requires effective risk assessment tools. This retrospective study from Brazil showcases the efficacy of machine learning in discerning complex patterns within routine blood tests, presenting a globally accessible and cost-effective approach for risk evaluation. We analyzed complete blood count (CBC) tests from 396,848 women aged 40-70, who underwent breast imaging or biopsies within six months after their CBC test. Of these, 2861 (0.72%) were identified as cases: 1882 with BC confirmed by anatomopathological tests, and 979 with highly suspicious imaging (BI-RADS 5). The remaining 393,987 participants (99.28%), with BI-RADS 1 or 2 results, were classified as controls. The database was divided into modeling (including training and validation) and testing sets based on diagnostic certainty. The testing set comprised cases confirmed by anatomopathology and controls cancer-free for 4.5-6.5 years post-CBC. Our ridge regression model, incorporating neutrophil-lymphocyte ratio, red blood cells, and age, achieved an AUC of 0.64 (95% CI 0.64-0.65). We also demonstrate that these results are slightly better than those from a boosting machine learning model, LightGBM, plus having the benefit of being fully interpretable. Using the probabilistic output from this model, we divided the study population into four risk groups: high, moderate, average, and low risk, which obtained relative ratios of BC of 1.99, 1.32, 1.02, and 0.42, respectively. The aim of this stratification was to streamline prioritization, potentially improving the early detection of breast cancer, particularly in resource-limited environments. As a risk stratification tool, this model offers the potential for personalized breast cancer screening by prioritizing women based on their individual risk, thereby indicating a shift from a broad population strategy.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] Early Colorectal Cancer Detected by Machine Learning Model Using Gender, Age, and Complete Blood Count Data (vol 62, pg 2719, 2017)
    Hornbrook, Mark C.
    Goshen, Ran
    Choman, Eran
    O'Keeffe-Rosetti, Maureen
    Kinar, Yaron
    Liles, Elizabeth G.
    Rust, Kristal C.
    DIGESTIVE DISEASES AND SCIENCES, 2018, 63 (01) : 270 - 270
  • [42] Predicting disease recurrence in breast cancer patients using machine learning models with clinical and radiomic characteristics: a retrospective study
    Azeroual, Saadia
    Ben-Bouazza, Fatima-ezzahraa
    Naqi, Amine
    Sebihi, Rajaa
    JOURNAL OF THE EGYPTIAN NATIONAL CANCER INSTITUTE, 2024, 36 (01)
  • [43] Efficacy of ColonFlag as a Complete Blood Count-Based Machine Learning Algorithm for Early Detection of Colorectal Cancer: A Systematic Review
    Putri, Raeni Dwi
    Sujana, Syifa Alfiah
    Hanifa, Nadhira Nizza
    Santoso, Tiffanie Almas
    Abdullah, Murdani
    IRANIAN JOURNAL OF MEDICAL SCIENCES, 2024, 49 (10) : 610 - 622
  • [44] Malnutrition risk assessment using a machine learning-based screening tool: A multicentre retrospective cohort
    Parchuri, Pramathamesh
    Besculides, Melanie
    Zhan, Serena
    Cheng, Fu-yuan
    Timsina, Prem
    Cheertirala, Satya Narayana
    Kersch, Ilana
    Wilson, Sara
    Freeman, Robert
    Reich, David
    Mazumdar, Madhu
    Kia, Arash
    JOURNAL OF HUMAN NUTRITION AND DIETETICS, 2024, 37 (03) : 622 - 632
  • [45] Improving risk stratification of recurrent myocardial infarction in a large real-world dataset using machine learning
    Chodick, G.
    Vered, Z.
    Elgui, K.
    Mathieu, T.
    Trichelair, P.
    Zachlederova, M.
    Rousset, A.
    EUROPEAN HEART JOURNAL, 2023, 44
  • [46] Prognostic value of baseline complete blood count components in advanced gastric cancer patients: A multicenter retrospective study.
    Elsamany, Shereef Ahmed
    Zeeneldin, Ahmed
    Tashkandi, Emad
    Rasmy, Ayman Ahamd
    Abozeed, Waleed
    Abdelfatah, Gomaa
    Bukhari, Huda Sharif
    Sulaimani, Mernan
    Firaq, Donia Ahmed
    JOURNAL OF CLINICAL ONCOLOGY, 2020, 38 (15)
  • [47] Digital Transformation Strategies, Practices, and Trends: A Large-Scale Retrospective Study Based on Machine Learning
    Gurcan, Fatih
    Boztas, Gizem Dilan
    Dalveren, Gonca Gokce Menekse
    Derawi, Mohammad
    SUSTAINABILITY, 2023, 15 (09)
  • [48] Association of ABO blood group and risk of female breast cancer-A retrospective study
    Sujatha, B.
    Jenilin, Sherry G.
    INTERNATIONAL JOURNAL OF MEDICAL RESEARCH & HEALTH SCIENCES, 2016, 5 (01): : 124 - 127
  • [49] Alzheimer's Disease Risk Assessment Using Large-Scale Machine Learning Methods
    Casanova, Ramon
    Hsu, Fang-Chi
    Sink, Kaycee M.
    Rapp, Stephen R.
    Williamson, Jeff D.
    Resnick, Susan M.
    Espeland, Mark A.
    PLOS ONE, 2013, 8 (11):
  • [50] The Application and Comparison of Machine Learning Models for the Prediction of Breast Cancer Prognosis: Retrospective Cohort Study
    Xiao, Jialong
    Mo, Miao
    Wang, Zezhou
    Zhou, Changming
    Shen, Jie
    Yuan, Jing
    He, Yulian
    Zheng, Ying
    JMIR MEDICAL INFORMATICS, 2022, 10 (02)