Unlocking the complete blood count as a risk stratification tool for breast cancer using machine learning: a large scale retrospective study

被引:2
|
作者
Araujo, Daniella Castro [1 ,4 ]
Rocha, Bruno Aragao [2 ]
Gomes, Karina Braga [3 ]
da Silva, Daniel Noce [1 ]
Ribeiro, Vinicius Moura [1 ]
Kohara, Marco Aurelio [1 ]
Marana, Fernanda Tostes [1 ]
Bitar, Renata Andrade [1 ]
Veloso, Adriano Alonso [4 ]
Pintao, Maria Carolina [2 ]
da Silva, Flavia Helena [2 ]
Viana, Celso Ferraz [2 ]
de Souza, Pedro Henrique Araujo [1 ,5 ]
da Silva, Ismael Dale Cotrim Guerreiro [2 ,6 ]
机构
[1] Huna, Sao Paulo, Brazil
[2] Grp Fleury, Sao Paulo, Brazil
[3] Univ Fed Minas Gerais UFMG, Fac Farm, Dept Anal Clin & Toxicol, Campus Belo Horizonte, Belo Horizonte, MG, Brazil
[4] Univ Fed Minas Gerais UFMG, Dept Ciencias Computacao, Inst Ciencias Exatas, Campus Belo Horizonte, Belo Horizonte, MG, Brazil
[5] Inst Nacl Canc INCA, Dept Oncol Clin Res, Rio De Janeiro, Brazil
[6] Univ Fed Sao Paulo, Dept Gynecol, Escola Paulista Med, Sao Paulo, Brazil
来源
SCIENTIFIC REPORTS | 2024年 / 14卷 / 01期
基金
巴西圣保罗研究基金会;
关键词
Breast cancer; Screening; Machine learning; Risk stratification; Routine blood tests; CBC; NLR; RBC; CBC-ratios; Hemogram;
D O I
10.1038/s41598-024-61215-y
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Optimizing early breast cancer (BC) detection requires effective risk assessment tools. This retrospective study from Brazil showcases the efficacy of machine learning in discerning complex patterns within routine blood tests, presenting a globally accessible and cost-effective approach for risk evaluation. We analyzed complete blood count (CBC) tests from 396,848 women aged 40-70, who underwent breast imaging or biopsies within six months after their CBC test. Of these, 2861 (0.72%) were identified as cases: 1882 with BC confirmed by anatomopathological tests, and 979 with highly suspicious imaging (BI-RADS 5). The remaining 393,987 participants (99.28%), with BI-RADS 1 or 2 results, were classified as controls. The database was divided into modeling (including training and validation) and testing sets based on diagnostic certainty. The testing set comprised cases confirmed by anatomopathology and controls cancer-free for 4.5-6.5 years post-CBC. Our ridge regression model, incorporating neutrophil-lymphocyte ratio, red blood cells, and age, achieved an AUC of 0.64 (95% CI 0.64-0.65). We also demonstrate that these results are slightly better than those from a boosting machine learning model, LightGBM, plus having the benefit of being fully interpretable. Using the probabilistic output from this model, we divided the study population into four risk groups: high, moderate, average, and low risk, which obtained relative ratios of BC of 1.99, 1.32, 1.02, and 0.42, respectively. The aim of this stratification was to streamline prioritization, potentially improving the early detection of breast cancer, particularly in resource-limited environments. As a risk stratification tool, this model offers the potential for personalized breast cancer screening by prioritizing women based on their individual risk, thereby indicating a shift from a broad population strategy.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Multimodal integration using a machine learning approach facilitates risk stratification in HR+/HER2-breast cancer
    Zhang, Hang
    Yang, Fan
    Xu, Ying
    Zhao, Shen
    Jiang, Yi-Zhou
    Shao, Zhi-Ming
    Xiao, Yi
    CELL REPORTS MEDICINE, 2025, 6 (02)
  • [22] A machine learning approach to support triaging of primary versus secondary headache patients using complete blood count
    Yang, Fei
    Meng, Tong
    Torben-Nielsen, Ben
    Magnus, Carsten
    Liu, Chuang
    Dejean, Emilie
    PLOS ONE, 2023, 18 (03):
  • [23] Predicting early gastric cancer risk using machine learning: A population-based retrospective study
    Ke, Xing
    Cai, Xinyu
    Bian, Bingxian
    Shen, Yuanheng
    Zhou, Yunlan
    Liu, Wei
    Wang, Xu
    Shen, Lisong
    Yang, Junyao
    DIGITAL HEALTH, 2024, 10
  • [24] Association of complete blood count parameters with the risk of incident pulmonary heart disease in pneumoconiosis: a retrospective cohort study
    Liu, Lifang
    Peng, Shanshan
    Wei, Yuhao
    Yu, Wenao
    Liao, Jiaqiang
    Du, Wen
    Shi, Ying
    He, Qiurong
    Wu, Dongsheng
    Chen, Li
    Han, Su
    Zhang, Ling
    Shen, Jiang
    Jiang, Xia
    Li, Jiayuan
    Peng, Lijun
    Zhang, Ben
    Yao, Yuqin
    Zhang, Qin
    BMJ OPEN, 2024, 14 (07): : 1 - 7
  • [25] Impact of phthalate exposure and blood lipids on breast cancer risk: machine learning prediction
    Liu, Yanbin
    Li, Kunze
    Zhang, Yu
    Cai, Yifan
    Liu, Xuanyu
    Jia, Yiwei
    Yao, Peizhuo
    Wei, Xinyu
    Wu, Huizi
    Liu, Xuan
    Feng, Cong
    Li, Chaofan
    Wang, Weiwei
    Zhang, Shuqun
    Du, Chong
    ENVIRONMENTAL SCIENCES EUROPE, 2025, 37 (01)
  • [26] Prognostic prediction of breast cancer patients using machine learning models: a retrospective analysis
    Song, Xuchun
    Chu, Jiebin
    Guo, Zijie
    Wei, Qun
    Wang, Qingchuan
    Hu, Wenxian
    Wang, Linbo
    Zhao, Wenhe
    Zheng, Heming
    Lu, Xudong
    Zhou, Jichun
    GLAND SURGERY, 2024, 13 (09) : 1575 - 1587
  • [27] USING UNSUPERVISED MACHINE LEARNING FOR ASSESSMENT OF LEFT VENTRICULAR DIASTOLIC FUNCTION AND RISK STRATIFICATION IN A LARGE POPULATION
    Chao, Chieh Ju
    Kato, Nahoko
    Lopez-Jimenez, Francisco
    Lin, Grace
    Kane, Garvan C.
    Pellikka, Patricia A.
    JOURNAL OF THE AMERICAN COLLEGE OF CARDIOLOGY, 2022, 79 (09) : 3481 - 3481
  • [28] Machine learning for risk stratification of thyroid cancer patients: a 15-year cohort study
    Shiva Borzooei
    Giovanni Briganti
    Mitra Golparian
    Jerome R. Lechien
    Aidin Tarokhian
    European Archives of Oto-Rhino-Laryngology, 2024, 281 : 2095 - 2104
  • [29] Machine learning for risk stratification of thyroid cancer patients: a 15-year cohort study
    Borzooei, Shiva
    Briganti, Giovanni
    Golparian, Mitra
    Lechien, Jerome R.
    Tarokhian, Aidin
    EUROPEAN ARCHIVES OF OTO-RHINO-LARYNGOLOGY, 2024, 281 (04) : 2095 - 2104
  • [30] Complete blood count and C-reactive protein to predict positive blood culture among neonates using machine learning algorithms
    Matsushita, Felipe Yu
    Krebs, Vera Lucia Jornada
    de Carvalho, Werther Brunow
    CLINICS, 2023, 78