Revisiting the Risk Factors for Endometriosis: A Machine Learning Approach

被引:13
|
作者
Blass, Ido [1 ]
Sahar, Tali [2 ]
Shraibman, Adi [3 ]
Ofer, Dan [4 ]
Rappoport, Nadav [5 ]
Linial, Michal [4 ]
机构
[1] Hebrew Univ Jerusalem, Rachel & Selim Benin Sch Comp Sci & Engn, IL-91904 Jerusalem, Israel
[2] McGill Univ, Hlth Ctr, Alan Edwards Pain Management Unit, Montreal, PQ H3G 1A4, Canada
[3] Acad Coll Tel Aviv Yaffo, Dept Comp Sci, IL-69978 Tel Aviv, Israel
[4] Ben Gurion Univ Negev, Fac Engn Sci, Dept Software & Informat Syst Engn, IL-84105 Beer Sheva, Israel
[5] Hebrew Univ Jerusalem, Inst Life Sci, Dept Biol Chem, IL-91904 Jerusalem, Israel
来源
JOURNAL OF PERSONALIZED MEDICINE | 2022年 / 12卷 / 07期
关键词
machine learning; UK-Biobank; pelvic pain; women's health; CatBoost; features engineering; QUALITY-OF-LIFE; UK BIOBANK; DIAGNOSIS; WOMEN; DELAY; EPIDEMIOLOGY; ASSOCIATION; PREVALENCE; INSIGHTS; GENETICS;
D O I
10.3390/jpm12071114
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Endometriosis is a condition characterized by implants of endometrial tissues into extrauterine sites, mostly within the pelvic peritoneum. The prevalence of endometriosis is under-diagnosed and is estimated to account for 5-10% of all women of reproductive age. The goal of this study was to develop a model for endometriosis based on the UK-biobank (UKB) and re-assess the contribution of known risk factors to endometriosis. We partitioned the data into those diagnosed with endometriosis (5924; ICD-10: N80) and a control group (142,723). We included over 1000 variables from the UKB covering personal information about female health, lifestyle, self-reported data, genetic variants, and medical history prior to endometriosis diagnosis. We applied machine learning algorithms to train an endometriosis prediction model. The optimal prediction was achieved with the gradient boosting algorithms of CatBoost for the data-combined model with an area under the ROC curve (ROC-AUC) of 0.81. The same results were obtained for women from a mixed ethnicity population of the UKB (7112; ICD-10: N80). We discovered that, prior to being diagnosed with endometriosis, affected women had significantly more ICD-10 diagnoses than the average unaffected woman. We used SHAP, an explainable AI tool, to estimate the marginal impact of a feature, given all other features. The informative features ranked by SHAP values included irritable bowel syndrome (IBS) and the length of the menstrual cycle. We conclude that the rich population-based retrospective data from the UKB are valuable for developing unified machine learning endometriosis models despite the limitations of missing data, noisy medical input, and participant age. The informative features of the model may improve clinical utility for endometriosis diagnosis.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Risk Factors for Gout in Taiwan Biobank: A Machine Learning Approach
    Liu, Yu-Ruey
    Nfor, Oswald Ndi
    Zhong, Ji-Han
    Lin, Chun-Yuan
    Liaw, Yung-Po
    JOURNAL OF INFLAMMATION RESEARCH, 2024, 17 : 9847 - 9856
  • [2] A MACHINE LEARNING APPROACH FOR THE IDENTIFICATION OF RISK FACTORS FOR CARDIOVASCULAR DISEASE
    Coelho, J. R.
    Gaspar, I. M.
    Silva, A. M.
    Freitas, A. T.
    CARDIOLOGY, 2013, 126 : 272 - 272
  • [3] Diagnosis of Endometriosis Based on Comorbidities: A Machine Learning Approach
    Tore, Ulan
    Abilgazym, Aibek
    Asunsolo-del-Barco, Angel
    Terzic, Milan
    Yemenkhan, Yerden
    Zollanvari, Amin
    Sarria-Santamera, Antonio
    BIOMEDICINES, 2023, 11 (11)
  • [4] A machine learning approach to investigate potential risk factors for gastroschisis in California
    Weber, Kari A.
    Yang, Wei
    Carmichael, Suzan L.
    Padula, Amy M.
    Shaw, Gary M.
    BIRTH DEFECTS RESEARCH, 2019, 111 (04): : 212 - 221
  • [5] A machine learning approach to determine the risk factors for fall in multiple sclerosis
    Ozgur, Su
    Toran, Meryem Kocaslan
    Toygar, Ismail
    Yalcin, Gizem Yagmur
    Eraksoy, Mefkure
    MULTIPLE SCLEROSIS JOURNAL, 2024, 30 (03) : 1035 - 1035
  • [6] A machine learning approach to determine the risk factors for fall in multiple sclerosis
    Ozgur, Su
    Toran, Meryem Kocaslan
    Toygar, Ismail
    Yalcin, Gizem Yagmur
    Eraksoy, Mefkure
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2024, 24 (01)
  • [7] Classification of Software Project Risk Factors Using Machine Learning Approach
    Chaudhary, Prerna
    Singh, Deepali
    Sharma, Ashish
    INTELLIGENT SYSTEMS TECHNOLOGIES AND APPLICATIONS, VOL 2, 2016, 385 : 297 - 309
  • [8] Machine Learning Approach for Pre-Eclampsia Risk Factors Association
    Martinez-Velasco, Antonieta
    Martinez-Villasenor, Lourdes
    Miralles-Pechuan, Luis
    GOODTECHS '18: PROCEEDINGS OF THE 4TH EAI INTERNATIONAL CONFERENCE ON SMART OBJECTS AND TECHNOLOGIES FOR SOCIAL GOOD (GOODTECHS), 2018, : 232 - 237
  • [9] Machine learning algorithms as new screening approach for patients with endometriosis
    Sofiane Bendifallah
    Anne Puchar
    Stéphane Suisse
    Léa Delbos
    Mathieu Poilblanc
    Philippe Descamps
    Francois Golfier
    Cyril Touboul
    Yohann Dabi
    Emile Daraï
    Scientific Reports, 12
  • [10] Machine learning algorithms as new screening approach for patients with endometriosis
    Bendifallah, Sofiane
    Puchar, Anne
    Suisse, Stephane
    Delbos, Lea
    Poilblanc, Mathieu
    Descamps, Philippe
    Golfier, Francois
    Touboul, Cyril
    Dabi, Yohann
    Darai, Emile
    SCIENTIFIC REPORTS, 2022, 12 (01)