Revisiting the Risk Factors for Endometriosis: A Machine Learning Approach

被引:13
|
作者
Blass, Ido [1 ]
Sahar, Tali [2 ]
Shraibman, Adi [3 ]
Ofer, Dan [4 ]
Rappoport, Nadav [5 ]
Linial, Michal [4 ]
机构
[1] Hebrew Univ Jerusalem, Rachel & Selim Benin Sch Comp Sci & Engn, IL-91904 Jerusalem, Israel
[2] McGill Univ, Hlth Ctr, Alan Edwards Pain Management Unit, Montreal, PQ H3G 1A4, Canada
[3] Acad Coll Tel Aviv Yaffo, Dept Comp Sci, IL-69978 Tel Aviv, Israel
[4] Ben Gurion Univ Negev, Fac Engn Sci, Dept Software & Informat Syst Engn, IL-84105 Beer Sheva, Israel
[5] Hebrew Univ Jerusalem, Inst Life Sci, Dept Biol Chem, IL-91904 Jerusalem, Israel
来源
JOURNAL OF PERSONALIZED MEDICINE | 2022年 / 12卷 / 07期
关键词
machine learning; UK-Biobank; pelvic pain; women's health; CatBoost; features engineering; QUALITY-OF-LIFE; UK BIOBANK; DIAGNOSIS; WOMEN; DELAY; EPIDEMIOLOGY; ASSOCIATION; PREVALENCE; INSIGHTS; GENETICS;
D O I
10.3390/jpm12071114
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Endometriosis is a condition characterized by implants of endometrial tissues into extrauterine sites, mostly within the pelvic peritoneum. The prevalence of endometriosis is under-diagnosed and is estimated to account for 5-10% of all women of reproductive age. The goal of this study was to develop a model for endometriosis based on the UK-biobank (UKB) and re-assess the contribution of known risk factors to endometriosis. We partitioned the data into those diagnosed with endometriosis (5924; ICD-10: N80) and a control group (142,723). We included over 1000 variables from the UKB covering personal information about female health, lifestyle, self-reported data, genetic variants, and medical history prior to endometriosis diagnosis. We applied machine learning algorithms to train an endometriosis prediction model. The optimal prediction was achieved with the gradient boosting algorithms of CatBoost for the data-combined model with an area under the ROC curve (ROC-AUC) of 0.81. The same results were obtained for women from a mixed ethnicity population of the UKB (7112; ICD-10: N80). We discovered that, prior to being diagnosed with endometriosis, affected women had significantly more ICD-10 diagnoses than the average unaffected woman. We used SHAP, an explainable AI tool, to estimate the marginal impact of a feature, given all other features. The informative features ranked by SHAP values included irritable bowel syndrome (IBS) and the length of the menstrual cycle. We conclude that the rich population-based retrospective data from the UKB are valuable for developing unified machine learning endometriosis models despite the limitations of missing data, noisy medical input, and participant age. The informative features of the model may improve clinical utility for endometriosis diagnosis.
引用
收藏
页数:18
相关论文
共 50 条
  • [21] Investigating the biopsychosocial risk factors of problematic gaming in youth using a machine learning approach
    Song, Suyeon
    Yoo, Cio
    Chang, Rose
    Ahn, Woo-Young
    JOURNAL OF BEHAVIORAL ADDICTIONS, 2023, 12 : 292 - 292
  • [22] Ranking Risk Factors in Financial Losses From Railroad Incidents: A Machine Learning Approach
    Dhingra, Neeraj
    Bridgelall, Raj
    Lu, Pan
    Szmerekovsky, Joseph
    Bhardwaj, Bhavana
    TRANSPORTATION RESEARCH RECORD, 2023, 2677 (02) : 299 - 309
  • [23] Machine Learning: An Approach in Identifying Risk Factors for Coercion Compared to Binary Logistic Regression
    Hotzy, Florian
    Theodoridou, Anastasia
    Hoff, Paul
    Schneeberger, Andres R.
    Seifritz, Erich
    Olbrich, Sebastian
    Jaeger, Matthias
    FRONTIERS IN PSYCHIATRY, 2018, 9
  • [24] PREDICTING THE RISK FACTORS OF HYPERTENSION AMONG INDIAN OLDER POPULATION: A MACHINE LEARNING APPROACH
    Das, Ayushi
    INNOVATION IN AGING, 2023, 7 : 450 - 450
  • [25] Importance and limits of cardiovascular risk factors on the prediction of SCD using machine learning approach
    Chocron, R.
    Laurenceau, T.
    Youssfi, Y.
    Bougouin, W.
    Empana, J. P.
    Chopin, N.
    Jouven, X.
    EUROPEAN HEART JOURNAL, 2024, 45
  • [26] A machine learning approach to risk disclosure reporting
    Resende, Max
    Ferreira, Alexandre
    ECONOMICS BULLETIN, 2021, 41 (02): : 234 - 251
  • [27] Revisiting thread configuration of SpMV kernels on GPU: A machine learning based approach
    Gao, Jianhua
    Ji, Weixing
    Liu, Jie
    Wang, Yizhuo
    Shi, Feng
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2024, 185
  • [28] Revisiting CVD Risk Prediction Using Machine Learning Approaches: A Case Study
    Dashti, Hesam
    Liu, Yanyan
    Glynn, Robert J.
    Ridker, Paul M.
    Mora, Samia
    Demler, Olga
    CIRCULATION, 2020, 141
  • [29] The application of risk models based on machine learning to predict endometriosis-associated ovarian cancer in patients with endometriosis
    Chao, Xiaopei
    Wang, Shu
    Lang, Jinghe
    Leng, Jinhua
    Fan, Qingbo
    ACTA OBSTETRICIA ET GYNECOLOGICA SCANDINAVICA, 2022, 101 (12) : 1440 - 1449
  • [30] Revisiting the Geochemical Classification of Zircon Source Rocks Using a Machine Learning Approach
    Itano, Keita
    Sawada, Hikaru
    MATHEMATICAL GEOSCIENCES, 2024, 56 (06) : 1139 - 1160