Predicting Nurse Turnover for Highly Imbalanced Data Using the Synthetic Minority Over-Sampling Technique and Machine Learning Algorithms

被引:3
|
作者
Xu, Yuan [1 ]
Park, Yongshin [2 ]
Park, Ju Dong [3 ]
Sun, Bora [4 ]
机构
[1] Dalian Maritime Univ, Collaborat Innovat Ctr Transport Studies, Sch Maritime Econ & Management, 1 Linghai Rd, Dalian 116026, Peoples R China
[2] St Edwards Univ, Bill Munday Sch Business, Dept Mkt Operat & Analyt, 3001 South Congress, Austin, TX 78704 USA
[3] Gyeongsang Natl Univ, Dept Maritime Police & Prod Syst, Tongyeong Si 53064, Gyeongsangnam D, South Korea
[4] Univ Texas Austin, Sch Nursing, 1710 Red River St, Austin, TX 78712 USA
关键词
nurse turnover; machine learning; SMOTE; NSSRN; random forest; XGBoost; ASSOCIATION; BURNOUT; SMOTE;
D O I
10.3390/healthcare11243173
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Predicting nurse turnover is a growing challenge within the healthcare sector, profoundly impacting healthcare quality and the nursing profession. This study employs the Synthetic Minority Over-sampling Technique (SMOTE) to address class imbalance issues in the 2018 National Sample Survey of Registered Nurses dataset and predict nurse turnover using machine learning algorithms. Four machine learning algorithms, namely logistic regression, random forests, decision tree, and extreme gradient boosting, were applied to the SMOTE-enhanced dataset. The data were split into 80% training and 20% validation sets. Eighteen carefully selected variables from the database served as predictive features, and the machine learning model identified age, working hours, electric health record/electronic medical record, individual income, and job type as important features concerning nurse turnover. The study includes a performance comparison based on accuracy, precision, recall (sensitivity), F1-score, and AUC. In summary, the results demonstrate that SMOTE-enhanced random forests exhibit the most robust predictive power in the classical approach (with all 18 predictive variables) and an optimized approach (utilizing eight key predictive variables). Extreme gradient boosting, decision tree, and logistic regression follow in performance. Notably, age emerges as the most influential factor in nurse turnover, with working hours, electric health record/electronic medical record usability, individual income, and region also playing significant roles. This research offers valuable insights for healthcare researchers and stakeholders, aiding in selecting suitable machine learning algorithms for nurse turnover prediction.
引用
收藏
页数:22
相关论文
共 50 条
  • [11] Synthetic minority over-sampling technique-enhanced machine learning models for predicting recurrence of postoperative chronic subdural hematoma
    Ni, Zhihui
    Zhu, Yehao
    Qian, Yiwei
    Li, Xinbo
    Xing, Zhenqiu
    Zhou, Yinan
    Chen, Yu
    Huang, Lijie
    Yang, Jianjing
    Zhuge, Qichuan
    [J]. FRONTIERS IN NEUROLOGY, 2024, 15
  • [12] Software defect prediction with imbalanced distribution by radius-synthetic minority over-sampling technique
    Guo, Shikai
    Dong, Jian
    Li, Hui
    Wang, Jiahui
    [J]. JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2021, 33 (07)
  • [13] AN IMBALANCED SIGNAL MODULATION CLASSIFICATION AND EVALUATION METHOD BASED ON SYNTHETIC MINORITY OVER-SAMPLING TECHNIQUE
    Liu, Xuebo
    Wang, Yiran
    Bai, Jing
    Li, Haoran
    Wang, Xu
    [J]. IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 6224 - 6227
  • [14] An Over-Sampling Technique with Rejection for Imbalanced Class Learning
    Lee, Jaedong
    Kim, Noo-ri
    Lee, Jee-Hyong
    [J]. ACM IMCOM 2015, PROCEEDINGS, 2015,
  • [15] Borderline over-sampling in feature space for learning algorithms in imbalanced data environments
    [J]. Savetratanakaree, Kittipat (kittipatsavet@gmail.com), 1600, International Association of Engineers (43):
  • [16] Enhancing Cascade Quality Prediction Method in Handling Imbalanced Dataset Using Synthetic Minority Over-Sampling Technique
    Julian, Fajar Azhari
    Arif, Fahmi
    [J]. INDUSTRIAL ENGINEERING AND MANAGEMENT SYSTEMS, 2023, 22 (04): : 389 - 398
  • [17] Arabic Authorship Attribution Using Synthetic Minority Over-Sampling Technique and Principal Components Analysis for Imbalanced Documents
    Hadjadj, Hassina
    Sayoud, Halim
    [J]. INTERNATIONAL JOURNAL OF COGNITIVE INFORMATICS AND NATURAL INTELLIGENCE, 2021, 15 (04)
  • [18] A sparrow search algorithm-optimized convolutional neural network for imbalanced data classification using synthetic minority over-sampling technique
    Deng, Wu
    He, Qi
    Zhou, Xiangbing
    Chen, Huayue
    Zhao, Huimin
    [J]. PHYSICA SCRIPTA, 2023, 98 (11)
  • [19] Learning from Imbalanced Data Using Over-Sampling and the Firefly Algorithm
    Czarnowski, Ireneusz
    [J]. COMPUTATIONAL COLLECTIVE INTELLIGENCE (ICCCI 2021), 2021, 12876 : 373 - 386
  • [20] A synthetic informative minority over-sampling (SIMO) algorithm leveraging support vector machine to enhance learning from imbalanced datasets
    Piri, Saeed
    Delen, Dursun
    Liu, Tieming
    [J]. DECISION SUPPORT SYSTEMS, 2018, 106 : 15 - 29