Hyperparameter Optimization and Combined Data Sampling Techniques in Machine Learning for Customer Churn Prediction: A Comparative Analysis

被引:7
|
作者
Imani, Mehdi [1 ]
Arabnia, Hamid Reza [2 ]
机构
[1] Stockholm Univ, Dept Comp & Syst Sci, S-10691 Stockholm, Sweden
[2] Univ Georgia, Sch Comp, Athens, GA 30602 USA
关键词
machine learning; churn prediction; imbalanced data; combined data sampling techniques; hyperparameter optimization; TELECOMMUNICATION INDUSTRY; REGRESSION;
D O I
10.3390/technologies11060167
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
This paper explores the application of various machine learning techniques for predicting customer churn in the telecommunications sector. We utilized a publicly accessible dataset and implemented several models, including Artificial Neural Networks, Decision Trees, Support Vector Machines, Random Forests, Logistic Regression, and gradient boosting techniques (XGBoost, LightGBM, and CatBoost). To mitigate the challenges posed by imbalanced datasets, we adopted different data sampling strategies, namely SMOTE, SMOTE combined with Tomek Links, and SMOTE combined with Edited Nearest Neighbors. Moreover, hyperparameter tuning was employed to enhance model performance. Our evaluation employed standard metrics, such as Precision, Recall, F1-score, and the Receiver Operating Characteristic Area Under Curve (ROC AUC). In terms of the F1-score metric, CatBoost demonstrates superior performance compared to other machine learning models, achieving an outstanding 93% following the application of Optuna hyperparameter optimization. In the context of the ROC AUC metric, both XGBoost and CatBoost exhibit exceptional performance, recording remarkable scores of 91%. This achievement for XGBoost is attained after implementing a combination of SMOTE with Tomek Links, while CatBoost reaches this level of performance after the application of Optuna hyperparameter optimization.
引用
收藏
页数:26
相关论文
共 50 条
  • [1] A comparison of machine learning techniques for customer churn prediction
    Vafeiadis, T.
    Diamantaras, K. I.
    Sarigiannidis, G.
    Chatzisavvas, K. Ch.
    [J]. SIMULATION MODELLING PRACTICE AND THEORY, 2015, 55 : 1 - 9
  • [2] Telecom customer churn prediction model : Analysis of machine learning techniques for churn prediction and factor identification in telecom sector
    Pareek, Anshul
    Poonam
    Arora, Shaifali Madan
    Gupta, Nidhi
    [J]. JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES, 2024, 45 (02): : 613 - 630
  • [3] Customer churn prediction in telecom sector using machine learning techniques
    Wagh, Sharmila K.
    Andhale, Aishwarya A.
    Wagh, Kishor S.
    Pansare, Jayshree R.
    Ambadekar, Sarita P.
    Gawande, S. H.
    [J]. RESULTS IN CONTROL AND OPTIMIZATION, 2024, 14
  • [4] A Survey on Customer Churn Prediction using Machine Learning and data mining Techniques in E-commerce
    Gopal, Priya
    Bin MohdNawi, Nazri
    [J]. 2021 IEEE ASIA-PACIFIC CONFERENCE ON COMPUTER SCIENCE AND DATA ENGINEERING (CSDE), 2021,
  • [5] Customer churn prediction in telecom using machine learning in big data platform
    Ahmad, Abdelrahim Kasem
    Jafar, Assef
    Aljoumaa, Kadan
    [J]. JOURNAL OF BIG DATA, 2019, 6 (01)
  • [6] Customer churn prediction in telecom using machine learning in big data platform
    Abdelrahim Kasem Ahmad
    Assef Jafar
    Kadan Aljoumaa
    [J]. Journal of Big Data, 6
  • [7] Customer churn prediction system: a machine learning approach
    Praveen Lalwani
    Manas Kumar Mishra
    Jasroop Singh Chadha
    Pratyush Sethi
    [J]. Computing, 2022, 104 : 271 - 294
  • [8] Customer Churn Prediction by Classification Models in Machine Learning
    Zhao, Heng
    Zuo, Xumin
    Xie, Yuanyuan
    [J]. 2022 9TH INTERNATIONAL CONFERENCE ON ELECTRICAL AND ELECTRONICS ENGINEERING (ICEEE 2022), 2022, : 399 - 407
  • [9] Customer churn prediction system: a machine learning approach
    Lalwani, Praveen
    Mishra, Manas Kumar
    Chadha, Jasroop Singh
    Sethi, Pratyush
    [J]. COMPUTING, 2022, 104 (02) : 271 - 294
  • [10] Machine Learning for Customer Churn Prediction in Retail Banking
    Dias, Joana
    Godinho, Pedro
    Torres, Pedro
    [J]. COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2020, PT III, 2020, 12251 : 576 - 589