Applying machine learning techniques to predict and explain subscriber churn of an online drug information platform

被引:0
|
作者
Georgios Theodoridis
Athanasios Tsadiras
机构
[1] Aristotle University of Thessaloniki,
来源
关键词
Customer churn; Online subscriber churn; Method comparison; Ensemble methods; Neural networks; Advanced preprocessing; Boruta algorithm; Isolation forest; Feature importance;
D O I
暂无
中图分类号
学科分类号
摘要
Presently, most markets are extremely saturated and, as a result, businesses are highly competitive. Hence, avoiding the loss of preexisting customers is pivotal, deeming the prediction of customer loss crucial to efficiently target potential churners and attempt to retain them. This study provides an in-depth comparison of various machine learning techniques and advanced preprocessing methods as well as an overall guide for handling churn prediction problems. Churn prediction is fundamentally a binary classification problem. To handle said problem, within this paper, numerous methods that belong to different machine learning categories (linear, nonlinear, ensemble, neural networks) are constructed, optimized and trained on the subscription data of a new real-world dataset originating from a popular online drug information platform that provides information on drugs and drug substances as well as professional tools for pharmacotherapy decision making. In contrast with previous works that address traditional customer churn in relation to telecom, banking or insurance industries, the current study addresses online subscriber churn where users might churn at any given moment. This study also focuses on the proper preprocessing of the given data via advanced machine learning methods, as well as evaluating the models under different conditions to measure their robustness. The results are presented, compared, analyzed and explained. Extensive feature importance analysis is performed to explain not only the models themselves but to also indicate the main factors that contribute toward churning. The findings co-align with the notion that, under the important condition that the dataset is preprocessed using not only statistical methods but machine learning techniques as well, all methods perform adequately and are generally viable options, but ensemble methods, namely Random Forests, are more flexible and resistant toward outliers. Feature importance analysis indicates that usage, not demographic data, is the prime indicator of churn.
引用
收藏
页码:19501 / 19514
页数:13
相关论文
共 50 条
  • [31] Applying Machine Learning Techniques to the Audit of Antimicrobial Prophylaxis
    Shi, Zhi-Yuan
    Hon, Jau-Shin
    Cheng, Chen-Yang
    Chiang, Hsiu-Tzy
    Huang, Hui-Mei
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (05):
  • [32] Applying Machine Learning Techniques for Email Reply Prediction
    Ayodele, Taiwo
    Zhou, Shikun
    Khusainov, Rinat
    [J]. WORLD CONGRESS ON ENGINEERING 2009, VOLS I AND II, 2009, : 31 - 36
  • [33] Online Machine Learning Techniques for Coq: A Comparison
    Zhang, Liao
    Blaauwbroek, Lasse
    Piotrowski, Bartosz
    Cerny, Prokop
    Kaliszyk, Cezary
    Urban, Josef
    [J]. INTELLIGENT COMPUTER MATHEMATICS (CICM 2021), 2021, 12833 : 67 - 83
  • [34] Applying Machine Learning Techniques for Speech Emotion Recognition
    Tarunika, K.
    Pradeeba, R. B.
    Aruna, P.
    [J]. 2018 9TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING TECHNOLOGIES (ICCCNT), 2018,
  • [35] A survey on applying machine learning techniques for management of diseases
    El Houby, Enas M. F.
    [J]. JOURNAL OF APPLIED BIOMEDICINE, 2018, 16 (03) : 165 - 174
  • [36] APPLYING MACHINE LEARNING TECHNIQUES IN DETECTING BACTERIAL VAGINOSIS
    Baker, Yolanda S.
    Agrawal, Rajeev
    Foster, James A.
    Beck, Daniel
    Dozier, Gerry
    [J]. PROCEEDINGS OF 2014 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOL 1, 2014, : 241 - 246
  • [37] Analysing and Applying Captured Object with Machine Learning Techniques
    Khan, Soumya Suvra
    Majumdar, Rana
    Maut, Partha Pratim
    Ghosh, Anupam
    Mishra, Ved P.
    [J]. PROCEEDINGS OF 2019 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND KNOWLEDGE ECONOMY (ICCIKE' 2019), 2019, : 287 - 290
  • [38] Applying Machine Learning and AI on Self Automated Personalized Online Learning
    Srisa-An, Chetneti
    Yongsiriwit, Karn
    [J]. FUZZY SYSTEMS AND DATA MINING V (FSDM 2019), 2019, 320 : 137 - 145
  • [39] Can machine learning predict drug nanocrystals?
    He, Yuan
    Ye, Zhuyifan
    Liu, Xinyang
    Wei, Zhengjie
    Qiu, Fen
    Li, Hai-Feng
    Zheng, Ying
    Ouyang, Defang
    [J]. JOURNAL OF CONTROLLED RELEASE, 2020, 322 : 274 - 285
  • [40] Telecom customer churn prediction model : Analysis of machine learning techniques for churn prediction and factor identification in telecom sector
    Pareek, Anshul
    Poonam
    Arora, Shaifali Madan
    Gupta, Nidhi
    [J]. JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES, 2024, 45 (02): : 613 - 630