Advancing Model Performance With ADASYN and Recurrent Feature Elimination and Cross-Validation in Machine Learning-Assisted Credit Card Fraud Detection: A Comparative Analysis

被引:0
|
作者
Ileberi, Emmanuel [1 ]
Sun, Yanxia [1 ]
机构
[1] Univ Johannesburg, Dept Elect & Elect Engn Sci, ZA-2094 Johannesburg, South Africa
来源
IEEE ACCESS | 2024年 / 12卷
基金
新加坡国家研究基金会;
关键词
Credit cards; Fraud; Accuracy; Support vector machines; Europe; Boosting; Adaptation models; Machine learning; Predictive models; Classification algorithms; Credit card fraud detection; ADASYN; recursive feature elimination; machine learning; predictive modeling; classification; imbalanced classes; SMOTE;
D O I
10.1109/ACCESS.2024.3457922
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Online card transactions have become more frequent due to the growth of e-commerce and financial technology apps. However, this also means more opportunities for credit card fraud, which affects banks, retailers, and card issuers. Therefore, we need systems that can protect the security and integrity of credit card transactions. In this study, we use the Adaptive Synthetic Minority Oversampling Technique to balance an imbalanced dataset; then we combine that with the Recursive Feature Elimination with the validation technique to enhance the performance of credit card fraud detection systems. We compare the results of several models, which are Decision Tree, Random Forests, Extreme Gradient Boosting, Light Gradient Boosting Machines, and Linear Regression, on the original imbalanced dataset and the resampled dataset using the Adaptive Synthetic Minority Oversampling Technique before finally applying the Recursive Feature Elimination with Cross Validation technique. We aim to find the best model and method for detecting credit card fraud. During training, k-fold cross-validation is applied to both sets of models in order to prevent overfitting and improve the classification. Our results show that the Adaptive Synthetic Minority Oversampling Technique and Recursive Feature Elimination with Cross Validation modified dataset improved overall classification errors over the baseline dataset. Specifically, the best performing models were the Extreme Gradient Boosting and Random Forests with a Matthew's Correlation Coefficient of 0.8794 and 0.8622, respectively, when used with the baseline dataset, and when used with Adaptive Synthetic Minority Oversampling Technique and Recursive Feature Elimination with Cross Validation dataset, we recorded a Matthew's Correlation Coefficient of 0.9994 and 0.9991 for the Extreme Gradient Boosting and Random Forest, respectively. Furthermore, our results show that the Light Gradient Boosting Machines model recorded the most improvement in Matthew's Correlation Coefficient from 0.3394 to 0.9980 when used with baseline and Adaptive Synthetic Minority Oversampling Technique modified datasets, respectively. This represents an increase of 194% in Matthew's Correlation Coefficient.
引用
下载
收藏
页码:133315 / 133327
页数:13
相关论文
共 6 条
  • [1] Machine Learning Model for Credit Card Fraud Detection- A Comparative Analysis
    Sharma, Pratyush
    Banerjee, Souradeep
    Tiwari, Devyanshi
    Patni, Jagdish Chandra
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2021, 18 (06) : 789 - 796
  • [2] The Performance Analysis of Machine Learning Algorithms for Credit Card Fraud Detection
    Khan, Muhammad Zohaib
    Shaikh, Sarmad Ahmed
    Shaikh, Muneer Ahmed
    Khatri, Kamlesh Kumar
    Rauf, Mahira Abdul
    Kalhoro, Ayesha
    Adnan, Muhammad
    INTERNATIONAL JOURNAL OF ONLINE AND BIOMEDICAL ENGINEERING, 2023, 19 (03) : 82 - 98
  • [3] Credit card fraud detection using Machine Learning Techniques: A Comparative Analysis
    Awoyemi, John O.
    Adetunmbi, Adebayo O.
    Oluwadare, Samuel A.
    PROCEEDINGS OF THE IEEE INTERNATIONAL CONFERENCE ON COMPUTING NETWORKING AND INFORMATICS (ICCNI 2017), 2017,
  • [4] Credit card fraud detection: an evaluation of SMOTE resampling and machine learning model performance
    Alshameri F.
    Xia R.
    International Journal of Business Intelligence and Data Mining, 2023, 23 (01): : 1 - 13
  • [5] Recursive Feature Elimination with Cross-Validation with Decision Tree: Feature Selection Method for Machine Learning-Based Intrusion Detection Systems
    Awad, Mohammed
    Fraihat, Salam
    JOURNAL OF SENSOR AND ACTUATOR NETWORKS, 2023, 12 (05)
  • [6] A Comparative Analysis of Machine Learning Techniques for National Glacier Mapping: Evaluating Performance through Spatial Cross-Validation in Perú
    Bueno, Marcelo
    Macera, Briggitte
    Montoya, Nilton
    WATER, 2023, 15 (24)