An ensemble learning approach for anomaly detection in credit card data with imbalanced and overlapped classes

被引:8
|
作者
Islam, Md Amirul [1 ]
Uddin, Md Ashraf [2 ]
Aryal, Sunil [2 ]
Stea, Giovanni [1 ]
机构
[1] Univ Pisa, Dept Informat Engn, Pisa, Italy
[2] Deakin Univ, Sch Informat Technol, Geelong, Australia
关键词
Anomaly detection; Credit card; Ensemble; Meta-learning; Base learner; Classification; DECISION TREE APPROACH; FRAUD DETECTION; MACHINE; CLASSIFICATION; SUPPORT;
D O I
10.1016/j.jisa.2023.103618
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Electronic payment methods have become increasingly popular for business transactions, both online and in person, across the globe. Anomalies like online fraud and default payments, which can result in substantial financial losses, have become more common as the usage of credit cards in online purchases has increased. To address this issue, researchers have explored various machine learning models and their ensemble techniques for detecting anomalies in credit card transaction data. However, detecting anomalies in this data can be challenging due to overlapping class samples and an imbalanced class distribution. Therefore, the detection rate of anomalies from minority class samples is relatively low, and general learning algorithms can be biased towards the majority class samples. In this paper, we propose a model called Credit Card Anomaly Detection (CCAD) that leverages the base learners paradigm and meta-learning ensemble techniques to improve the detection rate of credit card anomalies. We utilize four outlier detection algorithms as base learners and XGBoost algorithm as meta learner in the proposed stacked ensemble approach to detect anomaly in credit card transactions. We apply stratified sampling technique and k-fold cross-validation process to address the issues of data imbalance and overfitting. In addition, the discordance rate is calculated to enhance the accuracy of ensemble learning performances. The proposed model is trained and tested using two datasets: CCF (Credit Card Fraud) and CCDP (Credit Card Default Payment). Experimental results demonstrate that our approach outperforms existing approaches, particularly in detecting anomalies from the minority class instances of these datasets.
引用
收藏
页数:21
相关论文
共 50 条
  • [31] A voting ensemble machine learning based credit card fraud detection using highly imbalance data
    Chhabra, Raunak
    Goswami, Shailza
    Ranjan, Ranjeet Kumar
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (18) : 54729 - 54753
  • [32] Machine Learning for Prediction of Imbalanced Data: Credit Fraud Detection
    Thanh Cong Tran
    Tran Khanh Dang
    PROCEEDINGS OF THE 2021 15TH INTERNATIONAL CONFERENCE ON UBIQUITOUS INFORMATION MANAGEMENT AND COMMUNICATION (IMCOM 2021), 2021,
  • [33] Ensemble Method for Credit Card Fraud Detection
    Wang, Rui
    Liu, Guanjun
    2021 4TH INTERNATIONAL CONFERENCE ON INTELLIGENT AUTONOMOUS SYSTEMS (ICOIAS 2021), 2021, : 246 - 252
  • [34] Credit Card Fraud Detection: Addressing Imbalanced Datasets with a Multi-phase Approach
    El Hlouli F.Z.
    Riffi J.
    Mahraz M.A.
    Yahyaouy A.
    El Fazazy K.
    Tairi H.
    SN Computer Science, 5 (1)
  • [35] Fraudulent Transaction Detection in Credit Card by Applying Ensemble Machine Learning techniques
    Prusti, Debachudamani
    Rath, Santanu Kumar
    2019 10TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING TECHNOLOGIES (ICCCNT), 2019,
  • [36] An Incremental Learning Ensemble Method for Imbalanced Credit Scoring
    Tian, Jin
    Liu, Xinye
    Li, Minqiang
    2019 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2019), 2019, : 754 - 759
  • [37] A Hybrid Machine Learning Approach for Credit Card Fraud Detection
    Gupta, Sonam
    Varshney, Tushtee
    Verma, Abhinav
    Goel, Lipika
    Yadav, Arun Kumar
    Singh, Arjun
    INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY PROJECT MANAGEMENT, 2022, 13 (03)
  • [38] Review of Machine Learning Approach on Credit Card Fraud Detection
    Rejwan Bin Sulaiman
    Vitaly Schetinin
    Paul Sant
    Human-Centric Intelligent Systems, 2022, 2 (1-2): : 55 - 68
  • [39] A Survey on GAN Techniques for Data Augmentation to Address the Imbalanced Data Issues in Credit Card Fraud Detection
    Strelcenia, Emilija
    Prakoonwit, Simant
    MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2023, 5 (01): : 304 - 329
  • [40] Threshold optimization and random undersampling for imbalanced credit card data
    Leevy, Joffrey L. L.
    Johnson, Justin M. M.
    Hancock, John
    Khoshgoftaar, Taghi M. M.
    JOURNAL OF BIG DATA, 2023, 10 (01)