Methods of Handling Unbalanced Datasets in Credit Card Fraud Detection

被引:5
|
作者
Minastireanu, Elena-Adriana [1 ]
Mesnita, Gabriela [2 ]
机构
[1] Alexandru Ioan Cuza Univ, Doctoral Sch Econ & Business Adm, Iasi 700057, Romania
[2] Alexandru Ioan Cuza Univ, Fac Econ & Business Adm, Business Informat Syst Dept, Iasi 700057, Romania
关键词
bank fraud; machine-learning algorithms; resampling; cost-sensitive training; unbalanced dataset; CLASSIFICATION; SMOTE;
D O I
10.18662/brain/11.1/19
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
Nowadays fraudulent transactions of every type represent a major concern in the, financial industry due to the total amount of money that are lost every year. Manually analyzing fraudulent transactions is unfeasible if re think at the huge amount of data and the complexity of bank fraud in the digitization era. In this context, the problem to detect the fraud can be achieved by machine-learning algorithms due to their ability of detecting small anomalies in very large datasets. The problem that arise here is that the datasets are highly unbalanced meaning that the non-fraudulent cases heavily dominates the fraudulent ones. In this paper, we are going to present three :rays of handling unbalanced datasets by: resampling methods (undersampling and oversampling), cost :sensitive training and tree algorithms (decision tree, random forest and Naive Bays), emphasizing the idea of why the Receiver Operating Characteristics curve (ROC) should not he used on this type of datasets when measuring the performance of the algorithm. The experimental test was applied on a number of 890,977 banking transactions in order to observe the performance metrics of all the three methods mentioned above.
引用
下载
收藏
页码:131 / 143
页数:13
相关论文
共 50 条
  • [1] Handling Imbalanced Datasets in the Case of Credit Card Fraud
    Ounacer, Soumaya
    Jihal, Houda
    Bayoude, Kenza
    Daif, Abderrahmane
    Azzouazi, Mohamed
    ADVANCED INTELLIGENT SYSTEMS FOR SUSTAINABLE DEVELOPMENT (AI2SD'2020), VOL 1, 2022, 1417 : 666 - 678
  • [2] Efficient Resampling for Fraud Detection During Anonymised Credit Card Transactions with Unbalanced Datasets
    Mrozek, Petr
    Panneerselvam, John
    Bagdasar, Ovidiu
    2020 IEEE/ACM 13TH INTERNATIONAL CONFERENCE ON UTILITY AND CLOUD COMPUTING (UCC 2020), 2020, : 426 - 433
  • [3] Logistic Regression Learning Model for Handling Concept Drift with Unbalanced Data in Credit Card Fraud Detection System
    Kulkarni, Pallavi
    Ade, Roshani
    PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION TECHNOLOGIES, IC3T 2015, VOL 2, 2016, 380 : 681 - 689
  • [4] Credit Card Fraud Detection - Machine Learning methods
    Varmedja, Dejan
    Karanovic, Mirjana
    Sladojevic, Srdjan
    Arsenovic, Marko
    Anderla, Andras
    2019 18TH INTERNATIONAL SYMPOSIUM INFOTEH-JAHORINA (INFOTEH), 2019,
  • [5] Credit Card Fraud Detection with Machine Learning Methods
    Goy, Gokhan
    Gezer, Cengiz
    Gungor, Vehbi Cagri
    2019 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2019, : 350 - 354
  • [6] Review On Fraud Detection Methods in Credit Card Transactions
    Modi, Krishna
    Dayma, Reshma
    PROCEEDINGS OF 2017 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL (I2C2), 2017,
  • [7] Handling Class Imbalance in Credit Card Fraud using Resampling Methods
    Hordri, Nur Farhana
    Yuhaniz, Siti Sophiayati
    Azmi, Nurulhuda Firdaus Mohd
    Shamsuddin, Siti Mariyam
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2018, 9 (11) : 390 - 396
  • [8] Credit Card Fraud Detection
    Tiwari, Mohit
    Sharma, Vipul
    Bala, Devashish
    Devansh
    Kaushal, Dishant
    JOURNAL OF ALGEBRAIC STATISTICS, 2022, 13 (02) : 1778 - 1789
  • [9] Machine Learning Methods for Credit Card Fraud Detection: A Survey
    Dastidar, Kanishka Ghosh
    Caelen, Olivier
    Granitzer, Michael
    IEEE Access, 2024, 12 : 158939 - 158965
  • [10] Credit Card Fraud Detection System
    Filippov, V.
    Mukhanov, L.
    Shchukin, B.
    PROCEEDINGS OF THE 2008 7TH IEEE INTERNATIONAL CONFERENCE ON CYBERNETIC INTELLIGENT SYSTEMS, 2008, : 79 - +