Explainable machine learning models for Medicare fraud detection

被引:5
|
作者
Hancock, John T. [1 ]
Bauder, Richard A. [1 ]
Wang, Huanjing [2 ]
Khoshgoftaar, Taghi M. [1 ]
机构
[1] Florida Atlantic Univ, Coll Engn & Comp Sci, Boca Raton, FL 33004 USA
[2] Western Kentucky Univ, Ogden Coll Sci & Engn, Bowling Green, KY USA
关键词
Big Data; Class imbalance; Explainable machine learning models; Ensemble supervised feature selection; Medicare fraud detection;
D O I
10.1186/s40537-023-00821-5
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
As a means of building explainable machine learning models for Big Data, we apply a novel ensemble supervised feature selection technique. The technique is applied to publicly available insurance claims data from the United States public health insurance program, Medicare. We approach Medicare insurance fraud detection as a supervised machine learning task of anomaly detection through the classification of highly imbalanced Big Data. Our objectives for feature selection are to increase efficiency in model training, and to develop more explainable machine learning models for fraud detection. Using two Big Data datasets derived from two different sources of insurance claims data, we demonstrate how our feature selection technique reduces the dimensionality of the datasets by approximately 87.5% without compromising performance. Moreover, the reduction in dimensionality results in machine learning models that are easier to explain, and less prone to overfitting. Therefore, our primary contribution of the exposition of our novel feature selection technique leads to a further contribution to the application domain of automated Medicare insurance fraud detection. We utilize our feature selection technique to provide an explanation of our fraud detection models in terms of the definitions of the selected features. The ensemble supervised feature selection technique we present is flexible in that any collection of machine learning algorithms that maintain a list of feature importance values may be used. Therefore, researchers may easily employ variations of the technique we present.
引用
收藏
页数:31
相关论文
共 50 条
  • [21] Medicare Fraud Detection Using Graph Analysis: A Comparative Study of Machine Learning and Graph Neural Networks
    Yoo, Yeeun
    Shin, Jinho
    Kyeong, Sunghyon
    IEEE ACCESS, 2023, 11 : 88278 - 88294
  • [22] Fraud Detection Using Machine Learning and Deep Learning
    Gandhar A.
    Gupta K.
    Pandey A.K.
    Raj D.
    SN Computer Science, 5 (5)
  • [23] Explainable machine learning for phishing feature detection
    Calzarossa, Maria Carla
    Giudici, Paolo
    Zieni, Rasha
    QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL, 2024, 40 (01) : 362 - 373
  • [24] Explainable Machine Learning for Fake News Detection
    Reis, Julio C. S.
    Correia, Andre
    Murai, Fabricio
    Veloso, Adriano
    Benevenuto, Fabricio
    PROCEEDINGS OF THE 11TH ACM CONFERENCE ON WEB SCIENCE (WEBSCI'19), 2019, : 17 - 26
  • [25] Fraud Detection in Blockchains using Machine Learning
    Kilic, Baran
    Sen, Alper
    Ozturan, Can
    2022 FOURTH INTERNATIONAL CONFERENCE ON BLOCKCHAIN COMPUTING AND APPLICATIONS (BCCA), 2022, : 214 - 218
  • [26] Fraud detection with machine learning: model comparison
    Pacheco J.
    Chela J.
    Salomé G.
    International Journal of Business Intelligence and Data Mining, 2023, 22 (04) : 434 - 450
  • [27] Healthcare Fraud Detection using Machine Learning
    Prova, Nuzhat Noor Islam
    2024 SECOND INTERNATIONAL CONFERENCE ON INTELLIGENT CYBER PHYSICAL SYSTEMS AND INTERNET OF THINGS, ICOICI 2024, 2024, : 1119 - 1123
  • [28] Fraud Detection in Supply Chain with Machine Learning
    Seify, Mahdi
    Sepehri, Mehran
    Hosseini-far, Amin
    Darvish, Aryana
    IFAC PAPERSONLINE, 2022, 55 (10): : 406 - 411
  • [29] Machine Learning Detection for Financial Statement Fraud
    Hwang, Ting-Kai
    Chen, Wei-Chun
    Chiang, Wan-Chi
    Li, Yung-Ming
    INFORMATION SYSTEMS AND TECHNOLOGIES, WORLDCIST 2022, VOL 2, 2022, 469 : 148 - 154
  • [30] Credit Card Fraud Detection Using Machine Learning and Predictive Models: A Comparative Study
    Sontakke, Atharv
    Yewale, Mrunali
    Zambare, Sejal
    Tendulkar, Sakshi
    Chaudhari, Anagha
    HYBRID INTELLIGENT SYSTEMS, HIS 2021, 2022, 420 : 171 - 180