Improved Dirichlet mixture model clustering algorithm for medical data anomaly detection

被引:0
|
作者
Wu, Lili [1 ,2 ]
Ali, Majid Khan Majahar [3 ]
Shan, Fam Pei [3 ]
Tian, Ying [4 ]
Tao, Li [3 ]
机构
[1] Xinzhou Teachers Univ, Dept Comp Sci, Xinzhou 034000, Peoples R China
[2] Univ Sains Malaysia USM, Sch Math Sci, George Town 11800, Malaysia
[3] USM, Sch Math Sci, George Town 11800, Malaysia
[4] Taiyuan Univ Technol, Dept Math, Taiyuan 030024, Peoples R China
关键词
over-diagnosis; anomaly expenses; anomaly detection; DPMM; CBLOF;
D O I
10.1504/IJBIC.2024.10064803
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In order to address the issue of identifying over-diagnosis and anomaly expenses in the healthcare service process, a local outlier mining clustering algorithm (ILOF-DPMM) is proposed by combining the clustering-based local outlier factor (CBLOF) algorithm with Dirichlet mixture model (DPMM). By extracting the patient's hospitalisation records from the medical record homepage, the influencing factors of hospitalisation costs for different disease types are classified, and the random forest method is used to reduce the feature dimension by disease type. The feature extraction and dimensionality reduction methods adopted by this algorithm effectively cluster medical insurance expense data. When calculating the LOF value of data, using a weighted calculation method based on the similarity of discrete and continuous features can more accurately detect abnormal data points in the data set, and has the ability to detect new data in real time, thus improving detection accuracy and efficiency.
引用
收藏
页码:11 / 21
页数:12
相关论文
共 50 条
  • [31] Sampling in Dirichlet Process Mixture Models for Clustering Streaming Data
    Dinari, Or
    Freifeld, Oren
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151 : 818 - 835
  • [32] Laplacian regularized generalized Dirichlet mixture distribution for data clustering
    Li, Baohua
    Hu, Lixia
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2020, 49 (01) : 16 - 28
  • [33] Dirichlet Process Mixture Models with Pairwise Constraints for Data Clustering
    Li C.
    Rana S.
    Phung D.
    Venkatesh S.
    Annals of Data Science, 2016, 3 (2) : 205 - 223
  • [34] Research on Argo Data Anomaly Detection Based on Improved DBSCAN Algorithm
    Jiang, YongGuo
    Kang, Ce
    Shen, Yan
    Huang, TingTing
    Zhai, GuangDa
    WIRELESS SENSOR NETWORKS, CWSN 2022, 2022, 1715 : 44 - 54
  • [35] Target Detection Algorithm Based on Improved Gaussian Mixture Model
    Wang, Xiaomeng
    Zhao, Dequn
    Sun, Guangmin
    Liu, Xingwang
    Wu, Yanli
    PROCEEDINGS OF THE 2015 2ND INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER ENGINEERING AND ELECTRONICS (ICECEE 2015), 2015, 24 : 846 - 850
  • [36] Spatial-temporal trajectory anomaly detection based on an improved spectral clustering algorithm
    Guo, Yishan
    Liu, Mandan
    INTELLIGENT DATA ANALYSIS, 2023, 27 (01) : 31 - 58
  • [37] Improved unsupervised anomaly detection algorithm
    Luo, Na
    Yuan, Fuyu
    Zuo, Wanli
    He, Fengling
    Zhou, Zhiguo
    ROUGH SETS AND KNOWLEDGE TECHNOLOGY, 2008, 5009 : 532 - +
  • [38] Dirichlet Process Mixture Model for Document Clustering with Feature Partition
    Huang, Ruizhang
    Yu, Guan
    Wang, Zhaojun
    Zhang, Jun
    Shi, Liangxing
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013, 25 (08) : 1748 - 1759
  • [39] The nested joint clustering via Dirichlet process mixture model
    Han, Shengtong
    Zhang, Hongmei
    Sheng, Wenhui
    Arshad, Hasan
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2019, 89 (05) : 815 - 830
  • [40] Comparative Analysis of Improved Dirichlet Process Mixture Model
    Wu, Lili
    Fam, Pei Shan
    Ali, Majid Khan Majahar
    Tian, Ying
    Ismail, Mohd. Tahir
    Jamaludin, Siti Zulaikha Mohd
    MALAYSIAN JOURNAL OF FUNDAMENTAL AND APPLIED SCIENCES, 2023, 19 (06): : 1099 - 1118