Data-driven cluster analysis method: a novel outliers detection method in multivariate data

被引:0
|
作者
Duarte, A. R. [1 ]
Barbosa, J. J. [1 ]
Martins, H. S. R. [1 ]
Oliveira, F. L. P. [1 ]
机构
[1] Univ Fed Ouro Preto, Stat Dept, Ouro Preto, Brazil
关键词
Data-driven; Multivariate outliers; Cluster analysis; Bayesian information criterion; Accuracy; MAHALANOBIS DISTANCE; IDENTIFICATION;
D O I
10.1080/03610918.2024.2376872
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Detection of multivariate outliers is crucial in statistical studies. On the other hand, the statistical applications without identifying possible outliers may present incorrect results. This study proposes a new technique for detecting multivariate outliers based on cluster analysis. The method considers information inherent in the data itself. We compare the methodology with three detection methods that are already widespread. The comparative investigation considers detection techniques based on the Mahalanobis distance. Sensitivity, specificity, and accuracy measures are used to assess the quality of the methods, as well as an analysis of the CPU time required to carry out the procedures. The new technique revealed a notorious superiority over others.
引用
收藏
页数:21
相关论文
共 50 条
  • [41] A data-driven method for falsified vehicle trajectory identification by anomaly detection
    Ed Huang, Shihong
    Feng, Yiheng
    Liu, Henry X.
    TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2021, 128
  • [42] Probabilistic Data-Driven Method for Limb Movement Detection during Sleep
    Cesari, Matteo
    Christensen, Julie A. E.
    Jennum, Poul
    Sorensen, Helge B. D.
    2018 40TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2018, : 163 - 166
  • [43] An adaptive data-driven fault detection method for monitoring dynamic process
    Chen, Zhiwen
    Peng, Tao
    Yang, Chunhua
    Li, Fanbiao
    He, Zhangming
    IECON 2018 - 44TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2018, : 5353 - 5358
  • [44] Data-Driven Fault Detection and Isolation Inspired by Subspace Identification Method
    Chen Zhaoxu
    Fang Huajing
    2014 33RD CHINESE CONTROL CONFERENCE (CCC), 2014, : 3224 - 3229
  • [45] A radically data-driven method for fault detection and diagnosis in wind turbines
    Yu, D.
    Chen, Z. M.
    Xiahou, K. S.
    Li, M. S.
    Ji, T. Y.
    Wu, Q. H.
    INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2018, 99 : 577 - 584
  • [46] A Hybrid Data-Driven Method for Wire Rope Surface Defect Detection
    Zhou, Ping
    Zhou, Gongbo
    Li, Yingming
    He, Zhenzhi
    Liu, Yiwen
    IEEE SENSORS JOURNAL, 2020, 20 (15) : 8297 - 8306
  • [47] A Novel Hybrid Aeroengine Modeling Method for Combining Data-Driven Modules
    Cai, Wen
    Zhao, Yong-Ping
    Zhu, Ye
    Yin, Jun
    Xu, Zhan-Yan
    Liu, Wei-Min
    JOURNAL OF AEROSPACE ENGINEERING, 2024, 37 (05)
  • [48] A Novel Data-Driven Controller Tuning Method for Improving Convergence Performance
    Jiang, Yi
    Zhu, Yu
    Yang, Kaiming
    Hu, Chuxiong
    Mu, Haihua
    2015 AMERICAN CONTROL CONFERENCE (ACC), 2015, : 3230 - 3235
  • [49] A Novel Data-Driven Modeling and Control Design Method for Autonomous Vehicles
    Fenyes, Daniel
    Nemeth, Balazs
    Gaspar, Peter
    ENERGIES, 2021, 14 (02)
  • [50] Highcor: A novel data-driven regressor identification method for BOLD fMRI
    Curtis, A. T.
    Menon, R. S.
    NEUROIMAGE, 2014, 98 : 184 - 194