Density-Distance Outlier Detection Algorithm Based on Natural Neighborhood

被引:2
|
作者
Zhang, Jiaxuan [1 ]
Yang, Youlong [1 ]
机构
[1] Xidian Univ, Sch Math & Stat, Xian 710126, Peoples R China
基金
中国国家自然科学基金;
关键词
outlier detection; natural neighbors; adaptive kernel density estimation; local density; relative distance; EFFICIENT;
D O I
10.3390/axioms12050425
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Outlier detection is of great significance in the domain of data mining. Its task is to find those target points that are not identical to most of the object generation mechanisms. The existing algorithms are mainly divided into density-based algorithms and distance-based algorithms. However, both approaches have some drawbacks. The former struggles to handle low-density modes, while the latter cannot detect local outliers. Moreover, the outlier detection algorithm is very sensitive to parameter settings. This paper proposes a new two-parameter outlier detection (TPOD) algorithm. The method proposed in this paper does not need to manually define the number of neighbors, and the introduction of relative distance can also solve the problem of low density and further accurately detect outliers. This is a combinatorial optimization problem. Firstly, the number of natural neighbors is iteratively calculated, and then the local density of the target object is calculated by adaptive kernel density estimation. Secondly, the relative distance of the target points is computed through natural neighbors. Finally, these two parameters are combined to obtain the outlier factor. This eliminates the influence of parameters that require users to determine the number of outliers themselves, namely, the top-n effect. Two synthetic datasets and 17 real datasets were used to test the effectiveness of this method; a comparison with another five algorithms is also provided. The AUC value and F1 score on multiple datasets are higher than other algorithms, indicating that outliers can be found accurately, which proves that the algorithm is effective.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] ERDOF: outlier detection algorithm based on entropy weight distance and relative density outlier factor
    Zhang, Zhongping
    Liu, Weixiong
    Zhang, Yuting
    Deng, Yu
    Wei, Mianxin
    [J]. Tongxin Xuebao/Journal on Communications, 2021, 42 (09): : 133 - 143
  • [2] A Comparative Study of Cluster Based Outlier Detection, Distance Based Outlier Detection and Density Based Outlier Detection Techniques
    Mandhare, Harshada C.
    Idate, S. R.
    [J]. 2017 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS), 2017, : 931 - 935
  • [3] Detection of local and clustered outliers based on the density-distance decision graph
    Li, Kangsheng
    Gao, Xin
    Jia, Xin
    Xue, Bing
    Fu, Shiyuan
    Liu, Zhiyu
    Huang, Xu
    Huang, Zijian
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 110
  • [4] Multi-stage Mixed Attribute Outlier Detection Algorithm Based on Neighborhood Density Difference
    Du, Haizhou
    Fang, Wei
    Liu, Qing
    Yang, Zhenchen
    Wang, Xiaofeng
    [J]. 5TH INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING AND COMMUNICATIONS (BIGCOM 2019), 2019, : 160 - 168
  • [5] An Efficient Distance and Density Based Outlier Detection Approach
    Zhong, Xunbiao
    Huang, Xiaoxia
    [J]. MECHANICAL ENGINEERING AND GREEN MANUFACTURING II, PTS 1 AND 2, 2012, 155-156 : 342 - 347
  • [6] A novel temporal protein complexes identification framework based on density-distance and heuristic algorithm
    Xie, Dan
    Yi, Yang
    Zhou, Jin
    Li, Xiaodong
    Wu, Huikun
    [J]. NEURAL COMPUTING & APPLICATIONS, 2019, 31 (09): : 4693 - 4701
  • [7] HCDC: A novel hierarchical clustering algorithm based on density-distance cores for data sets with varying density
    Yang, Qi-Fen
    Gao, Wan-Yi
    Han, Gang
    Li, Zi-Yang
    Tian, Meng
    Zhu, Shu-Hua
    Deng, Yu-hui
    [J]. INFORMATION SYSTEMS, 2023, 114
  • [8] An automatic density peaks clustering based on a density-distance clustering index
    Xu, Xiao
    Liao, Hong
    Yang, Xu
    [J]. AIMS MATHEMATICS, 2023, 8 (12): : 28926 - 28950
  • [9] Neural Density-Distance Fields
    Ueda, Itsuki
    Fukuhara, Yoshihiro
    Kataoka, Hirokatsu
    Aizawa, Hiroaki
    Shishido, Hidehiko
    Kitahara, Itaru
    [J]. COMPUTER VISION - ECCV 2022, PT XXXII, 2022, 13692 : 53 - 68
  • [10] RDOF: An outlier detection algorithm based on relative density
    Wahid, Abdul
    Rao, Annavarapu Chandra Sekhara
    [J]. EXPERT SYSTEMS, 2022, 39 (02)