Outlier detection algorithm based on fluctuation of centroid projection

被引:0
|
作者
Zhang Z. [1 ,2 ,3 ]
Zhang Y. [1 ]
Liu W. [1 ]
Deng Y. [1 ]
机构
[1] College of Information Science and Engineering, Yanshan University, Qinhuangdao
[2] The Key Laboratory for Computer Virtual Technology, System Integration of Hebei Province, Yanshan University, Qinhuangdao
[3] The Key Laboratory of Software Engineering of Hebei Province, Qinhuangdao
关键词
centroid projection fluctuation; data mining; k-nearest neighbors; neighbor tree; outlier detection;
D O I
10.13196/j.cims.2022.12.014
中图分类号
学科分类号
摘要
Outlier detection is an important field of data mining research. In the traditional outlier detection method based on nearest neighbor, the k-nearest neighbor relationship is widely used. However, with the diversification of data distribution and the increase of data dimensions, the process of detecting outliers based on the k-nearest neighbor relationship algorithm is easily affected by different clusters and the detection effect is not satisfactory. To solve the above problems, a new neighborhood set was generated by introducing the nearest neighbor tree instead of the k-nearest neighbor relationship, and the concept of centroid projection was proposed to describe the distribution characteristics of the data object and its neighbors. As the neighbor points of the data object gradually increase, the centroid projections of outliers and internal points were different, and the centroid projection fluctuation was proposed to measure the degree of outlier of each data object. An outlier detection algorithm based on the fluctuation of centroid projection was proposed. Experiments on artificial data sets and real data sets showed that the proposed algorithm could effectively and comprehensively detect outliers. © 2022 CIMS. All rights reserved.
引用
收藏
页码:3869 / 3878
页数:9
相关论文
共 25 条
  • [11] JIN W, TUNG A K H, HAN J, Et al., Ranking outlier using symmetric neighborhood relationship [J], Lecture Notes in Computer Science, 3918, pp. 577-593, (2006)
  • [12] TANG J, CHEN Z, FU A, Et al., Enhancing effectiveness of outlier detections for low density patterns, Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, (2002)
  • [13] HUANG J, ZHU Q, YANG L, Et al., A non-parameter outlier detection algorithm based on natural neighbor [J], Knowledge-Based Systems, 92, pp. 71-77, (2016)
  • [14] AGYEMANG M., Local sparsity coefficient-based mining of outliers
  • [15] PAPADIMITRIOU S, KITAGAWA H, GIBBONS P B, Et al., LOCI: Fast outlier detection using the local correlation integral, Proceedings of the 19th International Conference on Data Engineering, (2003)
  • [16] ZHANG Zhongping, FANG Chunzhen, Subspace outlier detection algorithm based on cumulative holoentropy for clustering[J], Computer Integrated Manufacturing Systems, 21, 8, pp. 2249-2256, (2015)
  • [17] ZHANG Zhongping, QlUJingyang, LIU Cong, Et al., Outlier detection algorithm based on clustering outlier factor and mutual density[J], Computer Integrated Manufacturing Systems, 9, pp. 2314-2323, (2019)
  • [18] XIE J, XIONG Z, DAI Q, Et al., A local-gravitation-based method for the detection of outliers and boundary points[J], Knowledge-Based Systems, 192, pp. 105331-105331, (2020)
  • [19] WAHID A, SKEHAR C., NaNOD: A natural neighbourbased outlier detection algorithm[J], Neural Computing and Applications, 33, pp. 2107-2123, (2020)
  • [20] WANG R, ZHU Q., LSOF: Novel outlier detection approach based on local structure, Proceedings of 2019 IEEE Intl Conf on Parallel &- Distributed Processing with Applications, Big Data &- Cloud Computing, Sustainable Computing &- Communications, Social Computing &- Networking, (2019)