On eigenfunction approach to data mining: outlier detection in high-dimensional data sets

被引:0
|
作者
Nagar, AK [1 ]
Muyeba, MK [1 ]
机构
[1] Liverpool Hope Univ Coll, Liverpool L16 9JD, Merseyside, England
关键词
eigenfunction; Pseudo-SVD; spatial; orthogonal; data mining; optimisation;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present two methods, one based on eigenvalue analysis, and the other, a modified version of singular value decomposition (SVD) called pseudo-SVD, for mining outliers in high-dimensional data sets. The eigenvalue analysis approach examines the spatial relationship among the column vectors of object-attribute matrix to obtain an insight into the degree of inconsistency in a cluster of data. The pseudo-SVD method, in which the singular values are allowed to have a sign, looks at the direction of vectors in the object-attribute matrix and based on the degree of their orthogonality detects the outliers. The pseudo-SVD algorithm is formulated as an optimisation problem for clustering the data on the basis of their angular inclination. A framework for this approach is formulated and further research directions are discussed.
引用
收藏
页码:251 / 256
页数:6
相关论文
共 50 条
  • [1] Outlier mining in large high-dimensional data sets
    Angiulli, F
    Pizzuti, C
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (02) : 203 - 215
  • [2] Outlier detection for high-dimensional data
    Ro, Kwangil
    Zou, Changliang
    Wang, Zhaojun
    Yin, Guosheng
    [J]. BIOMETRIKA, 2015, 102 (03) : 589 - 599
  • [3] Ordinal Outlier Algorithm for Anomaly Detection of High-Dimensional Data Sets
    Chen, Gang
    Du, Linlin
    An, Baoran
    [J]. PROCEEDINGS OF THE 32ND 2020 CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2020), 2020, : 5356 - 5361
  • [4] Efficient Outlier Detection for High-Dimensional Data
    Liu, Huawen
    Li, Xuelong
    Li, Jiuyong
    Zhang, Shichao
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2018, 48 (12): : 2451 - 2461
  • [5] A geometric framework for outlier detection in high-dimensional data
    Herrmann, Moritz
    Pfisterer, Florian
    Scheipl, Fabian
    [J]. WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2023, 13 (03)
  • [6] A fast outlier detection strategy for distributed high-dimensional data sets with mixed attributes
    Koufakou, Anna
    Georgiopoulos, Michael
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2010, 20 (02) : 259 - 289
  • [7] A fast outlier detection strategy for distributed high-dimensional data sets with mixed attributes
    Anna Koufakou
    Michael Georgiopoulos
    [J]. Data Mining and Knowledge Discovery, 2010, 20 : 259 - 289
  • [8] A Comparison of Outlier Detection Techniques for High-Dimensional Data
    Xu, Xiaodan
    Liu, Huawen
    Li, Li
    Yao, Minghai
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2018, 11 (01) : 652 - 662
  • [9] A Comparison of Outlier Detection Techniques for High-Dimensional Data
    Xiaodan Xu
    Huawen Liu
    Li Li
    Minghai Yao
    [J]. International Journal of Computational Intelligence Systems, 2018, 11 : 652 - 662
  • [10] An Unbiased Distance-Based Outlier Detection Approach for High-Dimensional Data
    Hoang Vu Nguyen
    Gopalkrishnan, Vivekanand
    Assent, Ira
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PT I, 2011, 6587 : 138 - +