Estimation of Locally Relevant Subspace in High-dimensional Data

被引:6
|
作者
Thudumu, Srikanth [1 ]
Branch, Philip [1 ]
Jin, Jiong [1 ]
Singh, Jugdutt [2 ]
机构
[1] Swinburne Univ Technol, Melbourne, Vic, Australia
[2] Sarawak State Govt, Kuching, Malaysia
关键词
High-dimensionality problem; Subspace methods; Outlier Detection; Locally Relevant subspace; The curse of dimensionality problem; OUTLIER DETECTION;
D O I
10.1145/3373017.3373032
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
High-dimensional data is becoming more and more available due to the advent of big data and IoT. Having more dimensions makes data analysis cumbersome increasing the sparsity of data points due to the problem called "curse of dimensionality". To address this problem, global dimensionality reduction techniques are used; however, these techniques are ineffective in revealing hidden outliers from the high-dimensional space. This is due to the behaviour of outliers being hidden in the subspace where they belong; hence, a locally relevant subspace is needed to reveal the hidden outliers. In this paper, we present a technique that identifies a locally relevant subspace and associated low-dimensional subspaces by deriving a final correlation score. To verify the effectiveness of the technique in determining the generalised locally relevant subspace, we evaluate the results with a benchmark data set. Our comparative analysis shows that the technique derived the locally relevant subspace that consists of relevant dimensions presented in benchmark data set.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] EDSC: Efficient Document Subspace Clustering Technique for High-Dimensional Data
    Radhika, K. R.
    Pushpa, C. N.
    Thriveni, J.
    Venugopal, K. R.
    [J]. 2016 INTERNATIONAL CONFERENCE ON COMPUTATIONAL TECHNIQUES IN INFORMATION AND COMMUNICATION TECHNOLOGIES (ICCTICT), 2016,
  • [42] Local-Density Subspace Distributed Clustering for High-Dimensional Data
    Geng, Yangli-ao
    Li, Qingyong
    Liang, Mingfei
    Chi, Chong-Yung
    Tan, Juan
    Huang, Heng
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2020, 31 (08) : 1799 - 1814
  • [43] Spectral Clustering by Subspace Randomization and Graph Fusion for High-Dimensional Data
    Cai, Xiaosha
    Huang, Dong
    Wang, Chang-Dong
    Kwoh, Chee-Keong
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2020, PT I, 2020, 12084 : 330 - 342
  • [44] Adaptive multi-view subspace clustering for high-dimensional data
    Yan, Fei
    Wang, Xiao-dong
    Zeng, Zhi-qiang
    Hong, Chao-qun
    [J]. PATTERN RECOGNITION LETTERS, 2020, 130 : 299 - 305
  • [45] Subspace Clustering in High-Dimensional Data Streams: A Systematic Literature Review
    Ghani, Nur Laila Ab
    Aziz, Izzatdin Abdul
    AbdulKadir, Said Jadid
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 75 (02): : 4649 - 4668
  • [46] High-dimensional data analysis with subspace comparison using matrix visualization
    Wang, Junpeng
    Liu, Xiaotong
    Shen, Han-Wei
    [J]. INFORMATION VISUALIZATION, 2019, 18 (01) : 94 - 109
  • [47] A novel algorithm for fast and scalable subspace clustering of high-dimensional data
    Kaur A.
    Datta A.
    [J]. Journal of Big Data, 2015, 2 (01)
  • [48] Synchronization-based scalable subspace clustering of high-dimensional data
    Shao, Junming
    Wang, Xinzuo
    Yang, Qinli
    Plant, Claudia
    Boehm, Christian
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2017, 52 (01) : 83 - 111
  • [49] Exploring high-dimensional data through locally enhanced projections
    Lai, Chufan
    Zhao, Ying
    Yuan, Xiaoru
    [J]. JOURNAL OF VISUAL LANGUAGES AND COMPUTING, 2018, 48 : 144 - 156
  • [50] Publishing locally private high-dimensional synthetic data efficiently
    Zhang, Hua
    Li, Kaixuan
    Huang, Teng
    Zhang, Xin
    Li, Wenmin
    Jin, Zhengping
    Gao, Fei
    Gao, Minghui
    [J]. INFORMATION SCIENCES, 2023, 633 : 343 - 356