General Cost Models for Evaluating Dimensionality Reduction in High-Dimensional Spaces

被引:8
|
作者
Lian, Xiang [1 ]
Chen, Lei [1 ]
机构
[1] Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Kowloon, Hong Kong, Peoples R China
关键词
High-dimensionality reduction; similarity search;
D O I
10.1109/TKDE.2008.170
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Similarity search usually encounters a serious problem in the high-dimensional space, known as the "curse of dimensionality." In order to speed up the retrieval efficiency, most previous approaches reduce the dimensionality of the entire data set to a fixed lower value before building indexes (referred to as global dimensionality reduction (GDR)). More recent works focus on locally reducing the dimensionality of data to different values (called the local dimensionality reduction (LDR)). In addition, random projection is proposed as an approximate dimensionality reduction (ADR) technique to answer the approximate similarity search instead of the exact one. However, so far little work has formally evaluated the effectiveness and efficiency of GDR, LDR, and ADR for the range query. Motivated by this, in this paper, we propose general cost models for evaluating the query performance over the reduced data sets by GDR, LDR, and ADR, in light of which we introduce a novel (A) LDR method, Partitioning based on RANdomized Search (PRANS). It can achieve high retrieval efficiency with the guarantee of optimality given by the formal models. Finally, a B(+)-tree index is constructed over the reduced partitions for fast similarity search. Extensive experiments validate the correctness of our cost models on both real and synthetic data sets and demonstrate the efficiency and effectiveness of the proposed PRANS method.
引用
收藏
页码:1447 / 1460
页数:14
相关论文
共 50 条
  • [1] A general cost model for dimensionality reduction in high dimensional spaces
    Lian, Xiang
    Chen, Lei
    [J]. 2007 IEEE 23RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2007, : 41 - +
  • [2] Dimensionality reduction for density ratio estimation in high-dimensional spaces
    Sugiyama, Masashi
    Kawanabe, Motoaki
    Chui, Pui Ling
    [J]. NEURAL NETWORKS, 2010, 23 (01) : 44 - 59
  • [3] Heterogeneous Dimensionality Reduction for Efficient Motion Planning in High-Dimensional Spaces
    Yu, Huan
    Lu, Wenjie
    Han, Yongqiang
    Liu, Dikai
    Zhang, Miao
    [J]. IEEE ACCESS, 2020, 8 : 42619 - 42632
  • [4] SeekAView: An Intelligent Dimensionality Reduction Strategy for Navigating High-Dimensional Data Spaces
    Krause, Josua
    Dasgupta, Aritra
    Fekete, Jean-Daniel
    Bertini, Enrico
    [J]. 2016 IEEE 6TH SYMPOSIUM ON LARGE DATA ANALYSIS AND VISUALIZATION (LDAV), 2016, : 11 - 19
  • [5] Dimensionality reduction for visualizing high-dimensional biological data
    Malepathirana, Tamasha
    Senanayake, Damith
    Vidanaarachchi, Rajith
    Gautam, Vini
    Halgamuge, Saman
    [J]. BIOSYSTEMS, 2022, 220
  • [6] Dimensionality Reduction for Registration of High-Dimensional Data Sets
    Xu, Min
    Chen, Hao
    Varshney, Pramod K.
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2013, 22 (08) : 3041 - 3049
  • [7] Efficient Dimensionality Reduction for High-Dimensional Network Estimation
    Celik, Safiye
    Logsdon, Benjamin A.
    Lee, Su-In
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 2), 2014, 32 : 1953 - 1961
  • [8] Effective indexing and searching with dimensionality reduction in high-dimensional space
    Jeong, Seungdo
    Kim, Sang-Wook
    Choi, Byung-Uk
    [J]. COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2016, 31 (04): : 291 - 302
  • [9] An adaptive and efficient dimensionality reduction algorithm for high-dimensional indexing
    Jin, H
    Ooi, BC
    Shen, HT
    Yu, C
    Zhou, AY
    [J]. 19TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2003, : 87 - 98
  • [10] An adaptive and dynamic dimensionality reduction method for high-dimensional indexing
    Heng Tao Shen
    Xiaofang Zhou
    Aoying Zhou
    [J]. The VLDB Journal, 2007, 16 : 219 - 234