An adaptive index structure for high-dimensional similarity search

被引:0
|
作者
Wu, P [1 ]
Manjunath, BS [1 ]
Chandrasekaran, S [1 ]
机构
[1] Univ Calif Santa Barbara, Dept Elect & Comp Engn, Santa Barbara, CA 93106 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A practical method for creating a high dimensional index structure that adapts to the data distribution and scales well with the database size, is presented. Typical media descriptors are high dimensional and are not uniformly distributed in the feature space. The performance of many existing methods degrade if the data is not uniformly distributed. The proposed method offers an efficient solution to this problem. First, the data's marginal distribution along each dimension is characterized using a Gaussian mixture model. The parameters of this model are estimated using the well known Expectation-Maximization (EM) method. These model parameters can also be estimated sequentially for on-line updating. Using the marginal distribution information, each of the data dimensions can be partitioned such that each bin contains approximately an equal number of objects, Experimental results on a real image texture data set are presented. Comparisons with existing techniques, such as the VA-File, demonstrate a significant overall improvement.
引用
收藏
页码:71 / 77
页数:7
相关论文
共 50 条
  • [1] An efficient high-dimensional index structure using cell signatures for similarity search
    Chang, JW
    Song, KT
    [J]. ADVANCES IN WEB-AGE INFORMATION MANAGEMENT, PROCEEDINGS, 2001, 2118 : 26 - 33
  • [2] The GC-tree: A high-dimensional index structure for similarity search in image databases
    Cha, GH
    Chung, CW
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2002, 4 (02) : 235 - 247
  • [3] Effectiveness of NAQ-tree as index structure for similarity search in high-dimensional metric space
    Ming Zhang
    Reda Alhajj
    [J]. Knowledge and Information Systems, 2010, 22 : 1 - 26
  • [4] Effectiveness of NAQ-tree as index structure for similarity search in high-dimensional metric space
    Zhang, Ming
    Alhajj, Reda
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2010, 22 (01) : 1 - 26
  • [5] Fast similarity search for high-dimensional dataset
    Wang, Quan
    You, Suya
    [J]. ISM 2006: EIGHTH IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA, PROCEEDINGS, 2006, : 799 - +
  • [6] High-Dimensional Similarity Search for Scalable Data Science
    Echihabi, Karima
    Zoumpatianos, Kostas
    Palpanas, Themis
    [J]. 2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021), 2021, : 2369 - 2372
  • [7] Clustering for approximate similarity search in high-dimensional spaces
    Li, C
    Chang, E
    Garcia-Molina, H
    Wiederhold, G
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2002, 14 (04) : 792 - 808
  • [8] Memory Vectors for Similarity Search in High-Dimensional Spaces
    Iscen, Ahmet
    Furon, Teddy
    Gripon, Vincent
    Rabbat, Michael
    Jegou, Herve
    [J]. IEEE TRANSACTIONS ON BIG DATA, 2018, 4 (01) : 65 - 77
  • [9] What's Wrong with High-Dimensional Similarity Search?
    Blott, Stephen
    Weber, Roger
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2008, 1 (01): : 3 - 3
  • [10] Quantization techniques for similarity search in high-dimensional data spaces
    Garcia-Arellano, C
    Sevcik, K
    [J]. NEW HORIZONS IN INFORMATION MANAGEMENT, 2003, 2712 : 75 - 94