An adaptive index structure for high-dimensional similarity search

被引:0
|
作者
Wu, P [1 ]
Manjunath, BS [1 ]
Chandrasekaran, S [1 ]
机构
[1] Univ Calif Santa Barbara, Dept Elect & Comp Engn, Santa Barbara, CA 93106 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A practical method for creating a high dimensional index structure that adapts to the data distribution and scales well with the database size, is presented. Typical media descriptors are high dimensional and are not uniformly distributed in the feature space. The performance of many existing methods degrade if the data is not uniformly distributed. The proposed method offers an efficient solution to this problem. First, the data's marginal distribution along each dimension is characterized using a Gaussian mixture model. The parameters of this model are estimated using the well known Expectation-Maximization (EM) method. These model parameters can also be estimated sequentially for on-line updating. Using the marginal distribution information, each of the data dimensions can be partitioned such that each bin contains approximately an equal number of objects, Experimental results on a real image texture data set are presented. Comparisons with existing techniques, such as the VA-File, demonstrate a significant overall improvement.
引用
收藏
页码:71 / 77
页数:7
相关论文
共 50 条
  • [21] Fast approximate similarity search in extremely high-dimensional data sets
    Houle, ME
    Sakuma, J
    [J]. ICDE 2005: 21ST INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2005, : 619 - 630
  • [22] Indexing high-dimensional data for efficient in-memory similarity search
    Cui, B
    Ooi, BC
    Su, JW
    Tan, KL
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (03) : 339 - 353
  • [23] Combined algorithm for high-dimensional similarity search in time series database
    Department of Aviation Auto-Control Engineering, Engineering Institute, Air Force University of Engineering, Xi'an 710038, China
    [J]. Jisuanji Gongcheng, 2006, 10 (172-174):
  • [24] Dimensionality reduction for similarity search with the Euclidean distance in high-dimensional applications
    Seungdo Jeong
    Sang-Wook Kim
    Byung-Uk Choi
    [J]. Multimedia Tools and Applications, 2009, 42 : 251 - 271
  • [25] High-Dimensional Similarity Search with Quantum-Assisted Variational Autoencoder
    Gao, Nicholas
    Wilson, Max
    Vandal, Thomas
    Vinci, Walter
    Nemani, Ramakrishna
    Rieffel, Eleanor
    [J]. KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 956 - 964
  • [26] Indexing high-dimensional data for main-memory similarity search
    Yu, Xiaohui
    Doug, Junfeng
    [J]. INFORMATION SYSTEMS, 2010, 35 (07) : 825 - 843
  • [27] High-dimensional similarity joins
    Shim, K
    Srikant, R
    Agrawal, R
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2002, 14 (01) : 156 - 171
  • [28] High-dimensional similarity joins
    Shim, K
    Srikant, R
    Agrawal, R
    [J]. 13TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING - PROCEEDINGS, 1997, : 301 - 311
  • [29] High-Dimensional Adaptive Test Design Including Boundary Search
    Sandmeier, Nino
    Hegmann, Michael
    Roepke, Karsten
    Guehmann, Clemens
    Mewes, Gian
    [J]. SAE INTERNATIONAL JOURNAL OF ENGINES, 2020, 13 (02) : 253 - 265
  • [30] High-dimensional index tracking based on the adaptive elastic net
    Shu, Lianjie
    Shi, Fangquan
    Tian, Guoliang
    [J]. QUANTITATIVE FINANCE, 2020, 20 (09) : 1513 - 1530