An adaptive index structure for high-dimensional similarity search

被引：0

作者：

Wu, P ^{[1
]}

Manjunath, BS ^{[1
]}

Chandrasekaran, S ^{[1
]}

机构：

[1] Univ Calif Santa Barbara, Dept Elect & Comp Engn, Santa Barbara, CA 93106 USA

来源：

ADVANCES IN MUTLIMEDIA INFORMATION PROCESSING - PCM 2001, PROCEEDINGS | 2001年 / 2195卷

关键词：

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

A practical method for creating a high dimensional index structure that adapts to the data distribution and scales well with the database size, is presented. Typical media descriptors are high dimensional and are not uniformly distributed in the feature space. The performance of many existing methods degrade if the data is not uniformly distributed. The proposed method offers an efficient solution to this problem. First, the data's marginal distribution along each dimension is characterized using a Gaussian mixture model. The parameters of this model are estimated using the well known Expectation-Maximization (EM) method. These model parameters can also be estimated sequentially for on-line updating. Using the marginal distribution information, each of the data dimensions can be partitioned such that each bin contains approximately an equal number of objects, Experimental results on a real image texture data set are presented. Comparisons with existing techniques, such as the VA-File, demonstrate a significant overall improvement.

引用

页码：71 / 77

页数：7

共 50 条

[21] Fast approximate similarity search in extremely high-dimensional data sets
Houle, ME
Sakuma, J
[J]. ICDE 2005: 21ST INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2005, : 619 - 630
[22] Indexing high-dimensional data for efficient in-memory similarity search
Cui, B
Ooi, BC
Su, JW
Tan, KL
[J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (03) : 339 - 353
[23] Combined algorithm for high-dimensional similarity search in time series database
Department of Aviation Auto-Control Engineering, Engineering Institute, Air Force University of Engineering, Xi'an 710038, China
[J]. Jisuanji Gongcheng, 2006, 10 (172-174):
[24] Dimensionality reduction for similarity search with the Euclidean distance in high-dimensional applications
Seungdo Jeong
Sang-Wook Kim
Byung-Uk Choi
[J]. Multimedia Tools and Applications, 2009, 42 : 251 - 271
[25] High-Dimensional Similarity Search with Quantum-Assisted Variational Autoencoder
Gao, Nicholas
Wilson, Max
Vandal, Thomas
Vinci, Walter
Nemani, Ramakrishna
Rieffel, Eleanor
[J]. KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 956 - 964
[26] Indexing high-dimensional data for main-memory similarity search
Yu, Xiaohui
Doug, Junfeng
[J]. INFORMATION SYSTEMS, 2010, 35 (07) : 825 - 843
[27] High-dimensional similarity joins
Shim, K
Srikant, R
Agrawal, R
[J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2002, 14 (01) : 156 - 171
[28] High-dimensional similarity joins
Shim, K
Srikant, R
Agrawal, R
[J]. 13TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING - PROCEEDINGS, 1997, : 301 - 311
[29] High-Dimensional Adaptive Test Design Including Boundary Search
Sandmeier, Nino
Hegmann, Michael
Roepke, Karsten
Guehmann, Clemens
Mewes, Gian
[J]. SAE INTERNATIONAL JOURNAL OF ENGINES, 2020, 13 (02) : 253 - 265
[30] High-dimensional index tracking based on the adaptive elastic net
Shu, Lianjie
Shi, Fangquan
Tian, Guoliang
[J]. QUANTITATIVE FINANCE, 2020, 20 (09) : 1513 - 1530

← 1 2 3 4 5 →