PROM: Efficient matching query processing on high-dimensional data

被引:0
|
作者
Ma, Chunyang [1 ]
Zhou, Yongluan [2 ]
Shou, Lidan [3 ]
Chen, Gang [3 ]
机构
[1] IBM Res Corp, Beijing, Peoples R China
[2] Univ Southern Denmark, Dept Math & Comp Sci, Copenhagen, Denmark
[3] Zhejiang Univ, Dept Comp Sci, Hangzhou, Zhejiang, Peoples R China
关键词
Index; High-dimensional; Matching; CLOSEST-PAIR QUERIES; RECOMMENDATION;
D O I
10.1016/j.ins.2015.05.005
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In many applications, such as online dating or job hunting websites, users often need to search for potential matches based on the requirements or preferences imposed by both sides. We refer to this type of queries as matching queries. In spite of their wide applicabilities, there has been little attention devoted to improving their performance. As matching queries often appear in various forms even within a single application, we, in this paper, propose a general processing framework, which can efficiently process various forms of matching queries. Moreover, we illustrate the applicability of this framework by elaborating the detailed processing algorithms of one particular matching query and its extensions to two other forms of matching queries. We conduct an extensive experimental study with both synthetic and real datasets. The results indicate that, for various matching queries, our techniques can highly improve the query performance, especially when the dimensionality is high. (C) 2015 Elsevier Inc. All rights reserved.
引用
收藏
页码:1 / 19
页数:19
相关论文
共 50 条
  • [41] A generic framework for efficient subspace clustering of high-dimensional data
    Kriegel, HP
    Kröger, P
    Renz, M
    Wurst, S
    [J]. Fifth IEEE International Conference on Data Mining, Proceedings, 2005, : 250 - 257
  • [42] An efficient nearest neighbor search in high-dimensional data spaces
    Lee, DH
    Kim, HJ
    [J]. INFORMATION PROCESSING LETTERS, 2002, 81 (05) : 239 - 246
  • [43] Efficient indexing of high-dimensional data through dimensionality reduction
    Goh, CH
    Lim, A
    Ooi, BC
    Tan, KL
    [J]. DATA & KNOWLEDGE ENGINEERING, 2000, 32 (02) : 115 - 130
  • [44] Linearization approach for efficient KNN search of high-dimensional data
    Al Aghbari, Z
    Makinouchi, A
    [J]. ADVANCES IN WEB-AGE INFORMATION MANAGEMENT: PROCEEDINGS, 2004, 3129 : 229 - 238
  • [45] Integration of projected clusters and principal axis trees for high-dimensional data indexing and query
    Wang, B
    Gan, JQ
    [J]. INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING IDEAL 2004, PROCEEDINGS, 2004, 3177 : 191 - 196
  • [46] Scaling Up Subgraph Query Processing with Efficient Subgraph Matching
    Sun, Shixuan
    Luo, Qiong
    [J]. 2019 IEEE 35TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2019), 2019, : 220 - 231
  • [47] A Range Query Parallel Algorithm in High-dimensional Space
    Xu, Hongbo
    Yao, Nianmin
    [J]. INFORMATION TECHNOLOGY APPLICATIONS IN INDUSTRY, PTS 1-4, 2013, 263-266 : 2308 - 2313
  • [48] Efficient nearest neighbor query based on extended B+-tree in high-dimensional space
    Cui, Jiangtao
    An, Zhiyong
    Guo, Yong
    Zhou, Shuisheng
    [J]. PATTERN RECOGNITION LETTERS, 2010, 31 (12) : 1740 - 1748
  • [49] Efficient Range Query Processing on Uncertain Data
    Knight, Andrew
    Yu, Qi
    Rege, Manjeet
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IRI), 2011, : 263 - 268
  • [50] Binary Matching for High-dimensional Image Descriptors
    Wang, Hongjun
    Hu, Jiani
    Deng, Weihong
    [J]. PROCEEDINGS 3RD IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION ACPR 2015, 2015, : 401 - 405