Answering Top-k Representative Queries on Graph Databases

被引:21
|
作者
Ranu, Sayan [1 ]
Minh Hoang [2 ]
Singh, Ambuj [2 ]
机构
[1] IIT Madras, Dept CSE, Madras, Tamil Nadu, India
[2] Univ Calif Santa Barbara, Dept Comp Sci, Santa Barbara, CA 93106 USA
基金
美国国家科学基金会;
关键词
Graph Search; Representative power; top-k;
D O I
10.1145/2588555.2610524
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Given a function that classifies a data object as relevant or irrelevant, we consider the task of selecting k objects that best represent all relevant objects in the underlying database. This problem occurs naturally when analysts want to familiarize themselves with the relevant objects in a database using a small set of k exemplars. In this paper, we solve the problem of top-k representative queries on graph databases. While graph databases model a wide range of scientific data, solving the problem in the context of graphs presents us with unique challenges due to the inherent complexity of matching structures. Furthermore, top-k representative queries map to the classic Set Cover problem, making it NP-hard. To overcome these challenges, we develop a greedy approximation with theoretical guarantees on the quality of the answer set, noting that a better approximation is not feasible in polynomial time. To further optimize the quadratic computational cost of the greedy algorithm, we propose an index structure called NB-Index to index the theta-neighborhoods of the database graphs by employing a novel combination of Lipschitz embedding and agglomerative clustering. Extensive experiments on real graph datasets validate the efficiency and effectiveness of the proposed techniques that achieve up to two orders of magnitude speed-up over state-of-the-art algorithms.
引用
收藏
页码:1163 / 1174
页数:12
相关论文
共 50 条
  • [41] Answering top-K query combined keywords and structural queries on RDF graphs
    Peng, Peng
    Zou, Lei
    Qin, Zheng
    [J]. INFORMATION SYSTEMS, 2017, 67 : 19 - 35
  • [42] A Grid-Based Approach in Answering Top-k Dominating Queries on Groups
    Santoso, Bagus Jati
    Mumpuni, Retno
    Hong, Hsiang-Jen
    Muhammad, Dwika Setya
    [J]. PROCEEDINGS OF 2019 12TH INTERNATIONAL CONFERENCE ON INFORMATION & COMMUNICATION TECHNOLOGY AND SYSTEM (ICTS), 2019, : 343 - 348
  • [43] Answering why-not questions on top-k augmented spatial keyword queries
    Li, Yanhong
    Zhang, Wang
    Luo, Changyin
    Du, Xiaokun
    Li, Jianjun
    [J]. KNOWLEDGE-BASED SYSTEMS, 2021, 223
  • [44] Diversified Top-k Answering of Cypher Queries over Large Data Graphs
    Mahfoud, Houari
    [J]. 2023 20TH ACS/IEEE INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS, AICCSA, 2023,
  • [45] Dominant Graph: An efficient indexing structure to answer top-k queries
    Zou, Lei
    Chen, Lei
    [J]. 2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2008, : 536 - +
  • [46] Top-k Closest Pair Queries over Spatial Knowledge Graph
    Wu, Fangwei
    Xie, Xike
    Shi, Jieming
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2021), PT I, 2021, 12681 : 625 - 640
  • [47] Graph Encryption for Top-K Nearest Keyword Search Queries on Cloud
    Liu, Chang
    Zhu, Liehuang
    Chen, Jinjun
    [J]. IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING, 2017, 2 (04): : 371 - 381
  • [48] Efficient Processing of Top-k Queries in Uncertain Databases with x-Relations
    Yi, Ke
    Li, Feifei
    Kollios, George
    Srivastava, Divesh
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2008, 20 (12) : 1669 - 1682
  • [49] Top-k best probability queries and semantics ranking properties on probabilistic databases
    Trieu Minh Nhut Le
    Cao, Jinli
    He, Zhen
    [J]. DATA & KNOWLEDGE ENGINEERING, 2013, 88 : 248 - 266
  • [50] Answering Why-Not Spatial Keyword Top-k Queries via Keyword Adaption
    Chen, Lei
    Xu, Jianliang
    Lin, Xin
    Jensen, Christian S.
    Hu, Haibo
    [J]. 2016 32ND IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2016, : 697 - 708