Answering Top-k Representative Queries on Graph Databases

被引:21
|
作者
Ranu, Sayan [1 ]
Minh Hoang [2 ]
Singh, Ambuj [2 ]
机构
[1] IIT Madras, Dept CSE, Madras, Tamil Nadu, India
[2] Univ Calif Santa Barbara, Dept Comp Sci, Santa Barbara, CA 93106 USA
基金
美国国家科学基金会;
关键词
Graph Search; Representative power; top-k;
D O I
10.1145/2588555.2610524
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Given a function that classifies a data object as relevant or irrelevant, we consider the task of selecting k objects that best represent all relevant objects in the underlying database. This problem occurs naturally when analysts want to familiarize themselves with the relevant objects in a database using a small set of k exemplars. In this paper, we solve the problem of top-k representative queries on graph databases. While graph databases model a wide range of scientific data, solving the problem in the context of graphs presents us with unique challenges due to the inherent complexity of matching structures. Furthermore, top-k representative queries map to the classic Set Cover problem, making it NP-hard. To overcome these challenges, we develop a greedy approximation with theoretical guarantees on the quality of the answer set, noting that a better approximation is not feasible in polynomial time. To further optimize the quadratic computational cost of the greedy algorithm, we propose an index structure called NB-Index to index the theta-neighborhoods of the database graphs by employing a novel combination of Lipschitz embedding and agglomerative clustering. Extensive experiments on real graph datasets validate the efficiency and effectiveness of the proposed techniques that achieve up to two orders of magnitude speed-up over state-of-the-art algorithms.
引用
收藏
页码:1163 / 1174
页数:12
相关论文
共 50 条
  • [1] Answering Top-k Graph Similarity Queries in Graph Databases
    Zhu, Yuanyuan
    Qin, Lu
    Yu, Jeffrey Xu
    Cheng, Hong
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2020, 32 (08) : 1459 - 1474
  • [2] Answering Top-k Keyword Queries on Relational Databases
    Thein, Myint Myint
    Thwin, Mie Mie Su
    [J]. INTERNATIONAL JOURNAL OF INFORMATION RETRIEVAL RESEARCH, 2012, 2 (03) : 36 - 57
  • [3] Top-k Differential Queries in Graph Databases
    Vasilyeva, Elena
    Thiele, Maik
    Bornhoevd, Christof
    Lehner, Wolfgang
    [J]. ADVANCES IN DATABASES AND INFORMATION SYSTEMS (ADBIS 2014), 2014, 8716 : 112 - 125
  • [4] Top-k typicality queries and efficient query answering methods on large databases
    Ming Hua
    Jian Pei
    Ada W. C. Fu
    Xuemin Lin
    Ho-Fung Leung
    [J]. The VLDB Journal, 2009, 18 : 809 - 835
  • [5] Top-k typicality queries and efficient query answering methods on large databases
    Hua, Ming
    Pei, Jian
    Fu, Ada W. C.
    Lin, Xuemin
    Leung, Ho-Fung
    [J]. VLDB JOURNAL, 2009, 18 (03): : 809 - 835
  • [6] CrowdK: Answering top-k queries with crowdsourcing
    Lee, Jongwuk
    Lee, Dongwon
    Hwang, Seung-won
    [J]. INFORMATION SCIENCES, 2017, 399 : 98 - 120
  • [7] Answering Top-k Similar Region Queries
    Sheng, Chang
    Zheng, Yu
    Hsu, Wynne
    Lee, Mong Li
    Xie, Xing
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PT I, PROCEEDINGS, 2010, 5981 : 186 - +
  • [8] Answering Top-k Exemplar Trajectory Queries
    Wang, Sheng
    Bao, Zhifeng
    Culpepper, J. Shane
    Sellis, Timos
    Sanderson, Mark
    Qin, Xiaolin
    [J]. 2017 IEEE 33RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2017), 2017, : 597 - 608
  • [9] Preference-Based Top-k Representative Skyline Queries on Uncertain Databases
    Ha Thanh Huynh Nguyen
    Cao, Jinli
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PART II, 2015, 9078 : 280 - 292
  • [10] Top-k Representative Queries with Binary Constraints
    Khan, Arijit
    Singh, Vishwakarma
    [J]. PROCEEDINGS OF THE 27TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, 2015,