Ranking the big sky: efficient top-k skyline computation on massive data

被引:2
|
作者
Han, Xixian [1 ]
Wang, Bailing [1 ]
Li, Jianzhong [1 ]
Gao, Hong [1 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin, Heilongjiang, Peoples R China
基金
中国国家自然科学基金;
关键词
Massive data; Top-k skyline; RSTS algorithm; Table scan; Pruning operation; QUERIES; ALGORITHMS; DISTANCE;
D O I
10.1007/s10115-018-1256-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In many applications, top-k skyline query is an important operation to return k skyline tuples with the highest domination scores in a potentially huge data space. It is analyzed that the existing algorithms cannot process top-k skyline query on massive data efficiently. In this paper, we propose a novel table-scan-based algorithm RSTS to compute top-k skyline results on massive data efficiently. RSTS first builds the presorted table, whose tuples are arranged in the order of round-robin retrieval on sorted column lists. RSTS consists of two phases. In phase 1, the candidate tuples are acquired by the sequential scan on the presorted table. In phase 2, RSTS calculates the domination scores of the candidates and returns query results by another sequential scan. It is proved that RSTS has the characteristic of early termination, along with the theoretical analysis of scan depths. The pruning rule for candidate tuples is devised in this paper. The theoretical pruning effect shows that majority of the skyline results can be discarded directly. The extensive experimental results, conducted on synthetic and real-life data sets, show that RSTS outperforms the existing algorithms significantly.
引用
收藏
页码:415 / 446
页数:32
相关论文
共 50 条
  • [21] Efficient Top-k Data Sources Ranking for Query on Deep Web
    Shen, Derong
    Li, Meifang
    Yu, Ge
    Kou, Yue
    Nie, Tiezheng
    [J]. WEB INFORMATION SYSTEMS ENGINEERING - WISE 2008, PROCEEDINGS, 2008, 5175 : 321 - 336
  • [22] TKEP: An efficient top-k query processing algorithm on massive data
    Han, Xi-Xian
    Yang, Dong-Hua
    Li, Jian-Zhong
    [J]. Jisuanji Xuebao/Chinese Journal of Computers, 2010, 33 (08): : 1405 - 1417
  • [23] Efficient top-k high utility itemset mining on massive data
    Han, Xixian
    Liu, Xianmin
    Li, Jianzhong
    Gao, Hong
    [J]. INFORMATION SCIENCES, 2021, 557 : 382 - 406
  • [24] A Rating-Ranking Method for Crowdsourced Top-k Computation
    Li, Kaiyu
    Zhang, Xiaohang
    Li, Guoliang
    [J]. SIGMOD'18: PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2018, : 975 - 990
  • [25] Top-k Combinatorial Skyline Queries
    Su, I-Fang
    Chung, Yu-Chi
    Lee, Chiang
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PT II, PROCEEDINGS, 2010, 5982 : 79 - +
  • [26] Top-k skyline: A unified approach
    Goncalves, M
    Vidal, ME
    [J]. ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS 2005: OTM 2005 WORKSHOPS, PROCEEDINGS, 2005, 3762 : 790 - 799
  • [27] On the semantics of top-k ranking for objects with uncertain data
    Wang, Chonghai
    Yuan, Li Yan
    You, Jia-Huai
    [J]. COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2011, 62 (07) : 2812 - 2823
  • [28] Is Top-k Sufficient for Ranking?
    Lan, Yanyan
    Niu, Shuzi
    Guo, Jiafeng
    Cheng, Xueqi
    [J]. PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, : 1261 - 1270
  • [29] Efficient Algorithms for Skyline Top-K Keyword Queries on XML Streams
    Li, Lingli
    Wang, Hongzhi
    Li, Jianzhong
    Gao, Hong
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PROCEEDINGS, 2009, 5463 : 283 - 287
  • [30] Adversarial Top-K Ranking
    Suh, Changho
    Tan, Vincent Y. F.
    Zhao, Renbo
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2017, 63 (04) : 2201 - 2225