Ranking the big sky: efficient top-k skyline computation on massive data

被引:2
|
作者
Han, Xixian [1 ]
Wang, Bailing [1 ]
Li, Jianzhong [1 ]
Gao, Hong [1 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin, Heilongjiang, Peoples R China
基金
中国国家自然科学基金;
关键词
Massive data; Top-k skyline; RSTS algorithm; Table scan; Pruning operation; QUERIES; ALGORITHMS; DISTANCE;
D O I
10.1007/s10115-018-1256-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In many applications, top-k skyline query is an important operation to return k skyline tuples with the highest domination scores in a potentially huge data space. It is analyzed that the existing algorithms cannot process top-k skyline query on massive data efficiently. In this paper, we propose a novel table-scan-based algorithm RSTS to compute top-k skyline results on massive data efficiently. RSTS first builds the presorted table, whose tuples are arranged in the order of round-robin retrieval on sorted column lists. RSTS consists of two phases. In phase 1, the candidate tuples are acquired by the sequential scan on the presorted table. In phase 2, RSTS calculates the domination scores of the candidates and returns query results by another sequential scan. It is proved that RSTS has the characteristic of early termination, along with the theoretical analysis of scan depths. The pruning rule for candidate tuples is devised in this paper. The theoretical pruning effect shows that majority of the skyline results can be discarded directly. The extensive experimental results, conducted on synthetic and real-life data sets, show that RSTS outperforms the existing algorithms significantly.
引用
收藏
页码:415 / 446
页数:32
相关论文
共 50 条
  • [1] Ranking the big sky: efficient top-k skyline computation on massive data
    Xixian Han
    Bailing Wang
    Jianzhong Li
    Hong Gao
    [J]. Knowledge and Information Systems, 2019, 60 : 415 - 446
  • [2] Efficient Top-k Skyline Computation in MapReduce
    Song, Baoyan
    Liu, Aili
    Ding, Linlin
    [J]. 2015 12TH WEB INFORMATION SYSTEM AND APPLICATION CONFERENCE (WISA), 2015, : 67 - 70
  • [3] Ranking uncertain sky: The probabilistic top-k skyline operator
    Zhang, Ying
    Zhang, Wenjie
    Lin, Xuemin
    Jiang, Bin
    Pei, Jian
    [J]. INFORMATION SYSTEMS, 2011, 36 (05) : 898 - 915
  • [4] Efficient Top-k Dominating Computation on Massive Data
    Han, Xixian
    Li, Jianzhong
    Gao, Hong
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2017, 29 (06) : 1199 - 1211
  • [5] Efficient Computation of Top-K Skyline Objects in Data Set With Uncertain Preferences
    Sukhwani, Nitesh
    Kagita, Venkateswara Rao
    Kumar, Vikas
    Panda, Sanjaya Kumar
    [J]. INTERNATIONAL JOURNAL OF DATA WAREHOUSING AND MINING, 2021, 17 (03) : 68 - 80
  • [6] Efficient Top-k Dominating Computation on Massive Data (Extended abstract)
    Han, Xixian
    Li, Jianzhong
    Gao, Hong
    [J]. 2018 IEEE 34TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2018, : 1771 - 1772
  • [7] Efficient Top-k Retrieval on Massive Data
    Han, Xixian
    Li, Jianzhong
    Gao, Hong
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (10) : 2687 - 2699
  • [8] Efficient Top-k Retrieval on Massive Data
    Han, Xixian
    Li, Jianzhong
    Gao, Hong
    [J]. 2016 32ND IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2016, : 1496 - 1497
  • [9] Efficient skyline and top-k retrieval in subspaces
    Tao, Yufei
    Xiao, Xiaokui
    Pei, Jian
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2007, 19 (08) : 1072 - 1088
  • [10] Efficient evaluation of Top-k Skyline queries
    Goncalves, Marlene
    Vidal, Maria-Esther
    [J]. REVISTA TECNICA DE LA FACULTAD DE INGENIERIA UNIVERSIDAD DEL ZULIA, 2009, 32 (02): : 170 - 179