Top-k approximate selection for typicality query results over spatio-textual data

被引:0
|
作者
Meng, Xiangfu [1 ]
Zhang, Xiaoyan [1 ]
Huo, Hongjin [1 ]
Leng, Qiangkui [1 ]
机构
[1] Liaoning Tech Univ, Sch Elect & Informat Engn, Huludao 125105, Peoples R China
基金
中国国家自然科学基金;
关键词
Spatio-textual data; spatial keyword query; Probability density estimation; Typicality analysis; Top-k approximate selection; KEYWORD SEARCH;
D O I
10.1007/s10115-023-02013-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Spatial keyword query is a classical query processing mode for spatio-textual data, which aims to provide users the spatio-textual objects with the highest spatial proximity and textual similarity to the given query. However, the top-k result objects obtained by using the spatial keyword query mode are often similar to each other, while users hope that the system can pick top-k typicality results from the candidate query results in order to make users understand the representative features of the candidate result set. To deal with the problem of typicality analysis and typical object selection of spatio-textual data query results, a typicality evaluation and top-k approximate selection approach is proposed. First, the approach calculates the synthetic distances on dimensions of geographic location, textual semantics, and numeric attributes between all spatio-textual objects. And then, a hybrid index structure that can simultaneously support the location, text, and numeric multi-dimension matching is presented in order to expeditiously obtain the candidate query results. According to the synthetic distances between spatio-textual objects, a Gaussian kernel probability density estimation-based method for measuring the typicality of query results is proposed. To facilitate the query result analysis and top-k typical object selection, the Tournament strategy-based and local neighborhood-based top-k typical object approximate selection algorithms are presented, respectively. The experimental results demonstrated that the text semantic relevancy measuring method for spatio-textual objects is accurate and reasonable, and the local neighborhood-based top-k typicality result approximate selection algorithm achieved both the low error rate and high execution efficiency. The source code and datasets used in this paper are available to be accessed from https://github.com/JiaShengS/Typicality_analysis/.
引用
收藏
页码:1425 / 1468
页数:44
相关论文
共 50 条
  • [31] Approximate spatio-temporal top-k publish/subscribe
    Chen, Lisi
    Shang, Shuo
    [J]. WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2019, 22 (05): : 2153 - 2175
  • [32] Personalizing the Top-k Spatial Keyword Preference Query with textual classifiers
    Dias de Almeida, Joao Paulo
    Durao, Frederico Araujo
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2020, 162
  • [33] Effective and efficient top-k query processing over incomplete data streams
    Ren, Weilong
    Lian, Xiang
    Ghazinour, Kambiz
    [J]. INFORMATION SCIENCES, 2021, 544 : 343 - 371
  • [34] A lattice framework for reusing top-k query results
    Hill, B
    [J]. PROCEEDINGS OF THE 2005 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION, 2005, : 38 - 43
  • [35] Bounded diversification methods for top-k query results
    [J]. Zhou, Yu (zyawf810@163.com), 1600, Chinese Academy of Sciences (25):
  • [36] What happened then and there: Top-k spatio-temporal keyword query
    Liu, Xiping
    Wan, Changxuan
    Xiong, Neal N.
    Liu, Dexi
    Liao, Guoqiong
    Deng, Song
    [J]. INFORMATION SCIENCES, 2018, 453 : 281 - 301
  • [37] Top-k closest pairs join query:: An approximate algorithm for large high dimensional data
    Angiulli, F
    Pizzuti, C
    [J]. INTERNATIONAL DATABASE ENGINEERING AND APPLICATIONS SYMPOSIUM, PROCEEDINGS, 2004, : 103 - 110
  • [38] Efficient top-k query evaluation on probabilistic data
    Re, Christopher
    Dalvi, Nilesh
    Suciu, Dan
    [J]. 2007 IEEE 23RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2007, : 861 - +
  • [39] Top-k Correlated Subgraph Query for Data Streams
    Pan, Shirui
    Zhu, Xingquan
    Fang, Meng
    [J]. 2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 2906 - 2909
  • [40] A Top-k query algorithm on uncertain streaming data
    Wang, Ying
    Yu, Jianqiao
    [J]. Journal of Computational Information Systems, 2013, 9 (13): : 5273 - 5279