The String Similarity Query Processing in Cloud Computing System

被引:0
|
作者
LiaoYuanLai [1 ]
机构
[1] Heyuan Polytech, Heyuan 517000, Peoples R China
关键词
Cloud computing; String similarity; Query processing; Aggregation;
D O I
10.14257/ijgdc.2015.8.2.04
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The paper target at string similarity search in cloud systems. Existing works focus on query processing within a single server, and it incurs main memory overflow and external memory overflow while dealing with big data. For the above problems, the paper proposes a distributed index to support string similarity search in cloud environments. To provide efficient searching in a single node, an external memory index is designed, which adopts multiple filtering techniques and optimizing strategies. The external memory resident index supports length filter, positional filter in disks. This paper proposes the index construction method. During query processing, asymmetric q-gram is used to reduce the number of inverted lists read from disks. An adaptive algorithm is given to choose inverted lists, and seek the tradeoff between two aspects of query cost. The global index partitions the entire string dataset according the content of strings, and a char vector space partition method is proposed. In char vector space partition method, similar strings are partitioned into the same computing nodes, thus the number of computing nodes involved in a single query is reduced. The partition method is also adopted to determine necessary computing node set for a query to access. Simulation results validate the efficiency and effectiveness of our proposed index.
引用
收藏
页码:25 / 35
页数:11
相关论文
共 50 条
  • [1] Query similarity computing based on system similarity measurement
    Zhang, Chengzhi
    Xu, Xiaoqin
    Su, Xinning
    [J]. COMPUTER PROCESSING OF ORIENTAL LANGUAGES, PROCEEDINGS: BEYOND THE ORIENT: THE RESEARCH CHALLENGES AHEAD, 2006, 4285 : 42 - +
  • [2] Verifiable Range Query Processing for Cloud Computing
    Li, Yanling
    Lai, Junzuo
    Wang, Chuansheng
    Zhang, Jianghe
    Xiong, Jie
    [J]. INFORMATION SECURITY PRACTICE AND EXPERIENCE, ISPEC 2017, 2017, 10701 : 333 - 349
  • [3] Balance-Aware Distributed String Similarity-Based Query Processing System
    Sun, Ji
    Shang, Zeyuan
    Li, Guoliang
    Deng, Dong
    Bao, Zhifeng
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2019, 12 (09): : 961 - 974
  • [4] Fast and Secure kNN Query Processing in Cloud Computing
    Lei, Xinyu
    Tu, Guan-Hua
    Liu, Alex X.
    Xie, Tian
    [J]. 2020 IEEE CONFERENCE ON COMMUNICATIONS AND NETWORK SECURITY (CNS), 2020,
  • [5] Fast Range Query Processing with Strong Privacy Protection for Cloud Computing
    Rui Li
    Liu, Alex X.
    Wang, Ann L.
    Bruhadeshwar, Bezawada
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2014, 7 (14): : 1953 - 1964
  • [6] Fast and Scalable Range Query Processing With Strong Privacy Protection for Cloud Computing
    Li, Rui
    Liu, Alex X.
    Wang, Ann L.
    Bruhadeshwar, Bezawada
    [J]. IEEE-ACM TRANSACTIONS ON NETWORKING, 2016, 24 (04) : 2305 - 2318
  • [7] A secure data sharing and query processing framework via federation of cloud computing
    Samanthula, Bharath K.
    Elmehdwi, Yousef
    Howser, Gerry
    Madria, Sanjay
    [J]. INFORMATION SYSTEMS, 2015, 48 : 196 - 212
  • [8] Adaptively Secure Conjunctive Query Processing over Encrypted Data for Cloud Computing
    Li, Rui
    Liu, Alex X.
    [J]. 2017 IEEE 33RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2017), 2017, : 697 - 708
  • [9] Comparative Analysis of String Similarity on Dynamic Query Suggestions
    Rinartha, Komang
    Suryasa, Wayan
    Kartika, Luh Gede Surya
    [J]. 2018 ELECTRICAL POWER, ELECTRONICS, COMMUNICATIONS, CONTROLS, AND INFORMATICS SEMINAR (EECCIS), 2018, : 399 - 404
  • [10] Similarity Query Processing for Probabilistic Sets
    Gao, Ming
    Jin, Cheqing
    Wang, Wei
    Lin, Xuemin
    Zhou, Aoying
    [J]. 2013 IEEE 29TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2013, : 913 - 924