LAF: a new XML encoding and indexing strategy for keyword-based XML search

被引:2
|
作者
Deng, Zhi-Hong [1 ]
Xiang, Yong-Qing [1 ]
Gao, Ning [1 ]
机构
[1] Peking Univ, Sch Elect Engn & Comp Sci, Minist Educ, Key Lab Machine Percept, Beijing 100871, Peoples R China
来源
基金
中国国家自然科学基金; 国家高技术研究发展计划(863计划);
关键词
XML keyword search; LAF; two-layer index; ABS; SLCA;
D O I
10.1002/cpe.2906
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
As a large number of corpuses are represented, stored and published in XML format, how to find useful information from XML databases has become an increasingly important issue. Keyword search enables web users to easily access XML data without the need to learn a structured query language or to study complex data schemas. Most existing indexing strategies for XML keyword search are based upon Dewey encoding. In this paper, we proposed a new encoding method called Level Order and Father (LAF) for XML documents. With LAF encoding, we devised a new index structure, called two-layer LAF inverted index, which can greatly decrease the space complexity compared with Dewey encoding-based inverted index. Furthermore, with two-layer LAF inverted index, we proposed a new keyword query algorithm called Algorithm based on Binary Search (ABS) that can quickly find all Smallest Lowest Common Ancestor. We experimentally evaluate two-layer LAF inverted index and ABS algorithm on four real XML data sets selected from Wikipedia. The experimental results prove the advantages of our index method and querying algorithm. The space consumed by two-layer LAF index is less than half of that consumed by Dewey inverted index. Moreover, ABS is about one to two orders of magnitude faster than the classic Stack algorithm. Concurrency and Computation: Practice and Experience, 2012.(c) 2012 Wiley Periodicals, Inc.
引用
收藏
页码:1604 / 1621
页数:18
相关论文
共 50 条
  • [41] Nearest Keyword Search on Probabilistic XML Data
    Zhao, Yue
    Yuan, Ye
    Wang, Guoren
    WEB TECHNOLOGIES AND APPLICATIONS, APWEB 2014, 2014, 8709 : 485 - 493
  • [42] Research & application of XML keyword search with sorting
    Li, Jianguo
    Tang, Yong
    Ji, Gaofeng
    Ma, Hui
    PROCEEDINGS OF THE 2007 1ST INTERNATIONAL SYMPOSIUM ON INFORMATION TECHNOLOGIES AND APPLICATIONS IN EDUCATION (ISITAE 2007), 2007, : 45 - 48
  • [43] A query refinement framework for xml keyword search
    Bao, Zhifeng
    Yu, Yi
    Shen, Jian
    Fu, Zhangjie
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2017, 20 (06): : 1469 - 1505
  • [44] Keyword Search over Probabilistic XML Data
    Zhao, Yue
    Wang, Guoren
    Yuan, Ye
    Wang, Junxia
    Lin, Chungang
    Yu, Ying
    2015 12TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (FSKD), 2015, : 1230 - 1235
  • [45] Containment based XML indexing
    Xu, Hai-Yuan
    Wu, Quan-Yuan
    Wang, Huai-Min
    Jia, Yan
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2003, 31 (08): : 1155 - 1159
  • [46] Research and Implementation of XML Keyword Search Algorithm Based on Semantic Relatives
    Shen, Mingyan
    Li, Xin
    Meng, Xiangfu
    MANUFACTURING SYSTEMS AND INDUSTRY APPLICATIONS, 2011, 267 : 811 - 815
  • [47] Keyword Search over Probabilistic XML Documents Based on Node Classification
    Zhao, Yue
    Yuan, Ye
    Wang, Guoren
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2015, 2015
  • [48] An efficient SLCA-based keyword search algorithm on uncertain XML
    Zhang, Xiaolin
    Hao, Kun
    Liu, Lixin
    Zhang, Huanxiang
    Journal of Computational Information Systems, 2015, 11 (21): : 7721 - 7729
  • [49] An XSketch-based spelling suggestion approach for XML keyword search
    Li, Sheng
    Wang, Junhu
    INTERNATIONAL JOURNAL OF WEB INFORMATION SYSTEMS, 2014, 10 (03) : 245 - +
  • [50] Keyword search with path-based filtering over XML streams
    Bou, Savong
    Amagasa, Toshiyuki
    Kitagawa, Hiroyuki
    2014 IEEE 33RD INTERNATIONAL SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS (SRDS), 2014, : 337 - 338