LAF: a new XML encoding and indexing strategy for keyword-based XML search

被引:2
|
作者
Deng, Zhi-Hong [1 ]
Xiang, Yong-Qing [1 ]
Gao, Ning [1 ]
机构
[1] Peking Univ, Sch Elect Engn & Comp Sci, Minist Educ, Key Lab Machine Percept, Beijing 100871, Peoples R China
来源
基金
中国国家自然科学基金; 国家高技术研究发展计划(863计划);
关键词
XML keyword search; LAF; two-layer index; ABS; SLCA;
D O I
10.1002/cpe.2906
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
As a large number of corpuses are represented, stored and published in XML format, how to find useful information from XML databases has become an increasingly important issue. Keyword search enables web users to easily access XML data without the need to learn a structured query language or to study complex data schemas. Most existing indexing strategies for XML keyword search are based upon Dewey encoding. In this paper, we proposed a new encoding method called Level Order and Father (LAF) for XML documents. With LAF encoding, we devised a new index structure, called two-layer LAF inverted index, which can greatly decrease the space complexity compared with Dewey encoding-based inverted index. Furthermore, with two-layer LAF inverted index, we proposed a new keyword query algorithm called Algorithm based on Binary Search (ABS) that can quickly find all Smallest Lowest Common Ancestor. We experimentally evaluate two-layer LAF inverted index and ABS algorithm on four real XML data sets selected from Wikipedia. The experimental results prove the advantages of our index method and querying algorithm. The space consumed by two-layer LAF index is less than half of that consumed by Dewey inverted index. Moreover, ABS is about one to two orders of magnitude faster than the classic Stack algorithm. Concurrency and Computation: Practice and Experience, 2012.(c) 2012 Wiley Periodicals, Inc.
引用
收藏
页码:1604 / 1621
页数:18
相关论文
共 50 条
  • [21] Keyword search for XML in relational databases
    Xu, Zhengchuan
    Chen, Zhongmin
    Sun, Hai
    Zhou, Aoying
    Gaojishu Tongxin/High Technology Letters, 2004, 14 (02):
  • [22] Indexing relational database content offline for efficient keyword-based search
    Su, Q
    Widom, J
    9th International Database Engineering & Application Symposium, Proceedings, 2005, : 297 - 306
  • [23] MAXLCA: A NEW QUERY SEMANTIC MODEL FOR XML KEYWORD SEARCH
    Gao, Ning
    Deng, Zhi-Hong
    Jiang, Jia-Jian
    Yu, Hang
    JOURNAL OF WEB ENGINEERING, 2012, 11 (02): : 131 - 145
  • [24] MCN: A New Semantics Towards Effective XML Keyword Search
    Zhou, Junfeng
    Bao, Zhifeng
    Ling, Tok Wang
    Meng, Xiaofeng
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PROCEEDINGS, 2009, 5463 : 511 - +
  • [25] A Crowdsourcing-based Approach for Efficient XML Keyword Search
    Amini, Leila M.
    Keyvanpour, Mohammad Reza
    2019 5TH INTERNATIONAL CONFERENCE ON WEB RESEARCH (ICWR), 2019, : 16 - 21
  • [26] Path-based keyword search over XML streams
    Bou, Savong
    Amagasa, Toshiyuki
    Kitagawa, Hiroyuki
    INTERNATIONAL JOURNAL OF WEB INFORMATION SYSTEMS, 2015, 11 (03) : 347 - 369
  • [27] Semantic-Distance Based Clustering for XML Keyword Search
    Yang, Weidong
    Zhu, Hao
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT II, PROCEEDINGS, 2010, 6119 : 398 - 409
  • [28] XObject: An XML keyword search method based on structural retrieval
    Li, Xia
    Li, Zhanhuai
    Chen, Qun
    Wang, Peng
    Lou, Ying
    Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University, 2010, 28 (04): : 602 - 608
  • [29] A Keyword-Based Filtering Technique of Document-Centric XML using NFA Representation
    Byun, Changwoo
    Lee, Kyounghan
    Park, Seog
    PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 21, 2007, 21 : 88 - 95
  • [30] Efficient Data Structure for XML Keyword Search
    Choi, Ryan H.
    Wong, Raymond K.
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PROCEEDINGS, 2009, 5463 : 549 - 554