Towards an Effective XML Keyword Search

被引:26
|
作者
Bao, Zhifeng [1 ]
Lu, Jiaheng [2 ]
Ling, Tok Wang [1 ]
Chen, Bo [1 ]
机构
[1] Natl Univ Singapore, Sch Comp, Singapore 117417, Singapore
[2] Renmin Univ China, Key Lab Data Engn & Knowledge Engn, MOE, Beijing 100872, Peoples R China
基金
中国国家自然科学基金;
关键词
XML; keyword search; ranking; PROXIMITY SEARCH;
D O I
10.1109/TKDE.2010.63
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Inspired by the great success of information retrieval (IR) style keyword search on the web, keyword search on XML has emerged recently. The difference between text database and XML database results in three new challenges: 1) Identify the user search intention, i.e., identify the XML node types that user wants to search for and search via. 2) Resolve keyword ambiguity problems: a keyword can appear as both a tag name and a text value of some node; a keyword can appear as the text values of different XML node types and carry different meanings; a keyword can appear as the tag name of different XML node types with different meanings. 3) As the search results are subtrees of the XML document, new scoring function is needed to estimate its relevance to a given query. However, existing methods cannot resolve these challenges, thus return low result quality in term of query relevance. In this paper, we propose an IR-style approach which basically utilizes the statistics of underlying XML data to address these challenges. We first propose specific guidelines that a search engine should meet in both search intention identification and relevance oriented ranking for search results. Then, based on these guidelines, we design novel formulae to identify the search for nodes and search via nodes of a query, and present a novel XML TF*IDF ranking strategy to rank the individual matches of all possible search intentions. To complement our result ranking framework, we also take the popularity into consideration for the results that have comparable relevance scores. Lastly, extensive experiments have been conducted to show the effectiveness of our approach.
引用
收藏
页码:1077 / 1092
页数:16
相关论文
共 50 条
  • [1] MCN: A New Semantics Towards Effective XML Keyword Search
    Zhou, Junfeng
    Bao, Zhifeng
    Ling, Tok Wang
    Meng, Xiaofeng
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PROCEEDINGS, 2009, 5463 : 511 - +
  • [2] Adaptive and Effective Keyword Search for XML
    Yang, Weidong
    Zhu, Hao
    Li, Nan
    Zhu, Guansheng
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT I: 15TH PACIFIC-ASIA CONFERENCE, PAKDD 2011, 2011, 6634 : 423 - 434
  • [3] ELCEA: An Entity-based Semantics towards Effective XML Keyword Search
    Ji, Qingling
    Zhou, Junfeng
    Guo, Jingfeng
    [J]. 2010 2ND INTERNATIONAL WORKSHOP ON DATABASE TECHNOLOGY AND APPLICATIONS PROCEEDINGS (DBTA), 2010,
  • [4] Exploiting structures in keyword queries for effective XML search
    Liu, Xiping
    Chen, Lei
    Wan, Changxuan
    Liu, Dexi
    Xiong, Naixue
    [J]. INFORMATION SCIENCES, 2013, 240 : 56 - 71
  • [5] An Effective Object-Level XML Keyword Search
    Bao, Zhifeng
    Lu, Jiaheng
    Ling, Tok Wang
    Xu, Liang
    Wu, Huayu
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PT I, PROCEEDINGS, 2010, 5981 : 93 - +
  • [6] Effective keyword search in XML documents based on MIU
    Xu, Jianjun
    Lu, Jiaheng
    Wang, Wei
    Shi, Baile
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PROCEEDINGS, 2006, 3882 : 702 - 716
  • [7] Effective XML keyword search basing on the path semantics
    [J]. Ji, B. (xianyinglou@gmail.com), 2013, Binary Information Press, P.O. Box 162, Bethel, CT 06801-0162, United States (09):
  • [8] Effective Keyword Search for Candidate Fragments of XML Documents
    Wen, Yanlong
    Zhang, Haiwei
    Zhang, Ying
    Zhang, Lu
    Xu, Lei
    Yuan, Xiaojie
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2011, 2011, 6637 : 427 - 439
  • [9] Effective XML Keyword Search with Relevance Oriented Ranking
    Bao, Zhifeng
    Ling, Tok Wang
    Chen, Bo
    Lu, Jiaheng
    [J]. ICDE: 2009 IEEE 25TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2009, : 517 - +
  • [10] From Structure-Based to Semantics-Based: Towards Effective XML Keyword Search
    Thuy Ngoc Le
    Wu, Huayu
    Ling, Tok Wang
    Li, Luochen
    Lu, Jiaheng
    [J]. CONCEPTUAL MODELING, ER 2013, 2013, 8217 : 356 - +