BT plus -tree: A New Index for Temporal Information in Web Pages

被引:0
|
作者
Chen, Hong [1 ]
Li, Qiang [1 ]
Jin, Peiquan [1 ]
机构
[1] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei 230027, Peoples R China
关键词
B+-TREES;
D O I
暂无
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
With the growth of Web information, traditional search engines, which are built on the text-based search technology, are unable to meet users demands on Web search. As many queries are time-related, and most Web pages contain time information, it has been an important issue to develop time-aware Web search engines. Based on this view, in this paper we study the indexing mechanism of the temporal information in Web pages. Our work is based on the assumption that each Web page only has one primary time, which will be utilized in time-based Web search. We present a new index structure called BT+-tree which is based on the MAP21-tree. However, unlike MAP21-tree's double-tree structure, BT+-tree only uses one tree structure. Furthermore, duplicated keys can be effectively treated in BT+-tree, while the MAP21-tree has little consideration on duplicated keys. After discussing the index structure as well as manipulation algorithms of BT+-tree, we design a testing program to measure the performance of BT+-tree. The experimental results show that BT+-tree is effective for indexing temporal information in Web pages.
引用
收藏
页码:68 / 78
页数:11
相关论文
共 50 条
  • [1] Indexing Temporal Information for Web Pages
    Jin, Peiquan
    Chen, Hong
    Zhao, Xujian
    Li, Xiaowen
    Yue, Lihua
    [J]. COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2011, 8 (03) : 711 - 737
  • [2] XM-Tree, a new index for Web Information Retrieval
    Deco, Claudia
    Pierangeli, Guillermo
    Bender, Cristina
    Reyes, Nora
    [J]. JOURNAL OF COMPUTER SCIENCE & TECHNOLOGY, 2008, 8 (02): : 78 - 84
  • [3] Automatic Identification of Temporal Information in Tourism Web Pages
    Weiser, Stephanie
    Laublet, Philippe
    Minel, Jean-Luc
    [J]. SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 127 - 131
  • [4] A new approach to recycle web contents -: The DOM tree as the support for building new web pages
    Sabucedo, Luis Alvarez
    Rifon, Luis Anido
    [J]. WEBIST 2008: PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS AND TECHNOLOGIES, VOL 2, 2008, : 284 - 289
  • [5] ENHANCING TOPIC TRACKING FOR CHINESE NEWS WEB PAGES WITH TEMPORAL INFORMATION AND KEY WEB CONTEXTS
    Qiu, Jing
    Liao, Lejian
    Li, Peng
    [J]. INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2010, 6 (01): : 399 - 408
  • [6] Learning (k, l)-contextual tree languages for information extraction from web pages
    Raeymaekers, Stefan
    Bruynooghe, Maurice
    Van den Bussche, Jan
    [J]. MACHINE LEARNING, 2008, 71 (2-3) : 155 - 183
  • [7] Capturing Semantics of Web Pages using Weighted TAG-Tree for Information Retrieval
    Priya, R. Vishnu
    Vadivel, A.
    [J]. INTERNATIONAL JOURNAL OF ASIAN BUSINESS AND INFORMATION MANAGEMENT, 2012, 3 (04) : 7 - 24
  • [8] Learning (k,l)-contextual tree languages for information extraction from web pages
    Stefan Raeymaekers
    Maurice Bruynooghe
    Jan Van den Bussche
    [J]. Machine Learning, 2008, 71 : 155 - 183
  • [9] Information categorization in web pages and sites
    [J]. Carchiolo, V. (car@diit.unict.it), 2005, IOS Press (03):
  • [10] Representing Spatiotemporal Information for Web Pages
    Jin, Peiquan
    Zhao, Jie
    Yue, Lihua
    [J]. NCM 2008: 4TH INTERNATIONAL CONFERENCE ON NETWORKED COMPUTING AND ADVANCED INFORMATION MANAGEMENT, VOL 2, PROCEEDINGS, 2008, : 621 - +