An inverted file cache for fast information retrieval

被引:0
|
作者
Shieh, WY [1 ]
Shann, JJJ [1 ]
Chung, CP [1 ]
机构
[1] Natl Chiao Tung Univ, Dept Comp Sci & Informat Engn, Hsinchu 300, Taiwan
关键词
information retrieval system; inverted file; cache; hashing; memory management;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The inverted file is the most popular indexing mechanism used for document search in an information retrieval system (IRS). However, the disk I/O for accessing the inverted file becomes a bottleneck in an IRS. To avoid using the disk I/O, we propose a caching mechanism for accessing the inverted file, called the inverted file cache (IF cache). In this cache, a proposed hashing scheme using a linked list structure to handle collisions in the hash table speeds up entry indexing. Furthermore, the replacement and storage mechanisms of this cache are designed specifically for the inverted file structure. We experimentally verify our design, based on documents collected from the TREC (Text REtrieval Conference) and search requests generated by the Zipf-like distribution. Simulation results show that the IF cache can improve the performance of a test IRS by about 60% in terms of the average searching response time.
引用
收藏
页码:681 / 695
页数:15
相关论文
共 50 条
  • [1] An extended inverted file approach for information retrieval
    Ounis, I
    Pasca, M
    [J]. IDEAS '97 - INTERNATIONAL DATABASE ENGINEERING AND APPLICATIONS SYMPOSIUM, PROCEEDINGS, 1997, : 397 - 402
  • [2] INVERTED FILE PROCESSOR FOR INFORMATION-RETRIEVAL
    STELLHORN, WH
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 1977, 26 (12) : 1258 - 1267
  • [3] BIGFile: Bayesian Information Gain for Fast File Retrieval
    Liu, Wanyu
    Rioul, Olivier
    Mcgrenere, Joanna
    Mackay, Wendy E.
    Beaudouin-Lafon, Michel
    [J]. PROCEEDINGS OF THE 2018 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI 2018), 2018,
  • [4] Fast query evaluation through document identifier assignment for inverted file-based information retrieval systems
    Cheng, CS
    Chung, CP
    Shann, JJJ
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2006, 42 (03) : 729 - 750
  • [5] Inverted file partitioning for distributed query processing in information retrieval systems
    Srisawat, J
    Alexandridis, N
    OConnell, M
    [J]. PARALLEL AND DISTRIBUTED COMPUTING SYSTEMS - PROCEEDINGS OF THE ISCA 9TH INTERNATIONAL CONFERENCE, VOLS I AND II, 1996, : 738 - 743
  • [6] A tree-based inverted file for fast ranked-document retrieval
    Shieh, WY
    Chen, TF
    Chung, CP
    [J]. IKE'03: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE ENGINEERING, VOLS 1 AND 2, 2003, : 64 - 69
  • [8] INVERTED RETRIEVAL FILE OF CHEMICAL ANALYSIS DATA
    YAKUSHEV, VM
    [J]. NAUCHNO-TEKHNICHESKAYA INFORMATSIYA SERIYA 2-INFORMATSIONNYE PROTSESSY I SISTEMY, 1971, (06): : 23 - &
  • [9] THE INVERTED FILE - MARKETING INFORMATION PRODUCTS
    ARNOLD, S
    [J]. ONLINE, 1986, 10 (01): : 6 - 11
  • [10] RETRIEVAL TIMES FOR A PACKED DIRECT ACCESS INVERTED FILE
    BAYES, AJ
    [J]. COMMUNICATIONS OF THE ACM, 1969, 12 (10) : 582 - &