Efficient in-memory extensible inverted file

被引:10
|
作者
Luk, Robert W. P.
Lam, Wai
机构
[1] Hong Kong Polytech Univ, Dept Comp, Kowloon, Hong Kong, Peoples R China
[2] Chinese Univ Hong Kong, Dept Syst Engn & Engn Management, Shatin, Hong Kong, Peoples R China
关键词
information retrieval; indexing; optimization;
D O I
10.1016/j.is.2006.06.001
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The growing amount of on-line data demands efficient parallel and distributed indexing mechanisms to manage large resource requirements and unpredictable system failures. Parallel and distributed indices built using commodity hardware like personal computers (PCs) can substantially save cost because PCs are produced in bulk, achieving the scale of economy. However, PCs have limited amount of random access memory (RAM) and the effective utilization of RAM for in-memory inversion is crucial. This paper presents an analytical investigation and an empirical evaluation of storage-efficient in memory extensible inverted files, which are represented by fixed- or variable-sized linked list nodes. The size of these linked list nodes is determined by minimizing the storage wastes or maximizing storage utilization under different conditions, which lead to different storage allocation schemes. Minimizing storage wastes also reduces the number of address indirections (i.e., chaining). We evaluated our storage allocation schemes using a number of reference collections. We found that the arrival rate scheme is the best in terms of both storage utilization and the mean number of chainings per term. The final storage utilization can be over 90% in our evaluation if there is a sufficient number of documents indexed. The mean number of chainings is not large (less than 2.6 for all the reference collections). We have also showed that our best storage allocation scheme can be used for our extensible compressed inverted file. The final storage utilization of the extensible compressed inverted file can be over 90% in our evaluation provided that there is a sufficient number of documents indexed. The proposed storage allocation schemes can also be used by compressed extensible inverted files with word positions (c) 2006 Elsevier B.V. All rights reserved.
引用
下载
收藏
页码:733 / 754
页数:22
相关论文
共 50 条
  • [1] Designing an Efficient Persistent In-Memory File System
    Sha, Edwin H. -M.
    Chen, Xianzhang
    Zhuge, Qingfeng
    Shi, Liang
    Jiang, Weiwen
    2015 IEEE NON-VOLATILE MEMORY SYSTEMS AND APPLICATIONS SYMPOSIUM (NVMSA), 2015,
  • [2] HydraFS: an efficient NUMA-aware in-memory file system
    Wu, Ting
    Chen, Xianzhang
    Liu, Kai
    Xiao, Chunhua
    Liu, Zhixiang
    Zhuge, Qingfeng
    Sha, Edwin H. -M.
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2020, 23 (02): : 705 - 724
  • [3] An Efficient File System for Hybrid In-Memory NVM and Block Devices
    Zeng, Yuansong
    Sha, Edwin H. -M.
    Zhuge, Qingfeng
    Chen, Xianzhang
    Ma, Zhulin
    Wu, Lin
    2018 7TH IEEE NON-VOLATILE MEMORY SYSTEMS AND APPLICATIONS SYMPOSIUM (NVMSA 2018), 2018, : 43 - 48
  • [4] HydraFS: an efficient NUMA-aware in-memory file system
    Ting Wu
    Xianzhang Chen
    Kai Liu
    Chunhua Xiao
    Zhixiang Liu
    Qingfeng Zhuge
    Edwin H.-M. Sha
    Cluster Computing, 2020, 23 : 705 - 724
  • [5] The Design and Implementation of an Efficient Data Consistency Mechanism for In-Memory File Systems
    Chen, Xianzhang
    Sha, Edwin H. -M.
    Sun, Zhilong
    Zhuge, Qingfeng
    Jiang, Weiwen
    2016 13TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (ICESS) - PROCEEDINGS, 2016, : 170 - 175
  • [6] In-Memory File System with Efficient Swap Support for Mobile Smart Devices
    Choi, Jungsik
    Ahn, Joonwook
    Kim, Jiwon
    Ryu, Sungtae
    Han, Hwansoo
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2016, 62 (03) : 275 - 282
  • [7] The Design and Implementation of an Efficient User-Space In-memory File System
    Sha, Edwin H. -M.
    Jia, Yang
    Chen, Xianzhang
    Zhuge, Qingfeng
    Jiang, Weiwen
    Qin, Jiejie
    2016 5TH NON-VOLATILE MEMORY SYSTEMS AND APPLICATIONS SYMPOSIUM (NVMSA), 2016,
  • [8] PHOENIX - A SAFE IN-MEMORY FILE SYSTEM
    GAIT, J
    COMMUNICATIONS OF THE ACM, 1990, 33 (01) : 81 - 86
  • [9] Mobi-PMFS: An Efficient and Durable In-Memory File System for Mobile Devices
    Xiao, Chunhua
    Lin, Fangzhu
    Fu, Xiaoxiang
    Wu, Ting
    Zhu, Yuanjun
    Liu, Weichen
    2020 IEEE 44TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2020), 2020, : 73 - 78
  • [10] An Efficient Shared In-Memory File System for Co-Resident Virtual Machines
    Sha E.H.-M.
    Wu T.
    Zhuge Q.-F.
    Yang C.-S.
    Ma Z.-L.
    Chen X.-Z.
    Jisuanji Xuebao/Chinese Journal of Computers, 2019, 42 (04): : 800 - 819