Propeller: A Scalable Real-Time File-Search Service in Distributed Systems

被引:6
|
作者
Xu, Lei [1 ]
Jiang, Hong [1 ]
Tian, Lei [1 ]
Huang, Ziling [1 ]
机构
[1] Univ Nebraska Lincoln, Lincoln, NE 68588 USA
关键词
D O I
10.1109/ICDCS.2014.46
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
File-search service is a valuable facility to accelerate many analytics applications, because it can drastically reduce the scale of the input data. The main challenge facing the design of large-scale and accurate file-search services is how to support real-time indexing in an efficient and scalable way. To address this challenge, we propose a distributed file-search service, called Propeller, which utilizes a special file-access pattern, called access-causality, to partition file-indices in order to expose substantial access locality and parallelism to accelerate the file-indexing process. The extensive evaluations of Propeller show that it is real-time in file-indexing operations, accurate in file-search results, and scalable in large datasets. It achieves significantly better file-indexing and file-search performance (up to 250x) than a centralized solution (MySQL) and much higher accuracy and substantially lower query latency (up to 22x) than a state-of-the-art desktop search engine (Spotlight).
引用
下载
收藏
页码:378 / 388
页数:11
相关论文
共 50 条
  • [21] An architecture to support dynamic service composition in distributed real-time systems
    Estevez-Ayres, Iria
    Almeida, Luis
    Garcia-Valls, Marisol
    Basanta-Val, Pablo
    10TH IEEE INTERNATIONAL SYMPOSIUM ON OBJECT AND COMPONENT-ORIENTED REAL-TIME DISTRIBUTED COMPUTING, PROCEEDINGS, 2007, : 249 - +
  • [22] Scalable Distributed Datastore for Real-Time Cloud Computing
    Lasota, Maciej
    Deniziak, Stanislaw
    Chrobot, Arkadiusz
    PROCEEDINGS OF THE 2015 FEDERATED CONFERENCE ON SOFTWARE DEVELOPMENT AND OBJECT TECHNOLOGIES, 2017, 511 : 193 - 207
  • [23] MetaFlow: A Scalable Metadata Lookup Service for Distributed File Systems in Data Centers
    Sun, Peng
    Wen, Yonggang
    Duong Nguyen Binh Ta
    Xie, Haiyong
    IEEE TRANSACTIONS ON BIG DATA, 2018, 4 (02) : 203 - 216
  • [24] A Distributed Cache for Hadoop Distributed File System in Real-time Cloud Services
    Zhang, Jing
    Wu, Gongqing
    Hu, Xuegang
    Wu, Xindong
    2012 ACM/IEEE 13TH INTERNATIONAL CONFERENCE ON GRID COMPUTING (GRID), 2012, : 12 - 21
  • [25] A scalable method for testing real-time systems
    En-Nouaary, Abdeslam
    SOFTWARE QUALITY JOURNAL, 2008, 16 (01) : 3 - 22
  • [26] A scalable method for testing real-time systems
    Abdeslam En-Nouaary
    Software Quality Journal, 2008, 16 : 3 - 22
  • [27] Monitoring distributed real-time systems
    Shiyou Hiagong Gaodeng Xuexiao Xuebao, 1 (71-73, 86):
  • [28] Parallel and distributed real-time systems
    Manimaran, G
    Ecker, K
    Huh, EN
    JOURNAL OF SYSTEMS AND SOFTWARE, 2005, 77 (01) : 1 - 2
  • [29] Testing distributed real-time systems
    Thane, H
    Hansson, H
    MICROPROCESSORS AND MICROSYSTEMS, 2001, 24 (09) : 463 - 478
  • [30] Real-time scheduling in distributed systems
    Thai, ND
    PAR ELEC 2002: INTERNATIONAL CONFERENCE ON PARALLEL COMPUTING IN ELECTRICAL ENGINEERING, 2002, : 165 - 170