Automatic compiler-inserted I/O prefetching for out-of-core applications

被引:0
|
作者
Mowry, TC
Demke, AK
Krieger, O
机构
关键词
D O I
10.1145/238721.238734
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Current operating systems offer poor performance when a numeric application's working set does not fit in main memory. As a result, programmers who wish to solve ''out-of-core'' problems efficiently are typically faced with the onerous task of rewriting an application to use explicit I/O operations (e.g., read/write). In this paper, we propose and evaluate a fully-automatic technique which liberates the programmer from this task, provides high performance, and requires only minimal changes to current operating systems. In our scheme, the compiler provides the crucial information on future access patterns without burdening the programmer, the operating system supports non-binding prefetch and release hints for managing I/O, and the operating system cooperates with a run-time layer to accelerate performance by adapting to dynamic behavior and minimizing prefetch overhead. This approach maintains the abstraction of unlimited virtual memory for the programmer, gives the compiler the flexibility to aggressively move prefetches back ahead of references, and gives the operating system the flexibility to arbitrate between the competing resource demands of multiple applications. We have implemented our scheme using the SUIF compiler and the Hurricane operating system. Our experimental results demonstrate that our fully-automatic scheme effectively hides the I/O latency in out-of-core versions of the entire NAS Parallel benchmark suite, thus resulting in speedups of roughly twofold for five of the eight applications, with two applications speeding up by threefold or more.
引用
收藏
页码:3 / 17
页数:15
相关论文
共 50 条
  • [21] CLIP: A Disk I/O Focused Parallel Out-of-Core Graph Processing System
    Ai, Zhiyuan
    Zhang, Mingxing
    Wu, Yongwei
    Qian, Xuehai
    Chen, Kang
    Zheng, Weimin
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2019, 30 (01) : 45 - 62
  • [22] Minimizing I/Os in Out-of-Core Task Tree Scheduling
    Marchal, Loris
    McCauley, Samuel
    Simon, Bertrand
    Vivien, Frederic
    2017 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2017, : 884 - 893
  • [23] Disk resident arrays: An array-oriented I/O library for out-of-core computations
    Nieplocha, J
    Foster, I
    FRONTIERS '96 - THE SIXTH SYMPOSIUM ON FRONTIERS OF MASSIVELY PARALLEL COMPUTING, PROCEEDINGS, 1996, : 196 - 204
  • [24] Minimizing I/Os in Out-of-Core Task Tree Scheduling
    Marchal, Loris
    McCauley, Samuel
    Simon, Bertrand
    Vivien, Frederic
    INTERNATIONAL JOURNAL OF FOUNDATIONS OF COMPUTER SCIENCE, 2023, 34 (01) : 51 - 80
  • [25] Profiler and Compiler Assisted Adaptive I/O Prefetching for Shared Storage Caches
    Son, Seung Woo
    Muralidhara, Sai Prashanth
    Ozturk, Ozcan
    Kandemir, Mahmut
    Kolcu, Ibrahim
    Karakoy, Mustafa
    PACT'08: PROCEEDINGS OF THE SEVENTEENTH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, 2008, : 112 - 121
  • [26] Squeezing out All the Value of Loaded Data: An Out-of-core Graph Processing System with Reduced Disk I/O
    Ai, Zhiyuan
    Zhang, Mingxing
    Wu, Yongwei
    Qian, Xuehai
    Chen, Kang
    Zheng, Weimin
    2017 USENIX ANNUAL TECHNICAL CONFERENCE (USENIX ATC '17), 2017, : 125 - 137
  • [27] Parallelization of irregular out-of-core applications for distributed-memory systems
    Brezany, P
    Choudhary, A
    Dang, M
    HIGH-PERFORMANCE COMPUTING AND NETWORKING, 1997, 1225 : 811 - 820
  • [28] Tools for improving the out-of-core performance of data and computation intensive applications
    Valsalam, VK
    Reese, DS
    PROCEEDINGS OF 1999 SYMPOSIUM ON PERFORMANCE EVALUATION OF COMPUTER AND TELECOMMUNICATION SYSTEMS, 1999, : 89 - 96
  • [29] I/O Chunking and Latency Hiding Approach for Out-of-core Sorting Acceleration using GPU and Flash NVM
    Sato, Hitoshi
    Mizote, Ryo
    Matsuoka, Satoshi
    Ogawa, Hirotaka
    2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 398 - 403
  • [30] HUS-Graph: I/O-Efficient Out-of-Core Graph Processing with Hybrid Update Strategy
    Xu, Xianghao
    Wang, Fang
    Jiang, Hong
    Cheng, Yongli
    Feng, Dan
    Zhang, Yongxuan
    PROCEEDINGS OF THE 47TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, 2018,