Automatic compiler-inserted I/O prefetching for out-of-core applications

被引:0
|
作者
Mowry, TC
Demke, AK
Krieger, O
机构
关键词
D O I
10.1145/238721.238734
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Current operating systems offer poor performance when a numeric application's working set does not fit in main memory. As a result, programmers who wish to solve ''out-of-core'' problems efficiently are typically faced with the onerous task of rewriting an application to use explicit I/O operations (e.g., read/write). In this paper, we propose and evaluate a fully-automatic technique which liberates the programmer from this task, provides high performance, and requires only minimal changes to current operating systems. In our scheme, the compiler provides the crucial information on future access patterns without burdening the programmer, the operating system supports non-binding prefetch and release hints for managing I/O, and the operating system cooperates with a run-time layer to accelerate performance by adapting to dynamic behavior and minimizing prefetch overhead. This approach maintains the abstraction of unlimited virtual memory for the programmer, gives the compiler the flexibility to aggressively move prefetches back ahead of references, and gives the operating system the flexibility to arbitrate between the competing resource demands of multiple applications. We have implemented our scheme using the SUIF compiler and the Hurricane operating system. Our experimental results demonstrate that our fully-automatic scheme effectively hides the I/O latency in out-of-core versions of the entire NAS Parallel benchmark suite, thus resulting in speedups of roughly twofold for five of the eight applications, with two applications speeding up by threefold or more.
引用
收藏
页码:3 / 17
页数:15
相关论文
共 50 条
  • [31] A Highly Efficient I/O-based Out-of-Core Stencil Algorithm with Globally Optimized Temporal Blocking
    Midorikawa, Hiroko
    Tan, Hideyuki
    2017 IEEE PACIFIC RIM CONFERENCE ON COMMUNICATIONS, COMPUTERS AND SIGNAL PROCESSING (PACRIM), 2017,
  • [32] SOWalker: An I/O-Optimized Out-of-Core Graph Processing System for Second-Order RandomWalks
    Wu, Yutong
    Shi, Zhan
    Huang, Shicai
    Tian, Zhipeng
    Zuo, Pengwei
    Fang, Peng
    Wang, Fang
    Feng, Dan
    PROCEEDINGS OF THE 2023 USENIX ANNUAL TECHNICAL CONFERENCE, 2023, : 87 - 100
  • [33] A linux cluster-based parallel I/O system for high performance and out-of-core volume rendering
    Jeong, KJ
    Kim, JI
    Kim, NK
    Kim, JH
    Ryu, YJ
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS I-V, 2000, : 2319 - 2324
  • [34] MLBS: Transparent Data Caching in Hierarchical Storage for Out-of-Core HPC Applications
    Alturkestani, Tariq
    Tonellot, Thierry
    Ltaief, Hatem
    Abdelkhalak, Rached
    Etienne, Vincent
    Keyes, David
    2019 IEEE 26TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, DATA, AND ANALYTICS (HIPC), 2019, : 312 - 322
  • [35] An efficient page lock/release OS mechanism for out-of-core embedded applications
    Patil, Ameet
    Audsley, Neil
    13TH IEEE INTERNATIONAL CONFERENCE ON EMBEDDED AND REAL-TIME COMPUTING SYSTEMS AND APPLICATIONS, PROCEEDINGS, 2007, : 81 - +
  • [36] Improving I/O performance through compiler-directed code restructuring and adaptive prefetching
    Son, Seung Woo
    Kandemir, Mahmut
    Karakoy, Mustafa
    2008 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL & DISTRIBUTED PROCESSING, VOLS 1-8, 2008, : 2485 - +
  • [37] Automatic ARIMA time series modeling for adaptive I/O prefetching
    Tran, N
    Reed, DA
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2004, 15 (04) : 362 - 377
  • [38] File I/O Cache Performance of Supercomputer Fugaku Using an Out-of-Core Direct Numerical Simulation Code of Turbulence
    Hatanaka, Yuto
    Yamane, Yuki
    Yamaguchi, Kenta
    Soga, Takashi
    Musa, Akihiro
    Ishihara, Takashi
    Uno, Atsuya
    Komatsu, Kazuhiko
    Kobayashi, Hiroaki
    Yokokawa, Mitsuo
    COMPUTATIONAL SCIENCE, ICCS 2024, PT VI, 2024, 14937 : 173 - 187
  • [39] Efficient Swap Protocol of Remote Memory Paging for Out-of-Core Multi-thread Applications
    Midorikawa, Hiroko
    Kitagawa, Kenji
    Ohura, Hikari
    2017 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2017, : 637 - 638
  • [40] Language, compiler and parallel database support for I/O intensive applications
    Brezany, P
    Mueck, TA
    Schikuta, E
    HIGH-PERFORMANCE COMPUTING AND NETWORKING, 1995, 919 : 14 - 20