Server-Based Data Push Architecture for Multi-Processor Environments

被引:2
|
作者
孙贤和 [1 ]
Surendra Byna [2 ]
陈勇 [2 ]
机构
[1] Department of Computer Science Illinois Institute of Technology,Chicago,Illinois 60616,U.S.A.Computing Division,Fermi National Accelerator Laboratory.Batavia,IL 60510-0500,U.S.A.
[2] Department of Computer Science Illinois Institute of Technology,Chicago,Illinois 60616,U.S.A.
基金
美国国家科学基金会;
关键词
performance measurement; evaluation; modeling; simulation of multiple-processor system; cache memory;
D O I
暂无
中图分类号
TP332 [运算器和控制器(CPU)];
学科分类号
081201 ;
摘要
Data access delay is a major bottleneck in utilizing current high-end computing(HEC)machines.Prefetch- ing,where data is fetched before CPU demands for it,has been considered as an effective solution to masking data access delay.However,current client-initiated prefetching strategies,where a computing processor initiates prefetching instructions,have many limitations.They do not work well for applications with complex,non-contiguous data access patterns.While technology advances continue to increase the gap between computing and data access performance, trading computing power for reducing data access delay has become a natural choice.In this paper,we present a server- based data-push approach and discuss its associated implementation mechanisms.In the server-push architecture,a dedicated server called Data Push Server(DPS)initiates and proactively pushes data closer to the client in time.Issues, such as what data to fetch,when to fetch,and how to push are studied.The SimpleScalar simulator is modified with a dedicated prefetching engine that pushes data for another processor to test DPS based prefetching.Simulation results show that L1 Cache miss rate can be reduced by up to 97%(71% on average)over a superscalar processor for SPEC CPU2000 benchmarks that have high cache miss rates.
引用
收藏
页码:641 / 652
页数:12
相关论文
共 50 条
  • [21] A generic wrapper architecture for multi-processor SoC cosimulation and design
    Yoo, S
    Nicolescu, G
    Lyonnard, D
    Baghdadi, A
    Jerraya, AA
    [J]. PROCEEDINGS OF THE NINTH INTERNATIONAL SYMPOSIUM ON HARDWARE/SOFTWARE CODESIGN, 2001, : 195 - 200
  • [22] Transaction level modeling of NoC based multi-processor architecture for wireless communication system
    Yoon, Sung-Rok
    Park, Sin-Chong
    [J]. 2006 ASIA-PACIFIC CONFERENCE ON COMMUNICATION, VOLS 1 AND 2, 2006, : 694 - 697
  • [23] Mobile satellite reception with a virtual satellite dish based on a reconfigurable multi-processor architecture
    van de Burgwal, M. D.
    Rovers, K. C.
    Blom, K. C. H.
    Kokkeler, A. B. J.
    Smit, G. J. M.
    [J]. MICROPROCESSORS AND MICROSYSTEMS, 2011, 35 (08) : 716 - 728
  • [24] Implementation and simulation of a cluster-based hierarchical NoC architecture for multi-processor SoC
    Leng, XL
    Xu, NY
    Dong, F
    Zhou, ZC
    [J]. INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES 2005, VOLS 1 AND 2, PROCEEDINGS, 2005, : 1163 - 1166
  • [25] Multi-processor based fast data acquisition for a free electron laser and experiments
    Agababyan, A.
    Asova, G.
    Dirnitrov, G.
    Grygiel, G.
    Fominykh, B.
    Hensler, O.
    Kammering, R.
    Petrosyan, L.
    Rehlich, K.
    Rybnikov, V.
    Trowitzsch, G.
    Winde, M.
    Wilksen, T.
    [J]. IEEE TRANSACTIONS ON NUCLEAR SCIENCE, 2008, 55 (01) : 256 - 260
  • [26] Multi-processor based fast data acquisition for a free electron laser and experiments
    Agababyan, A.
    Asova, G.
    Dimitrov, G.
    Grygiel, G.
    Fominykh, B.
    Hensler, O.
    Kammering, R.
    Petrosyan, L.
    Rehlich, K.
    Rybnikov, V.
    Trowitzsch, G.
    Winde, M.
    Wilksen, T.
    [J]. 2007 15TH IEEE-NPSS REAL-TIME CONFERENCE, VOLS 1 AND 2, 2007, : 653 - +
  • [27] The Client/server-based Distributed Architecture of a CIM Intelligent Integrating Platform
    王刚
    [J]. High Technology Letters, 1996, (02) : 9 - 12
  • [28] A multi-processor NoC-based architecture for real-time image/video enhancement
    Sergio Saponara
    Luca Fanucci
    Esa Petri
    [J]. Journal of Real-Time Image Processing, 2013, 8 : 111 - 125
  • [29] A Variation Tolerant Architecture for Ultra Low Power Multi-processor Cluster
    Bortolotti, Daniele
    Rossi, Davide
    Bartolini, Andrea
    Benini, Luca
    [J]. 2013 23RD INTERNATIONAL WORKSHOP ON POWER AND TIMING MODELING, OPTIMIZATION AND SIMULATION (PATMOS), 2013, : 32 - 38
  • [30] Application development of camera-based driver assistance systems on a programmable multi-processor architecture
    Techmer, Axel
    [J]. 2007 IEEE INTELLIGENT VEHICLES SYMPOSIUM, VOLS 1-3, 2007, : 595 - 600