Hierarchical Cache Directory for CMP

被引:21
|
作者
Guo, Song-Liu [1 ]
Wang, Hai-Xia [2 ]
Xue, Yi-Bo [2 ]
Li, Chong-Min [1 ]
Wang, Dong-Sheng [1 ,2 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
[2] Tsinghua Natl Lab Informat Sci & Technol, Beijing 100084, Peoples R China
基金
中国国家自然科学基金;
关键词
cache coherence protocol; hierarchical directory; chip multiprocessor; ARCHITECTURE;
D O I
10.1007/s11390-010-9321-5
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
As more processing cores are integrated into one chip and feature size continues to shrink, the average access latency for remote nodes using directory-based coherence protocol becomes higher, which greatly impacts system performance. Previous techniques such as, data replication and data migration optimize the performance of the requesting core, but offer little improvement for neighbor nodes. Other techniques such as in-transit optimization try to reduce latency at the cost of increased storage. This paper introduces hierarchical cache directory into CMP (chip multiprocessor), which divides CMP tiles into multiple regions hierarchically, and combines it with data replication. A new directory organization is proposed to record the share status within a, region and assist the regional home to complete operation efficiently. Simulation results show that for a 16-core CMP, compared to traditional directory, hierarchical cache directory reduces average access latency by 9% and on-chip network traffic by 34% on average with less storage. Theoretical analyses show that for a 2(n) x 2(n) tiled CMP, the average access latency in hierarchical cache directory asymptotically approaches a function that is independent of n, hence the architecture is highly scalable.
引用
收藏
页码:246 / 256
页数:11
相关论文
共 50 条
  • [21] A Prediction based CMP Cache Migration Policy
    Hao, Song
    Du, Zhihui
    Bader, David
    Wang, Man
    HPCC 2008: 10TH IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, PROCEEDINGS, 2008, : 374 - +
  • [22] Zero Directory Eviction Victim: Unbounded Coherence Directory and Core Cache Isolation
    Chaudhuri, Mainak
    2021 27TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2021), 2021, : 277 - 290
  • [23] QoS Policies and Architecture for Cache/Memory in CMP Platforms
    Iyer, Ravi
    Zhao, Li
    Guo, Fei
    Illikkal, Ramesh
    Makineni, Srihari
    Newell, Don
    Solihin, Yan
    Hsu, Lisa
    Reinhardt, Steve
    SIGMETRICS'07: PROCEEDINGS OF THE 2007 INTERNATIONAL CONFERENCE ON MEASUREMENT & MODELING OF COMPUTER SYSTEMS, 2007, 35 (01): : 25 - +
  • [24] The Significance of CMP Cache Sharing on Contemporary Multithreaded Applications
    Zhang, Eddy Zheng
    Jiang, Yunlian
    Shen, Xipeng
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2012, 23 (02) : 367 - 374
  • [25] Modeling and Stack Simulation of CMP Cache Capacity and Accessibility
    Shi, Xudong
    Su, Feiqi
    Peir, Jih-Kwon
    Xia, Ye
    Yang, Zhen
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2009, 20 (12) : 1752 - 1763
  • [26] Cluster Cache Monitor: Leveraging the Proximity Data in CMP
    Li, Guohong
    Temam, Olivier
    Liu, Zhenyu
    Guo, Sanchuan
    Wang, Dongsheng
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2015, 43 (06) : 1054 - 1077
  • [27] A sharing-aware actively pushing cache on CMP
    Wang, De-Li
    Gao, De-Yuan
    Wang, Dang-Hui
    2010 2ND INTERNATIONAL CONFERENCE ON COMPUTER AND AUTOMATION ENGINEERING (ICCAE 2010), VOL 4, 2010, : 286 - 291
  • [28] A Design of Level Interface for CMP based Cache System
    Chen, Chen
    He, Hu
    Liu, Yuan
    2009 IEEE 8TH INTERNATIONAL CONFERENCE ON ASIC, VOLS 1 AND 2, PROCEEDINGS, 2009, : 839 - +
  • [29] Exploring DRAM Cache Architectures for CMP Server Platforms
    Zhao, Li
    Iyer, Ravi
    Illikkal, Ramesh
    Newell, Don
    2007 IEEE INTERNATIONAL CONFERENCE ON COMPUTER DESIGN, VOLS, 1 AND 2, 2007, : 55 - 62
  • [30] Cluster Cache Monitor: Leveraging the Proximity Data in CMP
    Guohong Li
    Olivier Temam
    Zhenyu Liu
    Sanchuan Guo
    Dongsheng Wang
    International Journal of Parallel Programming, 2015, 43 : 1054 - 1077