A Reuse-Degree Based Locality Classifier for Locality-Aware Data Replication

被引:0
|
作者
Wu, Qianqian [1 ]
Ji, Zhenzhou [1 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin 150001, Peoples R China
基金
中国国家自然科学基金;
关键词
Chip multiprocessors (CMPs); last level cache (LLC); data replication; locality classifier; reuse-degree (RD);
D O I
10.1109/ACCESS.2019.2959840
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The last level cache (LLC) in shared configuration is widely used in the tiled chip multiprocessors (CMPs), which reduces the off-chip miss rate but incurs the long on-chip access latency. The state-of-the-art Locality-Aware Data Replication (LADR) scheme provides an effective tradeoff between capacity and latency through an in-hardware structure named locality classifier. However, the best Limited(3) locality classifier (Limited(3)) in LADR equally preserves locality information of 3 cores for all cache lines indiscriminately that is superfluous for some lines reused by less than 3 cores but incomplete for other lines reused by more than 3 cores, which not only wastes the storage space but also limits the performance improvement. In this paper, we propose a novel concept of Reuse-Degree (RD) for each LLC line, since the line is loaded into LLC, to represent the number of cores that have reused the line. Then, we divide cache lines into Not Reused Line (NRL, RD = 0), Single Reused Line (SRL, RD = 1) and Multiple Reused Line (MRL, RD >= 2) based on their RDs and find that a significant fraction of LLC lines are NRLs or SRLs at any time. Based on this observation, we design a Reuse-Degree based Locality Classifier (RD_LC) for LADR. Specifically, RD_LC decouples the locality classifier from the LLC tag array and introduces two kinds of locality information arrays, single locality information array (SLIA) and complete locality information array (CLIA). Besides, RD_LC allocates a locality information entry only for the reused cache lines (SRLs or MRLs) instead of all cache lines, and assigns an SLIA entry to SRLs and a CLIA entry to MRLs. Our proposal avoids a waste of the storage space and also maintains enough locality information for the accuracy of data replication decisions. Experimental results show that our RD_LC for LADR saves 51% of the storage overhead than that of the baseline Limited(3) locality classifier with a performance improvement and a network traffic reduction by 7.56% and 3.33 % respectively.
引用
收藏
页码:182207 / 182216
页数:10
相关论文
共 50 条
  • [21] Extending smart containers for data locality-aware skeleton programming
    Ernstsson, August
    Kessler, Christoph
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (05):
  • [22] Locality-Aware Routing in Stateful Streaming Applications
    Caneill, Matthieu
    El Rheddane, Ahmed
    Leroy, Vincent
    De Palma, Noel
    [J]. MIDDLEWARE '16: PROCEEDINGS OF THE 17TH INTERNATIONAL MIDDLEWARE CONFERENCE, 2016,
  • [23] Locality-Aware Peer-to-Peer SIP
    Li, Lichun
    Ji, Yang
    Ma, Tao
    Gu, Lanzhi
    Zhang, Chunhong
    [J]. PROCEEDINGS OF THE 2008 14TH IEEE INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS, 2008, : 295 - 302
  • [24] Locality-aware CTA clustering for modern GPUs
    Li A.
    Song S.L.
    Liu W.
    Liu X.
    Kumar A.
    Corporaal H.
    [J]. 1600, Association for Computing Machinery, 2 Penn Plaza, Suite 701, New York, NY 10121-0701, United States (52): : 297 - 311
  • [25] Locality-Aware CTA Clustering for Modern GPUs
    Li, Ang
    Song, Shuaiwen Leon
    Liu, Weifeng
    Liu, Xu
    Kumar, Akash
    Corporaal, Henk
    [J]. OPERATING SYSTEMS REVIEW, 2017, 51 (02) : 297 - 311
  • [26] Locality-aware predictive scheduling of network processors
    Wolf, T
    Franklin, MA
    [J]. ISPASS: 2001 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE, 2001, : 152 - 159
  • [27] Locality-Aware Network Utilization Balancing in NoCs
    More, Ankit
    Taskin, Baris
    [J]. ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2015, 21 (01)
  • [28] LDHT: Locality-aware Distributed Hash Tables
    Wu, Weiyu
    Chen, Yang
    Zhang, Xinyi
    Shi, Xiaohui
    Cong, Lin
    Deng, Beixing
    Li, Xing
    [J]. 2008 THE INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING, 2008, : 187 - +
  • [29] Locality-aware process scheduling for embedded MPSoCs
    Kandemir, M
    Chen, GL
    [J]. DESIGN, AUTOMATION AND TEST IN EUROPE CONFERENCE AND EXHIBITION, VOLS 1 AND 2, PROCEEDINGS, 2005, : 870 - 875
  • [30] Locality-aware Qubit Routing for the Grid Architecture
    Banerjee, Avah
    Liang, Xin
    Tohid, R.
    [J]. 2022 IEEE 36TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW 2022), 2022, : 607 - 613