Cache Contention and Application Performance Prediction for Multi-Core Systems

被引:48
|
作者
Xu, Chi [1 ]
Chen, Xi [2 ]
Dick, Robert P. [2 ]
Mao, Zhuoqing Morley [2 ]
机构
[1] Univ Minnesota, ECE Dept, Minneapolis, MN 55455 USA
[2] Univ Michigan, Dept EECS, Ann Arbor, MI 48109 USA
来源
2010 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE (ISPASS 2010) | 2010年
基金
美国国家科学基金会;
关键词
D O I
10.1109/ISPASS.2010.5452065
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The ongoing move to chip multiprocessors (CMPs) permits greater sharing of last-level cache by processor cores but this sharing aggravates the cache contention problem, potentially undermining performance improvements. Accurately modeling the impact of inter-process cache contention on performance and power consumption is required for optimized process assignment. However, techniques based on exhaustive consideration of process-to-processor mappings and cycle-accurate simulation are inefficient or intractable for CMPs, which often permit a large number of potential assignments. This paper proposes CAMP, a fast and accurate shared cache aware performance model for multi-core processors. CAMP estimates the performance degradation due to cache contention of processes running on CMPs. It uses reuse distance histograms, cache access frequencies, and the relationship between the throughput and cache miss rate of each process to predict its effective cache size when running concurrently and sharing cache with other processes, allowing instruction throughput estimation. We also provide an automated way to obtain process-dependent characteristics, such as reuse distance histograms, without offline simulation, operating system (OS) modification, or additional hardware. We tested the accuracy of CAMP using 55 different combinations of 10 SPEC CPU2000 benchmarks on a dual-core CMP machine. The average throughput prediction error was 1.57%.
引用
收藏
页码:76 / 86
页数:11
相关论文
共 50 条
  • [21] Page Reusability-Based Cache Partitioning for Multi-Core Systems
    Park, Jiwoong
    Yeom, Heonyoung
    Son, Yongseok
    IEEE TRANSACTIONS ON COMPUTERS, 2020, 69 (06) : 812 - 818
  • [22] Performance Evaluation of LAMMPS on Multi-core Systems
    Cha, Kwangho
    2013 IEEE 15TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2013 IEEE INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (HPCC_EUC), 2013, : 812 - 819
  • [23] Virtualizing Performance Asymmetric Multi-core Systems
    Kwon, Youngjin
    Kim, Changdae
    Maeng, Seungryoul
    Huh, Jaehyuk
    ISCA 2011: PROCEEDINGS OF THE 38TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, 2011, : 45 - 56
  • [24] Dynamic Partition of Shared Cache for Multi-Threaded Application in Multi-Core System
    Li, Shuo
    Wu, Feng
    ADVANCED MEASUREMENT AND TEST, PARTS 1 AND 2, 2010, 439-440 : 1587 - +
  • [25] Cache Efficiency and Scalability on Multi-core Architectures
    Mueller, Thomas
    Trinitis, Carsten
    Smajic, Jasmin
    PARALLEL COMPUTING TECHNOLOGIES, 2011, 6873 : 88 - +
  • [26] A Cache Utility Monitor for Multi-core Processor
    Fang, Juan
    Cheng, Yan-Jin
    Cai, Min
    Chang, Ze-Qing
    PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATION AND SENSOR NETWORKS (WCSN 2016), 2016, 44 : 561 - 565
  • [27] the Review of Cache Partitioning in Multi-core Processor
    Li, Shuo
    Xu, Gaochao
    Dong, Yushuang
    Wu, Feng
    ADVANCED MEASUREMENT AND TEST, PARTS 1 AND 2, 2010, 439-440 : 1223 - +
  • [28] Multi-core system performance prediction and analysis at the ESL
    Yeh, Jen-Chieh
    Lin, Chi-Hung
    Liu, Chun-Nan
    INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2014, 9 (1-2) : 86 - 94
  • [29] Directory cache design for multi-core processor
    State Key Laboratory of High-End Server & Storage Technology , Beijing
    100085, China
    Jisuanji Yanjiu yu Fazhan, 6 (1242-1253):
  • [30] Performance-Controllable Shared Cache Architecture for Multi-Core Soft Real-Time Systems
    Lee, Myoungjun
    Kim, Soontae
    2013 IEEE 31ST INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD), 2013, : 519 - 522