Process Variation-Aware Nonuniform Cache Management in a 3D Die-Stacked Multicore Processor

被引:11
|
作者
Zhao, Bo [1 ]
Du, Yu [2 ]
Yang, Jun [1 ]
Zhang, Youtao [2 ]
机构
[1] Univ Pittsburgh, Dept Elect & Comp Engn, Pittsburgh, PA 15261 USA
[2] Univ Pittsburgh, Dept Comp Sci, Pittsburgh, PA 15260 USA
基金
美国国家科学基金会;
关键词
Process variation; 3D die stacking; DRAM; NUCA; RETENTION TIME DISTRIBUTION; PERFORMANCE; IMPACT; MODEL;
D O I
10.1109/TC.2012.129
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Process variations in integrated circuits have significant impact on their performance, leakage, and stability. This is particularly evident in large, regular, and dense structures such as DRAMs. DRAMs are built using minimized transistors with presumably uniform speed in an organized array structure. Process variation can introduce latency disparity among different memory arrays. With the proliferation of 3D stacking technology, DRAMs become a favorable choice for stacking on top of a multicore processor as a last level cache for large capacity, high bandwidth, and low power. Hence, variations in bank speed create a unique problem of nonuniform cache accesses in 3D space. In this paper, we investigate cache management techniques for tolerating process variation in a 3D DRAM stacked onto a multicore processor. We modeled the process variation in a four-layer DRAM memory, including cell transistor, capacitor trench, and peripheral circuit, to characterize the latency and retention time variations among different banks. As a result, the notion of fast and slow banks from the core's standpoint is no longer associated with their physical distances with the banks. They are determined by the different bank latencies due to process variation. We develop cache migration schemes that utilize fast banks while limiting the cost due to migration. Our experiments show that there is a great performance benefit in exploiting fast memory banks through migration. On average, a variation-aware management can improve the performance of a workload over the baseline (where one of the slowest bank speed is assumed for all banks) by 16.5 percent. We are also only 0.8 percent away in performance from an ideal memory where no process variation is present.
引用
收藏
页码:2252 / 2265
页数:14
相关论文
共 42 条
  • [1] Process Variation-Aware Nonuniform Cache Management in a 3D Die-Stacked Multicore Processor (vol 62, pg 2252, 2013)
    Zhao, Bo
    Du, Yu
    Yang, Jun
    Zhang, Youtao
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2014, 63 (02) : 525 - 526
  • [2] Process Variation-Aware Adaptive Cache Architecture and Management
    Mutyam, Madhu
    Wang, Feng
    Krishnan, Ramakrishnan
    Narayanan, Vijaykrishnan
    Kandemir, Mahmut
    Xie, Yuan
    Irwin, Mary Jane
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2009, 58 (07) : 865 - 877
  • [3] VAWOM: Temperature and Process Variation Aware WearOut Management in 3D Multicore Architecture
    Tajik, Hossein
    Homayoun, Houman
    Dutt, Nikil
    [J]. 2013 50TH ACM / EDAC / IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2013,
  • [4] Process Variation-Aware Floorplanning for 3D Many-Core Processors
    Hong, Hyejeong
    Lim, Jaeil
    Kang, Sungho
    [J]. 2012 IEEE ELECTRICAL DESIGN OF ADVANCED PACKAGING AND SYSTEMS SYMPOSIUM (EDAPS), 2012, : 193 - 196
  • [5] 3D die-stacked DRAM thermal management via task allocation and core pipeline control
    Yoon, Changho
    Shim, Jae Hoon
    Moon, Byungin
    Kong, Joonho
    [J]. IEICE ELECTRONICS EXPRESS, 2018, 15 (03):
  • [6] Mechanical effects of copper through-vias in a 3D die-stacked module
    Tanaka, N
    Sato, T
    Yamaji, Y
    Morifuji, T
    Umemoto, M
    Takahashi, K
    [J]. 52ND ELECTRONIC COMPONENTS & TECHNOLOGY CONFERENCE, 2002 PROCEEDINGS, 2002, : 473 - +
  • [7] Implementing register files for high-performance microprocessors in a die-stacked (3D) technology
    Puttaswamy, Kiran
    Loh, Gabriel H.
    [J]. IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI, PROCEEDINGS: EMERGING VLSI TECHNOLOGIES AND ARCHITECTURES, 2006, : 384 - +
  • [8] Dynamic Cache Pooling for Improving Energy Efficiency in 3D Stacked Multicore Processors
    Meng, Jie
    Zhang, Tiansheng
    Coskun, Ayse K.
    [J]. 2013 IFIP/IEEE 21ST INTERNATIONAL CONFERENCE ON VERY LARGE SCALE INTEGRATION (VLSI-SOC), 2013, : 210 - 215
  • [9] Squeezing Maximizing Performance out of 3D Cache-Stacked Multicore Architectures
    Khan, Asim
    Kang, Kyungsu
    Kyung, Chong-Min
    [J]. 2011 IEEE 54TH INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS (MWSCAS), 2011,
  • [10] Yield-Enhancement Schemes for Multicore Processor and Memory Stacked 3D ICs
    Huang, Yu-Jen
    Li, Jin-Fu
    [J]. ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2014, 13