Balancing Reliability, Cost, and Performance Tradeoffs with FreeFault

被引:0
|
作者
Kim, Dong Wan [1 ]
Erez, Mattan [1 ]
机构
[1] Univ Texas Austin, Dept Elect & Comp Engn, Austin, TX 78712 USA
来源
2015 IEEE 21ST INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA) | 2015年
基金
美国国家科学基金会;
关键词
MEMORY;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Memory errors have been a major source of system failures and fault rates may rise even further as memory continues to scale. This increasing fault rate, especially when combined with advent of integrated on-package memories, may exceed the capabilities of traditional fault tolerance mechanisms or significantly increase their overhead. In this paper, we present FreeFault as a hardware-only, transparent, and nearly-free resilience mechanism that is implemented entirely within a processor and can tolerate the majority of DRAM faults. FreeFault repurposes portions of the last-level cache for storing retired memory regions and augments a hardware memory scrubber to monitor memory health and aid retirement decisions. Because it relies on existing structures (cache associativity) for retirement/remapping type repair, FreeFault has essentially no hardware overhead. Because it requires a very modest portion of the cache (as small as 8KB) to cover a large fraction of DRAM faults, FreeFault has almost no impact on performance. We explain how FreeFault adds an attractive layer in an overall resilience scheme of highly-reliable and highly-available systems by delaying, and even entirely avoiding, calling upon software to make tradeoff decisions between memory capacity, performance, and reliability.
引用
收藏
页码:439 / 450
页数:12
相关论文
共 50 条
  • [11] Balancing performance and reliability in the memory hierarchy
    Asadi, GH
    Sridharan, V
    Tahoori, MB
    Kaeli, D
    ISPASS 2005: IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE, 2005, : 269 - 279
  • [12] DYNAMIC RAM CONTROLLER PERFORMANCE-COST TRADEOFFS
    VOLK, A
    COMPUTER DESIGN, 1979, 18 (03): : 127 - 136
  • [13] Cost-Performance Tradeoffs in Unreliable Computation Architectures
    Donmez, Mehmet A.
    Raginsky, Maxim
    Singer, Andrew C.
    Varshney, Lay R.
    2016 50TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, 2016, : 215 - 219
  • [14] Design for reliability: Tradeoffs between lifetime and performance due to electromigration
    Wolff, Francis
    Weyer, Daniel
    Papachristou, Chris
    Clay, Steve
    MICROELECTRONICS RELIABILITY, 2021, 117
  • [15] Microprocessor Reliability-Performance Tradeoffs Assessment at the Microarchitecture Level
    Tselonis, Sotiris
    Kaliorakis, Manolis
    Foutris, Nikos
    Papadimitriou, George
    Gizopoulos, Dimitris
    2016 IEEE 34TH VLSI TEST SYMPOSIUM (VTS), 2016,
  • [16] Postural sway parameters in seated balancing; their reliability and relationship with balancing performance
    van Dieen, Jaap H.
    Koppes, Lando L. J.
    Twisk, Jos W. R.
    GAIT & POSTURE, 2010, 31 (01) : 42 - 46
  • [17] Accuracy, Cost, and Performance Tradeoffs for Floating-Point Accumulation
    Nagar, Krishna K.
    Bakos, Jason D.
    2013 INTERNATIONAL CONFERENCE ON RECONFIGURABLE COMPUTING AND FPGAS (RECONFIG), 2013,
  • [18] Cost & performance tradeoffs for carbon fibers in wind turbine blades
    Griffin, DA
    SAMPE JOURNAL, 2004, 40 (04) : 20 - 28
  • [19] Evaluating cost-performance tradeoffs for system level applications
    Ing, WL
    Hwang, CT
    Wu, ACH
    PROCEEDINGS OF THE ASP-DAC '97 - ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE 1997, 1996, : 233 - 238
  • [20] Designing the optimal bit: balancing energetic cost, speed and reliability
    Deshpande, Abhishek
    Gopalkrishnan, Manoj
    Ouldridge, Thomas E.
    Jones, Nick S.
    PROCEEDINGS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2017, 473 (2204):