Balancing Reliability, Cost, and Performance Tradeoffs with FreeFault

被引:0
|
作者
Kim, Dong Wan [1 ]
Erez, Mattan [1 ]
机构
[1] Univ Texas Austin, Dept Elect & Comp Engn, Austin, TX 78712 USA
基金
美国国家科学基金会;
关键词
MEMORY;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Memory errors have been a major source of system failures and fault rates may rise even further as memory continues to scale. This increasing fault rate, especially when combined with advent of integrated on-package memories, may exceed the capabilities of traditional fault tolerance mechanisms or significantly increase their overhead. In this paper, we present FreeFault as a hardware-only, transparent, and nearly-free resilience mechanism that is implemented entirely within a processor and can tolerate the majority of DRAM faults. FreeFault repurposes portions of the last-level cache for storing retired memory regions and augments a hardware memory scrubber to monitor memory health and aid retirement decisions. Because it relies on existing structures (cache associativity) for retirement/remapping type repair, FreeFault has essentially no hardware overhead. Because it requires a very modest portion of the cache (as small as 8KB) to cover a large fraction of DRAM faults, FreeFault has almost no impact on performance. We explain how FreeFault adds an attractive layer in an overall resilience scheme of highly-reliable and highly-available systems by delaying, and even entirely avoiding, calling upon software to make tradeoff decisions between memory capacity, performance, and reliability.
引用
收藏
页码:439 / 450
页数:12
相关论文
共 50 条
  • [1] BALANCING DENSITY AND PERFORMANCE TRADEOFFS
    Nelson, Rick
    EE: Evaluation Engineering, 2019, 58 (12): : 12 - 14
  • [2] Balancing cost, reliability
    不详
    ASHRAE JOURNAL, 2008, 50 (02) : 5 - 5
  • [3] Managing Cost, Performance, and Reliability Tradeoffs for Energy-Aware Server Provisioning
    Guenter, Brian
    Jain, Navendu
    Williams, Charles
    2011 PROCEEDINGS IEEE INFOCOM, 2011, : 1332 - 1340
  • [4] COST AND PERFORMANCE TRADEOFFS OF BUFFERED MEMORIES
    POHM, AV
    AGRAWAL, OP
    MONROE, RN
    PROCEEDINGS OF THE IEEE, 1975, 63 (08) : 1129 - 1135
  • [5] Reliability and availability cost design tradeoffs for HA systems
    Hou, W
    Okogbaa, OG
    ANNUAL RELIABILITY AND MAINTAINABILITY SYMPOSIUM, 2005 PROCEEDINGS, 2005, : 433 - 438
  • [6] Cost-Reliability Tradeoffs in Fusing Unreliable Computational Units
    Donmez, Mehmet A.
    Raginsky, Maxim
    Singer, Andrew C.
    Varshney, Lav R.
    IEEE OPEN JOURNAL OF SIGNAL PROCESSING, 2020, 1 : 77 - 89
  • [7] COST-PERFORMANCE TRADEOFFS - CASE HISTORY
    LAPIDUS, G
    IEEE SPECTRUM, 1973, 10 (09) : 79 - 82
  • [8] COST-PERFORMANCE TRADEOFFS FOR INTERCONNECTION NETWORKS
    KRUSKAL, CP
    SNIR, M
    DISCRETE APPLIED MATHEMATICS, 1992, 37-8 : 359 - 385
  • [9] Balancing coating performance and cost
    Hrkalovich, Daniel
    PPCJ Polymers Paint Colour Journal, 2014, 204 (4593): : 34 - 35
  • [10] Managing Performance-Reliability Tradeoffs in Multicore Processors
    Song, William J.
    Mukhopadhyay, Saibal
    Yalamanchili, Sudhakar
    2015 IEEE INTERNATIONAL RELIABILITY PHYSICS SYMPOSIUM (IRPS), 2015,