Review of Memory RAS for Data Centers

被引:0
|
作者
Lee, Jiseong [1 ]
Kim, Min Joon [1 ]
Kim, Woo-Seop [1 ]
Kim, Yong Sin [1 ]
机构
[1] Korea Univ, Sch Elect Engn, Seoul 02841, South Korea
基金
新加坡国家研究基金会;
关键词
Error correction codes; Data centers; Costs; Servers; Performance evaluation; Memory modules; Maintenance engineering; Reliability engineering; Correctable error (CE); error correction code (ECC); memory reliability; availability; serviceability (RAS); uncorrectable error (UE); ERROR-CORRECTION; CODES; DRAM; RELIABILITY; ECC; RESILIENCE; STORAGE;
D O I
10.1109/ACCESS.2023.3329984
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multi-bit error and downtime due to uncorrectable error (UE) in a dual in line memory module (DIMM) have received great attention in data centers for its high repair or replacement cost. These problems can be alleviated by utilizing ECC (Error Correction Code) technology, which enables prompt error correction during initial occurrences and prediction of future UEs based on recurring error patterns. The technologies for addressing errors can be categorized into reliability, availability, and serviceability (RAS), and need to be optimized using various parameters such as accuracy, recall, F-measures, and cost reduction. This paper describes an overview of the current RAS technologies and trends in memory for data centers, which includes an analysis of conventional ECC technologies and their recent developments. Once UEs cannot be completely eliminated with ECCs, page offline methods based on analysis on error patterns and characterization of UE can be performed. Recent research trends for reducing memory capacity wasted by UE and page offline have been towards on-die ECC in high bandwidth memory architecture.
引用
下载
收藏
页码:124782 / 124796
页数:15
相关论文
共 50 条
  • [41] A Literature Review of Machine Learning Techniques for Cybersecurity in Data Centers
    Roponena, Evita
    Kampars, Janis
    Gailitis, Andris
    2021 62ND INTERNATIONAL SCIENTIFIC CONFERENCE ON INFORMATION TECHNOLOGY AND MANAGEMENT SCIENCE OF RIGA TECHNICAL UNIVERSITY (ITMS), 2021,
  • [42] Differential power processing for data centers applications: A comprehensive review
    Faiad, Azza A.
    Hamdan, Eman
    Hamad, Mostafa S.
    Abdel-Khalik, Ayman S.
    Hamdy, Ragi A.
    ALEXANDRIA ENGINEERING JOURNAL, 2020, 59 (03) : 1833 - 1846
  • [44] The Future of Optical Interconnects for Data Centers: A Review of Technology Trends
    Aleksic, Slavisa
    PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS CONTEL 2017, 2017, : 41 - 46
  • [45] Review on Cooling System Energy Consumption in Internet Data Centers
    Amoabeng, Kofi Owura
    Choi, Jong Min
    INTERNATIONAL JOURNAL OF AIR-CONDITIONING AND REFRIGERATION, 2016, 24 (04)
  • [46] Demonstrating Optically Interconnected Remote Serial and Parallel Memory in Disaggregated Data Centers
    Mishra, Vaibhawa
    Benjamin, Joshua L.
    Zervas, Georgios
    2020 OPTICAL FIBER COMMUNICATIONS CONFERENCE AND EXPOSITION (OFC), 2020,
  • [47] A Memory RAS System Design and Engineering Practice in High Temperature Ambient Data Center
    Yao, Aili
    Li, JinFeng
    Wang, Fengqian
    Zhao, Jie
    Liu, Hongmei
    Zhang, Jiajun
    Zhang, Jun
    Zhou, Alex
    Song, Youquan
    Xu, Jialiang
    Sun, Paul
    Zhu, Kunye
    Ahuja, Nishi
    Zhu, Dayi
    Kuo, Sean
    PROCEEDINGS OF THE NINETEENTH INTERSOCIETY CONFERENCE ON THERMAL AND THERMOMECHANICAL PHENOMENA IN ELECTRONIC SYSTEMS (ITHERM 2020), 2020, : 1379 - 1388
  • [48] Optimizing Interrupt Handling Performance for Memory Failures in Large Scale Data Centers
    Dixit, Harish Dattatraya
    Lin, Fan
    Holland, Bill
    Beadon, Matt
    Yang, Zhengyu
    Sankar, Sriram
    PROCEEDINGS OF THE ACM/SPEC INTERNATIONAL CONFERENCE ON PERFORMANCE ENGINEERING (ICPE'20), 2020, : 193 - 201
  • [49] Evaluation of a Rack-Scale Disaggregated Memory Prototype for Cloud Data Centers
    Quiroga, Josue, V
    Torrents, Marti
    Sonmez, Nehir
    Theodoropoulos, Dimitris
    Zyulkyarov, Ferad
    Nemirovsky, Mario
    PROCEEDINGS OF THE 30TH INTERNATIONAL WORKSHOP ON RAPID SYSTEM PROTOTYPING (RSP'19): SHORTENING THE PATH FROM SPECIFICATION TO PROTOTYPE, 2019, : 15 - 21
  • [50] Stratification of the Urban Space in Contemporary Paris: Modiano, Vasset, and the Data Centers of Memory
    Cadieu, Morgane
    CONTEMPORARY FRENCH AND FRANCOPHONE STUDIES, 2017, 21 (02) : 133 - 141