Impact of Soft Errors on Large-Scale FPGA Cloud Computing

被引:9
|
作者
Keller, Andrew M. [1 ]
Wirthlin, Michael J. [1 ]
机构
[1] Brigham Young Univ, Dept Elect & Comp Engn, NSF Ctr Space High Performance & Resilient Comp S, Provo, UT 84602 USA
基金
美国国家科学基金会;
关键词
FPGA cloud computing; FPGA data centers; soft error rate; SER; single event upset; SEU; architectural vulnerability factor; AVF; fault injection; critical bit; reliability; recovery; Intel FPGA; mission time;
D O I
10.1145/3289602.3293911
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
FPGAs are being used in large numbers within cloud computing to provide high-performance, low-power alternatives to more traditional computing structures. While FPGAs provide a number of important benefits to cloud computing environments, they are susceptible to radiation-induced soft errors, which can lead to silent data corruption or system instability. Although soft errors within a single FPGA occur infrequently, soft errors in large-scale FPGAs systems can occur at a relatively high rate. This paper investigates the failure rate of several FPGA applications running within an FPGA cloud computing node by performing fault injection experiments to determine the susceptibility of these applications to soft-errors. The results from these experiments suggest that silent data corruption will occur every few hours within a 100,000 node FPGA system and that such a system can only maintain high-levels of reliability for short periods of operation. These results suggest that soft-error detection and mitigation techniques may be needed in large-scale FPGA systems.
引用
收藏
页码:272 / 281
页数:10
相关论文
共 50 条
  • [1] A CLOUD COMPUTING PLATFORM FOR LARGE-SCALE FORENSIC COMPUTING
    Roussev, Vassil
    Wang, Liqiang
    Richard, Golden
    Marziale, Lodovico
    [J]. ADVANCES IN DIGITAL FORENSICS V, 2009, 306 : 201 - 214
  • [2] The Application of Cloud Computing in Large-Scale Statistic
    Sun Xiuli
    Li Ying
    Hu Baofang
    Sun Hongfeng
    [J]. PROCEEDINGS OF THE 1ST INTERNATIONAL WORKSHOP ON CLOUD COMPUTING AND INFORMATION SECURITY (CCIS 2013), 2013, 52 : 308 - 311
  • [3] LARGE-SCALE RANKING AND SELECTION USING CLOUD COMPUTING
    Luo, Jun
    Hong, L. Jeff
    [J]. PROCEEDINGS OF THE 2011 WINTER SIMULATION CONFERENCE (WSC), 2011, : 4046 - 4056
  • [4] Large-scale user behavior analysisbased on cloud computing
    [J]. Jiang, Dao, 1600, Journal of Chemical and Pharmaceutical Research, 3/668 Malviya Nagar, Jaipur, Rajasthan, India (06):
  • [5] Fusion of soft computing and hard computing for large-scale plants: An overview
    Kamiya, A
    Ovaska, SJ
    Roy, R
    Kobayashi, S
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-5, CONFERENCE PROCEEDINGS, 2003, : 1441 - 1448
  • [6] THE IMPACT OF LARGE-SCALE COMPUTING ON LATTICE STATISTICS
    MARTIN, JL
    [J]. JOURNAL OF STATISTICAL PHYSICS, 1990, 58 (3-4) : 749 - 774
  • [7] A Large-Scale Study of Soft-Errors on GPUs in the Field
    Nie, Bin
    Tiwari, Devesh
    Gupta, Saurabh
    Smirni, Evgenia
    Rogers, James H.
    [J]. PROCEEDINGS OF THE 2016 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA-22), 2016, : 519 - 530
  • [8] A Large-Scale Distributed Sorting Algorithm Based on Cloud Computing
    Pang, Na
    Zhu, Dali
    Fan, Zheming
    Rong, Wenjing
    Feng, Weimiao
    [J]. APPLICATIONS AND TECHNIQUES IN INFORMATION SECURITY, ATIS 2015, 2015, 557 : 226 - 237
  • [9] Muclouds: Parallel Simulator for Large-scale Cloud Computing Systems
    Liu, Jinzhao
    Zhou, Yuezhi
    Zhang, Di
    Fang, Yujian
    Han, Wei
    Zhang, Yaoxue
    [J]. 2014 IEEE 11TH INTL CONF ON UBIQUITOUS INTELLIGENCE AND COMPUTING AND 2014 IEEE 11TH INTL CONF ON AUTONOMIC AND TRUSTED COMPUTING AND 2014 IEEE 14TH INTL CONF ON SCALABLE COMPUTING AND COMMUNICATIONS AND ITS ASSOCIATED WORKSHOPS, 2014, : 80 - 87
  • [10] Improving Failure Tolerance in Large-Scale Cloud Computing Systems
    Luo, Liang
    Meng, Sa
    Qiu, Xiwei
    Dai, Yuanshun
    [J]. IEEE TRANSACTIONS ON RELIABILITY, 2019, 68 (02) : 620 - 632