Probabilistic cluster fault diagnosis for multiprocessor systems

被引:0
|
作者
Niu, Baohua [1 ]
Zhou, Shuming [1 ,2 ]
Zhang, Hong [1 ]
Zhang, Qifan [1 ]
机构
[1] Fujian Normal Univ, Coll Math & Stat, Fuzhou 350117, Fujian, Peoples R China
[2] Fujian Normal Univ, Ctr Appl Math Fujian Prov, Fuzhou 350117, Peoples R China
基金
中国国家自然科学基金;
关键词
Probabilistic diagnostic model; Cluster fault; Reliability; CONDITIONAL DIAGNOSABILITY; RELIABILITY; (N;
D O I
10.1016/j.tcs.2024.114837
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
As high performance computing systems consisting of multiple processors play an important role in big data analytics, we are motivated to focus on the research of reliability, design-for- test, fault diagnosis and detection of large-scale multiprocessor interconnected systems. System- level diagnosis theory, which originates from the testing of VLSI and Wafer, aims to identify faulty processors in these systems by means of analyzing the test results among the processors, while diagnosability as well as diagnosis accuracy are two important indices. The probabilistic fault diagnostic strategy seeks to correctly diagnose processors with high probability under the assumption that each processor has a certain failing probability. In this work, based on the probabilistic diagnosis algorithm with consideration of fault clustering, we specialize in the local diagnostic capability to establish the probability that any processor in a discrete status is diagnosed correctly. Subsequently, we investigate the global performance evaluation of multiprocessor systems under various significant fault distributions including Poisson distribution, Exponential distribution and Binomial distribution. In addition, we directly apply our results to the data center network HSDC and ( n, k )-star network. Numerical simulations are performed to verify the established results, which reveal the relationship between the accuracy of correct diagnosis and regulatory parameters.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] DIAGNOSIS AND REPAIR IN MULTIPROCESSOR SYSTEMS
    BLOUGH, DM
    PELC, A
    IEEE TRANSACTIONS ON COMPUTERS, 1993, 42 (02) : 205 - 217
  • [32] FAULT DIAGNOSIS OF MULTIPROCESSOR SYSTEMS BASED ON GENETIC AND ESTIMATION OF DISTRIBUTION ALGORITHMS: A PERFORMANCE EVALUATION
    Duarte, Elias P., Jr.
    Pozo, Aurora T. R.
    Nassu, Bogdan T.
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2010, 19 (01) : 1 - 18
  • [33] An algorithm for conditional-fault local diagnosis of multiprocessor systems under the MM∗ model
    Lv, Yali
    Lin, Cheng-Kuan
    Hsu, D. Frank
    Fan, Jianxi
    THEORETICAL COMPUTER SCIENCE, 2024, 987
  • [34] COMPARISON CONNECTION ASSIGNMENTS FOR DIAGNOSIS OF MULTIPROCESSOR SYSTEMS UNDER A 2-FAULT ASSUMPTION
    WU, J
    FERNANDEZ, EB
    COMPUTING SYSTEMS, 1992, 7 (03): : 199 - 201
  • [35] Comparison connection assignments for diagnosis of multiprocessor systems under a two-fault assumption
    Wu, Jie
    Fernandez, Eduardo B.
    Computer Systems Science and Engineering, 1992, 7 (03): : 199 - 201
  • [36] Multiple Fault Diagnosis in Electrical Power Systems with Probabilistic Neural Networks
    Gonzalez, Juan Pablo Nieto
    Castanon, Luis E. Garza
    Menendez, Ruben Morales
    MICAI 2007: SIXTH MEXICAN INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2008, : 71 - +
  • [37] Probabilistic fault diagnosis in communication systems through incremental hypothesis updating
    Steinder, M
    Sethi, AS
    COMPUTER NETWORKS, 2004, 45 (04) : 537 - 562
  • [38] Achieving fault tolerance in pipelined multiprocessor systems
    Lin, JP
    Kuo, SY
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1997, E80D (06) : 665 - 671
  • [39] MULTIPROCESSOR FAULT-DIAGNOSIS UNDER LOCAL CONSTRAINTS
    DAS, A
    THULASIRAMAN, K
    AGARWAL, VK
    LAKSHMANAN, KB
    IEEE TRANSACTIONS ON COMPUTERS, 1993, 42 (08) : 984 - 988
  • [40] Probabilistic Fault Diagnosis of Safety Instrumented Systems based on Fault Tree Analysis and Bayesian Network
    Chiremsel Z.
    Nait Said R.
    Chiremsel R.
    Journal of Failure Analysis and Prevention, 2016, 16 (5) : 747 - 760