Probabilistic SUNGGU POSTECH, Diagnosis of Multiprocessor Systems LEE Department of Electrical Engineering, P. O. Box 125, Pohang 790-600, Korea KANG GEUN The Uruversity SHIN of Michigan, Ann Arbor, Real-T~me MI Computmg Laboratory, Department of Electrical Engineering and Computer Science, 48109-2212 This paper critically surveys methods for the automated probabilistic diagnosis of large multiprocessor systems. In recent years, much of the work on system-level diagnosis has focused on probabilistic methods, which can diagnose intermittently faulty situations on general interconnection processing nodes and can be applied in general networks. The theory behind the probabilistic diagnosis methods is explained, and the various diagnosis algorithms are described in simple terms with the aid of a running example. The diagnosis methods are compared and analyzed, and a chart is produced, showing the comparative advantages of the various diagnosis algorithms on the basis of several factors important to probabilistic diagnosis. Multiple Data and Subject Descriptors: C.1.2 [Processor Architectures]: parallel processors; D.4.5 [Operating Systems]: Stream Architectures-MZMD; Reliability fault tolerance; G.3 [Mathematics of Computing]: Probability and (including Monte Carlo) Statistics probabilistic algorithms Categories General Terms: Algorithms, Performance Additional Key Words and Phrases: Centralized and distributed self-diagnosis, comparison testing, fault-tolerant computing, probabilistic diagnosis, system-level diagnosis, system-level testing
/lp/association-for-computing-machinery/probabilistic-diagnosis-of-multiprocessor-systems-VvKMWXVNNQ