قابلیت اطمینان و تجزیه و تحلیل عملکرد برای برنامه های با قابلیت تحمل خطا شامل نسخه های با ویژگی های مختلف
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|27821||2004||7 صفحه PDF||سفارش دهید||3726 کلمه|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Reliability Engineering & System Safety, Volume 86, Issue 1, October 2004, Pages 75–81
This paper presents a simple straightforward algorithm for evaluating reliability and expected execution time for software systems consisting of fault-tolerant components. The components are built from functionally equivalent but independently developed versions characterized by different reliability and performance. Both N-version programming (with parallel and sequential execution of the versions) and the recovery block scheme are considered within a universal model.
Software failures are caused by errors made in various phases of program development. When the software reliability is of critical importance, special programming techniques are used in order to achieve its fault tolerance. Two of the best-known fault-tolerant software schemes are N-version programming (NVP) and recovery block scheme (RBS) . Both schemes are based on the redundancy of software modules (functionally equivalent but independently developed) and the assumption that coincident failures of modules are rare. NVP was proposed by Chen and Avizienis . This approach presumes the execution of N functionally equivalent software modules (called versions) that receive the same input and send their outputs to a voter, which is aimed at determining the system output. The voter produces an output if at least M out of N outputs agree. Otherwise, the system fails. Usually majority voting is used in which N is odd and M=(N+1)/2. RBS was proposed by Randell . This approach presumes consecutive execution of different versions. After execution of each version, its output is tested by an acceptance test block (ATB). If the ATB accepts the version output, the process is terminated and the version output is considered to be the output of the entire system. If the ATB does not accept the output, the next version is executed. If all N versions do not produce the accepted output, the system fails. The fault-tolerant programming based on computational redundancy usually requires additional resources and results in performance penalties (particularly with regard to computation time), which constitutes a tradeoff between software performance and reliability. Estimating the effect of the fault-tolerant programming on system performance is especially important in safety critical real-time computer applications. This effect has been studied by Tai et al.  and by Goseva-Popstojanova and Grnarov  and . While in Ref.  a basic realization of NVP (N=3,M=2) consisting of versions with identical fault probabilities and different execution times has been considered, in Refs.  and  NVP with arbitrary N has been studied in which both times to failure and execution times of different versions are identically distributed random variables. In many cases, the information about version reliability and execution time is available from separate testing and/or reliability prediction models . This information can be incorporated into a fault-tolerant program model in order to obtain a more precise evaluation of reliability and performance. The reliability model of NVP with versions having different reliability has been considered in Ref. . However, in this study, the system performance evaluation problem has not been addressed and a general algorithm for evaluating NVP reliability for arbitrary N and M has not been suggested. This paper presents an algorithm for evaluating the reliability and the performance of NVP and RBS with arbitrary N and M consisting of versions characterized by different reliability and execution time. The models of different fault-tolerant programs and measures of their reliability and performance are presented in Section 2. A fast algorithm for evaluating the reliability and performance measures is presented in Section 3. Section 4 contains illustrative examples.