دانلود مقاله ISI انگلیسی شماره 5478
ترجمه فارسی عنوان مقاله

تعیین تحمل خطا در سیستم های هوشمند بحرانی ماموریت

عنوان انگلیسی
Specifying fault tolerance in mission critical intelligent systems
کد مقاله سال انتشار تعداد صفحات مقاله انگلیسی
5478 2001 12 صفحه PDF
منبع

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)

Journal : Knowledge-Based Systems, Volume 14, Issue 7, 1 November 2001, Pages 385–396

ترجمه کلمات کلیدی
/ تحمل خطا - سیستم های هوشمند - فرمالیسم اتوماتیک / - سیستم های بحرانی ماموریت - سیستم های خبره زمان واقعی
کلمات کلیدی انگلیسی
پیش نمایش مقاله
پیش نمایش مقاله  تعیین تحمل خطا در سیستم های هوشمند بحرانی ماموریت

چکیده انگلیسی

Real time intelligent systems are being increasingly used in mission critical applications in domains like military, aerospace, process control industry and medicine. Despite this vast potential, the major concern about deploying mission critical intelligent systems is their dependability. Dependability encompasses such notions as reliability, safety, security, maintainability and portability. A major concern about mission critical intelligent systems is their performance in the presence of failures. Intelligent systems are characterized by often non-existent, imprecise or rapidly changing specifications. This makes the task of characterizing an intelligent system's performance in the presence of failures much more difficult. In this paper, we characterize the failures that are likely in a mission critical intelligent system. We propose an extended I/O automata model to capture these failure specifications. We further demonstrate how these specifications can be realized in a real time expert system by structuring the knowledge base. This formalism can also be used to specify the fault tolerant properties of the underlying hardware and software over which the intelligent system resides. Thus we have an unified formalism to specify fault tolerance properties in hardware, system software and the intelligent system. This will enable us to reason about the performance of the entire system inclusive of all its components in an uniform manner.

مقدمه انگلیسی

Real time intelligent systems are being increasingly used for mission critical applications. The proliferation of mission critical applications leads to the challenge of making real time intelligent systems dependable. A dependable system is fault tolerant and provides high assurance to its users by maintaining high guaranteed levels of security, reliability, timing, availability, safety, and other attributes characterizing their operation. High Assurance Systems presume unambiguous specifications [4]. However, the development of intelligent systems is often characterized by non-existent, imprecise or rapidly changing specifications [30]. High Assurance in intelligent systems is yet to gain the attention it deserves. Some of the attempts at improving dependability of intelligent systems include [1], [6], [7], [8] and [12]. The recent success of agent based systems and other AI systems in real world applications necessitates the need to focus on high assurance in intelligent systems. A first step towards developing high assurance intelligent systems is to have unambiguous specifications. A formalism for developing such specifications is in order. Considerable work is carried out in making software systems fault tolerant [2], [11], [13] and [28]. Formalisms developed to model the development of dependable systems focus on various design issues like specifying the faults to be handled, how to detect the existence of a fault, identifying the fault recovery methods, what happens during fault recovery, and the time constraints of the recovery process. However, this research is not easily extensible to the design of dependable intelligent systems. In this paper, we first characterize the failures that are likely to occur in a mission critical intelligent system and then present a formalism, which will enable us to specify these requirements, in an unambiguous manner. To this end, we chose the I/O automata formalism [17]. I/O automata provide an appropriate and powerful model for discrete event systems consisting of concurrently operating components. The basic I/O automata model is extended to capture the notions of fault tolerance in mission critical intelligent systems. Once the specifications are realized in a design, it is necessary to prove that the design meets the requirements. Standard proof techniques reported in I/O automata literature are adaptable to the extended formalism. The extended I/O automata formalism can also be used to specify the fault tolerant properties of the underlying hardware and software over which the intelligent system operates [24]. Thus we have a unified formalism to specify fault tolerance properties in hardware, system software and the intelligent system. This will enable us to reason about the performance of the entire system inclusive of all its components in a uniform manner. In Section 2, we characterize faults, failures and the associated recovery methods in high assurance intelligent systems. In Section 3, we present the I/O automata formalism, followed by an extension to specify fault tolerance. Section 4, shows how to specify failure detection and recovery in intelligent systems using the extended I/O automata formalism. In Section 5, we describe, how the specification of fault tolerant properties are realized by structuring the knowledge base of an intelligent system. Section 6 concludes with directions for future work.

نتیجه گیری انگلیسی

Mission critical applications of real time intelligent systems are increasing at a fast rate. Mission critical intelligent systems need to be dependable and should provide high assurance. High assurance implies precise specifications and designs. We have proposed an extended I/O automata formalism towards this end. This formalism captures the notions of failures and recovery in addition to notions of time, control and non-determinism. This enables us to capture both the problem solving and failure detection and recovery aspects in a mission critical intelligent system. It is possible to define both the specification and the design of a system in the extended I/O automata formalism. Once the specifications are realized in a design, it is necessary to prove that the design meets the requirements. Standard proof techniques reported in I/O automata literature are adaptable to the extended formalism. This allows us to prove that a given intelligent system design meets its specifications. The extended I/O automata formalism can also be used to specify the fault tolerant properties of the underlying hardware and software over which the intelligent system operates [24]. Thus we have a unified formalism to specify fault tolerance properties in hardware, system software and the intelligent system. This will enable us to reason about the performance of the entire system inclusive of all its components in a uniform manner. Moreover, it is possible to derive analytical performance properties like schedulability and deadlines using the extended I/O automaton formalism [24]. The design of an intelligent system defined by an extended I/O automaton, is realized by structuring the rule base in the REX [27], a real time expert system shell. The temporal properties associated with the actions are captured, using appropriate rule structures like the Clock synchronized rules and Spanning rules [27]. We propose, further work, to address the development of visualization and simulation tools for the extended I/O automata. In the context of a high assurance intelligent system, a major concern is the performance overhead associated with making the system dependable. We propose to undertake further research to gain insights into the performance overhead, and as to what changes in inference strategies will help in minimizing the performance tradeoff while developing high assurance intelligent systems.