اعتبار سنجی سیستم های هوشمند : یک مطالعه انتقادی و ابزاری
کد مقاله | سال انتشار | تعداد صفحات مقاله انگلیسی |
---|---|---|
5469 | 2000 | 16 صفحه PDF |
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Expert Systems with Applications, Volume 18, Issue 1, January 2000, Pages 1–16
چکیده انگلیسی
One of the most important phases in the methodology for the development of intelligent systems is that corresponding to the evaluation of the performance of the implemented product. This process is popularly known as verification and validation (V&V). The majority of tools designed to support the V&V process are preferentially directed at verification in detriment to validation, and limited to an analysis of the internal structures of the system. The authors of this article propose a methodology for the development of a results-oriented validation, and a tool (SHIVA) is presented which facilitates the fulfilment of the tasks included in the methodology, whilst covering quantitative as well as heuristic aspects. The result is an intelligent tool for the validation of intelligent systems.
مقدمه انگلیسی
For many years now, software engineering has been concerned with the development of methods and techniques for the definition, construction and maintenance of quality software. Classical methodologies are not, however, entirely suitable to knowledge engineering. In the words of Morris (1985), the world of artificial intelligence does not quite fit into the grey areas of software engineering. One of the development methodologies which has had most repercussions in the field of intelligent systems is the spiral methodology, proposed by Boehm (1988). This methodology is recommended by several authors, among them Lee and O'Keefe, 1994 and Noblett and Jones, 1991, and Cardeñosa et al. (1991), because it permits the inclusion of concepts such as incremental development and fast prototyping, fundamental in the development of an intelligent system. Fig. 1illustrates an example of the spiral methodology, taken from Lee and O'Keefe (1994).In the spiral methodology the final stages of each development cycle are given over to the testing of the quality of the developed product. These stages are commonly known as verification and validation, or simply V&V. Verification refers, according to Boehm (1981), to building the system right. When it is a question of intelligent systems, this definition is described as ‘testing that the system has no errors and complies with its initial specifications’. Validation on the other hand, and again according to Boehm (1981), refers to building the right system, and this concept expressed in terms of intelligent systems implies testing that the output of the system is correct and complies with the needs and requirements of the user. As a follow-up to validation, many authors include one or several additional phases—commonly grouped together under the term evaluation—whereby aspects that go beyond the validity of the final solutions are analysed. Evaluation thus is an endeavour to analyse aspects such as utility, robustness, velocity, efficiency, extension possibilities, ease of use, credibility, etc. The validation phase may be viewed from two different perspectives: •Results-oriented validation (Lee and O'Keefe, 1994) compares the performance of the system with an expected performance (provided by a standard reference or by human experts) to ensure that the system reaches an acceptable performance level. • Usage-oriented validation goes beyond the correction of the results obtained by the system and concentrates on matters referring to the man–machine interaction. This type of validation is generally referred to in the literature in terms of ‘assessment’ (O'Leary, 1987) and there are authors such as Liebowitz (1986), who include it within the evaluation phase. Results-oriented validation is normally a prerequisite for the implementation of a validation oriented to use. Thus, if the system fails to render a satisfactory performance (or fails to give indications that performance will improve in the future on making improvements in its development), then aspects concerning its utilisation are irrelevant. For this reason, in this article the study of results-oriented validation, which henceforth shall be referred to simply as ‘validation’, is the central issue. The complexity of the V&V processes has motivated endeavours to automate these phases, although success in this respect has been mixed. Thus, while considerable advances have been made in the matter of verification—with the construction of commercial shells (that partially eliminate the need to carry out a verification of the inference engine) and specific tools such as EVA (Stachowitz and Combs, 1987), CHECK (Nguyen et al., 1987) and COVER (Preece et al., 1992) that facilitate the verification of knowledge bases—the validation of intelligent systems continues to be a poorly structured field of research, in which many ad hoc approaches have been developed, but where an overall vision of developments in the field is still lacking. In a recent study, Murrel and Plant (1997) analysed the principal verification and validation tools that were discussed in the bibliography between the years 1985 and 1995, and from this study it is clear that the majority of the tools described carry out verification or refinement tasks. Refinement is an intermediate phase between verification and validation, concerned with applying ‘white box’ tests to the system. Among the different knowledge refinement tools deserving mention are KVAT (Mengshoel, 1993), SEEK (Politakis, 1985), SEEK2 (Ginsberg and Weiss, 1985) and DIVER (Zlatareva, 1998). In this article, we wish to move up a step on the pyramid of behaviour analysis by carrying out a validation in which the intelligent system is treated as a ‘black box’, whilst concentrating on a results-oriented validation. It is evident that an automation of the validation phase brings with it many advantages, since it permits the construction of systems with a guarantee of quality and more easily. Nonetheless, before proceeding to the automation of this phase it is necessary to identify and organise the tasks that must be carried out and to construct a validation methodology that indicates the steps to be followed in the process. For this reason the objectives of this research are: (1) to identify and study the different processes related to validation; (2) to construct a methodology that characterises the validation process and that indicates the task to be carried out at each step of the way; and finally (3) to construct a computational tool that automates, as far as possible, the distinct phases of the methodology.
نتیجه گیری انگلیسی
In order to evaluate the applicability of the developed tool, it was decided to utilise it in the validation of the intelligent systems PATRICIA (Moret-Bonillo et al., 1993) and NST-EXPERT (Alonso-Betanzos et al., 1995), both developed in the Research and Development Laboratory for Artificial Intelligence (LIDIA) at the University of A Coruña, Spain. A detailed description of the results of these validations may be consulted in Moret-Bonillo et al. (1997) and Alonso-Betanzos et al. (1998) respectively. From this field test it can be demonstrated that the application of SHIVA not only permits a comparison of the performance of the intelligent system with the performance of the human experts but also assists in the acquisition of new knowledge or the refinement of existing knowledge. For a more detailed description of SHIVA and its underlying methodology Mosqueira-Rey (1998) may be consulted. In the course of this work it has become patently clear that intelligent systems continue to be software, for which reason the experience acquired by the software engineer is also applicable to knowledge engineering. Nevertheless, the distinctive characteristics of these systems and their application domains mean that both the development and validation methodologies differ substantially from their homonyms within the field of software engineering. The development methodologies for intelligent systems are fundamentally based on the concepts of prototypes and incremental development and a prime example of their kind are the spiral methodologies. For the verification and validation of the ‘conventional’ software parts of an intelligent system the techniques described in software engineering can be utilised. Nonetheless, for the remaining part of the system it is necessary to develop new V&V procedures. With respect to verification, a great deal of tools have been developed that permit the consistency and completeness of the knowledge bases to be tested in a manner that is acceptably efficient, although to guarantee their correct functioning a simplified structure of the said knowledge databases is presupposed. The fact that the task of validation is not as automated as that of verification is due to the fact that the former is a task that is intrinsically more complex, given that a strong influence is exerted by the subjectivity of the evaluators. It is also due to the fact that there is available neither a clear, precise validation methodology, nor an overall classification of validation problems. Nor is there a clear relationship between these problems and the techniques destined to their solution. Furthermore, the majority of validation tools implemented develop refinement tasks by means of ‘white box’ techniques, whereby an attempt is made to discover which knowledge structure is the cause of an error in interpretation. This means that these tools are very dependent on the structures that they are endeavouring to validate, and also means that it is not possible to construct a tool of a general nature. The objective of our research is to develop a tool that treats the system as a ‘black box’, paying attention exclusively to its results and not to its internal functioning, and this permits its applicability to any kind of intelligent system. The development of the validation tool has been approached from the perspective of methodology: in the first place the common characteristics of a validation process are analysed, then the different development phases within the validation are identified and finally, a tool is developed that facilitates the application of the different phases. Evidently, the automation of a process such as validation incorporates many advantages. The process of construction of an intelligent system is a cyclical process, whereby the validation phase is repeatedly executed in the system (even though the agents used in each validation may be different) in an endeavour to assure the final quality of the system. There is no doubt but that the more support tools a knowledge engineer has available for a validation, the easier the application and the more complete the results. The validation tool SHIVA is presented as a helping tool for an interpretation of the validation data on the basis of the framework that is made up of the planning and application phases. The tool bases its efficacy on an amalgamated utilisation of distinct statistical measurements in such a way that it trades off the drawbacks inherent to some techniques against the advantages of other techniques. A final point is to outline aspects that require consideration as corresponding to new lines of investigation and development currently in progress: •Inclusion in the tool of a support for usage-oriented validation. •Study and implementation of mathematical methods for the construction of standards. •Study and implementation of new agreement and association methods, as well as new techniques of hierarchical clustering. •Support for the interpretation phase via an application of a heuristic nature.