روش خودکار برای تخمین قابلیت اطمینان سیستم های شبکه با استفاده از شبکه های بیزی
کد مقاله | سال انتشار | تعداد صفحات مقاله انگلیسی |
---|---|---|
29177 | 2012 | 10 صفحه PDF |
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Reliability Engineering & System Safety, Volume 104, August 2012, Pages 96–105
چکیده انگلیسی
Grid computing has become relevant due to its applications to large-scale resource sharing, wide-area information transfer, and multi-institutional collaborating. In general, in grid computing a service requests the use of a set of resources, available in a grid, to complete certain tasks. Although analysis tools and techniques for these types of systems have been studied, grid reliability analysis is generally computation-intensive to obtain due to the complexity of the system. Moreover, conventional reliability models have some common assumptions that cannot be applied to the grid systems. Therefore, new analytical methods are needed for effective and accurate assessment of grid reliability. This study presents a new method for estimating grid service reliability, which does not require prior knowledge about the grid system structure unlike the previous studies. Moreover, the proposed method does not rely on any assumptions about the link and node failure rates. This approach is based on a data-mining algorithm, the K2, to discover the grid system structure from raw historical system data, that allows to find minimum resource spanning trees (MRST) within the grid then, uses Bayesian networks (BN) to model the MRST and estimate grid service reliability.
مقدمه انگلیسی
Grid computing has become relevant due to its applications to large-scale resource sharing, wide-area information transfer, and multi-institutional collaborating. In general, in grid computing services request a set of resources, available in a grid, to complete certain tasks. Many experts believe that the grid technologies will offer a chance to extend the benefits of the Internet [1]. However, it is difficult to analyze the grid reliability due to its highly heterogeneous and distributed characteristics. Because the grid systems involve cross-organizational sharing, they support existing distributed computing technologies. As an example, enterprise-level distributed computing systems can use the grid technologies to achieve resource sharing across its different institutions. Although, several development tools and techniques for the grid systems have been studied, estimating grid reliability is not straightforward due to the size and complexity of the grid [2]. Therefore, new analytical methods are needed to evaluate the grid reliability. Over the past several years, research and development efforts have focused on the challenges that arise when large grid organizations [1], [2], [3] and [4] are built. As a recent topic, there are a few studies on estimating grid system reliability in the literature [5], [6], [7] and [8]. In these studies, the grid system reliability is estimated by focusing on the reliabilities of services provided in the grid system. For this purpose, the grid system components that are involved in a grid service are classified into spanning trees, and each tree is studied separately. However, these studies mainly focus on understanding grid system structures rather than estimating the actual system reliability. Thus for simplification purposes, they make certain assumptions on component failure rates, such as satisfying a probabilistic distribution [7]. For reliability estimation, Bayesian networks (BN) have been proposed as an efficient method [9], [10], [11] and [12]. BN provide significant advantages over traditional frameworks for the systems engineers, mainly because they are easy to interpret and they can be used in interaction with domain experts in the reliability field [13]. Using the BN structure and the probabilistic values, the system reliability can be estimated with the help of Bayes rule [12]. There are several recent studies for reliability estimation using BN [9], [11], [14], [15] and [16], which require specialized networks that are designed for a specific system. That is, the BN to be used for analyzing system reliability should be known beforehand (i.e. the BN can be built by an expert who has “adequate” knowledge about the system under consideration). However, human intervention is always open to unintentional mistakes that could cause discrepancies in the results [17]. To address these issues, this paper introduces a methodology for estimating grid system reliability by combining techniques such as BN construction from raw component and system data, association rule mining and evaluation of conditional probabilities. Based on the extensive literature review, this is the first study that incorporates these methods for estimating grid system reliability. With the increasing popularity of computer environments in systems engineering, grid systems have been widely used in various system-related applications. Understanding the grid system structure and the component relationships is essential for systems engineers for optimal resource allocation and improving the system reliability. This study provides a methodology for automated discovery of component relationships and estimation of reliability of grid services to help the systems engineers. The methodology suggested in this paper automates the process of spanning tree discovery and BN construction by using the K2 algorithm (a commonly used association rule mining algorithm) that identifies the associations among the grid system components by using a predefined scoring function and a heuristic. According to the proposed method, once the BN is efficiently and accurately constructed, reliabilities of grid services are estimated with the help of Bayes rule. Unlike previous studies, the methodology proposed in this paper does not rely on any assumptions about the component failure rates in grid systems. Moreover, the proposed method does not require prior knowledge about the grid system structure.
نتیجه گیری انگلیسی
Grid systems are newly developed concepts for large-scale distributed systems. In a grid system, there can be various nodes that are logically and physically distributed; and large-scale sharing of resources is essential between these nodes. There are mainly two types of nodes in a grid system: RM share resources and RN request service from them. Identification of the links and nodes between RN and RM is essential for estimating the reliability of the requested service. Due to their special and complex nature, traditional reliability estimation methods cannot be used for grid systems. As the grid systems become popular in the last decade, they find new application areas in systems engineering; however question of estimating reliabilities of the grid systems remained wide open. Although there has been studies in the literature on estimating grid service reliability, these studies rely on certain assumptions about the link and component failure rates [7], [8] and [20] and/or assume that the grid system structure is completely known [7] and [28]. However, in the real-life grid systems these assumptions may not be true at all times. First, the component and link failures in real-life systems may occur randomly and making assumptions on their failure rates can provide incorrect results. Second, the grid systems can be very large and dynamic that their structure may not be exactly known apriori. This study discusses an automated method for estimating grid service reliability without relying on any assumptions about the component and link failures. Moreover the proposed method does not require prior knowledge about the grid system structure. Alternatively, the method is based on a popular data mining algorithm, K2, and finds the associations between the grid components automatically. To find the component associations, the proposed method works with a dataset that shows availabilities of the grid components in the past. After discovering the grid structure based on this dataset, the proposed method finds the MRSTs, and also computes CPTs for the components that it considers during the process. The MRSTs and CPTs are essential for estimating the grid service reliability, which is estimated with the help of BN and Bayes theorem. Moreover, the proposed method does not need to consider all components in the grid system, which can be a very large set. Instead, it stops when it finds all possible MRSTs, which usually requires considering only a small subset of the components in the grid system. Also, experimental analysis of the performance and accuracy of the proposed method are provided. It is shown that the proposed method discovers the MRSTs in less time than Dai and Wang's method (that uses genetic algorithm) and provides very accurate reliability values. Finally, the proposed method will be very useful for system and reliability engineers, since it is fully automated, does not rely on assumptions and does not require prior knowledge of the grid system structure.