طرحریزی سیاستهای بازرسی و نگهداری ساختاری از طریق برنامهریزی پویا و فرآیندهای مارکوف – بخش ۱: نظریه
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی|
|26159||2014||12 صفحه PDF||35 صفحه WORD|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Reliability Engineering & System Safety, Volume 130, October 2014, Pages 202–213
شکل ۱- سیاست شماتیک POMDP برای مسئلهی بازرسی و نگهداری ساختاری.
۲- فرآیندهای تصمیمگیری مارکوف
۲.۱- برنامهریزی پویا
۳- افزایش حالت
۴- فرآیندهای تصمیمگیری نیم-مارکوف
۴.۱- بازهی تصمیمگیری
۵- محدودیتهای MDPها برای مدیریت زیرساخت
۶- فرآیندهای تصمیمگیری مارکوف نیمه مشاهدهپذیر
شکل ۲- سیمپلکس فضای باور برای |S|=3 و نمونهای از نقطهی باور، [b=[0.1,0.7,0.2.
شکل ۳- تابع مقدار باور نمونه برای |S|=2. هر بردار α ناحیهای را در سراسر سیمپلکس باور تعریف میکند.
شکل ۴- تابع مقدار نمونه برای |S|=3
۶.۱- پشتیبانهای بلمن
۷- برنامهریزی POMDP تقریبی
شکل ۵- نمونهی هرس. بردارهای خطی اضافی به رنگ خاکستری نشان داده شدهاند.
۷.۱- تقریب بر اساس MDP و تابعهای Q
شکل ۶- تقریب QMDP از تابع مقدار که کران بالای تابع دقیق است.
۷.۲- تقریبهای مبتنی بر گرید (شبکه)
۷.۳- حلکنندههای مبتنی بر نقطه
To address effectively the urgent societal need for safe structures and infrastructure systems under limited resources, science-based management of assets is needed. The overall objective of this two part study is to highlight the advanced attributes, capabilities and use of stochastic control techniques, and especially Partially Observable Markov Decision Processes (POMDPs) that can address the conundrum of planning optimum inspection/monitoring and maintenance policies based on stochastic models and uncertain structural data in real time. Markov Decision Processes are in general controlled stochastic processes that move away from conventional optimization approaches in order to achieve minimum life-cycle costs and advice the decision-makers to take optimum sequential decisions based on the actual results of inspections or the non-destructive testings they perform. In this first part of the study we exclusively describe, out of the vast and multipurpose stochastic control field, methods that are fitting for structural management, starting from simpler to sophisticated techniques and modern solvers. We present Markov Decision Processes (MDPs), semi-MDP and POMDP methods in an overview framework, we have related each of these to the others, and we have described POMDP solutions in many forms, including both the problematic grid-based approximations that are routinely used in structural maintenance problems, and the advanced point-based solvers capable of solving large scale, realistic problems. Our approach in this paper is helpful for understanding shortcomings of the currently used methods, related complications, possible solutions and the significance different solvers have not only on the solution but also on the modeling choices of the problem. In the second part of the study we utilize almost all presented topics and notions in a very broad, infinite horizon, minimum life-cycle cost structural management example and we focus on point-based solvers implementation and comparison with simpler techniques, among others.
In this paper the framework of planning and making decisions under uncertainty is analyzed, with a focus on deciding optimum maintenance and inspection actions and intervals for civil engineering structures based on the structural conditions in real time. The problem of making optimum sequential decisions has a huge history in a big variety of scientific fields, like operations research, management, econometrics, machine maintenance, control and game theory, artificial intelligence, robotics and many more. From this immense range of problems and methods we carefully chose to analyze techniques that can particularly address the engineering and mathematical problem of structural management, and we also present them in a manner that we think is most appropriate for the potential interested readers, who are dealing with this particular problem and/or structural safety. A large variety of different formulations can be found addressing the problem of maintenance and management of aging civil infrastructure. In an effort to very succinctly present the most prevalent methodologies we classify them in five different general categories. The first category includes methods that rely on simulation of different predefined policies and indicative works can be found by Engelund and Sorensen  and Alipour et al. . Based on the simulation results, the solution that provides the best performance among these scenarios is chosen, which could be the one with the minimum cost or cost/benefit ratio, etc. It is evident that a problem with this approach is that the chosen policy, although better than the provided alternatives, will hardly be the optimal among all the possible ones that can actually be implemented. In the second category we include methods that are usually associated with a pre-specified reliability or risk threshold and several different procedures have been suggested in the literature. In Deodatis et al.  and Ito et al.  the structure is maintained whenever the simulation model is reaching the reliability threshold, while in Zhu and Frangopol  the same logic is followed with the exception that the maintenance actions to take at the designated times are suggested by an optimization procedure. Thoft-Christensen and Sorensen  and Mori and Ellingwood  pre-assume a given number of lifetime repairs, in order to avoid the discrete nature of this variable in their non-linear, gradient-based optimization process, and based on their modeling they identify optimum maintenance times so that the reliability remains above the specified threshold. Zhu and Frangopol  also followed this approach but used a genetic algorithm, which has significant computational cost however, in order to drop the assumption of pre-determined number of lifetime repairs and to be able to model the available maintenance actions in a more realistic manner. Overall, the available methods in this category provide very basic policies and the simultaneous use of optimization algorithms in a probabilistic domain, in this context, usually compels use of rudimentary models. Unfortunately, this last statement, concerning a probabilistic domain, is also valid when the problem is cast in a generic optimization formulation, which we characterize as another category although the work in  would also fit in. Formulations in this class usually work well with deterministic models, the available number of possible different actions is typically greater than before and a multi-objective framework is enabled. The problem is frequently solved by genetic algorithms and a Pareto front is sought. The choice of genetic algorithms, or other heuristic search methods, for solving the problem is not accidental since these methods can also tackle the discrete part of the problem, like the number of lifetime actions and the chosen action type in each maintenance period. Unavoidably, the computational cost is significant nonetheless and probabilistic formats are problematic with these techniques. Representative works can be seen in ,  and , among others. All presented methods until now rely exclusively on simulation results and in essence do not take actual data into account in order to adjust or determine the performed actions, with the works in  and  being some sort of exception. While this may be sufficient for a variety of purposes, it is definitely incongruous for an applied, real world structural management policy. To address the issue a possible approach is suggested in the literature which is typically, but not utterly, associated with condition based thresholds. We classify these methods in a fourth category and a representative work can be seen in Castanier et al. . The main idea behind these methods is to simulate deterioration based on a continuous state stochastic model, with Gamma processes being a favored candidate, and to set certain condition thresholds based on optimization, in between which a certain action takes place. Assuming perfect inspections, the related action is thus performed as soon as the structure exceeds a certain condition state during its lifetime. As probably understood already, the main weakness of this formulation is the usually unrealistic assumption about perfect observations. Due to this, although capabilities of the formulation are generally broad and versatile, including probabilistic outcome and duration of actions, the inspection part is lacking important attributes and analogous sophistication with other parts of the approach. A secondary concern with this approach can be also identified in the fact that the global optimum may be hard to find in non-convex spaces, although this is not a general limitation and is dependent on the specifics of the problem and the optimization algorithm used. In the fifth category we include models that rely on stochastic control and optimum sequential decisions and these are the models of further interest in this paper. These approaches usually work in a discrete state space, and like the ones in the previously described category also take actual, real-time data into account in order to choose the best possible actions. In their most basic form of Markov Decision Processes (MDPs) these models share the limitation of perfect observations, although they can generally provide more versatile, non-stationary policies, and taking advantage of their particular structure the search for the global optimum is typically unproblematic. Indicative of the successful implementation of MDPs in practical problems, Golabi et al.  and Thompson et al.  describe their use with fixed biannual inspection periods in PONTIS, the predominant bridge management system used in the United States. Most importantly however, as is also shown in detail in this paper, MDPs can be further extended considerably to a large variety of models and especially to Partially Observable Markov Decision Processes (POMDPs) that can take the notion of the cost of information into account and can even address the conundrum of planning optimum policies based on uncertain structural data and stochastic models. We believe that POMDP based models are adroit methods with superior attributes for the structural maintenance problem, in comparison to all other methods. They do not impose any unjustified constraints on the policy search space, such as periodic inspections, threshold performances, pre-determined number of lifetime repairs, etc., and can instead incorporate in their framework a diverse range of formulations, including condition-based, reliability and/or risk-based problems, periodic and aperiodic inspection intervals, perfect and imperfect inspections, deterministic and probabilistic choice and/or outcome of actions, perfect and partial repair, stationary and non-stationary environments, infinite and finite horizons, and many more. Representative works with a POMDP framework can be seen in Madanat and Ben-Akiva , Ellis et al.  and Corotis et al. , while further references about studies based on Markov Decision Processes are also given in the rest of this paper and in the second part of this work, . To illustrate schematically a POMDP policy, with a minimum life-cycle cost objective, in a general, characteristic structural inspection and maintenance problem, Fig. 1 is provided. In this figure, the actual path of the deterioration process (continuous blue line) has been simulated based on one realization of a non-stationary Gamma process and is overall unknown to the decision-maker except when he decides to take an observation action. The gray area in Fig. 1 defines the mean +/− 2 standard deviations uncertainty area which is given by the used stochastic model. This probabilistic outcome of the simulation model is the only base for maintenance planning for the decision-maker when actual observation data cannot be taken into account. Even with an accurate stochastic model, the fact that the actual deterioration process is never observed will usually result in non-optimum actions, for a certain structure, since the realized process can be, for example, in percentiles far away from the mean. Taking observation data into account the decision-maker can update his belief about the deterioration level of the structure according to his prior knowledge and the accuracy of observations. In Fig. 1 the belief updating is shown clearly based on the outcome of the first two different observation actions (marked with+in the figure). As seen, the first observation method is more accurate (probably at a higher cost), in comparison to the second, and directs more effectively to the true state of the system. Although rarely the case with structural inspection/monitoring methods, if a certain observation action can identify the state of the structure with certainty, the belief is then updated to this state with probability one. As is shown in detail in the rest of this paper, POMDPs plan their policy upon the belief state-space and this key feature enables them also to suggest times for inspection/monitoring and types of observation actions, without any restrictions, unlike any other method. Concerning maintenance actions, POMDPs can again optimally suggest the type and time of actions without any modeling limitations. Two different maintenance actions are shown as an example in Fig. 1, marked by the red rectangles. The length of the rectangles indicates the duration time of actions. The first maintenance action is preceded by a quite precise observation action and substantially improves the condition of the structure. Since the outcome of maintenance actions is usually also probabilistic, the belief of the decision-maker over the structural deterioration level after the action is updated based on the observation and the performed maintenance. In the POMDP framework observation actions are not necessarily connected to maintenance actions and hence a computed policy may suggest a maintenance action at certain instances of the belief space without observing first. Such an occasion is depicted at the second maintenance action in Fig. 1, where the decision-maker does not want to pay the cost of information to update his belief and decides to maintain the structure regardless. Based on his prior belief and the probabilistic outcome of the performed action, which is of lower maintenance quality but less time demanding and costly than the previous one, the belief of the decision-maker is updated accordingly. Mainly due to the absence of real-time data in this case, the actual deterioration level is at a somewhat extreme percentile and the remaining uncertainty, after the action, is still considerable. Full-size image (36 K) Fig. 1. A schematic POMDP policy for a structural inspection and maintenance problem. Figure options Despite the fact that the maintenance and inspection problem has received considerable attention along the years and that POMDPs provide such a powerful framework for its solution, this is still not widely recognized today. We believe that one possible reason for this is that until recently a very serious limitation of POMDP models was that the optimal policy was impossible to be computed for anything but very small problems. Hence, significant available works in this area primarily, if not exclusively, focused on the modeling part and the important solving part of POMDP models was degraded in these works. This depressing news for solving the POMDP models did not motivate researchers and engineers to enter the field and unfortunately this is even currently so, despite recent, significant advances of POMDP solvers, mainly in the field of robotics. Addressing this issue in this paper, we exclusively describe out of the vast and multipurpose stochastic control field, methods that are fitting for structural management, starting from simpler to sophisticated techniques, and modern solvers capable of solving large-scale, realistic problems. More specifically, in Section 2 we briefly describe MDPs as a foundation for the rest of the paper and in Section 3 we explain how state augmentation works which is a valuable technique to form non-stationary problems, among others. In Section 4 we describe semi-MDPs, which can model the duration of actions, and we relate them to the important, for structural management, decision interval notion, while in Section 5 we explain why MDPs and semi-MDPs present intrinsic limitations for our considered problem. We then explain in Section 6 in detail how POMDPs can give answers to all these limitations, we explain the belief updating and the belief space concept and overall we present this demanding topic in a concise and clear way. In Section 7 we examine solving techniques and we deliberately present simple approximate solvers for POMDPs that are directly based on MDPs, because currently structural management programs like PONTIS only rely on MDPs and hence these programs could be straightforwardly enhanced by these methods. We also present grid-based solvers, which are almost exclusively used in the literature today for structural maintenance problems with POMDPs, and we explain their inadequacies and we finally analyze point-based solvers that have the capability to solve larger scale problems. We believe our approach in this Part I paper is helpful for understanding shortcomings of the currently used methods, related complications and possible solutions and we hope that will also help interested readers understand the significance different POMDP solvers have, not only on the solution of course but also on the modeling of the problem, and why different researchers often choose specific models based on solver availability. Based on this Part I paper, in our Part II companion paper, , we utilize almost all presented topics and notions in a very broad, realistic, infinite horizon, minimum life-cycle cost example, with hundreds of states, choice availability of both inspection and maintenance actions, uncertain observations and non-stationarity, and we focus on point-based solvers implementation and comparison with simpler techniques, among others. Closing this introductory part it is important to mention that several other formulations for the structural maintenance problem exist which could either be integrated in the provided categories as model variations or could perhaps form new categories. Representative examples can be found in renewal theory concepts ,  and , continuous time Markov Decision Processes , and in review papers with valuable references, , ,  and . Notwithstanding the pluralism, however, of the methodologies, the sophistication and adeptness of POMDP formulations are exceptional.
نتیجه گیری انگلیسی
In this paper, stochastic control approaches that are appropriate for infrastructure management and minimum life-cycle costs are analyzed. We briefly describe Markov Decision Processes (MDPs) as a foundation for the rest of the paper and we gradually advance to more sophisticated techniques, and modern solvers capable of solving large-scale, realistic problems. Particularly, we present the state-augmentation procedure, semi-MDPs, and their broader association with important notions, as well as Partially Observable MDPs (POMDPs) that can efficiently address significant limitations of alternative methods. We also examine solving techniques, from simple approximate solvers for POMDPs that are directly based on MDPs and can be straightforwardly utilized by available structural management programs, to the inadequate grid-based solvers which are excessively used for maintenance problems with POMDPs, and finally to advanced, modern point-based solvers with enhanced attributes, capable of solving larger scale problems. Overall, it can be easily recognized in the paper that POMDPs extend studies based on alternative concepts, such as the classic reliability/risk-based maintenance, and they do not impose any unjustified constraints on the policy search space, like periodic inspection periods, threshold performances, perfect inspections and many more. A clear disadvantage of POMDPs, however, is that they are difficult to be solved, especially for large models with many states, and this paper helps explain the significance of different POMDP solvers both on the solution and the modeling of the problem. Based on this Part I paper, the companion Part II paper, , utilizes almost all presented topics and notions in a demanding minimum life-cycle cost application, where the optimum policy for a deteriorating structure consists of a complex combination of a variety of inspection/monitoring types and intervals, as well as maintenance actions and action times.