The main objective of this paper is to present a new method of detection and isolation with a Bayesian network. For that, a combination of two original works is made. The first one is the work of Li et al. [1] who proposed a causal decomposition of the T2 statistic. The second one is a previous work on the detection of fault with Bayesian networks [2], notably on the modeling of multivariate control charts in a Bayesian network. Thus, in the context of multivariate processes, we propose an original network structure allowing to decide if a fault has appeared in the process. This structure permits the isolation of the variables implicated in the fault. A particular interest of the method is the fact that the detection and the isolation can be made with a unique tool: a Bayesian network.
Nowadays, monitoring of complex manufacturing systems is becoming an essential task in order to: insure a safe production (for humans and materials), reduce the variability of products or reduce manufacturing cost. Classically, in the literature, three approaches can be found for the process monitoring [3] and [4]: the knowledge-based approach, the model-based approach and the data-driven approach. The knowledge-based category represents methods based on qualitative models: Digraphs; Fault Trees [5]; Case Based Reasoning [6]. The model-based approach is based on analytical (physical) models able to simulate the system [7]. Though, at each instant, the theoretical value of each sensor can be known for the normal operating state of the system. As a consequence, it is relatively easy to see if the real process values are similar to the theoretical values. However, the major drawback of this family of techniques is that a detailed model of the process is required in order to monitor it efficiently. An effective detailed model can be very difficult, time consuming and expensive to obtain, particularly for large-scale systems with many variables. The data-driven approaches are a family of different techniques based on the analysis of the real data extracted from the process [8]. These methods are based on rigorous statistical developments of the process data (i.e., control charts, methods based on Principal Component Analysis, Projection to Latent Structure or Discriminant Analysis) [3]. Since we are monitoring large multivariate processes, we will work in the data-driven monitoring framework.
To achieve this activity of data-driven monitoring, some authors call this AEM (Abnormal Event Management) [4]. This is composed of three principal steps: firstly, a timely detection of an abnormal event; secondly, diagnosing its causal origins (or root causes); and finally, taking appropriate decisions and actions to return the process in a normal working state. As the third step is specific to each process, literature generally focuses on the two first step: fault detection and diagnosis, named FDD [9]. We will call “fault” an abnormal event (like an excessive pressure in a reactor, or a low quality of a part of a product, and so on), usually defined as a departure from an acceptable range of an observed variable or a calculated parameter of the process [4]. Generally, a monitoring technique is dedicated to one specific step: detection or diagnosis. In the literature, one can find many data-driven techniques for the fault detection: univariate statistical process control (Shewhart charts) [10] and [11], multivariate statistical process control (T2 and Q charts) [12] and [13], and some PCA (Principal Component Analysis) based techniques [14] like Multiway PCA or Moving PCA [15] used for the detection step. Kano et al. [16] make comparisons between these different techniques.
An efficient fault detection and isolation tool should be able to isolate the variables implicated in the fault, in order to help the process operator to identify the root cause (the physical cause) of the fault. Some methods exist to solve this problem (see Section 2.2), which are based on a decomposition of the T2 statistic. But, each of these methods uses different tools for the fault detection and the fault isolation (variables implicated in the fault), like control charts, statistical decompositions, Bayesian networks, etc. From a practical point of view, it would be more interesting to combine the main advantages of these techniques and to exploit them jointly in one single tool. Recently, the application of Bayesian networks for the fault detection and diagnosis has been used with success, in the data-driven context [17] and [18], but also in the model-based context [19]. The objective of this article is to propose an improvement of the decomposition method of Li et al. [1], in order to use a sole Bayesian network enable to detect a fault and to isolate the implicated variables in this fault.
The article is structured as follows: Section 2 presents preliminaries needed for a correct understanding of the article; Section 2.1 highlights some aspects of Bayesian networks; Section 2.2 presents the various T2 decompositions (causal and MYT); in Section 3 we show how to construct some multivariate control charts with a Bayesian network and how to exploit the network in order to isolate the detected faults; two examples of the approach are presented in Section 4; finally, in the last section, we conclude on the proposed approach.
In this paper, we have presented an approach for the fault detection and fault isolation of a multivariate process. This approach is based on a Bayesian network. We have combined previous work of control chart in a Bayesian network with some recent work of Li et al. [1]. The proposed approach allows us to isolate the variables responsible of a fault in a multivariate process. The method has been tested on a 5-variable system (a hot forming process) and on the benchmark problem of the TEP, demonstrating the performance of the method. This paper demonstrates the impact of taking into account causality in the detection and isolation steps.