ترجمه فارسی عنوان مقاله

نمودار کنترل مبتنی بر مدل داده کاوی برای فرایندهای چند متغیره و خود همبسته

عنوان انگلیسی

Data mining model-based control charts for multivariate and autocorrelated processes

کد مقاله	سال انتشار	تعداد صفحات مقاله انگلیسی
22249	2012	9 صفحه PDF

منبع

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)

Journal : Expert Systems with Applications, Volume 39, Issue 2, 1 February 2012, Pages 2073–2081

ترجمه کلمات کلیدی

فرایند همبسته - فرایند چند متغیره - مدل مبتنی بر کنترل نمودار - کنترل فرایند آماری - داده کاوی

کلمات کلیدی انگلیسی

Autocorrelated process, Multivariate process, Model-based control chart, Statistical process control, Data mining

دانلود رایگان 2 صفحه اول مقاله لاتین (PDF)

پیش نمایش مقاله

چکیده انگلیسی

Process monitoring and diagnosis have been widely recognized as important and critical tools in system monitoring for detection of abnormal behavior and quality improvement. Although traditional statistical process control (SPC) tools are effective in simple manufacturing processes that generate a small volume of independent data, these tools are not capable of handling the large streams of multivariate and autocorrelated data found in modern systems. As the limitations of SPC methodology become increasingly obvious in the face of ever more complex processes, data mining algorithms, because of their proven capabilities to effectively analyze and manage large amounts of data, have the potential to resolve the challenging problems that are stretching SPC to its limits. In the present study we attempted to integrate state-of-the-art data mining algorithms with SPC techniques to achieve efficient monitoring in multivariate and autocorrelated processes. The data mining algorithms include artificial neural networks, support vector regression, and multivariate adaptive regression splines. The residuals of data mining models were utilized to construct multivariate cumulative sum control charts to monitor the process mean. Simulation results from various scenarios indicated that data mining model-based control charts performs better than traditional time-series model-based control charts.

مقدمه انگلیسی

One of the key management systems in organizations is planning for quality. Organizations consider planning for quality as a part of their strategic planning. Without careful strategic planning for quality, organizations could lose large amounts of money, market share, time, and effort (Montgomery, 2001). Therefore, business/manufacturers should focus on planning for quality as a way to develop a competitive edge in the market. Quality control and improvement include a set of activities implemented to achieve product and service specifications. SPC methodologies have frequently been used to avoid poor quality. A control chart is an important tool used in SPC to monitor the performance of a process over time to keep the process within control limits. Control charts are based on solid statistical theory and provide a comprehensive graphical display that can be readily configured by the users with minimal assistance. A typical control chart comprises the monitoring statistics and the control limits. When the monitoring statistics exceed (or fall below) the control limits, an alarm is generated so that the process can be investigated before defective units are produced. Univariate control charts were devised to monitor the quality of a single process characteristic. However, modern processes often involve a large number of highly correlated process characteristics. Although univariate control charts can be applied to each individual characteristic, this technique may lead to unsatisfactory results when multivariate problems are involved. Moreover, high-throughput technologies in modern industries are capable of generating data for short intervals that in their brevity leads to an autocorrelation problem. Traditional multivariate control charts were developed and came into use to solve these problems. However, they have become less and less capable of handling the large streams of complex and auto/cross-correlated data found in modern manufacturing and service systems. Hotelling’s T2 chart is the most widely used multivariate control chart because it can simultaneously and efficiently monitor multiple correlated process characteristics. The main assumptions of T2 control charts are the normality and independency of observed process data. That is, successive multivariate observations are assumed to be independent, identically, and normally distributed over time. Some other types of multivariate control charts include the multivariate cumulative sum (MCUSUM) control chart ( Crosier, 1988, Healy, 1987, Pignatiello and Runger, 1990 and Woodall and Ncube, 1985) and the multivariate exponentially weighted moving average (MEWMA) control chart ( Lowry, Woodall, Champ, & Rigdon, 1992). Both were devised for increased sensitivity to detect small process shifts. Although the MCUSUM and MEWMA charts are known to be relatively robust, compared with Hotelling’s T2 control chart, for nonnormal and autocorrelated data, failure to use multivariate control charts carefully with autocorrelated data may result in deterioration of monitoring performance ( Alwan, 1992 and Montgomery and Mastrangelo, 1991). Increased rates of false alarms are one possible result of such deterioration. Model-based control charts that yield the residuals – the difference between the actual values and the fitted values from the models used – have been the traditional way to address autocorrelation problems in process monitoring. Model-based control charts have been effectively used in monitoring multistage processes in which the output process variable(s) of interest are related to the input process variables from the previous and current stages (Loredo, Jearkpaporn, & Borror, 2002). A regression adjustment control chart, developed by Hawkins (1991), monitors the residuals from the process variable of interest when that variable is regressed on all the others. A regression adjustment control chart is especially useful when a process variable of interest exhibits autocorrelation because the residuals from the regression model are typically uncorrelated. However, its parametric assumption of an error term in linear regression analysis limits its applicability for handling nonnormal process data. A number of other model-based control charts are available (Alwan L.C. and Roberts, 1988, Jiang et al., 2000, Montgomery and Mastrangelo, 1991, Runger and Willemain, 1995 and Zhang, 1998). Alwan L.C. and Roberts (1988) proposed a two-step approach containing two control charts, one called a common-cause chart and the other, a special-cause chart. The approach works well in detecting large process mean shifts. Montgomery and Mastrangelo (1991) proposed the EWMA center line control chart. Their approach works well if the observations are positively autocorrelated and if the process mean does not drift too rapidly. Runger and Willemain (1995) proposed the unweighted batch means (UBM) chart. This approach monitors the average value of observations and does not use a residual-based control chart. Zhang (1998) proposed an exponentially weighted moving average for stationary process (EWMAST) chart to deal with a stationary autocorrelated process. The chart works well when the process has low positive autocorrelation and small mean shifts. Jiang et al. (2000) proposed a charting technique based on autoregressive moving average statistics, the ARMA chart. All of the methods discussed above, however, deal with the occurrence of autocorrelation in univariate processes. They do not address autocorrelation in multivariate processes. As the limitations of SPC methodology become increasingly obvious in the face of evermore complex manufacturing processes, data mining algorithms, because of their proven capabilities to effectively analyze and manage large amounts of data, have the potential to resolve the problems that are stretching SPC to its limits. Despite their great potential, however, few efforts have been made to integrate data mining algorithms with SPC. Arkat, Niaki, and Abbasi (2007) used artificial neural networks (ANNs) to build a model and construct a MCUSUM chart using the residuals for multivariate and autoregressive processes. They compared the average run length (ARL) performance of the three methods: autocorrelated charts, time-series-based residuals charts, and ANN-based residuals charts and concluded that ANN-based residuals charts outperformed the other two charts for small mean shifts in processes. ARL is the average number of observation required for the chart to detect a change (Woodall & Montgomery, 1999). In-control ARL (ARL0) and out-of-control ARL (ARL1) were, respectively, calculated under in-control and out-of-control processes. Issam and Mohamad (2008) used support vector regression (SVR) to construct the residuals-based MCUSUM control chart. They calculated the residuals from one-step-ahead prediction. That is, current observations are used as input to forecast future observations. They concluded that SVR-based residuals charts performed better than time-series-based residuals charts and ANN-based residuals charts when small mean shifts were involved. This idea is interesting, but their main conclusion was derived based on limited simulation scenarios. Their studied did not investigate the different degrees of autocorrelation. Thus, their methods need to be justified much more thoroughly via simulation under various scenarios. Our proposed approach differs from Issam and Mohamad (2008) in how it finds the residuals. To illustrate, for a process with three variables; x1, x2, and x3, we use x1 and x2 as inputs to create a model that predicts x3. The residuals of this model are obtained for monitoring x3. We apply the same procedure to the other variables until we get the residuals from all variables. The assumption behind our proposed approach to obtain the residuals is that degrees of autocorrelation of individual process variables are not significantly different. This is a reasonable assumption because the process variables from an equipment may have similar degrees of autocorrelation. In the present study, we conducted a simulation study under various scenarios including multiple dimensions and different degree of autocorrelation. The focus of the present study is the development of the new process monitoring methodology that can effectively deals with complex multivariate autocorrelated processes. Specifically, we use such state-of-the-art data mining models as multivariate adaptive regression splines (MARS), ANNs, and SVR. Multivariate control charts will then be used to monitor the residuals of the output variables from these data mining models. The rest of this paper is organized as follows. In Section 2, we briefly explain the data mining models used for the model-based control charts. Section 3 illustrates the simulation study and performance comparisons among various data mining model-based control charts based on ARL measures. Section 4 presents our concluding remarks

نتیجه گیری انگلیسی

This study proposes model-based control charts based on data mining algorithms. The proposed charts address a growing need in process control for a way to deal with correlation among variables and autocorrelation within variables without introducing unreliability that would be marked by increasing rates of false alarms. Three data mining model-based techniques and three traditional techniques were compared in this study based on a measurement of ARL performance. Given similar ARL0, the preferred techniques are those that yield the smaller ARL1. The simulation results, based on 1000 replications, indicated that data mining model-based techniques, especially ANN and SVR, performed better than traditional model-based techniques and much better than direct monitoring of a cross/auto-correlated process. The difference in performance is obvious in smaller mean shifts. In addition, data mining model-based control charts also performed better in higher positive autocorrelation processes and in high-dimensional processes. Therefore, these results show that data mining can provide a sound and promising solution for multivariate and autocorrelated process control