مفاهیم و پیاده سازی ، داده کاوی مورد نیاز جریان کار تعاملی
کد مقاله | سال انتشار | تعداد صفحات مقاله انگلیسی |
---|---|---|
21786 | 2006 | 23 صفحه PDF |

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Data & Knowledge Engineering, Volume 56, Issue 1, January 2006, Pages 41–63
چکیده انگلیسی
Many information systems log event data about executed tasks. Workflow mining is concerned with the derivation of a graphical workflow model out of this data. Experience from applying our workflow mining system InWoLvE in experiments and practical applications has shown that workflow mining is a highly interactive process. The mining expert iteratively approaches the result by varying the parameters of the mining tool and verifying the mined models. Our tool InWoLvE was not designed for intensive interactive usage. In this paper, we report about a rigorous requirements analysis and about possible solutions related with the support of such interactivity. Two selected solution concepts are explained in more detail. First, a special layout algorithm that is stable against small changes of the model thus allowing the workflow mining expert to maintain a mental map of the workflow. Second, a validation procedure that helps the expert to check event sequences against the (preliminary) mined model. These and other important concepts have been implemented in a prototype called ProTo.
مقدمه انگلیسی
1.1. The role of workflow mining in workflow design Today, business process (re-)design is usually followed by an implementation of the redesigned process in an information system (IS). Such software systems are also called process-aware information systems (p-IS). The main focus of traditional IS is to support the execution of single tasks, for example, transaction management, context dependent information and decision support. In contrast, one of the main goals of the introduction of a p-IS is to help to drive the business process, that is, to steer the process participants way of working. One of the most critical tasks in p-IS development is the design of the related workflow. The term workflow thereby refers to the part of the business process explicitly supported by the software system. Every single member of an organization has good knowledge about how to get his work done. To design the business process and workflow, respectively, a more global view is needed: the knowledge of the single process participants has to be compiled into one global process. In literature, the effort to acquire and adapt the business process and workflow is estimated to need about 60% of total p-IS development time [15], [34] and [21]. Hence, business process and workflow design support is needed. Over the last decade, research related to workflow management was mainly concerned with workflow modelling, simulation and implementation (e.g., workflow management systems) [3]. More recent work concentrates on workflow diagnosis and performance analysis, that is, the support of activities that monitor and analyze the execution of workflow enactment. Studying the workflow design process from an organization sciences perspective shows the importance of this workflow lifecycle phase [18] and [19]. With respect to Fig. 1, this is a shift in focus from “a priori” process/workflow modelling and design to “a posteriori” process/workflow enactment diagnosis. • Design. Process knowledge is elicited and compiled into one process. A process definition can be textual as well as graphical. For simulation and validation purposes, sometimes a formal process model is constructed. In the design phase, a process definition comprises manual as well as workflow activities (MA’s and WA’s in Fig. 1) and is usually on a high level of abstraction. • Configuration. A part of this high level process is further refined to a workflow definition and implemented into a p-IS (TA’s—technical activities—in Fig. 1). Within a workflow management system, the workflow is implemented in form of an explicit—usually graphical—process model. Using traditional implementation techniques, the workflow is reflected within the regular code. • Enactment. Then, the p-IS is put into operational use. Research related to the definition of the role of IT in organizations [22], [24], [25], [26], [18] and [19] emphasizes that the impact of a business process definition on work practice is that of a norm and a p-IS a tool. Work practice is not determined by this norm and tool but influenced. The new work practice emerges with the enactment of the new system and becomes visible within the actions of the process participants executed to achieve the business process goals (in Fig. 1, these activities are depicted as EA’s—executed activities). • Diagnosis. In process/workflow diagnosis the resulting work practice is analyzed. This analysis may include, for example, the determination of the frequency of activity occurrences or the application of workflow mining.Since the introduction of the p-IS changes the existing organization in a not completely foreseeable way (cf. [29], [31] and [33]) and the enactment of the system in operational practice is also difficult to foresee (compare the above discussion), this knowledge was not part of the initial process and workflow definition. Thus, work practice usually deviates from these prescriptions. As a consequence, the appropriate process/workflow must be learned over time and redesigned several times. 1 To be able to redesign the process/workflow, the primary information needed is how the process participants work with the new system and where and why they deviate from the intended process. Workflow mining supports mainly the diagnosis and the design. It helps to understand how people work with the new system. Common techniques to understand work practice are interviews and observations, for example. The results of these techniques strongly depend on the involved persons. Observations suffer already less than interviews from personal influences. In addition, both techniques are costly and time consuming. Business process support with an information system allows a new way to analyze work practice: the diagnosis based on workflow logs. A workflow log contains information about process participants activities executed with the p-IS. Workflow logs can be seen as a materialization of a part of the work practice. Only a part, since, by definition, manual activities never have a corresponding workflow log. In Fig. 1, the logged activities of the work practice are depicted with solid lines (E2, E3, E5) and manual activities with dotted lines (E1, E6). Workflow mining uses different machine learning techniques to derive a workflow model representing the observed work practice out of these workflow logs. This yields the following advantages: • Objectiveness. In contrast to information gained by interviews, workflow logs are an unbiased image of executed workflow activities. • Reduction of analysis time and costs. In most information system implementations, log data is ready available (cf. [1]) without additional costs. The mining process is automated to a high degree. Both aspects contribute to analysis time and cost reductions. But there are also limitations: • Level of abstraction. Since the process model is acquired out of workflow logs, the level of abstraction of this model corresponds with the level of abstraction of the log. This may be problematic in situations where workflow logs are very technical. • Acquisition of incomplete process model. As mentioned, workflow logs are a partial image of work practice. Manual activities are not considered and therefore not part of the acquired model. Hence, in our terminology, we only derive a model of the implemented workflow and not the whole process. Both problems strongly depend on the project and logged information. Since workflow mining is intended as a pre-analysis of process enactment anyways—and not as a full automation of work practice process analysis—the above mentioned problems may narrow the benefits but not invalidate them. 1.2. Why is workflow mining interactive? At DaimlerChrysler Research we have realized a workflow mining algorithm called InWoLvE[14] and [16]. InWoLvE is one of the few workflow mining algorithms, that is, able to deal with the “duplicate tasks” problem (see [1]). This means that InWoLvE in contrast to most other workflow mining algorithms is able to generate workflow models having multiple activity nodes sharing the same label or name. Other workflow mining algorithms require that every activity instance “A” contained in the workflow log is related to one unique activity node of the model with the label “A”. This at first glance little difference makes it necessary for our mining algorithm to apply search in a complex search space of possible solutions. This search is guided by a set of parameters: • The search space explored by InWoLvE (see Fig. 3) is exponential in the number of activity instances of the workflow log. As any other practically applicable search algorithm InWoLvE will only be able to search a small fraction of this space. How large this fraction is and where in the search space it is located, is determined by parameters. • At some point InWoLvE needs to make a decision, which of the analyzed nodes of the search space it considers as the best. Here deciding for the best result involves finding the right tradeoff between the degree of how good the model fits to the observed examples and the size of the model. This is not a trivial task. A model containing one separate path for each example certainly fits the observed examples perfectly but it will be rather useless because of its size. Finding the right tradeoff between fitness and size is again formulated using parameters. • Workflow models mined from workflow logs typically contain rare activities or links. Activities or links with a low frequency may possibly be of less interest to the workflow miner than activities or links with high frequencies. Therefore InWoLvE offers a feature called “noise reduction” allowing the workflow miner to set a minimum frequency, which is used for filtering out rare links and activities from the mined model. So the result provided by InWoLvE depends heavily on these parameters. Workflow miners typically identify the “best” workflow model by iteratively trying different parameter settings and comparing the results. Furthermore in practical situations the workflow miner will observe a continuous stream of workflow instances. There is thus the need to decide, when enough instances have been observed to make a reliable prediction about the model. This again is best tackled by iteratively trying larger example sets until the required level of reliability has been achieved. 1.3. Organisation of this paper This paper is organized as follows: Section 2 gives a short overview of the InWoLvE workflow mining system. In Section 3 we explain, how we systematically gathered requirements for an interactive workflow mining tool, before we outline the mayor concepts needed for interactive workflow mining in Section 4. These concepts have been implemented in a prototype called ProTo, which is presented in Section 5. Related work is discussed in Sections 6 and 7 gives an outlook of our future work.
نتیجه گیری انگلیسی
This paper describes the first analysis of the interactive aspects of the workflow mining process and the first solution for some of the problems that became obvious under this focus. To this end we systematically gathered requirements, and then selectively developed solutions. Among others we developed a special layout algorithm that provides a structured and change resistant layout. Furthermore we defined a measure for the reliability of mined models based on validation, and devised several methods of supporting the user in the decision for a final result. Most of the concepts were implemented in the ProTo tool in order to prove their feasibility. First working experiences with this tool have been very promising, surpassing the possibilities of a combined system of a non-interactive workflow mining tool and a “normal” workflow tool by far. A real estimation of the value of the developed concepts can only be made after putting the system to work in a realistic scenario. Also the feedback from non-developing users will bring invaluable information about the deficiencies of the tool. Further future work also includes the improvement of the InWoLvE mining algorithms. Improvements will include the mechanisms for dealing with loops and for the detection of dependencies. In parallel to ProTo, InterPoL [18] has been developed. InterPoL supports the task of comparing actual work practice with the intended business process (also called Delta-Analysis). Both approaches complement each other since ProTo contributes to a better understanding of the actual work practice. Thus, we built them on the same code basis to make a future integration easy.