هم افزایی های تحقیق در عملیات و داده کاوی
کد مقاله | سال انتشار | تعداد صفحات مقاله انگلیسی |
---|---|---|
6940 | 2010 | 10 صفحه PDF |
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : European Journal of Operational Research, Volume 206, Issue 1, 1 October 2010, Pages 1–10
چکیده انگلیسی
In this contribution we identify the synergies of Operations Research and Data Mining. Synergies can be achieved by integration of optimization techniques into Data Mining and vice versa. In particular, we define three classes of synergies and illustrate each of them by examples. The classification is based on a generic description of aims, preconditions as well as process models of Operations Research and Data Mining. It serves as a framework for the assessment of approaches at the intersection of the two procedures.
مقدمه انگلیسی
Increasing interest in the integration of Operations Research (OR) and Data Mining (DM) can be observed. Recently, a number of publications of successful approaches at the intersection of the two procedures appeared. They underline the potential for benefits from integration. However, these approaches focus either on specific application domains or on specific methods. Criteria for the classification of existing approaches are needed for a better understanding of how far research has come. In addition, essential aims of the integration of OR and DM must be known in order to understand its long term impact. The identification of the basic synergies of OR and DM is necessary to answer both of these questions. The purposes of the two procedures are different. Although the definition of OR is discussed until today (see e.g. [69]), its goals and methods are well known. The purpose is to optimally solve decision problems appearing in real-world applications [38]. Optimal decisions require insights into the structure of the application system under consideration. The vast body of techniques within OR provides the means for capturing this structure in terms of models. Further, OR provides the algorithmic means for deriving a decision on how to modify the application system (c.f. Fig. 1). For a detailed introduction into OR, we refer to e.g. Hillier and Lieberman [46]. In contrast to OR, DM still is quite a young discipline. Though the term “Data Mining” has been used earlier (see e.g. [64] and [31]), the field as it is known today has its origins in the mid 1990s. One of the latest introductions to the current DM methodology is given by Tan et al. [94]. A unique definition of DM has not been established yet. Nevertheless the field’s subject is obvious. DM is concerned with secondary analysis of large amounts of data. Both the aspect of secondary analysis and the sheer size of data distinguish the field from common statistics [40] and [41]. Data is not collected based on experiments designed to answer a certain set of a priori known questions. Instead DM copes with data arising as a byproduct of either operating application systems or simulations of such systems. It aims at abstraction of information about the application system from data. (c.f. Fig. 1). Thus, both OR and DM are application focused [102] and [99]. Many Data Mining approaches are within traditional OR domains like logistics [105], manufacturing [74], health care [58] or finance [57]. Further, both OR and DM are multidisciplinary. Since its origins, OR has been relying on fields such as mathematics, statistics, economics and computer science [99]. In DM, most of the current textbooks show a strong bias towards one of its founding disciplines, like database management [39], machine learning [100] or statistics [45]. Being multidisciplinary and application focused, it should be a natural step for both of the paradigms to gain synergies from integration. Some authors even suggest DM to be a natural extension of the OR problem solving methodology [92]. Recent publication of mere DM algorithms in the OR community [7], [27], [47], [53], [51], [62] and [98] seems to strengthen this claim. Fig. 1 summarizes OR and DM in an application context, including the benefits both can gain from each other. The remainder of this paper explains Fig. 1 in detail. We proceed as follows. In Section 2 we establish a common foundation of OR and DM from an application perspective. Process models of the two procedures are derived. In Section 3 we illustrate how OR techniques lead to increased efficiency of DM as well as how DM contributes to the effectiveness of OR. Three ways of achieving synergies of OR and DM are identified. A number of examples from the literature are classified according to these. Section 4 concludes the paper.
نتیجه گیری انگلیسی
This contribution illustrates the synergies of Operations Research and Data Mining. As a common starting point for the synergies, the notion of application system is defined in terms of appearance and structure. Subsequently, the process models of Operations Research and Data Mining are derived from the elements of an application system. Data Mining requires a set of system appearances to derive information about system structure. Operations Research starts from hypotheses about system structure and modifies system appearance via decision attributes. Three types of synergies of Operations Research and Data Mining are distinguished. The synergies are described in terms of the process models of the procedures and are illustrated by examples from the literature. As a first synergy, increased efficiency of Data Mining is achieved by considering the procedure as an application domain of Operations Research. Examples from the literature show increased efficiency of both preprocessing and core Data Mining operations. The second type of synergy increases effectiveness of decision making by replacement of Operations Research by Data Mining. Examples tend to be application specific. The third type of synergy results in more effective decision making by refinement of decision models. Information gained from Data Mining can be used for refinement with respect to both regular system attributes and decision attributes as well as measurement attributes. The long term impact of integration of Operations Research and Data Mining will appear in terms of the three synergies. The large number of approaches in the literature using optimization methods to increase Data Mining efficiency suggests Data Mining to become a major OR application domain. In contrast, synergy two suggests some decision problems to provide a mere application domain for Data Mining. However, being quite application specific, replacement of Operations Research by Data Mining seems unlikely to result in a new generic methodology. Synergy three implies a quite balanced application of methods of Data Mining and Operations Research and existing approaches from the literature are promising. However the field is young and further research is needed to understand the full potential of refinement of decision models by information gained from Data Mining.