هم افزایی های بین تحقیق در عملیات و داده کاوی : استفاده از روش های چند هدفه در حال ظهور
کد مقاله | سال انتشار | تعداد صفحات مقاله انگلیسی |
---|---|---|
6959 | 2012 | 11 صفحه PDF |
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : European Journal of Operational Research, Volume 221, Issue 3, 16 September 2012, Pages 469–479
چکیده انگلیسی
Operations research and data mining already have a long-established common history. Indeed, with the growing size of databases and the amount of data available, data mining has become crucial in modern science and industry. Data mining problems raise interesting challenges for several research domains, and in particular for operations research, as very large search spaces of solutions need to be explored. Hence, many operations research methods have been proposed to deal with such challenging problems. But the relationships between these two domains are not limited to these natural applications of operations research approaches. The counterpart is also important to consider, since data mining approaches have also been applied to improve operations research techniques. The aim of this article is to highlight the interplay between these two research disciplines. A particular emphasis will be placed on the emerging theme of applying multi-objective approaches in this context.
مقدمه انگلیسی
Data mining (DM) has recently seen an explosion of interest in many fields of applications, owing to the increasing amount of data available, and the growing understanding that deeper analyzes are far more valuable than simple summary statistics. Data mining is an inductive (not deductive) process. Its aim is to infer knowledge that is generalized from the data in the database. This process is generally not supported by classical DataBase Management systems. Data mining problems raise interesting challenges for several research domains, such as statistics, information theory, databases, machine learning, data visualization, and also for operations research (OR), since very large search spaces of solutions need to be explored. Hence, for several years, numerous research efforts using operational research methods to solve data mining problems have been reported, and several reviews of such approaches have been published [68], [91] and [92]. However, the synergy between operations research (OR) and data mining (DM) is not a one-way street; as described by Meisel and Mattfeld, three kinds of synergies may be achieved [82]: 1/OR can contribute to the efficiency of DM techniques, 2/DM can increase the number of problems in which OR can be applied by means of a less rigorous model building process, 3/finally, increased system performance can result from complementary uses of these two research domains. In this article we will use a simpler categorization of the synergies between DM and OR, which emphasizes two types of interaction, in terms of how OR and DM can contribute to each other. Hence, the first point of view (similar to the first of Meisel and Mattfeld’s synergies) is to analyze how OR can contribute to the efficiency of DM techniques. The second point of view looks at how DM can contribute to OR methods. In our view, the second synergy of Meisel and Mattfeld, concerned with using DM techniques to better capture the structure of the underlying system, may be merged into our second type of DM/OR interaction, since it yields the same overall result of enhancing OR via deployment of DM. Our first point of interest is to analyze how OR can be useful in the challenges faced by applications of DM. In other words, how OR approaches can contribute in helping DM difficult problems. We will see in this review that there are several answers, using several approaches, which all tend to center on using OR to deal with one or other NP-hard optimization problem that arises in a DM task. In particular, metaheuristics have been widely used in this context, and several books dedicated to metaheuristics and data mining have been published [26] and [35]. Meanwhile, multi-objective metaheuristic approaches are increasingly also being proposed in this context [61] and [59]. Thus, this article will pay a particular attention to this multi-objective aspect and methods that have been proposed for that. Therefore, the notion of quality criterion related to the objective function, for example, will be discussed. The second fundamental question, when synergies between OR and DM are under analysis, is to understand how DM techniques can help OR methods. Even though this thread of research is less studied, significant such work is emerging [64]. The objectives of such a synergy may be for example, to either improve the quality of results obtained by OR approaches, or to speed up the execution of algorithms. The aim of this review is to provide interesting pointers to how OR and DM can enrich each other. The remainder is organized as follows: the second section is designed to present to the OR community a short introduction to ‘knowledge discovery’, in order to help define the scope of this very general term and to make this article be self-content. It will describe the main data mining tasks and the principal and classical algorithms in this field. Section 3 will then deal with the first question: how operations research can help data mining. Section 4 is dedicated to the other side of the coin: how data mining may be useful for operations research techniques. In both Sections 3 and 4, a particular emphasis will be given to multi-objective models and methods. Section 5 will conclude the review and will suggest some interesting research perspectives for both communities.
نتیجه گیری انگلیسی
As we have aimed to demonstrate in this review, data mining and operations research already share a common history. The success of the interaction between them also motivates the further exploration of other connected domains, such as statistical learning, to find other opportunities for fruitful combinations [11]. Indeed, as we have seen in this survey, OR can help the data mining process and DM can help OR. We expect that the synergy between these two domains will continue to blossom, especially in the light of the surge of interest in both communities for using multi-objective approaches. Challenging questions still arise, and we point out two such questions here. First, how can we integrate more domain knowledge while solving a problem? In an optimization context, for example, DM can help provide insight about how the best way to solve the problem might depend on instances of the problem, or it may help us learn during optimization, so that we can speed up the process by learning rules that help us avoid areas of the solution space where the solution quality is unpromising. Such integration of knowledge can be extremely helpful on larger and more complex problems. In a similar manner, to solve a data mining task, it may be useful to integrate knowledge about the domain. A good opportunity for this is given by multi-objective approaches, which enable us to combine objective functions specific to the data mining task with separate criteria that exploit domain knowledge. In both OR and DM there are many ways knowledge can be integrated, but the research community as a whole lacks an overall theory or agreed set of guidelines that could make the process less ad hoc, and could help identify further possibilities and mechanisms for this integration. Then the second question arises. How can we integrate preferences from an expert? This question is very important in the context of multi-objective approaches, which have the advantage of producing good compromise solutions, but have the drawback of usually producing too many solutions. This is a classical question when dealing with multi-objective approaches (even in fields other than DM - see for example [12], [113] and [18] for the integration of User preferences in multi-objective optimization algorithms) and some works have been proposed to address it in the DM context, such as the proposition of dominance-based rough set approach, for example [107].