حمایت از مدیریت کیفیت داده ها در تصمیم گیری
کد مقاله | سال انتشار | تعداد صفحات مقاله انگلیسی |
---|---|---|
4392 | 2006 | 16 صفحه PDF |
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Decision Support Systems, Volume 42, Issue 1, October 2006, Pages 302–317
چکیده انگلیسی
In the complex decision-environments that characterize e-business settings, it is important to permit decision-makers to proactively manage data quality. In this paper we propose a decision-support framework that permits decision-makers to gauge quality both in an objective (context-independent) and in a context-dependent manner. The framework is based on the information product approach and uses the Information Product Map (IPMAP). We illustrate its application in evaluating data quality using completeness—a data quality dimension that is acknowledged as important. A decision-support tool (IPView) for managing data quality that incorporates the proposed framework is also described.
مقدمه انگلیسی
The access to information in today's decision-environments is not restricted by business-unit or organizational boundaries. Decision-making in these environments involves large data volumes and includes a wide variety of decision-tasks. Decision-makers are forced to become more responsive as they have access to data anywhere and at anytime. In such environments it is important to assure decision-makers of the quality of data they use and allow them to gauge quality. Traditional methods for evaluating data quality dimensions do so objectively—without considering contextual factors such as the decision-task and the decision-maker's preferences. However, a classic definition of quality is fitness for use, or the extent to which a product successfully serves the purposes of customers [11]. Quality of the data, therefore, is dependent on the purpose (task). We believe that the perceived quality of the data is influenced by the decision-task and that the same data may be viewed through two or more different quality lenses depending on the decision-maker and the decision-task it is used for. For example, an instructor trying to place orders for course textbooks may find an approximate enrollment figure sufficiently accurate to decide the number of copies to order. The same instructor will not consider this enrollment figure an accurate-enough representation of his/her class-size when requesting a room (seating capacity) for class meetings. Decision-makers must have the ability to evaluate data quality1 based on the decision-task that the data is used for. It is therefore important to communicate data quality information to the decision-maker and offer the decision-maker the ability to gauge the quality of the data using task-dependent interpretations. The first objective in this paper is to propose a framework that communicates data quality information to the decision-maker and allows the decision-maker to gauge data quality by incorporating task-dependent factors. The data quality management framework proposed here is based on the notion of managing information as a product—the information product (IP) approach [25]. The IP approach treats information as a product instead of a “by-product” of information systems [4]. Although research has focused on developing and implementing information systems that deliver the “right” data, the outputs often do not meet the consumer's expectations. One reason is the mismatch in specifications between the “output product” and the user's need. Another is the poor management of the raw materials and processing involved in creating this “output”. Total Quality Management (TQM) methods were implemented to address similar problems in conventional manufacturing. Research in data quality suggests that the focus should shift from information systems to the output of such systems, the IP [4] and [25]. An IP such as a business report (inventory volume report or sales report) is the deliverable that corresponds to specific requirements of the consumers. TQM and other methods successfully employed to address quality issues in conventional manufacturing can be used to manage the processes that create the information product and implement Total Data Quality Management (TDQM) in information systems. The information product map (IPMAP) is a representation scheme for representing the manufacture of an IP based on the IP approach [20]. The second objective of this paper is to propose the use of this visual representation for communicating quality-related metadata associated with an IP, informing decision-makers about the manufacturing processes used to create an IP, and for evaluating data quality of the IP at all manufacturing stages. A decision-support tool for data quality management (IPView) that incorporates the IPMAP is described. Methods for evaluating data quality, including the one proposed here for evaluating completeness, are implemented in IPView. Past research has illustrated that data quality may be evaluated along several different quality dimensions [6], [18] and [23]. The three important and commonly addressed data quality dimensions are accuracy, timeliness, and completeness [1], [8], [14] and [22]. Timeliness and accuracy have been addressed in depth [2] and [4]. However, completeness, acknowledged as an important data quality dimension, is addressed to a lesser extent [3]. Ballou and Pazer examine completeness by dividing it into structural completeness and content completeness and treating the two as independent [3]. In this paper, our context-dependent examination of completeness is based on the provision of data quality metadata (including measurement of structural or context-independent completeness). Kahn et al. identify that an important aspect of managing data quality is conformance to specification [12]. Without an appropriate measurement, it is difficult to determine the level of conformance to specification. The third objective of this paper is to provide an in-depth examination of completeness as a data quality dimension and propose a method for evaluating it. The paper also illustrates how completeness can be evaluated using the IPMAP. The next section presents an overview of the relevant literature on data quality to differentiate this research and to define its scope. Section 3 describes the IPMAP, the method for evaluating completeness, and how this evaluation is done using the IPMAP. Section 4 describes the extensions to the IPMAP necessary for evaluating completeness and for implementing it as a decision-support tool (IPView) for communicating and evaluating data quality. Concluding remarks and the research directions are presented in Section 5.
نتیجه گیری انگلیسی
In this paper we have attempted to justify the need to permit decision-makers to incorporate contextual considerations in the process of evaluating data quality. This important issue has not been explicitly addressed by the previous data quality research. Such a proactive support for data quality management is essential in dynamic decision-environments that exist today. The quality of the data is dependent on the decision-task and the same data may be viewed with two or more different quality lenses according to the decision-task it is used for. We further propose a comprehensive framework for evaluating completeness as a data quality dimension, dealing with both context-independent and context-dependent evaluations. Using an example drawn from a supply chain decision-environment, we have shown how completeness may be evaluated using a context-independent approach and also illustrated how this can be modified to include contextual considerations. This framework is built on the IPMAP representation that serves as a visual tool for communicating quality metadata to users. We further developed the IPView modeling tool to support the implementation of the data quality measurement framework. It can be argued that allowing decision-makers to assign weights may introduce a large bias in the evaluation rendering it useless. Users could assign “false” weights for personal gains. However, the assignment of weights is based on how a decision-maker perceives the importance/relevance of the data in the context of the decision-task. The decision-maker is gauging the quality for his/her own individual decision-needs. Each decision-maker has the ability to evaluate the quality based on his/her needs. Hence we submit the bias comes to play only if the decision-makers involved share the data quality evaluation of a specific IP in addition to sharing the IP. An extension of this research proposes the use of an IPMAP for providing metadata about the source and processing of primary data, in order to enhance its believability and fitness-for-use. We have defined a framework for assessing data quality-in-use from dual-process theories of human cognition [19]. By applying a dual-process approach to data quality assessment, the model enables simultaneous evaluation of both objective and contextual data quality attributes. In addition to assessing the role of metadata for enhancing believability, we use our framework to investigate the role of quality dimensions—relevance, completeness, accuracy, and timeliness. The model is the first to offer a theoretical explanation for the role of metadata in enhancing data quality. In the context of B2B exchanges, it is important to assure organizations the quality of information they get from other organizations. However, no such a generally accepted data quality standard framework exists for the B2B networked environment to help manage the quality of data exchanged across organizational boundaries. We have taken the first step towards developing a data quality standard framework for the B2B electronic commerce by introducing a three-layer solution: 1) the “DQ 9000” quality management standard for the information product manufacturing process; 2) the standardized data quality specification metadata through XML, and 3) the external data quality certification issuer [5]. This framework is the first step to systematically address the data quality problem in the networked B2B environment.