تجزیه و تحلیل چند بعدی کیفیت داده برای مدیریت ریسک اعتباری : بینش ها و چالش ها
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|791||2012||24 صفحه PDF||سفارش دهید||19370 کلمه|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Information & Management, Available online 16 November 2012
Recent studies have indicated that companies are increasingly experiencing Data Quality (DQ) related problems as more complex data are being collected. In order to address such problems, literature suggests the implementation of a Total Data Quality Management Program (TDQM) that should consist of the following phases: DQ definition, measurement, analysis and improvement. As such, this paper performs an empirical study by means of a questionnaire distributed to financial institutions worldwide to identify the most important DQ dimensions, assess the DQ level of credit risk databases using the identified DQ dimensions, analyze DQ issues and suggest improvement actions in a credit risk assessment context. This questionnaire is structured according to the framework of Wang and Strong, and incorporates three additional DQ dimensions which were found to be important to the context at hand (i.e. actionable, alignment and traceable). In addition, this paper contributes by developing a scorecard index to assess the DQ level of credit risk databases using the DQ dimensions that were identified as most important. Finally, the paper explores the key DQ challenges and causes of DQ problems, and suggests improvement actions. Findings of a statistical analysis of the empirical study suggest nine most important DQ dimensions including accuracy and security for assessing the DQ level.
The risk of poor Data Quality (DQ) increases as larger and more complex information resources are collected and maintained  and . Because most modern companies tend to collect increasing amounts of data, good data management is becoming increasingly important. In response, in the previous two decades, the aspect of DQ has received a lot of attention by both organizations worldwide and in academic literature. Several studies have explored DQ challenges and have focused on DQ measurement and improvement , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,  and . Fig. 1 illustrates this focus by plotting the increasing number of DQ related publications over the past ten years as reported by ISI Web of Knowledge. In practice, decision makers differentiate information from data intuitively and describe information as data that has been processed. Unless otherwise specified, this paper uses data interchangeably with information. DQ is often defined by ‘fitness for use’ which implies the relative nature of the concept ,  and . Quality data for one use may not be appropriate for other uses. For instance, the extent to which data are required to be complete for accounting tasks may not be required for sales prediction tasks. Accounting tasks typically require the availability of all cash balances, e.g., when making up a balance sheet. Conversely, sales prediction tasks will always be possible irrespective of missing cash balances  and . In addition to the task type, the contextuality of DQ can also be explained by the trade-offs between DQ dimensions where one dimension can be favored over other dimensions for a specific task. Data quality dimensions are not independent but are, in fact, correlated . Moreover, if one dimension is considered more important than other dimensions for a specific application, then the choice of favoring this dimension may negatively affect other dimensions. For example, having accurate data may require checks that could negatively affect timeliness. Conversely, having timely data may result in less accuracy, completeness or consistency. A typical situation in which timeliness can be preferred to accuracy, completeness, or consistency is given by most web applications. As time constraints are often very stringent for web data, it is possible that such data are deficient with respect to other quality dimensions. For instance, a list of courses published on a university web site must be timely though there could be accuracy or consistency errors, and some fields specifying additional course details could be missing. Conversely, when considering administrative applications, accuracy, consistency and completeness requirements are more essential than timeliness, and therefore, delays are mostly permissible. Another example can be a trade-off between completeness and consistency. A statistical data analysis typically requires a significant and representative set of data, and in this case, the approach will be to favor completeness while tolerating inconsistencies or adopting techniques to address these inconsistencies. Conversely, when publishing a list of student scores on an exam, it is crucial to check the list for consistency, which may possibly defer the publication of the complete list  and . Accordingly, studying the DQ in the context of a specific task is a recognized method , , , , ,  and .
نتیجه گیری انگلیسی
This paper explored the important DQ dimensions and assessed the DQ level using a scorecard index. Additionally, this study identified different DQ challenges and their possible causes. In general, this study demonstrated a TDQM effort in a financial setting. In the definition phase, the identification of various DQ dimensions relevant to credit risk assessment is considered. Similarly, in the measurement phase, the DQ level in credit risk databases is assessed, and DQ issues are analyzed. The results of the analysis help to identify the problem areas and to focus improvement actions, which completes the TDQM cycle. We began with a literature overview of the different DQ dimensions and focused on the framework of Wang and Strong . Based on the results of the pilot survey, this framework was extended with three additional DQ dimensions (i.e., ‘alignment’, ‘actionability’ and ‘traceability’), which resulted in seventeen total DQ dimensions. The importance of this extended framework has been assessed by credit risk managers. These decision makers rated the DQ dimensions on a scale from 0 to 10. The results were analyzed using a Friedman test, which indicated a significant difference between the scores of the DQ dimensions. The results of the post-hoc Bonferroni–Dunn test confirmed that accuracy is the most important DQ dimension. Additionally, security, relevancy, actionability, accessibility, objectivity, timeliness, value-added and representational-consistency are found to be important DQ dimensions. The Wilcoxon ranked sum tests confirmed that the most important DQ dimensions identified are valid irrespective of the size of the financial institution and the presence of DQ teams. A Bonferroni–Dunn test was also performed on data from other sectors. The results indicate that there is a difference between financial and other sectors in assessing the importance of DQ dimensions. This result also confirmed the contextual behavior of DQ. The correlation between DQ dimensions has also been assessed, and the majority of DQ dimensions were found to be correlated, which implies that DQ, although intrinsically a multidimensional concept, is often perceived from a single perspective. Second, the DQ levels in the credit risk databases are assessed using the weighted average model. The distributions of the weighted average of each DQ category were used to benchmark the DQ level as very good, good, below average and worst. The scorecard index is used to assess the DQ level and to indicate problem areas. Finally, the paper identified different DQ challenges and their causes in financial institutions. The results indicated that inconsistency and diversity of data sources are among the most recurring challenges. Likewise, manual data entry processes are found to cause the majority of the DQ problems. Although DQ problems are endangering the effectiveness of the task, only a few DQ enhancement activities are currently in place. Moreover, these activities are mostly instigated by regulatory authorities rather than by internal considerations. Surprisingly, creating a competitive advantage was not found to be an important stimulus in any DQ improving activity. It is confirmed in this paper that the majority of financial institutions are unaware of the magnitude of their DQ problems, which stops them from taking holistic measures to address these issues. This is a clear indication of the need for comprehensive DQ metrics. Although DQ is contextual and should be addressed with respect to the task at hand, DQ also has intrinsic characteristics that can be valuable to other tasks. Because credit risk assessment involves primarily analytical tasks, DQ requirements and findings of this study can be extended towards different tasks and organizations of a similar nature. The empirical validation of this conjecture is considered to be an interesting topic for future research. Finally, the sensitivity analysis of the parameters (PD, LGD, EaD and M) performed to understand the possible impact of DQ on risk concentration as well as the relative importance of individual DQ dimensions on these parameters are both considered to be interesting topics for future research.