یک مدل پیش بینی برای بیماری عروق مغزی با استفاده از داده کاوی
کد مقاله | سال انتشار | تعداد صفحات مقاله انگلیسی |
---|---|---|
22232 | 2011 | 8 صفحه PDF |
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Expert Systems with Applications, Volume 38, Issue 7, July 2011, Pages 8970–8977
چکیده انگلیسی
Cerebrovascular disease has been ranked the second or third of top 10 death causes in Taiwan and has caused about 13,000 people death every year since 1986. Once cerebrovascular disease occurs, it not only leads to huge cost of medical care, but even death. All developed countries in the world put cerebrovascular disease prevention and treatment in high priority, and invested considerable budget and human resource in long-term studies, in order to reduce the heavy burden. As the pathogenesis of cerebrovascular disease is complex and variable, it is hard to make accurate diagnosis in advance. However, in perspective of preventive medicine, it is necessary to build a predictive model to enhance the accurate diagnosis of cerebrovascular disease. Therefore, coupled with the 2007 cerebrovascular disease prevention and treatment program of a regional teaching hospital in Taiwan, this study aimed to apply the classification technology to construct an optimum cerebrovascular disease predictive model. From this predictive model, cerebrovascular disease classification rules were extracted and used to improve the diagnosis and prediction of cerebrovascular disease. This study acquired 493 valid samples from this cerebrovascular disease prevention and treatment program, and adopted three classification algorithms, decision tree, Bayesian classifier and back propagation neural network, to construct classification models, respectively. After analyzing and comparing classification efficiencies – sensitivity and accuracy, the decision tree constructed model was chosen as the optimum predictive model for cerebrovascular disease. In this model, the sensitivity and accuracy were 99.48% and 99.59%, respectively, and eight important influence factors of predicting cerebrovascular disease and 16 diagnosis classification rules were extracted. Five experienced cerebrovascular doctors assessed these rules, and confirmed them to be useful to the current clinical medical condition.
مقدمه انگلیسی
Cerebrovascular disease is a disease threatening human health seriously; it has “four-high” features: high prevalence, high fatality rate, high disability rate and high recurrence rate. In Taiwan, cerebrovascular disease has been ranked the second or third place of top 10 death causes since 1986. For example, among the top 10 death causes in 2007 published by the Department of Health in October 2008 (shown in Table 1), cerebrovascular disease ranked the third; 12,875 people died from it in that year (DOH, 2007). Cerebrovascular disease not only leads to high medical care expenditure, but also a heavy burden of mid-to-long term medical care expenditure and cost on families and communities. In light of this, all advanced countries in the world listed cerebrovascular disease prevention and treatment at high priority in health medical care, and invested considerable budget and human resources into cerebrovascular disease research and education, so as to lower its morbidity rate, fatality rate and sequela, as well as its burden on individuals, families, communities and countries (Pogue, Ellis, Michel, & Francis, 1996).Elderly population is vulnerable to cerebrovascular disease. As early as in 1993, Taiwan had been concluded by the world health organization (WHO) as an ageing society. Thus, how to discover and prevent cerebrovascular disease as early as possible has become a critical issue for Taiwan. As the pathogenesis of cerebrovascular disease is complex and variable, doctors need to rely on profound medicine knowledge and rich clinical experience to predict the probability of patient contracting cerebrovascular disease. On the other hand, in clinical practice, cerebrovascular disease occurrence is so abrupt and fierce that it is hard to make early and accurate diagnosis and prediction beforehand. Hence, in perspective of preventive medicine, it is indeed necessary to build a predictive model to help doctors diagnosing cerebrovascular disease accurately, so as to improve the treatment quality and contribute to cerebrovascular disease prevention and treatment. Along with great progress of information technology, computer can search for large amounts of data; the technology of detecting relation and knowledge from data is called data mining. Its main use and purpose are defined: seeking unknown, effective and feasible rule or knowledge from large amounts of data. In business, data mining has been applied to extract decision-making procedure or rule from history data stored in information system, to assist company in improving decision quality and enhancing competitiveness (Berry and Linoff, 1997 and Cabena et al., 1997). With the development of data mining technology, it is not only extensively applied in commercial purposes, but also successfully applied in many medical tasks, for examples, in intensive care medicine analysis (Ganzert & Guttmann, 2002), time dependency patterns mining in clinical pathways (Lin, Chou, & Chen, 2001), breast cancer screening (Ronco, 1999), diagnosis of ischaemic heart disease (Kukar, Kononenko, & Groselj, 1999). In Taiwan, cerebrovascular disease prevention and treatment has been listed in annual health medical care priorities, and the Department of Health allocates high budgets for cerebrovascular disease studies every year. In 2007, a regional teaching hospital in the central and southern of Taiwan implemented the cerebrovascular disease prevention and treatment program, which targeted residents in central and southern Taiwan. This program aimed at obtaining data on the patients, including their physical exam results, blood test results and diagnosis data. Then, these data were gathered, stored, and analyzed to contribute to the prevention and treatment of cerebrovascular disease. Therefore, the purpose of this study was to coordinate with the 2007 cerebrovascular disease prevention and treatment program of this regional teaching hospital and construct an optimum cerebrovascular disease predictive model. This study utilized case data of this program and employed classification techniques in data mining, such as decision tree, Bayesian classifier and back propagation neural network, to construct three classification models. After analyzing and comparing classification efficiency, the model with the highest efficiency was chosen as the optimum predictive model, and diagnosis classification rules would be extracted from it. The results would be evaluated by professional cerebrovascular doctors and confirmed to be effective and accurate in diagnosing and predicting cerebrovascular disease. This study acquired 493 valid samples from the prevention and treatment program database. The data of the patients, their physical exam results, blood test results, and diagnoses, were divided into three attribute input modes, T1, T2 and T3, in order to construct the classification models and analyze and compare the classification efficiency. After 10-fold cross-validation, the decision tree in T1 attribute input mode was found to construct a classification model with stable classification efficiency, and thus, chosen as the optimum classification algorithm of this study. The constructed optimum cerebrovascular disease predictive model has 99.48% upon sensitivity and 99.59% upon accuracy, and from this predictive model, 8 important factors of predicting cerebrovascular disease were selected, and 16 diagnosis classification rules were extracted. The results were confirmed by five cerebrovascular doctors, and were conformable with the current clinical medical condition and had reference value
نتیجه گیری انگلیسی
This study carried out along with the 2007 cerebrovascular disease prevention and treatment program of a regional teaching hospital in Taiwan, and used data mining technology to construct an optimum cerebrovascular disease predictive model. A total of 493 valid sample patients were acquired from this prevention and treatment program database, the data on the patients were collected for classification study, which included their physical exam results, blood test results, and diagnoses. Data mining technologies adopted in this study were decision tree, Bayesian classifier and back propagation neural network. In comparison of data mining technology, this study used sensitivity and accuracy indicators to evaluate classification efficiency of different algorithms. Over all, in T1 mode, decision tree’s sensitivity and accuracy were 95.29%, 98.01%, respectively; Bayesian classifier’s sensitivity and accuracy were 87.10%, 91.30%, respectively; BPNN sensitivity and accuracy were 94.82%, 97.87%, respectively. Decision tree had comparable classification efficiency to BPNN. After comparing standard deviation, with the more stable classification efficiency, decision tree was the best classification algorithm in this study. This result is similar to those of Lim et al., 1997 and Liu et al., 2004. On the other hand, among T1, T2 and T1 attribute input modes, T1 is the best attribute input mode in this study, which contained physical exam results, blood test results, and diagnosis data on the patients, 29 major attributes in total. The optimum cerebrovascular disease predictive model obtained in this study adopts decision tree as classification algorithm, T1 as attribute input mode, and its classification efficiency: sensitivity indicator = 99.48% and accuracy indicator = 99.59%. Eight major influence factors, diabetes mellitus, hypertension, myocardial infarction, cardiogenic shock, hyperlipemia, arrhythmiaischemic heart disease and body mass index, were recognized for accurately predicting cerebrovascular disease. In addition, 16 diagnosis classification rules were extracted from this predictive model, and confirmed by five cerebrovascular doctors to be conformable with current clinical medical condition and have reference value in diagnosis and prediction of cerebrovascular disease.