The paper presents a pilot research on the application of clinical decision support systems in a atrophic gastritis screening task. Two different DSS learning strategies have been tested – a standalone classifier and classifier ensemble application. Such classification algorithms as C4.5, CART, JRip and Naive Bayes were used as base classifiers. The classifiers were evaluated on the respondent medical data from an inquiry form, containing 28 attributes and 840 records. The dataset was preprocessed using simple methods in initial data analysis as well as more complex data mining methods for feature selection. The obtained results are summarized and discussed in order to summarize an information on what learning strategies are more applicable to the present dataset and should be studied in more detail in primary research.
Cancer is the worldwide problem in social health and one of the leading causes of death. Nevertheless it is known that most of cancer types are treatable. Referencing the World Health Organization data, at least 40% of all local cancer types are treatable and can be prevented, avoiding the risk factors, common not only for cancer, but also for the most chronic diseases. These risk factors are well known and the most important of them are smoking, alcohol and other pernicious habits, activity shortage, adiposis (excessive weight) and different infectious agents. New medical technologies, new medicaments, vaccines, screening systems are continuously developed and introduced, all aimed at the identification and treatment of cancer at initial stages and the improvement of life quality and life length for patients with cancer.
Even though globally the gastric cancer incidence is declining and in many Western countries the disease is not considered among the major health issues any more, globally the cancer of the stomach is still continuing to be an important healthcare problem. Gastric cancer is remaining the second leading cause of mortality worldwide within the group of malignant diseases after the lung cancer, and is accounting for almost 10% of cancer related deaths. Among men gastric cancer is the second (after lung cancer), but among women – the third leading (after breast and lung) cause of cancer-related deaths (Su et al., 2007 and WHO, 2013).
Gastric cancer is a very challenging malignancy given that it presents late, has complex pathogenetic mechanisms with multiple carcinogenic processes implicated, and is only moderately sensitive to chemotherapy and radiation. Gastric cancer presents mostly in an advanced stage and is lethal unless diagnosed early (Crew and Neugut, 2006, Miranda et al., 2009 and Varadhachary and Ajani, 2005).
The present paper discusses a possibility of application of CDSS – Clinical Decision Support Systems in order to give an expert additional information on probable disease; an atrophic gastritis in our case. Section 2 gives a look into CDSS, defines the main objectives of the system and reveals methods used in the pilot research. Section 3 presents the system evaluation results, which are summarized and discussed in Section 4.
The results of comparing mentioned learning strategies make it possible to conclude that for the dataset used in the experiments, the application of classical ensemble building techniques – bagging and boosting, did not bring a significant increase in classifier efficiency comparing to the application of standalone classifiers. On average, the results are on the same level and it may be concluded that the standalone classifier is more preferable for the present dataset as it is simpler and provides the same efficiency. However, CDSS with a standalone classifier will tell the medical expert only the class value. If there is a necessity in having a probabilistic evaluations of forecasted target attribute values on the basis of classifier precision merits, CDSS should use another mentioned learning strategy on the basis of classifier ensemble.
The reached classification accuracy did not exceed the 60% limit and false negative rate remained close to, but still higher than 26%. This could be named a straightforward results as the initial data analysis showed the balance of classes in the most of primary attributes. In other words no strong relationships between target and descriptive attributes were found in the present dataset.
Patient age discretization did not bring a significant enhancement in classifier efficiency. The only exception was the CART algorithm, efficiency of which significantly increased using the discrete age. This moment once again gives an opportunity to conclude that a reduction of attribute values may enhance the classifier efficiency.
Positioning the future research objectives it should be noted that the application of classifier ensembles in CDSS is preferable as compared to standalone classifiers. The present research showed that classical ensemble building techniques – bagging and boosting, can return results with a suitable classification efficiency. However, to achieve a more precise classification a custom and data-specific methods for building classifier ensembles should be developed.
The application of CDSS in healthcare contributes to service quality enhancement, but it should not be forgotten that the final decision in healthcare is always made by a medical expert.