تشخیص خودکار تقلب ادعایی با استفاده از شبکه های عصبی یادگیری بیزین
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|17681||2005||14 صفحه PDF||سفارش دهید||10455 کلمه|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Expert Systems with Applications, Volume 29, Issue 3, October 2005, Pages 653–666
This article explores the explicative capabilities of neural network classifiers with automatic relevance determination weight regularization, and reports the findings from applying these networks for personal injury protection automobile insurance claim fraud detection. The automatic relevance determination objective function scheme provides us with a way to determine which inputs are most informative to the trained neural network model. An implementation of MacKay's, (1992a,b) evidence framework approach to Bayesian learning is proposed as a practical way of training such networks. The empirical evaluation is based on a data set of closed claims from accidents that occurred in Massachusetts, USA during 1993.
In recent years, the detection of fraudulent claims has blossomed into a high-priority and technology-laden problem for insurers (Viaene, 2002). Several sources speak of the increasing prevalence of insurance fraud and the sizeable proportions it has taken on (see, for example, Canadian Coalition Against Insurance Fraud, 2002, Coalition Against Insurance Fraud, 2002, Comité Européen des Assurances, 1996 and Comité Européen des Assurances, 1997). September 2002, a special issue of the Journal of Risk and Insurance (Derrig, 2002) was devoted to insurance fraud topics. It scopes a significant part of previous and current technical research directions regarding insurance (claim) fraud prevention, detection and diagnosis. More systematic electronic collection and organization of and company-wide access to coherent insurance data have stimulated data-driven initiatives aimed at analyzing and modeling the formal relations between fraud indicator combinations and claim suspiciousness to upgrade fraud detection with (semi-)automatic, intelligible, accountable tools. Machine learning and artificial intelligence solutions are increasingly explored for the purpose of fraud prediction and diagnosis in the insurance domain. Still, all in all, little work has been published on the latter. Most of the state-of-the-art practice and methodology on fraud detection remains well-protected behind the thick walls of insurance companies. The reasons are legion. Viaene, et al. (2002) reported on the results of a predictive performance benchmarking study. The study involved the task of learning to predict expert suspicion of personal injury protection (PIP) (no-fault) automobile insurance claim fraud. The data that was used consisted of closed real-life PIP claims from accidents that occurred in Massachusetts, USA during 1993, and that were previously investigated for suspicion of fraud by domain experts. The study contrasted several instantiations of a spectrum of state-of-the-art supervised classification techniques, that is, techniques aimed at algorithmically learning to allocate data objects, that is, input or feature vectors, to a priori defined object classes, based on a training set of data objects with known class or target labels. Among the considered techniques were neural network classifiers trained according to MacKay's (1992a) evidence framework approach to Bayesian learning. These neural networks were shown to consistently score among the best for all evaluated scenarios. Statistical modeling techniques such as logistic regression, linear and quadratic discriminant analysis are widely used for modeling and prediction purposes. However, their predetermined functional form and restrictive (often unfounded) model assumptions limit their usefulness. The role of neural networks is to provide general and efficiently scalable parameterized nonlinear mappings between a set of input variables and a set of output variables (Bishop, 1995). Neural networks have shown to be very promising alternatives for modeling complex nonlinear relationships (see, for example, Desai et al., 1996, Lacher et al., 1995, Lee et al., 1996, Mobley et al., 2000, Piramuthu, 1999, Salchenberger et al., 1997 and Sharda and Wilson, 1996). This is especially true in situations where one is confronted with a lack of domain knowledge which prevents any valid argumentation to be made concerning an appropriate model selection bias on the basis of prior domain knowledge. Even though the modeling flexibility of neural networks makes them a very attractive and interesting alternative for pattern learning purposes, unfortunately, many practical problems still remain when implementing neural networks, such as What is the impact of the initial weight choice? How to set the weight decay parameter?How to avoid the neural network from fitting the noise in the training data? These and other issues are often dealt with in ad hoc ways. Nevertheless, they are crucial to the success of any neural network implementation. Another major objection to the use of neural networks for practical purposes remains their widely proclaimed lack of explanatory power. Neural networks are black boxes, it says. In this article Bayesian learning ( Bishop, 1995 and Neal, 1996) is suggested as a way to deal with these issues during neural network training in a principled, rather than an ad hoc fashion. We set out to explore and demonstrate the explicative capabilities of neural network classifiers trained using an implementation of MacKay's (1992a) evidence framework approach to Bayesian learning for optimizing an automatic relevance determination (ARD) regularized objective function (MacKay, 1994 and Neal, 1998). The ARD objective function scheme allows us to determine the relative importance of inputs to the trained model. The empirical evaluation in this article is based on the modeling work performed in the context of the baseline benchmarking study of Viaene et al. (2002). The importance of input relevance assessment needs no underlining. It is not uncommon for domain experts to ask which inputs are relatively more important. Specifically, Which inputs contribute most to the detection of insurance claim fraud? This is a very reasonable question. As such, methods for input selection are not only capable of improving the human understanding of the problem domain, in casu the diagnosis of insurance claim fraud, but also allow for more efficient and lower-cost solutions. In addition, penalization or elimination of (partially) redundant or irrelevant inputs may also effectively counter the curse of dimensionality ( Bellman, 1961). In practice, adding inputs (even relevant ones) beyond a certain point can actually lead to a reduction in the performance of a predictive model. This is because, faced with limited data availability, as we are in practice, increasing the dimensionality of the input space will eventually lead to a situation where this space is so sparsely populated that it very poorly represents the true model in the data. This phenomenon has been termed the curse of dimensionality. The ultimate objective of input selection is, therefore, to select a minimum number of inputs required to capture the structure in the data. This article is organized as follows. Section 2 revisits some basic theory on multilayer neural networks for classification. Section 3 elaborates on input relevance determination. The evidence framework approach to Bayesian learning for neural network classifiers is discussed in Section 4. The theoretical exposition in the first three sections is followed by an empirical evaluation. Section 5 describes the characteristics of the 1993 Massachusetts, USA PIP closed claims data that were used. Section 6 describes the setup of the empirical evaluation and reports its results. Section 7 concludes this article.
نتیجه گیری انگلیسی
Understanding the semantics that underlie the output of neural network models proves an important aspect of their acceptance by domain experts for routine analysis and decision making purposes. Hence, we explored the explicative capabilities of neural network classifiers with automatic relevance determination weight regularization, and reported the findings of applying these networks for personal injury protection automobile insurance claim fraud detection. The regularization scheme was aimed at providing us with a way to determine the relative importance of each input to the trained neural network model. We proposed to train the neural network models using MacKay, 1992a and MacKay, 1992b evidence framework for classification, a practical Bayesian learning approach that readily incorporates automatic relevance determination. The intelligible soft input selection capabilities of the presented method were demonstrated for a claim fraud detection case based on a data set of closed claims from accidents that occurred in Massachusetts, USA during 1993. The neural network findings were compared to the predictor importance evaluation from popular logistic regression and decision tree classifiers.