سیستمی تخصصی برای تشخیص کلاهبرداری در بیمه خودرو با استفاده از تجزیه و تحلیل شبکه های اجتماعی
کد مقاله | سال انتشار | تعداد صفحات مقاله انگلیسی |
---|---|---|
17725 | 2011 | 14 صفحه PDF |
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Expert Systems with Applications, Volume 38, Issue 1, January 2011, Pages 1039–1052
چکیده انگلیسی
The article proposes an expert system for detection, and subsequent investigation, of groups of collaborating automobile insurance fraudsters. The system is described and examined in great detail, several technical difficulties in detecting fraud are also considered, for it to be applicable in practice. Opposed to many other approaches, the system uses networks for representation of data. Networks are the most natural representation of such a relational domain, allowing formulation and analysis of complex relations between entities. Fraudulent entities are found by employing a novel assessment algorithm, Iterative Assessment Algorithm (IAA), also presented in the article. Besides intrinsic attributes of entities, the algorithm explores also the relations between entities. The prototype was evaluated and rigorously analyzed on real world data. Results show that automobile insurance fraud can be efficiently detected with the proposed system and that appropriate data representation is vital.
مقدمه انگلیسی
Fraud is encountered in a variety of domains. It comes in all different shapes and sizes, from traditional fraud, e.g. (simple) tax cheating, to more sophisticated, where entire groups of individuals are collaborating in order to commit fraud. Such groups can be found in the automobile insurance domain. Here fraudsters stage traffic accidents and issue fake insurance claims to gain (unjustified) funds from their general or vehicle insurance. There are also cases where an accident has never occurred, and the vehicles have only been placed onto the road. Still, the majority of such fraud is not planned (opportunistic fraud) – an individual only seizes the opportunity arising from the accident and issues exaggerated insurance claims or claims for past damages. Staged accidents have several common characteristics. They occur in late hours and non-urban areas in order to reduce the probability of witnesses. Drivers are usually younger males, there are many passengers in the vehicles, but never children or elders. The police is always called to the scene to make the subsequent acquisition of means easier. It is also not uncommon that all of the participants have multiple (serious) injuries, when there is almost no damage on the vehicles. Many other suspicious characteristics exist, not mentioned here. The insurance companies place the most interest in organized groups of fraudsters consisting of drivers, chiropractors, garage mechanics, lawyers, police officers, insurance workers and others. Such groups represent the majority of revenue leakage. Most of the analyses agree that approximately 20% of all insurance claims are in some way fraudulent (various resources). But most of these claims go unnoticed, as fraud investigation is usually done by hand by the domain expert or investigator and is only rarely computer supported. Inappropriate representation of data is also common, making the detection of groups of fraudsters extremely difficult. An expert system approach is thus needed. Jensen (1997) has observed several technical difficulties in detecting fraud (various domains). Most hold for (automobile) insurance fraud as well. Firstly, only a small portion of accidents or participants is fraudulent (skewed class distribution) making them extremely difficult to detect. Next, there is a severe lack of labeled data sets as labeling is expensive and time consuming. Besides, due to sensitivity of the domain, there is even a lack of unlabeled data sets. Any approach for detecting such fraud should thus be founded on moderate resources (data sets) in order to be applicable in practice. Fraudsters are very innovative and new types of fraud emerge constantly. Hence, the approach must also be highly adaptable, detecting new types of fraud as soon as they are noticed. Lastly, it holds that fully autonomous detection of automobile insurance fraud is not possible in practice. Final assessment of potential fraud can only be made by the domain expert or investigator, who also determines further actions in resolving it. The approach should also support this investigation process. Due to everything mentioned above, the set of approaches for detecting such fraud is extremely limited. We propose a novel expert system approach for detection and subsequent investigation of automobile insurance fraud. The system is focused on detection of groups of collaborating fraudsters, and their connecting accidents (non-opportunistic fraud), and not some isolated fraudulent entities. The latter should be done independently for each particular entity, while in our system, the entities are assessed in a way that considers also the relations between them. This is done with appropriate representation of the domain – networks. Networks are the most natural representation of any relational domain, allowing formulation of complex relations between entities. They also present the main advantage of our system against other approaches that use a standard flat data form. As collaborating fraudsters are usually related to each other in various ways, detection of groups of fraudsters is only possible with appropriate representation of data. Networks also provide clear visualization of the assessment, crucial for the subsequent investigation process. The system assesses the entities using a novel Iterative Assessment Algorithm (IAA algorithm), presented in this article. No learning from initial labeled data set is done, the system rather allows simple incorporation of the domain knowledge. This makes it applicable in practice and allows detection of new types of fraud as soon as they are encountered. The system can be used with poor data sets, which is often the case in practice. To simulate realistic conditions, the discussion in the article and evaluation with the prototype system relies only on the data and entities found in the police record of the accident (main entities are participant, vehicle, collision, 1police officer). The article makes an in depth description, evaluation and analysis of the proposed system. We pursue the hypothesis that automobile insurance fraud can be detected with such a system and that proper data representation is vital. Main contributions of our work are: (1) a novel expert system approach for the detection of automobile insurance fraud with networks; (2) a benchmarking study, as no expert system approach for detection of groups of automobile insurance fraudsters has yet been reported (to our knowledge); (3) an algorithm for assessment of entities in a relational domain, demanding no labeled data set (IAA algorithm); and (4) a framework for detection of groups of fraudsters with networks (applicable in other relational domains). The rest of the article is organized as follows. In Section 2 we discuss related work and emphasize weaknesses of other proposed approaches. Section 3 presents formal grounds of (social) networks. Next, in Section 4, we introduce the proposed expert system for detecting automobile insurance fraud. The prototype system was evaluated and rigorously analyzed on real world data, description of the data set and obtained results are given in Section 5. Discussion of the results is conducted in Section 6, followed by the conclusion in Section 7.
نتیجه گیری انگلیسی
The article proposes a novel expert system approach for detection of groups of automobile insurance fraudsters with networks. Empirical evaluation shows that such fraud can be efficiently detected using the proposition and, in particular, that proper representation of data is vital. For the system to be applicable in practice, no labeled data set is used. The system rather allows the imputation of domain expert’s knowledge, and it can thus be adopted to new types of fraud as soon as they are noticed. The approach can aid the domain investigator to detect and investigate fraud much faster and more efficiently. Moreover, the employed framework is easy to implement and is also applicable for detection (of fraud) in other relational domains. Future research will be focused on further analyses of different assessment models for IAA algorithm, considering also the nonlinear models. Moreover, the IAA will be altered into an unsupervised algorithm, learning the factors of the model in an unsupervised manner during the actual assessment. The factors would thus not have to be specified by the domain expert. Applications of the system in other domains will also be investigated.