طبقه بندی سفارشی و خارج از نوبت برای تشخیص کلاهبرداری
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|17702||2008||12 صفحه PDF||سفارش دهید||6770 کلمه|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Computational Statistics & Data Analysis, Volume 52, Issue 9, 15 May 2008, Pages 4521–4532
Detecting fraudulent plastic card transactions is an important and challenging problem. The challenges arise from a number of factors including the sheer volume of transactions financial institutions have to process, the asynchronous and heterogeneous nature of transactions, and the adaptive behaviour of fraudsters. In this fraud detection problem the performance of a supervised two-class classification approach is compared with performance of an unsupervised one-class classification approach. Attention is focussed primarily on one-class classification approaches. Useful representations of transaction records, and ways of combining different one-class classifiers are described. Assessment of performance for such problems is complicated by the need for timely decision making. Performance assessment measures are discussed, and the performance of a number of one- and two-class classification methods is assessed using two large, real world personal banking data sets.
Retail banks have successfully deployed plastic cards to provide a broad range of products and banking opportunities to consumers. This provision has been accompanied by the very serious problem of plastic card fraud. Our interest in this paper is detecting fraudulent transactions. Loosely, fraud implies unauthorised and illegal use of the credit facilities of a legitimate account. It is estimated that losses attributed to such fraud in the UK in 2004 amounted to £505 million. The most recent figures suggest a slight decrease for the first half of the year 2006, down by 5% from previous years, to £209 million (APACS, 2006). It is speculated that this drop can be attributed to the introduction of PIN authentication by the chip and PIN scheme, launched on 14th February 2006. Responsibility for the financial burden of fraud is absorbed by lenders, merchants and legitimate customers. Lenders and merchants expend significant resource to secure their systems and procedures in an attempt to limit their liability for such costs. Tackling fraud in the context of plastic card finance is a daunting problem. The effort can be divided into fraud prevention, that attempts to block fraudulent transactions at source, and fraud detection, where successful fraud transactions are subsequently identified. For prevention purposes, financial institutions challenge all transactions with rule based filters and methods based on neural networks; e.g. FALCON. For fraud detection it is obviously desirable to detect fraud as rapidly as possible. In both cases, prevention and detection, the problem is magnified by a number of domain constraints and characteristics. First, care must be taken not to prevent, or incorrectly implicate, too many legitimate transactions. Customer irritation is to be avoided. Second, most banks process vast numbers of transactions, of which only a small fraction is fraudulent, often less than 0.1%. Many approaches to fraud problems have been considered. Fawcett and Provost (2002) and Kou et al. (2004), provide a general discussion. Statistical views are explored by Bolton and Hand (2002), while data mining perspectives are discussed by Phua et al. (submitted for publication). In the context of plastic card fraud, various authors (e.g. Brause et al. (1999) and Maes et al. (2002)) have approached fraud detection as a classification problem. To use such approaches, a number of problems have to be solved. First, extensive processing of the irregularly timed transaction sequences is required, to convert the data into a representation suitable for classification algorithms. Furthermore, fraudsters change tactics–supervised approaches may only find existing tactics. The marked heterogeneity of transaction behaviour within and between accounts, along with the highly imbalanced classes, might indicate that supervised classification is not the most natural or appropriate tool for this problem. In this paper, we consider plastic card fraud detection approaches based on a one-class classification (e.g., Tax (2001) and Juszczak (2006)). The idea is to monitor each account separately and using suitable descriptors, attempt to identify and flag transactions that are abnormal. Abnormality will be defined in comparison to a model related to the estimated probability density of the account’s legitimate transaction descriptors. We propose a two stage process. In the estimation stage, we obtain a model of the distribution of the legitimate class. In the subsequent operational phase, transactions are designated legitimate or abnormal. We see the estimation phase as estimating the distribution of the normal class, followed by assignment to this class or some “other” class. One application of one-class classifiers is outlier detection (Barnett and Lewis, 1994, Hodge and Austin, 2004 and Ferdousi and Maeda, 2006). Important steps in the approach include judicious selection of descriptors, estimation of the distribution of the normal class, and the specification of an alert threshold for the contours of estimated probability, such that any transaction lying outside distribution is regarded as abnormal. A particularly attractive feature of one-class classification methods is that they have the capacity to respond to new types of fraud, since no explicit model is constructed for fraudulent behaviour — the models are based on legitimate behaviour only. One potential problem with this approach is that not all fraudulent transactions are abnormalities. The model of normal behaviour will in fact be based on a mixture of legitimate transactions and fraudulent transactions that appear legitimate. However, since the prevalence of fraudulent transactions is generally very low, we expect this to have negligible impact. A second, and complementary potential problem is that not all abnormalities will be fraudulent transactions. The proportion of flagged abnormalities which are in fact legitimate will be the false positive rate, and this will be a component of the performance measure.
نتیجه گیری انگلیسی
In this paper we have explored the utility of unsupervised one-class classification methods for the challenging problem of plastic card fraud detection. Such methods are eminently well suited to the large scale, imbalanced, asynchronous and evolving nature of plastic card transaction fraud detection. We found that this approach is to some extent capable of timely detection of fraudulent transactions. We compared the performance of one- and two-class classifiers in the fraud detection problem. Although the two-class classifiers perform well on frauds and legitimate transactions that are similar to those represented in a training set they fail to identify new types of frauds. One consequence of this is that the two-class approach may deteriorate more rapidly than the one-class approach, and we found evidence supporting this suggestion.