Credit card fraud costs consumers and the financial industry billions of dollars annually. However, there is a dearth of published literature on credit card fraud detection. In this study we employed transaction aggregation strategy to detect credit card fraud. We aggregated transactions to capture consumer buying behavior prior to each transaction and used these aggregations for model estimation to identify fraudulent transactions. We use real-life data of credit card transactions from an international credit card operation for transaction aggregation and model estimation.
Credit card fraud costs consumers and the financial industry billions of dollars annually (Chan et al., 1999 and Chen et al., 2006). The reported loss due to online fraud for the year 2008 was $4 billion, an increase of 11% on year 2007 loss of $3.6 billion (Leggatt, 2008). Credit card transactions, as a share of payment system, have been growing worldwide along with credit card fraud. Moreover, credit card fraud funds other criminal activities, including terrorism in ways that may be difficult to track and prevent (Everett, 2009). As fraud detection has steadily evolved, perpetrators have become more sophisticated in tandem with these improvements (Bolton & Hand, 2002). The audit of credit card fraud is an ongoing ‘arms-race’ that requires constant innovation on the part of card issuers.
However, there are various obstacles to this innovation. For example, academicians have difficulty in getting credit card transactions datasets leading to less academic research and also not much of proprietary detection techniques get discussed in public lest fraudsters should gain knowledge and evade detection (Leonard, 1993). There is a dearth of published literature on credit card fraud detection, which makes exchange of ideas and possible innovation in fraud detection difficult (Bolton & Hand, 2002). One difficulty with analysis of credit card fraud is that perpetrators do not usually carry on a single fraudulent transaction. Analyzing fraud from the perspective of a “one by one” transaction omits the idea of clustering that is inherent of credit card fraud actions. Perpetrators usually produce a group of fraudulent transactions. We argue that analyzing the aggregated behavior is essential to improve credit card fraud detection rates.
In this study we employ transaction aggregation strategy (Krivko, 2010 and Whitrow et al., 2009) to create variables for the estimation of a logistic regression model to attempt to detect (and thus control and prosecute) credit card fraud. We demonstrate the efficacy of aggregating transactions to capture consumer buying behavior prior to each transaction. The underlying rationale is that the buying behavior of fraudulent and legitimate transactions is different. This difference gets captured in aggregated transactions and can be used for identification of fraudulent transactions. We use real-life data of transactions from an international credit card operation for aggregating transactions and then use them for model estimation.
A general definition of ‘fraud’ may be somewhat elusive, as new methods of fraud appear with regularity. For the purpose of this study, fraudulent transactions are specifically defined by the institutional auditors as those that caused an unlawful transfer of funds from the bank sponsoring the credit cards. These transactions were observed to be fraudulent ex post.
The remainder of this paper is organized as follows. In the next section we discuss credit card fraud and detection methods. In section 3, we discuss the dataset source, primary attributes, and creation of derived attributes using these primary attributes. In section 4, we discuss the estimation method and present a standard logit model. Thereafter, we present our results, discussion, and conclusions of our study.
This study demonstrated the usefulness of creating derived attributes and judicious data partitioning for fraud detection. Practitioners and researchers may employ the idea of transaction aggregation in creation of suitable derived attributes. The key to creation of effective derived attributes is the choice of primary attributes and the length of aggregation periods of transactions. The choice of derived attributes may depend on the changes in perpetrators’ fraudulent behavior over time [5] and future research may investigate this issue. Similarly, aggregation periods can also be an interesting area of future research and researchers may employ different aggregation periods for different attributes.
This study also highlights the importance of dataset partitioning to focus on transactions where frauds are more likely to occur. This is critical because it is impossible to verify all transactions, given the constraints of cost and time. Credit card transaction datasets are not only large but also have lopsided class sizes, with legitimate transactions far outnumbering fraudulent transactions. Hence, merchants have to focus on transaction types, product types, and/ or merchant types, where frauds are more likely to occur.