دانلود مقاله ISI انگلیسی شماره 22288
ترجمه فارسی عنوان مقاله

مشخصات و شناسایی مالیات دهندگان با فاکتورها نادرست با استفاده از تکنیک های داده کاوی

عنوان انگلیسی
Characterization and detection of taxpayers with false invoices using data mining techniques
کد مقاله سال انتشار تعداد صفحات مقاله انگلیسی
22288 2013 10 صفحه PDF
منبع

Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)

Journal : Expert Systems with Applications, Volume 40, Issue 5, April 2013, Pages 1427–1436

ترجمه کلمات کلیدی
فاکتورها نادرست - تشخیص تقلب - داده کاوی - خوشه بندی - پیش گویی
کلمات کلیدی انگلیسی
False invoices, Fraud detection, Data mining, Clustering, Prediction
پیش نمایش مقاله
پیش نمایش مقاله  مشخصات و شناسایی مالیات دهندگان با فاکتورها نادرست با استفاده از تکنیک های داده کاوی

چکیده انگلیسی

In this paper we give evidence that it is possible to characterize and detect those potential users of false invoices in a given year, depending on the information in their tax payment, their historical performance and characteristics, using different types of data mining techniques. First, clustering algorithms like SOM and neural gas are used to identify groups of similar behaviour in the universe of taxpayers. Then decision trees, neural networks and Bayesian networks are used to identify those variables that are related to conduct of fraud and/or no fraud, detect patterns of associated behaviour and establishing to what extent cases of fraud and/or no fraud can be detected with the available information. This will help identify patterns of fraud and generate knowledge that can be used in the audit work performed by the Tax Administration of Chile (in Spanish Servicio de Impuestos Internos (SII)) to detect this type of tax crime.

مقدمه انگلیسی

Tax evasion and tax fraud1 have been a constant concern for tax administrations, especially when pertaining to developing countries (Davia, Coggins, Wideman, & Kastantin, 2000). While it is true that taxes are not the only source of government funding, the fact is that they send a very important signal about the commitment and effectiveness with which the State can carry out its functions and restrict access to other sources of income. In particular, the value added tax (VAT), implemented in over 130 countries at different stages of economic development has become a key component of tax revenues, raising about 25% of the world’s tax revenue (Harrison & Krelove, 2005). In the case of Chile, taxes provide about 75% of the resources from which the State each year pays its expenses and investments, collecting during 2011 a total of USD $41.6 billion dollars.2 VAT represents 45% amounting to USD $18.7 billion dollars and generating over 400 million invoices a year, of which 56% is issued in paper format and 44% in electronic format (Bergman, 2010). The phenomenon of false invoices in respect of VAT is explained by the mechanics of determining the tax payable. When a company receives a false invoice, it simulates a purchase that never existed, thus increasing its tax credit fraudulently and decreasing VAT payment. Also, there is a decrease of payment in the income tax due to increased costs and expenditures declared. The falsity of the document may be material if the physical elements that make up the invoice have been adulterated, or ideological when the materiality of the document is not altered, but the operations recorded in it are adulterated or nonexistent. The latter is more complex and difficult to detect because it involves fictitious transactions in which an audit is required to examine the sales books and corrections, or cross referencing the information with suppliers. Moreover, these cases are more expensive for SII, as they require a greater amount of time dedicated to collecting and testing evidence, which is harder to find. The best known cases of material falsification are the physical adulteration of the document, the use of hanging invoices in which an invoice is counterfeited to impersonate a taxpayer of good behavior, and the use of a double set of tax invoices, which has two same-numbered invoices, but one of which is fictional and for a higher amount. In ideological falsification, invoices are used to register a nonexistent operation or adulterate the contents of an existing operation. According to a method used by the SII to estimate VAT evasion (Schneider & Enste, 2000) resulting from false invoices and other credit enlargements applied in the period 1996–2004, evasion by false invoices has historically represented between 15% and 25% of total VAT evasion, increasing significantly in years of economic crisis. This is why in the crisis of 1998–1999 the participation rate increased to 38%, reaching an amount close to USD $1 billion dollars. This becomes relevant since recently there was a global economic crisis that hit Chile in late 2008 and the middle of 2009, causing an increase in the rate of VAT evasion to 23%, in the amount of evasion of USD $4 billion dollars. It also requires that resources be invested in well-focused monitoring, detecting those taxpayers who have greater compliance risk and not bother or waste time and resources on those who do comply (Slemrod & Yitzhaki, 2002). For this, data mining techniques offer great potential, because they allow the extraction and generation of knowledge from large volumes of data to detect and characterize fraudulent behavior and failure to pay tax, in the end improving the use of resources (Fayyad, Piatestky-Shapiro, & Smyth, 1996). This paper is organized as follows. Section 2 describes how artificial intelligence techniques have facilitated the detection of tax evasion in tax administrations. Section 3 describes the data mining techniques applied. Section 4 describes the type of information used and the main results obtained in the characterization and detection of fraud in the issuance of invoices, and Section 5 presents the main conclusions and future lines of research.

نتیجه گیری انگلیسی

The clustering and classification methods used to characterize the taxpayers who have good or bad fiscal behavior associated with the use of false invoices show that it is possible to identify some distinguishing characteristics between one group and another, which accord with what happens in reality. Particularly the neural gas method found that it was possible to identify some relevant variables to differentiate between good or bad behavior, not necessarily associated with the use and sale of false invoices. Kohonen’s method however, did not provide any behavioral patterns associated with the use of false invoices, but rather clusters were detected in relation to taxation, in which the variables with the largest number of zeros and variance proved to have more impact in shaping the groups. The decision tree method applied to cases in which the result of fraud and no fraud was known was a good technique to detect variables that could distinguish between fraud and no fraud. This is because when analyzing the distribution of variables in each group, it is noted that fraud cases tend to take more extreme values of the variables, so it was possible to distinguish ranges in which there is a chance of having or not having fraud. On the other hand, the results were consistent with those observed in reality, according to the expert view. Thus, in the case of micro and small enterprises the variables that allowed distinguishing between fraud and no fraud were mainly related to the percentage of tax credits generated by invoices with respect to total credit and previous audits with negative results. To the extent that the taxpayer was audited several times in the past and nothing was found, they are more likely to have no fraud in the future. On the other hand, where their credit is more associated with other items than invoices (fixed or other assets) they are less likely to use invoices to support their claims. Other important variables were the number of invoices issued during the year and its relation to the invoices stamped in the past two years, the total amount of VAT declared during the year, the ratio of average tax credit balances and positive prior audits and historical crimes and irregularities associated with invoices. In the medium and large companies, the most important variables were the amount of surplus credit accumulated in prior periods, the percentage of credit associated with invoices, the relationship between costs and assets, the level of informality in their accounting and the age of the company, as well as the number of irregularities associated with previous invoices and the amount of orders to pay and historical failures to answer notifications. In relation to the detection models, those which performed better were the multilayer perceptron neural network models, which for purposes of the study had an input layer containing the explanatory variables, an intermediate layer of processing and an output layer. In the case of micro and small businesses the percentage of correctly detected fraud cases was 92%, while in the case of medium and large enterprises, this percentage was 89%. Given this result, and considering that in practice only a rather small group of companies in a year can be monitored, we recommend a combination of the results obtained with neural networks, decision tree and bayesian networks, in order to select for audit those that appear labeled as fraud in the neural network and have the highest odds of committing fraud under the Bayesian network and decision tree. According to studies made by the SII, about 20% of taxpayers use false invoices to evade taxes. No information disaggregated by type of taxpayer exists but considering the percentage of classification of cases with and without fraud by neural network models, it is estimated that the universe of potential users of false invoices is 116,000 micro and small enterprises and 4768 medium and large enterprises, generating a potential collection of USD $210 million dollars. Finally, to test the actual detection model developed, and being consistent with the previous point, its implementation in activities in the field is vital to determine the level of accuracy in the classification of taxpayers selected in the sample. The implementation of a pilot program that will target the two economic sectors studied is recommended, which shall be conclusive in terms of the real effectiveness of the model. For future work, we recommend generating new historical behavioral variables related to specific audits and level of coverage of these, considering other methods for preprocessing and selection of variables as well as cross-validation techniques to explore and implement other data mining techniques to improve the detection of cases with and without fraud.