آزمون انطباق داده های سری های زمانی برای الگوبرداری
|کد مقاله||سال انتشار||مقاله انگلیسی||ترجمه فارسی||تعداد کلمات|
|1346||2011||13 صفحه PDF||سفارش دهید||محاسبه نشده|
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : International Journal of Forecasting, Available online 15 December 2011
Compatibility testing determines whether two series, say a sub-annual and an annual series, both of which are subject to sampling errors, can be considered suitable for benchmarking. We derive statistical tests and discuss the issues with their implementation. The results are illustrated using the artificial series from Denton (1971) and two empirical examples. A practical way of implementing the tests is also presented.
Benchmarking is done when two time series measuring the same variable at different frequencies are both subject to measurement errors. Benchmarking combines information from the two series to get a better estimate of the target variable. For example, monthly surveys of business revenue should add up to an equivalent annual survey. However, they will not typically add up, due to measurement errors. Benchmarking would produce a monthly series which will add up to the annual figures. For further details, see Cholette and Dagum (1994) for benchmarking time series using autocorrelated sampling errors, and Dagum and Cholette (2006) for a recent book on benchmarking and temporal distribution methods. Before benchmarking two time series, it might be advisable to test whether these series should be benchmarked in the first place. Two series are compatible if they are jointly likely to have been observed, considering the distributional assumptions made on the error process. Guerrero (1990) proposes a test for compatibility in the context of temporal disaggregation, which can be cast as a form of benchmarking. The test determines whether the observed aggregates are likely to have been observed under a given ARIMA model, where, in practice, the model is determined from observed sub-annual data. In turn, this series is used to temporally disaggregate the original series. The test is suitable when choosing one sub-annual series among many candidates, but this is not what we are considering in this paper. We are also looking for methods which can be implemented in a computer package, such as Statistics Canada’s in-house SAS® Proc Benchmarking (Latendresse, Djona, & Fortier, 2007), with as few inputs from the user as possible. Furthermore, our primary concern is the presence of measurement errors which cause discrepancies and therefore generate the need for benchmarking. The tests proposed here are designed for quality testing when we already know which series are involved. These tests are based on the observed discrepancies between the two series and determine whether the observed discrepancies are as expected. If the answer is no, then benchmarking should not be applied. A list of possible conceptual, operational and methodological differences to investigate is provided by Brisebois and Yung (2007). Section 2 of the paper states the model and hypotheses to be tested. Section 3 defines some statistics based on the observed discrepancies. Section 4 derives the test statistics assuming a full knowledge of the covariance matrices of the errors. It includes a discussion of benchmarking using signal extraction methods, which requires additional knowledge of the data generating process. In practice, the covariance matrices of the errors may not be fully known or available, in which case some additional assumptions are required. We discuss the issues facing the implementation of the tests when only the sub-annual series and the benchmarks are provided. Section 5 presents two simpler statistics which do not depend on the covariance structure, although their distributions do. The two simpler statistics can be used as indices of quality. Section 6 provides two real examples in which one is believed to be likely compatible and the is other not. These two examples illustrate how the compatibility tests can be used in practice. Section 7 presents a summary of the issues discussed and concludes the paper.
نتیجه گیری انگلیسی
In this paper, we have addressed the problem of testing for compatibility prior to benchmarking. From the model, we derived T0T0 as the reference test, which requires a knowledge of the scale and structure of the measurement error. We then showed that simpler statistics can be used. The one based on the first difference T2T2 behaves better than T1T1 in general, but if ρρ is small, T1T1 is better. Their distributions, however, require a gamma approximation. We propose a test for any significant bias in order to correct for possible coverage errors. Assuming that the sub-annual series is free of coverage bias, we propose a method which consists of creating an interval on the expected scale of the error, outside of which a decision can be made at a given confidence level. This allows for practical decision making in a context where only an approximate knowledge of ρρ is available. This method can be applied to T0,T1T0,T1 and T2T2. In the absence of a knowledge of the parameter ρρ, we provide a simple cut-off point of the size of the coefficient of variation in the sub-annual series, above which the series are likely to be compatible or under which they are unlikely to be. We have derived an alternative test based on testing whether the regression coefficient of the annual totals of the sub-annual series on the annual series is significantly different from 1. This test also provides another estimate of the variance of the measurement errors in the sub-annual series. A few issues needed to be addressed in order to implement a test for data compatibility when only the series ss and the benchmarks aa are provided, and there is minimal information regarding the measurement errors. 1. Is it reasonable to assume V(ϵ)=0V(ϵ)=0 for compatibility testing when aa are subject to measurement errors? 2. Is testing for significant bias, H2H2, a valid alternative? 3. Is it more appropriate to assume a constant variance or a constant coefficient of variation (cv) for etet? 4. Is it necessary to input a theoretical vector of seasonal proportions for testing under the constant cv assumption? 5. Is it sufficient to provide View the MathML sourceσˆ2(Ω,0.95) or View the MathML sourcecˆ2(Ω,0.95) as a guideline for deciding whether the data should be benchmarked or not? 6. What kind of parametric models should be used for ΩΩ and V(e)V(e)? In addition, we have explored the idea of creating an index when more than one series is to be benchmarked. The issues with the proposed test statistics were discussed and illustrated. Finally, the use of model-based signal extraction methods in testing for compatibility was discussed briefly. The suggested tests are better to be used as indicators of potential problems in order to ensure that end users are “protected from themselves”. Whenever possible, the decision of whether to benchmark should be done carefully as part of a program review, and the knowledge of the variance matrices should come into play if applicable; however, there are situations in which the proposed tests may be the best that can be done.