زمان تجارت کردن و رتبه بندی تمرین به ابعاد مختلف EQ-5D ایالات بهداشت حساس هستند
کد مقاله | سال انتشار | تعداد صفحات مقاله انگلیسی |
---|---|---|
25387 | 2012 | 6 صفحه PDF |
Publisher : Elsevier - Science Direct (الزویر - ساینس دایرکت)
Journal : Value in Health, Volume 15, Issue 5, July–August 2012, Pages 777–782
چکیده انگلیسی
Background One method suggested for creating preference-based tariffs for the new five-level EuroQol five-dimensional (EQ-5D) questionnaire is combining time trade-off (TTO) and discrete choice exercises. Rank values from previous valuation studies can be used as proxies for discrete choice exercises. This study examined rank and TTO data to determine whether the methods differ in sensitivity to the EQ-5D questionnaire dimensions. Methods We used rank and TTO data for 42 EQ-5D questionnaire health states from the US and UK three-level EQ-5D questionnaire valuation studies, extracting overall ranks of mean TTO and mean rank values, ranging from 1 (best) to 42 (worst). We identified pairs of health states with reversed overall ranks between TTO and rank data and regressed overall rank differences (TTO – ranking) on dummy variables representing impairments on EQ-5D questionnaire dimensions. Results Forty-three (US) and 41 (UK) health state pairs displayed reversed rank order. Both US and UK regression models on rank differences indicated that respondents rated impairments involving pain/discomfort and anxiety/depression as relatively worse in TTO than in the ranking task. Discussion Different dimension sensitivity between TTO and ranking methods suggests that combining them could lead to inconsistent tariffs. Differences could be caused by respondents focusing on the first presented dimensions when ranking states or could be related to the longest endurable time for health states involving pain/discomfort or anxiety/depression. The observed differences call into question which method best represents the preferences of the population.
مقدمه انگلیسی
The EuroQol five-dimensional (EQ-5D) questionnaire is a health-related quality-of-life instrument that is used extensively to estimate quality-adjusted life-years in health economic evaluations [1] and [2]. It uses five dimensions of health: mobility, self-care, usual activities, pain/discomfort, and anxiety/depression. Up until recently, these dimensions could be rated at three levels, corresponding to “no problems,” “some problems,” and “extreme problems.” The EuroQol group, however, has released official versions of the new five-level EQ-5D questionnaire, an expansion of the previous three-level EQ-5D questionnaire, in which each of the instrument's five dimensions can be rated at five levels. This expansion has increased the number of combination health states from 243 to 3125. Value sets for the three-level EQ-5D (EQ-5D-3L) questionnaire have typically been made by using mean preference values from the general population, elicited by using the time trade-off (TTO) method, in which health states are valued in relation to perfect health and death. As TTO interviews are costly and time-consuming, EQ-5D-3L questionnaire valuation studies have typically elicited TTO values for subsets (17–46) of the 243 possible health states, and values for all 243 states have been estimated by using regression modeling. Differences in the number of health states directly valued have been determined to contribute to observed differences between national EQ-5D questionnaire value sets [3], and two recent valuation studies directly valuing greater numbers of health states have revealed more complex interactions than those identified by previous valuation studies [4] and [5]. The increase in the number of possible health states that accompany the new five-level EQ-5D questionnaire makes the conventional method economically unfeasible and has led to a renewed focus on alternative valuation methods. One suggested method for creating value sets for the five-level EQ-5D questionnaire is combining TTO values for a limited set of health states with discrete choice exercise (DCE) data for a larger sample of health states [6]. In DCE, respondents are asked to state which of two alternative health states they think is best, a simpler and less costly method than TTO valuation. Combining TTO and DCE data in this manner requires that the two methods measure the same construct in similar manners. Preliminary analyses of results from a set of experimental valuation exercises performed in Norway, however, led us to wonder whether ranking and TTO exercises may make respondents sensitive to different EQ-5D questionnaire dimensions; we observed unexpected and stable mean rank transpositions between TTO and ranking of health state pairs involving impairments on different EQ-5D questionnaire dimensions. Both in our valuation experiments and in previous TTO-based EQ-5D-3L questionnaire valuation studies, respondents have been familiarized with health state valuation before TTO elicitation by having them rank the presented health states from subjective best to worst and then value the ranked states on a visual analogue scale (VAS). Lacking a gold standard for comparison, several researchers have proposed the use of ranking as a benchmark for comparison when considering the validity of other valuation methods such as TTO [7] and [8]. Furthermore, the ranking task can be considered as an ordered set of discrete choices. As such, existing rank data may be used as imperfect proxies for DCE data. Since ranking tasks were used in previous TTO-based EQ-5D questionnaire valuation studies, an abundance of data is available that enables comparison of ranking and TTO values. The aim of this study was to examine data from previous valuation studies to determine whether respondents were sensitive to different dimensions of health state impairment when performing ranking of health states to when performing TTO valuation.
نتیجه گیری انگلیسی
Respondents in the two valuation studies appear to have been more sensitive to impairments on the dimensions of mobility, self-care, and usual activities when ranking health states and more sensitive to impairments involving pain/discomfort and anxiety/depression in the TTO valuation. In both data sets, there were many examples of health state pairs in which respondents ranked one state as better than the other but were willing to trade away more life time to avoid the health states of the better ranked state than the worst. In nearly all these pairs, one of the states was predominantly impaired on the first three dimensions of the EQ-5D questionnaire, while the other state was dominated by impairments on the last two dimensions. The regression models indicate that the health state pairs in which the overall rank order was reversed between ranking and TTO represent extreme examples of a general trend. This apparent inconsistency in how the two methods value the different dimensions of health constitutes a breach of procedural invariance [17] and [18] and casts doubt on the two methods' ability to capture the same underlying construct—the population's preferences for EQ-5D questionnaire health states. Because the analyses were performed on ranks of means, interpreting the magnitude of the observed differences is not a straightforward task. Spearman's Rho between US and UK data within each method, however, was higher than between those methods for the same country, indicating that the difference between the methods is greater than the differences in mean preferences between the two countries. There is a large body of literature documenting how different valuation methods yield different results [7], [19], [20] and [21]. For instance, it has often been found that the standard gamble method yields higher values than TTO, which yields higher values than the VAS. Such comparisons, however, have typically focused on differences in absolute levels of values or on the functional form of values from different instruments. Our finding of dimension-specific inconsistencies between the ranking and TTO methods underscores the importance of investigating potential disagreements on the level of health dimensions when comparing valuation methods. In addition to the analyses presented, we performed several tests that did not add any new information: Analyses on data from the Danish TTO-based valuation study replicated the findings from the UK and US data. Switching the transformation methods for health states considered worse than death in the TTO task resulted in slight changes to the magnitudes of the regression coefficients, but the overall picture remained unchanged. In the two valuation studies from which our data were acquired, respondents were asked to value the same set of health states by using a thermometer-like VAS. VAS valuation was performed right after the ranking task, with the health states still in their ranked order, meaning that the VAS values were highly dependent on the previous ranking. We performed analyses substituting the overall mean rankings with overall rankings of mean VAS scores, with nearly identical results. Because of the intertwined nature of the ranking and VAS valuations, this does not necessarily mean that VAS valuation without prior ranking would induce sensitivity to the same EQ-5D questionnaire dimensions that ranking apparently does. Details and results for these analyses are available from the corresponding author. This study had four primary limitations. First, because ranking can be conceptualized as a set of discrete choices, we have used rank order as a proxy for DCE data. Empirical testing however, would be required to determine whether respondents perform consecutive DCE tasks in the same manner as they perform ranking. As ranking involves simultaneous comparison of more items than does DCE, there may be differences in how the two tasks are processed by respondents. Second, we analyzed the rank order of mean values from TTO and mean rank data. This procedure is insensitive to the relative distance between health states. Third, the analyses were performed on data collected for the EQ-5D-3L questionnaire. The degree to which this is generalizable to the five-level version is unknown, though some studies have been performed indicating that there is considerable agreement between the three- and five-level versions [22], [23], [24] and [25]. Finally, multiple linear regressions on rank data are not ideal. Because our objective was to identify and illustrate differences between the two valuation methods in terms of how respondents value the five EQ-5D questionnaire dimensions, we considered multiple linear regressions to be the simplest and most accessible method sufficient for our purpose. This study does not inform us as to why respondents rank the EQ-5D questionnaire dimensions in an apparently inconsistent manner when performing ranking and TTO valuation. We offer two hypotheses that are congruent with our findings but must warn that they are speculative at present. First, the respondents could be more influenced by the ordering of presentation for the EQ-5D questionnaire dimensions when ranking than when performing TTO; the ordering was fixed in both studies, and it is conceivable that respondents performing ranking of health states start by comparing the first dimension, then go on to the second, and so on, increasing the relative impact of impairments on the first dimensions. Fortunately, if this is the case, the observed differences should disappear if the ordering of the five dimensions was randomized. Alternatively, it could be that time framing is more salient in TTO and that respondents find the thought of long-term impairments involving pain/discomfort or anxiety/depression unbearable. This interpretation is compatible with previous findings about nonlinear time preferences and the concept of maximum endurable time in TTO [26], [27], [28] and [29]. In a recent cognitive debriefing study of EQ-5D questionnaire valuation by Bailey et al. [30], respondents frequently ignored the 10-year duration in the VAS and ranking tasks but were sensitive to time when performing TTO. The observed inconsistency between TTO and ranking raises two important issues: First, which of the two valuation methods should be considered as being the best or most correct? Second, if these findings can be generalized to TTO and DCE for the five-level version of the EQ-5D questionnaire, combining the two methods for the purpose of tariff generation may prove troublesome. If we understand the methods required for such hybrid tariff generation correctly, and DCE behaves as ranking does in our study, combining data from a small set of health states valued with the TTO with data from a large set of states valued with DCE could result in inconsistent tariffs: health states in proximity to the states selected for TTO valuation would be more influenced by pain/discomfort and anxiety/depression, while states further from the selected TTO states would be more influenced by mobility, self-care, and usual activities. In conclusion, experimental studies on DCE and TTO need to be performed to determine whether the two methods can be combined for the purpose of tariff generation without creating inconsistent tariffs.