Top

Open Access 09-12-2024

Response shift results of quantitative research using patient-reported outcome measures: a meta-regression analysis

Auteurs: Richard Sawatzky, Mathilde G. E. Verdam, Yseulys Dubuy, Tolulope T. Sajobi, Lara Russell, Oluwagbohunmi A. Awosoga, Ayoola Ademola, Jan R. Böhnke, Oluwaseyi Lawal, Anita Brobbey, Amélie Anota, Lisa M. Lix, Mirjam A. G. Sprangers, Véronique Sébille, the Response Shift – in Sync Working Group

Gepubliceerd in: Quality of Life Research

Abstract

Purpose

Our objectives were to identify characteristics of response shift studies using patient-reported outcomes (PROMs) that explain variability in (1) the detection and (2) the magnitude of response shift effects.

Methods

We conducted a systematic review of quantitative studies published before June 2023. First, two-level multivariable logistic regression models (effect- and sample-levels) were used to explain variability in the probability of finding a response shift effect. Second, variability in effect sizes (standardized mean differences) was investigated with 3-level meta-regression models (participant-, effect- and sample-levels). Explanatory variables identified via the purposeful selection methodology included response shift method and type, and population-, study design-, PROM- and study-quality characteristics.

Results

First, logistic regression analysis of 5597 effects from 206 samples in 171 studies identified variables explaining 41.5% of the effect-level variance, while no variables explained sample-level variance. The average probability of response shift detection is 0.20 (95% CI: 0.17-0.28). Variation in detection was predominantly explained by response shift methods and type (recalibration vs. reprioritization/reconceptualization). Second, effect sizes were analyzed for 769 effects from 114 samples and 96 studies based on the then-test and structural equation modeling methods. Meta-regression analysis identified variables explaining 11.6% of the effect-level variance and 26.4% of the sample-level variance, with an average effect size of 0.30 (95% CI: 0.26-0.34).

Conclusion

Response shift detection is influenced by study design and methods. Insights into the variables explaining response shift effects can be used to interpret results of other comparable studies using PROMs and inform the design of future response shift studies.

Supplementary Material 1

Supplementary Information

The online version contains supplementary material available at https://doi.org/10.1007/s11136-024-03867-x.

Richard Sawatzky and Mathilde G. E. Verdam shared first co-authorship.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Background

Patient-reported outcome measures (PROMs) are increasingly used in empirical studies and clinical practice to assess, among other things, the effectiveness of healthcare interventions and to monitor patients’ quality of life (QOL) over time. However, longitudinal measurements of patient-reported outcomes can be affected by response shift. Schwartz & Sprangers [1, 2] defined response shift as a change in the meaning of one’s self-evaluation of a target construct as a result of a change in one’s internal standards of measurement (recalibration), a change in the importance of component domains constituting the target construct (reprioritization), or a redefinition of the target construct (reconceptualization). When response shift occurs, PROM results will not have the same meaning at different points in time. Consequently, the change in observed PROM scores will not accurately reflect change in the construct that the PROM intends to measure (a.k.a. change in the “target construct”). This difference between “observed” and “target” change has been operationalized as a response shift effect. Whereas response shift may invalidate comparisons of PROM results over time when it is not taken into account, response shift is also viewed as meaningful information that provides insight into how patients accommodate health changes [1, 2].

Over the past decades, many studies have been conducted to investigate occurrences and magnitudes of response shift effects. Two systematic reviews on the detection of response shift have been published, encompassing 101 [3] and 107 [4] studies with 51 overlapping. Previous systematic reviews on the magnitudes of response shift effects include meta-analyses of: (a) studies published up to 2005 that examined response shift based on the then-test (one of the most commonly used response shift detection methods that involves asking respondents to retrospectively re-evaluate their baseline functioning at posttest. The comparison of scores on the baseline measure and the then-test provides an indication of response shift and its magnitude) [5], (b) studies published up to 2016 on people with an orthopedic condition [6], and (c) studies published up to 2018 on people with cancer [7]. Hence, these reviews were restricted in the outcome (i.e., detection or magnitude of response shift), the method, or the target population. We previously conducted a descriptive systematic review of all quantitative studies published before 2021 that investigated response shift using PROMs and described distributions of response shift detection and, where possible, effect sizes [8]. The results of this descriptive review provided insight into how the number and magnitude of response shift effects vary across diverse studies employing different response shift methods, populations, research designs, and PROMs.

The next important aim is to gain insight into why response shift results differ, by investigating relevant variables that are associated with and explain variability in response shift results. The current meta-regression analysis builds on the previous descriptive review and aims to identify response shift methods, population characteristics, design characteristics, PROMs, and study quality characteristics that explain variability in (1) the detection of response shift effects and (2) the magnitude of response shift effects (i.e., standardized mean differences). This latter objective was only investigated for studies using the then-test and/or structural equation modeling (SEM) methods that enable the calculation of a standardized effect size for measuring the difference between means (Cohen’s d). This work is part of the Response Shift – in Sync Working Group initiative that aims to synthesize the work on response shift to date [9‐13].

Methods

We conducted a systematic review and meta-regression analysis (registered retrospectively in INPLASY at the time of data analysis: #202290033) [14] following guidelines by Cooper, Hedge, and Valentine [15] and used the PRISMA statement as a guide for reporting the results [16].

Search strategy and eligibility criteria

We aimed to include all longitudinal quantitative studies that examined response shift using a PROM. The search strategy, eligibility criteria, study selection and data extraction procedures, and the use of the EPPI Reviewer Platform [17] were identical to those described in the publication of our previous descriptive systematic review [8]. Since the previous review contained articles published before 2021, we have updated the literature search to May 2023, following the same procedures. The additional studies were randomly assigned for independent screening of titles and abstracts by two team members (RS, VS, MAGS). The full text was retrieved for all citations identified as potentially relevant and were screened randomly by two of the same team members. Disagreements were reconciled via consensus. The updated search led to the inclusion of 23 studies, in addition to the 150 studies from the previous systematic review [8], resulting in 173 studies that fulfilled the eligibility criteria. Exclusion criteria were sequentially applied in the order shown in PRISMA flow diagram (see Fig. 1).

Data extraction

We used the data extracted from the 150 studies included in the previous systematic review [8] and completed data extraction (by one of two team members: MV and RS) for the additional included studies following the same procedures (see [8] for details). Data were extracted for the following explanatory variables, which were also included in the original systematic review (see corresponding categories in Table 1): population-, study design-, and PROM characteristics, and response shift type and methods. Additionally, data were extracted for the following study quality indicators, which were included as study quality control variables in the multivariable analyses: primary/secondary analysis, hypothesis stated (yes/no), missing data reported (yes/no), response shift explanation provided (whether authors explored relations between response shifts and other explanatory variables) (yes/no). The primary/secondary analysis variable was part of design characteristics in the original descriptive review [8], whereas the latter three variables are new.

Table 1

Results of the 2-level multivariable logistic regression model explaining variability in response shift detection

Explanatory/control variables	N Studies	N Samples	N Effects	Marginal probability of RS detection (95% CI)^a	Pratt Index (% R²)^b
Sample-level variables (R² = 7.2%)
Population characteristics
Sex
Mixed (ref)	144	167	4705
Only female	13	14	517		n/a
Only male	10	13	161		n/a
Other/unknown	6	12	214		n/a
Age
Mostly adults (ref)	120	146	4033
Mostly older adults	34	37	702		n/a
Mostly children/adolescents	8	9	225		n/a
Other/Unknown	10	14	637		n/a
Medical condition
Yes: Cancer (ref)	48	51	1707	0.23 (0.19–0.33)	n/a
Yes: Orthopedic	10	11	96	0.20 (0.10–0.38)	-1.8
Yes: Stroke	10	11	472	0.16 (0.09–0.26)	19.4
Yes: Mental health	14	24	1118	0.19 (0.12–0.34)	0.4
Yes: Other	81	96	2081	0.19 (0.16–0.28)	2.1
No	10	13	123	0.17 (0.10–0.40)	11.2
Intervention
No/Unclear (ref)	55	72	1814	0.17 (0.13–0.25)	n/a
Yes: Medical	79	82	2361	0.22 (0.18–0.31)	39.2
Yes: Psychological	27	39	1298	0.21 (0.14–0.34)	1.3
Yes: Other/unspecified	10	13	124	0.27 (0.16–0.51)	28.2
Effect-level variables (R² = 41.5%)
Study design characteristics
Design
Observational (ref)	138	159	4402	0.21 (0.19–0.30)	n/a
Experimental	33	47	1195	0.16 (0.11–0.24)	3.4
Sample size
Q1 (< 57)	41	50	352	0.12 (0.08–0.23)	-0.1
Q2 (57–254)	86	93	1823	0.20 (0.18–0.28)	0.6
Q3 (255–410)	34	32	1296	0.24 (0.21–0.35)	3.2
Q4 (> 411) (ref)	38	41	2126	0.19 (0.13–0.28)	n/a
Time period classification
< 1 month	19	24	661	0.16 (0.13–0.25)	1.6
1–6 months (ref)	97	114	3090	0.20 (0.16–0.29)	n/a
> 6 months − 12 months	44	51	976	0.25 (0.19–0.35)	0.7
> 12 months	25	35	367	0.27 (0.22–0.40)	3.2
Not reported	13	14	503	0.15 (0.08–0.34)	1.5
PROM characteristics
PROM type
Generic PROMs (ref)	82	97	2043	0.20 (0.17–0.30)	n/a
Disease-specific PROMs	65	73	1847	0.18 (0.14–0.26)	0.2
Individualized/other PROMs	63	75	1707	0.23 (0.17–0.33)	-1.0
PROM domain
General health/QOL (ref)	109	122	710	0.25 (0.21–0.34)	n/a
Physical	106	132	1949	0.21 (0.18–0.28)	0.0
Psychological	100	120	1723	0.18 (0.15–0.26)	3.5
Social	62	73	548	0.19 (0.15–0.28)	-0.2
Pain	53	62	244	0.27 (0.20–0.36)	0.3
Other	32	33	423	0.15 (0.12–0.24)	1.4
Response shift type and method
Response shift type
Recalibration (ref)	138	170	2153	0.29 (0.25–0.37)	n/a
Reprioritization/reconceptualization	104	130	3444	0.14 (0.12–0.22)	22.1
Response shift method^c
Then-Test (ref)	82	94	708	0.30 (0.19–0.47)
Latent variable method^d	58	81	4245	0.14 (0.11–0.21)	30.5
Regression method^e	26	31	492	0.50 (0.33–0.71)	10.2
Other^e	30	31	152	0.44 (0.27–0.64)	1.8
Study quality control variables
Primary data analysis
No (ref)	75	102	4032	0.17 (0.14–0.25)	n/a
Yes	96	107	1565	0.27 (0.19–0.39)	12.6
666Hypothesis about response shift
No (ref)	96	110	3772	0.22 (0.19–0.30)	n/a
Yes	75	101	1825	0.17 (0.14–0.23)	0.7
Explanation of response shift
No (ref)	79	75	2619	0.17 (0.15–0.25)	n/a
Yes	92	136	2978	0.23 (0.18–0.32)	2.1
Missing data reported
No (ref)	137	172	4795	0.19 (0.16–0.26)	n/a
Yes	34	36	802	0.26 (0.18–0.45)	1.9

Notes: N = the corresponding # of studies, # of samples, or # of response shift effects. The # of studies and # of samples do not add up to their respective totals or subtotals (bolded rows) because multiple categories for each explanatory variable could apply to the same study or sample. n/a = not available for reference categories or because the variable was not retained in the multivariable model. ref = reference category of dummy-coded variables. Marginal probabilities were calculated by applying the marginal standardization approach to the results of the 2-level multivariable logistic regression analysis, with bootstrapped confidence intervals (see methods section for details). ^b)The percentages do not add up exactly to 100% for each level because due to rounding of parameter estimates. Negative percentages may also be due to rounding of parameter estimates or may indicate the possibility of variable suppression. ^c)See Table 1 of Sawatzky et al., 2023 [8] for detailed descriptions. ^d)Includes structural equation modeling, item response theory and Rasch measurement theory methods. ^e)Includes regression methods with and without classification. ^f)Includes design-based methods other than the then-test (individualized methods, ideal scale approach, appraisal, change in importance ratings) as well as study-specific methods

The outcome variables were:

Detection of response shift: Evidence about whether response shift was detected (yes/no) at the level of individual response-shift effects. This categorization was based on the conclusions of the authors of the primary studies regarding the evidence of response shift, where these conclusions may have been based on different grounds (e.g., different studies adopted different alpha levels to determine the statistical significance of the effects).

Magnitude of response shift: Effect sizes based on standardized mean differences of individual response shift effects for the then-test and SEM methods. The magnitude of each response shift effect was based on reported statistical information, where possible. Table 1 in Sawatzky et al. [8] includes a detailed description of effect size calculation for both methods. Standardized mean differences (Cohen’s d) were calculated based on the difference between mean baseline (\(\:{\stackrel{-}{X}}_{1})\) and follow-up (then-test) (\(\:{\stackrel{-}{X}}_{2})\) scores as follows: Cohen’s d = \(\:\frac{{\stackrel{-}{X}}_{1}-{\stackrel{̿}{X}}_{2}}{SD}\) (where SD = standard deviation). The SD was calculated based on the following hierarchy: (i) the SD of the difference in means; (ii) the pooled SD; or (iii) the SD of the baseline measure. For some studies that did not provide the information required for these calculations we first had to transform medians, interquartile ranges (IQR), confidence intervals (CI), or t statistics into corresponding means and standard deviations [15, 18]. For SEM, Cohen’s d for response shift effects were based on parameter estimates of models that adjust for a lack of longitudinal measurement invariance (for more information see [19]). We used reported effect sizes, if provided, when insufficient information was available to calculate effect sizes. We included all response shift effects for which an effect size could be calculated or was reported, including response shift effects that were not statistically significant. All effect sizes were converted to absolute values because interpretation of the direction of effect is lost due to the heterogeneity of studies (i.e., expected change can be both positive and negative).

Statistical analyses

The current meta-regression analysis focuses on explaining variability in the detection and magnitude of response shift effects, using population-, study design-, and PROM-characteristics, and response shift type and methods as explanatory variables, and as a last step, study quality indicators as control variables (see explanation in the risk of bias section below). Figure 2 provides an overview of how the study-level observations are related to the variance levels of the modeling procedure and the explanatory/control variables. We used a multilevel framework for our analyses and included (unique) sample identifications to take into account that response shift effects are investigated in the same sample [20]. Some of the samples were included in multiple studies (e.g., for secondary analyses with another response shift method), whereas other studies included multiple (unique) samples; for example, when response shift effects were investigated in a treatment and control group. We used sample identification as the appropriate nesting variable because dependency between response shift effects will most likely occur when they are investigated in the same group of participants, rather than in the same study. When samples were overlapping, only the overall sample was included (i.e., subsamples were excluded). See Data Availability and Code Availability sections to access the data files and syntax for all analyses.

Variability in the detection of response shift was investigated with 2-level (effect- and sample-levels) multivariable logistic regression models [20]. The models do not include participant-level sampling variance because information about the detection of response shift (yes/no) was only available at the level of the study sample (based on the conclusions made by the authors of the primary studies) and not at the level of participants within a sample, for all but a few studies. We calculated marginal probabilities of response shift detection for each level of the explanatory variables using the marginal standardization approach [21, 22]. Marginal standardization estimates take into account that some categories of the explanatory variable have few observations and are appropriate for making inferences about the overall population. Methods for bootstrapping of clustered data were applied to obtain 95% CIs [23]. To provide information about the relative importance of individual explanatory variables in explaining variability in response shift detection, we calculated the Pratt Index based on the results of the final multilevel model [24, 25]. The Pratt Index shows the proportion of the explained variance attributable to each explanatory variable. The Mplus software (version 8.10) was used to obtain the standardized parameter estimates and correlations and to calculate the explained variance based on the R² of the latent variable underlying the binary criterion model, as is further explained by Bosker and Snijders (2011) [26].

Variability in the magnitude of response shift was investigated using 3-level meta-regression models to account for dependencies between effects from the same sample [27, 28]. The analyses account for participant sampling variance (level 1), as this is inherent in the calculation of Cohen’s d effect size, as well as effect- and sample-level variances (levels 2 and 3). The variance decomposition and variance explained (R²) by the explanatory variables at levels 2 and 3 were calculated as suggested by Cheung [27]. The statistical analyses were performed using the Mplus software (version 8.10) and packages lme4 (v1.1-34) [29] and metaSEM (v1.3.1) [30] in R [31].

We followed Hosmer and Lemeshow’s purposeful selection procedure [32] to identify variables that explain variability in the detection of response shift effects (based on the 2-level multivariable logistic models) and magnitude of response shift effects (based on the 3-level meta-regression models). This procedure strikes a balance between including important variables and maintaining statistical rigor in model building while ensuring that the selected model is clinically meaningful and statistically valid. The procedure begins by including variables that are deemed clinically or theoretically important and uses a combination of statistical and substantive criteria to arrive at a final selection of explanatory variables. In the first step, univariate models for each explanatory variable (i.e., response shift methods, population-, study design-, and PROM-characteristics) were tested for statistical significance using the criterion of α = 0.25. A conservative α was chosen as a screening criterion for initial variable selection to ensure the selection of potentially important variables. In step 2, a multivariable model was fitted with all explanatory variables selected in step 1. Using an iterative approach, each explanatory variable was removed from the multivariable model one-by-one, and the impact of this removal on the multivariable model was investigated in terms of statistical significance (p < 0.10) and confounding (a change of > 20% in one or more of the regression coefficients of the other variables in the model). Each explanatory variable that did not meet either criterion was excluded from the multivariable model. This iterative procedure was continued until all explanatory variables in the multivariable model met at least one of the two selection criteria. In step 3 the variables that were not selected in step 1 (p > 0.25) were added to the final multivariable model from step 2 to re-evaluate their contribution and were retained if p < 0.10. The control variables were subsequently added (without further selection) to control for the potential effects of study quality characteristics. Possible interaction effects between the explanatory variables were not considered.

Risk of bias

We did not perform a formal assessment of the methodological quality or risk of bias of individual studies. We do not believe that such a formal assessment could be conducted in an unambiguous and meaningful way, given the heterogeneity of the included studies, the predominance of observational designs for which risk of bias assessments are less straightforward, the inconsistent and incomplete reporting of quality indicators, and the current state of response shift research where quality requirements are contextually dependent and often unknown. For example, we do not know which catalytic event with which intensity and duration is needed or which follow-up period is of sufficient duration to be certain that a response shift would occur. However, for each study, we extracted several control variables that could be considered indications of study quality (i.e., whether a response shift hypothesis was formulated, whether primary or secondary analyses were conducted, whether missing data were reported, and whether a response shift explanation was provided). We adopted a post-hoc, empirical approach by including these control variables together with other explanatory variables in the meta-regression analysis [33].

Results

Studies, samples, and response shift effects

Of the 4038 records screened, 173 studies fulfilled the eligibility criteria (see Fig. 1. A reference list of included studies is provided in the Supplementary Appendix). We identified a total of 6154 response shift effects, of which 557 were excluded from analyses because they were from 101 subsamples of overall samples reported in the same or another study. Two studies that only investigated subsamples of overall samples investigated in another study were also excluded. The analysis therefore included 5597 effects from 206 samples that were investigated in 171 studies, of which 148 studies were unrelated to any of the other studies and 23 studies involved samples that were also analyzed in other studies. Of the included effects, 4577 effects were from 197 independent samples that were each analyzed in only one study, and 1020 effects were from 9 samples investigated in more than one study. Of the 5597 effects, 812 (14.5%) were identified as a response shift, based on criteria defined by the authors.

Explaining variability in detection of response shift

Following the purposeful selection procedure, sex and age were the only explanatory variables that did not meet the criteria for inclusion (results of the purposeful variable selection procedure are reported in Supplementary Table S1, with corresponding parameter estimates of the final model provided in Supplementary Table S2). The explanatory variables in the final 2-level multivariable logistic regression model explained 41.5% (95% CI: 0.34–0.49) of the effect-level variance in response shift detection (see Table 1). As shown in Fig. 3, most of the effect-level explained variance was attributable to differences in response shift methods (42.4%), in particular, the then-test method versus a latent variable method (30.5% of the effect-level explained variance). Moreover, response shift type (recalibration versus other) accounted for 22.1% of the effect-level variance. The study quality control variables accounted for 17.3% of the effect-level explained variance, where the distinction of primary versus secondary data analysis was the most important (accounting for 12.6% of the effect-level explained variance; not shown in Fig. 3). Study design characteristics (including observational versus experimental design, sample size, and time period classification) accounted for 14.1% of the effect-level explained variance, whereas PROM characteristics (types and domains) accounted for only 4.2%. At the sample level, the two retained explanatory variables (types of interventions and medical conditions) explained only 7.2% (95% CI: -0.04-0.18) of sample-level variance with a p-value of 0.19.

The overall marginal probability (P) of detecting response shift, while taking into account dependencies between effects investigated in the same sample (see Table 1), was 0.20 (95% CI: 0.17–0.28). The three smallest marginal probabilities were for effects based on small sample sizes of < 57 (P = 0.12, based on 352 effects from 50 samples), latent variable methods (P = 0.14, based on 4245 effects from 81 samples), and reprioritization/reconceptualization response shift (P = 0.14, based on 3444 effects from 130 sample). The three largest marginal probabilities were for regression methods (P = 0.50, based on 492 effects from 31 samples), other response shift methods (i.e., not then-test, latent variable methods, or regression methods) (P = 0.44, based on 152 effects from 31 samples), and the then-test (P = 0.30, based on 708 effects from 94 samples).

Explaining variability in magnitude of responses shift

Information about effect sizes was available for 769 effects from 116 samples and 97 studies using the then-test (637 effects from 81 samples and 72 studies) or SEM (132 effects from 39 samples and 27 studies) (see Table 2). Following the purposeful selection procedure, sex and sample size did not meet the criterion for inclusion. All other explanatory variables were retained in the final model. Response shift method (then-test vs. SEM) and the four study quality control variables were retained because they were of theoretical interest, but these did not show any statistically significant relations with effect-size magnitude (results of the purposeful variable selection procedure are reported in Supplementary Table S3, with corresponding parameter estimates of the final model provided in Supplementary Table S4).

Table 2

Results of the 3-level meta-regression model explaining variability in response shift effect sizes

Explanatory/control variables	N Studies	N Samples	N Effects	Marginal mean effect size (95% CI)
Sample-level variables (R² = 26.4%)
Population characteristics
Sex
Mixed	77	88	466	n/a
Only female	9	8	215	n/a
Only male	8	11	69	n/a
Other/unknown	4	9	19	n/a
Age
Mostly Adults	65	78	562	0.31 (0.27–0.36)
Mostly Older adults	21	23	121	0.30 (0.22–0.38)
Mostly Children/adolescents	5	6	26	0.10 (-0.06–0.26)
Other/Unknown	6	9	60	0.30 (0.16–0.44)
Medical condition
Yes: Cancer	34	35	429	0.31 (0.24–0.37)
Yes: Orthopedic	9	10	79	0.37 (0.25–0.48)
Yes: Stroke	4	6	14	0.16 (-0.02–0.33)
Yes: Mental health	4	8	8	0.43 (0.22–0.64)
Yes: Other	42	51	214	0.26 (0.20–0.32)
No	6	6	25	0.41 (0.26–0.57)
Intervention
No/unclear	23	28	122	0.26 (0.18–0.34)
Yes: Medical	55	58	528	0.28 (0.23–0.33)
Yes: Psychological	14	24	66	0.38 (0.28–0.48)
Yes: Other/unspecified	5	6	53	0.32 (0.17–0.47)
Effect-level variables (R² = 11.6%)
Study design characteristics
Design
Observational	81	89	618	0.29 (0.25–0.34)
Experimental	16	27	151	0.34 (0.25–0.43)
Sample size
Q1 (< 57)	31	38	177	n/a
Q2 (57–254)	53	57	325	n/a
Q3 (255–410)	18	16	200	n/a
Q4 (> 411)	11	10	67	n/a
Not reported	0	0	0	n/a
Time period classification
< 1 month	13	16	102	0.27 (0.22–0.34)
1–6 months	66	82	470	0.29 (0.25–0.34)
> 6 months − 12 months	21	20	79	0.31 (0.23–0.39)
> 12 months	11	14	50	0.45 (0.35–0.55)
Not reported	5	6	68	0.25 (0.17–0.33)
PROM characteristics
PROM type
Generic PROMs	46	55	241	0.24 (0.19–0.30)
Disease-specific PROMs	44	45	390	0.35 (0.29–0.40)
Individualized/other PROMs	28	37	138	0.26 (0.20–0.32)
PROM domain
General health/QOL	56	60	113	0.31 (0.26–0.36)
Physical	62	74	311	0.28 (0.24–0.33)
Psychological	51	54	152	0.27 (0.22–0.32)
Social	34	34	68	0.32 (0.26–0.38)
Pain	29	32	63	0.39 (0.33–0.44)
Other	21	21	62	0.31 (0.25–0.37)
Response shift type and method
Response shift type
Recalibration	95	106	732	0.30 (0.26–0.34)
Reprioritization/reconceptualization	14	20	37	0.22 (0.14–0.30)
Response shift method
Then-Test	72	81	637	0.29 (0.25–0.33)
Structural equation modeling	27	39	132	0.34 (0.27–0.41)
Study quality control variables
Primary data analysis
No	25	35	136	0.27 (0.19–0.34)
Yes	72	82	633	0.30 (0.26–0.35)
Hypothesis about response shift
No	53	61	470	0.29 (0.24–0.34)
Yes	44	58	299	0.31 (0.26–0.36)
Explanation of response shift
No	44	42	397	0.29 (0.24–0.34)
Yes	53	76	372	0.32 (0.28–0.36)
Missing data reported
No	81	102	726	0.30 (0.26–0.34)
Yes	16	15	43	0.30 (0.20–0.39)

Of the total variance in magnitude of response shift effects (0.053), 55% was due to variance at the sample-level (i.e., estimated sample-level variance = 0.029) and 37% was due to variance at the effect-level (i.e., estimated effect-level variance = 0.019). The explanatory variables in the final meta-regression model explained 11.6% of the effect-level variance and 26.4% of the sample-level variance in the response shift effect sizes. The average estimated effect size for response shift effects was 0.30 (95% CI: 0.26–0.34). All but two of the marginal mean effect sizes for the explanatory/control variables were of small to moderate magnitude (based on Cohen’s [34] conventions), ranging from 0.22 to 0.45 (see Table 2). The two exceptions were for mostly children/adolescents (d = 0.10, based on 26 effects from 6 samples) and for stroke (d = 0.16, based on 14 effects from 6 samples). Of the remaining marginal effect sizes, the smallest effect size of 0.22 was for reprioritization/reconceptualization response shift (based on 37 effects from 20 samples). The largest effect size was for the time-period > 12 months (d = 0.45, based on 50 effects from 14 samples).

Discussion

Our meta-regression analysis, involving 171 response shift studies indicates that, on average, one out of five longitudinal PROM effects that are investigated for response shift result in the detection of response shift, when adjusting for sampling dependencies and controlling for sample- and effect-level variables. This result is consistent with our previous systematic review, based on 150 overlapping studies [8]. The results of our current analysis further indicate that two-fifths of the effect-level variance in response shift detection was explained by the variables we extracted from the included studies. Of those, the kind of response shift method accounted for almost half of this explained variance. Other notable variables influencing response shift detection included response shift type and the study quality control variables. Contrary to expectation, variation in response shift detection was not explained by population characteristics (sex, age, medical condition, and intervention).

The results highlight that variation in response shift detection is predominantly attributable to methodological differences across studies. Most importantly, the question arises as to why different response shift methods have a varying probability of detecting response shift, even after controlling for the other effect-level variables. Most model-based methods, such as latent variable methods based on Oort’s procedure [35], allow for multiple effects (e.g., effects for different response shift types and domains) that are tested simultaneously within the same model, thereby protecting against false positives. For instance, an overall test for response shift is inherently included in SEMs based on Oort’s procedure, which can help to prevent false positives even when multiple testing corrections are not performed after this overall test. Conversely, in design-based methods, each response shift effect (e.g., effects for different domains) is typically tested separately, in which case multiple testing could result in false positives (unless a method to control the familywise error rate, such as the Bonferroni method is used). This difference may, in part, explain why the probability of detecting response shift is lower for latent-variable methods than for the then-test. Of course, there is also a trade-off between controlling familywise error and statistical power to detect response shift effects; it might be that latent variable method studies are generally underpowered as compared to studies that use design-based methods. This argument is deemed less plausible as application of latent variable methods usually requires larger sample sizes (with more power to detect effects). Another possible explanation is that model-based methods have been more commonly applied in secondary data analyses, leading to a relatively smaller probability of detecting response shift. Conversely, the then-test can only be used in studies where it was included in the original design (i.e., there is no room for secondary analyses as response shift detection is most likely the focus of the primary analyses). Still, these arguments do not explain why the probability of detecting response shift is larger for regression methods and other methods. Therefore, we echo previous recommendations that response shift studies should evaluate and report on sample size requirements for the chosen statistical analyses [36, 37] (e.g., see Verdam [38] for a tutorial on power calculations for response shift investigations with SEM).

The magnitude of response shift was only available for effects investigated with then-test and SEM, where one-quarter of the variability of response shift effect sizes across samples was explained by population characteristics (sample-level variables) and only a tenth of the variability across response shift effects was explained by effect-level variables (including study design, the PROM, response shift method (then-test vs. SEM) and study quality control variables). However, effect sizes could be determined for only 132 (3.2%) of the 4176 effects investigated with SEM, whereas for effects investigated with then-test effect-size information was available for 90.0% of the investigated effects (i.e. 637 out of 708). Consequently, the effect size results predominantly represent studies using the then-test method and may not be representative of studies based on SEM or other methods for examining response shift. This points to a limitation of the response shift literature, as the preponderance of response shift studies do not report information about the magnitudes of effects and some methods do not enable effect size computation.

The average effect size of the response shift effects based on the then-test or SEM methods (including only those for which an effect size could be determined) was 0.30 (95% CI: 0.26–0.34) and all marginal effect sizes were above 0.20 (with two exceptions), indicating that they were far from negligible and predominantly ranged from small to moderate in size (based on Cohen’s guidelines [34]). To contextualize, most effects in PROM research are also of a small to moderate magnitude, as discussed previously [8] and in some instances, small effect sizes might be clinically relevant [39]. The largest marginal effect size (0.45) was found for analyses of time-period > 12 months and samples of mostly children/adolescents had the smallest marginal effect size (0.10). It is important to consider that the differences in marginal effect sizes across all of the explanatory and control variables are relatively small and nearly all of the CIs are overlapping, precluding firm conclusions about their relative importance.

The current meta-regression analysis adds to our previous descriptive systematic review by disentangling the variability induced by differences in population characteristics, study design, PROMs, response shift methods, and study quality. The results allow for direct comparisons of effects across different methods and other characteristics. A more specific strength is our extensive analysis of dependencies, which has guarded us against finding spurious effects. Finally, the current meta-regression analysis on response shift effects and effect sizes of quantitative response shift studies is the most comprehensive to date, including all investigated populations and methods.

Several limitations that were listed in our previously published descriptive systematic review [8] apply equally to the current meta-regression analysis, including: the omission of studies that are not reported in English; inclusion of studies that adopted different operationalizations of response shift and/or had incomplete reporting of study results and/or methodology; consideration of all detected effects as response shift effects, although their substantiation may be questioned; and inclusion of a limited number of explanatory variables that in the current meta-regression analysis explained less than half of the calculated variances. Another limitation is the lack of a formal assessment of study quality, as a meaningful assessment of study quality is hindered by the heterogeneity of the included studies (see section on risk of bias). Although we have included four control variables as indicators of study quality in all the analyses, it is important to keep in mind that due to a lack of direct assessment of study quality its influence on response shift results remains unknown. Relatedly, the extent to which response shift results are affected by the psychometric quality of PROMs was not investigated. As information about the reliability and validity of PROMs is rarely consistently reported we did not consider it feasible to include this information in the current analysis; its influence might be more easily investigated using simulation studies. Finally, exploration of potential interaction effects between the explanatory/control variables was not performed. This exploration was deemed too complex due to the limited number of observations and the relatively large number of explanatory variables. Nevertheless, insight into the dynamics between the different explanatory variables might be clinically meaningful and thus a relevant topic for future research.

The current meta-regression analysis, combined with our previous descriptive systematic review [8], provides insight into the variability of response shift results, i.e., how the detection of response shift and response shift effect sizes vary across populations, study designs, PROMs, response shift methods, and study quality criteria. Rather than focusing only on overall response shift effects, future research should aim to identify and understand the conditions under which response shift is more or less likely to occur. This may include person- versus variable-centered quantitative methods [40, 41], qualitative research, and examination of theoretical and philosophical perspectives [42]. The marginal probabilities and effect sizes of response shift effects may also be taken into account when interpreting the results of other comparable studies using PROMs by considering the potential occurrence and impact of response shift effects. Additionally, future studies on response shift detection need to be informed by the current insights into response shift effects and effect sizes, which may help with designing the study and interpreting the results [13]. Well-designed studies and contextualized interpretation of results are needed to improve our understanding of response shift.

Declarations

Ethics approval

Not applicable.

Research involving human participants and/or animals

Not applicable.

Conflict of interest

The authors declare that they have no conflict of interest that are relevant to the content of this article. JRB is Co-Editor in Chief of Quality of Life Research and is not part of the decision-making process for this manuscript.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Onze productaanbevelingen

BSL Podotherapeut Totaal

Binnen de bundel kunt u gebruik maken van boeken, tijdschriften, e-learnings, web-tv's en uitlegvideo's. BSL Podotherapeut Totaal is overal toegankelijk; via uw PC, tablet of smartphone.

Meer informatie

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Sprangers, M. A., & Schwartz, C. E. (1999). Integrating response shift into health-related quality of life research: A theoretical model. Social Science and Medicine, 48(11), 1507–1515. https://doi.org/10.1016/s0277-9536(99)00045-3CrossRefPubMed

Schwartz, C. E., & Sprangers, M. A. (1999). Methodological approaches for assessing response shift in longitudinal health-related quality-of-life research. Social Science & Medicine, 48(11), 1531–1548. https://doi.org/S0277953699000477.

Sajobi, T. T., Brahmbatt, R., Lix, L. M., Zumbo, B. D., & Sawatzky, R. (2017). Scoping review of response shift methods: Current reporting practices and recommendations. Quality of Life Research, 27(5), 1133–1146. https://doi.org/10.1007/s11136-017-1751-xCrossRefPubMed

Ortega-Gomez, E., Vicente-Galindo, P., Martin-Rodero, H., & Galindo-Villardon, P. (2022). Detection of response shift in health-related quality of life studies: A systematic review. Health and Quality of Life Outcomes, 20(1), 20. https://doi.org/10.1186/s12955-022-01926-wCrossRefPubMedPubMedCentral

Schwartz, C. E., Merriman, M. P., Reed, G., & Byock, I. (2005). Evaluation of the Missoula-VITAS Quality of Life Index—Revised: Research tool or clinical tool? Journal of Palliative Medicine, 8(1), 121–135. https://doi.org/10.1089/jpm.2005.8.121CrossRefPubMed

Powden, C. J., Hoch, M. C., & Hoch, J. M. (2018). Examination of response shift after rehabilitation for orthopedic conditions: A systematic review. Journal of Sport Rehabilitation, 27(5), 469–479. https://doi.org/10.1123/jsr.2017-0104CrossRefPubMed

Ilie, G., Bradfield, J., Moodie, L., Lawen, T., Ilie, A., Lawen, Z., Blackman, C., Gainer, R., & Rutledge, R. D. H. (2019). The role of response-shift in studies assessing quality of life outcomes among cancer oatients: A systematic review. Frontiers in Oncology, 9, 783–783. https://doi.org/10.3389/fonc.2019.00783CrossRefPubMedPubMedCentral

Sawatzky, R., Sajobi, T. T., Russell, L., Awosoga, O. A., Ademola, A., Bohnke, J. R., Lawal, O., Brobbey, A., Lix, L. M., Anota, A., Sebille, V., Sprangers, M. A. G., & Verdam, M. G. E. (2023). Response shift results of quantitative research using patient-reported outcome measures: A descriptive systematic review. Quality of Life Research. https://doi.org/10.1007/s11136-023-03495-xCrossRefPubMedPubMedCentral

Vanier, A., Oort, F. J., McClimans, L., Ow, N., Gulek, B. G., Böhnke, J. R., Sprangers, M., Sébille, V., & Mayo, N. (2021). Response shift in patient-reported outcomes: Definition, theory, and a revised model. Quality of Life Research, 30(12), 3309–3322. https://doi.org/10.1007/s11136-021-02846-wCrossRefPubMedPubMedCentral

10.

Sprangers, M. A. G., Sajobi, T., Vanier, A., Mayo, N. E., Sawatzky, R., Lix, L. M., Oort, F. J., & Sébille, V. (2021). & The Response Shift—in Sync Working, G. Response shift in results of patient-reported outcome measures: a commentary to The Response Shift—in Sync Working Group initiative. Quality of Life Research. https://doi.org/10.1007/s11136-020-02747-4.

11.

Sébille, V., Lix, L. M., Ayilara, O. F., Sajobi, T. T., Janssens, A. C. J. W., Sawatzky, R., Sprangers, M. A. G., & Verdam, M. G. E. (2021). Critical examination of current response shift methods and proposal for advancing new methods. Quality of Life Research, 30(12), 3325–3342. https://doi.org/10.1007/s11136-020-02755-4CrossRefPubMedPubMedCentral

12.

Sawatzky, R., Kwon, J. Y., Barclay, R., Chauhan, C., Frank, L., van den Hout, W. B., Nielsen, L. K., Nolte, S., & Sprangers, M. A. G. (2021). Implications of response shift for micro-, meso-, and macro-level healthcare decision-making using results of patient-reported outcome measures. Quality of Life Research, 30(12), 3343–3357. https://doi.org/10.1007/s11136-021-02766-9CrossRefPubMedPubMedCentral

13.

Sprangers, M. A. G., Sawatzky, R., Vanier, A., Bohnke, J. R., Sajobi, T., Mayo, N. E., Lix, L. M., Verdam, M. G. E., Oort, F. J., & Sebille, V. (2023). Implications of the syntheses on definition, theory, and methods conducted by the response shift - in Sync Working Group. Quality of Life Research, 32(8), 2165–2178. https://doi.org/10.1007/s11136-023-03347-8CrossRefPubMedPubMedCentral

14.

Sawatzky, R., Tolulope, S., Russell, T., Awasoga, L., Ademola, A. O., Bohnke, A., Lawal, J. R., Brobbey, O., Lix, A., Anota, L. M., Sebille, A., Mirjam, V. S., A. G., & Mathilde, V. (2022). G. E. A synthesis of response shift effects in quantitative health research: A systematic review and meta-regression protocol. Inplasy protocol 202290033. https://inplasy.com/inplasy-2022-9-0033/.

15.

Cooper, H., Hedges, L. V., & Valentine, J. C. (2019). The Handbook of Research Synthesis and Meta-Analysis. Russell Sage Foundation.

16.

Page, M. J., Moher, D., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., Shamseer, L., Tetzlaff, J. M., Akl, E. A., Brennan, S. E., Chou, R., Glanville, J., Grimshaw, J. M., Hrobjartsson, A., Lalu, M. M., Li, T., Loder, E. W., Mayo-Wilson, E., McDonald, S., & McKenzie, J. E. (2021). PRISMA 2020 explanation and elaboration: updated guidance and exemplars for reporting systematic reviews. BMJ, 372, n160. https://doi.org/10.1136/bmj.n160

17.

Thomas, J., Graziosi, G., Brunton, J., Ghouze, Z., O’Driscoll, P., Bond, M. H., & Koryakina, A. (2022). EPPI-Reviewer: Advanced software for systematic reviews, maps and evidence synthesis. EPPI-Centre, UCL Social Research Institute, University College London.

18.

Wan, X., Wang, W., Liu, J., & Tong, T. (2014). Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range. BMC Medical Research Methodology, 14, 135. https://doi.org/10.1186/1471-2288-14-135CrossRefPubMedPubMedCentral

19.

Verdam, M. G. E., Oort, F. J., & Sprangers, M. A. G. (2017). Structural equation modeling–based effect-size indices were used to evaluate and interpret the impact of response shift effects. Journal of Clinical Epidemiology, 85, 37–44. https://doi.org/10.1016/j.jclinepi.2017.02.012CrossRefPubMed

20.

Heck, R. H., & Thomas, S. L. (2020). An introduction to Multilevel modeling techniques: MLM and SEM. Taylor & Francis.

21.

Muller, C. J., & MacLehose, R. F. (2014). Estimating predicted probabilities from logistic regression: Different methods correspond to different target populations. Journal of Clinical Epidemiology, 43(3), 962–970. https://doi.org/10.1093/ije/dyu029CrossRef

22.

Pavlou, M., Ambler, G., Seaman, S., & Omar, R. Z. (2015). A note on obtaining correct marginal predictions from a random intercepts model for binary outcomes. BMC Medical Research Methodology, 15(1), 59. https://doi.org/10.1186/s12874-015-0046-6CrossRefPubMedPubMedCentral

23.

Ren, S., Lai, H., Tong, W., Aminzadeh, M., Hou, X., & Lai, S. (2010). Nonparametric bootstrapping for hierarchical data. Journal of Applied Statistics, 37(9), 1487–1498. https://doi.org/10.1080/02664760903046102CrossRef

24.

Thomas, D. R., Zhu, P., Zumbo, B. D., & Dutta, S. (2008). On measuring the relative importance of explanatory variables in a logistic regression. Journal of Modern Applied Statistical Methods, 7, 21–38.CrossRef

25.

Liu, Y., Zumbo, B. D., & Wu, A. D. (2014). Relative importance of predictors in Multilevel modeling. Journal of Modern Applied Statistical Methods, 13(1), 2–22. https://doi.org/10.22237/jmasm/1398916860CrossRef

26.

Bosker, R., & Snijders, T. A. (2011). Multilevel analysis: An introduction to basic and advanced multilevel modeling. SAGE.

27.

Cheung, M. W. L. (2014). Modeling dependent effect sizes with three-level meta-analyses: A structural equation modeling approach. Psychological Methods, 19(2), 211–229. https://doi.org/10.1037/a0032968CrossRefPubMed

28.

Cheung, M. W. L. (2019). A guide to conducting a Meta-analysis with non-independent effect sizes. Neuropsychology Review, 29(4), 387–396. https://doi.org/10.1007/s11065-019-09415-6CrossRefPubMedPubMedCentral

29.

Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting Linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48. https://doi.org/10.18637/jss.v067.i01CrossRef

30.

Cheung, M. W. L. (2015). metaSEM: An R package for meta-analysis using structural equation modeling. Frontiers in Psychology. https://doi.org/10.3389/fpsyg.2014.01521. 5.CrossRefPubMedPubMedCentral

31.

R Core Team. (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing. Vienna.

32.

Hosmer, D. W. J., Lemeshow, S., & Sturdivant, R. X. (2013). Model-building strategies and methods for logistic regression. In D. W. J. Hosmer, S. Lemeshow, & R. X. Sturdivant (Eds.), Applied Logistic Regression (pp. 89–151). Wiley.

33.

Valentine, J. C. (2009). Judging the quality of primary research. In H. Cooper, & L. V. Hedges (Eds.), The handbook of research synthesis (second ed., pp. 130–146). Russell Sage Foundation.

34.

Cohen, J. (1992). A power primer. Psychological Bulletin, 112(1), 155–159.CrossRefPubMed

35.

Oort, F. J. (2005). Using structural equation modeling to detect response shifts and true change. Quality of Life Research, 14(3), 587–598. https://doi.org/10.1007/s11136-004-0830-yCrossRefPubMed

36.

Schwartz, C. E., Ahmed, S., Sawatzky, R., Sajobi, T., Mayo, N., Finkelstein, J., Lix, L., Verdam, M. G., Oort, F. J., & Sprangers, M. A. (2013). Guidelines for secondary analysis in search of response shift. Quality of Life Research, 22(10), 2663–2673. https://doi.org/10.1007/s11136-013-0402-0CrossRefPubMed

37.

Schwartz, C. E., & Sprangers, M. A. (2010). Guidelines for improving the stringency of response shift research using the thentest. Quality of Life Research, 19(4), 455–464. https://doi.org/10.1007/s11136-010-9585-9CrossRefPubMed

38.

Verdam, M. G. E. (2024). Power analyses for measurement model misspecification and response shift detection with structural equation modeling. Quality of Life Research, 33(5), 1241–1256. https://doi.org/10.1007/s11136-024-03605-3CrossRefPubMedPubMedCentral

39.

Revicki, D., Hays, R. D., Cella, D., & Sloan, J. (2008). Recommended methods for determining responsiveness and minimally important differences for patient-reported outcomes. Journal of Clinical Epidemiology, 61(2), 102–109. https://doi.org/10.1016/j.jclinepi.2007.03.012CrossRefPubMed

40.

Howard, M. C., & Hoffman, M. E. (2017). Variable-centered, person-centered, and person-specific approaches. Organizational research methods. Organizational Research Methods, 21(4), 846–876. https://doi.org/10.1177/1094428117744021CrossRef

41.

Laursen, B. P., & Hoff, E. (2006). Person-centered and variable-centered approaches to longitudinal data. Merrill-Palmer Quarterly, 52(3), 377–389. https://doi.org/10.1353/mpq.2006.0029CrossRef

42.

McClimans, L., Bickenbach, J., Westerman, M., Carlson, L., Wasserman, D., & Schwartz, C. (2013). Philosophical perspectives on response shift. Quality of Life Research, 22(7), 1871–1878. https://doi.org/10.1007/s11136-012-0300-xCrossRefPubMed

Titel: Response shift results of quantitative research using patient-reported outcome measures: a meta-regression analysis
Auteurs: Richard Sawatzky
Mathilde G. E. Verdam
Yseulys Dubuy
Tolulope T. Sajobi
Lara Russell
Oluwagbohunmi A. Awosoga
Ayoola Ademola
Jan R. Böhnke
Oluwaseyi Lawal
Anita Brobbey
Amélie Anota
Lisa M. Lix
Mirjam A. G. Sprangers
Véronique Sébille
the Response Shift – in Sync Working Group
Publicatiedatum: 09-12-2024
Uitgeverij: Springer International Publishing
Gepubliceerd in: Quality of Life Research
Print ISSN: 0962-9343
Elektronisch ISSN: 1573-2649
DOI: https://doi.org/10.1007/s11136-024-03867-x

Bohn Stafleu van Loghum

Welkom bij Erasmus MC & Bohn Stafleu van Loghum

Registreer

Login

Response shift results of quantitative research using patient-reported outcome measures: a meta-regression analysis

Abstract

Purpose

Methods

Results

Conclusion

Supplementary Information

Publisher’s note

Background

Methods

Search strategy and eligibility criteria

Data extraction

Statistical analyses

Risk of bias

Results

Studies, samples, and response shift effects

Explaining variability in detection of response shift

Explaining variability in magnitude of responses shift

Discussion

Declarations

Ethics approval

Research involving human participants and/or animals

Conflict of interest

Publisher’s note

Onze productaanbevelingen

BSL Podotherapeut Totaal

Electronic supplementary material

Bohn Stafleu van Loghum

Welkom bij Erasmus MC & Bohn Stafleu van Loghum

Registreer

Login

Deel dit onderdeel of sectie (kopieer de link)

Abstract

Purpose

Methods

Results

Conclusion

Supplementary Information

Publisher’s note

Background

Methods

Search strategy and eligibility criteria

Data extraction

Statistical analyses

Risk of bias

Results

Studies, samples, and response shift effects

Explaining variability in detection of response shift

Explaining variability in magnitude of responses shift

Discussion

Declarations

Ethics approval

Consent to participate

Research involving human participants and/or animals

Informed consent

Conflict of interest

Publisher’s note

Deel dit onderdeel of sectie (kopieer de link)

Onze productaanbevelingen

BSL Podotherapeut Totaal

Electronic supplementary material