Top

Gepubliceerd in:

25-10-2022 | Special Section: Methodologies for Meaningful Change

Measuring individual true change with PROMIS using IRT-based plausible values

Auteurs: Emily H. Ho, Jay Verkuilen, Felix Fischer

Gepubliceerd in: Quality of Life Research | Uitgave 5/2023

Abstract

Aims

A primary advantage of IRT-based patient-reported outcome measures such as PROMIS short forms and computer-adaptive tests is that each estimate of the latent trait comes with a standard error. Such measurement error needs to be acknowledged, in particular when monitoring individual patients over time. In this study, we use plausible values to account for measurement error and analyze the probability of true within-individual change.

Methods

We use a longitudinal, observational study of stable and exacerbated COPD patients (N = 185), providing PROMIS Physical Function and Fatigue T-scores over 3 months. At each measurement, we imputed 1000 plausible values from the scores’ posterior distribution. These were then used to calculate probability of true change using a pre-specified threshold such as minimally important difference supported by the literature, or \(\Delta T-score\) > 0. We demonstrate assessment of change in individuals and in groups, across different measures (Short Forms and CATs), and at various levels of confidence.

Results

Using plausible value imputation and with 95% certainty, 47.5% of participants in the exacerbated group reported less fatigue, compared with 26.5% of participants in the stable group. Comparison of Short Forms and CATs suggests that CATs have better ability to detect change compared to short forms. We also illustrate this method using an individual’s probability of change at different time points.

Conclusion

Plausible values offer a flexible way to include measurement error in analysis of individuals and on sample level. Assessment of probability of true change can complement existing distribution-based approaches and facilitates interpretation of improvement or decline.

vorige artikel Identifying meaningful change on PROMIS short forms in cancer patients: a comparison of item response theory and classic test theory frameworks

volgende artikel Comparison of raw and regression approaches to capturing change on patient-reported outcome measures

Alleen toegankelijk voor geautoriseerde gebruikers

Tulsky, D. S., Kisala, P. A., Victorson, D., Carlozzi, N., Bushnik, T., Sherer, M., & Cella, D. (2016). TBI-QOL: Development and calibration of item banks to measure patient reported outcomes following traumatic brain injury. The Journal of Head Trauma Rehabilitation, 31(1), 40–51. https://doi.org/10.1097/HTR.0000000000000131CrossRefPubMed

Akshoomoff, N., Beaumont, J. L., Bauer, P. J., Dikmen, S., Gershon, R., Mungas, D., & Heaton, R. K. (2013). NIH toolbox cognitive function battery (CFB): Composite scores of crystallized, fluid, and overall cognition. Monographs of the Society for Research in Child Development, 78(4), 119–132. https://doi.org/10.1111/mono.12038CrossRefPubMedPubMedCentral

Beaumont, J. L., Havlik, R., Cook, K. F., Hays, R. D., Wallner-Allen, K., Korper, S. P., & Gershon, R. (2013). Norming plans for the NIH toolbox. Neurology, 80(11 Suppl 3), S87–S92. https://doi.org/10.1212/WNL.0b013e3182872e70CrossRefPubMedPubMedCentral

Cella, D., Riley, W., Stone, A., Rothrock, N., Reeve, B., Yount, S., & Hays, R. (2010). Initial adult health item banks and first wave testing of the patient-reported outcomes measurement information system (PROMIS™) Network: 2005–2008. Journal of clinical epidemiology, 63(11), 1179–1194. https://doi.org/10.1016/j.jclinepi.2010.04.011CrossRefPubMedPubMedCentral

Cella, D., Yount, S., Rothrock, N., Gershon, R., Cook, K., Reeve, B., & Rose, M. (2007). The patient-reported outcomes measurement information system (PROMIS). Medical care, 45(5 Suppl 1), S3–S11. https://doi.org/10.1097/01.mlr.0000258615.42478.55CrossRefPubMedPubMedCentral

LeBlanc, T. W., & Abernethy, A. P. (2017). Patient-reported outcomes in cancer care—hearing the patient voice at greater volume. Nature Reviews Clinical Oncology, 14(12), 763–772. https://doi.org/10.1038/nrclinonc.2017.153CrossRefPubMed

Basch, E., Deal, A. M., Dueck, A. C., Scher, H. I., Kris, M. G., Hudis, C., & Schrag, D. (2017). Overall survival results of a trial assessing patient-reported outcomes for symptom monitoring during routine cancer treatment. JAMA, 318(2), 197. https://doi.org/10.1001/jama.2017.7156CrossRefPubMedPubMedCentral

Sands, W. A., & Waters, B. K. (1997). Introduction to ASVAB and CAT. In W. A. Sands, B. K. Waters, & J. R. McBride (Eds.), Computerized adaptive testing: From inquiry to operation (pp. 3–9). American Psychological Association.CrossRef

Yang, J. S., Hansen, M., & Cai, L. (2012). Characterizing sources of uncertainty in item response theory scale scores. Educational and Psychological Measurement, 72(2), 264–290. https://doi.org/10.1177/0013164411410056CrossRefPubMed

10.

Norman, G. R., Sloan, J. A., & Wyrwich, K. W. (2003). Interpretation of changes in health-related quality of life: The remarkable universality of half a standard deviation. Medical Care, 41(5), 582–592. https://doi.org/10.1097/01.MLR.0000062554.74615.4CCrossRefPubMed

11.

Revicki, D. A., Cella, D., Hays, R. D., Sloan, J. A., Lenderking, W. R., & Aaronson, N. K. (2006). Responsiveness and minimal important differences for patient reported outcomes. Health and Quality of Life Outcomes, 4(1), 70. https://doi.org/10.1186/1477-7525-4-70CrossRefPubMedPubMedCentral

12.

King, M. T. (2011). A point of minimal important difference (MID): A critique of terminology and methods. Expert Review of Pharmacoeconomics & Outcomes Research, 11(2), 171–184. https://doi.org/10.1586/erp.11.9CrossRef

13.

Yang, J. S., Hansen, M., & Cai, L. (2012). Characterizing sources of uncertainty in IRT scale scores. Educational and psychological measurement, 72(2), 264–290.CrossRefPubMed

14.

Chalmers, R. P., & Ng, V. (2017). Plausible-value imputation statistics for detecting item misfit. Applied Psychological Measurement, 41(5), 372–387. https://doi.org/10.1177/0146621617692079CrossRefPubMedPubMedCentral

15.

Marsman, M., Maris, G., Bechger, T., & Glas, C. (2016). What can we learn from plausible values? Psychometrika, 81(2), 274–289. https://doi.org/10.1007/s11336-016-9497-xCrossRefPubMedPubMedCentral

16.

von Davier, M., Gonzalez, E., & Mislevy, R. (2009). What are plausible values and why are they useful. IERI monograph series: Issues and methodologies in large-scale assessments (pp. 9–36). Education Testing Service.

17.

Fischer, H. F., & Rose, M. (2019). Scoring depression on a common metric: A comparison of EAP estimation, plausible value imputation, and full Bayesian IRT modeling. Multivariate Behavioral Research, 54(1), 85–99. https://doi.org/10.1080/00273171.2018.1491381CrossRefPubMed

18.

Fischer, F., Gibbons, C., Coste, J., Valderas, J. M., Rose, M., & Leplège, A. (2018). Measurement invariance and general population reference values of the PROMIS Profile 29 in the UK, France, and Germany. Quality of Life Research, 27(4), 999–1014. https://doi.org/10.1007/s11136-018-1785-8CrossRefPubMed

19.

Jacobson, N. S., & Truax, P. (1991). Clinical significance: A statistical approach to defining meaningful change in psychotherapy research. Journal of Consulting and Clinical Psychology, 59(1), 12.CrossRefPubMed

20.

Bartholomew, D. J., Knott, M., & Moustaki, I. (2011). Latent variable models and factor analysis: A unified approach (3rd ed.). Wiley.CrossRef

21.

Chang, H.-H., & Stout, W. (1993). The asymptotic posterior normality of the latent trait in an IRT model. Psychometrika, 58(1), 37–52. https://doi.org/10.1007/BF02294469CrossRef

22.

Brown, A., & Croudace, T. J. (2015). Scoring and estimating score precision using multidimensional IRT models. Handbook of item response theory modeling: Applications to typical performance assessment (pp. 307–333). Routledge.

23.

Little, R. J. A., & Rubin, D. B. (1987). Statistical analysis with missing data. John Wiley & Sons.

24.

Asparouhov, T., & Muthen, B. (2010). Plausible values for latent variables using Mplus. Mplus.

25.

Yount, S. E., Atwood, C., Donohue, J., Hays, R. D., Irwin, D., Leidy, N. K., & DeWalt, D. A. (2019). Responsiveness of PROMIS® to change in chronic obstructive pulmonary disease. Journal of Patient-Reported Outcomes. https://doi.org/10.1186/s41687-019-0155-9CrossRefPubMedPubMedCentral

26.

DeWalt, D. (2016). PROMIS 1 wave 2 chronic obstructive pulmonary disease (COPD). Harvard Dataverse. https://doi.org/10.7910/DVN/UOQNJF

27.

Schalet, B. D., Hays, R. D., Jensen, S. E., Beaumont, J. L., Fries, J. F., & Cella, D. (2016). Validity of PROMIS® physical function measures in diverse clinical samples. Journal of clinical epidemiology, 73, 112–118. https://doi.org/10.1016/j.jclinepi.2015.08.039CrossRefPubMedPubMedCentral

28.

Lewko, A., Bidgood, P. L., & Garrod, R. (2009). Evaluation of psychological and physiological predictors of fatigue in patients with COPD. BMC Pulmonary Medicine, 9(1), 47. https://doi.org/10.1186/1471-2466-9-47CrossRefPubMedPubMedCentral

29.

Breslin, E., van der Schans, C., Breukink, S., Meek, P., Mercer, K., Volz, W., & Louie, S. (1998). Perception of fatigue and quality of life in patients with COPD. Chest, 114(4), 958–964. https://doi.org/10.1378/chest.114.4.958CrossRefPubMed

30.

Wang, Q., & Bourbeau, J. (2005). Outcomes and health-related quality of life following hospitalization for an acute exacerbation of COPD. Respirology, 10(3), 334–340. https://doi.org/10.1111/j.1440-1843.2005.00718.xCrossRefPubMed

31.

Cote, C. G., Dordelly, L. J., & Celli, B. R. (2007). Impact of COPD exacerbations on patient-centered outcomes. Chest, 131(3), 696–704.CrossRefPubMed

32.

Irwin, D. E., Atwood, C. A., Hays, R. D., Spritzer, K., Liu, H., Donohue, J. F., & DeWalt, D. A. (2015). Correlation of PROMIS scales and clinical measures among chronic obstructive pulmonary disease patients with and without exacerbations. Quality of Life Research, 24(4), 999–1009. https://doi.org/10.1007/s11136-014-0818-1CrossRefPubMed

33.

Rose, M., Bjorner, J. B., Gandek, B., Bruce, B., Fries, J. F., & Ware, J. E. (2014). The PROMIS physical function item bank was calibrated to a standardized metric and shown to improve measurement efficiency. Journal of Clinical Epidemiology, 67(5), 516–526. https://doi.org/10.1016/j.jclinepi.2013.10.024CrossRefPubMedPubMedCentral

34.

Fries, J. F., Krishnan, E., Rose, M., Lingala, B., & Bruce, B. (2011). Improved responsiveness and reduced sample size requirements of PROMIS physical function scales with item response theory. Arthritis Research & Therapy, 13(5), R147. https://doi.org/10.1186/ar3461CrossRef

35.

Lai, J.-S., Cella, D., Choi, S., Junghaenel, D. U., Christodoulou, C., Gershon, R., & Stone, A. (2011). How item banks and their application can influence measurement practice in rehabilitation medicine: A promis fatigue item bank example. Archives of physical medicine and rehabilitation, 92(10), S20–S27. https://doi.org/10.1016/j.apmr.2010.08.033CrossRefPubMedPubMedCentral

36.

Ameringer, S., Elswick, R. K., Menzies, V., Robins, J. L., Starkweather, A., Walter, J., & Jallo, N. (2016). Psychometric evaluation of the patient-reported outcomes measurement information system fatigue-short form across diverse populations. Nursing Research, 65(4), 279–289. https://doi.org/10.1097/NNR.0000000000000162CrossRefPubMedPubMedCentral

37.

Choi, S. W., & Swartz, R. J. (2009). Comparison of CAT item selection criteria for polytomous items. Applied psychological measurement, 33(6), 419–440. https://doi.org/10.1177/0146621608327801CrossRefPubMed

38.

Yost, K., Cella, D., Chawla, A., Holmgren, E., Eton, D., Ayanian, J., & West, D. (2005). Minimally important differences were estimated for the functional assessment of cancer therapy-colorectal (FACT-C) instrument using a combination of distribution- and anchor-based approaches. Journal of Clinical Epidemiology, 58(12), 1241–1251. https://doi.org/10.1016/j.jclinepi.2005.07.008CrossRefPubMed

39.

Cella, D., Hahn, E. A., & Dineen, K. (2002). Meaningful change in cancer-specific quality of life scores: Differences between improvement and worsening. Quality of Life Research, 11(3), 207–221.CrossRefPubMed

40.

Beaumont, J. L., Davis, E. S., Fries, J. F., Curtis, J. R., Cella, D., & Yun, H. (2021). Meaningful change thresholds for patient-reported outcomes measurement information system (PROMIS) fatigue and pain interference scores in patients with rheumatoid arthritis. The Journal of Rheumatology. https://doi.org/10.3899/jrheum.200990CrossRefPubMed

41.

Wyrwich, K. W. (2004). Minimal important difference thresholds and the standard error of measurement: Is there a connection? Journal of Biopharmaceutical Statistics, 14(1), 97–110. https://doi.org/10.1081/BIP-120028508CrossRefPubMed

42.

Hays, R. D., Spritzer, K. L., Fries, J. F., & Krishnan, E. (2015). Responsiveness and minimally important difference for the patient-reported outcomes measurement information system (PROMIS) 20-item physical functioning short form in a prospective observational study of rheumatoid arthritis. Annals of the Rheumatic Diseases, 74(1), 104–107. https://doi.org/10.1136/annrheumdis-2013-204053CrossRefPubMed

43.

Bartlett, S. J., Gutierrez, A. K., Andersen, K. M., Bykerk, V. P., Curtis, J. R., Haque, U. J., & Bingham, C. O. (2020). Identifying minimal and meaningful change in PROMIS(®) for rheumatoid arthritis: Use of multiple methods and perspectives. Arthritis Care Res (Hoboken), 74(4), 588–597.CrossRef

44.

Snapinn, S. M., & Jiang, Q. (2007). Responder analyses and the assessment of a clinically relevant treatment effect. Trials, 8(1), 31. https://doi.org/10.1186/1745-6215-8-31CrossRefPubMedPubMedCentral

45.

Uryniak, T., Chan, I. S. F., Fedorov, V. V., Jiang, Q., Oppenheimer, L., Snapinn, S. M., & Zhang, J. (2011). Responder analyses—A PhRMA position paper. Statistics in Biopharmaceutical Research, 3(3), 476–487. https://doi.org/10.1198/sbr.2011.10070CrossRef

Titel: Measuring individual true change with PROMIS using IRT-based plausible values
Auteurs: Emily H. Ho
Jay Verkuilen
Felix Fischer
Publicatiedatum: 25-10-2022
Uitgeverij: Springer International Publishing
Gepubliceerd in: Quality of Life Research / Uitgave 5/2023
Print ISSN: 0962-9343
Elektronisch ISSN: 1573-2649
DOI: https://doi.org/10.1007/s11136-022-03264-2

Andere artikelen Uitgave 5/2023

How strong should my anchor be for estimating group and individual level meaningful change? A simulation study assessing anchor correlation strength and the impact of sample size, distribution of change scores and methodology on establishing a true meaningful change threshold

Special Section: Methodologies for Meaningful Change

Predictors of change in asthma-related quality of life: a longitudinal real-life study in adult asthmatics

Open Access

Fasting during cancer treatment: a systematic review

Open Access
Review

Minimally important changes do not always reflect minimally important change; moreover, there is no need for them

Letter to the Editor

A qualitative study to examine meaningful change in physical function associated with weight-loss

Open Access
Special Section: Methodologies for Meaningful Change

Bohn Stafleu van Loghum

Welkom bij Erasmus MC & Bohn Stafleu van Loghum

Registreer

Login

Measuring individual true change with PROMIS using IRT-based plausible values

Abstract

Aims

Methods

Results

Conclusion

Andere artikelen Uitgave 5/2023

How strong should my anchor be for estimating group and individual level meaningful change? A simulation study assessing anchor correlation strength and the impact of sample size, distribution of change scores and methodology on establishing a true meaningful change threshold

Predictors of change in asthma-related quality of life: a longitudinal real-life study in adult asthmatics

Fasting during cancer treatment: a systematic review

Minimally important changes do not always reflect minimally important change; moreover, there is no need for them

A qualitative study to examine meaningful change in physical function associated with weight-loss

Do mothers or females without children have better health-related quality of life across their reproductive years?

Bohn Stafleu van Loghum

Welkom bij Erasmus MC & Bohn Stafleu van Loghum

Registreer

Login

Deel dit onderdeel of sectie (kopieer de link)

Abstract

Aims

Methods

Results

Conclusion

Log in om toegang te krijgen

How strong should my anchor be for estimating group and individual level meaningful change? A simulation study assessing anchor correlation strength and the impact of sample size, distribution of change scores and methodology on establishing a true meaningful change threshold

Predictors of change in asthma-related quality of life: a longitudinal real-life study in adult asthmatics

Fasting during cancer treatment: a systematic review

Minimally important changes do not always reflect minimally important change; moreover, there is no need for them

A qualitative study to examine meaningful change in physical function associated with weight-loss

Do mothers or females without children have better health-related quality of life across their reproductive years?