Bansal, Aasthaa; Heagerty, Patrick J.
doi: 10.1177/0272989X18801312pmid: 30319014
Many medical decisions involve the use of dynamic information collected on individual patients toward predicting likely transitions in their future health status. If accurate predictions are developed, then a prognostic model can identify patients at greatest risk for future adverse events and may be used clinically to define populations appropriate for targeted intervention. In practice, a prognostic model is often used to guide decisions at multiple time points over the course of disease, and classification performance (i.e., sensitivity and specificity) for distinguishing high-risk v. low-risk individuals may vary over time as an individual’s disease status and prognostic information change. In this tutorial, we detail contemporary statistical methods that can characterize the time-varying accuracy of prognostic survival models when used for dynamic decision making. Although statistical methods for evaluating prognostic models with simple binary outcomes are well established, methods appropriate for survival outcomes are less well known and require time-dependent extensions of sensitivity and specificity to fully characterize longitudinal biomarkers or models. The methods we review are particularly important in that they allow for appropriate handling of censored outcomes commonly encountered with event time data. We highlight the importance of determining whether clinical interest is in predicting cumulative (or prevalent) cases over a fixed future time interval v. predicting incident cases over a range of follow-up times and whether patient information is static or updated over time. We discuss implementation of time-dependent receiver operating characteristic approaches using relevant R statistical software packages. The statistical summaries are illustrated using a liver prognostic model to guide transplantation in primary biliary cirrhosis.
Buskermolen, Maaike; Gini, Andrea; Naber, Steffie K.; Toes-Zoutendijk, Esther; de Koning, Harry J.; Lansdorp-Vogelaar, Iris
doi: 10.1177/0272989X18806497pmid: 30343626
Background. Microsimulation models are increasingly being used to inform colorectal cancer (CRC) screening recommendations. MISCAN-Colon is an example of such a model, used to inform the Dutch CRC screening program and US Preventive Services Task Force guidelines. Assessing the validity of these models is essential to provide transparency regarding their performance. In this study, we tested the external and predictive validity of MISCAN-Colon. Methods. We validated MISCAN-Colon using the Norwegian Colorectal Cancer Prevention (NORCCAP) trial, a randomized controlled trial that examined the effectiveness of once-only flexible sigmoidoscopy (FS) screening. We simulated the study population and design of the NORCCAP trial in MISCAN-Colon and compared 10- to 12-year model-predicted hazard ratios (HRs) for overall and distal CRC incidence and mortality to those observed. In addition, we compared the numbers of screen-detected neoplasia. Finally, we predicted the trial’s future results to allow for the assessment of predictive validity. Results. MISCAN-Colon predicted an HR for overall CRC incidence (0.85), distal CRC incidence (0.82), overall CRC mortality (0.68), and distal CRC mortality (0.62). These were within the limits of the 95% confidence intervals of the NORCCAP trial results. Similar results were observed for the number of screen-detected cancers. The model significantly underestimated the number of screen-detected adenomas. Model-predicted HRs for CRC incidence and mortality up to 15- to 17-year follow-up were 0.84 and 0.72, respectively. Conclusion. Although the underestimation of screen-detected adenomas requires further investigation, MISCAN-Colon is able to make a valid replication of the CRC incidence and mortality reduction of an FS screening trial, which suggests that it can be considered a useful tool to support decision making on CRC screening.
Dodd, Peter J.; Pennington, Jeff J.; Bronner Murrison, Liza; Dowdy, David W.
doi: 10.1177/0272989X18807438pmid: 30403578
Introduction. Cost-effectiveness models for infectious disease interventions often require transmission models that capture the indirect benefits from averted subsequent infections. Compartmental models based on ordinary differential equations are commonly used in this context. Decision trees are frequently used in cost-effectiveness modeling and are well suited to describing diagnostic algorithms. However, complex decision trees are laborious to specify as compartmental models and cumbersome to adapt, limiting the detail of algorithms typically included in transmission models. Methods. We consider an approximation replacing a decision tree with a single holding state for systems where the time scale of the diagnostic algorithm is shorter than time scales associated with disease progression or transmission. We describe recursive algorithms for calculating the outcomes and mean costs and delays associated with decision trees, as well as design strategies for computational implementation. We assess the performance of the approximation in a simple model of transmission/diagnosis and its role in simplifying a model of tuberculosis diagnostics. Results. When diagnostic delays were short relative to recovery rates, our approximation provided a good account of infection dynamics and the cumulative costs of diagnosis and treatment. Proportional errors were below 5% so long as the longest delay in our 2-step algorithm was under 20% of the recovery time scale. Specifying new diagnostic algorithms in our tuberculosis model was reduced from several tens to just a few lines of code. Discussion. For conditions characterized by a diagnostic process that is neither instantaneous nor protracted (relative to transmission dynamics), this novel approach retains the advantages of decision trees while embedding them in more complex models of disease transmission. Concise specification and code reuse increase transparency and reduce potential for error.
Wan, Wen; Skandari, M. Reza; Minc, Alexa; Nathan, Aviva G.; Zarei, Parmida; Winn, Aaron N.; O’Grady, Michael; Huang, Elbert S.
doi: 10.1177/0272989X18803109pmid: 30403576
Background. The economic impact of both continuous glucose monitoring (CGM) and insulin pumps (continuous subcutaneous insulin infusion [CSII]) in type 1 diabetes (T1D) have been evaluated separately. However, the cost-effectiveness of adding CSII to existing CGM users has not yet been assessed. Objective. The aim of this study was to evaluate the societal cost-effectiveness of CSII versus continuing multiple daily injections (MDI) in adults with T1D already using CGM. Methods. In the second phase of the DIAMOND trial, 75 adults using CGM were randomized to either CGM+CSII or CGM+MDI (control) and surveyed at baseline and 28 weeks. We performed within-trial and lifetime cost-effectiveness analyses (CEAs) and estimated lifetime costs and quality-adjusted life-years (QALYs) via a modified Sheffield T1D model. Results. Within the trial, the CGM+CSII group had a significant reduction in quality of life from baseline (−0.02 ± 0.05 difference in difference [DiD]) compared with controls. Total per-person 28-week costs were $8,272 (CGM+CSII) versus $5,623 (CGM+MDI); the difference in costs was primarily attributable to pump use ($2,644). Pump users reduced insulin intake (−12.8 units DiD) but increased the use of daily number of test strips (+1.2 DiD). Pump users also increased time with glucose in range of 70 to 180 mg/dL but had a higher HbA1c (+0.13 DiD) and more nonsevere hypoglycemic events. In the lifetime CEA, CGM+CSII would increase total costs by $112,045 DiD, decrease QALYs by 0.71, and decrease life expectancy by 0.48 years. Conclusions. Based on this single trial, initiating an insulin pump in adults with T1D already using CGM was associated with higher costs and reduced quality of life. Additional evidence regarding the clinical effects of adopting combinations of new technologies from trials and real-world populations is needed to confirm these findings.
doi: 10.1177/0272989X18797588pmid: 30226101
Objectives. To assess the external validity of mapping algorithms for predicting EQ-5D-3L utility values from EORTC QLQ-C30 responses not previously validated and to assess whether statistical models not previously applied are better suited for mapping the EORTC QLQ-C30 to the EQ-5D-3L. Methods. In total, 3866 observations for 1719 patients from a longitudinal study (Cancer 2015) were used to validate existing algorithms. Predictive accuracy was compared to previously validated algorithms using root mean squared error, mean absolute error across the EQ-5D-3L range, and for 10 tumor-type specific samples as well as using differences between estimated quality-adjusted life years. Thirteen new algorithms were estimated using a subset of the Cancer 2015 data (3203 observations for 1419 patients) applying various linear, response mapping, beta, and mixture models. Validation was performed using 2 data sets composed of patients with varying disease severity not used in the estimation and all available algorithms ranked on their performance. Results. None of the 5 existing algorithms offer an improvement in predictive accuracy over preferred algorithms from previous validation studies. Of the newly estimated algorithms, a 2-part beta model performed the best across the validation criteria and in data sets composed of patients with different levels of disease severity. Validation results did, however, vary widely between the 2 data sets, and the most accurate algorithm appears to depend on health state severity as the distribution of observed EQ-5D-3L values varies. Linear models performed better for patients in relatively good health, whereas beta, mixture, and response mapping models performed better for patients in worse health. Conclusion. The most appropriate mapping algorithm to apply in practice may depend on the disease severity of the patient sample whose utility values are being predicted.
Law, Ernest H.; Pickard, A. Simon; Xie, Feng; Walton, Surrey M.; Lee, Todd A.; Schwartz, Alan
doi: 10.1177/0272989X18802797pmid: 30403577
Objective. To compare and contrast EQ-5D-5L (5L) and EQ-5D-3L (3L) health state values derived from a common sample. Methods. Data from the 2017 US EQ-5D valuation study were analyzed. Value sets were estimated with random-effects linear regression based on composite time trade-off (cTTO) valuations for 3L and 5L health states with 2 approaches to model specification: main effects only and additional N3/N45 terms. Properties of the descriptive system and value set characteristics were compared by examining distributions of predicted index scores, ceiling effects, and single-level transition values from adjacent corner health states. Mean transition values were calculated for all predicted 3L and 5L health states and plotted against baseline index scores. Results. A total of 1062 respondents were included in the analysis. The observed mean cTTO values for the worst possible 3L and 5L health states were −0.423 and −0.343, respectively. The range of scale was larger with the 3L, compared to the 5L, for both main effects and N term models. Values for the mildest 5L health states (range, 0.857−0.924) were similar to 11111 for the 3L. Parameter estimates for matched dimension levels differed by <|0.07| except for the most severe level of Mobility. For the main effects model, 3L mean transition values were greater for more severe baseline 3L index scores, whereas 5L mean transition values remained constant irrespective of the baseline index score. Conclusions. Compared to the 3L, the 5L exhibited a lower ceiling effect and improved measurement properties. There was a larger range of scale for the 3L compared to 5L; however, this difference was driven by differences in preference for the most severe level of problems in Mobility.
Jia, Haomiao; Lubetkin, Erica I.; DeMichele, Kimberly; Stark, Debra S.; Zack, Matthew M.; Thompson, William W.
doi: 10.1177/0272989X18808494pmid: 30403580
Background. The Medicare Health Outcomes Survey (HOS), a nationwide annual survey of Medicare beneficiaries, includes the Centers for Disease Control and Prevention’s HRQOL-4 questionnaire and Veterans RAND 12-item Health Survey (VR-12). This study compared EQ-5D scores derived from the HRQOL-4 (dEQ-5D) to SF-6D scores derived from VR-12. Methods. Data were from Medicare HOS Cohort 15 (2012 baseline; 2014 follow-up). We included participants aged 65+ (n = 105,473). We compared score distributions, evaluated known-groups validity, assessed each index as a predictor for mortality, and estimated quality-adjusted life years (QALYs) using the dEQ-5D and SF-6D. Results. Compared to the SF-6D, the dEQ-5D had a higher mean score (0.787 v. 0.691) and larger standard deviation (0.310 v. 0.101). The decreases in estimated scores associated with chronic conditions were greater for the dEQ-5D than for the SF-6D. For example, dEQ-5D scores for persons with depression decreased 0.456 points compared to 0.141 points for the SF-6D. The dEQ-5D strongly predicted mortality, as adjusted hazard ratios for the first to fourth quintiles, relative to the fifth quintile, were 2.2, 1.7, 1.8, and 1.5, respectively, while the association between SF-6D and mortality was weaker or nonexistent (adjusted hazard ratios were 1.3, 1.1, 1.0, and 0.6, respectively). Compared to the SF-6D, QALYs estimated using the dEQ-5D were higher overall (5.6 v. 4.9 years), higher for persons with less debilitating conditions (e.g., hypertension, 5.0 v. 4.4 years), and lower for more debilitating conditions (e.g. depression, 2.5 v. 2.8 years). Conclusions. Compared to the SF-6D, the dEQ-5D was better able to measure individuals’ overall health; detect the differential impact of chronic conditions, particularly among persons in poorer health; and predict mortality. The HRQOL-4 questionnaire may be valuable for monitoring and improving health outcomes for the Medical HOS data set.
Tolbert, Elliott; Brundage, Michael; Bantug, Elissa; Blackford, Amanda L.; Smith, Katherine; Snyder, Claire; ,
doi: 10.1177/0272989X18791177pmid: 30132393
Background. Patient-reported outcome (PRO) results from clinical trials and research studies can inform patient-clinician decision making. However, data presentation issues specific to PROs, such as scaling directionality (higher scores may represent better or worse outcomes) and scoring strategies (normed v. nonnormed scores), can make the interpretation of PRO scores uniquely challenging. Objective. To identify the association of PRO score directionality, score norming, and other factors on a) how accurately PRO scores are interpreted and b) how clearly they are rated by patients, clinicians, and PRO researchers. Methods. We electronically surveyed adult cancer patients/survivors, oncology clinicians, and PRO researchers and conducted one-on-one cognitive interviews with patients/survivors and clinicians. Participants were randomized to 1 of 3 line graph formats showing longitudinal change: higher scores indicating “better,” “more” (better for function, worse for symptoms), or “normed” to a population average. Quantitative data evaluated interpretation accuracy and clarity. Online survey comments and cognitive interviews were analyzed qualitatively. Results. The Internet sample included 629 patients, 139 clinicians, and 249 researchers; 10 patients and 5 clinicians completed cognitive interviews. “Normed” line graphs were less accurately interpreted than “more” (odds ratio [OR] = 0.76; P = 0.04). “Better” line graphs were more accurately interpreted than both “more” (OR = 1.43; P = 0.01) and “normed” (OR = 1.88; P = 0.04). “Better” line graphs were more likely to be rated clear than “more” (OR = 1.51; P = 0.05). Qualitative data informed interpretation of these findings. Limitations. The survey relied on the online platforms used for distribution and consequent snowball sampling. We do not have information regarding participants’ numeracy/graph literacy. Conclusions. For communicating PROs as line graphs in patient educational materials and decision aids, these results support using graphs, with higher scores consistently indicating better outcomes.
Orom, Heather; Schofield, Elizabeth; Kiviniemi, Marc T.; Waters, Erika A.; Biddle, Caitlin; Chen, Xuewei; Li, Yuelin; Kaphingst, Kimberly A.; Hay, Jennifer L.
doi: 10.1177/0272989X18799999pmid: 30403579
Background. People who say they don’t know (DK) their disease risk are less likely to engage in protective behavior. Purpose. This study examined possible mechanisms underlying not knowing one’s risk for common diseases. Methods. Participants were a nationally representative sample of 1005 members of a standing probability-based survey panel who answered questions about their comparative and absolute perceived risk for diabetes and colon cancer, health literacy, risk factor knowledge and health information avoidance, and beliefs about illness unpredictability. Survey satisficing was a composite assessment of not following survey instructions, nondifferentiation of responses, haphazard responding, and speeding. The primary outcomes were whether a person selected DK when asked absolute and comparative risk perception questions about diabetes or colon cancer. Base structural equation modeling path models with pathways from information avoidance and health literacy/knowledge to DK responding for each DK outcome were compared to models that also included pathways from satisficing or unpredictability beliefs. Results. Base models contained significant indirect effects of health literacy (odds ratios [ORs] = 0.94 to 0.97, all P < 0.02) and avoidance (ORs = 1.05 to 1.15, all P < 0.01) on DK responding through risk factor knowledge and a direct effect of avoidance (ORs = 1.21 to 1.28, all P < 0.02). Adding the direct effect for satisficing to models resulted in poor fit (for all outcomes, residual mean square error estimates >0.17, all weighted root mean square residuals >3.2, all Comparative Fit Index <0.47, all Tucker-Lewis Index <0.49), indicating that satisficing was not associated with DK responding. Unpredictability was associated with not knowing one’s diabetes risk (OR = 1.01, P < 0.01). Limitations. The data were cross-sectional; therefore, directionality of the pathways cannot be assumed. Conclusions. DK responders may need more health information, but it needs to be delivered differently. Interventions might include targeting messages for lower health literacy audiences and disrupting defensive avoidance of threatening health information.
Showing 1 to 10 of 13 Articles