Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Use of Data-Driven Methods to Predict Long-term Patterns of Health Care Spending for Medicare Patients

Use of Data-Driven Methods to Predict Long-term Patterns of Health Care Spending for Medicare... Key Points Question What are the long-term IMPORTANCE Current approaches to predicting health care costs generally rely on a single spending patterns by Medicare composite value of spending and focus on short time horizons. By contrast, examining patients’ beneficiaries, and do baseline patient spending patterns using dynamic measures applied over longer periods may better identify patients factors that are potentially modifiable with different spending and help target interventions to those with the greatest need. predict these patterns? Findings In this cohort study using a OBJECTIVE To classify patients by their long-term, dynamic health care spending patterns using a data-driven approach to classifying data-driven approach and assess the ability to predict spending patterns, particularly using Medicare beneficiaries by their spending characteristics that are potentially modifiable through intervention. over 2 years, 5 patterns were identified and could be predicted, including those DESIGN, SETTING, AND PARTICIPANTS This cohort study used a retrospective cohort design from with consistent spending levels and a random nationwide sample of Medicare fee-for-service administrative claims data to identify others with spending that increased beneficiaries aged 65 years or older with continuous eligibility from 2011 to 2013. Statistical analysis progressively. The most influential was performed from August 2018 to December 2019. potentially modifiable factors were number of medications, number of MAIN OUTCOMES AND MEASURES Group-based trajectory modeling was applied to the claims office visits, and mean medication data to classify the Medicare beneficiaries by their total health care spending patterns over a 2-year adherence. period. The ability to predict membership in each trajectory spending group was assessed using generalized boosted regression, a data mining approach to model building and prediction, with split- Meaning These findings suggest that sample validation. Models were estimated using (1) prior-year predictors and (2) prior-year predictors spending by Medicare beneficiaries falls potentially modifiable through intervention measured in the claims data. These models were into 5 distinct groups and could be evaluated using validated C-statistics. The relative influence of individual predictors in the models accurately predicted; this approach was evaluated. could be adapted by organizations to target interventions. RESULTS Among the 329 476 beneficiaries, the mean (SD) age was 76.0 (7.2) years and 190 346 (57.8%) were female. This final 5-group model included a minimal-user group (group 1, 37 572 Supplemental content individuals [11.4%]), a low-cost group (group 2, 48 575 individuals [14.7%]), a rising-cost group (group 3, 24 736 individuals [7.5%]), a moderate-cost group (group 4, 83 338 individuals [25.3%]), Author affiliations and article information are listed at the end of this article. and a high-cost group (group 5, 135 255 individuals [41.2%]). Potentially modifiable characteristics strongly predicted these patterns (C-statistics range: 0.68-0.94). For groups with progressively increasing spending in particular, the most influential factors were number of medications (relative influence: 29.2), number of office visits (relative influence: 30.3), and mean medication adherence (relative influence: 33.6). CONCLUSIONS AND RELEVANCE Using a data-driven approach, distinct spending patterns were identified with high accuracy. The potentially modifiable predictors of membership in the rising-cost group represent important levers for early interventions that may prevent later spending increases. This approach could be adapted by organizations to target quality improvement interventions, (continued) Open Access. This is an open access article distributed under the terms of the CC-BY License. JAMA Network Open. 2020;3(10):e2020291. doi:10.1001/jamanetworkopen.2020.20291 (Reprinted) October 19, 2020 1/11 JAMA Network Open | Health Policy Use of Data-Driven Methods to Predict Long-term Patterns of Health Care Spending for Medicare Patients Abstract (continued) particularly because numerous health care organizations are increasingly using these routinely collected data. JAMA Network Open. 2020;3(10):e2020291. doi:10.1001/jamanetworkopen.2020.20291 Introduction With health care spending now accounting for almost 18% of the US gross domestic product, identifying individuals who may benefit from interventions to address potentially avoidable spending has become a central priority for health insurers and health care professionals. Current approaches generally focus on prediction or intervention for patients who may have escalating costs on the basis 2,3 of a single composite value of total spending over short time periods. However, many patients experience substantial increases or decreases in spending not 4-9 10 captured by these approaches. For example, Tamang et al identified a definable group of low-spending patients in 1 year whose costs bloomed (ie, they became high-spending individuals) in the subsequent year in Denmark. Similarly, Lauffenburger et al observed 7 distinct, dynamic patterns of spending over a 1-year period in commercially insured beneficiaries, including individuals whose costs increased rapidly toward the end of the year and another group of high-cost individuals for whom spending decreased. These prior studies were conducted over a 1-year period, yet there may also be dynamic patterns of spending over longer periods that may have implications both for whom to outreach for 1,12 intervention and when to do so. For example, patients with the same clinical conditions who are hospitalized early during a 12-month period may differ meaningfully from those hospitalized later, 13,14 although both could be identified as having rising costs. If these different spending patterns could be predicted using routinely collected data, then the ability to better proactively differentiate patients with increasing or decreasing spending patterns could better target interventions to those who are at greatest need of improved health or cost containment. The predictive accuracy of spending may also be higher when evaluating a long-term, compared with a short-term, time horizon as seen for other outcomes. Accordingly, we sought to classify patients according to their spending patterns over a 2-year period and to evaluate the ability to predict these spending groups using patient characteristics that are potentially modifiable. Methods This cohort study was approved by the institutional review board of Brigham and Women’s Hospital and was granted a waiver of informed patient consent because the data are secondary routinely collected data. This study follows reporting requirements of the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) reporting guideline. Setting and Study Design This study used administrative claims data from a 1-million-member sample of Medicare fee-for- service beneficiaries; the original sample included approximately 20 000 beneficiaries in a nationwide quality improvement program and approximately 980 000 randomly selected patients nationally. We restricted the cohort to the randomly selected patients and used their paid Medicare Parts A, B, and D patient-level files containing all procedures, physician encounters, hospitalizations, and filled outpatient prescriptions, including amounts paid by the insurer and patient. These data were linked to eligibility data including age, race/ethnicity, gender, and geographic location of residence. Aggregate zip code level data on median income and educational attainment were obtained by linking with 2010 US Census data. JAMA Network Open. 2020;3(10):e2020291. doi:10.1001/jamanetworkopen.2020.20291 (Reprinted) October 19, 2020 2/11 JAMA Network Open | Health Policy Use of Data-Driven Methods to Predict Long-term Patterns of Health Care Spending for Medicare Patients To be included, patients had to be aged 65 years or older and maintain continuous eligibility from January 1, 2011, to December 31, 2013. The cohort entry date was defined as January 1, 2012, to provide 1 year of prior year of baseline data (year 0) and 2 years of follow-up data (year 1 and year 2) (eFigure 1 in the Supplement). Costs We measured total monthly health care spending over a 2-year period for each patient by summing the allowed amounts on all inpatient, outpatient, and prescription drug claims. Monthly costs were generated by summing the costs in each month and were standardized by dividing the summed costs by the number of days in that month and then multiplying the result by 30. Costs were then logarithmically transformed to normalize their distribution, after adding $0.01, as frequently 9,18 done. Costs were inflated using the Medical Care Component of the Consumer Price Index to 2013 dollars when necessary. Predictors Using data from Medicare enrollment files and claims, we defined 37 clinically relevant baseline characteristics that were potential predictors of future spending (eTable 1 in the Supplement). These baseline variables were measured during the 12 months prior to the 2-year period during which cost outcomes were evaluated (eFigure 1 in the Supplement). These variables were based on characteristics used in cost modeling in claims data in the peer-reviewed literature and from the 6,10,11,15 quality-cost theoretical framework. These sets of predictors have also been shown to have equivalent predictive accuracy of predicting 1-year spending as proprietary risk-adjustment methods. Sociodemographic characteristics included age, race/ethnicity, gender, and community-level variables based on member’s zip code of residence, including median household income and educational attainment. Clinical comorbidities were measured using International Classification of Diseases, Ninth Revision codes (eAppendix and eTable 1 in the Supplement). Each patients’ number of unique prescriptions by generic name (ie, therapeutic complexity), physician office visits, emergency department visits, hospitalizations, unique physicians visited, unique pharmacies used, benefits’ generosity (copayments and deductibles or total net payments), and baseline year total costs were also measured. Adherence to long-term medication classes (eg, β-blockers) was measured in the baseline year. For each class, we created a supply diary beginning with the first fill for each class in the baseline year. This diary linked all observed fills based on dispensing date and days’ supply; switching was allowed within each class (eg, β-blockers). From this, we calculated the proportion of days covered (PDC) as a mean across classes that the patient filled to yield 1 20,21 mean PDC. We categorized each predictor by whether it was potentially modifiable, defined by whether it 22,23 could theoretically be addressed in interventions and by classifications in prior literature. For example, number of unique physicians could be potentially modifiable, while race/ethnicity is not. In total, we classified 10 predictors as potentially modifiable (Table 1). Data-Driven Approach to Modeling Long-term Costs We used trajectory modeling to empirically classify spending during follow-up. One advantage is that it allows the data to define the cost outcomes, rather than using arbitrarily selected thresholds. It also considers changes in spending over time, rather than aggregating costs over a set time. To define spending patterns, we used the previously described SAS procedure Proc Traj, a free 24-26 add-on. In brief, group-based trajectory models are an application of finite mixture modeling that identify clusters of individuals with similar outcome patterns over time. This modeling approach analyzes longitudinal data by fitting a semiparametric (discrete) mixture model, estimating each individual’s probability of membership in each group, and assigning them to the group according to their highest probability. We modeled longitudinal cost trajectories using calendar JAMA Network Open. 2020;3(10):e2020291. doi:10.1001/jamanetworkopen.2020.20291 (Reprinted) October 19, 2020 3/11 JAMA Network Open | Health Policy Use of Data-Driven Methods to Predict Long-term Patterns of Health Care Spending for Medicare Patients month as the time variable, costs in each month, order equal to 4, and a censored-normal distribution 11,24,26 (linear between minimum and maximum values). The models were estimated using a forward classifying approach using 2 to 7 groups, each time investigating model fit using the bayesian information criterion (BIC), whereby a lower BIC indicates Table 1. Patient Characteristics by Spending Trajectory Patients, No. (%) Group 1: minimal user Group 2: low cost Group 3: rising cost Group 4: moderate cost Group 5: high cost Covariates (n = 37 572) (n = 48 575) (n = 24 736) (n = 83 338) (n = 135 255) Demographic characteristics Age, mean (SD), y 73.8 (7.7) 74.8 (6.8) 75.1 (6.9) 75.9 (7.0) 77.1 (7.2) Female 16 394 (43.6) 26 531 (54.6) 13 500 (54.6) 49 287 (59.1) 84 634 (62.6) Race/ethnicity Non-Hispanic White 30 732 (81.8) 42 723 (88.0) 22 020 (89.0) 74 184 (89.0) 118 637 (87.7) Black 3610 (9.6) 3255 (6.7) 1169 (6.6) 5332 (6.4) 9646 (7.1) Other 1299 (3.5) 1297 (2.7) 539 (2.2) 1588 (1.9) 2293 (1.7) Asian or Pacific Islander 867 (2.3) 792 (1.6) 322 (1.3) 1330 (1.6) 2448 (1.8) Hispanic 1064 (2.8) 508 (1.1) 236 (1.0) 904 (1.1) 2231 (1.7) Zip code median income, mean (SD), $ 59 960 (24 347) 56 572 (24 199) 56 696 (23 765) 56 683 (23 776) 55 929 (23 808) Zip code high school graduates, mean (SD), % 80.8 (21.0) 84.4 (16.6) 84.5 (16.2) 84.6 (15.8) 83.9 (15.9) Health care use Part D Plan switch 163 (0.4) 173 (0.4) 82 (0.3) 468 (0.6) 1828 (1.4) Low-income subsidy 3584 (9.5) 2735 (5.6) 1379 (5.6) 8063 (9.7) 31 019 (22.9) Office visits, mean (SD), No. 1.2 (2.0) 4.5 (3.5) 4.7 (3.7) 7.1 (5.0) 11.3 (8.3) Physicians, mean (SD), No. 0.4 (0.7) 1.0 (1.0) 1.0 (0.9) 1.3 (1.1) 1.8 (1.3) Pharmacies used, mean (SD), No. 0.1 (0.4) 0.4 (0.8) 0.3 (0.7) 0.8 (1.1) 1.3 (1.3) Hospitalizations, mean (SD), No. 0.0 (0.2) 0.1 (0.4) 0.1 (0.3) 0.2 (0.5) 0.4 (0.8) Emergency department visits, mean (SD), No. 0.1 (0.4) 0.2 (0.6) 0.2 (0.6) 0.3 (0.7) 0.6 (1.3) Unique drugs, mean (SD), No. 0.2 (1.1) 1.0 (2.2) 0.9 (2.2) 3.1 (3.9) 8.0 (7.0) Prescription generosity, mean (SD) 0.1 (0.2) 0.1 (0.3) 0.1 (0.2) 0.2 (0.3) 0.2 (0.2) Medical benefits’ generosity, mean (SD) 0.2 (0.3) 0.2 (0.2) 0.2 (0.2) 0.1 (0.1) 0.1 (0.8) Total baseline year costs, mean (SD), $ 1629 (5948) 4969 (10 296) 4762 (8989) 8314 (13 052) 19 941 (26 331) Long-term medication use 1261 (3.4) 7942 (16.4) 3445 (13.9) 35 142 (42.2) 88 922 (65.7) Medication adherence, mean (SD) 0.55 (0.30) 0.78 (0.24) 0.76 (0.25) 0.82 (0.19) 0.82 (0.18) Comorbidities Comorbidity score, mean (SD) 0.1 (0.9) 0.3 (1.4) 0.3 (1.4) 0.7 (1.8) 2.1 (2.7) Coronary artery disease 312 (0.8) 1065 (2.2) 518 (2.1) 3209 (3.9) 13 664 (10.1) Prior myocardial infarction 55 (0.2) 171 (0.4) 66 (0.3) 430 (0.5) 1491 (1.1) Asthma or chronic obstructive pulmonary disease 1659 (4.4) 5047 (10.4) 2952 (11.9) 12 795 (15.4) 40 073 (29.6) Hypertension 8962 (23.9) 30 172 (62.1) 15 683 (63.4) 63 097 (75.7) 115 869 (85.7) Diabetes 577 (1.5) 2508 (5.2) 1360 (5.5) 7857 (9.4) 25 653 (19.0) Acute kidney failure or end stage kidney disease 197 (0.5) 555 (1.1) 275 (1.1) 1591 (1.9) 8604 (6.4) Dementia 210 (0.6) 555 (1.4) 362 (1.5) 1162 (1.9) 7805 (5.8) Depression 519 (1.4) 2120 (4.4) 1188 (4.8) 5878 (7.1) 20 787 (15.4) Stroke 93 (0.3) 224 (0.5) 102 (0.4) 628 (0.8) 2165 (1.6) Liver disease 28 (0.1) 62 (0.1) 15 (0.1) 184 (0.2) 702 (0.5) Congestive heart failure 107 (0.3) 325 (0.7) 168 (0.7) 1083 (1.3) 7235 (5.4) Hyperlipidemia 7821 (20.8) 30 098 (62.0) 15 376 (62.2) 60 720 (72.9) 105 003 (77.6) Atrial fibrillation 129 (0.3) 420 (0.9) 216 (0.9) 1607 (1.9) 8130 (6.0) Osteoporosis 1839 (4.9) 8562 (17.6) 4304 (17.4) 19 080 (22.9) 38 204 (28.3) Obesity 511 (1.3) 1867 (3.8) 971 (3.9) 4572 (5.5) 13 223 (9.8) Acute stress 245 (0.7) 780 (1.6) 385 (1.6) 1973 (2.4) 7427 (5.5) Tobacco use 1156 (3.1) 2851 (5.9) 1474 (6.0) 6094 (7.3) 16 499 (12.2) Denotes potentially modifiable predictors. JAMA Network Open. 2020;3(10):e2020291. doi:10.1001/jamanetworkopen.2020.20291 (Reprinted) October 19, 2020 4/11 JAMA Network Open | Health Policy Use of Data-Driven Methods to Predict Long-term Patterns of Health Care Spending for Medicare Patients better model fit. The number of groups investigated was capped at 7 on the basis of groupings observed in prior work. In addition to considering BIC, other key considerations in selecting the best-fitting trajectory were the ability to visually interpret separate groups, minimum membership 26-28 probabilities in each group, and having 5% or more of the sample in each group. Statistical Analysis After selecting the best fitting number of trajectories, we assessed the ability to predict membership in each 2-year trajectory group using boosted logistic regression, a nonparametric machine learning method. The boosted algorithm is considered one of the best data-mining approaches for prediction 16,29 problems. Specifically, the algorithm creates a prediction model by building numerous small regression trees that together provide highly accurate classification. The boosting algorithm has several built-in protections from model overfitting, provides automatic variable selection, and describes the relative influence of predictors. They also consider all possible interaction terms between potential predictors. We used the gbm package in R with 5-fold cross-validation to identify the optimal number of trees and applied standard default values for tuning parameters to identify the optimal model. For each trajectory group, we estimated 2 separate models. The first included all 37 baseline predictors (model 1) and the second included only the 10 baseline predictors that were considered a priori to be potentially modifiable (model 2). Because of the ability of boosted regression to handle missing data, an indicator of long-term medication use and mean PDC were both included as variables for model 1, and mean PDC was included alone as a variable for model 2. To avoid overoptimism bias, we used internal split-sample validation by randomly dividing the full cohort into 2 halves as an initial derivation sample and a validation sample for all models. We evaluated each model through discrimination measures. Discrimination, the model’s ability to distinguish between patients who do and do not experience the outcome, was measured by the 34,35 C-statistic, which ranges from 0.5 (noninformative model) to 1.0 (perfect prediction). For clinical context, we explored the association between potentially modifiable baseline characteristics and membership in a rising-cost trajectory compared with other trajectory groups that had similar spending at baseline. Specifically, we used multivariable logistic regression to compare membership in the rising-cost trajectory, including each potentially modifiable variable vs other groups. This approach provides insight into baseline factors that may help distinguish patients who become costly later (ie, at least a year later) and potential levers for interventions. We also explored the relative influence of each potentially modifiable predictor from model 2. We also evaluated the ability to predict patients who experience rising costs in year 2 defined using a decile-threshold approach (ie, those in the lower 90% of spending in year 1 and then were in the top 10% of spending in year 2 ) and patients who in trajectory modeling were estimated as belonging to a rising-cost trajectory. For this approach, we estimated each outcome with 2 additional models with boosted regression. Model 3 used all baseline predictors, and model 4 used the potentially modifiable predictors. This approach helps provide insight into whether these spending increases could be accurately predicted using baseline information less temporal to the spending changes, which could ultimately inform intervention design and allow more time for them to be implemented. We conducted several sensitivity analyses. Although our primary analysis included zip code sociodemographic characteristics, we also included patients’ region of residence based on enrollment files as a predictor in model 1. Then, we included adherence to each class separately as predictors in models 1 and 2. Finally, we repeated measurements and analyses in a subsequent year (ie, 2012-2014) to confirm generalizability (eAppendix in the Supplement). All analyses except for the boosted regression were performed using SAS version 9.4 (SAS Institute); the boosting algorithm was performed using R version 3.4.1 (The R Project for Statistical Computing). Statistical analysis was performed from August 2018 to December 2019. JAMA Network Open. 2020;3(10):e2020291. doi:10.1001/jamanetworkopen.2020.20291 (Reprinted) October 19, 2020 5/11 JAMA Network Open | Health Policy Use of Data-Driven Methods to Predict Long-term Patterns of Health Care Spending for Medicare Patients Results Study Population and Characteristics Our cohort consisted of 329 476 patients (eTable 2 in the Supplement). Their mean (SD) age was 76.0 (7.2) years, and 190 346 (57.8%) were women. A 5-group trajectory model best described the 2-year spending patterns (Figure); the model on the log scale is shown in eFigure 2 in the Supplement. The probabilities of group membership are in eTable 3 in the Supplement. Trajectories with alternative numbers of groups and corresponding BICs are shown in eFigure 3 in the Supplement; models with more groups had marginal improvements and were less interpretable. This final 5-group model included a minimal-user group (group 1, 37 572 individuals [11.4%]), a low-cost group (group 2, 48 575 individuals [14.7%]), a rising-cost group (group 3, 24 736 individuals [7.5%]), a moderate-cost group (group 4, 83 338 individuals [25.3%]), and a high-cost group (group 5, 135 255 individuals [41.2%]). Baseline characteristics for each group are shown in Table 1. Cost Prediction Table 2 shows the results of the main prediction models in the validation sample. Four of the 5 2-year spending trajectory groups could be accurately predicted using all baseline predictors, especially the minimal-user (C-statistic: 0.951), low-cost (C-statistic: 0.810), rising-cost (C-statistic: 0.764), and high-cost groups (C-statistic: 0.899). Using potentially modifiable predictors alone, overall predictive ability remained moderate to strong, with the exception of the moderate-cost group (eg, C-statistic: 0.684). Figure. 2-Year Spending Patterns Using Trajectory Modeling Minimal user (11.4%) Low cost (14.7%) Rising cost (7.5%) Moderate cost (25.3%) High cost (41.2%) The mean observed spending levels using 5-group trajectory modeling in the full sample are plotted. The percentages in the key refer to the number of patients who belong to each trajectory group out of the full 0 2 4 6 8 10 12 14 16 18 20 22 24 cohort (bayesian information criterion for this model: Time, mo 21704747). Table 2. Ability of Models to Predict 2-Year Spending Trajectory Groups Group Validation C-statistic All baseline predictors, model 1 Group 1: minimal user 0.951 Group 2: low cost 0.810 Group 3: rising cost 0.764 Group 4: moderate cost 0.728 Group 5: high cost 0.899 Potentially modifiable predictors, model 2 Group 1: minimal user 0.942 Group 2: low cost 0.783 Group 3: rising cost 0.753 Group 4: moderate cost 0.684 Group 5: high cost 0.873 JAMA Network Open. 2020;3(10):e2020291. doi:10.1001/jamanetworkopen.2020.20291 (Reprinted) October 19, 2020 6/11 Monthly costs, $ JAMA Network Open | Health Policy Use of Data-Driven Methods to Predict Long-term Patterns of Health Care Spending for Medicare Patients Table 3 shows potentially modifiable prior-year predictors of being in a rising-cost trajectory compared with the other 3 groups with similar spending in the prior baseline year (mean, $1500- $8000 in year 0). In particular, using more medications (odds ratio [OR]: 0.81; 95% CI, 0.79-0.84) and having more office visits (OR: 0.98; 95% CI, 0.97-0.99) were associated with lower odds of being in the rising-cost trajectory. Seeing more physicians (OR: 1.04; 95% CI, 1.02-1.06) and using tobacco (OR: 1.10; 95% CI, 1.02-1.20) were also factors independently associated with rising-cost membership. eFigure 4 in the Supplement shows the relative influence plots for each group incorporating only potentially modifiable characteristics (model 2). The plot for predicting the rising- cost group in particular indicates that the most predictive potentially modifiable factors were mean medication adherence (relative influence: 33.6), number of office visits (relative influence: 30.3), and number of medications (relative influence: 29.2). The results from the models predicting rising costs using a decile-threshold–based method and the trajectory group method are shown in eTable 4 in the Supplement. Patients in the decile- threshold–based approach had higher total 2-year costs on average ($39 737), compared with the trajectory approach ($23 670). The ability to predict decile-threshold–based rising costs (model 4 C-statistic: 0.643) was lower than the trajectory-based approach (model 4 C-statistic: 0.753). Sensitivity analyses incorporating region of residence and medication adherence to by class are shown in eTables 5 and 6 in the Supplement. Notably, trajectory group membership was fairly similar across regions, and including these predictors did not meaningfully change C-statistics. Replication in a subsequent year of data resulted in similar patterns and sizes of group membership (eFigure 5 in the Supplement) as well as ability to predict those groups (eTable 7 in the Supplement). Discussion Using a data-driven approach to classify 2-year health spending for Medicare beneficiaries, we observed 5 distinct spending patterns. Membership in these groups could be accurately predicted, even when using a simple set of potentially modifiable characteristics from claims data. These results suggest that this approach could potentially help inform the design, application, and timing of interventions. Prior efforts to predict health care spending have generally focused on a single composite value, such as total yearly costs or a threshold-based measure, such as being in the top 5% of spending, both of which collapse an entire year’s spending into a static variable. These approaches have had modest accuracy; C-statistics for threshold-based outcomes have generally ranged from 0.6 to 2,5,36,37 0.8. Two recently published approaches offer other cluster-based solutions to elucidate 38,39 subgroups of high-cost patients with some notable successes. However, these were not applied to evaluate changes in spending, outcomes over more than 1 year, or to elucidate patients with rising Table 3. Association Between Potentially Modifiable Factors and Membership in the Rising-Cost Spending Trajectory (Group 3) vs Other Trajectory Groups Characteristics OR (95% CI) for group 3: rising cost Intercept (SE) −1.86 (0.02) Baseline covariate Unique medications, No. 0.81 (0.79-0.84) Office visits, No. 0.98 (0.97-0.99) Physicians, No. 1.04 (1.02-1.06) Pharmacies, No. 0.99 (0.95-1.02) b Abbreviation: OR, odds ratio. Emergency department visits, No. 0.98 (0.94-1.01) Conducted within validation sample using logistic Depression 1.01 (0.92-1.10) regression model with only potentially modifiable Tobacco use 1.10 (1.02-1.20) covariates compared with groups 1, 2, and 4. Obesity 1.08 (0.98-1.19) Odds ratios are presented as a 1-unit increase for Acute stress 0.87 (0.74-1.02) continuous variables. JAMA Network Open. 2020;3(10):e2020291. doi:10.1001/jamanetworkopen.2020.20291 (Reprinted) October 19, 2020 7/11 JAMA Network Open | Health Policy Use of Data-Driven Methods to Predict Long-term Patterns of Health Care Spending for Medicare Patients 38,39 costs. They also focused on Medicare Advantage populations, which can differ from fee-for- 40,41 service beneficiaries. Patients may have dynamic patterns of spending over longer periods of time that can be potentially meaningful, with implications on whom to outreach for intervention as well as when and 1,12 10 perhaps how to do so. For example, Tamang et al identified low-spending patients in 1 year whose costs bloomed in the subsequent year using thresholds. When applied to our data, the ability to predict these patients using baseline data alone was modest. Using a data-driven approach, we observed a similarly sized group whose costs later increased that could be predicted somewhat better. One possible explanation could be that the 2-year time horizon itself as an outcome helped discriminate between groups. The ability to proactively differentiate between patients with rising or falling spending patterns using distally measured variables could better target interventions to those who are at greatest need. If successful, using these longer time horizons could allow more time for the implementation of potential interventions. Focusing interventions on patients with rising costs has some theoretical advantages, even though predictive ability was modest. First, the size of the group identified in this study was modest (ie, 7.5%). Of course, it still may be infeasible to intervene upon a group this large, and not all costs may be preventable. Identifying additional segmentation may be necessary, and the use of this approach may be just a starting point. Regardless, the ability to predict better could target interventions to those at greater need, and targeting has been shown to result in better population- level outcomes. When considering potential interventions, a prediction rule comprising the most influential potentially modifiable variables could be applied to better target patients. We observed several clinically actionable characteristics, such as therapeutic complexity (ie, number of medications or office visits), depression, medication adherence, and tobacco use that could be levers for interventions. Filling fewer medications and having fewer office visits were also predictors of the rising-cost trajectory, suggesting that patients may not be getting sufficient care to prevent future escalation of health problems. This information could also be used for intervention design to improve care. Many health care organizations, insurers, researchers, and policy makers use claims data to identify patients for interventions. Therefore, the ability to better leverage these routinely collected data for cost predictions and interventions with a variety of more nuanced cost-modeling methods holds wide potential. Moreover, using data-driven approaches to classify longer-term spending may hold promise compared with threshold-based approaches alone. Limitations Several limitations warrant mention. First, we examined trajectories from January to December; patients with incomplete enrollment or other policy start and end dates may differ. Because of differences in how outcomes are categorized, model performance of predicting a cost trajectory (binary outcome) cannot be directly compared with predicting total costs (continuous outcome) or patients defined by the rising-cost decile-threshold approach. The variables included in prediction models may also not be exhaustive, and although we used validated algorithms, they may be insufficiently sensitive. Trajectory modeling also provides predicted group membership; individual members may be assigned to their closest trajectory, but there could be within-group heterogeneity. The high-cost group was large, possibly because of how the model was specified (ie, log costs); one could potentially apply trajectories to identify subgroups within that group for further segmentation. Although group distribution did not differ on the basis of geographical region, the costs themselves were not adjusted for region; similarly, moving could have impacted relative changes in spending, but this was beyond the scope of this study. Furthermore, these results may not be generalizable to other payment systems, such as non–fee-for-service Medicare, Medicaid, or commercially insured beneficiaries. Although these other beneficiaries may have different spending levels, prior work has JAMA Network Open. 2020;3(10):e2020291. doi:10.1001/jamanetworkopen.2020.20291 (Reprinted) October 19, 2020 8/11 JAMA Network Open | Health Policy Use of Data-Driven Methods to Predict Long-term Patterns of Health Care Spending for Medicare Patients suggested similar patterns. Regardless, the same groups or predictive ability may not apply to other types of beneficiaries, and the results should be studied further to confirm reproducibility. Conclusions Using trajectory modeling to examine a 2-year time horizon improved the understanding of dynamic patterns, including the identification of a group of patients with progressively increasing costs and a group of patients with consistently high spending. This approach could be potentially adapted by health care organizations to improve cost-containment efforts. ARTICLE INFORMATION Accepted for Publication: August 4, 2020. Published: October 19, 2020. doi:10.1001/jamanetworkopen.2020.20291 Open Access: This is an open access article distributed under the terms of the CC-BY License. © 2020 Lauffenburger JC et al. JAMA Network Open. Corresponding Author: Julie C. Lauffenburger, PharmD, PhD, Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, 1620 Tremont St, Ste 3030, Boston, MA 02120 (jlauffenburger@bwh.harvard.edu). Author Affiliations: Center for Healthcare Delivery Sciences, Department of Medicine, Brigham and Women’s Hospital, Boston, Massachusetts (Lauffenburger, Choudhry); Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts (Lauffenburger, Mahesri, Choudhry). Author Contributions: Dr Lauffenburger had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. Concept and design: Lauffenburger, Choudhry. Acquisition, analysis, or interpretation of data: All authors. Drafting of the manuscript: Lauffenburger, Mahesri. Critical revision of the manuscript for important intellectual content: Mahesri, Choudhry. Statistical analysis: Lauffenburger, Mahesri. Obtained funding: Lauffenburger. Supervision: Lauffenburger, Choudhry. Conflict of Interest Disclosures: Dr Choudhry reported receiving unrestricted research funding from Sanofi, AstraZeneca, and Medisafe Inc payable to Brigham and Women’s Hospital. No other disclosures were reported. Funding/Support: This work was supported by an unrestricted investigator-initiated grant from the National Institute for Health Care Management to Brigham and Women’s Hospital. Dr Lauffenburger was also supported in part by a National Institutes of Health career development grant (K01 HL 141538). Dr Choudhry was also supported in part by a National Institutes of Health center grant (P30AG064199). Role of the Funder/Sponsor: The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication. REFERENCES 1. Martin AB, Hartman M, Washington B, Catlin A; National Health Expenditure Accounts Team. National health spending: faster growth in 2015 as coverage expands and utilization increases. Health Aff (Millwood). 2017;36(1): 166-176. doi:10.1377/hlthaff.2016.1330 2. Kuo RN, Dong YH, Liu JP, Chang CH, Shau WY, Lai MS. Predicting healthcare utilization using a pharmacy-based metric with the WHO’s Anatomic Therapeutic Chemical algorithm. Med Care. 2011;49(11):1031-1039. doi:10.1097/ MLR.0b013e31822ebe11 3. Perkins AJ, Kroenke K, Unützer J, et al. Common comorbidity scales were similar in their ability to predict health care costs and mortality. J Clin Epidemiol. 2004;57(10):1040-1048. doi:10.1016/j.jclinepi.2004.03.002 JAMA Network Open. 2020;3(10):e2020291. doi:10.1001/jamanetworkopen.2020.20291 (Reprinted) October 19, 2020 9/11 JAMA Network Open | Health Policy Use of Data-Driven Methods to Predict Long-term Patterns of Health Care Spending for Medicare Patients 4. Sales AE, Liu CF, Sloan KL, et al. Predicting costs of care using a pharmacy-based measure risk adjustment in a veteran population. Med Care. 2003;41(6):753-760. doi:10.1097/01.MLR.0000069502.75914.DD 5. Fishman PA, Goodman MJ, Hornbrook MC, Meenan RT, Bachman DJ, O’Keeffe Rosetti MC. Risk adjustment using automated ambulatory pharmacy data: the RxRisk model. Med Care. 2003;41(1):84-99. doi:10.1097/ 00005650-200301000-00011 6. Powers CA, Meyer CM, Roebuck MC, Vaziri B. Predictive modeling of total healthcare costs using pharmacy claims data: a comparison of alternative econometric cost modeling techniques. Med Care. 2005;43(11): 1065-1072. doi:10.1097/01.mlr.0000182408.54390.00 7. Forrest CB, Lemke KW, Bodycombe DP, Weiner JP. Medication, diagnostic, and cost information as predictors of high-risk patients in need of care management. Am J Manag Care. 2009;15(1):41-48. 8. Yarger S, Rascati K, Lawson K, Barner J, Leslie R. Analysis of predictive value of four risk models in Medicaid recipients with chronic obstructive pulmonary disease in Texas. Clin Ther. 2008;30(Spec No):1051-1057. doi:10. 1016/j.clinthera.2008.06.001 9. Mihaylova B, Briggs A, O’Hagan A, Thompson SG. Review of statistical methods for analysing healthcare resources and costs. Health Econ. 2011;20(8):897-916. doi:10.1002/hec.1653 10. Tamang S, Milstein A, Sørensen HT, et al. Predicting patient ‘cost blooms’ in Denmark: a longitudinal population-based study. BMJ Open. 2017;7(1):e011580. doi:10.1136/bmjopen-2016-011580 11. Lauffenburger JC, Franklin JM, Krumme AA, et al. Longitudinal patterns of spending enhance the ability to predict costly patients: a novel approach to identify patients for cost containment. Med Care. 2017;55(1):64-73. doi:10.1097/MLR.0000000000000623 12. Druss BG, Marcus SC, Olfson M, Tanielian T, Elinson L, Pincus HA. Comparing the national economic burden of five chronic conditions. Health Aff (Millwood). 2001;20(6):233-241. doi:10.1377/hlthaff.20.6.233 13. Ziaeian B, Fonarow GC. The prevention of hospital readmissions in heart failure. Prog Cardiovasc Dis. 2016;58 (4):379-385. doi:10.1016/j.pcad.2015.09.004 14. Barnett ML, Hsu J, McWilliams JM. Patient characteristics and differences in hospital readmission rates. JAMA Intern Med. 2015;175(11):1803-1812. doi:10.1001/jamainternmed.2015.4660 15. Nuckols TK, Escarce JJ, Asch SM. The effects of quality of care on costs: a conceptual framework. Milbank Q. 2013;91(2):316-353. doi:10.1111/milq.12015 16. Franklin JM, Shrank WH, Lii J, et al. Observing versus predicting: initial patterns of filling predict long-term adherence more accurately than high-dimensional modeling techniques. Health Serv Res. 2016;51(1):220-239. doi:10.1111/1475-6773.12310 17. Krumme AA, Glynn RJ, Schneeweiss S, et al. Medication synchronization programs improve adherence to cardiovascular medications and health care use. Health Aff (Millwood). 2018;37(1):125-133. doi:10.1377/hlthaff. 2017.0881 18. Austin PC, Ghali WA, Tu JV. A comparison of several regression models for analysing cost of CABG surgery. Stat Med. 2003;22(17):2799-2815. doi:10.1002/sim.1442 19. Artz MB, Hadsall RS, Schondelmeyer SW. Impact of generosity level of outpatient prescription drug coverage on prescription drug events and expenditure among older persons. Am J Public Health. 2002;92(8):1257-1263. doi:10.2105/AJPH.92.8.1257 20. Benner JS, Glynn RJ, Mogun H, Neumann PJ, Weinstein MC, Avorn J. Long-term persistence in use of statin therapy in elderly patients. JAMA. 2002;288(4):455-461. doi:10.1001/jama.288.4.455 21. Choudhry NK, Shrank WH, Levin RL, et al. Measuring concurrent adherence to multiple related medications. Am J Manag Care. 2009;15(7):457-464. 22. Goetzel RZ, Pei X, Tabrizi MJ, et al. Ten modifiable health risk factors are linked to more than one-fifth of employer-employee health care spending. Health Aff (Millwood). 2012;31(11):2474-2484. doi:10.1377/hlthaff. 2011.0819 23. Yusuf S, Hawken S, Ounpuu S, et al; INTERHEART Study Investigators. Effect of potentially modifiable risk factors associated with myocardial infarction in 52 countries (the INTERHEART study): case-control study. Lancet. 2004;364(9438):937-952. doi:10.1016/S0140-6736(04)17018-9 24. Jones BL, Nagin DS. Advances in group-based trajectory modeling and a SAS procedure for estimating them. Sociol Methods Res. 2007;35(4):542-571. doi:10.1177/0049124106292364 25. Franklin JM, Shrank WH, Pakes J, et al. Group-based trajectory models: a new approach to classifying and predicting long-term medication adherence. Med Care. 2013;51(9):789-796. doi:10.1097/MLR.0b013e3182984c1f JAMA Network Open. 2020;3(10):e2020291. doi:10.1001/jamanetworkopen.2020.20291 (Reprinted) October 19, 2020 10/11 JAMA Network Open | Health Policy Use of Data-Driven Methods to Predict Long-term Patterns of Health Care Spending for Medicare Patients 26. Jones BL, Nagin DS, Roeder K. A SAS procedure based on mixture models for estimating developmental trajectories. Sociol Methods Res. 2001;29:374-393. doi:10.1177/0049124101029003005 27. Li Y, Zhou H, Cai B, et al. Group-based trajectory modeling to assess adherence to biologics among patients with psoriasis. Clinicoecon Outcomes Res. 2014;6:197-208. doi:10.2147/CEOR.S59339 28. Franklin JM, Krumme AA, Tong AY, et al. Association between trajectories of statin adherence and subsequent cardiovascular events. Pharmacoepidemiol Drug Saf. 2015;24(10):1105-1113. doi:10.1002/pds.3787 29. Koh HC, Tan G. Data mining applications in healthcare. J Healthc Inf Manag. 2005;19(2):64-72. 30. Robinson JW. Regression tree boosting to adjust health care cost predictions for diagnostic mix. Health Serv Res. 2008;43(2):755-772. doi:10.1111/j.1475-6773.2007.00761.x 31. Varian HR. Big data: new tricks for econometrics. J Econ Perspect. 2014;28(2):3-28. doi:10.1257/jep.28.2.3 32. Steyerberg EW, Harrell FE Jr, Borsboom GJ, Eijkemans MJ, Vergouwe Y, Habbema JD. Internal validation of predictive models: efficiency of some procedures for logistic regression analysis. J Clin Epidemiol. 2001;54(8): 774-781. doi:10.1016/S0895-4356(01)00341-9 33. Waljee AK, Higgins PD, Singal AG. A primer on predictive models. Clin Transl Gastroenterol. 2014;5:e44. doi: 10.1038/ctg.2013.19 34. Steyerberg EW, Vickers AJ, Cook NR, et al. Assessing the performance of prediction models: a framework for traditional and novel measures. Epidemiology. 2010;21(1):128-138. doi:10.1097/EDE.0b013e3181c30fb2 35. Cook NR. Use and misuse of the receiver operating characteristic curve in risk prediction. Circulation. 2007;115 (7):928-935. doi:10.1161/CIRCULATIONAHA.106.672402 36. Liu CF, Sales AE, Sharp ND, et al. Case-mix adjusting performance measures in a veteran population: pharmacy- and diagnosis-based approaches. Health Serv Res. 2003;38(5):1319-1337. doi:10.1111/1475-6773.00179 37. Zhao Y, Ash AS, Ellis RP, et al. Predicting pharmacy costs and other medical costs using diagnoses and drug claims. Med Care. 2005;43(1):34-43. 38. Yan J, Linn KA, Powers BW, et al. Applying machine learning algorithms to segment high-cost patient populations. J Gen Intern Med. 2019;34(2):211-217. doi:10.1007/s11606-018-4760-8 39. Powers BW, Yan J, Zhu J, et al. Subgroups of high-cost Medicare Advantage patients: an observational study. J Gen Intern Med. 2019;34(2):218-225. doi:10.1007/s11606-018-4759-1 40. Powell SK. Choosing Medicare Advantage plans versus traditional fee-for-service: is this change the tipping point? Prof Case Manag. 2019;24(1):1-3. doi:10.1097/NCM.0000000000000338 41. Raetzman SO, Hines AL, Barrett ML, Karaca Z. Hospital stays in Medicare Advantage Plans versus the traditional Medicare fee-for-service program, 2013: statistical brief #198. Published December 2015. Accessed August 5, 2019. https://www.hcup-us.ahrq.gov/reports/statbriefs/sb198-Hospital-Stays-Medicare-Advantage- Versus-Traditional-Medicare.jsp 42. Stadhouders N, Kruse F, Tanke M, Koolman X, Jeurissen P. Effective healthcare cost-containment policies: a systematic review. Health Policy. 2019;123(1):71-79. doi:10.1016/j.healthpol.2018.10.015 43. Lauffenburger JC, Lewey J, Jan S, et al. Effectiveness of targeted insulin-adherence interventions for glycemic control using predictive analytics among patients with type 2 diabetes: a randomized clinical trial. JAMA Netw Open. 2019;2(3):e190657. doi:10.1001/jamanetworkopen.2019.0657 SUPPLEMENT. eFigure 1. Study Design eTable 1. Baseline Predictors of Spending Outcomes eAppendix. Supplemental Methods eTable 2. Patient Eligibility Criteria eTable 3. Predicted Probabilities for Each Trajectory Group eFigure 2. Two-Year Spending Patterns Using Trajectory Modeling: Original Log Scale eFigure 3. Trajectory Modeling of Two-Year Healthcare Spending Using Other Numbers of Groups eFigure 4. Relative Influence Plots From Boosted Regression Modeling for Predicting Trajectory Group Membership With Potentially-Modifiable Variables (Model 2) eTable 4. Validation C-Statistics From Models Predicting Patients With Future Rising Spending eTable 5. Geographic Region And Baseline Chronic Condition Medication Classes By Trajectory Group eTable 6. Validation C-Statistics From Sensitivity Analyses eFigure 5. Two-Year Spending Patterns Using Trajectory Modeling In 2013-2014 Data eTable 7. Ability of Models to Predict Two-Year Spending Trajectory Groups In 2013-2014 Data JAMA Network Open. 2020;3(10):e2020291. doi:10.1001/jamanetworkopen.2020.20291 (Reprinted) October 19, 2020 11/11 Supplemental Online Content Lauffenburger JC, Mahesri M, Choudhry NK. long-term patterns of health care spending Medicare patients. JAMA Netw Open. 2020;3(10):e2020291. doi:10.1001/jamanetworkopen.2020.20291 eFigure 1. Study Design eTable 1. Baseline Predictors of Spending Outcomes eAppendix. Supplemental Methods eTable 2. Patient Eligibility Criteria eTable 3. Predicted Probabilities for Each Trajectory Group eFigure 2. Two-Year Spending Patterns Using Trajectory Modeling: Original Log Scale eFigure 3. Trajectory Modeling of Two-Year Healthcare Spending Using Other Numbers of Groups eFigure 4. Relative Influence Plots From Boosted Regression Modeling for Predicting Trajectory Group Membership With Potentially-Modifiable Variables (Model 2) eTable 4. Validation C-Statistics From Models Predicting Patients With Future Rising Spending eTable 5. Geographic Region And Baseline Chronic Condition Medication Classes By Trajectory Group eTable 6. Validation C-Statistics From Sensitivity Analyses eFigure 5. Two-Year Spending Patterns Using Trajectory Modeling In 2013-2014 Data eTable 7. Ability of Models to Predict Two-Year Spending Trajectory Groups In 2013-2014 Data This supplemental material has been provided by the authors to give readers additional information about their work. © 2020 Lauffenburger JC et al. JAMA Network Open. eFigure 1. Study Design Two-year spending outcomes Cohort entry date 1/1/2012 1/1/2013 1/1/2011 12/31/2013 Baseline (Year 0) predictors eTable 1. Baseline redictors of pending utcomes th Type and Timing Predictors and relevant International Classification of Diseases, 9 Edition codes (ICD-9) or Current Procedural Terminology (CPT) codes Demographic Age Measured in enrollment files Sex Measured in enrollment files Race/ethnicity Measured in enrollment files Median household income files), rounded to nearest US dollar High school graduate Percentage of persons graduating from high school in the via 2010 US Census files) Healthcare utilization Part D plan switch Patient switched their Part D plans in the baseline period, measured in enrollment files Part D low income Patient received a Part D low-income subsidy for at least one month in the baseline period subsidy No. of office visits Number of physician office visits, based on procedure codes No. of physicians Number of unique physicians associated with outpatient claims No. of pharmacies used Number of distinct pharmacies by Pharmacy ID that the patients filled medications at, defined using pharmacy claims No. of hospitalizations Number of all-cause hospitalizations No. of Emergency Room Number of all-cause emergency room visits visits No. of unique drugs (i.e., Number of unique medications filled, by generic name therapeutic complexity) Prescription generosity Out-of-pocket drug costs for filled medications divided by net total drug payments (Artz MB, et al. Am J Pub Health. 2002;92:1257-1263.) Out-of-pocket costs for all services and procedures divided by net total payments generosity Total baseline year costs Total costs from all services and procedures related to inpatient and outpatient visits, procedures, durable medical equipment, home health, and medications Chronic medication use Defined by filling at least one medication different chronic medication classes, including medications for the following disease states: anti-hypertensive, lipid-lowering, anti- diabetic, osteoporosis, or asthma/COPD Average adherence Measured by proportion of days covered in the baseline period in pharmacy claims, averaged across all eligible chronic medication classes Comorbidities Comorbidity score Measured using the algorithm presented here: Gagne JJ et al. J Clin Epidemiol. 2011;64:749- 759 (incorporates 37 possible comorbid conditions) Coronary artery disease 410.x-414.x, 429.2, V45.81 (hospital discharge, any position) Prior Myocardial 410.x except 410.x2 AND length of stay >3 and <180 days (hospital discharge, any position) infarction Asthma or Coronary 493.x, 490.x, 491.x, 492.x, 496.x Obstructive Pulmonary Disorder Hypertension 401.x-405.x, 437.2 Diabetes mellitus 250.x (1 inpatient or two outpatient) Acute renal failure or End 572.4, 580.xx, 584.xx, 580.0, 580.4, 580.89, 580.9, 582.4, 791.2, 791.3 OR Stage Renal Disease 585.6 Dementia 290.0, 290.1x, 290.2x, 290.3x, 290.4x, 294.20 Depression 311, 296.2, 296.3, 300.4, 301.12, and 309.1 Stroke 433.x1, 434 excluding 434.x0, 435.xx, 436.xx, 437.1x, 437.9x (hospital discharge, any position) Liver disease 570-573 (esophageal varices 456.xx) (hospital discharge, any position) Congestive heart failure 428.x, 398.91, 402.01, 402.11, 402.91, 404.01, 404.11, 404.91, 404.03, 404.13, 404.93 (hospital discharge, any position) Hyperlipidemia 272.xx Atrial fibrillation 427.31 (hospital discharge, any position) Osteoporosis 733.x Obesity 278 Acute Stress 298, 308, 309 Tobacco use 305.1, 649.0, 989.84, V15.82, CPT: 1034F eAppendix. Additional etail on redictors Clinical comorbidities were measured using International Classification of Diseases th 9 edition (ICD-9) codes in medical files including comorbidities such as coronary artery disease, prior myocardial infarction, COPD/Asthma, hypertension, hyperlipidemia, congestive heart failure, stroke, major depression, diabetes, liver disease, chronic kidney disease/dementia, osteoporosis, obesity, and tobacco use. Replication in subsequent year We used administrative claims data from the nationwide 1-million-member sample for year 2012-2014 as a separate replication dataset. As in the primary analysis, we restricted the cohort to the randomly-selected patients and used their patient-level files. To January 1, 2012 to December 31, 2014. The cohort entry date was defined as January 1, 2013 follow-up data As in the primary analysis, we remeasured total monthly healthcare spending over the 2013-2014 period and used the same transformation and adjustment as described in the Methods. We also remeasured the 37 predictors using data from Medicare enrollment files and claims but in the 2012 baseline year. We classified the same 10 predictors as potentially-modifiable (asterisks denoted in Table 1). We also used trajectory modeling to empirically classify spending during the two- year follow-up in this replication sample. We also modeled longitudinal cost trajectories using calendar month as the time variable, costs in each month, order=4, group=5, and a censored-normal distribution (linear between minimum and maximum values). We evaluated other groupings, but the 5-group model still fit the data best. After conducting the trajectories, as in the main paper, we also assessed the ability to predict membership in each two-year trajectory group using boosted logistic regression in this separate dataset, using the same approach described in the Methods. For each trajectory group, we estimated two separate models. The first included all 37 baseline predictors (Model 1) and the second included only the 10 baseline predictors that were considered a priori to be potentially-modifiable (Model 2). As the main manuscript, we used internal split-sample validation and evaluated each model through discrimination measures. eTable 2. Patient ligibility riteria Criterion N Study sample 1,000,000 Was not in quality improvement study sample 976,550 Enrolled on 1/1/12 in medical and pharmacy benefits 550,215 Age 65 years on 1/1/11 433,561 Continuous enrollment from 1/1/11 to 12/31/13 329,476 eTable 3. Predicted robabilities for ach rajectory roup Trajectory group Mean (SD) predicted probability % of patients with >0.90 of trajectory group membership membership probability Group 1: Minimal-user 0.97 (0.09) 90.4% Group 2: Low-cost 0.91 (0.13) 73.6% Group 3: Rising costs 0.88 (0.16) 61.6% Group 4: Moderate cost 0.89 (0.14) 63.9% Group 5: High cost 0.95 (0.10) 84.4% eFigure 2. Two- ear pending atterns sing rajectory odeling: riginal og cale -1 -2 -3 -4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Months Minimal user (11.4%) Low-cost (14.7%) Rising-cost (7.5%) Moderate-user (25.3%) High-cost (41.2%) Note: The observed (solid lines) and predicted (dotted lines) probability of monthly costs on the logarithmic scale for each of 5 groups identified by the trajectory model. Log monthly costs (in US$) eFigure 3. Trajectory odeling of wo- ear ealthcare pending sing ther umbers of roups A. Two-group model BIC: 22202423 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Months Group 1 (31.1%) Group 2 (68.9%) B. Three-group model BIC: 21858264 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Months Group 1 (16.2%) Group 2 (30.1%) Group 3 (53.7%) Monthly costs (in US $) Monthly costs (in US $) C. Four-group model BIC: 21757980 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Months Group 1 (11.4%) Group 2 (18.1%) Group 3 (27.4%) Group 4 (43.2%) D. Six-group model BIC: 21667412 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Months Group 1 (9.1%) Group 2 (11.3%) Group 3 (8.7%) Group 4 (11.1%) Group 5 (25.6%) Group 6 (34.2%) Monthly costs (in US $) Monthly costs (in US $) E. Seven-group model BIC: 21646901 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Months Group 1 (9.3%) Group 2 (7.6%) Group 3 (4.5%) Group 4 (9.5%) Group 5 (11.9%) Group 6 (25.0%) Group 7 (32.2%) Monthly costs (in US $) eFigure 4. Relative nfluence lots rom oosted egression odeling for redicting rajectory roup embership ith otentially- odifiable ariables (Model 2) A. Group 1: Minimal-user No. of office visits 71.815 No. of unique medications 10.873 No. of physicians 10.526 Average adherence 4.705 No. of pharmacies 1.84 Depression 0.07 No. of ER visits 0.06 Tobacco use 0.044 Acute stress 0.033 No. of hopsitalizations 0.019 Obesity 0.017 0 10 20 30 40 50 60 70 80 Relative influence B. Group 2: Low-cost No. of office visits 42.159 No. of un ique medications 36.138 Average adherence 16.805 No. of physicians 2.747 No. of ER visits 0.694 Depression 0.491 No. of pharmacies 0.287 Obesity 0.206 Tobacco use 0.199 No. of hopsitalizations 0.164 Acute stress 0.110 0 5 10 15 20 25 30 35 40 45 Relative influence C. Group 3: Rising- - Average adherence 33.605 No. of office visits 30.334 No. of unique medications 29.236 No. of physicians 2.935 No. of pharmacies 1.034 No. of ER visits 0.714 No. of hospitalizations 0.61 Tobacco use 0.583 Obesity 0.39 Depression 0.306 Acute stress 0.252 0 10 20 30 40 Relative influence D. Group 4: Moderate-cost No. of office visits 36.438 Average adherence 28.239 No. of unique medications 24.773 No. of physicians 3.527 No. of ER visits 2.086 No. of pharmacies 1.829 No. of hospitalizations 1.055 Tobacco use 0.675 Depression 0.579 Obesity 0.487 Acute stress 0.312 0 5 10 15 20 25 30 35 40 Relative influence E. Group 5: High-cost No. of unique medications 46.590 No. of office visits 37.533 Average adherence 6.090 No. of physicians 4.518 No. of pharmacies 2.211 No. of ER visits 1.022 Depression 0.972 No. of hospitalizations 0.434 Acute stress 0.347 Obesity 0.208 Tobacco use 0.076 0 10 20 30 40 50 Relative influence eTable 4. Validation C- tatistics rom odels redicting atients ith uture ising pending Cost-bloomers Group 3: Model Predictors in Year 2 Rising-cost (n=20,470) (n=24,736) 3 All Year 0 0.667 0.764 4 Potentially-modifiable Year 0 0.643 0.753 Defined by being in lower 90% in Year 1 and top 10% in Year 2 eTable 5. Geographic egion and aseline hronic ondition edication lasses y rajectory roup Group 4: Group 1: Group 2: Low- Group 3: Group 5: Moderate- Minimal-user cost Rising-cost High-cost cost Geographic region of residence N (%) Midwest 8,512 (22.7) 11,890 (24.5) 6,313 (25.5) 20,612 (24.7) 31,123 (23.0) Northeast 5,945 (15.8) 7,828 (16.1) 4,385 (17.7) 15,166 (18.2) 27,778 (20.5) South 13,707 (36.5) 19,150 (39.4) 9,620 (38.9) 32,871 (39.4) 53,942 (39.9) West 8,605 (22.9) 9,418 (19.4) 4,283 (17.3) 14,280 (17.1) 21,917 (16.2) Medication class adherence (PDC) Mean (SD) Alpha glucosidase inhibitors 0.14 (0.20) 0.69 (0.26) 0.53 (0.21) 0.73 (0.26) 0.71 (0.29) ACEIs/ARBs 0.58 (0.31) 0.81 (0.24) 0.79 (0.27) 0.85 (0.22) 0.84 (0.23) Anticholinergics, inhaled 0.52 (0.36) 0.47 (0.31) 0.58 (0.33) 0.60 (0.29) 0.67 (0.30) Beta-blockers 0.59 (0.31) 0.80 (0.24) 0.79 (0.26) 0.85 (0.22) 0.85 (0.22) Biguanide 0.55 (0.29) 0.72 (0.28) 0.70 (0.28) 0.81 (0.23) 0.82 (0.23) Bisphosphonates 0.58 (0.28) 0.69 (0.28) 0.68 (0.28) 0.71 (0.28) 0.72 (0.28) Calcium channel blockers 0.58 (0.32) 0.80 (0.26) 0.81 (0.25) 0.85 (0.22) 0.85 (0.23) Dipeptidyl peptidase-4 inhibitors 0.55 (0.35) 0.66 (0.32) 0.68 (0.34) 0.73 (0.28) 0.79 (0.26) Diuretics, thiazide 0.56 (0.31) 0.79 (0.26) 0.79 (0.26) 0.83 (0.23) 0.81 (0.25) Leukotriene modulators 0.30 (0.25) 0.48 (0.31) 0.53 (0.29) 0.62 (0.33) 0.72 (0.30) Long-acting beta-agonists 0.59 (0.30) 0.51 (0.39) 0.75 (0.32) 0.48 (0.28) 0.57 (0.33) Meglitinides - 0.79 (0.24) 0.57 (0.39) 0.71 (0.30) 0.70 (0.29) Other anti-hypertensives 0.47 (0.30) 0.73 (0.31) 0.74 (0.30) 0.79 (0.27) 0.76 (0.30) Selective estrogen receptor 0.66 (0.26) 0.76 (0.28) 0.70 (0.29) 0.80 (0.24) 0.81 (0.24) modulators Statins 0.57 (0.30) 0.78 (0.25) 0.77 (0.26) 0.82 (0.22) 0.84 (0.22) Sulfonylureas 0.59 (0.30) 0.73 (0.28) 0.70 (0.28) 0.83 (0.22) 0.83 (0.23) Thiazolidinediones 0.48 (0.29) 0.63 (0.27) 0.59 (0.28) 0.69 (0.28) 0.73 (0.28) Xanthines 0.45 (0.29) 0.73 (0.28) 0.69 (0.44) 0.71 (0.29) 0.74 (0.30) Abbreviations: PDC, Proportion of Days Covered; ACEI/ARB, Angiotensin-converting enzyme inhibitors/angiotensin receptor blockers eTable 6. Validation C- tatistics rom ensitivity nalyses Group 1: Group 3: Group 4: Group 5: Group 2: Predictors Minimal- Rising- Moderate- High-cost Low-cost user cost cost All baseline predictors (Model 1)+ Region 0.956 0.819 0.765 0.729 0.903 All baseline predictors (Model 1) with 0.955 0.816 0.766 0.729 0.900 adherence by individual class Potentially-modifiable predictors (Model 0.946 0.785 0.756 0.689 0.874 2) with adherence by individual class eFigure 5. Two- ear pending atterns sing rajectory odeling n 2013-2014 ata 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Months Minimal user (10.9%) Low-cost (13.9%) Rising costs (6.8%) Moderate-cost (25.6%) High-cost (42.7%) Note: The mean spending levels using 5-group trajectory modeling in the full 2013-2014 sample (n=297,150) are plotted. The percentages refer to the number of patients who belong to each trajectory group out of the full cohort. Monthly costs (in US$) eTable 7. Ability of odels to redict wo- ear pending rajectory roups n 2013-2014 ata Validation C- Group statistic All baseline predictors Model 1 Group 1: Minimal-user 0.983 Group 2: Low-cost 0.864 Group 3: Rising-cost 0.812 Group 4: Moderate-cost 0.777 Group 5: High-cost 0.941 Potentially-modifiable predictors Model 2 Group 1: Minimal-user 0.952 Group 2: Low-cost 0.787 Group 3: Rising-cost 0.767 Group 4: Moderate-cost 0.696 Group 5: High-cost 0.887 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png JAMA Network Open American Medical Association

Use of Data-Driven Methods to Predict Long-term Patterns of Health Care Spending for Medicare Patients

Loading next page...
 
/lp/american-medical-association/use-of-data-driven-methods-to-predict-long-term-patterns-of-health-d0geEuiEQ4

References (93)

Publisher
American Medical Association
Copyright
Copyright 2020 Lauffenburger JC et al. JAMA Network Open.
eISSN
2574-3805
DOI
10.1001/jamanetworkopen.2020.20291
Publisher site
See Article on Publisher Site

Abstract

Key Points Question What are the long-term IMPORTANCE Current approaches to predicting health care costs generally rely on a single spending patterns by Medicare composite value of spending and focus on short time horizons. By contrast, examining patients’ beneficiaries, and do baseline patient spending patterns using dynamic measures applied over longer periods may better identify patients factors that are potentially modifiable with different spending and help target interventions to those with the greatest need. predict these patterns? Findings In this cohort study using a OBJECTIVE To classify patients by their long-term, dynamic health care spending patterns using a data-driven approach to classifying data-driven approach and assess the ability to predict spending patterns, particularly using Medicare beneficiaries by their spending characteristics that are potentially modifiable through intervention. over 2 years, 5 patterns were identified and could be predicted, including those DESIGN, SETTING, AND PARTICIPANTS This cohort study used a retrospective cohort design from with consistent spending levels and a random nationwide sample of Medicare fee-for-service administrative claims data to identify others with spending that increased beneficiaries aged 65 years or older with continuous eligibility from 2011 to 2013. Statistical analysis progressively. The most influential was performed from August 2018 to December 2019. potentially modifiable factors were number of medications, number of MAIN OUTCOMES AND MEASURES Group-based trajectory modeling was applied to the claims office visits, and mean medication data to classify the Medicare beneficiaries by their total health care spending patterns over a 2-year adherence. period. The ability to predict membership in each trajectory spending group was assessed using generalized boosted regression, a data mining approach to model building and prediction, with split- Meaning These findings suggest that sample validation. Models were estimated using (1) prior-year predictors and (2) prior-year predictors spending by Medicare beneficiaries falls potentially modifiable through intervention measured in the claims data. These models were into 5 distinct groups and could be evaluated using validated C-statistics. The relative influence of individual predictors in the models accurately predicted; this approach was evaluated. could be adapted by organizations to target interventions. RESULTS Among the 329 476 beneficiaries, the mean (SD) age was 76.0 (7.2) years and 190 346 (57.8%) were female. This final 5-group model included a minimal-user group (group 1, 37 572 Supplemental content individuals [11.4%]), a low-cost group (group 2, 48 575 individuals [14.7%]), a rising-cost group (group 3, 24 736 individuals [7.5%]), a moderate-cost group (group 4, 83 338 individuals [25.3%]), Author affiliations and article information are listed at the end of this article. and a high-cost group (group 5, 135 255 individuals [41.2%]). Potentially modifiable characteristics strongly predicted these patterns (C-statistics range: 0.68-0.94). For groups with progressively increasing spending in particular, the most influential factors were number of medications (relative influence: 29.2), number of office visits (relative influence: 30.3), and mean medication adherence (relative influence: 33.6). CONCLUSIONS AND RELEVANCE Using a data-driven approach, distinct spending patterns were identified with high accuracy. The potentially modifiable predictors of membership in the rising-cost group represent important levers for early interventions that may prevent later spending increases. This approach could be adapted by organizations to target quality improvement interventions, (continued) Open Access. This is an open access article distributed under the terms of the CC-BY License. JAMA Network Open. 2020;3(10):e2020291. doi:10.1001/jamanetworkopen.2020.20291 (Reprinted) October 19, 2020 1/11 JAMA Network Open | Health Policy Use of Data-Driven Methods to Predict Long-term Patterns of Health Care Spending for Medicare Patients Abstract (continued) particularly because numerous health care organizations are increasingly using these routinely collected data. JAMA Network Open. 2020;3(10):e2020291. doi:10.1001/jamanetworkopen.2020.20291 Introduction With health care spending now accounting for almost 18% of the US gross domestic product, identifying individuals who may benefit from interventions to address potentially avoidable spending has become a central priority for health insurers and health care professionals. Current approaches generally focus on prediction or intervention for patients who may have escalating costs on the basis 2,3 of a single composite value of total spending over short time periods. However, many patients experience substantial increases or decreases in spending not 4-9 10 captured by these approaches. For example, Tamang et al identified a definable group of low-spending patients in 1 year whose costs bloomed (ie, they became high-spending individuals) in the subsequent year in Denmark. Similarly, Lauffenburger et al observed 7 distinct, dynamic patterns of spending over a 1-year period in commercially insured beneficiaries, including individuals whose costs increased rapidly toward the end of the year and another group of high-cost individuals for whom spending decreased. These prior studies were conducted over a 1-year period, yet there may also be dynamic patterns of spending over longer periods that may have implications both for whom to outreach for 1,12 intervention and when to do so. For example, patients with the same clinical conditions who are hospitalized early during a 12-month period may differ meaningfully from those hospitalized later, 13,14 although both could be identified as having rising costs. If these different spending patterns could be predicted using routinely collected data, then the ability to better proactively differentiate patients with increasing or decreasing spending patterns could better target interventions to those who are at greatest need of improved health or cost containment. The predictive accuracy of spending may also be higher when evaluating a long-term, compared with a short-term, time horizon as seen for other outcomes. Accordingly, we sought to classify patients according to their spending patterns over a 2-year period and to evaluate the ability to predict these spending groups using patient characteristics that are potentially modifiable. Methods This cohort study was approved by the institutional review board of Brigham and Women’s Hospital and was granted a waiver of informed patient consent because the data are secondary routinely collected data. This study follows reporting requirements of the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) reporting guideline. Setting and Study Design This study used administrative claims data from a 1-million-member sample of Medicare fee-for- service beneficiaries; the original sample included approximately 20 000 beneficiaries in a nationwide quality improvement program and approximately 980 000 randomly selected patients nationally. We restricted the cohort to the randomly selected patients and used their paid Medicare Parts A, B, and D patient-level files containing all procedures, physician encounters, hospitalizations, and filled outpatient prescriptions, including amounts paid by the insurer and patient. These data were linked to eligibility data including age, race/ethnicity, gender, and geographic location of residence. Aggregate zip code level data on median income and educational attainment were obtained by linking with 2010 US Census data. JAMA Network Open. 2020;3(10):e2020291. doi:10.1001/jamanetworkopen.2020.20291 (Reprinted) October 19, 2020 2/11 JAMA Network Open | Health Policy Use of Data-Driven Methods to Predict Long-term Patterns of Health Care Spending for Medicare Patients To be included, patients had to be aged 65 years or older and maintain continuous eligibility from January 1, 2011, to December 31, 2013. The cohort entry date was defined as January 1, 2012, to provide 1 year of prior year of baseline data (year 0) and 2 years of follow-up data (year 1 and year 2) (eFigure 1 in the Supplement). Costs We measured total monthly health care spending over a 2-year period for each patient by summing the allowed amounts on all inpatient, outpatient, and prescription drug claims. Monthly costs were generated by summing the costs in each month and were standardized by dividing the summed costs by the number of days in that month and then multiplying the result by 30. Costs were then logarithmically transformed to normalize their distribution, after adding $0.01, as frequently 9,18 done. Costs were inflated using the Medical Care Component of the Consumer Price Index to 2013 dollars when necessary. Predictors Using data from Medicare enrollment files and claims, we defined 37 clinically relevant baseline characteristics that were potential predictors of future spending (eTable 1 in the Supplement). These baseline variables were measured during the 12 months prior to the 2-year period during which cost outcomes were evaluated (eFigure 1 in the Supplement). These variables were based on characteristics used in cost modeling in claims data in the peer-reviewed literature and from the 6,10,11,15 quality-cost theoretical framework. These sets of predictors have also been shown to have equivalent predictive accuracy of predicting 1-year spending as proprietary risk-adjustment methods. Sociodemographic characteristics included age, race/ethnicity, gender, and community-level variables based on member’s zip code of residence, including median household income and educational attainment. Clinical comorbidities were measured using International Classification of Diseases, Ninth Revision codes (eAppendix and eTable 1 in the Supplement). Each patients’ number of unique prescriptions by generic name (ie, therapeutic complexity), physician office visits, emergency department visits, hospitalizations, unique physicians visited, unique pharmacies used, benefits’ generosity (copayments and deductibles or total net payments), and baseline year total costs were also measured. Adherence to long-term medication classes (eg, β-blockers) was measured in the baseline year. For each class, we created a supply diary beginning with the first fill for each class in the baseline year. This diary linked all observed fills based on dispensing date and days’ supply; switching was allowed within each class (eg, β-blockers). From this, we calculated the proportion of days covered (PDC) as a mean across classes that the patient filled to yield 1 20,21 mean PDC. We categorized each predictor by whether it was potentially modifiable, defined by whether it 22,23 could theoretically be addressed in interventions and by classifications in prior literature. For example, number of unique physicians could be potentially modifiable, while race/ethnicity is not. In total, we classified 10 predictors as potentially modifiable (Table 1). Data-Driven Approach to Modeling Long-term Costs We used trajectory modeling to empirically classify spending during follow-up. One advantage is that it allows the data to define the cost outcomes, rather than using arbitrarily selected thresholds. It also considers changes in spending over time, rather than aggregating costs over a set time. To define spending patterns, we used the previously described SAS procedure Proc Traj, a free 24-26 add-on. In brief, group-based trajectory models are an application of finite mixture modeling that identify clusters of individuals with similar outcome patterns over time. This modeling approach analyzes longitudinal data by fitting a semiparametric (discrete) mixture model, estimating each individual’s probability of membership in each group, and assigning them to the group according to their highest probability. We modeled longitudinal cost trajectories using calendar JAMA Network Open. 2020;3(10):e2020291. doi:10.1001/jamanetworkopen.2020.20291 (Reprinted) October 19, 2020 3/11 JAMA Network Open | Health Policy Use of Data-Driven Methods to Predict Long-term Patterns of Health Care Spending for Medicare Patients month as the time variable, costs in each month, order equal to 4, and a censored-normal distribution 11,24,26 (linear between minimum and maximum values). The models were estimated using a forward classifying approach using 2 to 7 groups, each time investigating model fit using the bayesian information criterion (BIC), whereby a lower BIC indicates Table 1. Patient Characteristics by Spending Trajectory Patients, No. (%) Group 1: minimal user Group 2: low cost Group 3: rising cost Group 4: moderate cost Group 5: high cost Covariates (n = 37 572) (n = 48 575) (n = 24 736) (n = 83 338) (n = 135 255) Demographic characteristics Age, mean (SD), y 73.8 (7.7) 74.8 (6.8) 75.1 (6.9) 75.9 (7.0) 77.1 (7.2) Female 16 394 (43.6) 26 531 (54.6) 13 500 (54.6) 49 287 (59.1) 84 634 (62.6) Race/ethnicity Non-Hispanic White 30 732 (81.8) 42 723 (88.0) 22 020 (89.0) 74 184 (89.0) 118 637 (87.7) Black 3610 (9.6) 3255 (6.7) 1169 (6.6) 5332 (6.4) 9646 (7.1) Other 1299 (3.5) 1297 (2.7) 539 (2.2) 1588 (1.9) 2293 (1.7) Asian or Pacific Islander 867 (2.3) 792 (1.6) 322 (1.3) 1330 (1.6) 2448 (1.8) Hispanic 1064 (2.8) 508 (1.1) 236 (1.0) 904 (1.1) 2231 (1.7) Zip code median income, mean (SD), $ 59 960 (24 347) 56 572 (24 199) 56 696 (23 765) 56 683 (23 776) 55 929 (23 808) Zip code high school graduates, mean (SD), % 80.8 (21.0) 84.4 (16.6) 84.5 (16.2) 84.6 (15.8) 83.9 (15.9) Health care use Part D Plan switch 163 (0.4) 173 (0.4) 82 (0.3) 468 (0.6) 1828 (1.4) Low-income subsidy 3584 (9.5) 2735 (5.6) 1379 (5.6) 8063 (9.7) 31 019 (22.9) Office visits, mean (SD), No. 1.2 (2.0) 4.5 (3.5) 4.7 (3.7) 7.1 (5.0) 11.3 (8.3) Physicians, mean (SD), No. 0.4 (0.7) 1.0 (1.0) 1.0 (0.9) 1.3 (1.1) 1.8 (1.3) Pharmacies used, mean (SD), No. 0.1 (0.4) 0.4 (0.8) 0.3 (0.7) 0.8 (1.1) 1.3 (1.3) Hospitalizations, mean (SD), No. 0.0 (0.2) 0.1 (0.4) 0.1 (0.3) 0.2 (0.5) 0.4 (0.8) Emergency department visits, mean (SD), No. 0.1 (0.4) 0.2 (0.6) 0.2 (0.6) 0.3 (0.7) 0.6 (1.3) Unique drugs, mean (SD), No. 0.2 (1.1) 1.0 (2.2) 0.9 (2.2) 3.1 (3.9) 8.0 (7.0) Prescription generosity, mean (SD) 0.1 (0.2) 0.1 (0.3) 0.1 (0.2) 0.2 (0.3) 0.2 (0.2) Medical benefits’ generosity, mean (SD) 0.2 (0.3) 0.2 (0.2) 0.2 (0.2) 0.1 (0.1) 0.1 (0.8) Total baseline year costs, mean (SD), $ 1629 (5948) 4969 (10 296) 4762 (8989) 8314 (13 052) 19 941 (26 331) Long-term medication use 1261 (3.4) 7942 (16.4) 3445 (13.9) 35 142 (42.2) 88 922 (65.7) Medication adherence, mean (SD) 0.55 (0.30) 0.78 (0.24) 0.76 (0.25) 0.82 (0.19) 0.82 (0.18) Comorbidities Comorbidity score, mean (SD) 0.1 (0.9) 0.3 (1.4) 0.3 (1.4) 0.7 (1.8) 2.1 (2.7) Coronary artery disease 312 (0.8) 1065 (2.2) 518 (2.1) 3209 (3.9) 13 664 (10.1) Prior myocardial infarction 55 (0.2) 171 (0.4) 66 (0.3) 430 (0.5) 1491 (1.1) Asthma or chronic obstructive pulmonary disease 1659 (4.4) 5047 (10.4) 2952 (11.9) 12 795 (15.4) 40 073 (29.6) Hypertension 8962 (23.9) 30 172 (62.1) 15 683 (63.4) 63 097 (75.7) 115 869 (85.7) Diabetes 577 (1.5) 2508 (5.2) 1360 (5.5) 7857 (9.4) 25 653 (19.0) Acute kidney failure or end stage kidney disease 197 (0.5) 555 (1.1) 275 (1.1) 1591 (1.9) 8604 (6.4) Dementia 210 (0.6) 555 (1.4) 362 (1.5) 1162 (1.9) 7805 (5.8) Depression 519 (1.4) 2120 (4.4) 1188 (4.8) 5878 (7.1) 20 787 (15.4) Stroke 93 (0.3) 224 (0.5) 102 (0.4) 628 (0.8) 2165 (1.6) Liver disease 28 (0.1) 62 (0.1) 15 (0.1) 184 (0.2) 702 (0.5) Congestive heart failure 107 (0.3) 325 (0.7) 168 (0.7) 1083 (1.3) 7235 (5.4) Hyperlipidemia 7821 (20.8) 30 098 (62.0) 15 376 (62.2) 60 720 (72.9) 105 003 (77.6) Atrial fibrillation 129 (0.3) 420 (0.9) 216 (0.9) 1607 (1.9) 8130 (6.0) Osteoporosis 1839 (4.9) 8562 (17.6) 4304 (17.4) 19 080 (22.9) 38 204 (28.3) Obesity 511 (1.3) 1867 (3.8) 971 (3.9) 4572 (5.5) 13 223 (9.8) Acute stress 245 (0.7) 780 (1.6) 385 (1.6) 1973 (2.4) 7427 (5.5) Tobacco use 1156 (3.1) 2851 (5.9) 1474 (6.0) 6094 (7.3) 16 499 (12.2) Denotes potentially modifiable predictors. JAMA Network Open. 2020;3(10):e2020291. doi:10.1001/jamanetworkopen.2020.20291 (Reprinted) October 19, 2020 4/11 JAMA Network Open | Health Policy Use of Data-Driven Methods to Predict Long-term Patterns of Health Care Spending for Medicare Patients better model fit. The number of groups investigated was capped at 7 on the basis of groupings observed in prior work. In addition to considering BIC, other key considerations in selecting the best-fitting trajectory were the ability to visually interpret separate groups, minimum membership 26-28 probabilities in each group, and having 5% or more of the sample in each group. Statistical Analysis After selecting the best fitting number of trajectories, we assessed the ability to predict membership in each 2-year trajectory group using boosted logistic regression, a nonparametric machine learning method. The boosted algorithm is considered one of the best data-mining approaches for prediction 16,29 problems. Specifically, the algorithm creates a prediction model by building numerous small regression trees that together provide highly accurate classification. The boosting algorithm has several built-in protections from model overfitting, provides automatic variable selection, and describes the relative influence of predictors. They also consider all possible interaction terms between potential predictors. We used the gbm package in R with 5-fold cross-validation to identify the optimal number of trees and applied standard default values for tuning parameters to identify the optimal model. For each trajectory group, we estimated 2 separate models. The first included all 37 baseline predictors (model 1) and the second included only the 10 baseline predictors that were considered a priori to be potentially modifiable (model 2). Because of the ability of boosted regression to handle missing data, an indicator of long-term medication use and mean PDC were both included as variables for model 1, and mean PDC was included alone as a variable for model 2. To avoid overoptimism bias, we used internal split-sample validation by randomly dividing the full cohort into 2 halves as an initial derivation sample and a validation sample for all models. We evaluated each model through discrimination measures. Discrimination, the model’s ability to distinguish between patients who do and do not experience the outcome, was measured by the 34,35 C-statistic, which ranges from 0.5 (noninformative model) to 1.0 (perfect prediction). For clinical context, we explored the association between potentially modifiable baseline characteristics and membership in a rising-cost trajectory compared with other trajectory groups that had similar spending at baseline. Specifically, we used multivariable logistic regression to compare membership in the rising-cost trajectory, including each potentially modifiable variable vs other groups. This approach provides insight into baseline factors that may help distinguish patients who become costly later (ie, at least a year later) and potential levers for interventions. We also explored the relative influence of each potentially modifiable predictor from model 2. We also evaluated the ability to predict patients who experience rising costs in year 2 defined using a decile-threshold approach (ie, those in the lower 90% of spending in year 1 and then were in the top 10% of spending in year 2 ) and patients who in trajectory modeling were estimated as belonging to a rising-cost trajectory. For this approach, we estimated each outcome with 2 additional models with boosted regression. Model 3 used all baseline predictors, and model 4 used the potentially modifiable predictors. This approach helps provide insight into whether these spending increases could be accurately predicted using baseline information less temporal to the spending changes, which could ultimately inform intervention design and allow more time for them to be implemented. We conducted several sensitivity analyses. Although our primary analysis included zip code sociodemographic characteristics, we also included patients’ region of residence based on enrollment files as a predictor in model 1. Then, we included adherence to each class separately as predictors in models 1 and 2. Finally, we repeated measurements and analyses in a subsequent year (ie, 2012-2014) to confirm generalizability (eAppendix in the Supplement). All analyses except for the boosted regression were performed using SAS version 9.4 (SAS Institute); the boosting algorithm was performed using R version 3.4.1 (The R Project for Statistical Computing). Statistical analysis was performed from August 2018 to December 2019. JAMA Network Open. 2020;3(10):e2020291. doi:10.1001/jamanetworkopen.2020.20291 (Reprinted) October 19, 2020 5/11 JAMA Network Open | Health Policy Use of Data-Driven Methods to Predict Long-term Patterns of Health Care Spending for Medicare Patients Results Study Population and Characteristics Our cohort consisted of 329 476 patients (eTable 2 in the Supplement). Their mean (SD) age was 76.0 (7.2) years, and 190 346 (57.8%) were women. A 5-group trajectory model best described the 2-year spending patterns (Figure); the model on the log scale is shown in eFigure 2 in the Supplement. The probabilities of group membership are in eTable 3 in the Supplement. Trajectories with alternative numbers of groups and corresponding BICs are shown in eFigure 3 in the Supplement; models with more groups had marginal improvements and were less interpretable. This final 5-group model included a minimal-user group (group 1, 37 572 individuals [11.4%]), a low-cost group (group 2, 48 575 individuals [14.7%]), a rising-cost group (group 3, 24 736 individuals [7.5%]), a moderate-cost group (group 4, 83 338 individuals [25.3%]), and a high-cost group (group 5, 135 255 individuals [41.2%]). Baseline characteristics for each group are shown in Table 1. Cost Prediction Table 2 shows the results of the main prediction models in the validation sample. Four of the 5 2-year spending trajectory groups could be accurately predicted using all baseline predictors, especially the minimal-user (C-statistic: 0.951), low-cost (C-statistic: 0.810), rising-cost (C-statistic: 0.764), and high-cost groups (C-statistic: 0.899). Using potentially modifiable predictors alone, overall predictive ability remained moderate to strong, with the exception of the moderate-cost group (eg, C-statistic: 0.684). Figure. 2-Year Spending Patterns Using Trajectory Modeling Minimal user (11.4%) Low cost (14.7%) Rising cost (7.5%) Moderate cost (25.3%) High cost (41.2%) The mean observed spending levels using 5-group trajectory modeling in the full sample are plotted. The percentages in the key refer to the number of patients who belong to each trajectory group out of the full 0 2 4 6 8 10 12 14 16 18 20 22 24 cohort (bayesian information criterion for this model: Time, mo 21704747). Table 2. Ability of Models to Predict 2-Year Spending Trajectory Groups Group Validation C-statistic All baseline predictors, model 1 Group 1: minimal user 0.951 Group 2: low cost 0.810 Group 3: rising cost 0.764 Group 4: moderate cost 0.728 Group 5: high cost 0.899 Potentially modifiable predictors, model 2 Group 1: minimal user 0.942 Group 2: low cost 0.783 Group 3: rising cost 0.753 Group 4: moderate cost 0.684 Group 5: high cost 0.873 JAMA Network Open. 2020;3(10):e2020291. doi:10.1001/jamanetworkopen.2020.20291 (Reprinted) October 19, 2020 6/11 Monthly costs, $ JAMA Network Open | Health Policy Use of Data-Driven Methods to Predict Long-term Patterns of Health Care Spending for Medicare Patients Table 3 shows potentially modifiable prior-year predictors of being in a rising-cost trajectory compared with the other 3 groups with similar spending in the prior baseline year (mean, $1500- $8000 in year 0). In particular, using more medications (odds ratio [OR]: 0.81; 95% CI, 0.79-0.84) and having more office visits (OR: 0.98; 95% CI, 0.97-0.99) were associated with lower odds of being in the rising-cost trajectory. Seeing more physicians (OR: 1.04; 95% CI, 1.02-1.06) and using tobacco (OR: 1.10; 95% CI, 1.02-1.20) were also factors independently associated with rising-cost membership. eFigure 4 in the Supplement shows the relative influence plots for each group incorporating only potentially modifiable characteristics (model 2). The plot for predicting the rising- cost group in particular indicates that the most predictive potentially modifiable factors were mean medication adherence (relative influence: 33.6), number of office visits (relative influence: 30.3), and number of medications (relative influence: 29.2). The results from the models predicting rising costs using a decile-threshold–based method and the trajectory group method are shown in eTable 4 in the Supplement. Patients in the decile- threshold–based approach had higher total 2-year costs on average ($39 737), compared with the trajectory approach ($23 670). The ability to predict decile-threshold–based rising costs (model 4 C-statistic: 0.643) was lower than the trajectory-based approach (model 4 C-statistic: 0.753). Sensitivity analyses incorporating region of residence and medication adherence to by class are shown in eTables 5 and 6 in the Supplement. Notably, trajectory group membership was fairly similar across regions, and including these predictors did not meaningfully change C-statistics. Replication in a subsequent year of data resulted in similar patterns and sizes of group membership (eFigure 5 in the Supplement) as well as ability to predict those groups (eTable 7 in the Supplement). Discussion Using a data-driven approach to classify 2-year health spending for Medicare beneficiaries, we observed 5 distinct spending patterns. Membership in these groups could be accurately predicted, even when using a simple set of potentially modifiable characteristics from claims data. These results suggest that this approach could potentially help inform the design, application, and timing of interventions. Prior efforts to predict health care spending have generally focused on a single composite value, such as total yearly costs or a threshold-based measure, such as being in the top 5% of spending, both of which collapse an entire year’s spending into a static variable. These approaches have had modest accuracy; C-statistics for threshold-based outcomes have generally ranged from 0.6 to 2,5,36,37 0.8. Two recently published approaches offer other cluster-based solutions to elucidate 38,39 subgroups of high-cost patients with some notable successes. However, these were not applied to evaluate changes in spending, outcomes over more than 1 year, or to elucidate patients with rising Table 3. Association Between Potentially Modifiable Factors and Membership in the Rising-Cost Spending Trajectory (Group 3) vs Other Trajectory Groups Characteristics OR (95% CI) for group 3: rising cost Intercept (SE) −1.86 (0.02) Baseline covariate Unique medications, No. 0.81 (0.79-0.84) Office visits, No. 0.98 (0.97-0.99) Physicians, No. 1.04 (1.02-1.06) Pharmacies, No. 0.99 (0.95-1.02) b Abbreviation: OR, odds ratio. Emergency department visits, No. 0.98 (0.94-1.01) Conducted within validation sample using logistic Depression 1.01 (0.92-1.10) regression model with only potentially modifiable Tobacco use 1.10 (1.02-1.20) covariates compared with groups 1, 2, and 4. Obesity 1.08 (0.98-1.19) Odds ratios are presented as a 1-unit increase for Acute stress 0.87 (0.74-1.02) continuous variables. JAMA Network Open. 2020;3(10):e2020291. doi:10.1001/jamanetworkopen.2020.20291 (Reprinted) October 19, 2020 7/11 JAMA Network Open | Health Policy Use of Data-Driven Methods to Predict Long-term Patterns of Health Care Spending for Medicare Patients 38,39 costs. They also focused on Medicare Advantage populations, which can differ from fee-for- 40,41 service beneficiaries. Patients may have dynamic patterns of spending over longer periods of time that can be potentially meaningful, with implications on whom to outreach for intervention as well as when and 1,12 10 perhaps how to do so. For example, Tamang et al identified low-spending patients in 1 year whose costs bloomed in the subsequent year using thresholds. When applied to our data, the ability to predict these patients using baseline data alone was modest. Using a data-driven approach, we observed a similarly sized group whose costs later increased that could be predicted somewhat better. One possible explanation could be that the 2-year time horizon itself as an outcome helped discriminate between groups. The ability to proactively differentiate between patients with rising or falling spending patterns using distally measured variables could better target interventions to those who are at greatest need. If successful, using these longer time horizons could allow more time for the implementation of potential interventions. Focusing interventions on patients with rising costs has some theoretical advantages, even though predictive ability was modest. First, the size of the group identified in this study was modest (ie, 7.5%). Of course, it still may be infeasible to intervene upon a group this large, and not all costs may be preventable. Identifying additional segmentation may be necessary, and the use of this approach may be just a starting point. Regardless, the ability to predict better could target interventions to those at greater need, and targeting has been shown to result in better population- level outcomes. When considering potential interventions, a prediction rule comprising the most influential potentially modifiable variables could be applied to better target patients. We observed several clinically actionable characteristics, such as therapeutic complexity (ie, number of medications or office visits), depression, medication adherence, and tobacco use that could be levers for interventions. Filling fewer medications and having fewer office visits were also predictors of the rising-cost trajectory, suggesting that patients may not be getting sufficient care to prevent future escalation of health problems. This information could also be used for intervention design to improve care. Many health care organizations, insurers, researchers, and policy makers use claims data to identify patients for interventions. Therefore, the ability to better leverage these routinely collected data for cost predictions and interventions with a variety of more nuanced cost-modeling methods holds wide potential. Moreover, using data-driven approaches to classify longer-term spending may hold promise compared with threshold-based approaches alone. Limitations Several limitations warrant mention. First, we examined trajectories from January to December; patients with incomplete enrollment or other policy start and end dates may differ. Because of differences in how outcomes are categorized, model performance of predicting a cost trajectory (binary outcome) cannot be directly compared with predicting total costs (continuous outcome) or patients defined by the rising-cost decile-threshold approach. The variables included in prediction models may also not be exhaustive, and although we used validated algorithms, they may be insufficiently sensitive. Trajectory modeling also provides predicted group membership; individual members may be assigned to their closest trajectory, but there could be within-group heterogeneity. The high-cost group was large, possibly because of how the model was specified (ie, log costs); one could potentially apply trajectories to identify subgroups within that group for further segmentation. Although group distribution did not differ on the basis of geographical region, the costs themselves were not adjusted for region; similarly, moving could have impacted relative changes in spending, but this was beyond the scope of this study. Furthermore, these results may not be generalizable to other payment systems, such as non–fee-for-service Medicare, Medicaid, or commercially insured beneficiaries. Although these other beneficiaries may have different spending levels, prior work has JAMA Network Open. 2020;3(10):e2020291. doi:10.1001/jamanetworkopen.2020.20291 (Reprinted) October 19, 2020 8/11 JAMA Network Open | Health Policy Use of Data-Driven Methods to Predict Long-term Patterns of Health Care Spending for Medicare Patients suggested similar patterns. Regardless, the same groups or predictive ability may not apply to other types of beneficiaries, and the results should be studied further to confirm reproducibility. Conclusions Using trajectory modeling to examine a 2-year time horizon improved the understanding of dynamic patterns, including the identification of a group of patients with progressively increasing costs and a group of patients with consistently high spending. This approach could be potentially adapted by health care organizations to improve cost-containment efforts. ARTICLE INFORMATION Accepted for Publication: August 4, 2020. Published: October 19, 2020. doi:10.1001/jamanetworkopen.2020.20291 Open Access: This is an open access article distributed under the terms of the CC-BY License. © 2020 Lauffenburger JC et al. JAMA Network Open. Corresponding Author: Julie C. Lauffenburger, PharmD, PhD, Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, 1620 Tremont St, Ste 3030, Boston, MA 02120 (jlauffenburger@bwh.harvard.edu). Author Affiliations: Center for Healthcare Delivery Sciences, Department of Medicine, Brigham and Women’s Hospital, Boston, Massachusetts (Lauffenburger, Choudhry); Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts (Lauffenburger, Mahesri, Choudhry). Author Contributions: Dr Lauffenburger had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. Concept and design: Lauffenburger, Choudhry. Acquisition, analysis, or interpretation of data: All authors. Drafting of the manuscript: Lauffenburger, Mahesri. Critical revision of the manuscript for important intellectual content: Mahesri, Choudhry. Statistical analysis: Lauffenburger, Mahesri. Obtained funding: Lauffenburger. Supervision: Lauffenburger, Choudhry. Conflict of Interest Disclosures: Dr Choudhry reported receiving unrestricted research funding from Sanofi, AstraZeneca, and Medisafe Inc payable to Brigham and Women’s Hospital. No other disclosures were reported. Funding/Support: This work was supported by an unrestricted investigator-initiated grant from the National Institute for Health Care Management to Brigham and Women’s Hospital. Dr Lauffenburger was also supported in part by a National Institutes of Health career development grant (K01 HL 141538). Dr Choudhry was also supported in part by a National Institutes of Health center grant (P30AG064199). Role of the Funder/Sponsor: The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication. REFERENCES 1. Martin AB, Hartman M, Washington B, Catlin A; National Health Expenditure Accounts Team. National health spending: faster growth in 2015 as coverage expands and utilization increases. Health Aff (Millwood). 2017;36(1): 166-176. doi:10.1377/hlthaff.2016.1330 2. Kuo RN, Dong YH, Liu JP, Chang CH, Shau WY, Lai MS. Predicting healthcare utilization using a pharmacy-based metric with the WHO’s Anatomic Therapeutic Chemical algorithm. Med Care. 2011;49(11):1031-1039. doi:10.1097/ MLR.0b013e31822ebe11 3. Perkins AJ, Kroenke K, Unützer J, et al. Common comorbidity scales were similar in their ability to predict health care costs and mortality. J Clin Epidemiol. 2004;57(10):1040-1048. doi:10.1016/j.jclinepi.2004.03.002 JAMA Network Open. 2020;3(10):e2020291. doi:10.1001/jamanetworkopen.2020.20291 (Reprinted) October 19, 2020 9/11 JAMA Network Open | Health Policy Use of Data-Driven Methods to Predict Long-term Patterns of Health Care Spending for Medicare Patients 4. Sales AE, Liu CF, Sloan KL, et al. Predicting costs of care using a pharmacy-based measure risk adjustment in a veteran population. Med Care. 2003;41(6):753-760. doi:10.1097/01.MLR.0000069502.75914.DD 5. Fishman PA, Goodman MJ, Hornbrook MC, Meenan RT, Bachman DJ, O’Keeffe Rosetti MC. Risk adjustment using automated ambulatory pharmacy data: the RxRisk model. Med Care. 2003;41(1):84-99. doi:10.1097/ 00005650-200301000-00011 6. Powers CA, Meyer CM, Roebuck MC, Vaziri B. Predictive modeling of total healthcare costs using pharmacy claims data: a comparison of alternative econometric cost modeling techniques. Med Care. 2005;43(11): 1065-1072. doi:10.1097/01.mlr.0000182408.54390.00 7. Forrest CB, Lemke KW, Bodycombe DP, Weiner JP. Medication, diagnostic, and cost information as predictors of high-risk patients in need of care management. Am J Manag Care. 2009;15(1):41-48. 8. Yarger S, Rascati K, Lawson K, Barner J, Leslie R. Analysis of predictive value of four risk models in Medicaid recipients with chronic obstructive pulmonary disease in Texas. Clin Ther. 2008;30(Spec No):1051-1057. doi:10. 1016/j.clinthera.2008.06.001 9. Mihaylova B, Briggs A, O’Hagan A, Thompson SG. Review of statistical methods for analysing healthcare resources and costs. Health Econ. 2011;20(8):897-916. doi:10.1002/hec.1653 10. Tamang S, Milstein A, Sørensen HT, et al. Predicting patient ‘cost blooms’ in Denmark: a longitudinal population-based study. BMJ Open. 2017;7(1):e011580. doi:10.1136/bmjopen-2016-011580 11. Lauffenburger JC, Franklin JM, Krumme AA, et al. Longitudinal patterns of spending enhance the ability to predict costly patients: a novel approach to identify patients for cost containment. Med Care. 2017;55(1):64-73. doi:10.1097/MLR.0000000000000623 12. Druss BG, Marcus SC, Olfson M, Tanielian T, Elinson L, Pincus HA. Comparing the national economic burden of five chronic conditions. Health Aff (Millwood). 2001;20(6):233-241. doi:10.1377/hlthaff.20.6.233 13. Ziaeian B, Fonarow GC. The prevention of hospital readmissions in heart failure. Prog Cardiovasc Dis. 2016;58 (4):379-385. doi:10.1016/j.pcad.2015.09.004 14. Barnett ML, Hsu J, McWilliams JM. Patient characteristics and differences in hospital readmission rates. JAMA Intern Med. 2015;175(11):1803-1812. doi:10.1001/jamainternmed.2015.4660 15. Nuckols TK, Escarce JJ, Asch SM. The effects of quality of care on costs: a conceptual framework. Milbank Q. 2013;91(2):316-353. doi:10.1111/milq.12015 16. Franklin JM, Shrank WH, Lii J, et al. Observing versus predicting: initial patterns of filling predict long-term adherence more accurately than high-dimensional modeling techniques. Health Serv Res. 2016;51(1):220-239. doi:10.1111/1475-6773.12310 17. Krumme AA, Glynn RJ, Schneeweiss S, et al. Medication synchronization programs improve adherence to cardiovascular medications and health care use. Health Aff (Millwood). 2018;37(1):125-133. doi:10.1377/hlthaff. 2017.0881 18. Austin PC, Ghali WA, Tu JV. A comparison of several regression models for analysing cost of CABG surgery. Stat Med. 2003;22(17):2799-2815. doi:10.1002/sim.1442 19. Artz MB, Hadsall RS, Schondelmeyer SW. Impact of generosity level of outpatient prescription drug coverage on prescription drug events and expenditure among older persons. Am J Public Health. 2002;92(8):1257-1263. doi:10.2105/AJPH.92.8.1257 20. Benner JS, Glynn RJ, Mogun H, Neumann PJ, Weinstein MC, Avorn J. Long-term persistence in use of statin therapy in elderly patients. JAMA. 2002;288(4):455-461. doi:10.1001/jama.288.4.455 21. Choudhry NK, Shrank WH, Levin RL, et al. Measuring concurrent adherence to multiple related medications. Am J Manag Care. 2009;15(7):457-464. 22. Goetzel RZ, Pei X, Tabrizi MJ, et al. Ten modifiable health risk factors are linked to more than one-fifth of employer-employee health care spending. Health Aff (Millwood). 2012;31(11):2474-2484. doi:10.1377/hlthaff. 2011.0819 23. Yusuf S, Hawken S, Ounpuu S, et al; INTERHEART Study Investigators. Effect of potentially modifiable risk factors associated with myocardial infarction in 52 countries (the INTERHEART study): case-control study. Lancet. 2004;364(9438):937-952. doi:10.1016/S0140-6736(04)17018-9 24. Jones BL, Nagin DS. Advances in group-based trajectory modeling and a SAS procedure for estimating them. Sociol Methods Res. 2007;35(4):542-571. doi:10.1177/0049124106292364 25. Franklin JM, Shrank WH, Pakes J, et al. Group-based trajectory models: a new approach to classifying and predicting long-term medication adherence. Med Care. 2013;51(9):789-796. doi:10.1097/MLR.0b013e3182984c1f JAMA Network Open. 2020;3(10):e2020291. doi:10.1001/jamanetworkopen.2020.20291 (Reprinted) October 19, 2020 10/11 JAMA Network Open | Health Policy Use of Data-Driven Methods to Predict Long-term Patterns of Health Care Spending for Medicare Patients 26. Jones BL, Nagin DS, Roeder K. A SAS procedure based on mixture models for estimating developmental trajectories. Sociol Methods Res. 2001;29:374-393. doi:10.1177/0049124101029003005 27. Li Y, Zhou H, Cai B, et al. Group-based trajectory modeling to assess adherence to biologics among patients with psoriasis. Clinicoecon Outcomes Res. 2014;6:197-208. doi:10.2147/CEOR.S59339 28. Franklin JM, Krumme AA, Tong AY, et al. Association between trajectories of statin adherence and subsequent cardiovascular events. Pharmacoepidemiol Drug Saf. 2015;24(10):1105-1113. doi:10.1002/pds.3787 29. Koh HC, Tan G. Data mining applications in healthcare. J Healthc Inf Manag. 2005;19(2):64-72. 30. Robinson JW. Regression tree boosting to adjust health care cost predictions for diagnostic mix. Health Serv Res. 2008;43(2):755-772. doi:10.1111/j.1475-6773.2007.00761.x 31. Varian HR. Big data: new tricks for econometrics. J Econ Perspect. 2014;28(2):3-28. doi:10.1257/jep.28.2.3 32. Steyerberg EW, Harrell FE Jr, Borsboom GJ, Eijkemans MJ, Vergouwe Y, Habbema JD. Internal validation of predictive models: efficiency of some procedures for logistic regression analysis. J Clin Epidemiol. 2001;54(8): 774-781. doi:10.1016/S0895-4356(01)00341-9 33. Waljee AK, Higgins PD, Singal AG. A primer on predictive models. Clin Transl Gastroenterol. 2014;5:e44. doi: 10.1038/ctg.2013.19 34. Steyerberg EW, Vickers AJ, Cook NR, et al. Assessing the performance of prediction models: a framework for traditional and novel measures. Epidemiology. 2010;21(1):128-138. doi:10.1097/EDE.0b013e3181c30fb2 35. Cook NR. Use and misuse of the receiver operating characteristic curve in risk prediction. Circulation. 2007;115 (7):928-935. doi:10.1161/CIRCULATIONAHA.106.672402 36. Liu CF, Sales AE, Sharp ND, et al. Case-mix adjusting performance measures in a veteran population: pharmacy- and diagnosis-based approaches. Health Serv Res. 2003;38(5):1319-1337. doi:10.1111/1475-6773.00179 37. Zhao Y, Ash AS, Ellis RP, et al. Predicting pharmacy costs and other medical costs using diagnoses and drug claims. Med Care. 2005;43(1):34-43. 38. Yan J, Linn KA, Powers BW, et al. Applying machine learning algorithms to segment high-cost patient populations. J Gen Intern Med. 2019;34(2):211-217. doi:10.1007/s11606-018-4760-8 39. Powers BW, Yan J, Zhu J, et al. Subgroups of high-cost Medicare Advantage patients: an observational study. J Gen Intern Med. 2019;34(2):218-225. doi:10.1007/s11606-018-4759-1 40. Powell SK. Choosing Medicare Advantage plans versus traditional fee-for-service: is this change the tipping point? Prof Case Manag. 2019;24(1):1-3. doi:10.1097/NCM.0000000000000338 41. Raetzman SO, Hines AL, Barrett ML, Karaca Z. Hospital stays in Medicare Advantage Plans versus the traditional Medicare fee-for-service program, 2013: statistical brief #198. Published December 2015. Accessed August 5, 2019. https://www.hcup-us.ahrq.gov/reports/statbriefs/sb198-Hospital-Stays-Medicare-Advantage- Versus-Traditional-Medicare.jsp 42. Stadhouders N, Kruse F, Tanke M, Koolman X, Jeurissen P. Effective healthcare cost-containment policies: a systematic review. Health Policy. 2019;123(1):71-79. doi:10.1016/j.healthpol.2018.10.015 43. Lauffenburger JC, Lewey J, Jan S, et al. Effectiveness of targeted insulin-adherence interventions for glycemic control using predictive analytics among patients with type 2 diabetes: a randomized clinical trial. JAMA Netw Open. 2019;2(3):e190657. doi:10.1001/jamanetworkopen.2019.0657 SUPPLEMENT. eFigure 1. Study Design eTable 1. Baseline Predictors of Spending Outcomes eAppendix. Supplemental Methods eTable 2. Patient Eligibility Criteria eTable 3. Predicted Probabilities for Each Trajectory Group eFigure 2. Two-Year Spending Patterns Using Trajectory Modeling: Original Log Scale eFigure 3. Trajectory Modeling of Two-Year Healthcare Spending Using Other Numbers of Groups eFigure 4. Relative Influence Plots From Boosted Regression Modeling for Predicting Trajectory Group Membership With Potentially-Modifiable Variables (Model 2) eTable 4. Validation C-Statistics From Models Predicting Patients With Future Rising Spending eTable 5. Geographic Region And Baseline Chronic Condition Medication Classes By Trajectory Group eTable 6. Validation C-Statistics From Sensitivity Analyses eFigure 5. Two-Year Spending Patterns Using Trajectory Modeling In 2013-2014 Data eTable 7. Ability of Models to Predict Two-Year Spending Trajectory Groups In 2013-2014 Data JAMA Network Open. 2020;3(10):e2020291. doi:10.1001/jamanetworkopen.2020.20291 (Reprinted) October 19, 2020 11/11 Supplemental Online Content Lauffenburger JC, Mahesri M, Choudhry NK. long-term patterns of health care spending Medicare patients. JAMA Netw Open. 2020;3(10):e2020291. doi:10.1001/jamanetworkopen.2020.20291 eFigure 1. Study Design eTable 1. Baseline Predictors of Spending Outcomes eAppendix. Supplemental Methods eTable 2. Patient Eligibility Criteria eTable 3. Predicted Probabilities for Each Trajectory Group eFigure 2. Two-Year Spending Patterns Using Trajectory Modeling: Original Log Scale eFigure 3. Trajectory Modeling of Two-Year Healthcare Spending Using Other Numbers of Groups eFigure 4. Relative Influence Plots From Boosted Regression Modeling for Predicting Trajectory Group Membership With Potentially-Modifiable Variables (Model 2) eTable 4. Validation C-Statistics From Models Predicting Patients With Future Rising Spending eTable 5. Geographic Region And Baseline Chronic Condition Medication Classes By Trajectory Group eTable 6. Validation C-Statistics From Sensitivity Analyses eFigure 5. Two-Year Spending Patterns Using Trajectory Modeling In 2013-2014 Data eTable 7. Ability of Models to Predict Two-Year Spending Trajectory Groups In 2013-2014 Data This supplemental material has been provided by the authors to give readers additional information about their work. © 2020 Lauffenburger JC et al. JAMA Network Open. eFigure 1. Study Design Two-year spending outcomes Cohort entry date 1/1/2012 1/1/2013 1/1/2011 12/31/2013 Baseline (Year 0) predictors eTable 1. Baseline redictors of pending utcomes th Type and Timing Predictors and relevant International Classification of Diseases, 9 Edition codes (ICD-9) or Current Procedural Terminology (CPT) codes Demographic Age Measured in enrollment files Sex Measured in enrollment files Race/ethnicity Measured in enrollment files Median household income files), rounded to nearest US dollar High school graduate Percentage of persons graduating from high school in the via 2010 US Census files) Healthcare utilization Part D plan switch Patient switched their Part D plans in the baseline period, measured in enrollment files Part D low income Patient received a Part D low-income subsidy for at least one month in the baseline period subsidy No. of office visits Number of physician office visits, based on procedure codes No. of physicians Number of unique physicians associated with outpatient claims No. of pharmacies used Number of distinct pharmacies by Pharmacy ID that the patients filled medications at, defined using pharmacy claims No. of hospitalizations Number of all-cause hospitalizations No. of Emergency Room Number of all-cause emergency room visits visits No. of unique drugs (i.e., Number of unique medications filled, by generic name therapeutic complexity) Prescription generosity Out-of-pocket drug costs for filled medications divided by net total drug payments (Artz MB, et al. Am J Pub Health. 2002;92:1257-1263.) Out-of-pocket costs for all services and procedures divided by net total payments generosity Total baseline year costs Total costs from all services and procedures related to inpatient and outpatient visits, procedures, durable medical equipment, home health, and medications Chronic medication use Defined by filling at least one medication different chronic medication classes, including medications for the following disease states: anti-hypertensive, lipid-lowering, anti- diabetic, osteoporosis, or asthma/COPD Average adherence Measured by proportion of days covered in the baseline period in pharmacy claims, averaged across all eligible chronic medication classes Comorbidities Comorbidity score Measured using the algorithm presented here: Gagne JJ et al. J Clin Epidemiol. 2011;64:749- 759 (incorporates 37 possible comorbid conditions) Coronary artery disease 410.x-414.x, 429.2, V45.81 (hospital discharge, any position) Prior Myocardial 410.x except 410.x2 AND length of stay >3 and <180 days (hospital discharge, any position) infarction Asthma or Coronary 493.x, 490.x, 491.x, 492.x, 496.x Obstructive Pulmonary Disorder Hypertension 401.x-405.x, 437.2 Diabetes mellitus 250.x (1 inpatient or two outpatient) Acute renal failure or End 572.4, 580.xx, 584.xx, 580.0, 580.4, 580.89, 580.9, 582.4, 791.2, 791.3 OR Stage Renal Disease 585.6 Dementia 290.0, 290.1x, 290.2x, 290.3x, 290.4x, 294.20 Depression 311, 296.2, 296.3, 300.4, 301.12, and 309.1 Stroke 433.x1, 434 excluding 434.x0, 435.xx, 436.xx, 437.1x, 437.9x (hospital discharge, any position) Liver disease 570-573 (esophageal varices 456.xx) (hospital discharge, any position) Congestive heart failure 428.x, 398.91, 402.01, 402.11, 402.91, 404.01, 404.11, 404.91, 404.03, 404.13, 404.93 (hospital discharge, any position) Hyperlipidemia 272.xx Atrial fibrillation 427.31 (hospital discharge, any position) Osteoporosis 733.x Obesity 278 Acute Stress 298, 308, 309 Tobacco use 305.1, 649.0, 989.84, V15.82, CPT: 1034F eAppendix. Additional etail on redictors Clinical comorbidities were measured using International Classification of Diseases th 9 edition (ICD-9) codes in medical files including comorbidities such as coronary artery disease, prior myocardial infarction, COPD/Asthma, hypertension, hyperlipidemia, congestive heart failure, stroke, major depression, diabetes, liver disease, chronic kidney disease/dementia, osteoporosis, obesity, and tobacco use. Replication in subsequent year We used administrative claims data from the nationwide 1-million-member sample for year 2012-2014 as a separate replication dataset. As in the primary analysis, we restricted the cohort to the randomly-selected patients and used their patient-level files. To January 1, 2012 to December 31, 2014. The cohort entry date was defined as January 1, 2013 follow-up data As in the primary analysis, we remeasured total monthly healthcare spending over the 2013-2014 period and used the same transformation and adjustment as described in the Methods. We also remeasured the 37 predictors using data from Medicare enrollment files and claims but in the 2012 baseline year. We classified the same 10 predictors as potentially-modifiable (asterisks denoted in Table 1). We also used trajectory modeling to empirically classify spending during the two- year follow-up in this replication sample. We also modeled longitudinal cost trajectories using calendar month as the time variable, costs in each month, order=4, group=5, and a censored-normal distribution (linear between minimum and maximum values). We evaluated other groupings, but the 5-group model still fit the data best. After conducting the trajectories, as in the main paper, we also assessed the ability to predict membership in each two-year trajectory group using boosted logistic regression in this separate dataset, using the same approach described in the Methods. For each trajectory group, we estimated two separate models. The first included all 37 baseline predictors (Model 1) and the second included only the 10 baseline predictors that were considered a priori to be potentially-modifiable (Model 2). As the main manuscript, we used internal split-sample validation and evaluated each model through discrimination measures. eTable 2. Patient ligibility riteria Criterion N Study sample 1,000,000 Was not in quality improvement study sample 976,550 Enrolled on 1/1/12 in medical and pharmacy benefits 550,215 Age 65 years on 1/1/11 433,561 Continuous enrollment from 1/1/11 to 12/31/13 329,476 eTable 3. Predicted robabilities for ach rajectory roup Trajectory group Mean (SD) predicted probability % of patients with >0.90 of trajectory group membership membership probability Group 1: Minimal-user 0.97 (0.09) 90.4% Group 2: Low-cost 0.91 (0.13) 73.6% Group 3: Rising costs 0.88 (0.16) 61.6% Group 4: Moderate cost 0.89 (0.14) 63.9% Group 5: High cost 0.95 (0.10) 84.4% eFigure 2. Two- ear pending atterns sing rajectory odeling: riginal og cale -1 -2 -3 -4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Months Minimal user (11.4%) Low-cost (14.7%) Rising-cost (7.5%) Moderate-user (25.3%) High-cost (41.2%) Note: The observed (solid lines) and predicted (dotted lines) probability of monthly costs on the logarithmic scale for each of 5 groups identified by the trajectory model. Log monthly costs (in US$) eFigure 3. Trajectory odeling of wo- ear ealthcare pending sing ther umbers of roups A. Two-group model BIC: 22202423 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Months Group 1 (31.1%) Group 2 (68.9%) B. Three-group model BIC: 21858264 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Months Group 1 (16.2%) Group 2 (30.1%) Group 3 (53.7%) Monthly costs (in US $) Monthly costs (in US $) C. Four-group model BIC: 21757980 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Months Group 1 (11.4%) Group 2 (18.1%) Group 3 (27.4%) Group 4 (43.2%) D. Six-group model BIC: 21667412 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Months Group 1 (9.1%) Group 2 (11.3%) Group 3 (8.7%) Group 4 (11.1%) Group 5 (25.6%) Group 6 (34.2%) Monthly costs (in US $) Monthly costs (in US $) E. Seven-group model BIC: 21646901 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Months Group 1 (9.3%) Group 2 (7.6%) Group 3 (4.5%) Group 4 (9.5%) Group 5 (11.9%) Group 6 (25.0%) Group 7 (32.2%) Monthly costs (in US $) eFigure 4. Relative nfluence lots rom oosted egression odeling for redicting rajectory roup embership ith otentially- odifiable ariables (Model 2) A. Group 1: Minimal-user No. of office visits 71.815 No. of unique medications 10.873 No. of physicians 10.526 Average adherence 4.705 No. of pharmacies 1.84 Depression 0.07 No. of ER visits 0.06 Tobacco use 0.044 Acute stress 0.033 No. of hopsitalizations 0.019 Obesity 0.017 0 10 20 30 40 50 60 70 80 Relative influence B. Group 2: Low-cost No. of office visits 42.159 No. of un ique medications 36.138 Average adherence 16.805 No. of physicians 2.747 No. of ER visits 0.694 Depression 0.491 No. of pharmacies 0.287 Obesity 0.206 Tobacco use 0.199 No. of hopsitalizations 0.164 Acute stress 0.110 0 5 10 15 20 25 30 35 40 45 Relative influence C. Group 3: Rising- - Average adherence 33.605 No. of office visits 30.334 No. of unique medications 29.236 No. of physicians 2.935 No. of pharmacies 1.034 No. of ER visits 0.714 No. of hospitalizations 0.61 Tobacco use 0.583 Obesity 0.39 Depression 0.306 Acute stress 0.252 0 10 20 30 40 Relative influence D. Group 4: Moderate-cost No. of office visits 36.438 Average adherence 28.239 No. of unique medications 24.773 No. of physicians 3.527 No. of ER visits 2.086 No. of pharmacies 1.829 No. of hospitalizations 1.055 Tobacco use 0.675 Depression 0.579 Obesity 0.487 Acute stress 0.312 0 5 10 15 20 25 30 35 40 Relative influence E. Group 5: High-cost No. of unique medications 46.590 No. of office visits 37.533 Average adherence 6.090 No. of physicians 4.518 No. of pharmacies 2.211 No. of ER visits 1.022 Depression 0.972 No. of hospitalizations 0.434 Acute stress 0.347 Obesity 0.208 Tobacco use 0.076 0 10 20 30 40 50 Relative influence eTable 4. Validation C- tatistics rom odels redicting atients ith uture ising pending Cost-bloomers Group 3: Model Predictors in Year 2 Rising-cost (n=20,470) (n=24,736) 3 All Year 0 0.667 0.764 4 Potentially-modifiable Year 0 0.643 0.753 Defined by being in lower 90% in Year 1 and top 10% in Year 2 eTable 5. Geographic egion and aseline hronic ondition edication lasses y rajectory roup Group 4: Group 1: Group 2: Low- Group 3: Group 5: Moderate- Minimal-user cost Rising-cost High-cost cost Geographic region of residence N (%) Midwest 8,512 (22.7) 11,890 (24.5) 6,313 (25.5) 20,612 (24.7) 31,123 (23.0) Northeast 5,945 (15.8) 7,828 (16.1) 4,385 (17.7) 15,166 (18.2) 27,778 (20.5) South 13,707 (36.5) 19,150 (39.4) 9,620 (38.9) 32,871 (39.4) 53,942 (39.9) West 8,605 (22.9) 9,418 (19.4) 4,283 (17.3) 14,280 (17.1) 21,917 (16.2) Medication class adherence (PDC) Mean (SD) Alpha glucosidase inhibitors 0.14 (0.20) 0.69 (0.26) 0.53 (0.21) 0.73 (0.26) 0.71 (0.29) ACEIs/ARBs 0.58 (0.31) 0.81 (0.24) 0.79 (0.27) 0.85 (0.22) 0.84 (0.23) Anticholinergics, inhaled 0.52 (0.36) 0.47 (0.31) 0.58 (0.33) 0.60 (0.29) 0.67 (0.30) Beta-blockers 0.59 (0.31) 0.80 (0.24) 0.79 (0.26) 0.85 (0.22) 0.85 (0.22) Biguanide 0.55 (0.29) 0.72 (0.28) 0.70 (0.28) 0.81 (0.23) 0.82 (0.23) Bisphosphonates 0.58 (0.28) 0.69 (0.28) 0.68 (0.28) 0.71 (0.28) 0.72 (0.28) Calcium channel blockers 0.58 (0.32) 0.80 (0.26) 0.81 (0.25) 0.85 (0.22) 0.85 (0.23) Dipeptidyl peptidase-4 inhibitors 0.55 (0.35) 0.66 (0.32) 0.68 (0.34) 0.73 (0.28) 0.79 (0.26) Diuretics, thiazide 0.56 (0.31) 0.79 (0.26) 0.79 (0.26) 0.83 (0.23) 0.81 (0.25) Leukotriene modulators 0.30 (0.25) 0.48 (0.31) 0.53 (0.29) 0.62 (0.33) 0.72 (0.30) Long-acting beta-agonists 0.59 (0.30) 0.51 (0.39) 0.75 (0.32) 0.48 (0.28) 0.57 (0.33) Meglitinides - 0.79 (0.24) 0.57 (0.39) 0.71 (0.30) 0.70 (0.29) Other anti-hypertensives 0.47 (0.30) 0.73 (0.31) 0.74 (0.30) 0.79 (0.27) 0.76 (0.30) Selective estrogen receptor 0.66 (0.26) 0.76 (0.28) 0.70 (0.29) 0.80 (0.24) 0.81 (0.24) modulators Statins 0.57 (0.30) 0.78 (0.25) 0.77 (0.26) 0.82 (0.22) 0.84 (0.22) Sulfonylureas 0.59 (0.30) 0.73 (0.28) 0.70 (0.28) 0.83 (0.22) 0.83 (0.23) Thiazolidinediones 0.48 (0.29) 0.63 (0.27) 0.59 (0.28) 0.69 (0.28) 0.73 (0.28) Xanthines 0.45 (0.29) 0.73 (0.28) 0.69 (0.44) 0.71 (0.29) 0.74 (0.30) Abbreviations: PDC, Proportion of Days Covered; ACEI/ARB, Angiotensin-converting enzyme inhibitors/angiotensin receptor blockers eTable 6. Validation C- tatistics rom ensitivity nalyses Group 1: Group 3: Group 4: Group 5: Group 2: Predictors Minimal- Rising- Moderate- High-cost Low-cost user cost cost All baseline predictors (Model 1)+ Region 0.956 0.819 0.765 0.729 0.903 All baseline predictors (Model 1) with 0.955 0.816 0.766 0.729 0.900 adherence by individual class Potentially-modifiable predictors (Model 0.946 0.785 0.756 0.689 0.874 2) with adherence by individual class eFigure 5. Two- ear pending atterns sing rajectory odeling n 2013-2014 ata 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Months Minimal user (10.9%) Low-cost (13.9%) Rising costs (6.8%) Moderate-cost (25.6%) High-cost (42.7%) Note: The mean spending levels using 5-group trajectory modeling in the full 2013-2014 sample (n=297,150) are plotted. The percentages refer to the number of patients who belong to each trajectory group out of the full cohort. Monthly costs (in US$) eTable 7. Ability of odels to redict wo- ear pending rajectory roups n 2013-2014 ata Validation C- Group statistic All baseline predictors Model 1 Group 1: Minimal-user 0.983 Group 2: Low-cost 0.864 Group 3: Rising-cost 0.812 Group 4: Moderate-cost 0.777 Group 5: High-cost 0.941 Potentially-modifiable predictors Model 2 Group 1: Minimal-user 0.952 Group 2: Low-cost 0.787 Group 3: Rising-cost 0.767 Group 4: Moderate-cost 0.696 Group 5: High-cost 0.887

Journal

JAMA Network OpenAmerican Medical Association

Published: Oct 19, 2020

There are no references for this article.