In pursuit of quality and safety: an 8-year study of clinical peer review best practices in US hospitals

In pursuit of quality and safety: an 8-year study of clinical peer review best practices in US... Abstract Objectives Gather normative data on the goals of clinical peer review; refine a best-practice model and related self-assessment inventory; identify the interval progress towards best-practice adoption. Design Online survey (2015–16) of a cohort of 457 programs first studied by volunteer sampling in either 2007 or 2009 on 40 items assessing the degree of conformance to a validated quality improvement (QI) model and addressing program goals, structure, process, governance, and impact on quality and safety. Setting Acute care hospitals of all sizes in the USA. Study Participants Physicians and hospital leaders or hospital staff with intimate program knowledge. Intervention None. Main Outcome Measures Subjectively-rated program impact on quality and safety; QI model score. Results Two hundred and seventy responses (59% response rate) showed that clinical peer review most commonly aims to improve quality and safety. From 2007 to 2015, the median [inter-quartile range, IQR] annual rate of major program change was 20% [11–24%]. Mean [confidence interval, CI] QI model scores increased 5.6 [2.9–8.3] points from 46.2 at study entry. Only 35% scored at least 60 of 80 possible points—‘C’ level progress in adopting the QI model. The analysis supports expansion of the QI model and an associated self-assessment inventory to include 20 items on a 100-point scale for which a 10-point increase predicts a one level improvement in quality impact with an odds ratio [CI] of 2.5 [2.2–3.0]. Conclusions Hospital and physician leaders could potentially accelerate progress in quality and safety by revisiting their clinical peer review practices in light of the evidence-based QI model. clinical peer review, best practices, quality improvement, hospitals Background Peer review is used in a wide variety of healthcare settings, therefore the term may be subject to confusion. Here, we examine the routine clinical peer review programs found in all US hospitals, which invariably include retrospective medical record review of the quality of care [1]. These programs have served as the primary vehicle by which medical staff contribute to quality and safety improvement [2]. Clinical peer review appears to be the dominant mode of adverse event analysis in the hospital setting [3]. Despite its importance, clinical peer review process has received insufficient attention. For example, neither the goals of peer review nor the content of training offered to reviewers has been formally studied. Prevailing practice evolved as a consequence of 1980 Joint Commission standards that replaced a medical audit process which had failed to control escalating costs induced by the Medicare program [4]. About the same time, the first malpractice insurance crisis led hospitals to build risk management programs. The conjunction of these two forces led them to adopt ‘generic screens’ for sub-standard care as a means of peer review case identification, notwithstanding the lack of validation for that purpose [5, 6]. These practices have long been criticized as being out of touch with the evolution of systems thinking and quality improvement (QI) methods [7, 8]. Even so, a 2007-study showed that they persist [1]. The same study also identified a set of practices associated with a higher degree of perceived program impact on quality and safety. The authors termed it the QI model because most of the practices would logically follow from the systematic application of QI principles to the process of clinical peer review. In 2009, the QI model was subsequently validated in a different population against both subjective and objective measures of program impact [2, 9]. A short-term follow-up study of 2007 and 2009 participants conducted in 2011, refined the model, but showed little progress towards its adoption despite a 20% annual rate of major program change [3]. Among other practices, the QI model includes the standardization of peer review process, a focus on identifying opportunities for improved performance (as opposed to casting blame for error), promotion of self-reporting of adverse events, near misses and hazardous conditions, the quality of case review, timely performance feedback, recognition of clinical excellence, a solid connection between the peer review program and the organization’s QI process and attentive program governance [3]. A program self-assessment tool was published in 2009 [10]. This study in healthcare operations improvement was undertaken to evaluate long-term trends in the evolution of clinical peer review practices, gather data on program goals, further characterize best practice and refine the program self-assessment inventory. It's purpose was to guide healthcare leaders in re-designing clinical peer review programs to have maximum impact on the quality and safety of care. Methods The sample frame consisted of 457 identifiable acute care general and pediatric hospitals still operating from among those first studied in 2007 (n = 152 of 337) or 2009 (n = 305 of 330). The 2007-study was sponsored by the University HealthSystem Consortium (now Vizient), Premier Inc., the American College of Physician Executives (ACPE—now the American Association for Physician Leadership) and six state hospital associations each of whom solicited their membership for an online survey. The 2009 study was sponsored by the ACPE alone. Both studies captured data from hospitals of all sizes, but somewhat over-represented teaching hospitals. Prior survey instruments provided the framework for data collection—substituting new items addressing the goals of peer review, reviewer training and nursing involvement; and refining selected item wording to improve clarity. Respondents’ self-rated peer review program impact on quality and safety was the primary outcome variable. The survey also captured their reported medical staff perceptions of the process, physician engagement in quality and safety improvement activity, and overall physician-hospital relations. The scale for rating quality impact was expanded from 6 to 8 levels to enhance reliability [11]. Forty items populated four secure web pages. Conformance to the QI model was assessed as a ‘QI model score’ by modifying the 13-item self-assessment tool derived from the 2007-study [10] and validated against the 2009 cohort [9] to exclude two items (the use of clinical performance measurement and reliable rating scales), which derived from measurement theory rather than the results of regression modeling. In 2009, 27 paired ratings allowed estimation of inter-rater reliability of the survey instrument via the intra-class correlation coefficient [12] at 0.61 [0.31–0.80] and 11 duplicate responses gave an intra-rater reliability of 0.88 [0.63–0.97] [9]. Since these values adequately support aggregate comparisons and the general questionnaire design was comparable, a re-validation study was not done. The author solicited prior participants via email and phoned non-responding facilities to verify receipt or identify other potential respondents. Data collection extended from 5 October 2015 through 7 March 2016. The analysis included both complete and partial (3-page) responses with final disposition codes for the sample frame assigned at the hospital level according to 2015 AAPOR standards [13]. The longitudinal change in QI model scores was assessed using paired t-tests. Association of tabulated survey items with outcomes variables was evaluated using Pearson chi-square. Items meeting a cutoff of P < 0.05 were accepted as univariate correlates. They were further analyzed in multivariate models using ordinal logistical regression to estimate their independent contribution to program impact on quality and safety, retaining outlier values and selectively collapsing response levels. A regression equation was accepted only if all the factor coefficients and intercepts were significant at P < 0.05 and if both Pearson and Deviance goodness-of-fit tests met P > 0.1. This analysis served as the basis for an empirical revision of the self-assessment inventory and associated QI model score, weighting items in rough relation to their odds ratios (OR) and effects on other variables. Statistical analysis relied on Minitab version 15 (2007). Results Table 1 summarizes sample frame hospital characteristics. The study yielded 268 complete responses, two partial responses, six breakoffs, 39 refusals and one non-contact for an overall response rate of 59% (270/457). The informants were primarily senior leaders (44%) and mid-level managers (46%). About 62% (167/270) were physicians. Only 40% had participated in a prior study. There were no significant differences between respondent and non-respondent hospitals on the basis of prior survey QI model scores, program quality impact and medical staff perceptions. Table 1 Sample frame characteristics Respondents Non-respondents Sample total US dataa N = 270 N = 187 N = 457 N = 4936 Census regions  Northeast 60 36 96 680  South 61 62 123 1919  Midwest 95 48 143 1422  West 54 41 95 915 Staffed beds  >500 48 28 76 244  200–499 104 67 171 1160  50–199 90 78 168 2130  <50 28 14 42 1402 COTH Mmembersb 62 25 77 400 Respondents Non-respondents Sample total US dataa N = 270 N = 187 N = 457 N = 4936 Census regions  Northeast 60 36 96 680  South 61 62 123 1919  Midwest 95 48 143 1422  West 54 41 95 915 Staffed beds  >500 48 28 76 244  200–499 104 67 171 1160  50–199 90 78 168 2130  <50 28 14 42 1402 COTH Mmembersb 62 25 77 400 aAmerican Hospital Association 2009. bCouncil of Teaching Hospitals (https://www.aamc.org/members/coth/about/ accessed 20 August 2016). Table 1 Sample frame characteristics Respondents Non-respondents Sample total US dataa N = 270 N = 187 N = 457 N = 4936 Census regions  Northeast 60 36 96 680  South 61 62 123 1919  Midwest 95 48 143 1422  West 54 41 95 915 Staffed beds  >500 48 28 76 244  200–499 104 67 171 1160  50–199 90 78 168 2130  <50 28 14 42 1402 COTH Mmembersb 62 25 77 400 Respondents Non-respondents Sample total US dataa N = 270 N = 187 N = 457 N = 4936 Census regions  Northeast 60 36 96 680  South 61 62 123 1919  Midwest 95 48 143 1422  West 54 41 95 915 Staffed beds  >500 48 28 76 244  200–499 104 67 171 1160  50–199 90 78 168 2130  <50 28 14 42 1402 COTH Mmembersb 62 25 77 400 aAmerican Hospital Association 2009. bCouncil of Teaching Hospitals (https://www.aamc.org/members/coth/about/ accessed 20 August 2016). Due to limitations of space, this report will focus on the most salient results of the study, excluding information about Just Culture which is being reported separately [14]. Tabulated responses to all survey items with measures of association to outcomes variables are presented in Supplementary material. Participants rated the extent to which various objectives describe the primary purpose and aim of their peer review program on a six-level categorical scale ranging from ‘Most Strongly’ to ‘Not at All’. The rank ordering by mean rating was highest for ‘Improve the quality and safety of care’ (5.5) followed by ‘Identify and remediate sub-standard care’ (5.1). Having the primary aim of improving quality and safety is a significant multivariate predictor of program impact. For 79% of respondents, the peer review program is independent of credentialing, even if the results of peer review are used in credentialing decisions. Such separation emerged as another new multivariate predictor of program impact. Each year since 2007, a median [IQR] of 20% [11–24%] of hospitals have made major changes to peer review program structure, process and/or governance. Only 5% did not report any changes. Table 2 displays the evolution of the scope of what medical staffs park under the umbrella of their peer review programs. Table 3 gives the rank ordering of the methods used to identify cases for peer review. Data review and risk management activities are still the dominant sources by which these case identification criteria are applied. Table 4 compares the process and outcomes measures currently used to monitor program effectiveness to 2007 practice. Table 2 Clinical Peer Review Program Scope 2007–15 2007 2011 2015 15 vs. 07 N = 337 N = 300 N = 270 Change Activities % (n) % (n) % (n) % P-valuea Retrospective medical record review 96 (324) 97 (292) 97 (261) 1 0.83 Focused individual review of quality when serious concerns are raised N/Ab 91 (272) 90 (244) Ongoing professional practice evaluation N/A 82 (247) 90 (243) Focused professional practice evaluation for new privileges N/A 77 (232) 82 (222) Comparative evaluation of performance measures 74 (249) 82 (245) 82 (222) 8 0.02 Root cause analysis 66 (223) 77 (231) 76 (204) 9 0.01 Disruptive behavior management N/A 77 (232) 75 (203) Case-specific, individually-targeted recommendations to improve performance N/A 78 (234) 74 (199) Morbidity and mortality case conferences 58 (196) 57 (170) 67 (181) 9 0.03 Proctoring for new privileges 47 (160) 63 (188) 64 (174) 17 <0.001 Development and/or review of clinical policies, order sets, etc. N/A 62 (187) 64 (173) Comparative evaluation of aggregate data from peer review 50 (167) 62 (186) 57 (154) 7 0.07 Benchmarking to normative data N/A 57 (170) 56 (150) Physician health program administration 15 (51) 56 (166) 54 (147) 39 <0.001 Conducting quality improvement studies and/or projects 41 (139) 54 (162) 54 (145) 13 0.002 Concurrent medical record review 54 (181) 50 (149) 51 (137) −3 0.51 Producing educational programs for groups of clinicians 34 (116) 43 (130) 44 (119) 10 0.02 Other forms of direct observation 23 (79) 25 (76) 28 (75) 5 0.22 2007 2011 2015 15 vs. 07 N = 337 N = 300 N = 270 Change Activities % (n) % (n) % (n) % P-valuea Retrospective medical record review 96 (324) 97 (292) 97 (261) 1 0.83 Focused individual review of quality when serious concerns are raised N/Ab 91 (272) 90 (244) Ongoing professional practice evaluation N/A 82 (247) 90 (243) Focused professional practice evaluation for new privileges N/A 77 (232) 82 (222) Comparative evaluation of performance measures 74 (249) 82 (245) 82 (222) 8 0.02 Root cause analysis 66 (223) 77 (231) 76 (204) 9 0.01 Disruptive behavior management N/A 77 (232) 75 (203) Case-specific, individually-targeted recommendations to improve performance N/A 78 (234) 74 (199) Morbidity and mortality case conferences 58 (196) 57 (170) 67 (181) 9 0.03 Proctoring for new privileges 47 (160) 63 (188) 64 (174) 17 <0.001 Development and/or review of clinical policies, order sets, etc. N/A 62 (187) 64 (173) Comparative evaluation of aggregate data from peer review 50 (167) 62 (186) 57 (154) 7 0.07 Benchmarking to normative data N/A 57 (170) 56 (150) Physician health program administration 15 (51) 56 (166) 54 (147) 39 <0.001 Conducting quality improvement studies and/or projects 41 (139) 54 (162) 54 (145) 13 0.002 Concurrent medical record review 54 (181) 50 (149) 51 (137) −3 0.51 Producing educational programs for groups of clinicians 34 (116) 43 (130) 44 (119) 10 0.02 Other forms of direct observation 23 (79) 25 (76) 28 (75) 5 0.22 aFisher’s exact test for two proportions. bNot applicable (N/A): response option not offered. Table 2 Clinical Peer Review Program Scope 2007–15 2007 2011 2015 15 vs. 07 N = 337 N = 300 N = 270 Change Activities % (n) % (n) % (n) % P-valuea Retrospective medical record review 96 (324) 97 (292) 97 (261) 1 0.83 Focused individual review of quality when serious concerns are raised N/Ab 91 (272) 90 (244) Ongoing professional practice evaluation N/A 82 (247) 90 (243) Focused professional practice evaluation for new privileges N/A 77 (232) 82 (222) Comparative evaluation of performance measures 74 (249) 82 (245) 82 (222) 8 0.02 Root cause analysis 66 (223) 77 (231) 76 (204) 9 0.01 Disruptive behavior management N/A 77 (232) 75 (203) Case-specific, individually-targeted recommendations to improve performance N/A 78 (234) 74 (199) Morbidity and mortality case conferences 58 (196) 57 (170) 67 (181) 9 0.03 Proctoring for new privileges 47 (160) 63 (188) 64 (174) 17 <0.001 Development and/or review of clinical policies, order sets, etc. N/A 62 (187) 64 (173) Comparative evaluation of aggregate data from peer review 50 (167) 62 (186) 57 (154) 7 0.07 Benchmarking to normative data N/A 57 (170) 56 (150) Physician health program administration 15 (51) 56 (166) 54 (147) 39 <0.001 Conducting quality improvement studies and/or projects 41 (139) 54 (162) 54 (145) 13 0.002 Concurrent medical record review 54 (181) 50 (149) 51 (137) −3 0.51 Producing educational programs for groups of clinicians 34 (116) 43 (130) 44 (119) 10 0.02 Other forms of direct observation 23 (79) 25 (76) 28 (75) 5 0.22 2007 2011 2015 15 vs. 07 N = 337 N = 300 N = 270 Change Activities % (n) % (n) % (n) % P-valuea Retrospective medical record review 96 (324) 97 (292) 97 (261) 1 0.83 Focused individual review of quality when serious concerns are raised N/Ab 91 (272) 90 (244) Ongoing professional practice evaluation N/A 82 (247) 90 (243) Focused professional practice evaluation for new privileges N/A 77 (232) 82 (222) Comparative evaluation of performance measures 74 (249) 82 (245) 82 (222) 8 0.02 Root cause analysis 66 (223) 77 (231) 76 (204) 9 0.01 Disruptive behavior management N/A 77 (232) 75 (203) Case-specific, individually-targeted recommendations to improve performance N/A 78 (234) 74 (199) Morbidity and mortality case conferences 58 (196) 57 (170) 67 (181) 9 0.03 Proctoring for new privileges 47 (160) 63 (188) 64 (174) 17 <0.001 Development and/or review of clinical policies, order sets, etc. N/A 62 (187) 64 (173) Comparative evaluation of aggregate data from peer review 50 (167) 62 (186) 57 (154) 7 0.07 Benchmarking to normative data N/A 57 (170) 56 (150) Physician health program administration 15 (51) 56 (166) 54 (147) 39 <0.001 Conducting quality improvement studies and/or projects 41 (139) 54 (162) 54 (145) 13 0.002 Concurrent medical record review 54 (181) 50 (149) 51 (137) −3 0.51 Producing educational programs for groups of clinicians 34 (116) 43 (130) 44 (119) 10 0.02 Other forms of direct observation 23 (79) 25 (76) 28 (75) 5 0.22 aFisher’s exact test for two proportions. bNot applicable (N/A): response option not offered. Table 3 Case identification criteria for peer review 2007–15 2007 2011 2015 N = 337 N = 297 N = 270 % (n) Relative usea Relative usea Generic screens for or ‘triggers’ suggestive of adverse events 96 (324) 79 67 Physician or hospital staff ‘concerns’ 91 (308) 45 48 Unexplained deviation from protocols, pathways or specified clinical standards 68 (228) 21 21 Patient complaints 85 (287) 18 20 Statistical monitoring of process and/or outcomes measures 46 (154) 10 7 Core measures variances N/Ab 12 5 Review of new privileges (FPPEc) 40 (134) 6 5 Quality improvement studies 56 (189) 5 4 Clinically ‘interesting’ cases 42 (142) 2 2 Rapid Response Team activation N/A N/A 2 Random selection 17 (56) 3 1 2007 2011 2015 N = 337 N = 297 N = 270 % (n) Relative usea Relative usea Generic screens for or ‘triggers’ suggestive of adverse events 96 (324) 79 67 Physician or hospital staff ‘concerns’ 91 (308) 45 48 Unexplained deviation from protocols, pathways or specified clinical standards 68 (228) 21 21 Patient complaints 85 (287) 18 20 Statistical monitoring of process and/or outcomes measures 46 (154) 10 7 Core measures variances N/Ab 12 5 Review of new privileges (FPPEc) 40 (134) 6 5 Quality improvement studies 56 (189) 5 4 Clinically ‘interesting’ cases 42 (142) 2 2 Rapid Response Team activation N/A N/A 2 Random selection 17 (56) 3 1 aStandardized to a scale of 100 points for an item ranked #1 by all respondents. bNot applicable (N/A): response option not offered. cFocused professional practice evaluation (FPPE) for new privileges, a Joint Commission requirement. Table 3 Case identification criteria for peer review 2007–15 2007 2011 2015 N = 337 N = 297 N = 270 % (n) Relative usea Relative usea Generic screens for or ‘triggers’ suggestive of adverse events 96 (324) 79 67 Physician or hospital staff ‘concerns’ 91 (308) 45 48 Unexplained deviation from protocols, pathways or specified clinical standards 68 (228) 21 21 Patient complaints 85 (287) 18 20 Statistical monitoring of process and/or outcomes measures 46 (154) 10 7 Core measures variances N/Ab 12 5 Review of new privileges (FPPEc) 40 (134) 6 5 Quality improvement studies 56 (189) 5 4 Clinically ‘interesting’ cases 42 (142) 2 2 Rapid Response Team activation N/A N/A 2 Random selection 17 (56) 3 1 2007 2011 2015 N = 337 N = 297 N = 270 % (n) Relative usea Relative usea Generic screens for or ‘triggers’ suggestive of adverse events 96 (324) 79 67 Physician or hospital staff ‘concerns’ 91 (308) 45 48 Unexplained deviation from protocols, pathways or specified clinical standards 68 (228) 21 21 Patient complaints 85 (287) 18 20 Statistical monitoring of process and/or outcomes measures 46 (154) 10 7 Core measures variances N/Ab 12 5 Review of new privileges (FPPEc) 40 (134) 6 5 Quality improvement studies 56 (189) 5 4 Clinically ‘interesting’ cases 42 (142) 2 2 Rapid Response Team activation N/A N/A 2 Random selection 17 (56) 3 1 aStandardized to a scale of 100 points for an item ranked #1 by all respondents. bNot applicable (N/A): response option not offered. cFocused professional practice evaluation (FPPE) for new privileges, a Joint Commission requirement. Table 4 Process and outcome measures of peer review activity 2007–15 2007 2015 N = 329 N = 270 Difference % (n) % (n) P-valuea Trends in adverse event rates (either globally or by event type) 54 (178) 66 (177) 0.006 Trends in targeted clinical performance measures 51 (168) 57 (154) 0.16 Case review volume 49 (162) 44 (118) 0.19 Trends in individual or group performance on specific elements of care evaluated through the peer review process 44 (145) N/Ab N/A Counts/patterns of system or process of care improvement opportunities identified 38 (126) 33 (90) 0.23 Counts/patterns of recommendations for improved performance of individual clinicians 41 (136) 33 (88) 0.03 Counts of ‘corrective actions’ under the medical staff bylaws involving hospital privileges N/A 32 (87) N/A Measures of medical staff engagement in quality and safety N/A 27 (72) N/A Turn-around-time for case review 18 (59) 24 (64) 0.09 Counts/topics of recommendations for group education 17 (57) 19 (52) 0.60 Case review backlog 14 (47) 16 (43) 0.65 Counts of clinicians recognized for excellent performance 6 (21) 14 (39) 0.001 Measures of medical staff satisfaction with the peer review program N/A 13 (35) N/A Reviewer productivity (e.g. average number of cases reviewed per reviewer per meeting or per month) N/A 5 (13) N/A Counts of cases self-referred for peer review N/A 5 (13) N/A Other 1 (4) 1 (3) 1.00 Not applicable: We do not track and review any process or outcome measures in relation to our Peer Review Program 18 (59) 15 (41) 0.38 2007 2015 N = 329 N = 270 Difference % (n) % (n) P-valuea Trends in adverse event rates (either globally or by event type) 54 (178) 66 (177) 0.006 Trends in targeted clinical performance measures 51 (168) 57 (154) 0.16 Case review volume 49 (162) 44 (118) 0.19 Trends in individual or group performance on specific elements of care evaluated through the peer review process 44 (145) N/Ab N/A Counts/patterns of system or process of care improvement opportunities identified 38 (126) 33 (90) 0.23 Counts/patterns of recommendations for improved performance of individual clinicians 41 (136) 33 (88) 0.03 Counts of ‘corrective actions’ under the medical staff bylaws involving hospital privileges N/A 32 (87) N/A Measures of medical staff engagement in quality and safety N/A 27 (72) N/A Turn-around-time for case review 18 (59) 24 (64) 0.09 Counts/topics of recommendations for group education 17 (57) 19 (52) 0.60 Case review backlog 14 (47) 16 (43) 0.65 Counts of clinicians recognized for excellent performance 6 (21) 14 (39) 0.001 Measures of medical staff satisfaction with the peer review program N/A 13 (35) N/A Reviewer productivity (e.g. average number of cases reviewed per reviewer per meeting or per month) N/A 5 (13) N/A Counts of cases self-referred for peer review N/A 5 (13) N/A Other 1 (4) 1 (3) 1.00 Not applicable: We do not track and review any process or outcome measures in relation to our Peer Review Program 18 (59) 15 (41) 0.38 aFisher’s exact test for two proportions. bNot applicable (N/A). Table 4 Process and outcome measures of peer review activity 2007–15 2007 2015 N = 329 N = 270 Difference % (n) % (n) P-valuea Trends in adverse event rates (either globally or by event type) 54 (178) 66 (177) 0.006 Trends in targeted clinical performance measures 51 (168) 57 (154) 0.16 Case review volume 49 (162) 44 (118) 0.19 Trends in individual or group performance on specific elements of care evaluated through the peer review process 44 (145) N/Ab N/A Counts/patterns of system or process of care improvement opportunities identified 38 (126) 33 (90) 0.23 Counts/patterns of recommendations for improved performance of individual clinicians 41 (136) 33 (88) 0.03 Counts of ‘corrective actions’ under the medical staff bylaws involving hospital privileges N/A 32 (87) N/A Measures of medical staff engagement in quality and safety N/A 27 (72) N/A Turn-around-time for case review 18 (59) 24 (64) 0.09 Counts/topics of recommendations for group education 17 (57) 19 (52) 0.60 Case review backlog 14 (47) 16 (43) 0.65 Counts of clinicians recognized for excellent performance 6 (21) 14 (39) 0.001 Measures of medical staff satisfaction with the peer review program N/A 13 (35) N/A Reviewer productivity (e.g. average number of cases reviewed per reviewer per meeting or per month) N/A 5 (13) N/A Counts of cases self-referred for peer review N/A 5 (13) N/A Other 1 (4) 1 (3) 1.00 Not applicable: We do not track and review any process or outcome measures in relation to our Peer Review Program 18 (59) 15 (41) 0.38 2007 2015 N = 329 N = 270 Difference % (n) % (n) P-valuea Trends in adverse event rates (either globally or by event type) 54 (178) 66 (177) 0.006 Trends in targeted clinical performance measures 51 (168) 57 (154) 0.16 Case review volume 49 (162) 44 (118) 0.19 Trends in individual or group performance on specific elements of care evaluated through the peer review process 44 (145) N/Ab N/A Counts/patterns of system or process of care improvement opportunities identified 38 (126) 33 (90) 0.23 Counts/patterns of recommendations for improved performance of individual clinicians 41 (136) 33 (88) 0.03 Counts of ‘corrective actions’ under the medical staff bylaws involving hospital privileges N/A 32 (87) N/A Measures of medical staff engagement in quality and safety N/A 27 (72) N/A Turn-around-time for case review 18 (59) 24 (64) 0.09 Counts/topics of recommendations for group education 17 (57) 19 (52) 0.60 Case review backlog 14 (47) 16 (43) 0.65 Counts of clinicians recognized for excellent performance 6 (21) 14 (39) 0.001 Measures of medical staff satisfaction with the peer review program N/A 13 (35) N/A Reviewer productivity (e.g. average number of cases reviewed per reviewer per meeting or per month) N/A 5 (13) N/A Counts of cases self-referred for peer review N/A 5 (13) N/A Other 1 (4) 1 (3) 1.00 Not applicable: We do not track and review any process or outcome measures in relation to our Peer Review Program 18 (59) 15 (41) 0.38 aFisher’s exact test for two proportions. bNot applicable (N/A). Regular solicitation of input from reviewed clinicians is associated with higher program quality impact as is reviewer training, but not reviewer compensation. A third of facilities train most reviewers in chart review methods, legal and risk management issues, and interpersonal skills. A quarter train them on QI methods. In 69% of facilities, the majority of case reviews are presented and discussed in a committee prior to final decision-making. Nurses, especially leaders, sit on medical staff review committees at 58% of study hospitals. Some programs routinely assess nursing care during the case review process. Although they typically refer nursing issues to nursing for resolution, a few directly address all improvement opportunities. The 11-item QI model scores increased 5.6 [2.9–8.3] points from 46.2 at study entry. Only 35% (n = 94) scored at least 60 points (i.e. 75% of 80)—‘C’ level progress in adopting the QI model. These hospitals accounted for all of the improvement. Their current score compared with their first score increased a mean of 21.2 [16.9–25.6] points. Higher scoring organizations could not be differentiated from the others on the basis of their first scores (one-way ANOVA P = 0.59). Among the remaining facilities (n = 176), the mean score change was −2.8 [−5.59–0.0]. The revised QI model inventory totals to 100 points and includes the 20 items presented in Table 5. The rank order highlights the improvement opportunity in terms of the average number of points which could be gained on each item from adoption of best practice. Supplementary material presents the complete self-assessment inventory. The revised model retains all previously identified factors except the use of adverse event rates as a measure of program effectiveness. This item as re-written was not correlated with any outcome measure. Table 5 Revised QI model factors and improvement opportunity Factors Points Average improvement opportunitya 1. Likelihood of self-reporting adverse events, near misses and/or hazardous conditions 10 7.0 2. Quality of case review 10 6.4 3. Degree of standardization of peer review process 10 5.3 4. Diligence of program governance 10 4.2 5. Connecting the peer review program to the hospital’s quality/safety/performance improvement process 6 3.8 6. Excellent reviewer participation 6 3.8 7. Aiming the program to improve quality and safety above all else 8 2.9 8. Using a dashboard of process and outcome indicators to manage and improve the programb 3 2.6 9. Recognizing outstanding clinical performance 3 2.4 10. Routinely soliciting input to case review from involved cliniciansb 5 2.2 11. Using reliable rating scales to make subjective measures of clinical performancec 2 1.7 12. Reviewing an adequate volume of cases to generate meaningful improvements in care delivery 3 1.4 13. Measuring clinical performance as part of the case review process 2 1.2 14. Sharing information about program effectiveness with the board of trustees 2 1.0 15. Providing timely case review and communication of opportunities for improved performance 5 0.9 16. Allowing for group discussion of most case reviews 2 0.9 17. Looking for process improvement opportunities in every case review 7 0.8 18. Reviewing pertinent diagnostic studies that influenced critical decisions, not just the reports 2 0.5 19. Separating the peer review program from credentialing, even if the results of peer review inform credentialing activitiesb 2 0.4 20. Program scope includes either concurrent review or case-specific, individually-targeted recommendations to improve performanceb 2 0.3 Total 100 49.8 Factors Points Average improvement opportunitya 1. Likelihood of self-reporting adverse events, near misses and/or hazardous conditions 10 7.0 2. Quality of case review 10 6.4 3. Degree of standardization of peer review process 10 5.3 4. Diligence of program governance 10 4.2 5. Connecting the peer review program to the hospital’s quality/safety/performance improvement process 6 3.8 6. Excellent reviewer participation 6 3.8 7. Aiming the program to improve quality and safety above all else 8 2.9 8. Using a dashboard of process and outcome indicators to manage and improve the programb 3 2.6 9. Recognizing outstanding clinical performance 3 2.4 10. Routinely soliciting input to case review from involved cliniciansb 5 2.2 11. Using reliable rating scales to make subjective measures of clinical performancec 2 1.7 12. Reviewing an adequate volume of cases to generate meaningful improvements in care delivery 3 1.4 13. Measuring clinical performance as part of the case review process 2 1.2 14. Sharing information about program effectiveness with the board of trustees 2 1.0 15. Providing timely case review and communication of opportunities for improved performance 5 0.9 16. Allowing for group discussion of most case reviews 2 0.9 17. Looking for process improvement opportunities in every case review 7 0.8 18. Reviewing pertinent diagnostic studies that influenced critical decisions, not just the reports 2 0.5 19. Separating the peer review program from credentialing, even if the results of peer review inform credentialing activitiesb 2 0.4 20. Program scope includes either concurrent review or case-specific, individually-targeted recommendations to improve performanceb 2 0.3 Total 100 49.8 aAverage QI model score points available from adoption of best practices. bProvisional additions to QI model based on 2015 data. cIncluded in QI model based on measurement reliability theory. Table 5 Revised QI model factors and improvement opportunity Factors Points Average improvement opportunitya 1. Likelihood of self-reporting adverse events, near misses and/or hazardous conditions 10 7.0 2. Quality of case review 10 6.4 3. Degree of standardization of peer review process 10 5.3 4. Diligence of program governance 10 4.2 5. Connecting the peer review program to the hospital’s quality/safety/performance improvement process 6 3.8 6. Excellent reviewer participation 6 3.8 7. Aiming the program to improve quality and safety above all else 8 2.9 8. Using a dashboard of process and outcome indicators to manage and improve the programb 3 2.6 9. Recognizing outstanding clinical performance 3 2.4 10. Routinely soliciting input to case review from involved cliniciansb 5 2.2 11. Using reliable rating scales to make subjective measures of clinical performancec 2 1.7 12. Reviewing an adequate volume of cases to generate meaningful improvements in care delivery 3 1.4 13. Measuring clinical performance as part of the case review process 2 1.2 14. Sharing information about program effectiveness with the board of trustees 2 1.0 15. Providing timely case review and communication of opportunities for improved performance 5 0.9 16. Allowing for group discussion of most case reviews 2 0.9 17. Looking for process improvement opportunities in every case review 7 0.8 18. Reviewing pertinent diagnostic studies that influenced critical decisions, not just the reports 2 0.5 19. Separating the peer review program from credentialing, even if the results of peer review inform credentialing activitiesb 2 0.4 20. Program scope includes either concurrent review or case-specific, individually-targeted recommendations to improve performanceb 2 0.3 Total 100 49.8 Factors Points Average improvement opportunitya 1. Likelihood of self-reporting adverse events, near misses and/or hazardous conditions 10 7.0 2. Quality of case review 10 6.4 3. Degree of standardization of peer review process 10 5.3 4. Diligence of program governance 10 4.2 5. Connecting the peer review program to the hospital’s quality/safety/performance improvement process 6 3.8 6. Excellent reviewer participation 6 3.8 7. Aiming the program to improve quality and safety above all else 8 2.9 8. Using a dashboard of process and outcome indicators to manage and improve the programb 3 2.6 9. Recognizing outstanding clinical performance 3 2.4 10. Routinely soliciting input to case review from involved cliniciansb 5 2.2 11. Using reliable rating scales to make subjective measures of clinical performancec 2 1.7 12. Reviewing an adequate volume of cases to generate meaningful improvements in care delivery 3 1.4 13. Measuring clinical performance as part of the case review process 2 1.2 14. Sharing information about program effectiveness with the board of trustees 2 1.0 15. Providing timely case review and communication of opportunities for improved performance 5 0.9 16. Allowing for group discussion of most case reviews 2 0.9 17. Looking for process improvement opportunities in every case review 7 0.8 18. Reviewing pertinent diagnostic studies that influenced critical decisions, not just the reports 2 0.5 19. Separating the peer review program from credentialing, even if the results of peer review inform credentialing activitiesb 2 0.4 20. Program scope includes either concurrent review or case-specific, individually-targeted recommendations to improve performanceb 2 0.3 Total 100 49.8 aAverage QI model score points available from adoption of best practices. bProvisional additions to QI model based on 2015 data. cIncluded in QI model based on measurement reliability theory. Revised QI model scores range from 0 to 96 with a median of 50 [32–68]. Only 13% score at least 75. A 10-point increase is associated with an OR of 2.5 [2.2–3.0] for a one level increase in quality impact, 1.8 [1.6–2.1] for medical staff engagement in quality, 2.0 [1.8–2.3] for medical staff perceptions of the peer review program, and 1.5 [1.3–1.7] for physician-hospital relations. The revised score better predicts program impact than the original when comparing log likelihoods for the regressions (−267 vs. −304). Other things being equal, higher revised score is itself best predicted by the relative importance of the peer review program compared to other QI activities, the quality of organizational leadership, openness to change, aiming at improved quality and scoring at least 60 points on the original scale at cohort entry (adjusted R2 = 54%). Discussion This study affirms that the primary goal of clinical peer review in the USA is improved quality and safety of care and adds new dimensions to a best-practice model for achieving that aim. Most study hospitals have re-tooled their clinical peer review programs since 2007, but nearly two-thirds failed to incorporate additional best practices. The separation of the peer review program from credentialing activity is correlated with greater quality impact. Credentialing focuses on the competence required to perform requested privileges. Competence is an enduring quality, which is resilient to change. Questions of competence may be perceived as threatening because they put privileges, licensure and livelihood at risk. In contrast, case review deals with specific episodes of care to evaluate clinical performance, which may vary under influence of organizational and human factors. Even the best methods are insufficiently reliable to justify a negative judgment of competence from a single case [15, 16]. The 2007 observation that hospitals use unreliable methods to ‘score’ the findings from case review still holds true. The three-level approach categorizing how others might have managed the case adopted by the Veteran’s Administration is typical [17]. Such ratings don’t lend themselves to aggregation and reflect a confusion of agreement with reliability [18]. The reliability of a measure is directly related to the observed variability. For example, if all physicians are rated above average, even though the raters are in perfect agreement there is no variability in the ratings, which are therefore unreliable. Medical staffs still rely on generic screens suggestive of adverse events to identify cases worthy of review. A well-supported alternative, namely Rapid Response Team activations, is rarely used [19, 20]. Self-reporting is also little used, even though it’s the most efficient and effective method [21]. It surfaced in 2011 as one of the strongest independent predictors of overall program effectiveness. The willingness to self-report problems is a hallmark of safety culture and was critical to the transformation of aviation [22]. It could do the same for healthcare if given high priority: it is visible, easily measureable and requires the organization to consciously shift to a non-punitive peer review process typified by the QI model. Managers in all industries commonly follow a ‘dashboard’ of key performance measures relevant to their area of responsibility. The use of adverse event rates as a measure of program effectiveness may have appeared in the original QI model as an artifact of item wording, since many organizations track these rates for other purposes. Adverse event rates are a less specific measure than the count of improvement opportunities identified by the program, the turn-around-time for case review or the count of clinicians whose excellent performance was worthy of recognition—which are all strongly correlated with program effectiveness, and in combination have been added to the QI model. Nevertheless, the four studies integrated in this longitudinal analysis showed that 25% of respondents can’t recall basic metrics like case review volume. Moreover, given the primary aim to improve quality and safety, it is surprising that only a third track process of care improvement opportunities identified. Thus, it seems clinical peer review isn’t generally treated as a core business process. This study is limited by convenience sampling so that confidence intervals for the national population of hospitals cannot be projected. Teaching hospital practice is over-represented. The data are self-reported and un-audited. The survey items provide a relatively coarse-grained view of the process. Objective measures of program activity, including the proportion of reviewed cases with actions to improve clinical performance tend to be closely protected and could not be accessed. There is also potential for non-response bias, although this is mitigated by the high response rate and congruence across four studies. This study reveals a substantial gap between prevailing clinical peer review process and best practices in the pursuit of quality and safety. Successful translation of evidence into practice invariably requires consideration of the local environment [23]. It will take the cumulative work of many organizations consciously aiming at the QI model to further refine it and demonstrate winning strategies for its implementation. That work could be expedited through the improvement collaborative process, which has been successful in promoting adoption of specific clinical practice bundles [24, 25]. Studies of high performing programs could also add value. Hospital and physician leaders could potentially accelerate progress in quality and safety by revisiting their clinical peer review practices in light of the evidence-based QI model. Acknowledgements The author appreciates the contributions of those who critiqued drafts of this manuscript including George Helmrich, MD, Angelo P. Giardino, MD, PhD, Kymberlee A. Morrissey, BA, CPHQ, CNMT, RT(N), Deborah Ritchie, RN, BA and Larry Tonberg Edwards, PsyD. Funding This research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors. References 1 Edwards MT , Benjamin EM . The process of peer review in US hospitals . J Clin Outcomes Manag 2009 ; 16 : 461 – 7 . 2 Edwards MT . The objective impact of clinical peer review on quality of care . Am J Med Qual 2011 ; 26 : 110 – 9 . Google Scholar Crossref Search ADS PubMed 3 Edwards MT . A longitudinal study of clinical peer review’s impact on quality and safety in U.S. hospitals . J Healthc Manag 2013 ; 58 : 369 – 84 . Google Scholar Crossref Search ADS PubMed 4 Dershewitz RA , Gross RJ . Why medical audits are in disfavor . Arch Int Med 1980 ; 140 : 168 – 9 . Google Scholar Crossref Search ADS 5 Sanazaro PJ , Mills DH . A critique of the use of generic screening in quality assessment . JAMA 1991 ; 265 : 1977 – 81 . Google Scholar Crossref Search ADS PubMed 6 Hayward RA , Bernard AM , Rosevear JS et al. . An evaluation of generic screens for poor quality of hospital care on a general medicine service . Med Care 1993 ; 31 : 394 – 402 . Google Scholar Crossref Search ADS PubMed 7 Berwick DM . Peer review and quality management: are they compatible? Qual Rev Bull 1990 ; 16 : 246 – 51 . Google Scholar Crossref Search ADS 8 Dans PE . Clinical peer review: burnishing a tarnished image . Ann Intern Med 1993 ; 118 : 566 – 8 . Google Scholar Crossref Search ADS PubMed 9 Edwards MT . Clinical peer review program self-evaluation for US hospitals . Am J Med Qual 2010 ; 25 : 474 – 80 . Google Scholar Crossref Search ADS PubMed 10 Edwards MT . Peer review: a new tool for quality improvement . Physician Exec 2009 ; 35 : 54 – 9 . Google Scholar PubMed 11 Streiner DL , Norman GR . Health Measurement Scales: A Practical Guide to their Development and Use , 3rd edn . New York, NY : Oxford University Press , 2003 : 38 . 12 Shrout PE , Fleiss JL . Intraclass correlations: uses in assessing rater reliability . Psychol Bull 1979 ; 86 : 420 – 8 . Google Scholar Crossref Search ADS PubMed 13 The American Association for Public Opinion Research . 2015 . Standard Definitions: Final Dispositions of Case Codes and Outcome Rates for Surveys. 8th edn. AAPOR. http://www.websm.org/uploadi/editor/doc/1447505066AAPOR_2015_Standard_Definitions.pdf (9 Januari 2018, date last accessed). 14 Edwards MT . An assessment of the impact of just culture on quality and safety in U.S. hospitals . Am J Med Qual . 2018 [in press]. doi:10.1177/1062860618768057 15 Hayward RA , McMahon LF , Bernard AM . Evaluating the care of general medicine inpatients: how good is implicit review? Ann Intern Med 1993 ; 118 : 551 – 7 . 16 Goldman RL , Ciesco E . Improving peer review: alternatives to unstructured judgments by a single reviewer . Jt Comm J Qual Improv 1996 ; 22 : 762 – 9 . Google Scholar PubMed 17 Meeks DW , Meyer AN , Rose B et al. . Exploring new avenues to assess the sharp end of patient safety: an analysis of nationally aggregated peer review data . BMJ Qual Saf 2014 ; 23 : 1023 – 30 . doi:10.1136/bmjqs-2014-003239 . Google Scholar Crossref Search ADS PubMed 18 Edwards MT . Measuring clinical performance . Physician Exec 2009 ; 35 : 40 – 3 . Google Scholar PubMed 19 Braithwaite RS , DeVita MA , Mahidhara R et al. . Use of medical emergency team (MET) responses to detect medical errors . Qual Saf Health Care 2004 ; 13 : 255 – 9 . Google Scholar Crossref Search ADS PubMed 20 Kaplan LJ , Maerz LL , Schuster K et al. . Uncovering system errors using a rapid response team: cross-coverage caught in the crossfire . J Trauma 2009 ; 67 : 173 – 8 . Google Scholar Crossref Search ADS PubMed 21 Katz RI , Lagasse RS . Factors influencing the reporting of adverse perioperative outcomes to a quality management program . Anesth Analg 2000 ; 90 : 344 – 50 . Google Scholar PubMed 22 Edwards MT . Engaging physicians in patient safety through self-reporting of adverse events . Physician Exec 2012 ; 38 : 46 – 52 . Google Scholar PubMed 23 Pronovost PJ , Berenholtz SM , Needham DM . Translating evidence into practice: a model for large-scale knowledge translation . BMJ 2008 ; 337 : 963 – 5 . Google Scholar Crossref Search ADS 24 Berenholtz SM , Pham JC , Thompson DA et al. . Collaborative cohort study of an intervention to reduce ventilator-associated pneumonia in the intensive care unit . Infect Control Hosp Epidemiol 2011 ; 32 : 305 – 14 . Google Scholar Crossref Search ADS PubMed 25 Pronovost PJ , Watson SR , Goeschel CA et al. . Sustaining reductions in central line-associated bloodstream infections in Michigan intensive care units: a 10-year analysis . Am J Med Qual 2016 ; 31 : 197 – 202 . doi:10.1177/1062860614568647 . Google Scholar Crossref Search ADS PubMed © The Author(s) 2018. Published by Oxford University Press in association with the International Society for Quality in Health Care. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model) http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png International Journal for Quality in Health Care Oxford University Press

In pursuit of quality and safety: an 8-year study of clinical peer review best practices in US hospitals

Loading next page...
 
/lp/ou_press/in-pursuit-of-quality-and-safety-an-8-year-study-of-clinical-peer-4pOw0lJcPb
Publisher
Oxford University Press
Copyright
© The Author(s) 2018. Published by Oxford University Press in association with the International Society for Quality in Health Care. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com
ISSN
1353-4505
eISSN
1464-3677
D.O.I.
10.1093/intqhc/mzy069
Publisher site
See Article on Publisher Site

Abstract

Abstract Objectives Gather normative data on the goals of clinical peer review; refine a best-practice model and related self-assessment inventory; identify the interval progress towards best-practice adoption. Design Online survey (2015–16) of a cohort of 457 programs first studied by volunteer sampling in either 2007 or 2009 on 40 items assessing the degree of conformance to a validated quality improvement (QI) model and addressing program goals, structure, process, governance, and impact on quality and safety. Setting Acute care hospitals of all sizes in the USA. Study Participants Physicians and hospital leaders or hospital staff with intimate program knowledge. Intervention None. Main Outcome Measures Subjectively-rated program impact on quality and safety; QI model score. Results Two hundred and seventy responses (59% response rate) showed that clinical peer review most commonly aims to improve quality and safety. From 2007 to 2015, the median [inter-quartile range, IQR] annual rate of major program change was 20% [11–24%]. Mean [confidence interval, CI] QI model scores increased 5.6 [2.9–8.3] points from 46.2 at study entry. Only 35% scored at least 60 of 80 possible points—‘C’ level progress in adopting the QI model. The analysis supports expansion of the QI model and an associated self-assessment inventory to include 20 items on a 100-point scale for which a 10-point increase predicts a one level improvement in quality impact with an odds ratio [CI] of 2.5 [2.2–3.0]. Conclusions Hospital and physician leaders could potentially accelerate progress in quality and safety by revisiting their clinical peer review practices in light of the evidence-based QI model. clinical peer review, best practices, quality improvement, hospitals Background Peer review is used in a wide variety of healthcare settings, therefore the term may be subject to confusion. Here, we examine the routine clinical peer review programs found in all US hospitals, which invariably include retrospective medical record review of the quality of care [1]. These programs have served as the primary vehicle by which medical staff contribute to quality and safety improvement [2]. Clinical peer review appears to be the dominant mode of adverse event analysis in the hospital setting [3]. Despite its importance, clinical peer review process has received insufficient attention. For example, neither the goals of peer review nor the content of training offered to reviewers has been formally studied. Prevailing practice evolved as a consequence of 1980 Joint Commission standards that replaced a medical audit process which had failed to control escalating costs induced by the Medicare program [4]. About the same time, the first malpractice insurance crisis led hospitals to build risk management programs. The conjunction of these two forces led them to adopt ‘generic screens’ for sub-standard care as a means of peer review case identification, notwithstanding the lack of validation for that purpose [5, 6]. These practices have long been criticized as being out of touch with the evolution of systems thinking and quality improvement (QI) methods [7, 8]. Even so, a 2007-study showed that they persist [1]. The same study also identified a set of practices associated with a higher degree of perceived program impact on quality and safety. The authors termed it the QI model because most of the practices would logically follow from the systematic application of QI principles to the process of clinical peer review. In 2009, the QI model was subsequently validated in a different population against both subjective and objective measures of program impact [2, 9]. A short-term follow-up study of 2007 and 2009 participants conducted in 2011, refined the model, but showed little progress towards its adoption despite a 20% annual rate of major program change [3]. Among other practices, the QI model includes the standardization of peer review process, a focus on identifying opportunities for improved performance (as opposed to casting blame for error), promotion of self-reporting of adverse events, near misses and hazardous conditions, the quality of case review, timely performance feedback, recognition of clinical excellence, a solid connection between the peer review program and the organization’s QI process and attentive program governance [3]. A program self-assessment tool was published in 2009 [10]. This study in healthcare operations improvement was undertaken to evaluate long-term trends in the evolution of clinical peer review practices, gather data on program goals, further characterize best practice and refine the program self-assessment inventory. It's purpose was to guide healthcare leaders in re-designing clinical peer review programs to have maximum impact on the quality and safety of care. Methods The sample frame consisted of 457 identifiable acute care general and pediatric hospitals still operating from among those first studied in 2007 (n = 152 of 337) or 2009 (n = 305 of 330). The 2007-study was sponsored by the University HealthSystem Consortium (now Vizient), Premier Inc., the American College of Physician Executives (ACPE—now the American Association for Physician Leadership) and six state hospital associations each of whom solicited their membership for an online survey. The 2009 study was sponsored by the ACPE alone. Both studies captured data from hospitals of all sizes, but somewhat over-represented teaching hospitals. Prior survey instruments provided the framework for data collection—substituting new items addressing the goals of peer review, reviewer training and nursing involvement; and refining selected item wording to improve clarity. Respondents’ self-rated peer review program impact on quality and safety was the primary outcome variable. The survey also captured their reported medical staff perceptions of the process, physician engagement in quality and safety improvement activity, and overall physician-hospital relations. The scale for rating quality impact was expanded from 6 to 8 levels to enhance reliability [11]. Forty items populated four secure web pages. Conformance to the QI model was assessed as a ‘QI model score’ by modifying the 13-item self-assessment tool derived from the 2007-study [10] and validated against the 2009 cohort [9] to exclude two items (the use of clinical performance measurement and reliable rating scales), which derived from measurement theory rather than the results of regression modeling. In 2009, 27 paired ratings allowed estimation of inter-rater reliability of the survey instrument via the intra-class correlation coefficient [12] at 0.61 [0.31–0.80] and 11 duplicate responses gave an intra-rater reliability of 0.88 [0.63–0.97] [9]. Since these values adequately support aggregate comparisons and the general questionnaire design was comparable, a re-validation study was not done. The author solicited prior participants via email and phoned non-responding facilities to verify receipt or identify other potential respondents. Data collection extended from 5 October 2015 through 7 March 2016. The analysis included both complete and partial (3-page) responses with final disposition codes for the sample frame assigned at the hospital level according to 2015 AAPOR standards [13]. The longitudinal change in QI model scores was assessed using paired t-tests. Association of tabulated survey items with outcomes variables was evaluated using Pearson chi-square. Items meeting a cutoff of P < 0.05 were accepted as univariate correlates. They were further analyzed in multivariate models using ordinal logistical regression to estimate their independent contribution to program impact on quality and safety, retaining outlier values and selectively collapsing response levels. A regression equation was accepted only if all the factor coefficients and intercepts were significant at P < 0.05 and if both Pearson and Deviance goodness-of-fit tests met P > 0.1. This analysis served as the basis for an empirical revision of the self-assessment inventory and associated QI model score, weighting items in rough relation to their odds ratios (OR) and effects on other variables. Statistical analysis relied on Minitab version 15 (2007). Results Table 1 summarizes sample frame hospital characteristics. The study yielded 268 complete responses, two partial responses, six breakoffs, 39 refusals and one non-contact for an overall response rate of 59% (270/457). The informants were primarily senior leaders (44%) and mid-level managers (46%). About 62% (167/270) were physicians. Only 40% had participated in a prior study. There were no significant differences between respondent and non-respondent hospitals on the basis of prior survey QI model scores, program quality impact and medical staff perceptions. Table 1 Sample frame characteristics Respondents Non-respondents Sample total US dataa N = 270 N = 187 N = 457 N = 4936 Census regions  Northeast 60 36 96 680  South 61 62 123 1919  Midwest 95 48 143 1422  West 54 41 95 915 Staffed beds  >500 48 28 76 244  200–499 104 67 171 1160  50–199 90 78 168 2130  <50 28 14 42 1402 COTH Mmembersb 62 25 77 400 Respondents Non-respondents Sample total US dataa N = 270 N = 187 N = 457 N = 4936 Census regions  Northeast 60 36 96 680  South 61 62 123 1919  Midwest 95 48 143 1422  West 54 41 95 915 Staffed beds  >500 48 28 76 244  200–499 104 67 171 1160  50–199 90 78 168 2130  <50 28 14 42 1402 COTH Mmembersb 62 25 77 400 aAmerican Hospital Association 2009. bCouncil of Teaching Hospitals (https://www.aamc.org/members/coth/about/ accessed 20 August 2016). Table 1 Sample frame characteristics Respondents Non-respondents Sample total US dataa N = 270 N = 187 N = 457 N = 4936 Census regions  Northeast 60 36 96 680  South 61 62 123 1919  Midwest 95 48 143 1422  West 54 41 95 915 Staffed beds  >500 48 28 76 244  200–499 104 67 171 1160  50–199 90 78 168 2130  <50 28 14 42 1402 COTH Mmembersb 62 25 77 400 Respondents Non-respondents Sample total US dataa N = 270 N = 187 N = 457 N = 4936 Census regions  Northeast 60 36 96 680  South 61 62 123 1919  Midwest 95 48 143 1422  West 54 41 95 915 Staffed beds  >500 48 28 76 244  200–499 104 67 171 1160  50–199 90 78 168 2130  <50 28 14 42 1402 COTH Mmembersb 62 25 77 400 aAmerican Hospital Association 2009. bCouncil of Teaching Hospitals (https://www.aamc.org/members/coth/about/ accessed 20 August 2016). Due to limitations of space, this report will focus on the most salient results of the study, excluding information about Just Culture which is being reported separately [14]. Tabulated responses to all survey items with measures of association to outcomes variables are presented in Supplementary material. Participants rated the extent to which various objectives describe the primary purpose and aim of their peer review program on a six-level categorical scale ranging from ‘Most Strongly’ to ‘Not at All’. The rank ordering by mean rating was highest for ‘Improve the quality and safety of care’ (5.5) followed by ‘Identify and remediate sub-standard care’ (5.1). Having the primary aim of improving quality and safety is a significant multivariate predictor of program impact. For 79% of respondents, the peer review program is independent of credentialing, even if the results of peer review are used in credentialing decisions. Such separation emerged as another new multivariate predictor of program impact. Each year since 2007, a median [IQR] of 20% [11–24%] of hospitals have made major changes to peer review program structure, process and/or governance. Only 5% did not report any changes. Table 2 displays the evolution of the scope of what medical staffs park under the umbrella of their peer review programs. Table 3 gives the rank ordering of the methods used to identify cases for peer review. Data review and risk management activities are still the dominant sources by which these case identification criteria are applied. Table 4 compares the process and outcomes measures currently used to monitor program effectiveness to 2007 practice. Table 2 Clinical Peer Review Program Scope 2007–15 2007 2011 2015 15 vs. 07 N = 337 N = 300 N = 270 Change Activities % (n) % (n) % (n) % P-valuea Retrospective medical record review 96 (324) 97 (292) 97 (261) 1 0.83 Focused individual review of quality when serious concerns are raised N/Ab 91 (272) 90 (244) Ongoing professional practice evaluation N/A 82 (247) 90 (243) Focused professional practice evaluation for new privileges N/A 77 (232) 82 (222) Comparative evaluation of performance measures 74 (249) 82 (245) 82 (222) 8 0.02 Root cause analysis 66 (223) 77 (231) 76 (204) 9 0.01 Disruptive behavior management N/A 77 (232) 75 (203) Case-specific, individually-targeted recommendations to improve performance N/A 78 (234) 74 (199) Morbidity and mortality case conferences 58 (196) 57 (170) 67 (181) 9 0.03 Proctoring for new privileges 47 (160) 63 (188) 64 (174) 17 <0.001 Development and/or review of clinical policies, order sets, etc. N/A 62 (187) 64 (173) Comparative evaluation of aggregate data from peer review 50 (167) 62 (186) 57 (154) 7 0.07 Benchmarking to normative data N/A 57 (170) 56 (150) Physician health program administration 15 (51) 56 (166) 54 (147) 39 <0.001 Conducting quality improvement studies and/or projects 41 (139) 54 (162) 54 (145) 13 0.002 Concurrent medical record review 54 (181) 50 (149) 51 (137) −3 0.51 Producing educational programs for groups of clinicians 34 (116) 43 (130) 44 (119) 10 0.02 Other forms of direct observation 23 (79) 25 (76) 28 (75) 5 0.22 2007 2011 2015 15 vs. 07 N = 337 N = 300 N = 270 Change Activities % (n) % (n) % (n) % P-valuea Retrospective medical record review 96 (324) 97 (292) 97 (261) 1 0.83 Focused individual review of quality when serious concerns are raised N/Ab 91 (272) 90 (244) Ongoing professional practice evaluation N/A 82 (247) 90 (243) Focused professional practice evaluation for new privileges N/A 77 (232) 82 (222) Comparative evaluation of performance measures 74 (249) 82 (245) 82 (222) 8 0.02 Root cause analysis 66 (223) 77 (231) 76 (204) 9 0.01 Disruptive behavior management N/A 77 (232) 75 (203) Case-specific, individually-targeted recommendations to improve performance N/A 78 (234) 74 (199) Morbidity and mortality case conferences 58 (196) 57 (170) 67 (181) 9 0.03 Proctoring for new privileges 47 (160) 63 (188) 64 (174) 17 <0.001 Development and/or review of clinical policies, order sets, etc. N/A 62 (187) 64 (173) Comparative evaluation of aggregate data from peer review 50 (167) 62 (186) 57 (154) 7 0.07 Benchmarking to normative data N/A 57 (170) 56 (150) Physician health program administration 15 (51) 56 (166) 54 (147) 39 <0.001 Conducting quality improvement studies and/or projects 41 (139) 54 (162) 54 (145) 13 0.002 Concurrent medical record review 54 (181) 50 (149) 51 (137) −3 0.51 Producing educational programs for groups of clinicians 34 (116) 43 (130) 44 (119) 10 0.02 Other forms of direct observation 23 (79) 25 (76) 28 (75) 5 0.22 aFisher’s exact test for two proportions. bNot applicable (N/A): response option not offered. Table 2 Clinical Peer Review Program Scope 2007–15 2007 2011 2015 15 vs. 07 N = 337 N = 300 N = 270 Change Activities % (n) % (n) % (n) % P-valuea Retrospective medical record review 96 (324) 97 (292) 97 (261) 1 0.83 Focused individual review of quality when serious concerns are raised N/Ab 91 (272) 90 (244) Ongoing professional practice evaluation N/A 82 (247) 90 (243) Focused professional practice evaluation for new privileges N/A 77 (232) 82 (222) Comparative evaluation of performance measures 74 (249) 82 (245) 82 (222) 8 0.02 Root cause analysis 66 (223) 77 (231) 76 (204) 9 0.01 Disruptive behavior management N/A 77 (232) 75 (203) Case-specific, individually-targeted recommendations to improve performance N/A 78 (234) 74 (199) Morbidity and mortality case conferences 58 (196) 57 (170) 67 (181) 9 0.03 Proctoring for new privileges 47 (160) 63 (188) 64 (174) 17 <0.001 Development and/or review of clinical policies, order sets, etc. N/A 62 (187) 64 (173) Comparative evaluation of aggregate data from peer review 50 (167) 62 (186) 57 (154) 7 0.07 Benchmarking to normative data N/A 57 (170) 56 (150) Physician health program administration 15 (51) 56 (166) 54 (147) 39 <0.001 Conducting quality improvement studies and/or projects 41 (139) 54 (162) 54 (145) 13 0.002 Concurrent medical record review 54 (181) 50 (149) 51 (137) −3 0.51 Producing educational programs for groups of clinicians 34 (116) 43 (130) 44 (119) 10 0.02 Other forms of direct observation 23 (79) 25 (76) 28 (75) 5 0.22 2007 2011 2015 15 vs. 07 N = 337 N = 300 N = 270 Change Activities % (n) % (n) % (n) % P-valuea Retrospective medical record review 96 (324) 97 (292) 97 (261) 1 0.83 Focused individual review of quality when serious concerns are raised N/Ab 91 (272) 90 (244) Ongoing professional practice evaluation N/A 82 (247) 90 (243) Focused professional practice evaluation for new privileges N/A 77 (232) 82 (222) Comparative evaluation of performance measures 74 (249) 82 (245) 82 (222) 8 0.02 Root cause analysis 66 (223) 77 (231) 76 (204) 9 0.01 Disruptive behavior management N/A 77 (232) 75 (203) Case-specific, individually-targeted recommendations to improve performance N/A 78 (234) 74 (199) Morbidity and mortality case conferences 58 (196) 57 (170) 67 (181) 9 0.03 Proctoring for new privileges 47 (160) 63 (188) 64 (174) 17 <0.001 Development and/or review of clinical policies, order sets, etc. N/A 62 (187) 64 (173) Comparative evaluation of aggregate data from peer review 50 (167) 62 (186) 57 (154) 7 0.07 Benchmarking to normative data N/A 57 (170) 56 (150) Physician health program administration 15 (51) 56 (166) 54 (147) 39 <0.001 Conducting quality improvement studies and/or projects 41 (139) 54 (162) 54 (145) 13 0.002 Concurrent medical record review 54 (181) 50 (149) 51 (137) −3 0.51 Producing educational programs for groups of clinicians 34 (116) 43 (130) 44 (119) 10 0.02 Other forms of direct observation 23 (79) 25 (76) 28 (75) 5 0.22 aFisher’s exact test for two proportions. bNot applicable (N/A): response option not offered. Table 3 Case identification criteria for peer review 2007–15 2007 2011 2015 N = 337 N = 297 N = 270 % (n) Relative usea Relative usea Generic screens for or ‘triggers’ suggestive of adverse events 96 (324) 79 67 Physician or hospital staff ‘concerns’ 91 (308) 45 48 Unexplained deviation from protocols, pathways or specified clinical standards 68 (228) 21 21 Patient complaints 85 (287) 18 20 Statistical monitoring of process and/or outcomes measures 46 (154) 10 7 Core measures variances N/Ab 12 5 Review of new privileges (FPPEc) 40 (134) 6 5 Quality improvement studies 56 (189) 5 4 Clinically ‘interesting’ cases 42 (142) 2 2 Rapid Response Team activation N/A N/A 2 Random selection 17 (56) 3 1 2007 2011 2015 N = 337 N = 297 N = 270 % (n) Relative usea Relative usea Generic screens for or ‘triggers’ suggestive of adverse events 96 (324) 79 67 Physician or hospital staff ‘concerns’ 91 (308) 45 48 Unexplained deviation from protocols, pathways or specified clinical standards 68 (228) 21 21 Patient complaints 85 (287) 18 20 Statistical monitoring of process and/or outcomes measures 46 (154) 10 7 Core measures variances N/Ab 12 5 Review of new privileges (FPPEc) 40 (134) 6 5 Quality improvement studies 56 (189) 5 4 Clinically ‘interesting’ cases 42 (142) 2 2 Rapid Response Team activation N/A N/A 2 Random selection 17 (56) 3 1 aStandardized to a scale of 100 points for an item ranked #1 by all respondents. bNot applicable (N/A): response option not offered. cFocused professional practice evaluation (FPPE) for new privileges, a Joint Commission requirement. Table 3 Case identification criteria for peer review 2007–15 2007 2011 2015 N = 337 N = 297 N = 270 % (n) Relative usea Relative usea Generic screens for or ‘triggers’ suggestive of adverse events 96 (324) 79 67 Physician or hospital staff ‘concerns’ 91 (308) 45 48 Unexplained deviation from protocols, pathways or specified clinical standards 68 (228) 21 21 Patient complaints 85 (287) 18 20 Statistical monitoring of process and/or outcomes measures 46 (154) 10 7 Core measures variances N/Ab 12 5 Review of new privileges (FPPEc) 40 (134) 6 5 Quality improvement studies 56 (189) 5 4 Clinically ‘interesting’ cases 42 (142) 2 2 Rapid Response Team activation N/A N/A 2 Random selection 17 (56) 3 1 2007 2011 2015 N = 337 N = 297 N = 270 % (n) Relative usea Relative usea Generic screens for or ‘triggers’ suggestive of adverse events 96 (324) 79 67 Physician or hospital staff ‘concerns’ 91 (308) 45 48 Unexplained deviation from protocols, pathways or specified clinical standards 68 (228) 21 21 Patient complaints 85 (287) 18 20 Statistical monitoring of process and/or outcomes measures 46 (154) 10 7 Core measures variances N/Ab 12 5 Review of new privileges (FPPEc) 40 (134) 6 5 Quality improvement studies 56 (189) 5 4 Clinically ‘interesting’ cases 42 (142) 2 2 Rapid Response Team activation N/A N/A 2 Random selection 17 (56) 3 1 aStandardized to a scale of 100 points for an item ranked #1 by all respondents. bNot applicable (N/A): response option not offered. cFocused professional practice evaluation (FPPE) for new privileges, a Joint Commission requirement. Table 4 Process and outcome measures of peer review activity 2007–15 2007 2015 N = 329 N = 270 Difference % (n) % (n) P-valuea Trends in adverse event rates (either globally or by event type) 54 (178) 66 (177) 0.006 Trends in targeted clinical performance measures 51 (168) 57 (154) 0.16 Case review volume 49 (162) 44 (118) 0.19 Trends in individual or group performance on specific elements of care evaluated through the peer review process 44 (145) N/Ab N/A Counts/patterns of system or process of care improvement opportunities identified 38 (126) 33 (90) 0.23 Counts/patterns of recommendations for improved performance of individual clinicians 41 (136) 33 (88) 0.03 Counts of ‘corrective actions’ under the medical staff bylaws involving hospital privileges N/A 32 (87) N/A Measures of medical staff engagement in quality and safety N/A 27 (72) N/A Turn-around-time for case review 18 (59) 24 (64) 0.09 Counts/topics of recommendations for group education 17 (57) 19 (52) 0.60 Case review backlog 14 (47) 16 (43) 0.65 Counts of clinicians recognized for excellent performance 6 (21) 14 (39) 0.001 Measures of medical staff satisfaction with the peer review program N/A 13 (35) N/A Reviewer productivity (e.g. average number of cases reviewed per reviewer per meeting or per month) N/A 5 (13) N/A Counts of cases self-referred for peer review N/A 5 (13) N/A Other 1 (4) 1 (3) 1.00 Not applicable: We do not track and review any process or outcome measures in relation to our Peer Review Program 18 (59) 15 (41) 0.38 2007 2015 N = 329 N = 270 Difference % (n) % (n) P-valuea Trends in adverse event rates (either globally or by event type) 54 (178) 66 (177) 0.006 Trends in targeted clinical performance measures 51 (168) 57 (154) 0.16 Case review volume 49 (162) 44 (118) 0.19 Trends in individual or group performance on specific elements of care evaluated through the peer review process 44 (145) N/Ab N/A Counts/patterns of system or process of care improvement opportunities identified 38 (126) 33 (90) 0.23 Counts/patterns of recommendations for improved performance of individual clinicians 41 (136) 33 (88) 0.03 Counts of ‘corrective actions’ under the medical staff bylaws involving hospital privileges N/A 32 (87) N/A Measures of medical staff engagement in quality and safety N/A 27 (72) N/A Turn-around-time for case review 18 (59) 24 (64) 0.09 Counts/topics of recommendations for group education 17 (57) 19 (52) 0.60 Case review backlog 14 (47) 16 (43) 0.65 Counts of clinicians recognized for excellent performance 6 (21) 14 (39) 0.001 Measures of medical staff satisfaction with the peer review program N/A 13 (35) N/A Reviewer productivity (e.g. average number of cases reviewed per reviewer per meeting or per month) N/A 5 (13) N/A Counts of cases self-referred for peer review N/A 5 (13) N/A Other 1 (4) 1 (3) 1.00 Not applicable: We do not track and review any process or outcome measures in relation to our Peer Review Program 18 (59) 15 (41) 0.38 aFisher’s exact test for two proportions. bNot applicable (N/A). Table 4 Process and outcome measures of peer review activity 2007–15 2007 2015 N = 329 N = 270 Difference % (n) % (n) P-valuea Trends in adverse event rates (either globally or by event type) 54 (178) 66 (177) 0.006 Trends in targeted clinical performance measures 51 (168) 57 (154) 0.16 Case review volume 49 (162) 44 (118) 0.19 Trends in individual or group performance on specific elements of care evaluated through the peer review process 44 (145) N/Ab N/A Counts/patterns of system or process of care improvement opportunities identified 38 (126) 33 (90) 0.23 Counts/patterns of recommendations for improved performance of individual clinicians 41 (136) 33 (88) 0.03 Counts of ‘corrective actions’ under the medical staff bylaws involving hospital privileges N/A 32 (87) N/A Measures of medical staff engagement in quality and safety N/A 27 (72) N/A Turn-around-time for case review 18 (59) 24 (64) 0.09 Counts/topics of recommendations for group education 17 (57) 19 (52) 0.60 Case review backlog 14 (47) 16 (43) 0.65 Counts of clinicians recognized for excellent performance 6 (21) 14 (39) 0.001 Measures of medical staff satisfaction with the peer review program N/A 13 (35) N/A Reviewer productivity (e.g. average number of cases reviewed per reviewer per meeting or per month) N/A 5 (13) N/A Counts of cases self-referred for peer review N/A 5 (13) N/A Other 1 (4) 1 (3) 1.00 Not applicable: We do not track and review any process or outcome measures in relation to our Peer Review Program 18 (59) 15 (41) 0.38 2007 2015 N = 329 N = 270 Difference % (n) % (n) P-valuea Trends in adverse event rates (either globally or by event type) 54 (178) 66 (177) 0.006 Trends in targeted clinical performance measures 51 (168) 57 (154) 0.16 Case review volume 49 (162) 44 (118) 0.19 Trends in individual or group performance on specific elements of care evaluated through the peer review process 44 (145) N/Ab N/A Counts/patterns of system or process of care improvement opportunities identified 38 (126) 33 (90) 0.23 Counts/patterns of recommendations for improved performance of individual clinicians 41 (136) 33 (88) 0.03 Counts of ‘corrective actions’ under the medical staff bylaws involving hospital privileges N/A 32 (87) N/A Measures of medical staff engagement in quality and safety N/A 27 (72) N/A Turn-around-time for case review 18 (59) 24 (64) 0.09 Counts/topics of recommendations for group education 17 (57) 19 (52) 0.60 Case review backlog 14 (47) 16 (43) 0.65 Counts of clinicians recognized for excellent performance 6 (21) 14 (39) 0.001 Measures of medical staff satisfaction with the peer review program N/A 13 (35) N/A Reviewer productivity (e.g. average number of cases reviewed per reviewer per meeting or per month) N/A 5 (13) N/A Counts of cases self-referred for peer review N/A 5 (13) N/A Other 1 (4) 1 (3) 1.00 Not applicable: We do not track and review any process or outcome measures in relation to our Peer Review Program 18 (59) 15 (41) 0.38 aFisher’s exact test for two proportions. bNot applicable (N/A). Regular solicitation of input from reviewed clinicians is associated with higher program quality impact as is reviewer training, but not reviewer compensation. A third of facilities train most reviewers in chart review methods, legal and risk management issues, and interpersonal skills. A quarter train them on QI methods. In 69% of facilities, the majority of case reviews are presented and discussed in a committee prior to final decision-making. Nurses, especially leaders, sit on medical staff review committees at 58% of study hospitals. Some programs routinely assess nursing care during the case review process. Although they typically refer nursing issues to nursing for resolution, a few directly address all improvement opportunities. The 11-item QI model scores increased 5.6 [2.9–8.3] points from 46.2 at study entry. Only 35% (n = 94) scored at least 60 points (i.e. 75% of 80)—‘C’ level progress in adopting the QI model. These hospitals accounted for all of the improvement. Their current score compared with their first score increased a mean of 21.2 [16.9–25.6] points. Higher scoring organizations could not be differentiated from the others on the basis of their first scores (one-way ANOVA P = 0.59). Among the remaining facilities (n = 176), the mean score change was −2.8 [−5.59–0.0]. The revised QI model inventory totals to 100 points and includes the 20 items presented in Table 5. The rank order highlights the improvement opportunity in terms of the average number of points which could be gained on each item from adoption of best practice. Supplementary material presents the complete self-assessment inventory. The revised model retains all previously identified factors except the use of adverse event rates as a measure of program effectiveness. This item as re-written was not correlated with any outcome measure. Table 5 Revised QI model factors and improvement opportunity Factors Points Average improvement opportunitya 1. Likelihood of self-reporting adverse events, near misses and/or hazardous conditions 10 7.0 2. Quality of case review 10 6.4 3. Degree of standardization of peer review process 10 5.3 4. Diligence of program governance 10 4.2 5. Connecting the peer review program to the hospital’s quality/safety/performance improvement process 6 3.8 6. Excellent reviewer participation 6 3.8 7. Aiming the program to improve quality and safety above all else 8 2.9 8. Using a dashboard of process and outcome indicators to manage and improve the programb 3 2.6 9. Recognizing outstanding clinical performance 3 2.4 10. Routinely soliciting input to case review from involved cliniciansb 5 2.2 11. Using reliable rating scales to make subjective measures of clinical performancec 2 1.7 12. Reviewing an adequate volume of cases to generate meaningful improvements in care delivery 3 1.4 13. Measuring clinical performance as part of the case review process 2 1.2 14. Sharing information about program effectiveness with the board of trustees 2 1.0 15. Providing timely case review and communication of opportunities for improved performance 5 0.9 16. Allowing for group discussion of most case reviews 2 0.9 17. Looking for process improvement opportunities in every case review 7 0.8 18. Reviewing pertinent diagnostic studies that influenced critical decisions, not just the reports 2 0.5 19. Separating the peer review program from credentialing, even if the results of peer review inform credentialing activitiesb 2 0.4 20. Program scope includes either concurrent review or case-specific, individually-targeted recommendations to improve performanceb 2 0.3 Total 100 49.8 Factors Points Average improvement opportunitya 1. Likelihood of self-reporting adverse events, near misses and/or hazardous conditions 10 7.0 2. Quality of case review 10 6.4 3. Degree of standardization of peer review process 10 5.3 4. Diligence of program governance 10 4.2 5. Connecting the peer review program to the hospital’s quality/safety/performance improvement process 6 3.8 6. Excellent reviewer participation 6 3.8 7. Aiming the program to improve quality and safety above all else 8 2.9 8. Using a dashboard of process and outcome indicators to manage and improve the programb 3 2.6 9. Recognizing outstanding clinical performance 3 2.4 10. Routinely soliciting input to case review from involved cliniciansb 5 2.2 11. Using reliable rating scales to make subjective measures of clinical performancec 2 1.7 12. Reviewing an adequate volume of cases to generate meaningful improvements in care delivery 3 1.4 13. Measuring clinical performance as part of the case review process 2 1.2 14. Sharing information about program effectiveness with the board of trustees 2 1.0 15. Providing timely case review and communication of opportunities for improved performance 5 0.9 16. Allowing for group discussion of most case reviews 2 0.9 17. Looking for process improvement opportunities in every case review 7 0.8 18. Reviewing pertinent diagnostic studies that influenced critical decisions, not just the reports 2 0.5 19. Separating the peer review program from credentialing, even if the results of peer review inform credentialing activitiesb 2 0.4 20. Program scope includes either concurrent review or case-specific, individually-targeted recommendations to improve performanceb 2 0.3 Total 100 49.8 aAverage QI model score points available from adoption of best practices. bProvisional additions to QI model based on 2015 data. cIncluded in QI model based on measurement reliability theory. Table 5 Revised QI model factors and improvement opportunity Factors Points Average improvement opportunitya 1. Likelihood of self-reporting adverse events, near misses and/or hazardous conditions 10 7.0 2. Quality of case review 10 6.4 3. Degree of standardization of peer review process 10 5.3 4. Diligence of program governance 10 4.2 5. Connecting the peer review program to the hospital’s quality/safety/performance improvement process 6 3.8 6. Excellent reviewer participation 6 3.8 7. Aiming the program to improve quality and safety above all else 8 2.9 8. Using a dashboard of process and outcome indicators to manage and improve the programb 3 2.6 9. Recognizing outstanding clinical performance 3 2.4 10. Routinely soliciting input to case review from involved cliniciansb 5 2.2 11. Using reliable rating scales to make subjective measures of clinical performancec 2 1.7 12. Reviewing an adequate volume of cases to generate meaningful improvements in care delivery 3 1.4 13. Measuring clinical performance as part of the case review process 2 1.2 14. Sharing information about program effectiveness with the board of trustees 2 1.0 15. Providing timely case review and communication of opportunities for improved performance 5 0.9 16. Allowing for group discussion of most case reviews 2 0.9 17. Looking for process improvement opportunities in every case review 7 0.8 18. Reviewing pertinent diagnostic studies that influenced critical decisions, not just the reports 2 0.5 19. Separating the peer review program from credentialing, even if the results of peer review inform credentialing activitiesb 2 0.4 20. Program scope includes either concurrent review or case-specific, individually-targeted recommendations to improve performanceb 2 0.3 Total 100 49.8 Factors Points Average improvement opportunitya 1. Likelihood of self-reporting adverse events, near misses and/or hazardous conditions 10 7.0 2. Quality of case review 10 6.4 3. Degree of standardization of peer review process 10 5.3 4. Diligence of program governance 10 4.2 5. Connecting the peer review program to the hospital’s quality/safety/performance improvement process 6 3.8 6. Excellent reviewer participation 6 3.8 7. Aiming the program to improve quality and safety above all else 8 2.9 8. Using a dashboard of process and outcome indicators to manage and improve the programb 3 2.6 9. Recognizing outstanding clinical performance 3 2.4 10. Routinely soliciting input to case review from involved cliniciansb 5 2.2 11. Using reliable rating scales to make subjective measures of clinical performancec 2 1.7 12. Reviewing an adequate volume of cases to generate meaningful improvements in care delivery 3 1.4 13. Measuring clinical performance as part of the case review process 2 1.2 14. Sharing information about program effectiveness with the board of trustees 2 1.0 15. Providing timely case review and communication of opportunities for improved performance 5 0.9 16. Allowing for group discussion of most case reviews 2 0.9 17. Looking for process improvement opportunities in every case review 7 0.8 18. Reviewing pertinent diagnostic studies that influenced critical decisions, not just the reports 2 0.5 19. Separating the peer review program from credentialing, even if the results of peer review inform credentialing activitiesb 2 0.4 20. Program scope includes either concurrent review or case-specific, individually-targeted recommendations to improve performanceb 2 0.3 Total 100 49.8 aAverage QI model score points available from adoption of best practices. bProvisional additions to QI model based on 2015 data. cIncluded in QI model based on measurement reliability theory. Revised QI model scores range from 0 to 96 with a median of 50 [32–68]. Only 13% score at least 75. A 10-point increase is associated with an OR of 2.5 [2.2–3.0] for a one level increase in quality impact, 1.8 [1.6–2.1] for medical staff engagement in quality, 2.0 [1.8–2.3] for medical staff perceptions of the peer review program, and 1.5 [1.3–1.7] for physician-hospital relations. The revised score better predicts program impact than the original when comparing log likelihoods for the regressions (−267 vs. −304). Other things being equal, higher revised score is itself best predicted by the relative importance of the peer review program compared to other QI activities, the quality of organizational leadership, openness to change, aiming at improved quality and scoring at least 60 points on the original scale at cohort entry (adjusted R2 = 54%). Discussion This study affirms that the primary goal of clinical peer review in the USA is improved quality and safety of care and adds new dimensions to a best-practice model for achieving that aim. Most study hospitals have re-tooled their clinical peer review programs since 2007, but nearly two-thirds failed to incorporate additional best practices. The separation of the peer review program from credentialing activity is correlated with greater quality impact. Credentialing focuses on the competence required to perform requested privileges. Competence is an enduring quality, which is resilient to change. Questions of competence may be perceived as threatening because they put privileges, licensure and livelihood at risk. In contrast, case review deals with specific episodes of care to evaluate clinical performance, which may vary under influence of organizational and human factors. Even the best methods are insufficiently reliable to justify a negative judgment of competence from a single case [15, 16]. The 2007 observation that hospitals use unreliable methods to ‘score’ the findings from case review still holds true. The three-level approach categorizing how others might have managed the case adopted by the Veteran’s Administration is typical [17]. Such ratings don’t lend themselves to aggregation and reflect a confusion of agreement with reliability [18]. The reliability of a measure is directly related to the observed variability. For example, if all physicians are rated above average, even though the raters are in perfect agreement there is no variability in the ratings, which are therefore unreliable. Medical staffs still rely on generic screens suggestive of adverse events to identify cases worthy of review. A well-supported alternative, namely Rapid Response Team activations, is rarely used [19, 20]. Self-reporting is also little used, even though it’s the most efficient and effective method [21]. It surfaced in 2011 as one of the strongest independent predictors of overall program effectiveness. The willingness to self-report problems is a hallmark of safety culture and was critical to the transformation of aviation [22]. It could do the same for healthcare if given high priority: it is visible, easily measureable and requires the organization to consciously shift to a non-punitive peer review process typified by the QI model. Managers in all industries commonly follow a ‘dashboard’ of key performance measures relevant to their area of responsibility. The use of adverse event rates as a measure of program effectiveness may have appeared in the original QI model as an artifact of item wording, since many organizations track these rates for other purposes. Adverse event rates are a less specific measure than the count of improvement opportunities identified by the program, the turn-around-time for case review or the count of clinicians whose excellent performance was worthy of recognition—which are all strongly correlated with program effectiveness, and in combination have been added to the QI model. Nevertheless, the four studies integrated in this longitudinal analysis showed that 25% of respondents can’t recall basic metrics like case review volume. Moreover, given the primary aim to improve quality and safety, it is surprising that only a third track process of care improvement opportunities identified. Thus, it seems clinical peer review isn’t generally treated as a core business process. This study is limited by convenience sampling so that confidence intervals for the national population of hospitals cannot be projected. Teaching hospital practice is over-represented. The data are self-reported and un-audited. The survey items provide a relatively coarse-grained view of the process. Objective measures of program activity, including the proportion of reviewed cases with actions to improve clinical performance tend to be closely protected and could not be accessed. There is also potential for non-response bias, although this is mitigated by the high response rate and congruence across four studies. This study reveals a substantial gap between prevailing clinical peer review process and best practices in the pursuit of quality and safety. Successful translation of evidence into practice invariably requires consideration of the local environment [23]. It will take the cumulative work of many organizations consciously aiming at the QI model to further refine it and demonstrate winning strategies for its implementation. That work could be expedited through the improvement collaborative process, which has been successful in promoting adoption of specific clinical practice bundles [24, 25]. Studies of high performing programs could also add value. Hospital and physician leaders could potentially accelerate progress in quality and safety by revisiting their clinical peer review practices in light of the evidence-based QI model. Acknowledgements The author appreciates the contributions of those who critiqued drafts of this manuscript including George Helmrich, MD, Angelo P. Giardino, MD, PhD, Kymberlee A. Morrissey, BA, CPHQ, CNMT, RT(N), Deborah Ritchie, RN, BA and Larry Tonberg Edwards, PsyD. Funding This research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors. References 1 Edwards MT , Benjamin EM . The process of peer review in US hospitals . J Clin Outcomes Manag 2009 ; 16 : 461 – 7 . 2 Edwards MT . The objective impact of clinical peer review on quality of care . Am J Med Qual 2011 ; 26 : 110 – 9 . Google Scholar Crossref Search ADS PubMed 3 Edwards MT . A longitudinal study of clinical peer review’s impact on quality and safety in U.S. hospitals . J Healthc Manag 2013 ; 58 : 369 – 84 . Google Scholar Crossref Search ADS PubMed 4 Dershewitz RA , Gross RJ . Why medical audits are in disfavor . Arch Int Med 1980 ; 140 : 168 – 9 . Google Scholar Crossref Search ADS 5 Sanazaro PJ , Mills DH . A critique of the use of generic screening in quality assessment . JAMA 1991 ; 265 : 1977 – 81 . Google Scholar Crossref Search ADS PubMed 6 Hayward RA , Bernard AM , Rosevear JS et al. . An evaluation of generic screens for poor quality of hospital care on a general medicine service . Med Care 1993 ; 31 : 394 – 402 . Google Scholar Crossref Search ADS PubMed 7 Berwick DM . Peer review and quality management: are they compatible? Qual Rev Bull 1990 ; 16 : 246 – 51 . Google Scholar Crossref Search ADS 8 Dans PE . Clinical peer review: burnishing a tarnished image . Ann Intern Med 1993 ; 118 : 566 – 8 . Google Scholar Crossref Search ADS PubMed 9 Edwards MT . Clinical peer review program self-evaluation for US hospitals . Am J Med Qual 2010 ; 25 : 474 – 80 . Google Scholar Crossref Search ADS PubMed 10 Edwards MT . Peer review: a new tool for quality improvement . Physician Exec 2009 ; 35 : 54 – 9 . Google Scholar PubMed 11 Streiner DL , Norman GR . Health Measurement Scales: A Practical Guide to their Development and Use , 3rd edn . New York, NY : Oxford University Press , 2003 : 38 . 12 Shrout PE , Fleiss JL . Intraclass correlations: uses in assessing rater reliability . Psychol Bull 1979 ; 86 : 420 – 8 . Google Scholar Crossref Search ADS PubMed 13 The American Association for Public Opinion Research . 2015 . Standard Definitions: Final Dispositions of Case Codes and Outcome Rates for Surveys. 8th edn. AAPOR. http://www.websm.org/uploadi/editor/doc/1447505066AAPOR_2015_Standard_Definitions.pdf (9 Januari 2018, date last accessed). 14 Edwards MT . An assessment of the impact of just culture on quality and safety in U.S. hospitals . Am J Med Qual . 2018 [in press]. doi:10.1177/1062860618768057 15 Hayward RA , McMahon LF , Bernard AM . Evaluating the care of general medicine inpatients: how good is implicit review? Ann Intern Med 1993 ; 118 : 551 – 7 . 16 Goldman RL , Ciesco E . Improving peer review: alternatives to unstructured judgments by a single reviewer . Jt Comm J Qual Improv 1996 ; 22 : 762 – 9 . Google Scholar PubMed 17 Meeks DW , Meyer AN , Rose B et al. . Exploring new avenues to assess the sharp end of patient safety: an analysis of nationally aggregated peer review data . BMJ Qual Saf 2014 ; 23 : 1023 – 30 . doi:10.1136/bmjqs-2014-003239 . Google Scholar Crossref Search ADS PubMed 18 Edwards MT . Measuring clinical performance . Physician Exec 2009 ; 35 : 40 – 3 . Google Scholar PubMed 19 Braithwaite RS , DeVita MA , Mahidhara R et al. . Use of medical emergency team (MET) responses to detect medical errors . Qual Saf Health Care 2004 ; 13 : 255 – 9 . Google Scholar Crossref Search ADS PubMed 20 Kaplan LJ , Maerz LL , Schuster K et al. . Uncovering system errors using a rapid response team: cross-coverage caught in the crossfire . J Trauma 2009 ; 67 : 173 – 8 . Google Scholar Crossref Search ADS PubMed 21 Katz RI , Lagasse RS . Factors influencing the reporting of adverse perioperative outcomes to a quality management program . Anesth Analg 2000 ; 90 : 344 – 50 . Google Scholar PubMed 22 Edwards MT . Engaging physicians in patient safety through self-reporting of adverse events . Physician Exec 2012 ; 38 : 46 – 52 . Google Scholar PubMed 23 Pronovost PJ , Berenholtz SM , Needham DM . Translating evidence into practice: a model for large-scale knowledge translation . BMJ 2008 ; 337 : 963 – 5 . Google Scholar Crossref Search ADS 24 Berenholtz SM , Pham JC , Thompson DA et al. . Collaborative cohort study of an intervention to reduce ventilator-associated pneumonia in the intensive care unit . Infect Control Hosp Epidemiol 2011 ; 32 : 305 – 14 . Google Scholar Crossref Search ADS PubMed 25 Pronovost PJ , Watson SR , Goeschel CA et al. . Sustaining reductions in central line-associated bloodstream infections in Michigan intensive care units: a 10-year analysis . Am J Med Qual 2016 ; 31 : 197 – 202 . doi:10.1177/1062860614568647 . Google Scholar Crossref Search ADS PubMed © The Author(s) 2018. Published by Oxford University Press in association with the International Society for Quality in Health Care. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

Journal

International Journal for Quality in Health CareOxford University Press

Published: Oct 1, 2018

References

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off