Holistic rubric vs. analytic rubric for measuring clinical performance levels in medical students

Holistic rubric vs. analytic rubric for measuring clinical performance levels in medical students Background: Task-specific checklists, holistic rubrics, and analytic rubrics are often used for performance assessments. We examined what factors evaluators consider important in holistic scoring of clinical performance assessment, and compared the usefulness of applying holistic and analytic rubrics respectively, and analytic rubrics in addition to task- specific checklists based on traditional standards. Methods: We compared the usefulness of a holistic rubric versus an analytic rubric in effectively measuring the clinical skill performances of 126 third-year medical students who participated in a clinical performance assessment conducted by Pusan National University School of Medicine. We conducted a questionnaire survey of 37 evaluators who used all three evaluation methods—holistic rubric, analytic rubric, and task-specific checklist—for each student. The relationship between the scores on the three evaluation methods was analyzed using Pearson’s correlation. Inter-rater agreement was analyzed by Kappa index. The effect of holistic and analytic rubric scores on the task-specific checklist score was analyzed using multiple regression analysis. Results: Evaluators perceived accuracy and proficiency to be major factors in objective structured clinical examinations evaluation, and history taking and physical examination to be major factors in clinical performance examinations evaluation. Holistic rubric scores were highly related to the scores of the task-specific checklist and analytic rubric. Relatively low agreement was found in clinical performance examinations compared to objective structured clinical examinations. Meanwhile, the holistic and analytic rubric scores explained 59.1% of the task-specific checklist score in objective structured clinical examinations and 51.6% in clinical performance examinations. Conclusion: The results show the usefulness of holistic and analytic rubrics in clinical performance assessment, which can be used in conjunction with task-specific checklists for more efficient evaluation. Keywords: Clinical assessment, Objective structured clinical examination, Feedback Background relevant, and contextually appropriate setting [1]. In par- In medical education, a clinical performance assessment ticular, the role of the evaluator as a major factor in the (CPA) is a criterion-referenced test that assesses compe- reliability of CPA has often been discussed. Such factors tencies in the care of patients. The main issue is whether as evaluator expertise, experience, and hawkishness may the standardization and objectivity of evaluations are re- affect CPA more than the evaluation items, because no liably maintained in a complex and simulated, clinically single method of setting standards is perfect [2–4]. Holistic rubrics emphasize the use of experts to judge performance assessment. They comprise a comprehen- * Correspondence: saylee@pnu.edu sive assessment of the complex multi-faceted character- Department of Medical Education, Pusan National University School of istics of the tasks undertaken and are based on the Medicine, 49, Busandaehak-ro, Mulgeum-eup, Yangsan-si, Gyeongsangnam-do 50612, Republic of Korea overall impression of the experts who implement them. Family Medicine Clinic and Research Institute of Convergence of Biomedical Since performance is not a sum of simple factors, the Science and Technology, Pusan National University Yangsan Hospital, 49, use of expert holistic rubrics is recognized as a useful Busandaehak-ro, Mulgeum-eup, Yangsan-si, Gyeongsangnam-do 50612, Republic of Korea © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Yune et al. BMC Medical Education (2018) 18:124 Page 2 of 6 evaluation method [5]. However, when assessment times holistic scoring in CPA, and what factors evaluators are longer, evaluators generally tend to be more lenient recognize as useful in holistic grading. The usefulness of and are more likely to overlook students’ failure factors these two rubrics was also compared by applying holistic due to evaluator fatigue [6]. At the beginning of an and analytic rubrics in addition to task-specific checklists evaluation, evaluators are reported to exaggerate due to based on traditional standards. The four overarching re- inexperience while, in the latter stage, they exaggerate search questions guiding the study were as follows: (1) due to fatigue and comparisons with applicants who What are the evaluators’ perceptions regarding key factors have already been assessed. Evaluation of clinical per- in determining OSCE and CPX holistic scoring? (2) Are formance through an evaluation matrix has been recom- there correlations among the scores of the task-specific mended to avoid evaluator errors [7]. However, it has checklist, holistic rubric, and analytic rubric in OSCEs and been pointed out that that to some extent, evaluations CPXs? (3) Is there agreement between pass and fail fre- using task-specific checklists covering many criteria have quencies based on the task-specific checklist, holistic ru- difficulties in evaluating competency, and that there is a bric, and analytic rubric in OSCEs and CPXs? (4) What is limit to the effects of the evaluator’s expertise in evalu- the effect of the holistic rubric and analytic rubric on the ation [8]. task-specific checklist score in OSCEs and CPXs? Due to the limitations of task-specific checklist evalu- ation, it has been proposed that evaluators use a global Methods rating scale [9, 10]. A global rating scale is a holistic ru- Participants and data bric that provides a single score based on an overall im- This study utilized the data of 126 third-year students pression of a student’s clinical performance and can be who participated in a CPA during two days in 2016, led used in tandem with the task-specific checklist [11]. Glo- by Pusan National University School of Medicine. This bal rating scale assessments are easy to develop and use, study was reviewed by the Institutional Review Board of but there is a high possibility that errors will occur. For Pusan National University Yangsan Hospital, and given example, factors related to applicants may cause a “halo exempt status (IRB No. 05–2017-102). Because we ana- effect”; additionally, the “central tendency,” whereby the lyzed data retrospectively and anonymously by assigning evaluator tends to select the middle range, may also each subject a distinct number, the institutional review cause errors. Holistic rubrics can use behaviorally an- board did not require informed consent from partici- chored scales or Likert scales. A global rating scale can pants. The CPA was operated with 12 stations per day, be readily used to set standards for performance [12]. including six clinical performance examination (CPX) However, to use the global rating scale, it is necessary to stations and six objective structured clinical examination clearly present pre-shared criteria, i.e., rubrics, to assess (OSCE) stations. For each of the CPX and OSCE, 24 learners’ achievements. professors participated as evaluators with each professor In this respect, analytic rubrics have the advantage of evaluating 31–32 students. The total number of evalua- reducing evaluator errors on a global rating scale. Ana- tions was 750. Evaluations were used for statistical ana- lytic rubrics are more reliable than holistic rubrics in lysis only if there were data including analytic rubric and that they check the key content, rather than providing a global rating scale results, in addition to task-specific holistic evaluation [13]. Analytic rubrics provide specific checklists as a mandatory assessment method. A total of feedback according to several sections or dimensions, 292 CPX evaluation cases (38.9%) and 488 OSCE allow students to identify which factors are missing from (65.1%) evaluation cases were used as data in the final the holistic rubric, and enable continuous monitoring analysis. In addition, 37 evaluators (77.1%) responded to [14]. Analytic scoring refers to a process of assigning a questionnaire. points to individual traits present in the performance and adding the points to derive single or multiple di- Materials mension scores. For example, students with low scores A questionnaire was administered to evaluators who par- in aseptic manipulation can be educated separately and ticipated in the CPA. The most important factor recog- their progress can be monitored to confirm the degree nized in the global rating scale was as follows: “If you of improvement in aseptic manipulation ability in the evaluate clinical performance with one item, what factors next CPA. do you think are most important?. .. In which case, do you However, little research has been conducted to deter- think that he/she is a high-performing student? (For ex- mine whether holistic rubrics or analytic rubrics are more ample, accuracy, proficiency, sterility, success rate, etc.).” useful, and to examine how such a rubric-based evaluation Evaluators assessed whether the holistic rubric for CPA, approach can be used as a more effective tool than a assigned a score from 0 to 4 and developed according to a task-specific checklist evaluation. Therefore, this study ex- score-based criterion, could measure students’ clinical amined what factors evaluators consider important in ability to perform. The analytic rubrics were developed Yune et al. BMC Medical Education (2018) 18:124 Page 3 of 6 based on the results of a questionnaire administered to from the average were regarded as having failed the as- the faculty focus group. The CPX rubrics allocated 0–3 sessment. The pass/fail agreement between the three points each to various fields, including history taking (con- evaluations was then examined using the Kappa coeffi- tents, systematic), physical examination (contents, accur- cient. Theoretically, if the Kappa value is greater than acy), patient education, and attitude. In the case of the 0.8, it is a perfect agreement, 0.6 denotes substantial OSCE, 0–3 points were allocated for each of the four ru- agreement, 0.4 to 0.6 denotes moderate agreement, and bric items of proficiency, accuracy, asepticity, and success. 0.2 to 0.4 denotes only a fair agreement [15]. To ascer- For example, in CPX, we rated students on a 4-point scale tain which factors had the greatest effect on the from 3 to 0 points in the context of history taking as fol- task-specific checklist total scores, multiple regression lows: the student asked the standard patient every single analysis was performed using the holistic rubric score question (3 point), the student missed some questions (2 and analytic rubric scores as independent variables and point), the student missed a lot of questions (1 point), and the task-specific checklist total scores as the dependent the student did not do anything (0 point). In the OSCE, variable. Statistical analysis of the data was performed we rated students on a 4-point scale from 3 to 0 in asepti- using SPSS version 21.0 for Windows (SPSS Inc., Chi- city as follows: the whole process was aseptic (3 point), pa- cago, USA). tient safety was ensured but contamination occurred (2 point), contamination threatened patient safety (1 point), Results and the student did not do anything (0 point). The sum of Evaluators’ perceptions of key factors in determining CPA the analytic rubric items was calculated by weighting the holistic scoring importance of each item. The sum of the CPX analytic ru- In OSCE, accuracy was the most important factor for bric was history taking × 40 + physical examination × 30 + evaluators who had less than six experiences of evalu- patient education × 20 + attitude × 10, while the sum of ation, while evaluators who had participated more than the OSCE analytic rubric was proficiency × 45 + accuracy six times recognized proficiency as the most important × 30 + asepticity × 15 + success × 10. In the case of the factor. Evaluators who were professors with less than task-specific checklist, a score from 0 to 1 or 0 to 2 was al- 10 years of experience recognized accuracy as the most located for each item, and the sum of the final scores was important factor, while professors with more than then obtained. The CPX and OSCE task-specific checklists 10 years of experience considered both accuracy and consisted of 19 to 28 items and 14 to 18 items, proficiency to be most important. Overall, evaluators respectively. recognized accuracy as the most important factor, followed by proficiency. In CPX, evaluators recognized Statistical analysis history taking as the most important factor, followed by The contents of the questionnaire responses to import- physical examination, regardless of the number of evalu- ant factors in the global rating scale of the CPA were an- ation experiences and duration of the working period alyzed qualitatively and for frequency. The relationship (Table 1). between the global rating scale, analytic rubric scores, and task-specific checklist total scores was examined Relationship among holistic score, analytic rubric scores, using Pearson correlation analysis. Taking into account and task-specific checklist scores the task-specific checklist scores, holistic score, and ana- In the OSCE, the task-specific checklist scores showed a lytic rubric scores, students who were less than 1SD strong positive correlation with holistic score and Table 1 Evaluators’ perceptions of key factors in determining the OSCE and CPX holistic rubric scoring Factors OSCE CPX n Asepticity Accuracy Proficiency Success n History taking Physical examination Patient education Attitude Evaluation experience < 6 times 12 2 7 3 2 16 15 11 3 1 ≥6 times 7 1 3 4 – 22 –– – Subtotal 19 3 10 7 2 18 17 11 3 1 Faculty career < 10 years 7 1 4 1 2 12 9 6 2 – ≥10 years 12 2 6 6 – 68 5 1 1 Subtotal 19 3 10 7 2 18 17 11 3 1 Multiple-response Yune et al. BMC Medical Education (2018) 18:124 Page 4 of 6 analytic rubric scores (r = 0.751, P < 0.001 and r = 0.697, Table 3 Pass and fail frequencies based on task-specific checklist, holistic rubric, and analytic rubrics scores in the OSCE and CPX P < 0.001, respectively). Holistic score also had a strong positive correlation with analytic rubric scores (r = 0.791, Task-specific checklist [n (%)] P < 0.001). In the case of CPX, the task-specific checklist Pass Fail Total scores showed a strong positive correlation with holistic OSCE (n = 488) score and analytic rubric scores (r = 0.689, P < 0.001 and Holistic rubric Pass 394 (96.6) 48 (60.0) 442 (90.6) r = 0.613, P < 0.001, respectively). Holistic score also had Fail 14 (3.4) 32 (40.0) 46 (9.4) a strong positive correlation with analytic rubric scores Analytic rubrics Pass 356 (87.3) 32 (40.0) 388 (79.5) (r = 0.655, P < 0.001) (Table 2). Fail 52 (12.7) 48 (60.0) 100 (20.5) Inter-rater agreement among holistic score, analytic Total 408 (100.0) 80 (100.0) 488 (100.0) rubric scores, and task-specific checklist scores CPX (n = 291) In the OSCE, the task-specific checklist scores showed a Holistic rubric Pass 240 (98.4) 34 (72.3) 274 (94.2) moderate agreement with holistic score and analytic ru- Fail 4 (1.6) 13 (27.7) 17 (5.8) bric scores (Kappa = 0.441, P < 0.001 and Kappa = 0.429, Analytic rubrics Pass 226 (92.6) 25 (53.2) 251 (86.3) P < 0.001, respectively). Holistic score also had a moder- Fail 18 (7.4) 22 (46.8) 40 (13.7) ate agreement with analytic rubric scores (r = 0.512, P < 0.001). Of the students who passed the task-specific Total 244 (100.0) 47 (100.0) 291 (100.0) checklist, 96.6% passed the holistic rubric and 87.3% OSCE; objective structured clinical examination, CPX; clinical performance examination passed the analytics rubrics, while of the students who Students who were less than 1SD from the average were regarded as having failed the task-specific checklist, 40.0% failed the holistic failed the assessment rubric, and 60% failed the analytic rubrics (Tables 3, 4). In CPX, the task-specific checklist scores showed a fair statistically significant in predicting task-specific checklist agreement with holistic score and analytic rubric scores scores, with an explanatory power of 51.6% (F = 155.896, (Kappa = 0.351, P < 0.001 and Kappa = 0.420, P < 0.001, P < 0.001), and holistic score (β =0.503, P < 0.001) showed respectively). Holistic score also had a moderate agree- greater explanatory power than analytic rubric scores (β ment with analytic rubric scores (Kappa = 0.255, P < =0.283, P < 0.001) (Table 5). 0.001). Of the students who passed the task-specific checklist, 98.4% passed the holistic rubric and 92.6% Discussion passed the analytic rubrics, while of the students who Evaluators recognized accuracy as the most important failed the task-specific checklist, 27.7% failed the holistic factor in OSCE, and then proficiency. In CPX, history rubric, and 46.8% failed the analytic rubrics (Tables 3, 4). taking was the major factor, followed by physical exam- ination. Based on these results, we developed an analytic Explanatory power of holistic rubric and analytic rubrics rubrics and examined the relationship and agreement for task-specific checklist among the task-specific checklist, holistic rubric, and In the OSCE, multiple regression analyses showed that analytic rubrics. both holistic score and analytic rubric scores were statisti- In the correlation analysis, both the OCSE and CPX cally significant in predicting task-specific checklist scores, showed a strong positive correlation among holistic score, with an explanatory power of 59.1% (F = 352.37, P < analytic rubric scores, and task-specific checklist scores. 0.001), while although holistic score was the most influen- In the Kappa coefficient for the evaluation agreement, the tial variable (β =0.534, P < 0.001). All variables had vari- OSCE showed a moderate agreement among task-specific ance inflation factors of less than 10 or tolerances of checklist, holistic rubric, and analytic rubrics. In the CPX, greater than 0.1, which shows that multicollinearity does however, there was fair agreement between holistic rubric not exist. In the CPX, multiple regression analyses showed and task-specific checklists or analytic rubrics, and moder- that both holistic score and analytic rubric scores were ate agreement between task-specific checklist and analytic Table 2 Correlations among task-specific checklist, holistic rubric, and analytic rubrics scores in the OSCE and CPX Factor OSCE (n = 488) CPX (n = 291) Mean ± SD 1 2 3 Mean ± SD 1 2 3 1. Task-specific checklist 12.5 ± 2.9 – 28.1 ± 4.8 – 2. Holistic rubric 2.4 ± 0.8 0.751* – 2.4 ± 0.7 0.689* – 3. Analytic rubrics 212.0 ± 52.7 0.697* 0.791* – 400.4 ± 62.1 0.613* 0.655* – *P < 0.001, OSCE; objective structured clinical examination, CPX; clinical performance examination Yune et al. BMC Medical Education (2018) 18:124 Page 5 of 6 Table 4 Kappa coefficient among task-specific checklist, holistic standardized patients can be explained in part by the am- rubric, and analytic rubrics scores in the OSCE and CPX biguous scoring criteria of checklist items, lack of training Holistic Analytic Task-specific to improve consistency between evaluators, and evalua- rubric rubrics checklist tors’ fatigue [16]. OSCE (n = 488) In order to evaluate inter-rater agreement, Kappa coeffi- Holistic rubric – 0.512* 0.441* cient and percent agreement are considered together. In the present study, students who failed the task-specific Analytic rubrics 0.512* – 0.429* checklist evaluation often passed the holistic evaluation or Task-specific checklist 0.441* 0.429* – the analytical rubrics evaluation in the case of the CPX. CPX (n = 291) These findings mean that it is more difficult for students Holistic rubric – 0.255* 0.351* to pass when evaluated with a large number of evaluation Analytic rubrics 0.255* – 0.420* items. However, in the results of this study, it is difficult to Task-specific checklist 0.351* 0.420* – determine whether the task-specific checklist, the holistic evaluation, or the analytical rubrics evaluation was more *P < 0.001, OSCE; objective structured clinical examination, CPX; clinical performance examination. Students who were less than 1SD from the average reliable. Previous studies have argued that task-specific were regarded as having failed the assessment. Then, the pass/fail agreement checklist evaluation of OSCE cannot evaluate competency between the three evaluations was examined using the Kappa coefficient and that it is very specific and hierarchical, so it is difficult, rubrics. Therefore, although task-specific checklists had a using the checklist scores alone to distinguish beginners strong relationship with holistic rubric and analytic ru- and experts in terms of problem-solving ability to form ac- brics, there are some discrepancies in the CPX between curate hypotheses with minimum information [7, 18]. the three evaluation tools compared to the OSCE. The Therefore, there is a growing emphasis on holistic assess- lower inter-rater agreement in the CPX as compare to ment. Compared to task-specific checklists, holistic grad- OSCE was more influenced by the evaluator, because the ing is superior in reliability and validity, as well as evaluation factor of the CPX includes certain subjective sensitivity to expertized skill level [9, 19], and shows con- items such as attitude or patient education, unlike the sistently higher internal consistency reliability and higher OCSE. In addition, in the task-specific checklist evaluation inter-evaluator reliability [20]. However, further research of the CPX, the doctor-patient relationship is evaluated by is needed to generalize our findings to other academic standardized patients, as opposed to evaluation by a fac- environments. ulty evaluator with clinical experience as a doctor. In a Regression analysis showed that the holistic rubric and previous study [16], the correlation between the evalu- analytic rubrics accounted for 59.1% of the OSCE ation scores of the faculty evaluator and standardized pa- task-specific checklist score and 51.6% of the CPX tient in the physical examination area was 0.91, while the task-specific checklist score. The most influential vari- correlation in the doctor-patient relationship was as low able in predicting the task-specific checklist score in as 0.54; this means there may be differences in evaluation both the OSCE and CPX was the holistic rubric score. areas where objective verification is difficult. It also In other words, evaluating a large number of checklist pointed out that the perception of doctor- patient rela- items for CPA may be one way to increase reliability, tionships may not be the same between faculty evaluators but a holistic rubric can be a useful tool in terms of effi- and standardized patients. Another previous study [17]on ciency. The evaluator as a clinical physician can quickly inter-rater agreement between faculty evaluators and stan- assess the degree of clinical performance and know the dardized patients reported that Kappa values were lower determinants of overall clinical performance and how in items related to history taking, but higher in the phys- well the student is functioning. However, these evaluator ical findings, diagnosis, and management items. This determinations cannot be conducted properly by relying evaluation difference between faculty evaluators and on task-specific checklists, and although objective Table 5 Effect of holistic rubric and analytic rubrics on task-specific checklist score by multiple regression in the OSCE and CPX Independent Variable B S.E. β t R Adj R F OSCE (n = 488) Holistic rubric 1.972 0.346 0.534 11.279* 0.770 0.591 352.37* Analytic rubrics 0.015 .003 0.274 5.793 CPX (n = 291)0 Holistic rubric 3.643 0.391 0.503 9.312* 0.721 0.516 155.896* Analytic rubrics 0.022 0.004 0.283 5.234 *P < 0.001, OSCE; objective structured clinical examination, CPX; clinical performance examination Yune et al. BMC Medical Education (2018) 18:124 Page 6 of 6 checklists are often used they are not the best way to as- Publisher’sNote Springer Nature remains neutral with regard to jurisdictional claims in sess clinical performance. Likewise, specific information published maps and institutional affiliations. on student performance can be difficult to obtain using holistic rubric alone. Therefore, the concurrent use of Received: 27 November 2017 Accepted: 17 May 2018 analytic rubrics evaluation should also be considered for applying evaluation results to real practical situations. References 1. David N. Techniques for measuring clinical competence: objective structured clinical examinations. Med Educ. 2004;38(2):199–203. Conclusion 2. Chesser A, Cameron H, Evans P, Cleland J, Boursicot K, Mires G. Sources of In summary, this study demonstrates that holistic rubric variation in performance on a shared OSCE station across four UK medical schools. Med Educ. 2009;43(6):526–32. and analytic rubrics are efficient tools for explaining 3. Harasym PH, Woloschuk W, Cunning L. Undesired variance due to examiner task-specific checklist scores. Holistic rubric can better stringency/leniency effect in communication skill scores assessed in OSCEs. explain task-specific checklist scores compared to ana- Adv Health Sci Educ Theory Pract. 2008;13(5):617–32. 4. Stroud L, Herold J, Tomlinson G, Cavalcanti RB. Who you know or what you lytic rubrics. Further validation, however, is required to know? Effect of examiner familiarity with residents on OSCE scores. Acad confirm these findings. Our findings will contribute to Med. 2011;86(10):S8–S11. the development of evaluation tools to ensure the reli- 5. Slater SC, Boulet JR. Predicting holistic ratings of written performance assessments from analytic scoring. Adv in Health Sci Educ. 2001;6(2):103–19. ability and efficiency of CPA widely used in medical edu- 6. McLaughlin K, Ainslie M, Coderre S, Wright B, Violato C. The effect of cation, while providing implications for the use of differential rater function over time (DRIFT) on objective structured clinical holistic evaluation of professional skills in CPA. examination ratings. Med Educ. 2009;43(10):989–92. 7. Godfrey P, Fuller R, Homer M, Roberts T. How to measure the quality of the OSCE: a review of metrics-AMEE guide no. 49. Med Teach. 2010;32(10):802–11. Abbreviations 8. Winckel CP, Reznick RK, Cohen R, Taylor B. Reliability and construct validity CPA: Clinical performance assessment; CPX: Clinical performance examination; of a structured technical skills assessment form. Am J Surg. 1994;167(4):423–7. OSCE: Objective structured clinical examination 9. Cohen DS, Colliver JA, Robbs RS. Swartz MH. A large-scale study of the reliabilities of checklist scores and ratings of interpersonal and Acknowledgements communication skills evaluated on a standardized-patient examination. Adv The authors wish to thank the participants in the study. Health Sci Educ Theory Pract. 1996;1(3):209–13. 10. Regehr G, MacRae H, Reznick RK, Szalay DL. Comparing the psychometric Funding properties of checklists and global rating scales for assessing performance This study was supported by Biomedical Research Institute Grant (2015–27), on an OSCE-format examination. Acad Med. 1998;73(9):993–7. Pusan National University Hospital. 11. Keynan A, Friedman M, Benbassat J. Reliability of global rating scales in the assessment of clinical competence of medical students. Med Educ. 1987; 21(6):477–81. Availability of data and materials 12. Dauphinee W, Blackmore D, Smee S, Rothman A, Reznick R. Using the The datasets from this study are available from the corresponding author on judgments of physician examiners in setting the standards for a national request. multi-center high stakes OSCE. Adv Health Sci Educ Theory Pract. 1997;2(3): 201–11. Authors’ contributions 13. Goulden NR. Relationship of analytic and holistic methods to raters' scores SJY and SYL conceptualized the study, developed the proposal, coordinated for speeches. J Res & Dev in Educ. 1994;27(2):73–82. the project, completed the initial data entry and analysis, and wrote the 14. Bharuthram S, Patel M. Co-constructing a rubric checklist with first year report. SJY, BSK, SJI, and SYB assisted in writing and editing the final report. university students: a self-assessment tool. J Appl Lang Stud. 2017;11(4):35–55. SJY and SYL participated in the overall supervision of the project and 15. Landis JR, Koch GG. The measurement of observer agreement for revision of the report. All authors read and approved the final manuscript. categorical data. Biometrics. 1977;33(1):159–74. 16. Park HK, Lee JK, Hwang HS, Lee JU, Choi YY, Kim H, Ahn DH. Periodical Authors' information clinical competence, observer variation, educational measurement. Patient So Jung Yune is associate professor in the Department of Medical Education, simulation Korean J Med Educ. 2003;15(2):141–50. Pusan National University School of Medicine, South Korea. 17. Kim JJ, Lee KJ, Choi KY, Lee DW. Analysis of the evaluation for clinical Sang Yeoup Lee is professor in the Department of Medical Education, Pusan performance examination using standardized patients in one medical National University School of Medicine and Department of Family Medicine, school. Korean J Med Educ. 2004;16(1):51–61. Pusan National University Yangsan Hospital, South Korea. 18. Stillman P, Swanson D, Regan MB, Philbin MM, Nelson V, Ebert T, Ley B, Sun Ju Im is associate professor in the Department of Medical Education, Parrino T, Shorey J, Stillman A, Alpert E, Caslowitz J, Clive D, Florek J, Pusan National University School of Medicine, South Korea. Hamolsky M, Hatem C, Kizirian J, Kopelman R, Levenson D, Levinson G, Bee Sung Kam is assistant professor in the Department of Medical Education, McCue J, Pohl H, Schiffman F, Schwartz J, Thane M, Wolf M. Assessment of Pusan National University School of Medicine, South Korea. clinical skills of residents utilizing standardized patients: a follow-up study Sun Yong Baek is professor in the Department of Medical Education, Pusan and recommendations for application. Ann Intern Med. 1991;114(5): National University School of Medicine, South Korea. 393–401. 19. Hodges B, McIlroy JH. Analytic global OSCE ratings are sensitive to level of training. Med Educ. 2003;37(11):1012–6. Ethics approval and consent to participate 20. Walzak A, Bacchus M, Schaefer JP, Zarnke K, Glow J, Brass C, McLaughlin K, This study was reviewed and given exempt status by the Institutional Review Irene WY. Diagnosing technical competence in six bedside procedures: Board of Pusan National University Yangsan Hospital (IRB No. 05–2017-102). comparing checklists and a global rating scale in the assessment of resident Because we analyzed data retrospectively and anonymously by assigning performance. Acad Med. 2015;90:1100–8. each subject a distinct number, the institutional review board did not require informed consent from participants. Competing interests The authors declare that they have no competing interests. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png BMC Medical Education Springer Journals

Holistic rubric vs. analytic rubric for measuring clinical performance levels in medical students

Free
6 pages
Loading next page...
 
/lp/springer_journal/holistic-rubric-vs-analytic-rubric-for-measuring-clinical-performance-mpXuwVT5a4
Publisher
BioMed Central
Copyright
Copyright © 2018 by The Author(s).
Subject
Education; Medical Education; Theory of Medicine/Bioethics
eISSN
1472-6920
D.O.I.
10.1186/s12909-018-1228-9
Publisher site
See Article on Publisher Site

Abstract

Background: Task-specific checklists, holistic rubrics, and analytic rubrics are often used for performance assessments. We examined what factors evaluators consider important in holistic scoring of clinical performance assessment, and compared the usefulness of applying holistic and analytic rubrics respectively, and analytic rubrics in addition to task- specific checklists based on traditional standards. Methods: We compared the usefulness of a holistic rubric versus an analytic rubric in effectively measuring the clinical skill performances of 126 third-year medical students who participated in a clinical performance assessment conducted by Pusan National University School of Medicine. We conducted a questionnaire survey of 37 evaluators who used all three evaluation methods—holistic rubric, analytic rubric, and task-specific checklist—for each student. The relationship between the scores on the three evaluation methods was analyzed using Pearson’s correlation. Inter-rater agreement was analyzed by Kappa index. The effect of holistic and analytic rubric scores on the task-specific checklist score was analyzed using multiple regression analysis. Results: Evaluators perceived accuracy and proficiency to be major factors in objective structured clinical examinations evaluation, and history taking and physical examination to be major factors in clinical performance examinations evaluation. Holistic rubric scores were highly related to the scores of the task-specific checklist and analytic rubric. Relatively low agreement was found in clinical performance examinations compared to objective structured clinical examinations. Meanwhile, the holistic and analytic rubric scores explained 59.1% of the task-specific checklist score in objective structured clinical examinations and 51.6% in clinical performance examinations. Conclusion: The results show the usefulness of holistic and analytic rubrics in clinical performance assessment, which can be used in conjunction with task-specific checklists for more efficient evaluation. Keywords: Clinical assessment, Objective structured clinical examination, Feedback Background relevant, and contextually appropriate setting [1]. In par- In medical education, a clinical performance assessment ticular, the role of the evaluator as a major factor in the (CPA) is a criterion-referenced test that assesses compe- reliability of CPA has often been discussed. Such factors tencies in the care of patients. The main issue is whether as evaluator expertise, experience, and hawkishness may the standardization and objectivity of evaluations are re- affect CPA more than the evaluation items, because no liably maintained in a complex and simulated, clinically single method of setting standards is perfect [2–4]. Holistic rubrics emphasize the use of experts to judge performance assessment. They comprise a comprehen- * Correspondence: saylee@pnu.edu sive assessment of the complex multi-faceted character- Department of Medical Education, Pusan National University School of istics of the tasks undertaken and are based on the Medicine, 49, Busandaehak-ro, Mulgeum-eup, Yangsan-si, Gyeongsangnam-do 50612, Republic of Korea overall impression of the experts who implement them. Family Medicine Clinic and Research Institute of Convergence of Biomedical Since performance is not a sum of simple factors, the Science and Technology, Pusan National University Yangsan Hospital, 49, use of expert holistic rubrics is recognized as a useful Busandaehak-ro, Mulgeum-eup, Yangsan-si, Gyeongsangnam-do 50612, Republic of Korea © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Yune et al. BMC Medical Education (2018) 18:124 Page 2 of 6 evaluation method [5]. However, when assessment times holistic scoring in CPA, and what factors evaluators are longer, evaluators generally tend to be more lenient recognize as useful in holistic grading. The usefulness of and are more likely to overlook students’ failure factors these two rubrics was also compared by applying holistic due to evaluator fatigue [6]. At the beginning of an and analytic rubrics in addition to task-specific checklists evaluation, evaluators are reported to exaggerate due to based on traditional standards. The four overarching re- inexperience while, in the latter stage, they exaggerate search questions guiding the study were as follows: (1) due to fatigue and comparisons with applicants who What are the evaluators’ perceptions regarding key factors have already been assessed. Evaluation of clinical per- in determining OSCE and CPX holistic scoring? (2) Are formance through an evaluation matrix has been recom- there correlations among the scores of the task-specific mended to avoid evaluator errors [7]. However, it has checklist, holistic rubric, and analytic rubric in OSCEs and been pointed out that that to some extent, evaluations CPXs? (3) Is there agreement between pass and fail fre- using task-specific checklists covering many criteria have quencies based on the task-specific checklist, holistic ru- difficulties in evaluating competency, and that there is a bric, and analytic rubric in OSCEs and CPXs? (4) What is limit to the effects of the evaluator’s expertise in evalu- the effect of the holistic rubric and analytic rubric on the ation [8]. task-specific checklist score in OSCEs and CPXs? Due to the limitations of task-specific checklist evalu- ation, it has been proposed that evaluators use a global Methods rating scale [9, 10]. A global rating scale is a holistic ru- Participants and data bric that provides a single score based on an overall im- This study utilized the data of 126 third-year students pression of a student’s clinical performance and can be who participated in a CPA during two days in 2016, led used in tandem with the task-specific checklist [11]. Glo- by Pusan National University School of Medicine. This bal rating scale assessments are easy to develop and use, study was reviewed by the Institutional Review Board of but there is a high possibility that errors will occur. For Pusan National University Yangsan Hospital, and given example, factors related to applicants may cause a “halo exempt status (IRB No. 05–2017-102). Because we ana- effect”; additionally, the “central tendency,” whereby the lyzed data retrospectively and anonymously by assigning evaluator tends to select the middle range, may also each subject a distinct number, the institutional review cause errors. Holistic rubrics can use behaviorally an- board did not require informed consent from partici- chored scales or Likert scales. A global rating scale can pants. The CPA was operated with 12 stations per day, be readily used to set standards for performance [12]. including six clinical performance examination (CPX) However, to use the global rating scale, it is necessary to stations and six objective structured clinical examination clearly present pre-shared criteria, i.e., rubrics, to assess (OSCE) stations. For each of the CPX and OSCE, 24 learners’ achievements. professors participated as evaluators with each professor In this respect, analytic rubrics have the advantage of evaluating 31–32 students. The total number of evalua- reducing evaluator errors on a global rating scale. Ana- tions was 750. Evaluations were used for statistical ana- lytic rubrics are more reliable than holistic rubrics in lysis only if there were data including analytic rubric and that they check the key content, rather than providing a global rating scale results, in addition to task-specific holistic evaluation [13]. Analytic rubrics provide specific checklists as a mandatory assessment method. A total of feedback according to several sections or dimensions, 292 CPX evaluation cases (38.9%) and 488 OSCE allow students to identify which factors are missing from (65.1%) evaluation cases were used as data in the final the holistic rubric, and enable continuous monitoring analysis. In addition, 37 evaluators (77.1%) responded to [14]. Analytic scoring refers to a process of assigning a questionnaire. points to individual traits present in the performance and adding the points to derive single or multiple di- Materials mension scores. For example, students with low scores A questionnaire was administered to evaluators who par- in aseptic manipulation can be educated separately and ticipated in the CPA. The most important factor recog- their progress can be monitored to confirm the degree nized in the global rating scale was as follows: “If you of improvement in aseptic manipulation ability in the evaluate clinical performance with one item, what factors next CPA. do you think are most important?. .. In which case, do you However, little research has been conducted to deter- think that he/she is a high-performing student? (For ex- mine whether holistic rubrics or analytic rubrics are more ample, accuracy, proficiency, sterility, success rate, etc.).” useful, and to examine how such a rubric-based evaluation Evaluators assessed whether the holistic rubric for CPA, approach can be used as a more effective tool than a assigned a score from 0 to 4 and developed according to a task-specific checklist evaluation. Therefore, this study ex- score-based criterion, could measure students’ clinical amined what factors evaluators consider important in ability to perform. The analytic rubrics were developed Yune et al. BMC Medical Education (2018) 18:124 Page 3 of 6 based on the results of a questionnaire administered to from the average were regarded as having failed the as- the faculty focus group. The CPX rubrics allocated 0–3 sessment. The pass/fail agreement between the three points each to various fields, including history taking (con- evaluations was then examined using the Kappa coeffi- tents, systematic), physical examination (contents, accur- cient. Theoretically, if the Kappa value is greater than acy), patient education, and attitude. In the case of the 0.8, it is a perfect agreement, 0.6 denotes substantial OSCE, 0–3 points were allocated for each of the four ru- agreement, 0.4 to 0.6 denotes moderate agreement, and bric items of proficiency, accuracy, asepticity, and success. 0.2 to 0.4 denotes only a fair agreement [15]. To ascer- For example, in CPX, we rated students on a 4-point scale tain which factors had the greatest effect on the from 3 to 0 points in the context of history taking as fol- task-specific checklist total scores, multiple regression lows: the student asked the standard patient every single analysis was performed using the holistic rubric score question (3 point), the student missed some questions (2 and analytic rubric scores as independent variables and point), the student missed a lot of questions (1 point), and the task-specific checklist total scores as the dependent the student did not do anything (0 point). In the OSCE, variable. Statistical analysis of the data was performed we rated students on a 4-point scale from 3 to 0 in asepti- using SPSS version 21.0 for Windows (SPSS Inc., Chi- city as follows: the whole process was aseptic (3 point), pa- cago, USA). tient safety was ensured but contamination occurred (2 point), contamination threatened patient safety (1 point), Results and the student did not do anything (0 point). The sum of Evaluators’ perceptions of key factors in determining CPA the analytic rubric items was calculated by weighting the holistic scoring importance of each item. The sum of the CPX analytic ru- In OSCE, accuracy was the most important factor for bric was history taking × 40 + physical examination × 30 + evaluators who had less than six experiences of evalu- patient education × 20 + attitude × 10, while the sum of ation, while evaluators who had participated more than the OSCE analytic rubric was proficiency × 45 + accuracy six times recognized proficiency as the most important × 30 + asepticity × 15 + success × 10. In the case of the factor. Evaluators who were professors with less than task-specific checklist, a score from 0 to 1 or 0 to 2 was al- 10 years of experience recognized accuracy as the most located for each item, and the sum of the final scores was important factor, while professors with more than then obtained. The CPX and OSCE task-specific checklists 10 years of experience considered both accuracy and consisted of 19 to 28 items and 14 to 18 items, proficiency to be most important. Overall, evaluators respectively. recognized accuracy as the most important factor, followed by proficiency. In CPX, evaluators recognized Statistical analysis history taking as the most important factor, followed by The contents of the questionnaire responses to import- physical examination, regardless of the number of evalu- ant factors in the global rating scale of the CPA were an- ation experiences and duration of the working period alyzed qualitatively and for frequency. The relationship (Table 1). between the global rating scale, analytic rubric scores, and task-specific checklist total scores was examined Relationship among holistic score, analytic rubric scores, using Pearson correlation analysis. Taking into account and task-specific checklist scores the task-specific checklist scores, holistic score, and ana- In the OSCE, the task-specific checklist scores showed a lytic rubric scores, students who were less than 1SD strong positive correlation with holistic score and Table 1 Evaluators’ perceptions of key factors in determining the OSCE and CPX holistic rubric scoring Factors OSCE CPX n Asepticity Accuracy Proficiency Success n History taking Physical examination Patient education Attitude Evaluation experience < 6 times 12 2 7 3 2 16 15 11 3 1 ≥6 times 7 1 3 4 – 22 –– – Subtotal 19 3 10 7 2 18 17 11 3 1 Faculty career < 10 years 7 1 4 1 2 12 9 6 2 – ≥10 years 12 2 6 6 – 68 5 1 1 Subtotal 19 3 10 7 2 18 17 11 3 1 Multiple-response Yune et al. BMC Medical Education (2018) 18:124 Page 4 of 6 analytic rubric scores (r = 0.751, P < 0.001 and r = 0.697, Table 3 Pass and fail frequencies based on task-specific checklist, holistic rubric, and analytic rubrics scores in the OSCE and CPX P < 0.001, respectively). Holistic score also had a strong positive correlation with analytic rubric scores (r = 0.791, Task-specific checklist [n (%)] P < 0.001). In the case of CPX, the task-specific checklist Pass Fail Total scores showed a strong positive correlation with holistic OSCE (n = 488) score and analytic rubric scores (r = 0.689, P < 0.001 and Holistic rubric Pass 394 (96.6) 48 (60.0) 442 (90.6) r = 0.613, P < 0.001, respectively). Holistic score also had Fail 14 (3.4) 32 (40.0) 46 (9.4) a strong positive correlation with analytic rubric scores Analytic rubrics Pass 356 (87.3) 32 (40.0) 388 (79.5) (r = 0.655, P < 0.001) (Table 2). Fail 52 (12.7) 48 (60.0) 100 (20.5) Inter-rater agreement among holistic score, analytic Total 408 (100.0) 80 (100.0) 488 (100.0) rubric scores, and task-specific checklist scores CPX (n = 291) In the OSCE, the task-specific checklist scores showed a Holistic rubric Pass 240 (98.4) 34 (72.3) 274 (94.2) moderate agreement with holistic score and analytic ru- Fail 4 (1.6) 13 (27.7) 17 (5.8) bric scores (Kappa = 0.441, P < 0.001 and Kappa = 0.429, Analytic rubrics Pass 226 (92.6) 25 (53.2) 251 (86.3) P < 0.001, respectively). Holistic score also had a moder- Fail 18 (7.4) 22 (46.8) 40 (13.7) ate agreement with analytic rubric scores (r = 0.512, P < 0.001). Of the students who passed the task-specific Total 244 (100.0) 47 (100.0) 291 (100.0) checklist, 96.6% passed the holistic rubric and 87.3% OSCE; objective structured clinical examination, CPX; clinical performance examination passed the analytics rubrics, while of the students who Students who were less than 1SD from the average were regarded as having failed the task-specific checklist, 40.0% failed the holistic failed the assessment rubric, and 60% failed the analytic rubrics (Tables 3, 4). In CPX, the task-specific checklist scores showed a fair statistically significant in predicting task-specific checklist agreement with holistic score and analytic rubric scores scores, with an explanatory power of 51.6% (F = 155.896, (Kappa = 0.351, P < 0.001 and Kappa = 0.420, P < 0.001, P < 0.001), and holistic score (β =0.503, P < 0.001) showed respectively). Holistic score also had a moderate agree- greater explanatory power than analytic rubric scores (β ment with analytic rubric scores (Kappa = 0.255, P < =0.283, P < 0.001) (Table 5). 0.001). Of the students who passed the task-specific checklist, 98.4% passed the holistic rubric and 92.6% Discussion passed the analytic rubrics, while of the students who Evaluators recognized accuracy as the most important failed the task-specific checklist, 27.7% failed the holistic factor in OSCE, and then proficiency. In CPX, history rubric, and 46.8% failed the analytic rubrics (Tables 3, 4). taking was the major factor, followed by physical exam- ination. Based on these results, we developed an analytic Explanatory power of holistic rubric and analytic rubrics rubrics and examined the relationship and agreement for task-specific checklist among the task-specific checklist, holistic rubric, and In the OSCE, multiple regression analyses showed that analytic rubrics. both holistic score and analytic rubric scores were statisti- In the correlation analysis, both the OCSE and CPX cally significant in predicting task-specific checklist scores, showed a strong positive correlation among holistic score, with an explanatory power of 59.1% (F = 352.37, P < analytic rubric scores, and task-specific checklist scores. 0.001), while although holistic score was the most influen- In the Kappa coefficient for the evaluation agreement, the tial variable (β =0.534, P < 0.001). All variables had vari- OSCE showed a moderate agreement among task-specific ance inflation factors of less than 10 or tolerances of checklist, holistic rubric, and analytic rubrics. In the CPX, greater than 0.1, which shows that multicollinearity does however, there was fair agreement between holistic rubric not exist. In the CPX, multiple regression analyses showed and task-specific checklists or analytic rubrics, and moder- that both holistic score and analytic rubric scores were ate agreement between task-specific checklist and analytic Table 2 Correlations among task-specific checklist, holistic rubric, and analytic rubrics scores in the OSCE and CPX Factor OSCE (n = 488) CPX (n = 291) Mean ± SD 1 2 3 Mean ± SD 1 2 3 1. Task-specific checklist 12.5 ± 2.9 – 28.1 ± 4.8 – 2. Holistic rubric 2.4 ± 0.8 0.751* – 2.4 ± 0.7 0.689* – 3. Analytic rubrics 212.0 ± 52.7 0.697* 0.791* – 400.4 ± 62.1 0.613* 0.655* – *P < 0.001, OSCE; objective structured clinical examination, CPX; clinical performance examination Yune et al. BMC Medical Education (2018) 18:124 Page 5 of 6 Table 4 Kappa coefficient among task-specific checklist, holistic standardized patients can be explained in part by the am- rubric, and analytic rubrics scores in the OSCE and CPX biguous scoring criteria of checklist items, lack of training Holistic Analytic Task-specific to improve consistency between evaluators, and evalua- rubric rubrics checklist tors’ fatigue [16]. OSCE (n = 488) In order to evaluate inter-rater agreement, Kappa coeffi- Holistic rubric – 0.512* 0.441* cient and percent agreement are considered together. In the present study, students who failed the task-specific Analytic rubrics 0.512* – 0.429* checklist evaluation often passed the holistic evaluation or Task-specific checklist 0.441* 0.429* – the analytical rubrics evaluation in the case of the CPX. CPX (n = 291) These findings mean that it is more difficult for students Holistic rubric – 0.255* 0.351* to pass when evaluated with a large number of evaluation Analytic rubrics 0.255* – 0.420* items. However, in the results of this study, it is difficult to Task-specific checklist 0.351* 0.420* – determine whether the task-specific checklist, the holistic evaluation, or the analytical rubrics evaluation was more *P < 0.001, OSCE; objective structured clinical examination, CPX; clinical performance examination. Students who were less than 1SD from the average reliable. Previous studies have argued that task-specific were regarded as having failed the assessment. Then, the pass/fail agreement checklist evaluation of OSCE cannot evaluate competency between the three evaluations was examined using the Kappa coefficient and that it is very specific and hierarchical, so it is difficult, rubrics. Therefore, although task-specific checklists had a using the checklist scores alone to distinguish beginners strong relationship with holistic rubric and analytic ru- and experts in terms of problem-solving ability to form ac- brics, there are some discrepancies in the CPX between curate hypotheses with minimum information [7, 18]. the three evaluation tools compared to the OSCE. The Therefore, there is a growing emphasis on holistic assess- lower inter-rater agreement in the CPX as compare to ment. Compared to task-specific checklists, holistic grad- OSCE was more influenced by the evaluator, because the ing is superior in reliability and validity, as well as evaluation factor of the CPX includes certain subjective sensitivity to expertized skill level [9, 19], and shows con- items such as attitude or patient education, unlike the sistently higher internal consistency reliability and higher OCSE. In addition, in the task-specific checklist evaluation inter-evaluator reliability [20]. However, further research of the CPX, the doctor-patient relationship is evaluated by is needed to generalize our findings to other academic standardized patients, as opposed to evaluation by a fac- environments. ulty evaluator with clinical experience as a doctor. In a Regression analysis showed that the holistic rubric and previous study [16], the correlation between the evalu- analytic rubrics accounted for 59.1% of the OSCE ation scores of the faculty evaluator and standardized pa- task-specific checklist score and 51.6% of the CPX tient in the physical examination area was 0.91, while the task-specific checklist score. The most influential vari- correlation in the doctor-patient relationship was as low able in predicting the task-specific checklist score in as 0.54; this means there may be differences in evaluation both the OSCE and CPX was the holistic rubric score. areas where objective verification is difficult. It also In other words, evaluating a large number of checklist pointed out that the perception of doctor- patient rela- items for CPA may be one way to increase reliability, tionships may not be the same between faculty evaluators but a holistic rubric can be a useful tool in terms of effi- and standardized patients. Another previous study [17]on ciency. The evaluator as a clinical physician can quickly inter-rater agreement between faculty evaluators and stan- assess the degree of clinical performance and know the dardized patients reported that Kappa values were lower determinants of overall clinical performance and how in items related to history taking, but higher in the phys- well the student is functioning. However, these evaluator ical findings, diagnosis, and management items. This determinations cannot be conducted properly by relying evaluation difference between faculty evaluators and on task-specific checklists, and although objective Table 5 Effect of holistic rubric and analytic rubrics on task-specific checklist score by multiple regression in the OSCE and CPX Independent Variable B S.E. β t R Adj R F OSCE (n = 488) Holistic rubric 1.972 0.346 0.534 11.279* 0.770 0.591 352.37* Analytic rubrics 0.015 .003 0.274 5.793 CPX (n = 291)0 Holistic rubric 3.643 0.391 0.503 9.312* 0.721 0.516 155.896* Analytic rubrics 0.022 0.004 0.283 5.234 *P < 0.001, OSCE; objective structured clinical examination, CPX; clinical performance examination Yune et al. BMC Medical Education (2018) 18:124 Page 6 of 6 checklists are often used they are not the best way to as- Publisher’sNote Springer Nature remains neutral with regard to jurisdictional claims in sess clinical performance. Likewise, specific information published maps and institutional affiliations. on student performance can be difficult to obtain using holistic rubric alone. Therefore, the concurrent use of Received: 27 November 2017 Accepted: 17 May 2018 analytic rubrics evaluation should also be considered for applying evaluation results to real practical situations. References 1. David N. Techniques for measuring clinical competence: objective structured clinical examinations. Med Educ. 2004;38(2):199–203. Conclusion 2. Chesser A, Cameron H, Evans P, Cleland J, Boursicot K, Mires G. Sources of In summary, this study demonstrates that holistic rubric variation in performance on a shared OSCE station across four UK medical schools. Med Educ. 2009;43(6):526–32. and analytic rubrics are efficient tools for explaining 3. Harasym PH, Woloschuk W, Cunning L. Undesired variance due to examiner task-specific checklist scores. Holistic rubric can better stringency/leniency effect in communication skill scores assessed in OSCEs. explain task-specific checklist scores compared to ana- Adv Health Sci Educ Theory Pract. 2008;13(5):617–32. 4. Stroud L, Herold J, Tomlinson G, Cavalcanti RB. Who you know or what you lytic rubrics. Further validation, however, is required to know? Effect of examiner familiarity with residents on OSCE scores. Acad confirm these findings. Our findings will contribute to Med. 2011;86(10):S8–S11. the development of evaluation tools to ensure the reli- 5. Slater SC, Boulet JR. Predicting holistic ratings of written performance assessments from analytic scoring. Adv in Health Sci Educ. 2001;6(2):103–19. ability and efficiency of CPA widely used in medical edu- 6. McLaughlin K, Ainslie M, Coderre S, Wright B, Violato C. The effect of cation, while providing implications for the use of differential rater function over time (DRIFT) on objective structured clinical holistic evaluation of professional skills in CPA. examination ratings. Med Educ. 2009;43(10):989–92. 7. Godfrey P, Fuller R, Homer M, Roberts T. How to measure the quality of the OSCE: a review of metrics-AMEE guide no. 49. Med Teach. 2010;32(10):802–11. Abbreviations 8. Winckel CP, Reznick RK, Cohen R, Taylor B. Reliability and construct validity CPA: Clinical performance assessment; CPX: Clinical performance examination; of a structured technical skills assessment form. Am J Surg. 1994;167(4):423–7. OSCE: Objective structured clinical examination 9. Cohen DS, Colliver JA, Robbs RS. Swartz MH. A large-scale study of the reliabilities of checklist scores and ratings of interpersonal and Acknowledgements communication skills evaluated on a standardized-patient examination. Adv The authors wish to thank the participants in the study. Health Sci Educ Theory Pract. 1996;1(3):209–13. 10. Regehr G, MacRae H, Reznick RK, Szalay DL. Comparing the psychometric Funding properties of checklists and global rating scales for assessing performance This study was supported by Biomedical Research Institute Grant (2015–27), on an OSCE-format examination. Acad Med. 1998;73(9):993–7. Pusan National University Hospital. 11. Keynan A, Friedman M, Benbassat J. Reliability of global rating scales in the assessment of clinical competence of medical students. Med Educ. 1987; 21(6):477–81. Availability of data and materials 12. Dauphinee W, Blackmore D, Smee S, Rothman A, Reznick R. Using the The datasets from this study are available from the corresponding author on judgments of physician examiners in setting the standards for a national request. multi-center high stakes OSCE. Adv Health Sci Educ Theory Pract. 1997;2(3): 201–11. Authors’ contributions 13. Goulden NR. Relationship of analytic and holistic methods to raters' scores SJY and SYL conceptualized the study, developed the proposal, coordinated for speeches. J Res & Dev in Educ. 1994;27(2):73–82. the project, completed the initial data entry and analysis, and wrote the 14. Bharuthram S, Patel M. Co-constructing a rubric checklist with first year report. SJY, BSK, SJI, and SYB assisted in writing and editing the final report. university students: a self-assessment tool. J Appl Lang Stud. 2017;11(4):35–55. SJY and SYL participated in the overall supervision of the project and 15. Landis JR, Koch GG. The measurement of observer agreement for revision of the report. All authors read and approved the final manuscript. categorical data. Biometrics. 1977;33(1):159–74. 16. Park HK, Lee JK, Hwang HS, Lee JU, Choi YY, Kim H, Ahn DH. Periodical Authors' information clinical competence, observer variation, educational measurement. Patient So Jung Yune is associate professor in the Department of Medical Education, simulation Korean J Med Educ. 2003;15(2):141–50. Pusan National University School of Medicine, South Korea. 17. Kim JJ, Lee KJ, Choi KY, Lee DW. Analysis of the evaluation for clinical Sang Yeoup Lee is professor in the Department of Medical Education, Pusan performance examination using standardized patients in one medical National University School of Medicine and Department of Family Medicine, school. Korean J Med Educ. 2004;16(1):51–61. Pusan National University Yangsan Hospital, South Korea. 18. Stillman P, Swanson D, Regan MB, Philbin MM, Nelson V, Ebert T, Ley B, Sun Ju Im is associate professor in the Department of Medical Education, Parrino T, Shorey J, Stillman A, Alpert E, Caslowitz J, Clive D, Florek J, Pusan National University School of Medicine, South Korea. Hamolsky M, Hatem C, Kizirian J, Kopelman R, Levenson D, Levinson G, Bee Sung Kam is assistant professor in the Department of Medical Education, McCue J, Pohl H, Schiffman F, Schwartz J, Thane M, Wolf M. Assessment of Pusan National University School of Medicine, South Korea. clinical skills of residents utilizing standardized patients: a follow-up study Sun Yong Baek is professor in the Department of Medical Education, Pusan and recommendations for application. Ann Intern Med. 1991;114(5): National University School of Medicine, South Korea. 393–401. 19. Hodges B, McIlroy JH. Analytic global OSCE ratings are sensitive to level of training. Med Educ. 2003;37(11):1012–6. Ethics approval and consent to participate 20. Walzak A, Bacchus M, Schaefer JP, Zarnke K, Glow J, Brass C, McLaughlin K, This study was reviewed and given exempt status by the Institutional Review Irene WY. Diagnosing technical competence in six bedside procedures: Board of Pusan National University Yangsan Hospital (IRB No. 05–2017-102). comparing checklists and a global rating scale in the assessment of resident Because we analyzed data retrospectively and anonymously by assigning performance. Acad Med. 2015;90:1100–8. each subject a distinct number, the institutional review board did not require informed consent from participants. Competing interests The authors declare that they have no competing interests.

Journal

BMC Medical EducationSpringer Journals

Published: Jun 5, 2018

References

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off