Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You and Your Team.

Learn More →

How to Use a Randomized Clinical Trial Addressing a Surgical Procedure: Users’ Guide to the Medical Literature

How to Use a Randomized Clinical Trial Addressing a Surgical Procedure: Users’ Guide to the... Abstract Because surgical procedures require clinicians to develop and maintain procedural expertise and because blinding in randomized clinical trials of such therapies is often challenging, their critical appraisal raises unique issues. Risk of bias of trials of surgical procedures increases if investigators fail to rigorously conceal allocation and, where possible, to ensure blinding of those involved in the trial. Variability in surgeons’ expertise can also increase bias and lead to important limitations in applicability. To address these issues, this Users’ Guide to the Medical Literature reviews the use of remote randomization systems, blinding, sham-controlled trials, split-body trials, expertise-based trials, and mechanistic vs practical trials. Consideration of risk of bias and applicability issues will allow clinicians to make optimal use of trials addressing surgical procedures. Clinical Scenario You are an orthopedic surgeon seeing a healthy 65-year-old woman with a displaced fracture of her right proximal humerus that occurred when she fell onto an outstretched hand. Her fracture involves the surgical neck and greater tuberosity of her dominant proximal humerus, and each fragment is displaced approximately 1 cm. You have treated many patients like this nonsurgically, but the patient tells you she has a friend who had a similar fracture and did well after an operation. The patient’s question brings to mind a notification you recently received on your mobile device for the Proximal Fracture of the Humerus Evaluation by Randomization (PROFHER) trial; you let the patient know you will get back to her very shortly after reviewing the latest evidence on the matter.1 As you pull up the report, you wonder if there are any special issues to which you should attend when reviewing a randomized clinical trial (RCT) addressing a surgical procedure. Challenges of Evaluating Surgical Procedures Although research to evaluate surgical procedures has historically been dominated by nonrandomized observational studies, increasing awareness that randomization reduces bias by ensuring similar distributions of prognostic factors in the intervention and control groups has led to a marked increase in the conduct of surgical RCTs.2-4 This positive development raises new issues: trials of surgical procedures present special challenges in understanding and applying their results to patient management.5 Many of these challenges arise because, in contrast to pharmacological treatments, surgical procedures rely on procedural expertise. Variability in procedural expertise between the intervention and control groups can therefore influence outcomes and result in spurious inferences.6 Critical appraisal of RCTs of surgical procedures also requires addressing the frequent lack of blinding, which may be unavoidable but even if possible is typically much more challenging than in RCTs of drug therapies.7 This Users’ Guide to the Medical Literature presents a practical approach to assessing RCTs of surgical procedures using the 3-step approach of our Users’ Guides: assessment of risk of bias, results, and application to patient care (Box 1). This article highlights those issues that are specific to RCTs of surgical procedures. Box Section Ref ID Box 1. Users’ Guides Approach to a Randomized Clinical Trial Addressing a Surgical Procedure How Serious Is the Risk of Bias? Did the intervention and control groups start with the same prognosis? Was allocation concealed?a Were patients in the treatment and control groups similar with respect to known prognostic factors? Was prognostic balance maintained as the study progressed? If possible, were participants, health care professionals, data collectors, outcomes assessors, data analysts, and/or those interpreting results blinded?a Were the groups prognostically balanced at the study’s completion? Was follow-up complete? Were patients analyzed in the groups to which they were randomized? Was stopping the trial early for benefit avoided? Were the interventions administered with similar expertise?a What Are the Results? How large was the treatment effect? How precise was the estimate of the treatment effect? How Can I Apply the Results to Patient Care? Were the study patients similar to my patient? Were the study interventions similar to interventions in my setting?a Were all patient-important outcomes considered? Are the likely treatment benefits worth the potential harms and costs? a Includes issues specific to trials of surgical procedures. How Serious Is the Risk of Bias? Previous Users’ Guides have considered whether allocation was concealed, patients were similar with respect to known prognostic factors, blinding was implemented, follow-up was complete, patients were analyzed in the groups to which they were randomized, and investigators avoided early stopping for benefit.4,8-11 For RCTs of surgical procedures, allocation concealment and blinding each warrant further discussion, and the expertise with which the study interventions were administered presents a unique issue requiring consideration. Did the Intervention and Control Groups Start With the Same Prognosis? Randomization, if successful, creates groups with a similar likelihood of experiencing the outcomes of interest (ie, a similar prognosis). Allocation concealment describes the extent to which individuals responsible for enrolling patients were unaware of and could not influence the study arm to which the randomization schedule assigned patients.8,12Concealment refers not to the process of creating the randomization schedule or to the methods of blinding used to maintain prognostic balance as a study progresses, but rather to safeguarding the implementation of the randomization process. Trials with inadequate methods of allocation concealment may overestimate treatment effects, and trials of surgical procedures frequently implement methods that are less secure.8,13,14 Consider an RCT in which patients with appendicitis were randomly allocated to receive open or laparoscopic appendectomies, and the laparoscopic procedure but not the open procedure required the attending surgeons’ presence in the operating room.15 Resident physicians, responsible for recruiting and enrolling patients, obtained the treatment assignments from sealed envelopes. The residents were typically eager to perform procedures independently and were more often familiar and confident with open rather than laparoscopic appendectomy. When patients required surgery during the night, residents who were reluctant to call in their attending staff held up the envelopes to the light until they found one that contained an open procedure (Daryl R. Wall, MBBS, FRACS, written communication, June 9, 2000). If the patients who presented overnight were sicker or the care they received without the attending surgeons’ presence was inferior, the lack of concealment would have biased the results in favor of the laparoscopic procedure.4,16 Contrast this example with a typical blinded pharmacological trial in which study drugs are packaged and labeled before they are sent to each participating center, independent of the research coordinators who enroll patients. This arrangement prevents the coordinators from knowing which medication the next patient will receive and thus secures the randomization sequence. To corrupt the sequence in a pharmacological trial, the coordinators would have to obtain the central randomization sequence and unblind the packaged medications. Circumventing randomization is far easier in surgical trials that implement envelopes (or other even less secure methods) for concealment. Quiz Ref IDInvestigators of surgical RCTs can ensure allocation concealment by using remote web-based and 24-hour central telephone randomization services that require individuals who are enrolling patients to contact an independent source.12 Each contact is logged, and treatment assignments for each patient are provided only after eligibility has been confirmed. Although sealed, opaque, and sequentially numbered envelopes are preferable to envelopes that are unsealed, translucent, and not numbered, they remain vulnerable to tampering and are therefore less secure than remote randomization systems. Randomization may fail to do its job of creating prognostic balance through chance or through failure to conceal allocation. Either way, one can appraise the success of randomization by examining the distributions of the baseline characteristics in the intervention and control groups, usually presented in the first table of results. Clinicians can be reassured when the known prognostic factors are similar—and be legitimately concerned if they are not. Was Prognostic Balance Maintained as the Study Progressed? Blinding is the process of withholding information about treatment assignments from groups of individuals who could introduce bias if they gained this information after patients were randomized. Investigators can blind pharmacological therapies easily using placebo medications; however, placebos are often not possible for surgical procedures. Lack of blinding is an even more serious concern when, as is often the case in surgical RCTs, investigators focus on outcomes that are subjective (such as pain, function, quality of life, and satisfaction).3,7 There are 6 groups of individuals who should ideally be blinded in RCTs: participants, health care professionals, data collectors, outcomes assessors, data analysts, and the investigators responsible for interpreting the results (eTable in the Supplement).8,17 In trials of surgical procedures, it is sometimes possible to blind the participants and some health care personnel, but it is almost never possible to blind the surgeons. Quiz Ref IDWith careful planning it may be possible to blind the data collectors, is usually possible to blind outcomes assessors, and is always possible to blind data analysts and those interpreting the results.18 For example, consider the ReCharge trial in which 239 participants were randomly allocated to undergo surgical implantation of an active vagal nerve block device or a sham procedure to determine the effect of reversible intermittent intra-abdominal vagal nerve blockade on morbid obesity.19 The surgeons could not be blinded to the procedures that they were performing and the patients might have become unblinded by asking their surgeons which operation they had, so interaction between the surgeons and participants was limited postoperatively and blinded staff conducted patient follow-up. Participants in RCTs of surgical procedures can sometimes be blinded using standardized wound coverings, digitally altered radiographs, or split-body designs (eAppendix in the Supplement). However, this is much easier when the comparisons are between alternative but similar interventions (such as one surgical procedure vs another) than when the control is a nonsurgical intervention or standard of care.20-22 For outcomes that are subjective, placebo effects from surgical procedures are often substantial.23 Moreover, rituals, stressors, and environmental cues associated with admission, preparation, anesthesia, and recovery can heighten placebo effects associated with surgery because they may lead patients to have greater expectations for benefit.24 Sham-controlled RCTs can control for the placebo effect of surgery by ensuring that neither intervention nor control patients know whether they have undergone the active surgical procedure.24 Sham surgical procedures might seem to expose participants to unreasonable harms without promise of a direct benefit, but participants who receive sham treatments may not only experience substantial placebo effects but also receive additional monitoring, imaging, or clinic visits beyond standard practice.22 Sham-controlled RCTs are most effectively implemented by ensuring the relevant clinical community feels that there is equipoise about the relative effects of the intervention and control, minimizing risk, obtaining informed consent, and avoiding ongoing active deception.23,24 In the ReCharge trial, participants assigned to the sham procedure experienced operations similar to those in the active treatment group because both required similar laparoscopic surgical techniques under general anesthesia.19 Participants in the active treatment group experienced significantly greater weight loss than participants in the sham group, but the sham group nonetheless experienced 3-fold greater weight loss than was predicted, which suggests the influence of substantial placebo effects. When blinding is not possible or not undertaken, excluding placebo effects as an explanation of apparently positive results may not be possible. The extent to which concern about placebo effects undermines trial credibility is a matter of judgment and will differ according to circumstances. Placebo effects are much less of a concern if mortality is the outcome than if subjective symptoms are the outcome. Were the Interventions Administered With Similar Expertise? Randomized clinical trials of surgical procedures are at unique risk for additional bias due to potential differences in the expertise with which the interventions were administered. Differential expertise bias occurs when expertise is systematically greater in one treatment than in the other.6,25,26Quiz Ref ID Not only may surgeons be more experienced and skilled in performing an experimental or control procedure, but their conscious or subconscious investment in the superiority of one procedure may lead to differential administration of effective cointerventions (ie, administration of antibiotics, wound care, or early mobilization). For example, consider an RCT in which 206 participants with ventral incisional hernias were randomly allocated to laparoscopic or open repair.27 If the surgeons in this trial had greater skill or investment in performing open vs laparoscopic repair (perhaps as a result of their training and experience), we would have expected differential expertise bias to favor the open procedure. In this trial, the results showed longer operative times, higher rates of perioperative complications, and higher rates of recurrence in the laparoscopic group. It remains uncertain whether the results would have been similar if potential bias due to differential expertise had been addressed (for instance, by randomizing patients to surgeons who did only open hernia repairs or those who did only laparoscopic hernia repairs, called an expertise-based RCT design). Indeed, use of an expertise-based design in which patients are randomized to surgeons experienced and invested in the experimental treatment or to surgeons experienced and invested in the control treatment is the best way to guard against differential expertise bias. For example, in the Coronary Artery Bypass Grafting Off or On Pump Revascularization Study (CORONARY), 4752 participants were randomly allocated according to an expertise-based design between a group of surgeons who preferred a novel beating-heart technique and a group of surgeons who preferred a conventional cardiopulmonary bypass technique.28,29 To minimize possible confounding due to a learning curve, the trial included only surgeons who had completed at least 100 cases with their preferred procedure. Randomized clinical trials that compare surgical interventions with nonsurgical procedural interventions, such as trials of surgery vs physiotherapy, are expertise based by definition because the surgeons do not administer the physiotherapy and the physiotherapists do not administer the surgery. Box 2 presents the conclusions of the risk of bias assessment for the PROFHER trial.1 Box Section Ref ID Box 2. Using the Guide: How Serious Is the Risk of Bias? In the Proximal Fracture of the Humerus Evaluation by Randomization (PROFHER) trial, 250 participants with displaced proximal humerus fractures were randomly allocated to either surgical or nonsurgical treatment.1 A computer program generated the randomization sequence, and allocation concealment was achieved using independent remote randomization with randomly varied blocking. The participants in the intervention and control groups were similar with respect to their known prognostic factors except that there were more smokers in the nonsurgical group than in the surgical group (32% vs 19%, respectively). The participants, health care professionals, data collectors, outcomes assessors, and data analysts were not blinded in this trial, but complete follow-up data were available for 92% of the participants, the participants were analyzed in the groups to which they were randomized, and the trial was not stopped early for benefit. Those allocated to surgery received internal fixation or humeral head replacement according to the preferences and familiarity of the participating surgeons and then received supervised postoperative physiotherapy in inpatient, outpatient, or community settings. Those allocated to nonsurgical care received a sling or hanging bandage for as long as necessary followed by supervised physiotherapy. You conclude that this trial is at low risk for bias with 2 exceptions: the results could be biased if either surgery or physiotherapy has a strong placebo effect on subjective outcomes, or if those who evaluated the outcomes were biased toward one treatment or the other. Still, the trial is sufficiently credible that you continue on to the results. What Are the Results? In discussing interpretation of results, previous Users’ Guides have discussed composite end points and noninferiority trials.8,30,31 These issues are equally relevant to RCTs of surgical procedures, and we will not address them in this article. We will, however, review the most basic considerations: measures of effect and measures of precision. When considering the magnitude of a dichotomous (yes or no; dead or alive) treatment effect, relative measures of association can be misleading. For instance, a relative risk reduction of 50% sounds impressive. Indeed, it would be impressive if it reflected a reduction in death or surgical complications from 40% to 20% (a risk difference or absolute risk reduction of 20%). However, it might represent a reduction from 2% to 1% (a 1% risk difference) that may, in context, be trivial. Quiz Ref IDTherefore, clinicians should also look for the risk difference or its inverse, the number needed to treat (100/20 or 5 when the risk difference is 20%; 100/1 or 100 when the risk difference is 1%) to understand the magnitude of effect.8,32 For continuous outcomes, clinicians must decide whether observed treatment effects are likely to be sufficiently large to justify changes in management given any foreseeable harms or costs.33 When outcomes are measured using instruments that are unfamiliar to clinicians (eg, score on a quality-of-life instrument), such decisions may be extremely challenging. Knowing the minimally important difference (the smallest difference that patients would consider important) is likely to be very helpful in interpreting the results. When considering precision, clinicians should look for confidence intervals; 95% confidence intervals are typically reported. Clinicians should look to the upper and lower boundaries of confidence intervals to discern the largest and smallest possible treatment effects that, given the results, remain plausible.8,32 This is an important issue in surgical RCTs because they are often limited by imprecision due to small sample sizes.34-36 As a result, clinicians must consider whether the confidence intervals of an apparently negative trial exclude patient-important benefit (or harm). Box 3 presents considerations in understanding results from the PROFHER trial.1 Box Section Ref ID Box 3. Using the Guide: What Are the Results? The primary analysis of the Proximal Fracture of the Humerus Evaluation by Randomization (PROFHER) trial compared Oxford Shoulder Scores between the surgical and nonsurgical groups over 2 years of follow-up. The Oxford Shoulder Score is a continuous outcome that measures shoulder-related function, and 5 points was considered to be a minimally important difference. Secondary outcomes included scores from the 12-Item Short Form Health Survey, complications, and late interventions. Investigators conducted sensitivity analyses to evaluate the impact of missing data, to control for potential confounding due to the differing proportion of smokers in each group, and to explore possible clustering across the participating centers. There were no significant differences between the groups for Oxford Shoulder Scores, and the 95% confidence intervals excluded a 5-point difference over 2 years (mean difference of 0.75 point in favor of the surgical group; 95% CI, −1.33 to 2.84; P = .48). There were also no significant differences in scores on the 12-Item Short Form Health Survey (mean difference of 1.77 points in favor of the surgical group; 95% CI, −0.84 to 4.30; P = .18) or rates of shoulder-related complications (30 in the surgical group vs 23 in the nonsurgical group; risk difference, 6%; 95% CI, −5% to 15%; P = .28). The rates of secondary surgery (11 vs 11) and serious adverse events (28 vs 28) were identical. Unadjusted and adjusted analyses yielded similar results. How Can I Apply the Results to Patient Care? To apply the results from an RCT to patient care, one must consider the extent to which a trial is applicable. The setting of the trial, the methods of selecting the trial participants, the characteristics of the participants, differences between the trial protocol and routine practice, the chosen outcome measures, the duration of follow-up, and the observed adverse effects can all influence applicability.8,32,37-40 A given trial could be perfectly applicable for some clinicians and their particular patient encounter, yet of limited applicability for others. Were the Study Interventions Similar to Interventions in My Setting? For RCTs of surgical procedures, applicability also depends on the extent to which the study interventions are similar to interventions in one’s own setting. Quiz Ref IDVariability in individual surgeons’ expertise can affect their ability to achieve results from trials in their own practice. For instance, consider a trial demonstrating that a novel surgical procedure conducted by surgeons already experienced in the new procedure appears superior to the standard. For a surgeon who has never used anything but the standard, training (potentially extensive) would be required before we could be confident that surgeon could achieve results demonstrated in the trial. Consider the Asymptomatic Carotid Atherosclerosis Study in which 1662 participants with asymptomatic carotid artery stenosis were randomly allocated to carotid endarterectomy surgery or optimal medical risk factor management.41 This trial selected a group of expert surgeons who had established low perioperative complication rates and found that carotid endarterectomy significantly reduced the overall risk of stroke or death from 11.0% to 5.1%. For surgeons in routine clinical practice, this trial demonstrated that carotid endarterectomy was beneficial as long as the surgeons maintained similar low perioperative complication rates.37,42 This example highlights how a trial can be useful for addressing one question but not at all useful for addressing an apparently similar but actually fundamentally different question. One may ask: what is the effect of a procedure, delivered by those most expert, in comparison with the competing management strategy? Alternatively, one may ask: what is the effect of the procedure when undertaken by the level of expertise one might expect in most communities? An RCT addressing the first question might be considered a mechanistic or explanatory trial that addresses the effect of an intervention administered under ideal testing circumstances.43 An RCT addressing the second question might be considered a practical or pragmatic trial that bears directly on health care decisions in practice. Whether a trial is mechanistic or practical may also depend on your perspective: if you are in a community of surgeons with exceptional expertise, the first and not the second trial may be practical from your point of view. The Appendicitis Acuta trial illustrates the mechanistic vs practical perspective.44 In this trial, 530 participants with uncomplicated acute appendicitis were randomly allocated to receive either early appendectomy or a standardized regimen of antibiotic treatment and were followed up for 1 year. Of 257 participants in the antibiotic treatment group, 15 underwent appendectomy during their initial hospitalization because the surgeons suspected progressive infection, perforation, or peritonitis. From a mechanistic perspective that seeks the effect of surgery vs medical management, this trial might be considered problematic because some patients allocated to antibiotics underwent appendectomy. If, however, the real-world options are immediate surgery vs a wait-and-see approach in which one holds off surgery indefinitely if patients improve but only temporarily if they do not, this trial is eminently practical (and thus extremely helpful). Clinicians considering the applicability of the Appendicitis Acuta trial must also consider whether the nonoperative antibiotic regimen (3 days of intravenous ertapenem followed by 7 days of oral levofloxacin and metronidazole) was similar to the regimen that they would use and whether all patient-important outcomes were considered, including long-term complications of surgery. Box 4 presents issues of applicability and the resolution of our clinical scenario. Box Section Ref ID Box 4. Using the Guide: How Can I Apply the Results to Patient Care? You review the tables presenting baseline characteristics and note that the patients in the Proximal Fracture of the Humerus Evaluation by Randomization (PROFHER) trial were very similar to your patient with respect to age, sex, injury mechanism, and fracture pattern. You therefore find no compelling reason to doubt applicability to your patient. The report states that most of the surgical procedures (83%) involved open reduction and internal fixation with locking plates and that the operations were performed by attending surgeons (82%), supervised senior residents (12%), or independent fellows and senior residents who had immediate access to an attending surgeon if required (6%). Based on your training and experience, you are confident that you would offer a similar procedure and perform it with similar competence. On the other hand, the report also states that the physiotherapists provided a mean of 10 one-on-one sessions to each group, with a focus on restoring function and encouragement for additional home exercises. The proportions of participants who received education, exercises, stretching, soft-tissue techniques, and other modalities were similar between groups, as were the numbers of patients who performed additional home exercises (109 in the surgical group vs 103 in the nonsurgical group). You are familiar with the practice of the physiotherapists in your community and you are aware that they typically provide similar regimens to those in the trial for the surgical and nonsurgical groups, so you believe that their care is unlikely to differ in an important way. You conclude that this trial provides practical information about the effect of surgical vs nonsurgical treatment applicable to your patient encounter. You return to your patient and discuss the results. The patient asks about anesthetic and surgical complications, and you acknowledge that, while rare, these are a possibility. After considering the lack of benefit with surgery and the potential risks for morbidity or serious harms, your patient—despite the experience of her friend—decides to proceed with nonsurgical management. You provide her with a referral to an experienced physiotherapist and you make an appointment to see her for follow-up. Conclusions Surgical procedures require clinicians to develop and maintain procedural expertise. Randomized clinical trials of surgical procedures present unique methodological concerns in part related to the possibility of differential expertise and the relative expertise of trial practitioners vs those in one’s own practice setting. Further, failure to apply rigorous methods to achieve allocation concealment and innovative approaches to implement blinding may introduce serious risk of bias. Consideration of these issues will allow clinicians to make optimal use of trials addressing surgical procedures. Back to top Article Information Corresponding Author: Nathan Evaniew, MD, Department of Surgery, McMaster University, 293 Wellington St N, Ste 110, Hamilton, ON L8L 8E7, Canada (nathan.evaniew@medportal.ca). Accepted for Publication: December 27, 2015. Published Online: March 30, 2016. doi:10.1001/jamasurg.2016.0072. Conflict of Interest Disclosures: None reported. Funding/Support: Dr Evaniew was supported in part by a Doctoral Research Award from the Canadian Institutes of Health Research. Dr Tikkinen was supported by grant 276046 from the Academy of Finland and by the Hospital District of Helsinki and Uusimaa, the Jane and Aatos Erkko Foundation, and the Sigrid Jusélius Foundation. Dr Bhandari was supported by a Canada Research Chair from the Canadian Institutes of Health Research. Role of the Funder/Sponsor: The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication. References 1. Rangan A, Handoll H, Brealey S, et al; PROFHER Trial Collaborators. Surgical vs nonsurgical treatment of adults with displaced fractures of the proximal humerus: the PROFHER randomized clinical trial. JAMA. 2015;313(10):1037-1047.PubMedGoogle ScholarCrossref 2. Boutron I, Tubach F, Giraudeau B, Ravaud P. Methodological differences in clinical trials evaluating nonpharmacological and pharmacological treatments of hip and knee osteoarthritis. JAMA. 2003;290(8):1062-1070.PubMedGoogle ScholarCrossref 3. Farrokhyar F, Karanicolas PJ, Thoma A, et al. Randomized controlled trials of surgical interventions. Ann Surg. 2010;251(3):409-416.PubMedGoogle ScholarCrossref 4. Bhandari M, Guyatt GH, Swiontkowski MF. User’s guide to the orthopaedic literature: how to use an article about a surgical therapy. J Bone Joint Surg Am. 2001;83-A(6):916-926.PubMedGoogle Scholar 5. Boutron I, Moher D, Altman DG, Schulz KF, Ravaud P; CONSORT Group. Extending the CONSORT statement to randomized trials of nonpharmacologic treatment: explanation and elaboration. Ann Intern Med. 2008;148(4):295-309.PubMedGoogle ScholarCrossref 6. Devereaux PJ, Bhandari M, Clarke M, et al. Need for expertise based randomised controlled trials. BMJ. 2005;330(7482):88.PubMedGoogle ScholarCrossref 7. Karanicolas PJ, Bhandari M, Taromi B, et al. Blinding of outcomes in trials of orthopaedic trauma: an opportunity to enhance the validity of clinical trials. J Bone Joint Surg Am. 2008;90(5):1026-1033.PubMedGoogle ScholarCrossref 8. Guyatt G, Rennie D, Meade MO, Cook DJ. Users’ Guides to the Medical Literature: A Manual for Evidence-Based Clinical Practice. 3rd ed. New York, NY: McGraw-Hill Professional; 2014. http://jamaevidence.com. 9. Guyatt GH, Sackett DL, Cook DJ; Evidence-Based Medicine Working Group. Users’ guides to the medical literature, II: how to use an article about therapy or prevention, A: are the results of the study valid? JAMA. 1993;270(21):2598-2601.PubMedGoogle ScholarCrossref 10. Montori VM, Devereaux PJ, Adhikari NKJ, et al. Randomized trials stopped early for benefit: a systematic review. JAMA. 2005;294(17):2203-2209.PubMedGoogle ScholarCrossref 11. Bassler D, Briel M, Montori VM, et al; STOPIT-2 Study Group. Stopping randomized trials early for benefit and estimation of treatment effects: systematic review and meta-regression analysis. JAMA. 2010;303(12):1180-1187.PubMedGoogle ScholarCrossref 12. Schulz KF, Grimes DA. Allocation concealment in randomised trials: defending against deciphering. Lancet. 2002;359(9306):614-618.PubMedGoogle ScholarCrossref 13. Schulz KF, Chalmers I, Hayes RJ, Altman DG. Empirical evidence of bias: dimensions of methodological quality associated with estimates of treatment effects in controlled trials. JAMA. 1995;273(5):408-412.PubMedGoogle ScholarCrossref 14. Adie S, Harris IA, Naylor JM, Mittal R. CONSORT compliance in surgical randomized trials: are we there yet? a systematic review. Ann Surg. 2013;258(6):872-878.PubMedGoogle ScholarCrossref 15. Hansen JB, Smithers BM, Schache D, Wall DR, Miller BJ, Menzies BL. Laparoscopic versus open appendectomy: prospective randomized trial. World J Surg. 1996;20(1):17-20.PubMedGoogle ScholarCrossref 16. Phatak UR, Chan WM, Lew DF, et al. Is nighttime the right time? risk of complications after laparoscopic cholecystectomy at night. J Am Coll Surg. 2014;219(4):718-724.PubMedGoogle ScholarCrossref 17. Devereaux PJ, Bhandari M, Montori VM, Manns BJ, Ghali WA, Guyatt GH. Double blind, you are the weakest link—good-bye! ACP J Club. 2002;136(1):A11.PubMedGoogle Scholar 18. Järvinen TLN, Sihvonen R, Bhandari M, et al. Blinded interpretation of study results can feasibly and effectively diminish interpretation bias. J Clin Epidemiol. 2014;67(7):769-772.PubMedGoogle ScholarCrossref 19. Ikramuddin S, Blackstone RP, Brancatisano A, et al. Effect of reversible intermittent intra-abdominal vagal nerve blockade on morbid obesity: the ReCharge randomized clinical trial. JAMA. 2014;312(9):915-922.PubMedGoogle ScholarCrossref 20. Karanicolas PJ, Bhandari M, Walter SD, Heels-Ansdell D, Guyatt GH; Collaboration for Outcomes Assessment in Surgical Trials (COAST) Musculoskeletal Group. Radiographs of hip fractures were digitally altered to mask surgeons to the type of implant without compromising the reliability of quality ratings or making the rating process more difficult. J Clin Epidemiol. 2009;62(2):214-223.e1.PubMedGoogle ScholarCrossref 21. Mantovani E, Arduino PG, Schierano G, et al. A split-mouth randomized clinical trial to evaluate the performance of piezosurgery compared with traditional technique in lower wisdom tooth removal. J Oral Maxillofac Surg. 2014;72(10):1890-1897.PubMedGoogle ScholarCrossref 22. Flum DR. Interpreting surgical trials with subjective outcomes: avoiding UnSPORTsmanlike conduct. JAMA. 2006;296(20):2483-2485.PubMedGoogle ScholarCrossref 23. Wartolowska K, Judge A, Hopewell S, et al. Use of placebo controls in the evaluation of surgery: systematic review. BMJ. 2014;348:g3253.PubMedGoogle ScholarCrossref 24. Dowrick AS, Bhandari M. Ethical issues in the design of randomized trials: to sham or not to sham. J Bone Joint Surg Am. 2012;94(suppl 1):7-10.PubMedGoogle ScholarCrossref 25. Scholtes VA, Nijman TH, van Beers L, Devereaux PJ, Poolman RW. Emerging designs in orthopaedics: expertise-based randomized controlled trials. J Bone Joint Surg Am. 2012;94(suppl 1):24-28.PubMedGoogle ScholarCrossref 26. Maruthappu M, Gilbert BJ, El-Harasis MA, et al. The influence of volume and experience on individual surgical performance: a systematic review. Ann Surg. 2015;261(4):642-647.PubMedGoogle ScholarCrossref 27. Eker HH, Hansson BM, Buunen M, et al. Laparoscopic vs open incisional hernia repair: a randomized clinical trial. JAMA Surg. 2013;148(3):259-263.PubMedGoogle ScholarCrossref 28. Lamy A, Devereaux PJ, Prabhakaran D, et al; CORONARY Investigators. Effects of off-pump and on-pump coronary-artery bypass grafting at 1 year. N Engl J Med. 2013;368(13):1179-1188.PubMedGoogle ScholarCrossref 29. Garg AX, Devereaux PJ, Yusuf S, et al; CORONARY Investigators. Kidney function after off-pump or on-pump coronary artery bypass graft surgery: a randomized clinical trial. JAMA. 2014;311(21):2191-2198.PubMedGoogle ScholarCrossref 30. Mulla SM, Scott IA, Jackevicius CA, You JJ, Guyatt GH. How to use a noninferiority trial: users’ guides to the medical literature. JAMA. 2012;308(24):2605-2611.PubMedGoogle ScholarCrossref 31. Montori VM, Permanyer-Miralda G, Ferreira-González I, et al. Validity of composite end points in clinical trials. BMJ. 2005;330(7491):594-596.PubMedGoogle ScholarCrossref 32. Guyatt GH, Sackett DL, Cook DJ; Evidence-Based Medicine Working Group. Users’ guides to the medical literature, II: how to use an article about therapy or prevention, B: what were the results and will they help me in caring for my patients? JAMA. 1994;271(1):59-63.PubMedGoogle ScholarCrossref 33. Schünemann HJ, Guyatt GH. Commentary—goodbye M(C)ID! hello MID, where do you come from? Health Serv Res. 2005;40(2):593-597.PubMedGoogle ScholarCrossref 34. Brody BA, Ashton CM, Liu D, Xiong Y, Yao X, Wray NP. Are surgical trials with negative results being interpreted correctly? J Am Coll Surg. 2013;216(1):158-166.PubMedGoogle ScholarCrossref 35. Chapman SJ, Shelton B, Mahmood H, Fitzgerald JE, Harrison EM, Bhangu A. Discontinuation and non-publication of surgical randomised controlled trials: observational study. BMJ. 2014;349:g6870.PubMedGoogle ScholarCrossref 36. Dimick JB, Diener-West M, Lipsett PA. Negative results of randomized clinical trials published in the surgical literature: equivalency or error? Arch Surg. 2001;136(7):796-800.PubMedGoogle ScholarCrossref 37. Rothwell PM. External validity of randomised controlled trials: “to whom do the results of this trial apply?” Lancet. 2005;365(9453):82-93.PubMedGoogle ScholarCrossref 38. Rothwell PM. Factors that can affect the external validity of randomised controlled trials. PLoS Clin Trials. 2006;1(1):e9.PubMedGoogle ScholarCrossref 39. Furukawa TA, Guyatt GH, Griffith LE. Can we individualize the “number needed to treat”? an empirical study of summary effect measures in meta-analyses. Int J Epidemiol. 2002;31(1):72-76.PubMedGoogle ScholarCrossref 40. Sun X, Ioannidis JP, Agoritsas T, Alba AC, Guyatt G. How to use a subgroup analysis: users’ guide to the medical literature. JAMA. 2014;311(4):405-411.PubMedGoogle ScholarCrossref 41. Executive Committee for the Asymptomatic Carotid Atherosclerosis Study. Endarterectomy for asymptomatic carotid artery stenosis. JAMA. 1995;273(18):1421-1428.PubMedGoogle ScholarCrossref 42. Moore WS, Young B, Baker WH, et al; ACAS Investigators. Surgical results: a justification of the surgeon selection process for the ACAS trial. J Vasc Surg. 1996;23(2):323-328.PubMedGoogle ScholarCrossref 43. Karanicolas PJ, Montori VM, Devereaux PJ, Schünemann H, Guyatt GH. A new “mechanistic-practical” framework for designing and interpreting randomized trials. J Clin Epidemiol. 2009;62(5):479-484.PubMedGoogle ScholarCrossref 44. Salminen P, Paajanen H, Rautio T, et al. Antibiotic therapy vs appendectomy for treatment of uncomplicated acute appendicitis: the APPAC randomized clinical trial. JAMA. 2015;313(23):2340-2348.PubMedGoogle ScholarCrossref http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png JAMA Surgery American Medical Association

How to Use a Randomized Clinical Trial Addressing a Surgical Procedure: Users’ Guide to the Medical Literature

Loading next page...
 
/lp/american-medical-association/how-to-use-a-randomized-clinical-trial-addressing-a-surgical-procedure-dlPyb1qGY3
Publisher
American Medical Association
Copyright
Copyright © 2016 American Medical Association. All Rights Reserved.
ISSN
2168-6254
eISSN
2168-6262
DOI
10.1001/jamasurg.2016.0072
Publisher site
See Article on Publisher Site

Abstract

Abstract Because surgical procedures require clinicians to develop and maintain procedural expertise and because blinding in randomized clinical trials of such therapies is often challenging, their critical appraisal raises unique issues. Risk of bias of trials of surgical procedures increases if investigators fail to rigorously conceal allocation and, where possible, to ensure blinding of those involved in the trial. Variability in surgeons’ expertise can also increase bias and lead to important limitations in applicability. To address these issues, this Users’ Guide to the Medical Literature reviews the use of remote randomization systems, blinding, sham-controlled trials, split-body trials, expertise-based trials, and mechanistic vs practical trials. Consideration of risk of bias and applicability issues will allow clinicians to make optimal use of trials addressing surgical procedures. Clinical Scenario You are an orthopedic surgeon seeing a healthy 65-year-old woman with a displaced fracture of her right proximal humerus that occurred when she fell onto an outstretched hand. Her fracture involves the surgical neck and greater tuberosity of her dominant proximal humerus, and each fragment is displaced approximately 1 cm. You have treated many patients like this nonsurgically, but the patient tells you she has a friend who had a similar fracture and did well after an operation. The patient’s question brings to mind a notification you recently received on your mobile device for the Proximal Fracture of the Humerus Evaluation by Randomization (PROFHER) trial; you let the patient know you will get back to her very shortly after reviewing the latest evidence on the matter.1 As you pull up the report, you wonder if there are any special issues to which you should attend when reviewing a randomized clinical trial (RCT) addressing a surgical procedure. Challenges of Evaluating Surgical Procedures Although research to evaluate surgical procedures has historically been dominated by nonrandomized observational studies, increasing awareness that randomization reduces bias by ensuring similar distributions of prognostic factors in the intervention and control groups has led to a marked increase in the conduct of surgical RCTs.2-4 This positive development raises new issues: trials of surgical procedures present special challenges in understanding and applying their results to patient management.5 Many of these challenges arise because, in contrast to pharmacological treatments, surgical procedures rely on procedural expertise. Variability in procedural expertise between the intervention and control groups can therefore influence outcomes and result in spurious inferences.6 Critical appraisal of RCTs of surgical procedures also requires addressing the frequent lack of blinding, which may be unavoidable but even if possible is typically much more challenging than in RCTs of drug therapies.7 This Users’ Guide to the Medical Literature presents a practical approach to assessing RCTs of surgical procedures using the 3-step approach of our Users’ Guides: assessment of risk of bias, results, and application to patient care (Box 1). This article highlights those issues that are specific to RCTs of surgical procedures. Box Section Ref ID Box 1. Users’ Guides Approach to a Randomized Clinical Trial Addressing a Surgical Procedure How Serious Is the Risk of Bias? Did the intervention and control groups start with the same prognosis? Was allocation concealed?a Were patients in the treatment and control groups similar with respect to known prognostic factors? Was prognostic balance maintained as the study progressed? If possible, were participants, health care professionals, data collectors, outcomes assessors, data analysts, and/or those interpreting results blinded?a Were the groups prognostically balanced at the study’s completion? Was follow-up complete? Were patients analyzed in the groups to which they were randomized? Was stopping the trial early for benefit avoided? Were the interventions administered with similar expertise?a What Are the Results? How large was the treatment effect? How precise was the estimate of the treatment effect? How Can I Apply the Results to Patient Care? Were the study patients similar to my patient? Were the study interventions similar to interventions in my setting?a Were all patient-important outcomes considered? Are the likely treatment benefits worth the potential harms and costs? a Includes issues specific to trials of surgical procedures. How Serious Is the Risk of Bias? Previous Users’ Guides have considered whether allocation was concealed, patients were similar with respect to known prognostic factors, blinding was implemented, follow-up was complete, patients were analyzed in the groups to which they were randomized, and investigators avoided early stopping for benefit.4,8-11 For RCTs of surgical procedures, allocation concealment and blinding each warrant further discussion, and the expertise with which the study interventions were administered presents a unique issue requiring consideration. Did the Intervention and Control Groups Start With the Same Prognosis? Randomization, if successful, creates groups with a similar likelihood of experiencing the outcomes of interest (ie, a similar prognosis). Allocation concealment describes the extent to which individuals responsible for enrolling patients were unaware of and could not influence the study arm to which the randomization schedule assigned patients.8,12Concealment refers not to the process of creating the randomization schedule or to the methods of blinding used to maintain prognostic balance as a study progresses, but rather to safeguarding the implementation of the randomization process. Trials with inadequate methods of allocation concealment may overestimate treatment effects, and trials of surgical procedures frequently implement methods that are less secure.8,13,14 Consider an RCT in which patients with appendicitis were randomly allocated to receive open or laparoscopic appendectomies, and the laparoscopic procedure but not the open procedure required the attending surgeons’ presence in the operating room.15 Resident physicians, responsible for recruiting and enrolling patients, obtained the treatment assignments from sealed envelopes. The residents were typically eager to perform procedures independently and were more often familiar and confident with open rather than laparoscopic appendectomy. When patients required surgery during the night, residents who were reluctant to call in their attending staff held up the envelopes to the light until they found one that contained an open procedure (Daryl R. Wall, MBBS, FRACS, written communication, June 9, 2000). If the patients who presented overnight were sicker or the care they received without the attending surgeons’ presence was inferior, the lack of concealment would have biased the results in favor of the laparoscopic procedure.4,16 Contrast this example with a typical blinded pharmacological trial in which study drugs are packaged and labeled before they are sent to each participating center, independent of the research coordinators who enroll patients. This arrangement prevents the coordinators from knowing which medication the next patient will receive and thus secures the randomization sequence. To corrupt the sequence in a pharmacological trial, the coordinators would have to obtain the central randomization sequence and unblind the packaged medications. Circumventing randomization is far easier in surgical trials that implement envelopes (or other even less secure methods) for concealment. Quiz Ref IDInvestigators of surgical RCTs can ensure allocation concealment by using remote web-based and 24-hour central telephone randomization services that require individuals who are enrolling patients to contact an independent source.12 Each contact is logged, and treatment assignments for each patient are provided only after eligibility has been confirmed. Although sealed, opaque, and sequentially numbered envelopes are preferable to envelopes that are unsealed, translucent, and not numbered, they remain vulnerable to tampering and are therefore less secure than remote randomization systems. Randomization may fail to do its job of creating prognostic balance through chance or through failure to conceal allocation. Either way, one can appraise the success of randomization by examining the distributions of the baseline characteristics in the intervention and control groups, usually presented in the first table of results. Clinicians can be reassured when the known prognostic factors are similar—and be legitimately concerned if they are not. Was Prognostic Balance Maintained as the Study Progressed? Blinding is the process of withholding information about treatment assignments from groups of individuals who could introduce bias if they gained this information after patients were randomized. Investigators can blind pharmacological therapies easily using placebo medications; however, placebos are often not possible for surgical procedures. Lack of blinding is an even more serious concern when, as is often the case in surgical RCTs, investigators focus on outcomes that are subjective (such as pain, function, quality of life, and satisfaction).3,7 There are 6 groups of individuals who should ideally be blinded in RCTs: participants, health care professionals, data collectors, outcomes assessors, data analysts, and the investigators responsible for interpreting the results (eTable in the Supplement).8,17 In trials of surgical procedures, it is sometimes possible to blind the participants and some health care personnel, but it is almost never possible to blind the surgeons. Quiz Ref IDWith careful planning it may be possible to blind the data collectors, is usually possible to blind outcomes assessors, and is always possible to blind data analysts and those interpreting the results.18 For example, consider the ReCharge trial in which 239 participants were randomly allocated to undergo surgical implantation of an active vagal nerve block device or a sham procedure to determine the effect of reversible intermittent intra-abdominal vagal nerve blockade on morbid obesity.19 The surgeons could not be blinded to the procedures that they were performing and the patients might have become unblinded by asking their surgeons which operation they had, so interaction between the surgeons and participants was limited postoperatively and blinded staff conducted patient follow-up. Participants in RCTs of surgical procedures can sometimes be blinded using standardized wound coverings, digitally altered radiographs, or split-body designs (eAppendix in the Supplement). However, this is much easier when the comparisons are between alternative but similar interventions (such as one surgical procedure vs another) than when the control is a nonsurgical intervention or standard of care.20-22 For outcomes that are subjective, placebo effects from surgical procedures are often substantial.23 Moreover, rituals, stressors, and environmental cues associated with admission, preparation, anesthesia, and recovery can heighten placebo effects associated with surgery because they may lead patients to have greater expectations for benefit.24 Sham-controlled RCTs can control for the placebo effect of surgery by ensuring that neither intervention nor control patients know whether they have undergone the active surgical procedure.24 Sham surgical procedures might seem to expose participants to unreasonable harms without promise of a direct benefit, but participants who receive sham treatments may not only experience substantial placebo effects but also receive additional monitoring, imaging, or clinic visits beyond standard practice.22 Sham-controlled RCTs are most effectively implemented by ensuring the relevant clinical community feels that there is equipoise about the relative effects of the intervention and control, minimizing risk, obtaining informed consent, and avoiding ongoing active deception.23,24 In the ReCharge trial, participants assigned to the sham procedure experienced operations similar to those in the active treatment group because both required similar laparoscopic surgical techniques under general anesthesia.19 Participants in the active treatment group experienced significantly greater weight loss than participants in the sham group, but the sham group nonetheless experienced 3-fold greater weight loss than was predicted, which suggests the influence of substantial placebo effects. When blinding is not possible or not undertaken, excluding placebo effects as an explanation of apparently positive results may not be possible. The extent to which concern about placebo effects undermines trial credibility is a matter of judgment and will differ according to circumstances. Placebo effects are much less of a concern if mortality is the outcome than if subjective symptoms are the outcome. Were the Interventions Administered With Similar Expertise? Randomized clinical trials of surgical procedures are at unique risk for additional bias due to potential differences in the expertise with which the interventions were administered. Differential expertise bias occurs when expertise is systematically greater in one treatment than in the other.6,25,26Quiz Ref ID Not only may surgeons be more experienced and skilled in performing an experimental or control procedure, but their conscious or subconscious investment in the superiority of one procedure may lead to differential administration of effective cointerventions (ie, administration of antibiotics, wound care, or early mobilization). For example, consider an RCT in which 206 participants with ventral incisional hernias were randomly allocated to laparoscopic or open repair.27 If the surgeons in this trial had greater skill or investment in performing open vs laparoscopic repair (perhaps as a result of their training and experience), we would have expected differential expertise bias to favor the open procedure. In this trial, the results showed longer operative times, higher rates of perioperative complications, and higher rates of recurrence in the laparoscopic group. It remains uncertain whether the results would have been similar if potential bias due to differential expertise had been addressed (for instance, by randomizing patients to surgeons who did only open hernia repairs or those who did only laparoscopic hernia repairs, called an expertise-based RCT design). Indeed, use of an expertise-based design in which patients are randomized to surgeons experienced and invested in the experimental treatment or to surgeons experienced and invested in the control treatment is the best way to guard against differential expertise bias. For example, in the Coronary Artery Bypass Grafting Off or On Pump Revascularization Study (CORONARY), 4752 participants were randomly allocated according to an expertise-based design between a group of surgeons who preferred a novel beating-heart technique and a group of surgeons who preferred a conventional cardiopulmonary bypass technique.28,29 To minimize possible confounding due to a learning curve, the trial included only surgeons who had completed at least 100 cases with their preferred procedure. Randomized clinical trials that compare surgical interventions with nonsurgical procedural interventions, such as trials of surgery vs physiotherapy, are expertise based by definition because the surgeons do not administer the physiotherapy and the physiotherapists do not administer the surgery. Box 2 presents the conclusions of the risk of bias assessment for the PROFHER trial.1 Box Section Ref ID Box 2. Using the Guide: How Serious Is the Risk of Bias? In the Proximal Fracture of the Humerus Evaluation by Randomization (PROFHER) trial, 250 participants with displaced proximal humerus fractures were randomly allocated to either surgical or nonsurgical treatment.1 A computer program generated the randomization sequence, and allocation concealment was achieved using independent remote randomization with randomly varied blocking. The participants in the intervention and control groups were similar with respect to their known prognostic factors except that there were more smokers in the nonsurgical group than in the surgical group (32% vs 19%, respectively). The participants, health care professionals, data collectors, outcomes assessors, and data analysts were not blinded in this trial, but complete follow-up data were available for 92% of the participants, the participants were analyzed in the groups to which they were randomized, and the trial was not stopped early for benefit. Those allocated to surgery received internal fixation or humeral head replacement according to the preferences and familiarity of the participating surgeons and then received supervised postoperative physiotherapy in inpatient, outpatient, or community settings. Those allocated to nonsurgical care received a sling or hanging bandage for as long as necessary followed by supervised physiotherapy. You conclude that this trial is at low risk for bias with 2 exceptions: the results could be biased if either surgery or physiotherapy has a strong placebo effect on subjective outcomes, or if those who evaluated the outcomes were biased toward one treatment or the other. Still, the trial is sufficiently credible that you continue on to the results. What Are the Results? In discussing interpretation of results, previous Users’ Guides have discussed composite end points and noninferiority trials.8,30,31 These issues are equally relevant to RCTs of surgical procedures, and we will not address them in this article. We will, however, review the most basic considerations: measures of effect and measures of precision. When considering the magnitude of a dichotomous (yes or no; dead or alive) treatment effect, relative measures of association can be misleading. For instance, a relative risk reduction of 50% sounds impressive. Indeed, it would be impressive if it reflected a reduction in death or surgical complications from 40% to 20% (a risk difference or absolute risk reduction of 20%). However, it might represent a reduction from 2% to 1% (a 1% risk difference) that may, in context, be trivial. Quiz Ref IDTherefore, clinicians should also look for the risk difference or its inverse, the number needed to treat (100/20 or 5 when the risk difference is 20%; 100/1 or 100 when the risk difference is 1%) to understand the magnitude of effect.8,32 For continuous outcomes, clinicians must decide whether observed treatment effects are likely to be sufficiently large to justify changes in management given any foreseeable harms or costs.33 When outcomes are measured using instruments that are unfamiliar to clinicians (eg, score on a quality-of-life instrument), such decisions may be extremely challenging. Knowing the minimally important difference (the smallest difference that patients would consider important) is likely to be very helpful in interpreting the results. When considering precision, clinicians should look for confidence intervals; 95% confidence intervals are typically reported. Clinicians should look to the upper and lower boundaries of confidence intervals to discern the largest and smallest possible treatment effects that, given the results, remain plausible.8,32 This is an important issue in surgical RCTs because they are often limited by imprecision due to small sample sizes.34-36 As a result, clinicians must consider whether the confidence intervals of an apparently negative trial exclude patient-important benefit (or harm). Box 3 presents considerations in understanding results from the PROFHER trial.1 Box Section Ref ID Box 3. Using the Guide: What Are the Results? The primary analysis of the Proximal Fracture of the Humerus Evaluation by Randomization (PROFHER) trial compared Oxford Shoulder Scores between the surgical and nonsurgical groups over 2 years of follow-up. The Oxford Shoulder Score is a continuous outcome that measures shoulder-related function, and 5 points was considered to be a minimally important difference. Secondary outcomes included scores from the 12-Item Short Form Health Survey, complications, and late interventions. Investigators conducted sensitivity analyses to evaluate the impact of missing data, to control for potential confounding due to the differing proportion of smokers in each group, and to explore possible clustering across the participating centers. There were no significant differences between the groups for Oxford Shoulder Scores, and the 95% confidence intervals excluded a 5-point difference over 2 years (mean difference of 0.75 point in favor of the surgical group; 95% CI, −1.33 to 2.84; P = .48). There were also no significant differences in scores on the 12-Item Short Form Health Survey (mean difference of 1.77 points in favor of the surgical group; 95% CI, −0.84 to 4.30; P = .18) or rates of shoulder-related complications (30 in the surgical group vs 23 in the nonsurgical group; risk difference, 6%; 95% CI, −5% to 15%; P = .28). The rates of secondary surgery (11 vs 11) and serious adverse events (28 vs 28) were identical. Unadjusted and adjusted analyses yielded similar results. How Can I Apply the Results to Patient Care? To apply the results from an RCT to patient care, one must consider the extent to which a trial is applicable. The setting of the trial, the methods of selecting the trial participants, the characteristics of the participants, differences between the trial protocol and routine practice, the chosen outcome measures, the duration of follow-up, and the observed adverse effects can all influence applicability.8,32,37-40 A given trial could be perfectly applicable for some clinicians and their particular patient encounter, yet of limited applicability for others. Were the Study Interventions Similar to Interventions in My Setting? For RCTs of surgical procedures, applicability also depends on the extent to which the study interventions are similar to interventions in one’s own setting. Quiz Ref IDVariability in individual surgeons’ expertise can affect their ability to achieve results from trials in their own practice. For instance, consider a trial demonstrating that a novel surgical procedure conducted by surgeons already experienced in the new procedure appears superior to the standard. For a surgeon who has never used anything but the standard, training (potentially extensive) would be required before we could be confident that surgeon could achieve results demonstrated in the trial. Consider the Asymptomatic Carotid Atherosclerosis Study in which 1662 participants with asymptomatic carotid artery stenosis were randomly allocated to carotid endarterectomy surgery or optimal medical risk factor management.41 This trial selected a group of expert surgeons who had established low perioperative complication rates and found that carotid endarterectomy significantly reduced the overall risk of stroke or death from 11.0% to 5.1%. For surgeons in routine clinical practice, this trial demonstrated that carotid endarterectomy was beneficial as long as the surgeons maintained similar low perioperative complication rates.37,42 This example highlights how a trial can be useful for addressing one question but not at all useful for addressing an apparently similar but actually fundamentally different question. One may ask: what is the effect of a procedure, delivered by those most expert, in comparison with the competing management strategy? Alternatively, one may ask: what is the effect of the procedure when undertaken by the level of expertise one might expect in most communities? An RCT addressing the first question might be considered a mechanistic or explanatory trial that addresses the effect of an intervention administered under ideal testing circumstances.43 An RCT addressing the second question might be considered a practical or pragmatic trial that bears directly on health care decisions in practice. Whether a trial is mechanistic or practical may also depend on your perspective: if you are in a community of surgeons with exceptional expertise, the first and not the second trial may be practical from your point of view. The Appendicitis Acuta trial illustrates the mechanistic vs practical perspective.44 In this trial, 530 participants with uncomplicated acute appendicitis were randomly allocated to receive either early appendectomy or a standardized regimen of antibiotic treatment and were followed up for 1 year. Of 257 participants in the antibiotic treatment group, 15 underwent appendectomy during their initial hospitalization because the surgeons suspected progressive infection, perforation, or peritonitis. From a mechanistic perspective that seeks the effect of surgery vs medical management, this trial might be considered problematic because some patients allocated to antibiotics underwent appendectomy. If, however, the real-world options are immediate surgery vs a wait-and-see approach in which one holds off surgery indefinitely if patients improve but only temporarily if they do not, this trial is eminently practical (and thus extremely helpful). Clinicians considering the applicability of the Appendicitis Acuta trial must also consider whether the nonoperative antibiotic regimen (3 days of intravenous ertapenem followed by 7 days of oral levofloxacin and metronidazole) was similar to the regimen that they would use and whether all patient-important outcomes were considered, including long-term complications of surgery. Box 4 presents issues of applicability and the resolution of our clinical scenario. Box Section Ref ID Box 4. Using the Guide: How Can I Apply the Results to Patient Care? You review the tables presenting baseline characteristics and note that the patients in the Proximal Fracture of the Humerus Evaluation by Randomization (PROFHER) trial were very similar to your patient with respect to age, sex, injury mechanism, and fracture pattern. You therefore find no compelling reason to doubt applicability to your patient. The report states that most of the surgical procedures (83%) involved open reduction and internal fixation with locking plates and that the operations were performed by attending surgeons (82%), supervised senior residents (12%), or independent fellows and senior residents who had immediate access to an attending surgeon if required (6%). Based on your training and experience, you are confident that you would offer a similar procedure and perform it with similar competence. On the other hand, the report also states that the physiotherapists provided a mean of 10 one-on-one sessions to each group, with a focus on restoring function and encouragement for additional home exercises. The proportions of participants who received education, exercises, stretching, soft-tissue techniques, and other modalities were similar between groups, as were the numbers of patients who performed additional home exercises (109 in the surgical group vs 103 in the nonsurgical group). You are familiar with the practice of the physiotherapists in your community and you are aware that they typically provide similar regimens to those in the trial for the surgical and nonsurgical groups, so you believe that their care is unlikely to differ in an important way. You conclude that this trial provides practical information about the effect of surgical vs nonsurgical treatment applicable to your patient encounter. You return to your patient and discuss the results. The patient asks about anesthetic and surgical complications, and you acknowledge that, while rare, these are a possibility. After considering the lack of benefit with surgery and the potential risks for morbidity or serious harms, your patient—despite the experience of her friend—decides to proceed with nonsurgical management. You provide her with a referral to an experienced physiotherapist and you make an appointment to see her for follow-up. Conclusions Surgical procedures require clinicians to develop and maintain procedural expertise. Randomized clinical trials of surgical procedures present unique methodological concerns in part related to the possibility of differential expertise and the relative expertise of trial practitioners vs those in one’s own practice setting. Further, failure to apply rigorous methods to achieve allocation concealment and innovative approaches to implement blinding may introduce serious risk of bias. Consideration of these issues will allow clinicians to make optimal use of trials addressing surgical procedures. Back to top Article Information Corresponding Author: Nathan Evaniew, MD, Department of Surgery, McMaster University, 293 Wellington St N, Ste 110, Hamilton, ON L8L 8E7, Canada (nathan.evaniew@medportal.ca). Accepted for Publication: December 27, 2015. Published Online: March 30, 2016. doi:10.1001/jamasurg.2016.0072. Conflict of Interest Disclosures: None reported. Funding/Support: Dr Evaniew was supported in part by a Doctoral Research Award from the Canadian Institutes of Health Research. Dr Tikkinen was supported by grant 276046 from the Academy of Finland and by the Hospital District of Helsinki and Uusimaa, the Jane and Aatos Erkko Foundation, and the Sigrid Jusélius Foundation. Dr Bhandari was supported by a Canada Research Chair from the Canadian Institutes of Health Research. Role of the Funder/Sponsor: The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication. References 1. Rangan A, Handoll H, Brealey S, et al; PROFHER Trial Collaborators. Surgical vs nonsurgical treatment of adults with displaced fractures of the proximal humerus: the PROFHER randomized clinical trial. JAMA. 2015;313(10):1037-1047.PubMedGoogle ScholarCrossref 2. Boutron I, Tubach F, Giraudeau B, Ravaud P. Methodological differences in clinical trials evaluating nonpharmacological and pharmacological treatments of hip and knee osteoarthritis. JAMA. 2003;290(8):1062-1070.PubMedGoogle ScholarCrossref 3. Farrokhyar F, Karanicolas PJ, Thoma A, et al. Randomized controlled trials of surgical interventions. Ann Surg. 2010;251(3):409-416.PubMedGoogle ScholarCrossref 4. Bhandari M, Guyatt GH, Swiontkowski MF. User’s guide to the orthopaedic literature: how to use an article about a surgical therapy. J Bone Joint Surg Am. 2001;83-A(6):916-926.PubMedGoogle Scholar 5. Boutron I, Moher D, Altman DG, Schulz KF, Ravaud P; CONSORT Group. Extending the CONSORT statement to randomized trials of nonpharmacologic treatment: explanation and elaboration. Ann Intern Med. 2008;148(4):295-309.PubMedGoogle ScholarCrossref 6. Devereaux PJ, Bhandari M, Clarke M, et al. Need for expertise based randomised controlled trials. BMJ. 2005;330(7482):88.PubMedGoogle ScholarCrossref 7. Karanicolas PJ, Bhandari M, Taromi B, et al. Blinding of outcomes in trials of orthopaedic trauma: an opportunity to enhance the validity of clinical trials. J Bone Joint Surg Am. 2008;90(5):1026-1033.PubMedGoogle ScholarCrossref 8. Guyatt G, Rennie D, Meade MO, Cook DJ. Users’ Guides to the Medical Literature: A Manual for Evidence-Based Clinical Practice. 3rd ed. New York, NY: McGraw-Hill Professional; 2014. http://jamaevidence.com. 9. Guyatt GH, Sackett DL, Cook DJ; Evidence-Based Medicine Working Group. Users’ guides to the medical literature, II: how to use an article about therapy or prevention, A: are the results of the study valid? JAMA. 1993;270(21):2598-2601.PubMedGoogle ScholarCrossref 10. Montori VM, Devereaux PJ, Adhikari NKJ, et al. Randomized trials stopped early for benefit: a systematic review. JAMA. 2005;294(17):2203-2209.PubMedGoogle ScholarCrossref 11. Bassler D, Briel M, Montori VM, et al; STOPIT-2 Study Group. Stopping randomized trials early for benefit and estimation of treatment effects: systematic review and meta-regression analysis. JAMA. 2010;303(12):1180-1187.PubMedGoogle ScholarCrossref 12. Schulz KF, Grimes DA. Allocation concealment in randomised trials: defending against deciphering. Lancet. 2002;359(9306):614-618.PubMedGoogle ScholarCrossref 13. Schulz KF, Chalmers I, Hayes RJ, Altman DG. Empirical evidence of bias: dimensions of methodological quality associated with estimates of treatment effects in controlled trials. JAMA. 1995;273(5):408-412.PubMedGoogle ScholarCrossref 14. Adie S, Harris IA, Naylor JM, Mittal R. CONSORT compliance in surgical randomized trials: are we there yet? a systematic review. Ann Surg. 2013;258(6):872-878.PubMedGoogle ScholarCrossref 15. Hansen JB, Smithers BM, Schache D, Wall DR, Miller BJ, Menzies BL. Laparoscopic versus open appendectomy: prospective randomized trial. World J Surg. 1996;20(1):17-20.PubMedGoogle ScholarCrossref 16. Phatak UR, Chan WM, Lew DF, et al. Is nighttime the right time? risk of complications after laparoscopic cholecystectomy at night. J Am Coll Surg. 2014;219(4):718-724.PubMedGoogle ScholarCrossref 17. Devereaux PJ, Bhandari M, Montori VM, Manns BJ, Ghali WA, Guyatt GH. Double blind, you are the weakest link—good-bye! ACP J Club. 2002;136(1):A11.PubMedGoogle Scholar 18. Järvinen TLN, Sihvonen R, Bhandari M, et al. Blinded interpretation of study results can feasibly and effectively diminish interpretation bias. J Clin Epidemiol. 2014;67(7):769-772.PubMedGoogle ScholarCrossref 19. Ikramuddin S, Blackstone RP, Brancatisano A, et al. Effect of reversible intermittent intra-abdominal vagal nerve blockade on morbid obesity: the ReCharge randomized clinical trial. JAMA. 2014;312(9):915-922.PubMedGoogle ScholarCrossref 20. Karanicolas PJ, Bhandari M, Walter SD, Heels-Ansdell D, Guyatt GH; Collaboration for Outcomes Assessment in Surgical Trials (COAST) Musculoskeletal Group. Radiographs of hip fractures were digitally altered to mask surgeons to the type of implant without compromising the reliability of quality ratings or making the rating process more difficult. J Clin Epidemiol. 2009;62(2):214-223.e1.PubMedGoogle ScholarCrossref 21. Mantovani E, Arduino PG, Schierano G, et al. A split-mouth randomized clinical trial to evaluate the performance of piezosurgery compared with traditional technique in lower wisdom tooth removal. J Oral Maxillofac Surg. 2014;72(10):1890-1897.PubMedGoogle ScholarCrossref 22. Flum DR. Interpreting surgical trials with subjective outcomes: avoiding UnSPORTsmanlike conduct. JAMA. 2006;296(20):2483-2485.PubMedGoogle ScholarCrossref 23. Wartolowska K, Judge A, Hopewell S, et al. Use of placebo controls in the evaluation of surgery: systematic review. BMJ. 2014;348:g3253.PubMedGoogle ScholarCrossref 24. Dowrick AS, Bhandari M. Ethical issues in the design of randomized trials: to sham or not to sham. J Bone Joint Surg Am. 2012;94(suppl 1):7-10.PubMedGoogle ScholarCrossref 25. Scholtes VA, Nijman TH, van Beers L, Devereaux PJ, Poolman RW. Emerging designs in orthopaedics: expertise-based randomized controlled trials. J Bone Joint Surg Am. 2012;94(suppl 1):24-28.PubMedGoogle ScholarCrossref 26. Maruthappu M, Gilbert BJ, El-Harasis MA, et al. The influence of volume and experience on individual surgical performance: a systematic review. Ann Surg. 2015;261(4):642-647.PubMedGoogle ScholarCrossref 27. Eker HH, Hansson BM, Buunen M, et al. Laparoscopic vs open incisional hernia repair: a randomized clinical trial. JAMA Surg. 2013;148(3):259-263.PubMedGoogle ScholarCrossref 28. Lamy A, Devereaux PJ, Prabhakaran D, et al; CORONARY Investigators. Effects of off-pump and on-pump coronary-artery bypass grafting at 1 year. N Engl J Med. 2013;368(13):1179-1188.PubMedGoogle ScholarCrossref 29. Garg AX, Devereaux PJ, Yusuf S, et al; CORONARY Investigators. Kidney function after off-pump or on-pump coronary artery bypass graft surgery: a randomized clinical trial. JAMA. 2014;311(21):2191-2198.PubMedGoogle ScholarCrossref 30. Mulla SM, Scott IA, Jackevicius CA, You JJ, Guyatt GH. How to use a noninferiority trial: users’ guides to the medical literature. JAMA. 2012;308(24):2605-2611.PubMedGoogle ScholarCrossref 31. Montori VM, Permanyer-Miralda G, Ferreira-González I, et al. Validity of composite end points in clinical trials. BMJ. 2005;330(7491):594-596.PubMedGoogle ScholarCrossref 32. Guyatt GH, Sackett DL, Cook DJ; Evidence-Based Medicine Working Group. Users’ guides to the medical literature, II: how to use an article about therapy or prevention, B: what were the results and will they help me in caring for my patients? JAMA. 1994;271(1):59-63.PubMedGoogle ScholarCrossref 33. Schünemann HJ, Guyatt GH. Commentary—goodbye M(C)ID! hello MID, where do you come from? Health Serv Res. 2005;40(2):593-597.PubMedGoogle ScholarCrossref 34. Brody BA, Ashton CM, Liu D, Xiong Y, Yao X, Wray NP. Are surgical trials with negative results being interpreted correctly? J Am Coll Surg. 2013;216(1):158-166.PubMedGoogle ScholarCrossref 35. Chapman SJ, Shelton B, Mahmood H, Fitzgerald JE, Harrison EM, Bhangu A. Discontinuation and non-publication of surgical randomised controlled trials: observational study. BMJ. 2014;349:g6870.PubMedGoogle ScholarCrossref 36. Dimick JB, Diener-West M, Lipsett PA. Negative results of randomized clinical trials published in the surgical literature: equivalency or error? Arch Surg. 2001;136(7):796-800.PubMedGoogle ScholarCrossref 37. Rothwell PM. External validity of randomised controlled trials: “to whom do the results of this trial apply?” Lancet. 2005;365(9453):82-93.PubMedGoogle ScholarCrossref 38. Rothwell PM. Factors that can affect the external validity of randomised controlled trials. PLoS Clin Trials. 2006;1(1):e9.PubMedGoogle ScholarCrossref 39. Furukawa TA, Guyatt GH, Griffith LE. Can we individualize the “number needed to treat”? an empirical study of summary effect measures in meta-analyses. Int J Epidemiol. 2002;31(1):72-76.PubMedGoogle ScholarCrossref 40. Sun X, Ioannidis JP, Agoritsas T, Alba AC, Guyatt G. How to use a subgroup analysis: users’ guide to the medical literature. JAMA. 2014;311(4):405-411.PubMedGoogle ScholarCrossref 41. Executive Committee for the Asymptomatic Carotid Atherosclerosis Study. Endarterectomy for asymptomatic carotid artery stenosis. JAMA. 1995;273(18):1421-1428.PubMedGoogle ScholarCrossref 42. Moore WS, Young B, Baker WH, et al; ACAS Investigators. Surgical results: a justification of the surgeon selection process for the ACAS trial. J Vasc Surg. 1996;23(2):323-328.PubMedGoogle ScholarCrossref 43. Karanicolas PJ, Montori VM, Devereaux PJ, Schünemann H, Guyatt GH. A new “mechanistic-practical” framework for designing and interpreting randomized trials. J Clin Epidemiol. 2009;62(5):479-484.PubMedGoogle ScholarCrossref 44. Salminen P, Paajanen H, Rautio T, et al. Antibiotic therapy vs appendectomy for treatment of uncomplicated acute appendicitis: the APPAC randomized clinical trial. JAMA. 2015;313(23):2340-2348.PubMedGoogle ScholarCrossref

Journal

JAMA SurgeryAmerican Medical Association

Published: Jul 1, 2016

Keywords: bias (epidemiology),double-blind method,orthopedic procedures,outcome and process assessment (health care),randomization,randomized controlled trial,single-blind method,surgical procedures, operative,patient prognosis,research outcome,interpretation of findings,prognostic study,experience level,medical literature

References

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$499/year

Save searches from
Google Scholar,
PubMed

Create folders to
organize your research

Export folders, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month