Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 7-Day Trial for You or Your Team.

Learn More →

Systematic Reviews and Meta-Analysis of Preclinical Studies: Why Perform Them and How to Appraise Them Critically:

Systematic Reviews and Meta-Analysis of Preclinical Studies: Why Perform Them and How to Appraise... Journal of Cerebral Blood Flow & Metabolism (2014) 34, 737–742 OPEN & 2014 ISCBFM All rights reserved 0271-678X/14 www.jcbfm.com REVIEW ARTICLE Systematic reviews and meta-analysis of preclinical studies: why perform them and how to appraise them critically 1,2 1 2 1 2 Emily S Sena , Gillian L Currie , Sarah K McCann , Malcolm R Macleod and David W Howells The use of systematic review and meta-analysis of preclinical studies has become more common, including those of studies describing the modeling of cerebrovascular diseases. Empirical evidence suggests that too many preclinical experiments lack methodological rigor, and this leads to inflated treatment effects. The aim of this review is to describe the concepts of systematic review and meta-analysis and consider how these tools may be used to provide empirical evidence to spur the field to improve the rigor of the conduct and reporting of preclinical research akin to their use in improving the conduct and reporting of randomized controlled trials in clinical research. As with other research domains, systematic reviews are subject to bias. Therefore, we have also suggested guidance for their conduct, reporting, and critical appraisal. Journal of Cerebral Blood Flow & Metabolism (2014) 34, 737–742; doi:10.1038/jcbfm.2014.28; published online 19 February 2014 Keywords: acute stroke; animal models; basic science; Biostatistics; experimental INTRODUCTION Glass considers necessity was the mother of invention where meta-analysis is concerned; if it had not happened in the early Animal models are invaluable tools for enriching our under- 1970s, it was sure to happen soon after. We suggest that the same standing of the mechanisms and etiology of human diseases. The holds for meta-analysis in preclinical stroke research. In the early number of preclinical experiments performed each year continues years of the millennium, the dogma had developed that ‘every- to increase and our understanding of disease mechanisms is thing works in animals, but nothing works in humans’. In 2006, improving, but the number of novel interventions reaching the O’Collins et al published a review reporting that of more than 500 clinic to treat cerebrovascular diseases continues to fall. It is clear interventions that were reported to be efficacious in animal that there are limitations to the translational paradigm as it models of stroke, only thrombolysis with rtPA had been shown to currently exists. It is also clear from the sheer volume of preclinical be effective in stroke patients. A search for in vivo animal stroke research that structured methods are required to make objective studies published in the last 10 years yields more than 5,700 sense of the available data. Systematic review and meta-analysis articles, but still no new therapies to treat acute stroke have been are useful tools which can address some, but not all, of the developed. If the purpose of preclinical cerebrovascular research challenges of translational stroke research. They provide a less was to develop new treatments for human stroke then clearly biased summary of research findings and allow judgement of both there were substantial problems. The complexities of translational the range of available evidence (and hence the external validity) research led our group, and others, to adapt the techniques of and the likelihood that conclusions are at risk of bias (the internal meta-analysis, at that time largely restricted to clinical research, validity). to the preclinical domain, in an attempt to provide empirical Systematic review sets out to use a structured process to evidence for weaknesses in the prevailing translational paradigm, identify all data relevant to a specific research question. This may evidence which might guide improvements in the translational be followed by meta-analysis, a statistical process that provides a process. summary estimate of the outcomes from a group of studies, and In this narrative review, we aim to explain the concepts of allows these outcomes from different groups of studies to be systematic review and meta-analysis; to describe how their use compared. Although the first meta-analysis was performed in changed clinical medicine; and to explore the contribution they 1904 by Karl Pearson, it was only in 1976 that Gene Glass coined might make to preclinical research. We also describe some of the the term ‘meta-analysis’ to refer to this statistical pooling to allow elements to consider in the critical appraisal of systematic reviews the integration of findings. He suggests that meta-analysis was and meta-analyses of preclinical research. created out of the need to extract useful information from the cryptic records of inferential data analyses in the abbreviated reports of research in journals and other printed sources. Meta- analysis is now used in many fields of research including THE CONCEPT psychology, criminology, and education. Its use in clinical In any type of review, there are two fundamental steps that are medicine is routine, and the Cochrane Collaboration has been taken; identify the studies that are relevant to your research instrumental in establishing the framework for evidence-based questions and then synthesize these identified data to reach healthcare to guide clinical practice and healthcare policy. conclusions. The benefit of narrative reviews is that they include a However, in preclinical research, the use of systematic review broad overview of relevant information, perhaps interpreted by an and meta-analysis is relatively novel. experienced author tempered by years of practical knowledge of 1 2 Department of Clinical Neurosciences, University of Edinburgh, Edinburgh, UK and Stroke Division, Florey Institute of Neuroscience and Mental Health, Melbourne, Victoria, Australia. Correspondence: Dr ES Sena, Department of Clinical Neurosciences, University of Edinburgh, Chancellor’s Building, 49 Little France Crescent, Edinburgh EH16 4SB, UK. E-mail: [email protected] Received 11 December 2013; accepted 21 January 2014; published online 19 February 2014 Systematic review and meta-analysis: why and how? ES Sena et al the field; in many cases, they are highly useful. Unfortunately, they number of interventions not progressing to clinical trial), and to also have limitations. The selection methods used to identify describe the range of conditions under which efficacy has been studies that contribute to the review are often not transparent. tested. Furthermore, pooling data using meta-analysis can be used The credence given to individual studies is inherently subjective to assess both the overall efficacy of an intervention and the and often unclear. Often, the reviewer themselves may be unable impact of factors relating to internal and external validity, giving to articulate the processes through which they reached the valuable insights into the causes of translational successes and conclusions presented. Systematic reviews are not bias free, but failures. the transparency of the methods used are designed to reduce In recent years we, and others, have presented empirical bias. In a systematic review, the researcher is required to outline evidence that suggests that the usefulness of data from aims, objectives, and methodology. The principle is that an experiments testing drug efficacy in animal models of various independent researcher could perform the same identification neurologic diseases may be substantially impaired by limited process and yield the same data set. As is often the case, in the methodological quality, limited generalizability, and by significant 7–10 interpretation of primary research studies, the conclusions drawn publication bias. from a meta-analysis may differ from one reviewer to the next. However, the transparency and objectivity of the techniques used Methodological Quality provide a framework for these discussions. In an experiment, the credibility of the inferred causal relationship The data synthesis process in meta-analysis has a number of between treatment and outcome is dependent upon on the steps once the individual studies have been identified. First, the statistical power and internal validity. effect sizes of individual studies are determined. This is often a Preclinical animal research is confounded by pressures to treatment effect that is a measure of the difference between reduce the number of animals used because of concerns about control and treatment groups. Effect sizes are not limited to cost, time, ethics, and practicalities of disease modeling that might effects of drugs, but may represent a relationship between any lead to studies either being underpowered or of unknown power. two variables. Second, the precision of an effect size is determined Determining the required sample size to answer a research by its standard error. The broad aim is to calculate an average question is crucial. Too small and the results are imprecise and effect size across the studies, termed the summary estimate of lack statistical power. Too large and unnecessary costs are effect. Third, because some effect sizes are more precise than incurred. A priori sample size calculations also provide assurance others, this averaging process is often weighted so that more that animals are not added to a study incrementally in response to precise studies are given more weight in a meta-analysis than less (unreported) interim analyses. precise ones. Finally, the differences between the component The internal validity of an experiment ensures that the changes effect sizes—the heterogeneity–are assessed. We expect variation observed in outcomes are due to an induced change in one or of effect sizes to occur due both to random error and to real differ- more of the independent variables rather than some other ences in experimental design. In our view, it is this exploration of confounding factor. The internal validity of an experiment may be sources of heterogeneity—the identification of those aspects of threatened by a range of biases. These include, but are not limited experimental design that cause exaggerations or underestima- to, selection bias, performance bias, and detection bias. Selection tions in treatment effects, or those aspects of drug delivery that bias occurs when there are systematic differences between study give maximum efficacy—which is the true strength of meta- groups at the start of an experiment. Performance bias occurs analysis. However, because we are simply observing a ‘‘cohort’’ of when systematic differences occur in how the groups are handled experiments, rather than testing the impact of these influences during a study and detection bias occurs when systematic experimentally, findings from meta-analysis should be considered differences occur between groups in how outcomes are hypothesis generating rather than confirmatory. ascertained, diagnosed, or verified. Measures to reduce the impact of these biases include randomization, allocation concealment and masked assessment of outcome. IN CLINICAL RESEARCH Systematic review and meta-analysis of experimental studies Although systematic reviews and meta-analyses are now routinely covering a range of neurologic disorders have provided evidence used by medical researchers to inform practice and policy, they that few studies take measures to reduce bias or perform formal also had a pivotal role in providing empirical evidence of the power calculations to determine sample size (Table 1). impact of bias in the conduct of controlled clinical trials. Chalmers Meta-analysis can be used to assess the impact of methodo- and others provided evidence that studies that do not adequately logical quality on reported outcomes. Unfortunately, sample size mask treatment allocation are associated with bias and inflated calculations are so seldom reported that it has not been possible 3,4 treatment effects. In the hierarchy of evidence randomized to assess whether performing a sample size calculation influences controlled trials are now considered the gold standard in clinical outcome. However, in animal models of experimental autoim- trial design. Conceptually, the ability of randomization to account mune encephalomyelitis, we showed that reported efficacy was for systematic differences in factors, known or unknown, between largest in the smallest studies (Figure 1). groups that may affect outcome is apparent; as is ensuring that Fortunately, we have been able to generate empirical evidence preconceived views of patients and clinicians do not bias the describing the impact of reporting of measures to reduce bias on assessment of outcomes. But it required empirical evidence to outcome. In a meta-analysis of therapeutic hypothermia in revolutionize clinical research and to convince trialists of the experimental stroke, we observed treatment effects were 10% importance of methodological rigor in both the conduct and larger in non-randomized studies and 8% larger in unmasked reporting of their studies. studies than those that did take these measures to reduce bias. Similarly, in a meta-analysis of NXY-059 in experimental stroke, NXY-059 was reported to be 30% more effective in studies that WHAT COULD META-ANALYSIS DO FOR PRECLINICAL were not randomized or masked than in studies that reported RESEARCH? randomization and blinding. Across the modeling of a number of cerebrovascular diseases, there is a considerable volume of often conflicting data. External and Construct Validity Systematic review and meta-analysis can be used to describe which interventions have been tested in models of disease, to If preclinical models of disease are to inform human health, provide an indication of the attrition rate of interventions (i.e., the experiments require external validity and the models used require Journal of Cerebral Blood Flow & Metabolism (2014), 737 – 742 & 2014 ISCBFM Systematic review and meta-analysis: why and how? ES Sena et al Table 1. Number and percentages of studies across the modeling of different neurologic diseases reporting measures to reduce the risk of bias Number of Masked assessment of Random allocation to Allocation Sample size publications outcome (%) group (%) concealment (%) calculation (%) Alzheimer’s 428 95 (22) 67 (16) NA 0 (0) disease Multiple sclerosis 1,117 178 (16) 106 (9) NA 2 (o1) Parkinson’s disease 252 38 (15) 40 (16) NA 1 (o1) Intracerebral 88 43 (49) 27 (31) 7 (8) 0 (0) hemorrhage Focal ischemia NXY 059 9 4 (44) 3 (33) 5 (56) 2 (22) Hypothermia 101 38 (38) 36 (36) 4 (4) 0 (0) Erythropoietin 19 8 (42) 7 (37) 4 (21) 0 (0) Tirilazad 18 13 (72) 12 (67) 1 (6) 0 (0) tPA 113 24 (21) 42 (37) 23 (20) 8 (7) NA, not applicable; tPA, tissue plasminogen activator. independently. Others recommend that the time that treatment is started after disease/injury induction should be realistic in terms of what is possible in the clinic. The most common recommendations to improve construct validity include characterization of disease phenotype in the animal model before experimentation, matching the model to the human disease and matching outcome measures to the clinical setting. Reporting Bias The validity of a systematic review may be limited by reporting biases of the component studies. It has long been recognized that neutral studies often remain unpublished or take longer to get published than those reporting statistically significant results. They are also more likely to be published in journals of low impact or in languages other than English. Such work is less likely to be identified in narrative and even in systematic review, and such publication bias can lead to the overstatement of summary effects Figure 1. The effect of the mean sample size on the estimate of in meta-analysis. Published meta-analyses now routinely report effect size for neurobehavioural score in models of encephalomye- the presence or lack of publication bias in their reviews. litis. The horizontal gray bar represents the 95% confidence limits for Data from meta-analyses of 525 unique publications and 16 the summary estimate of effect. The vertical error bars represent the interventions tested in models of experimental stroke were 95% confidence intervals for the individual estimate. The widths of combined and imprecise study effects consistent with publication the bar represent the log of the number of animals contributing to that comparison. bias were seen in funnel plot asymmetry and confirmed with Egger Regression (Figures 2A and 2B). Using a meta-analytical technique known as trim-and-fill that imputes theoretical missing studies, the overall efficacy was significantly reduced from 30.1% (28.7 to 31.6%) to 23.3% (21.7 to 24.9%), a relative overstatement construct validity; both types of validity relate to the general- in efficacy of 31%. Two hundred studies were deemed to be izability of a study. External validity refers to the ability to missing and were ‘filled’ into the data set (Figure 2C). Furthermore, generalize the findings to different measures, settings, and times. only 2% of publications reported no significant treatment effects. Construct validity refers to adequate representation of theoretical Other reporting biases that are less commonly assessed in constructs, in disease modeling this may be threatened where systematic reviews include selective outcome reporting and only specific characteristics of a complex disease are modeled. selective analysis reporting. These biases may occur where many Aspects of both these validity types are clearly disease specific. outcome measures are assessed or many statistical analyses are For example, in the modeling of ischemic stroke, associated performed but only the ‘best’ results are presented. This in turn co-morbidities, age of animals, and the time to treatment are leads to a body of evidence with an inflated proportion of important factors. Using systematic review, we have identified that statistically significant results. In a review of 160 meta-analyses the majority of preclinical stroke studies use young male including 4,445 experiments from the modeling of six neuro- normotensive rats. Furthermore, meta-analysis has also provided logic disorders (Alzheimer’s, encephalomyelitis, focal ischemia, evidence of no detectable effect of either tissue plasminogen intracerebral hemorrhage, Parkinson’s disease, and spinal cord 13,14 activator or NXY-059 in hypertensive animals, which would injury) we assessed, using the Excess Significance Test, whether appear to be concordant with results in humans. too many of the individual studies in the meta-analyses reported The most common recommendations from preclinical research statistically significant results. We expected 21% of the results to guidelines to improve external validity are that experiments be significant but observed 39% significant results, suggesting the should be replicated in different models of the same disease, presence of bias in this data set. These issues may, in part, be in different species and that findings should be replicated addressed by the use of published protocols. & 2014 ISCBFM Journal of Cerebral Blood Flow & Metabolism (2014), 737 – 742 Systematic review and meta-analysis: why and how? ES Sena et al As systematic reviews and meta-analyses of preclinical research become more common, a number of different approaches have been used and readers—and reviewers—need to be able to assess whether the methodologies used are sound and the interpretation is valid. However, we are not aware of any study that provides empirical evidence of the presence or magnitude of the risk of bias associated with different aspects of the conduct or reporting of systematic reviews and meta-analyses of data from in vivo experiments. Without such data, guidelines have less validity, but in a developing field, it is, we believe, reasonable to make some recommendations, based in part on our experience in conducting such reviews and in part on guidelines in other, related fields. In Table 2, we make some recommendations for reporting systema- tic reviews and meta-analyses of animal studies using key elements of the guidelines proposed by Peters et al that are akin to the PRISMA guidelines for the reporting of systematic reviews and meta-analyses of healthcare interventions in human clinical studies. We suggest the following steps should be considered in the critical appraisal of a systematic review and meta-analysis (Table 3; adapted from Garg et al ). Does the Study Follow a Pre-Specified Protocol? If a protocol exists, the manuscript should provide a reference to where it might be found. A pre-specified written protocol is likely to improve the standard of a systematic review and meta-analysis by reducing the risk of ‘data dredging’ and post hoc revisions of study aims. Making the protocol publicly available to the research community also allows reviewers to obtain feedback on drafts through peer review. It is often the case that protocols evolve during the course of the review in light of clearer understanding of the research field and the data available; if such iterations occur, the changes should be justified and the date of these changes should be given in a revised protocol. The protocol should clearly define the research question, objectives, inclusion criteria, search strategy, data collection processes, and data analysis plan. A protocol should allow readers to judge whether the final study did indeed follow a pre-specified plan. Was the Research Question Focused and Clearly Defined? The manuscript should provide a clear statement of the research question that is addressed As with any study, the research question of a systematic review and meta-analysis needs to be focussed and clearly defined. It also needs to be appropriate to the type of study being performed, and its answer meaningful to the field. Are the Inclusion Criteria Appropriate? The manuscript should define clear inclusion and exclusion criteria relating to the identification of relevant publications. The reader should be confident that the eligibility criteria for inclusion in a systematic review are not biased and are appro- Figure 2. Publication bias. Plots describing (A) funnel plot, (B) Egger priate to the research questions being asked. The scope of the regression, and (C) trim-and-fill. inclusion criteria determines the validity of the conclusions drawn. There has been a debate on how broad or narrow the selection HOW TO APPRAISE CRITICALLY A SYSTEMATIC REVIEW AND process should be; some argue that only studies that meet a META-ANALYSIS OF PRECLINICAL STUDIES high standard of methodological quality should be included In a systematic review of systematic reviews of preclinical studies, while others argue that meta-analyses should deal with the good, Mignini and Khan found that 30% specified a testable hypothesis, bad, and indifferent of included studies. There is no correct 27% performed a literature search without language restrictions, approach but the degree of caution with which the results are 17% assessed for the presence of publication bias, half assessed interpreted and conclusions drawn should take this into account. study validity, and 2% investigated sources of heterogeneity. As Some reviews are restricted to studies published in English, we with any type of research, systematic reviews and meta-analyses suspect because of the ease of data abstraction. Language bias are susceptible to bias, and it is only through clear reporting of occurs if studies performed in non-English speaking countries are what was done that it is possible to assess this risk of bias. more likely to publish their statistically significant results in English Journal of Cerebral Blood Flow & Metabolism (2014), 737 – 742 & 2014 ISCBFM Systematic review and meta-analysis: why and how? ES Sena et al Table 2. Guidelines for reporting systematic reviews and meta-analyses of animal studies Title Identify the report as a systematic review and/or meta-analysis of animal experiments. Abstract Provide a structured abstract covering the following: objectives, data sources, review methods, results, and conclusion. Introduction Clearly defined and focussed research question. Methods Protocol Indicate if a protocol exists and where it can be found (i.e., web address). Searching Describe the information sources in detail, including keywords, search strategy, any restrictions, and special efforts to include all available data. Selection Describe the inclusion and exclusion criteria. Validity and quality Describe the criteria and process used to assess validity. assessment Data abstraction Describe the process or processes used (e.g., completed independently, in duplicate). Describe whether aggregate data or individual animal data are abstracted. Study characteristics Describe the study characteristics relevant to your research question. Quantitative data Describe the principal measures of effect, method of combining results, handling of missing data; how statistical synthesis heterogeneity was assessed; and any assessment of publication bias—all in enough detail to allow replication. Results Flow chart A meta-analysis profile summarizing study flow giving total number of experiments in the meta-analysis. Study characteristics Descriptive data for each experiment. Quantitative data Present simple summary results (e.g., forest plot); identify sources of heterogeneity, impact of study quality, and synthesis publication bias. Discussion Summarize the main findings; discuss limitations; provide general interpretation of the results in the context of other findings, and implications for future research. Funding Describe sources of funding for the review and other support. The role of funders should be presented. Conflict of interest Any potential conflict of interests should be reported. The search strategy used needs to be comprehensive enough to Table 3. Points to consider in the critical appraisal of systematic identify most relevant studies. The screening of identified studies reviews and meta-analyses of animal studies for inclusion is susceptible to random error and may be subjective. For this reason, we advise that two independent reviewers screen 1. Does the study follow a pre-specified protocol? 2. Was the research question focused and clearly defined? studies for inclusion, and the number of reviewers should be 3. Are the inclusion criteria appropriate? reported in the manuscript. 4. How comprehensive was the search strategy? A comprehensive search reduces the possibility of publication 5. Was the data abstraction from each study appropriate? bias in a review. Publication bias may be present because of an 6. Were the data pooled appropriately? incomplete search of the literature or because the studies themselves are not in the public domain to be identified in the search process. There are various techniques for assessing for the language journals and non-significant results in non-English presence and impact of publication bias in a meta-analysis that language journals. The impact of language bias on systematic reviewers should consider. reviews of preclinical data is yet to be determined. It may not be always practical to translate all non-English studies, but the Was the Data Abstraction from Each Study Appropriate? reviewers should at least identify and report how many studies were excluded for reason of language. This is usually possible as The choice of data (times, outcome measures) to be extracted titles and abstracts are often translated even if the full papers are from each publication should be defined. not. The number of investigators extracting data from publications, and the method for dealing with inconsistencies, should be described. How Comprehensive was the Search Strategy? The reviewers should be rigorous and their methods reprodu- The manuscript should define the search strategy, including the cible in extracting data for included studies. Ideally, this process search terms used, the databases searched, and the date(s) of the should also be performed by two independent reviewers. search(es). The assessment of methodological quality of included studies The number of investigators screening publications for inclu- should be reported. Understanding the rigor and validity of sion, and the method for dealing with inconsistencies, should be included studies is important in the interpretation of the con- described. clusions drawn. Identifying relevant studies for a systematic review can be arduous. Biomedical journals are the most common source of Were the Data Pooled Appropriately? relevant data and these are identified via the searching of bibliographic databases. A range of bibliographic databases The manuscript should identify a primary outcome variable or the should be searched as no database has complete coverage of statistical limits to any subdivision of the data. all health-related literature. The core databases used are Medline The manuscript should describe the method of pooling of and Embase; Medline is often searched via PubMed, but other data and provide summary estimates, estimates of uncertainty search services may be used. Embase indexes more European and (e.g. 95% confidence intervals), and a measure of heterogeneity Asian journals and has roughly a 60% overlap with Medline. There (e.g. Q, I ). are other sources of data that may be appropriate, including other The statistical pooling of aggregate data in a meta-analysis specialized databases, conference proceedings, personal commu- gives greater weight to more precise studies. The assumption of nications, and books. this pooling may be performed under a ‘fixed’ or ‘random’ effects & 2014 ISCBFM Journal of Cerebral Blood Flow & Metabolism (2014), 737 – 742 Systematic review and meta-analysis: why and how? ES Sena et al model. More detailed discussion of the merits of each model in 10 Tsilidis KK, Panagiotou OA, Sena ES, Aretouli E, Evangelou E, Howells DW et al. Evaluation of excess significance bias in animal studies of neurological diseases. the pooling of preclinical data can be found elsewhere but on PLoS Biol 2013; 11: e1001609. the whole, given the heterogeneity often observed in preclinical 11 Vesterinen HM, Sena ES, Ffrench-Constant C, Williams A, Chandran S, Macleod MR. studies, random effects meta-analysis is more appropriate. Formal Improving the translational hit of experimental treatments in multiple sclerosis. assessment of heterogeneity should be performed. Attempts to Mult Scler 2010; 16: 1044–1055. stratify data to account for, or explain, some of the heterogeneity 12 van der Worp HB, Sena ES, Donnan GA, Howells DW, Macleod MR. Hypothermia in are useful. animal models of acute ischaemic stroke: a systematic review and meta-analysis. Brain 2007; 130: 3063–3074. 13 Macleod MR, van der Worp HB, Sena ES, Howells DW, Dirnagl U, Donnan GA. SUMMARY Evidence for the efficacy of NXY-059 in experimental focal cerebral ischaemia is confounded by study quality. Stroke 2008; 39: 2824–2829. Recent years have seen improvements in the conduct and 14 Sena ES, Briscoe CL, Howells DW, Donnan GA, Sandercock PA, Macleod MR. reporting of clinical trial design after the publication of the 27 28 Factors affecting the apparent efficacy and safety of tissue plasminogen activator CONSORT statement. Analogous to this, the ARRIVE guidelines 29 in thrombotic occlusion models of stroke: systematic review and meta-analysis. and Landis paper hope to promote the same improvements in J Cereb Blood Flow Metab 2010; 30: 1905–1913. animal experiments. 15 Henderson VC, Kimmelman J, Fergusson D, Grimshaw JM, Hackam DG. Threats to Systematic review and meta-analysis have provided empirical validity in the design and conduct of preclinical efficacy studies: a systematic evidence that too many preclinical experiments lack methodolo- review of guidelines for in vivo animal experiments. PLoS Med 2013; 10: e1001489. gical rigor, and this leads to inflated treatment effects. There is 16 van der Worp HB, Howells DW, Sena ES, Porritt MJ, Rewell S, O’Collins V et al. Can animal models of disease reliably inform human studies? PLoS Med 2010; 7: of course no guarantee that improvements in the validity of e1000245. preclinical animal studies and reduced publication bias will 17 Sutton AJ, Song F, Gilbody SM, Abrams KR. Modelling publication bias in meta- improve the translational hit of interventions from bench to analysis: a review. Stat Methods Med Res 2000; 9: 421–445. bedside. However, we hope that the mounting empirical evidence 18 Ioannidis JPA. Why most discovered true associations are inflated. Epidemiology drives the preclinical research community toward improved rigor. 2008; 19: 640–648. Despite some very reasonable concerns about the novelty of the 19 Mignini LE, Khan KS. Methodological quality of systematic reviews of animal methodological approach and the difficulty of confirming these studies: a survey of reviews of basic research. BMC Med Res Methodol 2006; 6:10. hypotheses experimentally, findings from meta-analyses cover 20 Peters JL, Sutton AJ, Jones DR, Rushton L, Abrams KR. A systematic review of such a wide range of disease models and have been reported by systematic reviews and meta-analyses of animal experiments with guidelines for so many different research groups that it is highly unlikely that reporting. J Environ Sci Health B 2006; 41: 1245–1258. 21 Moher D, Liberati A, Tetzlaff J, Altman DG. PRISMA Group. Preferred reporting these conclusions do not reflect a real and present problem with items for systematic reviews and meta-analyses: the PRISMA statement. BMJ 2009; the use of animal models. As consumers of systematic reviews and 8: 336–41. meta-analyses of preclinical research, it is important that we are 22 Garg AX, Hackam D, Tonelli M. Systematic review and meta-analysis: when one able to discern, as we do for primary research studies, the rigor study is just not enough. Clin J Am Soc Nephrol 2008; 3: 253–260. with which the meta-analyses are performed and reported. 23 Meline T. Selecting studies for systematic review: inclusion and exclusion criteria. We hope that this empirical evidence of bias, akin to that Contemp Issues in Commun Sci Disord 2006; 33: 21–27. reported in clinical research more than 30 years ago, spurs a 24 Slavin RE. Best-evidence synthesis: why less is more. Educ Res 1987; 16: 15–16. similar change in the way we conduct and report preclinical 25 Egger M, Zellweger-Zþ n˜ hner T, Schneider M, Junker C, Lengeler C, Antes G. research. Whether this leads to improved translation we are yet to Language bias in randomised controlled trials published in English and German. Lancet 1997; 350: 326–329. see. However, what we will observe is an improvement in the 26 Vesterinen HM, Sena ES, Egan KJ, Hirst TC, Churolov L, Currie GL et al. Meta- conduct and reporting of preclinical research that reasonably can analysis of data from animal studies: a practical guide. J Neurosci Methods 2014; only be of benefit to this troubled field of research. 221: 92–102. 27 Moher D, Schulz K, Altman D. The CONSORT statement: revised recommendations for improving the quality of reports of parallel group randomized trials. BMC Med DISCLOSURE/CONFLICT OF INTEREST Res Methodol 2001; 1:2. The authors declare no conflict of interest. 28 Kilkenny C, Browne WJ, Cuthill IC, Emerson M, Altman DG. Improving bioscience research reporting: the ARRIVE guidelines for reporting animal research. PLoS Biol 2010; 8: e1000412. REFERENCES 29 Landis SC, Amara SG, Asadullah K, Austin CP, Blumenstein R, Bradley EW et al. A call for transparent reporting to optimize the predictive value of preclinical 1 Glass G. Meta-analysis at 25 http://www.gvglass info/papers/meta25.html 2013. research. Nature 2012; 490: 187–191. 2 O’Collins VE, Macleod MR, Donnan GA, Horky LL, van der Worp BH, Howells DW. 30 Egan KJ, Sena ES, Vesterinen HM, Macleod MR. Developing our understanding 1,026 experimental treatments in acute stroke. Ann Neurol 2006; 59: 467–477. of the pathogenesis of Alzheimer’s disease using a systematically identified 3 Schulz KF, Chalmers I, Hayes RJ, Altman DG. Empirical-evidence of bias— dataset of interventions tested in transgenic mouse models. Eur J Neurol 2012; 19: dimensions of methodological quality associated with estimates of treatment effects in controlled trials. JAMA 1995; 273: 408–412. 31 Rooke ED, Vesterinen HM, Sena ES, Egan KJ, Macleod MR. Dopamine agonists in 4 Chalmers TC, Celano P, Sacks HS, Smith H. Bias in treatment assignment in con- animal models of Parkinson’s disease: a systematic review and meta-analysis. trolled clinical trials. N Engl J Med 1983; 309: 1358–1361. Parkinsonism Relat Disord 2011; 17: 313–320. 5 Bonnie S, Martin R. Understanding controlled trials: why are randomised con- 32 Frantzias J, Sena ES, Macleod MR, Salman RA-S.. Treatment of intracerebral trolled trials important? BMJ 1998; 316: 201. hemorrhage in animal models: meta-analysis. Ann Neurol 2011; 69: 389–399. 6 Begg C, Cho M, Eastwood S, Horton R, Moher D, Olkin I et al. Improving the quality 33 Jerndal M, Forsberg K, Sena ES, Macleod MR, O’Collins VE, Linden T et al. of reporting of randomized controlled trials—The CONSORT statement. JAMA A systematic review and meta-analysis of erythropoietin in experimental stroke. 1996; 276: 637–639. J Cereb Blood Flow Metab 2010; 30: 961–968. 7 Crossley NA, Sena E, Goehler J, Horn J, van der Worp B, Bath PMW et al. Empirical 34 Sena E, Wheble P, Sandercock P, Macleod M. Systematic review and meta-analysis evidence of bias in the design of experimental stroke studies - a metaepide- of the efficacy of tirilazad in experimental stroke. Stroke 2007; 38: 388–394. miologic approach. Stroke 2008; 39: 929–934. 8 Sena ES, Macleod MR. Concordance between laboratory and clinical drug efficacy: lessons from systematic review and meta-analysis. Stroke 2007; 38:502. 9 Sena ES, van der Worp HB, Bath PMW, Howells DW, Macleod MR. Publication bias This work is licensed under a Creative Commons Attribution 3.0 in reports of animal stroke studies leads to major overstatement of efficacy. PLoS Unported License. To view a copy of this license, visit http:// Biol 2010; 8: e1000344. creativecommons.org/licenses/by/3.0/ Journal of Cerebral Blood Flow & Metabolism (2014), 737 – 742 & 2014 ISCBFM http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of Cerebral Blood Flow & Metabolism SAGE

Systematic Reviews and Meta-Analysis of Preclinical Studies: Why Perform Them and How to Appraise Them Critically:

Loading next page...
 
/lp/sage/systematic-reviews-and-meta-analysis-of-preclinical-studies-why-En0Ep3NJEy

References (74)

Publisher
SAGE
Copyright
Copyright © 2022 by International Society for Cerebral Blood Flow and Metabolism
ISSN
0271-678X
eISSN
1559-7016
DOI
10.1038/jcbfm.2014.28
Publisher site
See Article on Publisher Site

Abstract

Journal of Cerebral Blood Flow & Metabolism (2014) 34, 737–742 OPEN & 2014 ISCBFM All rights reserved 0271-678X/14 www.jcbfm.com REVIEW ARTICLE Systematic reviews and meta-analysis of preclinical studies: why perform them and how to appraise them critically 1,2 1 2 1 2 Emily S Sena , Gillian L Currie , Sarah K McCann , Malcolm R Macleod and David W Howells The use of systematic review and meta-analysis of preclinical studies has become more common, including those of studies describing the modeling of cerebrovascular diseases. Empirical evidence suggests that too many preclinical experiments lack methodological rigor, and this leads to inflated treatment effects. The aim of this review is to describe the concepts of systematic review and meta-analysis and consider how these tools may be used to provide empirical evidence to spur the field to improve the rigor of the conduct and reporting of preclinical research akin to their use in improving the conduct and reporting of randomized controlled trials in clinical research. As with other research domains, systematic reviews are subject to bias. Therefore, we have also suggested guidance for their conduct, reporting, and critical appraisal. Journal of Cerebral Blood Flow & Metabolism (2014) 34, 737–742; doi:10.1038/jcbfm.2014.28; published online 19 February 2014 Keywords: acute stroke; animal models; basic science; Biostatistics; experimental INTRODUCTION Glass considers necessity was the mother of invention where meta-analysis is concerned; if it had not happened in the early Animal models are invaluable tools for enriching our under- 1970s, it was sure to happen soon after. We suggest that the same standing of the mechanisms and etiology of human diseases. The holds for meta-analysis in preclinical stroke research. In the early number of preclinical experiments performed each year continues years of the millennium, the dogma had developed that ‘every- to increase and our understanding of disease mechanisms is thing works in animals, but nothing works in humans’. In 2006, improving, but the number of novel interventions reaching the O’Collins et al published a review reporting that of more than 500 clinic to treat cerebrovascular diseases continues to fall. It is clear interventions that were reported to be efficacious in animal that there are limitations to the translational paradigm as it models of stroke, only thrombolysis with rtPA had been shown to currently exists. It is also clear from the sheer volume of preclinical be effective in stroke patients. A search for in vivo animal stroke research that structured methods are required to make objective studies published in the last 10 years yields more than 5,700 sense of the available data. Systematic review and meta-analysis articles, but still no new therapies to treat acute stroke have been are useful tools which can address some, but not all, of the developed. If the purpose of preclinical cerebrovascular research challenges of translational stroke research. They provide a less was to develop new treatments for human stroke then clearly biased summary of research findings and allow judgement of both there were substantial problems. The complexities of translational the range of available evidence (and hence the external validity) research led our group, and others, to adapt the techniques of and the likelihood that conclusions are at risk of bias (the internal meta-analysis, at that time largely restricted to clinical research, validity). to the preclinical domain, in an attempt to provide empirical Systematic review sets out to use a structured process to evidence for weaknesses in the prevailing translational paradigm, identify all data relevant to a specific research question. This may evidence which might guide improvements in the translational be followed by meta-analysis, a statistical process that provides a process. summary estimate of the outcomes from a group of studies, and In this narrative review, we aim to explain the concepts of allows these outcomes from different groups of studies to be systematic review and meta-analysis; to describe how their use compared. Although the first meta-analysis was performed in changed clinical medicine; and to explore the contribution they 1904 by Karl Pearson, it was only in 1976 that Gene Glass coined might make to preclinical research. We also describe some of the the term ‘meta-analysis’ to refer to this statistical pooling to allow elements to consider in the critical appraisal of systematic reviews the integration of findings. He suggests that meta-analysis was and meta-analyses of preclinical research. created out of the need to extract useful information from the cryptic records of inferential data analyses in the abbreviated reports of research in journals and other printed sources. Meta- analysis is now used in many fields of research including THE CONCEPT psychology, criminology, and education. Its use in clinical In any type of review, there are two fundamental steps that are medicine is routine, and the Cochrane Collaboration has been taken; identify the studies that are relevant to your research instrumental in establishing the framework for evidence-based questions and then synthesize these identified data to reach healthcare to guide clinical practice and healthcare policy. conclusions. The benefit of narrative reviews is that they include a However, in preclinical research, the use of systematic review broad overview of relevant information, perhaps interpreted by an and meta-analysis is relatively novel. experienced author tempered by years of practical knowledge of 1 2 Department of Clinical Neurosciences, University of Edinburgh, Edinburgh, UK and Stroke Division, Florey Institute of Neuroscience and Mental Health, Melbourne, Victoria, Australia. Correspondence: Dr ES Sena, Department of Clinical Neurosciences, University of Edinburgh, Chancellor’s Building, 49 Little France Crescent, Edinburgh EH16 4SB, UK. E-mail: [email protected] Received 11 December 2013; accepted 21 January 2014; published online 19 February 2014 Systematic review and meta-analysis: why and how? ES Sena et al the field; in many cases, they are highly useful. Unfortunately, they number of interventions not progressing to clinical trial), and to also have limitations. The selection methods used to identify describe the range of conditions under which efficacy has been studies that contribute to the review are often not transparent. tested. Furthermore, pooling data using meta-analysis can be used The credence given to individual studies is inherently subjective to assess both the overall efficacy of an intervention and the and often unclear. Often, the reviewer themselves may be unable impact of factors relating to internal and external validity, giving to articulate the processes through which they reached the valuable insights into the causes of translational successes and conclusions presented. Systematic reviews are not bias free, but failures. the transparency of the methods used are designed to reduce In recent years we, and others, have presented empirical bias. In a systematic review, the researcher is required to outline evidence that suggests that the usefulness of data from aims, objectives, and methodology. The principle is that an experiments testing drug efficacy in animal models of various independent researcher could perform the same identification neurologic diseases may be substantially impaired by limited process and yield the same data set. As is often the case, in the methodological quality, limited generalizability, and by significant 7–10 interpretation of primary research studies, the conclusions drawn publication bias. from a meta-analysis may differ from one reviewer to the next. However, the transparency and objectivity of the techniques used Methodological Quality provide a framework for these discussions. In an experiment, the credibility of the inferred causal relationship The data synthesis process in meta-analysis has a number of between treatment and outcome is dependent upon on the steps once the individual studies have been identified. First, the statistical power and internal validity. effect sizes of individual studies are determined. This is often a Preclinical animal research is confounded by pressures to treatment effect that is a measure of the difference between reduce the number of animals used because of concerns about control and treatment groups. Effect sizes are not limited to cost, time, ethics, and practicalities of disease modeling that might effects of drugs, but may represent a relationship between any lead to studies either being underpowered or of unknown power. two variables. Second, the precision of an effect size is determined Determining the required sample size to answer a research by its standard error. The broad aim is to calculate an average question is crucial. Too small and the results are imprecise and effect size across the studies, termed the summary estimate of lack statistical power. Too large and unnecessary costs are effect. Third, because some effect sizes are more precise than incurred. A priori sample size calculations also provide assurance others, this averaging process is often weighted so that more that animals are not added to a study incrementally in response to precise studies are given more weight in a meta-analysis than less (unreported) interim analyses. precise ones. Finally, the differences between the component The internal validity of an experiment ensures that the changes effect sizes—the heterogeneity–are assessed. We expect variation observed in outcomes are due to an induced change in one or of effect sizes to occur due both to random error and to real differ- more of the independent variables rather than some other ences in experimental design. In our view, it is this exploration of confounding factor. The internal validity of an experiment may be sources of heterogeneity—the identification of those aspects of threatened by a range of biases. These include, but are not limited experimental design that cause exaggerations or underestima- to, selection bias, performance bias, and detection bias. Selection tions in treatment effects, or those aspects of drug delivery that bias occurs when there are systematic differences between study give maximum efficacy—which is the true strength of meta- groups at the start of an experiment. Performance bias occurs analysis. However, because we are simply observing a ‘‘cohort’’ of when systematic differences occur in how the groups are handled experiments, rather than testing the impact of these influences during a study and detection bias occurs when systematic experimentally, findings from meta-analysis should be considered differences occur between groups in how outcomes are hypothesis generating rather than confirmatory. ascertained, diagnosed, or verified. Measures to reduce the impact of these biases include randomization, allocation concealment and masked assessment of outcome. IN CLINICAL RESEARCH Systematic review and meta-analysis of experimental studies Although systematic reviews and meta-analyses are now routinely covering a range of neurologic disorders have provided evidence used by medical researchers to inform practice and policy, they that few studies take measures to reduce bias or perform formal also had a pivotal role in providing empirical evidence of the power calculations to determine sample size (Table 1). impact of bias in the conduct of controlled clinical trials. Chalmers Meta-analysis can be used to assess the impact of methodo- and others provided evidence that studies that do not adequately logical quality on reported outcomes. Unfortunately, sample size mask treatment allocation are associated with bias and inflated calculations are so seldom reported that it has not been possible 3,4 treatment effects. In the hierarchy of evidence randomized to assess whether performing a sample size calculation influences controlled trials are now considered the gold standard in clinical outcome. However, in animal models of experimental autoim- trial design. Conceptually, the ability of randomization to account mune encephalomyelitis, we showed that reported efficacy was for systematic differences in factors, known or unknown, between largest in the smallest studies (Figure 1). groups that may affect outcome is apparent; as is ensuring that Fortunately, we have been able to generate empirical evidence preconceived views of patients and clinicians do not bias the describing the impact of reporting of measures to reduce bias on assessment of outcomes. But it required empirical evidence to outcome. In a meta-analysis of therapeutic hypothermia in revolutionize clinical research and to convince trialists of the experimental stroke, we observed treatment effects were 10% importance of methodological rigor in both the conduct and larger in non-randomized studies and 8% larger in unmasked reporting of their studies. studies than those that did take these measures to reduce bias. Similarly, in a meta-analysis of NXY-059 in experimental stroke, NXY-059 was reported to be 30% more effective in studies that WHAT COULD META-ANALYSIS DO FOR PRECLINICAL were not randomized or masked than in studies that reported RESEARCH? randomization and blinding. Across the modeling of a number of cerebrovascular diseases, there is a considerable volume of often conflicting data. External and Construct Validity Systematic review and meta-analysis can be used to describe which interventions have been tested in models of disease, to If preclinical models of disease are to inform human health, provide an indication of the attrition rate of interventions (i.e., the experiments require external validity and the models used require Journal of Cerebral Blood Flow & Metabolism (2014), 737 – 742 & 2014 ISCBFM Systematic review and meta-analysis: why and how? ES Sena et al Table 1. Number and percentages of studies across the modeling of different neurologic diseases reporting measures to reduce the risk of bias Number of Masked assessment of Random allocation to Allocation Sample size publications outcome (%) group (%) concealment (%) calculation (%) Alzheimer’s 428 95 (22) 67 (16) NA 0 (0) disease Multiple sclerosis 1,117 178 (16) 106 (9) NA 2 (o1) Parkinson’s disease 252 38 (15) 40 (16) NA 1 (o1) Intracerebral 88 43 (49) 27 (31) 7 (8) 0 (0) hemorrhage Focal ischemia NXY 059 9 4 (44) 3 (33) 5 (56) 2 (22) Hypothermia 101 38 (38) 36 (36) 4 (4) 0 (0) Erythropoietin 19 8 (42) 7 (37) 4 (21) 0 (0) Tirilazad 18 13 (72) 12 (67) 1 (6) 0 (0) tPA 113 24 (21) 42 (37) 23 (20) 8 (7) NA, not applicable; tPA, tissue plasminogen activator. independently. Others recommend that the time that treatment is started after disease/injury induction should be realistic in terms of what is possible in the clinic. The most common recommendations to improve construct validity include characterization of disease phenotype in the animal model before experimentation, matching the model to the human disease and matching outcome measures to the clinical setting. Reporting Bias The validity of a systematic review may be limited by reporting biases of the component studies. It has long been recognized that neutral studies often remain unpublished or take longer to get published than those reporting statistically significant results. They are also more likely to be published in journals of low impact or in languages other than English. Such work is less likely to be identified in narrative and even in systematic review, and such publication bias can lead to the overstatement of summary effects Figure 1. The effect of the mean sample size on the estimate of in meta-analysis. Published meta-analyses now routinely report effect size for neurobehavioural score in models of encephalomye- the presence or lack of publication bias in their reviews. litis. The horizontal gray bar represents the 95% confidence limits for Data from meta-analyses of 525 unique publications and 16 the summary estimate of effect. The vertical error bars represent the interventions tested in models of experimental stroke were 95% confidence intervals for the individual estimate. The widths of combined and imprecise study effects consistent with publication the bar represent the log of the number of animals contributing to that comparison. bias were seen in funnel plot asymmetry and confirmed with Egger Regression (Figures 2A and 2B). Using a meta-analytical technique known as trim-and-fill that imputes theoretical missing studies, the overall efficacy was significantly reduced from 30.1% (28.7 to 31.6%) to 23.3% (21.7 to 24.9%), a relative overstatement construct validity; both types of validity relate to the general- in efficacy of 31%. Two hundred studies were deemed to be izability of a study. External validity refers to the ability to missing and were ‘filled’ into the data set (Figure 2C). Furthermore, generalize the findings to different measures, settings, and times. only 2% of publications reported no significant treatment effects. Construct validity refers to adequate representation of theoretical Other reporting biases that are less commonly assessed in constructs, in disease modeling this may be threatened where systematic reviews include selective outcome reporting and only specific characteristics of a complex disease are modeled. selective analysis reporting. These biases may occur where many Aspects of both these validity types are clearly disease specific. outcome measures are assessed or many statistical analyses are For example, in the modeling of ischemic stroke, associated performed but only the ‘best’ results are presented. This in turn co-morbidities, age of animals, and the time to treatment are leads to a body of evidence with an inflated proportion of important factors. Using systematic review, we have identified that statistically significant results. In a review of 160 meta-analyses the majority of preclinical stroke studies use young male including 4,445 experiments from the modeling of six neuro- normotensive rats. Furthermore, meta-analysis has also provided logic disorders (Alzheimer’s, encephalomyelitis, focal ischemia, evidence of no detectable effect of either tissue plasminogen intracerebral hemorrhage, Parkinson’s disease, and spinal cord 13,14 activator or NXY-059 in hypertensive animals, which would injury) we assessed, using the Excess Significance Test, whether appear to be concordant with results in humans. too many of the individual studies in the meta-analyses reported The most common recommendations from preclinical research statistically significant results. We expected 21% of the results to guidelines to improve external validity are that experiments be significant but observed 39% significant results, suggesting the should be replicated in different models of the same disease, presence of bias in this data set. These issues may, in part, be in different species and that findings should be replicated addressed by the use of published protocols. & 2014 ISCBFM Journal of Cerebral Blood Flow & Metabolism (2014), 737 – 742 Systematic review and meta-analysis: why and how? ES Sena et al As systematic reviews and meta-analyses of preclinical research become more common, a number of different approaches have been used and readers—and reviewers—need to be able to assess whether the methodologies used are sound and the interpretation is valid. However, we are not aware of any study that provides empirical evidence of the presence or magnitude of the risk of bias associated with different aspects of the conduct or reporting of systematic reviews and meta-analyses of data from in vivo experiments. Without such data, guidelines have less validity, but in a developing field, it is, we believe, reasonable to make some recommendations, based in part on our experience in conducting such reviews and in part on guidelines in other, related fields. In Table 2, we make some recommendations for reporting systema- tic reviews and meta-analyses of animal studies using key elements of the guidelines proposed by Peters et al that are akin to the PRISMA guidelines for the reporting of systematic reviews and meta-analyses of healthcare interventions in human clinical studies. We suggest the following steps should be considered in the critical appraisal of a systematic review and meta-analysis (Table 3; adapted from Garg et al ). Does the Study Follow a Pre-Specified Protocol? If a protocol exists, the manuscript should provide a reference to where it might be found. A pre-specified written protocol is likely to improve the standard of a systematic review and meta-analysis by reducing the risk of ‘data dredging’ and post hoc revisions of study aims. Making the protocol publicly available to the research community also allows reviewers to obtain feedback on drafts through peer review. It is often the case that protocols evolve during the course of the review in light of clearer understanding of the research field and the data available; if such iterations occur, the changes should be justified and the date of these changes should be given in a revised protocol. The protocol should clearly define the research question, objectives, inclusion criteria, search strategy, data collection processes, and data analysis plan. A protocol should allow readers to judge whether the final study did indeed follow a pre-specified plan. Was the Research Question Focused and Clearly Defined? The manuscript should provide a clear statement of the research question that is addressed As with any study, the research question of a systematic review and meta-analysis needs to be focussed and clearly defined. It also needs to be appropriate to the type of study being performed, and its answer meaningful to the field. Are the Inclusion Criteria Appropriate? The manuscript should define clear inclusion and exclusion criteria relating to the identification of relevant publications. The reader should be confident that the eligibility criteria for inclusion in a systematic review are not biased and are appro- Figure 2. Publication bias. Plots describing (A) funnel plot, (B) Egger priate to the research questions being asked. The scope of the regression, and (C) trim-and-fill. inclusion criteria determines the validity of the conclusions drawn. There has been a debate on how broad or narrow the selection HOW TO APPRAISE CRITICALLY A SYSTEMATIC REVIEW AND process should be; some argue that only studies that meet a META-ANALYSIS OF PRECLINICAL STUDIES high standard of methodological quality should be included In a systematic review of systematic reviews of preclinical studies, while others argue that meta-analyses should deal with the good, Mignini and Khan found that 30% specified a testable hypothesis, bad, and indifferent of included studies. There is no correct 27% performed a literature search without language restrictions, approach but the degree of caution with which the results are 17% assessed for the presence of publication bias, half assessed interpreted and conclusions drawn should take this into account. study validity, and 2% investigated sources of heterogeneity. As Some reviews are restricted to studies published in English, we with any type of research, systematic reviews and meta-analyses suspect because of the ease of data abstraction. Language bias are susceptible to bias, and it is only through clear reporting of occurs if studies performed in non-English speaking countries are what was done that it is possible to assess this risk of bias. more likely to publish their statistically significant results in English Journal of Cerebral Blood Flow & Metabolism (2014), 737 – 742 & 2014 ISCBFM Systematic review and meta-analysis: why and how? ES Sena et al Table 2. Guidelines for reporting systematic reviews and meta-analyses of animal studies Title Identify the report as a systematic review and/or meta-analysis of animal experiments. Abstract Provide a structured abstract covering the following: objectives, data sources, review methods, results, and conclusion. Introduction Clearly defined and focussed research question. Methods Protocol Indicate if a protocol exists and where it can be found (i.e., web address). Searching Describe the information sources in detail, including keywords, search strategy, any restrictions, and special efforts to include all available data. Selection Describe the inclusion and exclusion criteria. Validity and quality Describe the criteria and process used to assess validity. assessment Data abstraction Describe the process or processes used (e.g., completed independently, in duplicate). Describe whether aggregate data or individual animal data are abstracted. Study characteristics Describe the study characteristics relevant to your research question. Quantitative data Describe the principal measures of effect, method of combining results, handling of missing data; how statistical synthesis heterogeneity was assessed; and any assessment of publication bias—all in enough detail to allow replication. Results Flow chart A meta-analysis profile summarizing study flow giving total number of experiments in the meta-analysis. Study characteristics Descriptive data for each experiment. Quantitative data Present simple summary results (e.g., forest plot); identify sources of heterogeneity, impact of study quality, and synthesis publication bias. Discussion Summarize the main findings; discuss limitations; provide general interpretation of the results in the context of other findings, and implications for future research. Funding Describe sources of funding for the review and other support. The role of funders should be presented. Conflict of interest Any potential conflict of interests should be reported. The search strategy used needs to be comprehensive enough to Table 3. Points to consider in the critical appraisal of systematic identify most relevant studies. The screening of identified studies reviews and meta-analyses of animal studies for inclusion is susceptible to random error and may be subjective. For this reason, we advise that two independent reviewers screen 1. Does the study follow a pre-specified protocol? 2. Was the research question focused and clearly defined? studies for inclusion, and the number of reviewers should be 3. Are the inclusion criteria appropriate? reported in the manuscript. 4. How comprehensive was the search strategy? A comprehensive search reduces the possibility of publication 5. Was the data abstraction from each study appropriate? bias in a review. Publication bias may be present because of an 6. Were the data pooled appropriately? incomplete search of the literature or because the studies themselves are not in the public domain to be identified in the search process. There are various techniques for assessing for the language journals and non-significant results in non-English presence and impact of publication bias in a meta-analysis that language journals. The impact of language bias on systematic reviewers should consider. reviews of preclinical data is yet to be determined. It may not be always practical to translate all non-English studies, but the Was the Data Abstraction from Each Study Appropriate? reviewers should at least identify and report how many studies were excluded for reason of language. This is usually possible as The choice of data (times, outcome measures) to be extracted titles and abstracts are often translated even if the full papers are from each publication should be defined. not. The number of investigators extracting data from publications, and the method for dealing with inconsistencies, should be described. How Comprehensive was the Search Strategy? The reviewers should be rigorous and their methods reprodu- The manuscript should define the search strategy, including the cible in extracting data for included studies. Ideally, this process search terms used, the databases searched, and the date(s) of the should also be performed by two independent reviewers. search(es). The assessment of methodological quality of included studies The number of investigators screening publications for inclu- should be reported. Understanding the rigor and validity of sion, and the method for dealing with inconsistencies, should be included studies is important in the interpretation of the con- described. clusions drawn. Identifying relevant studies for a systematic review can be arduous. Biomedical journals are the most common source of Were the Data Pooled Appropriately? relevant data and these are identified via the searching of bibliographic databases. A range of bibliographic databases The manuscript should identify a primary outcome variable or the should be searched as no database has complete coverage of statistical limits to any subdivision of the data. all health-related literature. The core databases used are Medline The manuscript should describe the method of pooling of and Embase; Medline is often searched via PubMed, but other data and provide summary estimates, estimates of uncertainty search services may be used. Embase indexes more European and (e.g. 95% confidence intervals), and a measure of heterogeneity Asian journals and has roughly a 60% overlap with Medline. There (e.g. Q, I ). are other sources of data that may be appropriate, including other The statistical pooling of aggregate data in a meta-analysis specialized databases, conference proceedings, personal commu- gives greater weight to more precise studies. The assumption of nications, and books. this pooling may be performed under a ‘fixed’ or ‘random’ effects & 2014 ISCBFM Journal of Cerebral Blood Flow & Metabolism (2014), 737 – 742 Systematic review and meta-analysis: why and how? ES Sena et al model. More detailed discussion of the merits of each model in 10 Tsilidis KK, Panagiotou OA, Sena ES, Aretouli E, Evangelou E, Howells DW et al. Evaluation of excess significance bias in animal studies of neurological diseases. the pooling of preclinical data can be found elsewhere but on PLoS Biol 2013; 11: e1001609. the whole, given the heterogeneity often observed in preclinical 11 Vesterinen HM, Sena ES, Ffrench-Constant C, Williams A, Chandran S, Macleod MR. studies, random effects meta-analysis is more appropriate. Formal Improving the translational hit of experimental treatments in multiple sclerosis. assessment of heterogeneity should be performed. Attempts to Mult Scler 2010; 16: 1044–1055. stratify data to account for, or explain, some of the heterogeneity 12 van der Worp HB, Sena ES, Donnan GA, Howells DW, Macleod MR. Hypothermia in are useful. animal models of acute ischaemic stroke: a systematic review and meta-analysis. Brain 2007; 130: 3063–3074. 13 Macleod MR, van der Worp HB, Sena ES, Howells DW, Dirnagl U, Donnan GA. SUMMARY Evidence for the efficacy of NXY-059 in experimental focal cerebral ischaemia is confounded by study quality. Stroke 2008; 39: 2824–2829. Recent years have seen improvements in the conduct and 14 Sena ES, Briscoe CL, Howells DW, Donnan GA, Sandercock PA, Macleod MR. reporting of clinical trial design after the publication of the 27 28 Factors affecting the apparent efficacy and safety of tissue plasminogen activator CONSORT statement. Analogous to this, the ARRIVE guidelines 29 in thrombotic occlusion models of stroke: systematic review and meta-analysis. and Landis paper hope to promote the same improvements in J Cereb Blood Flow Metab 2010; 30: 1905–1913. animal experiments. 15 Henderson VC, Kimmelman J, Fergusson D, Grimshaw JM, Hackam DG. Threats to Systematic review and meta-analysis have provided empirical validity in the design and conduct of preclinical efficacy studies: a systematic evidence that too many preclinical experiments lack methodolo- review of guidelines for in vivo animal experiments. PLoS Med 2013; 10: e1001489. gical rigor, and this leads to inflated treatment effects. There is 16 van der Worp HB, Howells DW, Sena ES, Porritt MJ, Rewell S, O’Collins V et al. Can animal models of disease reliably inform human studies? PLoS Med 2010; 7: of course no guarantee that improvements in the validity of e1000245. preclinical animal studies and reduced publication bias will 17 Sutton AJ, Song F, Gilbody SM, Abrams KR. Modelling publication bias in meta- improve the translational hit of interventions from bench to analysis: a review. Stat Methods Med Res 2000; 9: 421–445. bedside. However, we hope that the mounting empirical evidence 18 Ioannidis JPA. Why most discovered true associations are inflated. Epidemiology drives the preclinical research community toward improved rigor. 2008; 19: 640–648. Despite some very reasonable concerns about the novelty of the 19 Mignini LE, Khan KS. Methodological quality of systematic reviews of animal methodological approach and the difficulty of confirming these studies: a survey of reviews of basic research. BMC Med Res Methodol 2006; 6:10. hypotheses experimentally, findings from meta-analyses cover 20 Peters JL, Sutton AJ, Jones DR, Rushton L, Abrams KR. A systematic review of such a wide range of disease models and have been reported by systematic reviews and meta-analyses of animal experiments with guidelines for so many different research groups that it is highly unlikely that reporting. J Environ Sci Health B 2006; 41: 1245–1258. 21 Moher D, Liberati A, Tetzlaff J, Altman DG. PRISMA Group. Preferred reporting these conclusions do not reflect a real and present problem with items for systematic reviews and meta-analyses: the PRISMA statement. BMJ 2009; the use of animal models. As consumers of systematic reviews and 8: 336–41. meta-analyses of preclinical research, it is important that we are 22 Garg AX, Hackam D, Tonelli M. Systematic review and meta-analysis: when one able to discern, as we do for primary research studies, the rigor study is just not enough. Clin J Am Soc Nephrol 2008; 3: 253–260. with which the meta-analyses are performed and reported. 23 Meline T. Selecting studies for systematic review: inclusion and exclusion criteria. We hope that this empirical evidence of bias, akin to that Contemp Issues in Commun Sci Disord 2006; 33: 21–27. reported in clinical research more than 30 years ago, spurs a 24 Slavin RE. Best-evidence synthesis: why less is more. Educ Res 1987; 16: 15–16. similar change in the way we conduct and report preclinical 25 Egger M, Zellweger-Zþ n˜ hner T, Schneider M, Junker C, Lengeler C, Antes G. research. Whether this leads to improved translation we are yet to Language bias in randomised controlled trials published in English and German. Lancet 1997; 350: 326–329. see. However, what we will observe is an improvement in the 26 Vesterinen HM, Sena ES, Egan KJ, Hirst TC, Churolov L, Currie GL et al. Meta- conduct and reporting of preclinical research that reasonably can analysis of data from animal studies: a practical guide. J Neurosci Methods 2014; only be of benefit to this troubled field of research. 221: 92–102. 27 Moher D, Schulz K, Altman D. The CONSORT statement: revised recommendations for improving the quality of reports of parallel group randomized trials. BMC Med DISCLOSURE/CONFLICT OF INTEREST Res Methodol 2001; 1:2. The authors declare no conflict of interest. 28 Kilkenny C, Browne WJ, Cuthill IC, Emerson M, Altman DG. Improving bioscience research reporting: the ARRIVE guidelines for reporting animal research. PLoS Biol 2010; 8: e1000412. REFERENCES 29 Landis SC, Amara SG, Asadullah K, Austin CP, Blumenstein R, Bradley EW et al. A call for transparent reporting to optimize the predictive value of preclinical 1 Glass G. Meta-analysis at 25 http://www.gvglass info/papers/meta25.html 2013. research. Nature 2012; 490: 187–191. 2 O’Collins VE, Macleod MR, Donnan GA, Horky LL, van der Worp BH, Howells DW. 30 Egan KJ, Sena ES, Vesterinen HM, Macleod MR. Developing our understanding 1,026 experimental treatments in acute stroke. Ann Neurol 2006; 59: 467–477. of the pathogenesis of Alzheimer’s disease using a systematically identified 3 Schulz KF, Chalmers I, Hayes RJ, Altman DG. Empirical-evidence of bias— dataset of interventions tested in transgenic mouse models. Eur J Neurol 2012; 19: dimensions of methodological quality associated with estimates of treatment effects in controlled trials. JAMA 1995; 273: 408–412. 31 Rooke ED, Vesterinen HM, Sena ES, Egan KJ, Macleod MR. Dopamine agonists in 4 Chalmers TC, Celano P, Sacks HS, Smith H. Bias in treatment assignment in con- animal models of Parkinson’s disease: a systematic review and meta-analysis. trolled clinical trials. N Engl J Med 1983; 309: 1358–1361. Parkinsonism Relat Disord 2011; 17: 313–320. 5 Bonnie S, Martin R. Understanding controlled trials: why are randomised con- 32 Frantzias J, Sena ES, Macleod MR, Salman RA-S.. Treatment of intracerebral trolled trials important? BMJ 1998; 316: 201. hemorrhage in animal models: meta-analysis. Ann Neurol 2011; 69: 389–399. 6 Begg C, Cho M, Eastwood S, Horton R, Moher D, Olkin I et al. Improving the quality 33 Jerndal M, Forsberg K, Sena ES, Macleod MR, O’Collins VE, Linden T et al. of reporting of randomized controlled trials—The CONSORT statement. JAMA A systematic review and meta-analysis of erythropoietin in experimental stroke. 1996; 276: 637–639. J Cereb Blood Flow Metab 2010; 30: 961–968. 7 Crossley NA, Sena E, Goehler J, Horn J, van der Worp B, Bath PMW et al. Empirical 34 Sena E, Wheble P, Sandercock P, Macleod M. Systematic review and meta-analysis evidence of bias in the design of experimental stroke studies - a metaepide- of the efficacy of tirilazad in experimental stroke. Stroke 2007; 38: 388–394. miologic approach. Stroke 2008; 39: 929–934. 8 Sena ES, Macleod MR. Concordance between laboratory and clinical drug efficacy: lessons from systematic review and meta-analysis. Stroke 2007; 38:502. 9 Sena ES, van der Worp HB, Bath PMW, Howells DW, Macleod MR. Publication bias This work is licensed under a Creative Commons Attribution 3.0 in reports of animal stroke studies leads to major overstatement of efficacy. PLoS Unported License. To view a copy of this license, visit http:// Biol 2010; 8: e1000344. creativecommons.org/licenses/by/3.0/ Journal of Cerebral Blood Flow & Metabolism (2014), 737 – 742 & 2014 ISCBFM

Journal

Journal of Cerebral Blood Flow & MetabolismSAGE

Published: Feb 19, 2014

Keywords: acute stroke; animal models; basic science; Biostatistics; experimental

There are no references for this article.