CHSL LibGuides: Systematic Reviews: -- GRADE

Grade Content

GRADE Guidelines

"Grades of Recommendation, Assessment, Development, and Evaluation" (GRADE) approach provides guidance for rating quality of evidence and grading strength of recommendations in health care. [Guyatt, 380]

To put it more simply, GRADE is a framework for Grading the evidence from studies you are considering using in your Systematic Review. Used properly, each study should be individually Graded to determine if it should be recommended for inclusion, to assess the quality of the evidence presented.

GRADE provides a systematic and transparent framework for clarifying questions, determining the outcomes of interest, summarizing the evidence that addresses a question, and moving from the evidence to a recommendation or decision. [Guyatt, 380]

The work of developing the GRADE framework began in 2000 as a means of developing a clear, transparent, and agreed-upon system for "rating quality of evidence and determining strength of recommendations for clinical practice guidelines." GRADE is recommended for "use in systematic reviews, guidelines, and health technology assessment." [Guyatt, 380] The product of a GRADE assessment is an evidence profile and a summary of findings table, which should accompany any recommendations or guidelines and should be provided for each outcome related to a clinical question.

This LibGuide is based on a series of articles which appeared in the Journal of Clinical Epidemiology, starting in volume 64 (2011) 380-382. The stated aim of this series was to provide a "how to" guide for those interested in learning how to apply the GRADE framework to systematic reviews or health guidelines. Links to the various articles are provided below.

GRADE

GRADE offers a system for rating quality of evidence in systematic reviews and guidelines and grading strength of recommendations in guidelines. [Guyatt, 383]

GRADE is much more than a rating system. It offers a transparent and structured process for developing and presenting evidence summaries for systematic reviews and guidelines in health care and for carrying out the steps involved in developing recommendations. [Guyatt, 383]

To achieve the goal of rating the quality of evidence for each outcome presented in a study, and to also provide a recommendation of how to dispose of that evidence, GRADE uses a category system to assign ratings to evidence. For instance, for a randomized controlled trial, considered the apex of design in producing quality evidence, there is a four-category system: high, moderate, low, and very low. These categories are decided upon based on the quality of the study itself, that is, how it was carried out. GRADE also uses a five-category system for rating down the quality of evidence. These categories include Risk of Bias, Inconsistency, Indirectness, Imprecision, and Publication Bias. How each of these categories affect quality is directly addressed in the series of articles. In this LibGuide, these elements will be summarized; however, if you are undertaking a Systematic Review, you are strongly encouraged to read the entirety of the article series.

GRADE offers a system for rating quality of evidence in systematic reviews and guidelines and grading strength of recommendations in guidelines. [Guyatt, 383] GRADE is an acronym meaning "Grades of Recommendation, Assessment, Development, and Evaluation."

GRADE is a transparent framework that allows a reviewer to examine and rate the quality of evidence demonstrated by studies, be they Randomized Controlled Trials (RCTs) or Observational Studies. "In the GRADE approach, randomized controlled trials (RCTs) start as high-quality evidence and observational studies as low-quality evidence supporting estimates of intervention effects. Five factors may lead to rating down the quality of evidence and three factors may lead to rating up. Ultimately, the quality of evidence for each outcome falls into one of four categories from high to very low." [Guyatt, 385]

Studies are not rated as a whole using GRADE, rather each individual outcome posited by the study is Graded, so a rating is given for each outcome the study sought to measure -- either explicitly or incidentally. Thus, a study with multiple outcomes reported could have a variety of different ratings assigned to it, one for each outcome reported. What is most important is that the quality of the evidence for each outcome is examined and rated.

As discussed at the outset of this LibGide, fundamental to a systematic review is the development of a clinical question that can (hopefully) be answered by the evidence across a number of studies. The basic form the clinical question takes is achieved by using PICO, the patient/intervention/comparator/outcome framework. GRADE uses PICO to establish which outcomes in a given study are the most critical, and which are less important, overall. This latter point is important, as GRADE seeks to provide a recommendation for EACH outcome, not just for the study or systematic review as a whole. "Systematic review and guideline authors use [GRADE] to rate the quality of evidence for each outcome across studies (i.e., for a body of evidence). This does not mean rating each study as a single unit. Rather, GRADE is 'outcome centric': rating is made for each outcome, and quality may differ--indeed, is likely to differ--from one outcome to another within a single study and across a body of evidence." [Guyatt, 385] In fact, the ultimate goal of GRADE is to provide both a summary of the evidence from a given study and to rate the evidence for (or against) an outcome and an estimate of the effect of whichever intervention was used in the study. To discover/report this GRADE uses an Evidence Profile (EP) and a Summary of Findings (SoF) table.

Evidence Profile (EP)

Also referred to as a GRADE EP, the Evidence Profile "includes a detailed quality assessment in addition to a SoFs. That is, the EP includes an explicit judgment of each factor that determines the quality of evidence for each outcome, in addition to a SoFs for each outcome. The SoF table includes an assessment of the quality of evidence for each outcome but not the detailed judgments." [Guyatt, 386] According to the series of articles, a Summary of Findings is intended for a broader audience, including end users of systematic reviews; whereas an Evidence Profile is intended for those creating guidelines.

GRID --

Above is the grid created to show the process at arriving at a judgment to determine the quality of evidence for an outcome using GRADE. As you can see, Randomized Controlled Trials begin with the assumption that the evidence is either High or Moderate Quality, whereas Observational Studies begin with the assumption that the quality of evidence is either Low or Very Low in quality. However, two columns in the grid show that the quality of evidence can be lowered or raised based on the review of the study process according to the five criteria mentioned before: Risk of Bias, Inconsistency, Indirectness, Imprecision, or Publication Bias. The quality of evidence can be rated up according to the criteria in the final column, with three criteria: Large Effect, Dose Response, or All Plausible Confounding.

An Evidence Profile table is shown in the article, which can be viewed here. It relies on the same study mentioned at the outset of this LibGuide: Venekeamp's study (https://doi.org/10.1002/14651858.CD000219.pub4) examining the effectiveness of antibiotic use versus no intervention in children with Acute Otitis Media.

Table 2 [Guyatt, 388] shows the Summary of Findings (SoF) with Outcomes as the first column, followed by the Control & Intervention Risk, Relative Risk (Confidence Interval), Number of Participants, Quality of Evidence (ranked by GRADE), and any comments. The Outcomes are, per the study, Pain at 24 hours, Pain at 2-7 days, Tympanometry at one month, Tympanometry at three months, Vomiting, diarrhea, or rash -- the latter two measures Tympanometry measures being surrogates for any actual pain measurement. Based on a GRADE assessment of the quality of the evidence, the first two outcomes were rated as being High quality evidence; the remaining three were rated as Moderate.

Any comments or reasoning for the ratings are discussed, as in Table 3 of the Guyatt article, p390, which discusses the results of a different study. It is important to note that the SoF is another manner of presenting the information from the EP. The EP, again, being more concerned with the details of the studies and their suggested outcomes than the SoF, which simply presents the findings. For instance, the EP for the Venekamp study discusses at length the outcome related to Vomiting, Diarrhea, and Rash, as the study was concerned with antibiotic use, but not necessarily with the type of antibiotic used -- as some are more prone to cause adverse effects than others. This concern caused the evidence for that outcome to be rated down.