Clinical Practice Guideline Development and the GRADE Framework

Overview of the Clinical Practice Guideline Development Process

The development of clinical practice guidelines involves two distinct but interconnected components: the systematic review and the guideline development itself.
This work is typically divided between two specialized bodies:
- Evidence Review Team: Composed of experts in conducting systematic reviews and evaluating the quality of evidence.
- Guideline Group (Panel): Responsible for the actual development of guidelines, ideally working side-by-side with the evidence review team.
The process is iterative and follows the GRADE (Grading of Recommendations Assessment, Development, and Evaluation) framework, which is the recognized international standard for guideline development.

Systematic Review and Evidence Evaluation Steps

Question Formulation: Guidelines use the PICO (Population, Intervention, Comparison, and Outcome) format to direct systematic reviewers.
Outcome Selection and Rating: The guideline panel (not the evidence review team) identifies relevant outcomes and rates their importance (e.g., mortality vs. physiological markers).
Literature Search: Guided by PICO, a comprehensive search identifies all relevant studies.
Evidence Profiling: Across identifying studies, teams create evidence profiles and "summary of findings," estimating the effect for each specific outcome.
Rating Quality Down: The quality of evidence for an outcome may be downgraded from "high" based on several factors:
- Risk of bias in individual studies.
- Inconsistency of results across studies.
- Indirectness (e.g., study population differs from the target population).
- Imprecision (e.g., small sample sizes or wide confidence intervals).
- Evidence of publication bias.
Rating Quality Up: Quality can be upgraded, particularly in high-quality observational studies, if:
- There is a very large magnitude of effect.
- There is a clear dose-response relationship.
- All plausible confounders have been adequately addressed.
Overall Quality Grade: The overall quality of evidence is typically determined by the "lowest quality of critical outcomes." For example, if mortality has "low" quality evidence but blood pressure reduction has "high" quality evidence, the overall quality for that recommendation is considered "low."

Multidisciplinary and Consumer Involvement

Multidisciplinary Representation: A trustworthy guideline must include a diverse group of experts and stakeholders.
Consumer/Patient Perspectives: Meaningful involvement of patients and carers is essential. Tokenistic approaches are considered worse than no involvement at all.
Models for Involvement: An example of good practice is a parallel process rather than simply placing one patient on a working group:
- Patients and carers meet in workshops or focus groups.
- They are presented with the draft scope and content.
- They receive education on the guidelines and identify their own priorities, needs, and perspectives.
- These priorities are compared against the guideline scope, identifying gaps that the guideline working group then revises.
Complex Target Groups: Specific populations require tailored engagement. For example, consumer engagement for the development of chronic kidney disease guidelines for Aboriginal and Torres Strait Islander people involves a major, multifaceted process to ensure the results address the specific needs of that community.

Formulating Recommendations (The GRADE Framework)

Recommendations should be clearly worded to indicate for or against an intervention and the strength of that recommendation.
Factors Influencing Recommendations:
- Overall quality of evidence (including study bias, inconsistency, indirectness, imprecision).
- Balance between benefits and harms.
- Values and preferences of patients.
- Costs and resource implications.
Strength of Recommendation: GRADE assigns two levels: "Strong" or "Weak/Conditional."
- Strong Recommendations: Often use language like "we recommend."
- Weak/Conditional Recommendations: Often use language like "we suggest."
Quality Levels Underpinning Recommendations:
- High quality ( $A$ )
- Moderate quality ( $B$ )
- Low quality ( $C$ )
- Very low quality ( $D$ )
Categorization Examples (The 1 and 2 Numeric System):
- $1A$ , $1B$ , $1C$ , $1D$ : Strong recommendations with varying evidence quality. This means benefits clearly outweigh harms (or vice-versa). Applicable to most patients in most circumstances.
- $2A$ , $2B$ , $2C$ , $2D$ : Weak/Conditional recommendations. Benefits and harms are closely balanced. The best course of action may differ based on patient values or clinical circumstances.

Implications of Strong vs. Weak Recommendations

Strong Recommendations ( $1$ ):
- Patients: Most people in that situation would want the recommended action; only a small proportion would not.
- Clinicians: Most patients should receive the action.
- Policymakers: The recommendation can be adopted as policy in most situations.
Weak Recommendations ( $2$ ):
- Patients: Most would want the action, but many would not, depending on individual values.
- Clinicians: Different choices are appropriate for different patients (e.g., due to comorbidities). Decision-making must be consistent with patient values.
- Policymakers: Areas require substantial debate and involvement of various stakeholders.

Case Study: Treating Anemia in Hemodialysis Patients

Balancing Benefits: Reduced fatigue, improved quality of life, fewer blood transfusions, and lower hospitalization rates.
Balancing Harms: Increased risk of death/mortality, cardiovascular events, cost, and vascular thrombosis.
KHA Karri Group Recommendations:
- Strong Recommendation ( $1A$ ): "We recommend against haemoglobin targets above $130\,gsL$ due to the strong association with increased morbidity and mortality."
- Weak Recommendation ( $2A$ ): "For many anaemic patients with CKD, we suggest a hemoglobin target of between $100$ and $115\,gsL$ ." Even though the evidence quality is high ( $A$ ), the recommendation is weak because the target depends on balancing blood transfusion risks vs. individual quality of life/fatigue issues.

Strong Recommendations with Low Quality Evidence

It is possible to make a strong recommendation ( $1C$ or $1D$ ) despite low quality evidence if the reasons are clear. Scenarios include:
- Life-threatening Situations: Uncertain benefit but very high mortality risk if no action is taken.
- Certain Harm: Potential for catastrophic harm from an intervention even if the benefit evidence is poor.
Examples:
- World Health Organization (WHO): Strong recommendation to treat avian flu suspected infections as soon as possible despite weak evidence, placing high value on high fatality rates and lower value on adverse drug reactions.
- American Thoracic Society (ATS): Strong recommendation against using Cyclosporine A for Idiopathic Pulmonary Fibrosis (IPF) despite very low quality evidence. The basis was placing a high value on preventing side effects/costs and a low value on discordant evidence.

Key Considerations and The Future of Guidelines

Flexibility: Guidelines should be adaptable to local conditions (e.g., metropolitan vs. rural hospital). Developers cannot predict all scenarios, but users should be able to create local protocols from recommendations.
Resources: Resource considerations are often poorly addressed, especially in international guidelines which are often biased toward high-income settings.
Life-Span and Updating: Guidelines are often out of date quickly. The "Living Guideline" model is the future, relying on continuous updating.
MAGICapp: A web-based tool (associated with Cochrane and the Grade Group) that improves transparency and facilitates living guidelines. The Australian Stroke Foundation uses this platform for their guidelines.

Nine Underpinning Principles of a Trustworthy Guideline

Transparency: Clear documentation of how the guideline was developed.
Conflict of Interest Management: Confidence in how experts' potential biases were handled.
Multidisciplinary Group: Inclusion of a wide range of experts and consumer/patient perspectives.
Rigorous Systematic Review: Clear identification and evaluation of all evidence.
Evidence-Recommendation Link: A clear process for moving from evidence to specific recommendations.
Clarity of Evidence Base: Transparent identification of when a recommendation reflects high-quality evidence versus opinion.
Clear Wording: Recommendations must be actionable "action statements" rather than just "evidence statements."
Review Process: Use of external peer review to ensure findings are not limited to the working group's views.
Updating and Implementation: A clear process for keeping the guideline current and ensuring it is practical to implement in real-world settings.