With the support of its membership, the ACR publishes clinical practice guidelines in multiple disease areas based on the best available clinical and scientific data. These aim to support health professionals treating rheumatology patients to give the best possible care. Like any set of medical guidelines, ACR guidelines are based on evidence of several different levels, and guideline writers must often work with incomplete information. Clinicians often desire guidance for particular questions that do not have a clear-cut answer in the medical literature, and they look to the ACR guidelines to help fill this void.
Here, we discuss the rigorous process by which writers develop the guidelines, the limits of the clinical data available and possible directions for greater research and future guideline development.
Background: Evolution & Role of Clinical Guidelines
Although the push toward creating standardized medical protocols for clinicians began decades earlier, the modern age of guidelines began with a 1990 Institute of Medicine report. This report defined clinical practice guidelines as “systematically developed statements to assist practitioner and patient decisions about appropriate healthcare for specific clinical circumstances.”1 Since that time, a proliferation of clinical guidelines across all medical specialties, including rheumatology, has occurred. The guidelines are intended to help clinicians stay abreast of the ever-expanding evidence base.2
ACR guideline authors make clear that clinical practice guidelines are not prescriptive and should be used by clinicians and patients only as a guide for discussion. ACR guidelines attempt to acknowledge this limited role in providing clinicians with relevant information to help guide (but not mandate) treatment. Optimal treatment requires a clinician’s assessment and collaborative decision making on the part of patients and providers, and consideration of clinical practice guidelines and supporting evidence. Clinicians must understand the limits of any set of clinical practice guidelines and how they may best be used to improve clinical care. Guidelines are never meant to replace clinical judgment or establish a rigid protocol for addressing all individuals with a specific rheumatologic condition.3
With the expansion of clinical practice guidelines across specialties, increasing scrutiny has been placed on the evidence base and the processes used to create guidelines. Efforts have been made to improve guidelines by increasing transparency, standardizing guideline development methods and managing conflicts of interest.4 One important remaining challenge is the evidence base. In many cases, the highest grade evidence (from randomized controlled trials) is simply not available to support the majority of recommendations made.5
Clinicians must realize the inherent limitations of any clinical practice guidelines, including those produced by the ACR. In the absence of a perfect evidence base, guidelines can still provide helpful, evidence-based guidance for those making clinical decisions.
ACR guideline writers make clear that clinical guidelines are not prescriptive & should be used by clinicians & patients only as a guide for discussion.
How Are the ACR Guidelines Generated?
The ACR strives to produce guidelines that provide the best possible evidence-based guidance to clinicians, while being transparent about any limitations in the evidence base for particular recommendations. The ACR guideline development process complies with standards from the National Academy of Medicine (formerly the Institute of Medicine) and the Council for Medical Subspecialty Societies. ACR guidelines are developed using a rigorous, multi-step process involving several coordinated groups of experts. Prior to the process, all members of the guideline group undergo orientation to prepare them for their specific roles.
The first step is assembling a group of experts who will work to develop the guideline, starting by identifying the project’s scope. Tuhina Neogi, MD, PhD, FRCPC, is a practicing rheumatologist and professor of medicine and epidemiology at the Boston University School of Medicine. She participated in developing the 2012 gout treatment guidelines, and she is one of the core team leaders for the osteoarthritis guideline, currently in process.
Dr. Neogi explains, “Usually the people who are leading the effort [help the ACR] identify appropriate candidates to populate the broader team that will be working on the treatment guidelines.”
Liron Caplan, MD, PhD, is an associate professor of medicine and rheumatology at the University of Colorado School of Medicine in Aurora, Colo. He was the literature review leader for the ACR’s axial spondyloarthritis (SpA) guideline and currently serves as the ACR Guideline Subcommittee chair. Dr. Caplan says, “In the case of the ACR, they require that more than half of the guideline development team have no conflicts of interest with regard to industry support in the topic area of the guideline.” The team includes members with content expertise and expertise in methodology and guideline development.
The next stage is identifying the most important clinical questions and outcomes the guideline needs to address. An expert panel works with the broader guideline development team to provide guidance and advice on the most clinically appropriate questions to include. “There are many more questions than are feasible to be addressed in a given treatment guideline,” says Dr. Neogi. “To have the treatment guidelines be relevant and also doable and digestible, there has to be some curation involved.” Before proceeding further, members of the larger community have an opportunity to shape these questions via a public comment period.
The next step is assembling a complete and systematic review of the medical literature that directly or indirectly addresses the chosen scenarios. This review is performed by a literature review team working in consultation with the core leadership team and the broader guideline development group. This team compiles the evidence to address the questions the expert panel helped develop. The evidence addressing all the chosen questions is presented in a systematic report.
Next, a separate voting panel examines the evidence report. Dr. Neogi explains, “The voting panel then reviews the evidence for those questions and determines what recommendations to make.”
This voting panel comprises rheumatologists, other relevant specialists (including primary care, where appropriate), stakeholders with clinical expertise in the target treatment areas and patient participants. After voting, the findings are compiled into a guideline draft that then undergoes a process of peer review by the ACR and its journals before final publication. Each guideline must be reviewed and endorsed by the ACR Guideline Subcommittee, Quality of Care Committee and Board of Directors.
The ACR … requires more than half of the guideline development team’s experts have no conflicts of interest with regard to industry support in the topic area of the guideline.
Specifics of the GRADE methodology
The ACR currently uses a specific system to develop its guidelines: the GRADE methodology (Grading of Recommendations Assessment, Development and Evaluation), an internationally accepted systematic approach to guideline development.
GRADE is not the only such rating system available. Many earlier ACR guidelines (such as the 2012 gout guidelines) used the RAND/UCLA method, a somewhat different method of compiling and evaluating evidence.6
GRADE was used in the following ACR guidelines: osteoarthritis (2012), axial spondyloarthritis (2015), polymyalgia rheumatica (2015), rheumatoid arthritis (2015), glucocorticoid-induced osteoporosis (2017) and perioperative management of antirheumatic medication (2017).3,7-11 Comparing it with the RAND/UCLA method, Dr. Caplan notes, “GRADE is more explicit and systematic about how a level of evidence is translated into a recommendation and more transparent about the limitations of the evidence.”
GRADE has certain other advantages over other rating systems. Unlike some other methods, it clearly distinguishes the quality of evidence from the strength of recommendations. It provides clear and pragmatic interpretations for strong vs. conditional recommendations. GRADE gives explicit and comprehensive criteria for downgrading or upgrading quality-of-evidence ratings. It focuses on the evaluation assessing two or more discreet alternative management strategies rather than general management principles. Another advantage is that GRADE explicitly acknowledges that values and preferences should play a role in recommendations.12
Dr. Caplan explains that the GRADE format helps the guideline developers avoid vague concepts while shaping specific recommendations around choices that clinicians address. “The PICO questions articulate medical scenarios in a very structured format, and then evidence is gathered around that scenario.” PICO questions are one evidence-based model used to help frame clear questions. PICO questions must specify the Patient population addressed, the Intervention, the Comparison and the Outcome(s) being examined in a specific clinical situation.
Each PICO-based recommendation receives a strong or a conditional recommendation for or against a specific intervention, as determined by the voting panel. Each recommendation also receives a level of evidence rating: very low, low, medium or high.
Four factors influence the strength of a recommendation under the GRADE method. These are the quality of the evidence, the balance between intervention benefits and risks/harms, cost and variability in patient values and preferences. A weakness in one of these areas can still result in a strong recommendation if the other criteria are convincingly robust. Conversely, an intervention that is particularly impressive in one area may receive only a conditional recommendation if other factors make it less desirable.
Example: For patients with established rheumatoid arthritis and low disease activity who have never taken a disease-modifying anti-rheumatic drug (DMARD), the 2015 GRADE-based rheumatoid arthritis guidelines give a strong recommendation for DMARD therapy (preferably methotrexate) over a tumor necrosis factor (TNF) inhibitor. However, the evidence is of relatively low quality. The justification for this strong recommendation is that DMARD therapy is less expensive and has an extensive safety record with well-documented efficacy in clinical experience.3
Dr. Neogi provides a different type of example, “If something is astronomically expensive, and the benefits are only incremental above other more cost-effective treatments, you wouldn’t give that one a strong recommendation as first-line therapy, even if the randomized controlled trial showed that it was superior to placebo.”
Each recommendation also receives an evaluation in terms of the evidence quality, based on the available data. Using specific software tailored to GRADE assessment, reviewers use the GRADE criteria to assess the quality of the available evidence. This grading incorporates a number of different characteristics, including the number and specific types (randomized controlled trials, observational, etc.) of studies available, risk of bias in included studies, inconsistency of study results, indirectness of evidence and imprecision.
In other words, GRADE assessment not only considers the trial type but also the level of quality of those studies in relation to the clinical question being asked in the guideline. As Dr. Caplan explains, “In GRADE, a really well done observational study could be increased in terms of its value. In contrast, a poorly developed randomized controlled trial with major potential for bias could be downgraded in its quality.” In addition, a high-quality study might be downgraded slightly in the guideline evidence report because the questions it addresses aren’t exactly the same as those in the guideline.
The guidelines themselves provide a great deal of information about why the voting panel decided as they did on a particular recommendation to make a more in-depth understanding of these questions available to guideline readers. The supplementary appendices to recent guidelines also provide further information about evidence and the specific studies used to make recommendations for each PICO question.
Awareness of the Evidence Base for ACR Guidelines
In recent years, there has been a growing push to develop standards for more rigorously developed clinical practice guidelines.4 With this has come a greater scrutiny of the evidence base for guidelines and a greater awareness that many recommendations across medical specialties are not based on evidence from randomized controlled trials, which is sometimes characterized as level A evidence.5,13-15 In general, there seems to be a push toward level A (the highest level) evidence from many areas of the medical community. Just what the evidence base means for clinical medicine on a practical level is not completely clear, although it seems obvious that greater scientific certainty is desirable.
The traditional evidence-based pyramid became well known in the 1990s as a tool to evaluate the level of evidence. Certain types of studies are given greater weight in this hierarchy, increasing from case series, to case control studies, to cohort studies. Randomized controlled trials (RCTs) are given even greater weight, with systematic reviews or meta-analyses traditionally pictured at the top of the evidence pyramid.16
Various organizations have adopted frameworks to quantify this pyramid of evidence. A hierarchy adopted by the American College of Cardiology/American Heart Association (ACC/AHA) categorizes the evidence of a recommendation into three levels. Level A grading is assigned to recommendations supported by more than one randomized clinical trial or one or more meta-analyses; level B grading is assigned to recommendations based on a single randomized trial or nonrandomized trials; level C grading is given to recommendations based on expert opinion, case studies or standard of care.17 The ACR also reported these evidence levels as described by the ACC/AHA in pre-GRADE guidelines, such as the 2012 gout guideline.
Some researchers have pointed out that the evidence base crosses medical specialties and have urged improvement in the evidence base used to support guideline recommendations. For example, a recent research letter in JAMA Internal Medicine critiqued the level of evidence used in compiling the ACR’s clinical practice guidelines. The authors recategorized the ACR recommendations based on ACC/AHA levels of evidence (A, B and C). In their analysis, they reported that the ACR guidelines “remain mostly expert-based but are comparable with guidelines in other subspecialties.”18
Dr. Caplan critiques the authors’ approach, explaining that they appear to have used the rating of evidence according to GRADE and back converted into the older ACC/AHA scheme. “But the two are not exactly the same, so they had to make a lot of assumptions, and it’s not exactly clear how they made those assumptions. That’s one issue.” However, he agrees with the general assessment that the guidelines are not predominantly supported by level A evidence (meaning RCTs). Dr. Caplan is coauthoring a response letter to the paper in JAMA Internal Medicine in collaboration with the current chair of the ACR Quality of Care Committee, John Fitzgerald, MD, PhD.
“These findings aren’t terribly surprising, and I don’t think they are unique to rheumatology, given a small discipline that is underfunded,” says Dr. Caplan. He points out that to conduct RCTs to address many of the questions included in guidelines would require immense amounts of funding that is often not available. He also notes it would be ethically challenging to pursue certain questions because existing effective medications might preclude a comparison with placebo, and thus only head-to-head data between agents are available.
Dr. Caplan also makes it clear that it is important to distinguish limits to the evidence base due to the state of the science from actual flaws in the guidelines themselves. “It’s not that the guidelines are flawed per se; it’s simply that they are the best available product for addressing the issue based on the limited evidence that is available. I would call a guideline flawed if it doesn’t adhere to a process or purports to make decision making in one way but really allows undue influence from pharma or allows votes in an inequitable manner. There are ways that guidelines can be flawed, but to call them flawed because the evidence base on which they’re established is limited is misleading.”
Unlike the ACC/AHA categorization of evidence used in the JAMA analysis, GRADE takes into account study quality when analyzing recommendations. Dr. Neogi points out that although RCTs provide the gold standard of evidence, they may have bias or other issues that can downgrade evidence quality. “On the other hand, a very well done observational study, a cohort study done very rigorously could be considered a slightly higher level of evidence because it dealt with all the different biases and nuances to make it more robust evidence. And the strength of the recommendation takes all that into consideration.” In other words, a trial that might technically qualify as level A evidence might not actually be as reliable as data from a study ranked as level B evidence under the ACA/AHA scheme.
Dr. Caplan thinks it is a mistake to conclude that lack of level A evidence on a given question should imply that guidelines are not desirable and should notIdeally, the process of guideline formation itself could be used to help shape the research agenda, & guidelines do usually contain recommendations for future research. be attempted. “That’s philosophically at odds with feedback we’ve gotten from the ACR membership. Rheumatologists and healthcare practitioners in rheumatology want guidance from experts, specifically in areas in which there isn’t clear evidence. That is the mandate of the membership and, therefore, ACR leadership. It’s the decision of the ACR to offer that kind of guidance, but to be clear and transparent about the evidence base for that.”
He notes that the ACR offers that guidance with the explicit statement that the recommendations are not as supported as would be ideal, but they are the best possible guidelines given the available evidence.
Dr. Caplan points out that potential guidelines based only on level A evidence would likely address only issues that are agreed upon and obvious to everyone. “The benefit of articulating those common sense and generally universally regarded principles may not be as useful to clinicians.” He adds, “What is difficult is getting guidance for areas for which there isn’t clear evidence and having a logic at least underlying your actions. To apply a system that requires level A evidence to every decision is not an approach that recognizes the reality of medical practice or the differences in certain subspecialties vs. general care.”
Dr. Neogi also defends the use of imperfect non-level A data in the creation of clinical guidelines.
“We have to help our rheumatology colleagues and the broader community taking care of millions of patients with rheumatic diseases; we need to be able to help them optimally manage patients they are seeing in their practices now.” In her view, in the absence of RCT data, the ACR still needs to be able to provide useful information based on the best available data to date, while being transparent about the fact the particular recommendations are not based on RCT data. She adds, “Otherwise, we’re doing millions of patients and providers a disservice by not addressing clinically important questions.”
Although RCTs can be considered the gold standard, both Dr. Caplan and Dr. Neogi believe it is possible to overstate their importance. Dr. Neogi points out that with current improved epidemiological methods, data from well-done observational studies can be more reliable than in the past. In response to the claim that the ACR’s recommendations are “mostly expert based,” Dr. Caplan responds that to some extent, all guidelines are expert based. “There is always extrapolation, and there are always experts contributing to this process.”
This issue of having relatively lesser amounts of level A evidence is not limited to rheumatology, but affects other specialties as well. For example, the levels of evidence available to rheumatologists are roughly equivalent to the levels of evidence available to be used in infectious disease and kidney guidelines.14-15 In other specialties, such as cardiovascular disease, this may be less of a problem, because they often get more research funding.
“It’s not shocking that in a medical discipline that addresses rare diseases and has limited public funding that you’re not going to have predominantly level A evidence—and it’s not unique either,” explains Dr. Caplan. “The same could be said of all medical care—particularly subspecialty care—that you never have the evidence base that you’d like.”
Ideally, the process of guideline formation itself could be used to help shape the research agenda, & guidelines do usually contain recommendations for future research.
Improving the Evidence Base
The broader question remains as to how the evidence base for the guidelines might be improved through greater research. Although RCTs remain unrealistic in many scenarios in rheumatology, an improvement in the number of high-quality observational studies could also help improve the evidence base. In many cases, registry data may be studied, such as were used in analyses of ankylosing spondylitis patients switching TNF-α inhibitor therapy as part of routine clinical care.19
Ideally, the process of guideline formation itself could be used to help shape the research agenda, and guidelines do usually contain recommendations for future research. But Dr. Neogi notes that companies are usually not interested in funding studies that will not influence whether or not their product will be used. She also notes that commercial companies are not typically interested in funding needed studies that may determine the optimal use of older, existing generic drugs.
She adds, “This has to be a priority for NIH and other funding sources. Yes, our hands are tied, because this is the evidence that is out there. It’s a call to action to see if we can convince funders to help us address these important questions.” She believes that the state of treatment guidelines in rheumatology regarding the level of evidence is simply reflective of what investigators, funders and industry have deemed important to study in RCTs. She notes, “Rheumatologic diseases have a big impact on people’s lives, and yet the funding by a variety of different organizations does not adequately reflect the public health burden of these diseases.”
Future Guidelines
Guidelines need to be updated as new data become available. Currently, the ACR employs a system of evaluating the medical literature every 12 months to see if a guideline update is warranted.
“When there are substantial changes in the field, that triggers a process of revising guidelines, or the components of guidelines that require revision,” explains Dr. Caplan. “For example, the axial spondyloarthritis guidelines were published in 2015, and they are already undergoing revision because there have been major changes.”
There is inevitably a lag time between when new data become available and when the guidelines can reflect those important changes. But that is partly why the guidelines are just that—guidelines—to be used in conjunction with physician expertise.
In terms of the ACR, the process of guideline development is ongoing. New guidelines for psoriatic arthritis, juvenile idiopathic arthritis and reproductive health should be available in 2018, as well as an update for axial spondyloarthritis. Updated osteoarthritis guidelines are expected to be published in spring 2019, with RA and gout updates to follow in late 2019.
Ruth Jessen Hickman, MD, is a graduate of the Indiana University School of Medicine. She is a freelance medical and science writer living in Bloomington, Ind.
References
- Woolf S, Schünemann HJ, Eccles MP, et al. Developing clinical practice guidelines: Types of evidence and outcomes; values and economics, synthesis, grading, and presentation and deriving recommendations. Implement Sci. 2012 Jul 4;7:61.
- Weisz G, Cambrosio A, Keating P, et al. The emergence of clinical practice guidelines. Milbank Q. 2007 Dec;85(4):691–727.
- Singh JA, Saag KG, Bridges SL, et al. 2015 American College of Rheumatology guideline for the treatment of rheumatoid arthritis. Arthritis Rheumatol. 2016 Jan;68(1):1–26.
- Institute of Medicine (US) Committee on Standards for Developing Trustworthy Clinical Practice Guidelines, et al. Clinical Practice Guidelines We Can Trust. Washington (DC): National Academies Press (US); 2011.
- Greenfield S. Clinical practice guidelines: Expanded use and misuse. JAMA. 2017;317(6):594–595.
- Khanna D, FitzGerald JD, Khanna PP, et al. 2012 American College of Rheumatology guidelines for management of gout part I: Systematic non-pharmacologic and pharmacologic therapeutic approaches to hyperuricemia. Arthritis Care Res (Hoboken). 2012 Oct;64(10):1431–1446.
- Hochberg MC, Altman RD, April KT, et al. American College of Rheumatology 2012 recommendations for the use of nonpharmacologic and pharmacologic therapies in osteoarthritis of the hand, hip, and knee. Arthritis Care Res (Hoboken). 2012 Apr;64(4):465–474.
- Ward MM, Deodhar A, Akl EA, et al. American College of Rheumatology/Spondylitis Association of America/Spondyloarthritis Research and Treatment Network 2015 recommendations for the treatment of ankylosing spondylitis and nonradiographic axial spondyloarthritis. Arthritis Rheumatol. 2016 Feb;68(2):282–298.
- Dejaco C, Singh YP, Perel P, et al; European League Against Rheumatism; American College of Rheumatology. 2015 Recommendations for the management of polymyalgia rheumatica: A European League Against Rheumatism/American College of Rheumatology collaborative initiative. Ann Rheum Dis. 2015 Oct;74(10):1799–1807.
- Buckley L, Guyatt G, Fink HA, et al. 2017 American College of Rheumatology guideline for the prevention and treatment of glucocorticoid-induced osteoporosis. Arthritis Rheumatol. 2017 Aug;69(8):1521–1537.
- Goodman SM, Springer B, Guyatt G, et al. 2017 American College of Rheumatology/American Association of Hip and Knee Surgeons guideline for the perioperative management of antirheumatic medication in patients with rheumatic diseases undergoing elective total hip or total knee arthroplasty. Arthritis Rheumatol. 2017 Aug; 69(8):1538–1551.
- Guyatt GH, Oxman AD, Vist GE, et al. GRADE: An emerging consensus on rating quality of evidence and strength of recommendations. BMJ. 2008 Apr 26; 336(7650):924–926.
- Tricoci P, Allen JM, Kramer JM, et al. Scientific evidence underlying the ACC/AHA clinical practice guidelines. JAMA. 2009 Feb 25;301(8):831–841.
- Khan AR, Khan S, Zimmerman V, et al. Quality and strength of evidence of the Infectious Diseases Society of America clinical practice guidelines. Clin Infect Dis. 2010 Nov 15;51(10):1147–1156.
- Alseiari M, Meyer KB, Wong JB. Evidence underlying KDIGO (Kidney Disease: Improving Global Outcomes) Guideline Recommendations: A systematic review. Am J Kidney Dis. 2016 Mar;67(3):417–422.
- Murad MH, Asi N, Alsawas M, Alahdab F. New evidence pyramid. Evid Based Med. 2016 Aug;21(4):125–127.
- Hunt SA, Abraham WT, Chin MH, et al. ACC/AHA 2005 guideline update for the diagnosis and management of chronic heart failure in the adult: A report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines. Circulation. 2005 Sep 20;112(12):e154–e235.
- Duarte-García A, Zamore R, Wong JB. The evidence basis for the American College of Rheumatology practice guidelines. JAMA Intern Med. 2018 Jan 1;178(1):146–148.
- Glintborg B, Østergaard M, Krogh NS, et al. Clinical response, drug survival and predictors thereof in 432 ankylosing spondylitis patients after switching tumour necrosis factor α inhibitor therapy: Results from the Danish nationwide DANBIO registry. Ann Rheum Dis. 2013 Jul;72(7):1149–1155.