Skip to main content
Effective Health Care Program
Home » Products » Use of Clinical Algorithms That Have the Potential To Introduce Racial/Ethnic Bias Into Healthcare Delivery » Use of Clinical Algorithms That Have the Potential To Introduce Racial/Ethnic Bias Into Healthcare Delivery

Use of Clinical Algorithms That Have the Potential To Introduce Racial/Ethnic Bias Into Healthcare Delivery

Overview

The Agency for Healthcare Research and Quality (AHRQ) is seeking information from the public on clinical algorithms that are used or recommended in medical practice and any evidence on clinical algorithms that may introduce bias into clinical decision making and/or influence access to care, quality of care, or health outcomes for racial and ethnic minorities and people who are socioeconomically disadvantaged. This request also will be posted in the Federal Register.

Background

AHRQ received a request from Congress to review the evidence on the use of race/ethnicity in clinical algorithms and the potential of algorithms to contribute to disparities in healthcare for Black, Indigenous, and other people of color.1 This effort is related to AHRQ's vision and mission to produce evidence to make healthcare equitable and to make sure the best available evidence is in use. In addition, AHRQ's role includes identifying and documenting healthcare disparities reported in the annual National Healthcare Quality and Disparities Report,2 which is also aligned with the current effort. As such, AHRQ intends to commission an evidence review that will critically appraise the evidence on commonly used algorithms, including whether race/ethnicity is included as an explicit variable and how algorithms have been developed and validated. The review will examine how race/ethnicity and related variables included in clinical algorithms impact healthcare use, patient outcomes and healthcare disparities. In addition, the review will identify and assess other variables with the potential to introduce bias, such as prior utilization. The review will identify and review approaches to clinical algorithm development that avoid the introduction of racial and ethnic bias into clinical decision making and resulting outcomes.

Established in 1997, the mission of the Evidence-based Practice Center (EPC) Program is to create evidence reviews that improve healthcare by supporting evidence-based decision making by patients, providers, and policymakers. Evidence reviews summarize and synthesize existing literature and evidence using rigorous methods. This information will be used to inform an AHRQ EPC evidence review and may inform other activities commissioned by or in collaboration with AHRQ.

For the purposes of this evidence review, clinical algorithms are defined as a set of steps that clinicians use to guide decision making in preventive services (such as screening), diagnosis, clinical management, or other assessments to improve a patient's health. Algorithms are informed by data and research evidence and may include patient-specific factors or characteristics that may be sociodemographic factors, such as race/ethnicity; physiologic factors such as blood sugar level; or other factors such as patterns of healthcare utilization.

When used appropriately, algorithms can improve disease management and patient health by creating efficiencies in place of decision-makers having to weigh multiple and complex factors when making a clinical judgement. As a result, the use of clinical algorithms has become widespread in healthcare and includes a heterogeneous set of tools including: clinical pathways/guidelines, the establishment of norms and standards that may vary according to patient-specific factors, clinical decision support embedded in electronic health records (EHRs) or within medical devices, pattern recognition software used for diagnosis, and apps and calculators that predict patient risk and prognosis.

Some clinical algorithms include information about a patient's race or ethnicity among their inputs and thus lead clinicians to decisions that vary by race/ethnicity, including decisions about how best to diagnose and manage individual patients. Several recent publications have highlighted how race- or ethnicity-based clinical algorithms used in practice may contribute to disparities in care. An article by Vyas et al.3 in the New England Journal of Medicine in August 2020 presented a partial list of algorithms in use where a race/ethnicity adjustment or "correction" is included in calculations across diverse clinical areas including cardiology, nephrology, obstetrics, and urology. For the several algorithms described, authors pointed out how inclusion of race/ethnicity could direct more resources toward White patients and thus exacerbate care inequities.

A prominent example is in kidney care,4 where an important metric of kidney function is the glomerular filtration rate (GFR). The GFR is difficult to measure directly so is often estimated from creatinine levels which can be obtained from a simple blood draw. For a given creatinine level, prior studies have found that on average Black study participants have higher GFR, that is, better kidney function compared with a White person with the same creatinine level.5 Therefore, an algorithm widely used in kidney care adjusts the estimated GFR (eGFR) for race. Considering the markedly higher rates of kidney disease and poor outcomes among Blacks compared with other groups, the concern is that this "adjustment" could be one factor contributing to delays or lack of access to care for Black patients. Additionally, race is a social and not a biological construct, and the heterogeneity among the Black population means that including race as a variable will result in error in the GFR estimation for many individuals. As a result, some institutions are eliminating the adjustment for race/ethnicity when using the eGFR.6 Removing race could increase the prevalence of chronic kidney disease (CKD) among Black adults from 14.9 percent to 18.4 percent and reclassify those with CKD as having more severe disease.6 This reclassification could have benefits and harms: increased access to care and effective treatments to prevent disease progression as well as the potential for overdiagnosis in some which could, for example, create barriers to obtaining life insurance. One solution would be to develop better biomarkers of kidney function.

In another example, Obermeyer et al.7 found that a widely used commercial prediction algorithm assigns lower health risk scores to sicker Black patients by using a proxy of lower healthcare utilization among Blacks in the originating dataset rather than disease severity or other measures of clinical need. Furthermore, many datasets from which algorithms are developed may not report or include adequate representation of Black, Indigenous, and other people of color from which to report true between-group differences or extrapolate inferences to other populations.8 These examples and others have led Hernandez-Boussard et al.9 and others10,11 to call for reporting standards and "algorithmic stewardship" in which algorithm developers and users such as health systems are held accountable for ensuring the safety, effectiveness, and fairness—absence of bias—of clinical algorithms prior to their implementation in clinical care. These authors recommend standards for completeness, transparency, and reporting for all data used to develop algorithms ("ground truth"). Further, analogous to determining safety and effectiveness of other medical products, the authors note the need for rigorous clinical evaluation to measure clinical outcomes associated with the use of the algorithms.

Although these authors9,11 were particularly focused on algorithms in machine learning and artificial intelligence rather than the full scope of how algorithms are developed and used in healthcare, they too noted the large number and impact of algorithms and the need to identify the large numbers of clinical algorithms in use, many of which may not have been assessed for bias. For example, a systematic review12 examining the risk of cardiovascular disease for patients with type 2 diabetes found 45 prediction models, of which 12 were specifically developed for patients with type 2 diabetes.

Request for Information

The purpose of this evidence review is to understand: which algorithms are currently used in different clinical settings; the type and extent of their validation; their potential for bias with impact on access, quality, and outcomes of care; awareness among clinicians of these issues; and strategies for developing and testing clinical algorithms to assure that they are free of bias in order to inform the scope of a future evidence review. We are interested in understanding which algorithms are currently in use in clinical practice, including those related to the use of clinical preventive services. How many include race/ethnicity and other factors that could lead to bias within the algorithm? We are interested in all algorithms, including clinical pathways/guidelines, norms and standards (including laboratory values) that vary according to patient-specific factors such as race/ethnicity and related variables, clinical decision support embedded in EHRs, pattern recognition software, and apps and calculators for patient risk and prognosis. We are interested in algorithms developed both through traditional methods and through new and ongoing methods including machine learning and artificial intelligence.

AHRQ seeks information:

  • From healthcare providers who use clinical algorithms to screen, diagnose, triage, treat, or otherwise care for patients
  • From laboratorians or technicians who use algorithms to interpret lab or radiology data
  • From researchers and clinical decision support developers who develop algorithms used in healthcare for patients
  • From clinical professional societies or other groups that develop clinical algorithms for healthcare
  • From payers who use clinical algorithms to guide payment decisions for care for patients
  • From healthcare delivery organizations that use clinical algorithms to determine healthcare practices and policies for patients
  • From device developers who incorporate algorithms into device software to interpret data and set standards
  • From patients whose healthcare and healthcare decisions may be informed by clinical algorithms

Respondents may address one or more of the questions below. AHRQ is interested in all of the questions, but respondents are welcome to address as many or as few as they choose and to address additional areas of interest not listed.

  1. What clinical algorithms are used in clinical practice, hospitals, health systems, payment systems, or other instances? What is the estimated impact of these algorithms in size and characteristics of population affected, quality of care, clinical outcomes, quality of life, and health disparities?
  2. Do the algorithms in question 1 include race/ethnicity as a variable and, if so, how was race and ethnicity defined (including from whose perspective and whether there is a designation for mixed-race or multiracial individuals)?
  3. Do the algorithms in question 1 include measures of social determinants of health (SDOH) and, if so, how were these defined? Are these independently or collectively examined for their potential contribution to healthcare disparities and biases in care?
  4. For the algorithms in question 1, what evidence, data quality and types (such as claims/utilization data, clinical data, social determinants of health), and data sources were used in their development and validation? What is the sample size of the datasets used for development and validation? What is the representation of Black, Indigenous, and other people of color and what is the power to detect between-group differences? What methods were used to validate the algorithms and measure health outcomes associated with the use of the algorithms?
  5. For the algorithms in question 1, what approaches are used in updating these algorithms?
  6. Which clinical algorithms have evidence that they contribute to healthcare disparities, including decreasing access to care, quality of care or worsening health outcomes for Black, Indigenous, and other people of color? What are the priority populations or conditions for assessing whether algorithms increase racial/ethnic disparities? What are the mechanisms by which use of algorithms contribute to poor care for Black, Indigenous, and other people of color?
  7. To what extent are users of algorithms, including clinicians, health systems, and health plans, aware of the inclusion of race/ethnicity or other variables that could introduce bias in these algorithms and the implications for clinical decision making? What evidence is available about the degree to which the use of clinical algorithms contributes to bias in care delivery and resulting disparities in health outcomes? To what extent are patients aware of the inclusion of race/ethnicity or other variables that can result in bias in algorithms that influence their care? Do providers or health systems communicate this information with patients in ways that can be understood?
  8. What are approaches to identifying sources of bias and/or correcting or developing new algorithms that may be free of bias? What evidence, data quality and types (such as claims/utilization data, clinical data, information on social determinants of health), data sources, and sample size are used in their development and validation? What is the impact of these new approaches and algorithms on outcomes?
  9. What challenges have arisen or can arise by designing algorithms developed using traditional biomedical or physiologic factors (such as blood glucose) yet include race/ethnicity as a proxy for other factors such as specific biomarkers, genetic information, etc.? What strategies can be used to address these challenges?
  10. What are existing and developing standards (national and international) about how clinical algorithms should be developed, validated, and updated in a way to avoid bias? Are you aware of guidance on the inclusion or race/ethnicity, related variables such as SDOH, prior utilization, or other variables to minimize the risk of bias?
  11. To what extent are users of clinical algorithms educated about how algorithms are developed or may influence their decision making? What educational curricula and training is available for clinicians that addresses bias in clinical algorithms?

Submit Comments

References

  1. Letter from U.S. Senators Elizabeth Warren, Ron Wyden, and Cory A. Booker and Rep. Barbara Lee to AHRQ Director Gopal Khanna. Sept. 22, 2020.
  2. National Healthcare Quality and Disparities Report. Rockville, MD: Agency for Healthcare Research and Quality; 2019.
  3. Vyas DA, Eisenstein LG, Jones DS. Hidden in plain Sight—reconsidering the use of race correction in clinical algorithms. N Engl J Med. 2020 Aug 27;383(9):874-82. doi: 10.1056/NEJMms2004740. PMID: 32853499.
  4. Powe NR. Black kidney function matters: use or misuse of race? JAMA. 2020 Aug 25;324(8):737-8. doi: 10.1001/jama.2020.13378. PMID: 32761164.
  5. Levey AS, Titan SM, Powe NR, et al. Kidney disease, race, and GFR estimation. Clin J Am Soc Nephrol. 2020 Aug 7;15(8):1203-12. doi: 10.2215/CJN.12791019. PMID: 32393465.
  6. Diao JA, Wu GJ, Taylor HA, et al. Clinical implications of removing race from estimates of kidney function. JAMA. 2021 Jan 12;325(2):184-6. doi: 10.1001/jama.2020.22124. PMID: 33263721.
  7. Obermeyer Z, Powers B, Vogeli C, et al. Dissecting racial bias in an algorithm used to manage the health of populations. Science. 2019 Oct 25;366(6464):447-53. doi: 10.1126/science.aax2342. PMID: 31649194.
  8. Bozkurt S, Cahan EM, Seneviratne MG, et al. Reporting of demographic data and representativeness in machine learning models using electronic health records. J Am Med Inform Assoc. 2020 Dec 9;27(12):1878-84. doi: 10.1093/jamia/ocaa164. PMID: 32935131.
  9. Hernandez-Boussard T, Bozkurt S, Ioannidis JPA, et al. MINIMAR (MINimum Information for Medical AI Reporting): Developing reporting standards for artificial intelligence in health care. J Am Med Inform Assoc. 2020 Dec 9;27(12):2011-5. doi: 10.1093/jamia/ocaa088. PMID: 32594179.
  10. Eaneff S, Obermeyer Z, Butte AJ. The case for algorithmic stewardship for artificial intelligence and machine learning technologies. JAMA. 2020 Sep 14. doi: 10.1001/jama.2020.9371. PMID: 32926087.
  11. Park Y, Jackson GP, Foreman MA, et al. Evaluating artificial intelligence in medicine: phases of clinical research. JAMIA Open. 2020 Oct;3(3):326-31. doi: 10.1093/jamiaopen/ooaa033. PMID: 33215066.
  12. van Dieren S, Beulens JW, Kengne AP, et al. Prediction models for the risk of cardiovascular disease in patients with type 2 diabetes: a systematic review. Heart. 2012 Mar;98(5):360-9. doi: 10.1136/heartjnl-2011-300734. PMID: 22184101.