Hospitals Lag in Evaluating AI and Predictive Models for Accuracy and Bias, Study Finds

A new study published in Health Affairs reveals that many U.S. hospitals are falling short in their evaluation of artificial intelligence (AI) and predictive models, potentially putting patient safety and equitable treatment at risk. The research, based on data from the 2023 American Hospital Association Annual Survey Information Technology Supplement, highlights a growing digital divide between well-resourced hospitals and those serving marginalized populations.
Widespread Adoption, Limited Evaluation
According to the study, approximately 65% of surveyed U.S. hospitals reported using AI or predictive models integrated with their electronic health record (EHR) systems. These tools are employed for various purposes, including predicting inpatient health trajectories (92%), identifying high-risk outpatients (79%), and scheduling (51%).
However, the research uncovered concerning gaps in the evaluation of these technologies:
- Only 61% of hospitals using AI or predictive models reported evaluating them for accuracy using local data.
- A mere 44% of hospitals assessed these tools for potential bias.
Paige Nong, assistant professor at the University of Minnesota's School of Public Health and the study's lead author, emphasized the implications: "By focusing on the differences in evaluation capacity among hospitals, this research highlights the risks of a growing digital divide between hospital types, which threatens equitable treatment and patient safety."
Resource Disparities and Model Sources
The study revealed significant disparities in AI adoption and evaluation based on hospital resources and location:
- Critical access hospitals, rural hospitals, and organizations serving high-Social Deprivation Index areas were less likely to use predictive models.
- Hospitals with higher profit margins and those belonging to health systems were more likely to conduct local evaluations of their AI tools.
The source of predictive models also played a role in evaluation practices:
- 79% of hospitals using predictive models obtained them from their EHR developer.
- 59% utilized tools from third-party vendors.
- 54% reported using self-developed models.
Notably, hospitals that developed their own models were more likely to evaluate them locally for accuracy and bias. This trend "likely explained by the similarities in technical expertise required to develop models and locally evaluate them," the researchers noted.
Implications and Future Directions
The findings raise concerns about the potential for inaccurate or biased models to harm patients, particularly in hospitals with fewer resources. The researchers warn of a "rich-get-richer" effect and suggest that policies to increase information provided by model developers or targeted interventions to bolster hospitals' evaluation capacity may be necessary.
As regulators develop new policies to address transparency and bias in healthcare AI, the study's authors conclude that independent hospitals with fewer resources need support to ensure the use of accurate and unbiased AI. They also note that "the growth and broad impact of providers' self-developed models that are currently outside the scope of federal regulation could warrant additional consideration."
References
- Not enough hospitals are testing their predictive AI models for accuracy, bias, study finds
Only 61% of hospitals using AI or a predictive model report evaluating for accuracy using local data, and just 44% do so for bias, according to a recent study. Analyses suggest that hospitals with fewer resources are less likely to evaluate, leaving their patients at greater risk for harm.
- More hospitals need to assess predictive models for accuracy, bias: Health Affairs
Just 61% of hospitals surveyed evaluated models for accuracy on their own data, and less than half evaluated the models for bias.
Explore Further
What specific measures can be implemented to improve AI evaluation capabilities in hospitals with fewer resources?
How do disparities in AI adoption and evaluation between rural and urban hospitals impact patient outcomes?
What role do policymakers have in standardizing AI and predictive model evaluations to ensure equitable healthcare access?
How might the reliance on EHR developers and third-party vendors for predictive models affect the transparency and accountability of AI tools in healthcare?
What strategies can be employed to ensure that self-developed predictive models meet accuracy and bias standards without federal oversight?