TY - JOUR
T1 - Analysis of responses from artificial intelligence programs to medication-related questions derived from critical care guidelines
AU - Williams, Blake
AU - Erstad, Brian L.
N1 - Publisher Copyright:
© American Society of Health-System Pharmacists 2025. All rights reserved.
PY - 2025/10/1
Y1 - 2025/10/1
N2 - Purpose: To evaluate the recommendations given by 4 publicly available artificial intelligence (AI) programs in comparison to recommendations in current clinical practice guidelines (CPGs) focused on critically ill adults. Methods: This study evaluated 4 publicly available large language models (LLMs): ChatGPT 4.0, Microsoft Copilot, Google Gemini Version 1.5, and Meta AI. Each AI chatbot was prompted with medication-related questions related to 6 CPGs published by the Society of Critical Care Medicine (SCCM) and also asked to provide references to support its recommendations. Responses were categorized as correct, partially correct, not correct, or “other” (eg, the LLM answered a question not asked). Results: In total, 43 responses were recorded for each AI program, with a significant difference (P = 0.007) in response types by AI program. Microsoft Copilot had the highest proportion of correct recommendations, followed by Meta AI, ChatGPT 4.0, and Google Gemini. All 4 LLMs gave some incorrect recommendations, with Gemini having the most incorrect responses, followed closely by ChatGPT. Copilot had the most responses in the “other” category (n = 5, 11.63%). On average, ChatGPT provided the greatest number of references per question (n = 4.54), followed by Google Gemini (n = 3.43), Meta AI (n = 3.06), and Microsoft Copilot (n = 2.04). Conclusion: Although they showed potential for future utility to pharmacists with further development and refinement, the evaluated AI programs did not consistently give accurate medication-related recommendations for the purpose of answering clinical questions such as those pertaining to critical care CPGs.
AB - Purpose: To evaluate the recommendations given by 4 publicly available artificial intelligence (AI) programs in comparison to recommendations in current clinical practice guidelines (CPGs) focused on critically ill adults. Methods: This study evaluated 4 publicly available large language models (LLMs): ChatGPT 4.0, Microsoft Copilot, Google Gemini Version 1.5, and Meta AI. Each AI chatbot was prompted with medication-related questions related to 6 CPGs published by the Society of Critical Care Medicine (SCCM) and also asked to provide references to support its recommendations. Responses were categorized as correct, partially correct, not correct, or “other” (eg, the LLM answered a question not asked). Results: In total, 43 responses were recorded for each AI program, with a significant difference (P = 0.007) in response types by AI program. Microsoft Copilot had the highest proportion of correct recommendations, followed by Meta AI, ChatGPT 4.0, and Google Gemini. All 4 LLMs gave some incorrect recommendations, with Gemini having the most incorrect responses, followed closely by ChatGPT. Copilot had the most responses in the “other” category (n = 5, 11.63%). On average, ChatGPT provided the greatest number of references per question (n = 4.54), followed by Google Gemini (n = 3.43), Meta AI (n = 3.06), and Microsoft Copilot (n = 2.04). Conclusion: Although they showed potential for future utility to pharmacists with further development and refinement, the evaluated AI programs did not consistently give accurate medication-related recommendations for the purpose of answering clinical questions such as those pertaining to critical care CPGs.
KW - ChatGPT
KW - artificial intelligence
KW - clinical practice guidelines
KW - critical care medicine
KW - large language models
KW - medication-related questions
UR - https://www.scopus.com/pages/publications/105016832580
UR - https://www.scopus.com/pages/publications/105016832580#tab=citedBy
U2 - 10.1093/ajhp/zxaf075
DO - 10.1093/ajhp/zxaf075
M3 - Review article
C2 - 40119714
AN - SCOPUS:105016832580
SN - 1079-2082
VL - 82
SP - e842-e847
JO - American Journal of Health-System Pharmacy
JF - American Journal of Health-System Pharmacy
IS - 19
ER -