Evaluation of a Generative Language Model Tool for Writing Examination Questions

Christopher J. Edwards, Brian L. Erstad

Research output: Contribution to journalArticlepeer-review


Objective: To describe an evaluation of a generative language model tool to write examination questions for a new elective course focused on the interpretation of common clinical laboratory results being developed as an elective for students in a Bachelor of Science in Pharmaceutical Sciences program. Methods: A total of 100 multiple-choice questions were generated using a publicly available large language model for a course dealing with common laboratory values. Two independent evaluators with extensive training and experience in writing multiple-choice questions evaluated each question for appropriate formatting, clarity, correctness, relevancy, and difficulty. For each question, a final dichotomous judgment was assigned by each reviewer, usable as written or not usable written. Results: The major finding of this study was that a generative language model (ChatGPT 3.5) could generate multiple-choice questions for assessing common laboratory value information but only about half the questions (50% and 57% for the 2 evaluators) were deemed usable without modification. General agreement between evaluator comments was common (62% of comments) with more than 1 correct answer being the most common reason for commenting on the lack of usability (N = 27). Conclusion: The generally positive findings of this study suggest that the use of a generative language model tool for developing examination questions is deserving of further investigation.

Original languageEnglish (US)
Article number100684
JournalAmerican journal of pharmaceutical education
Issue number4
StatePublished - Apr 2024


  • Artificial intelligence
  • Chatbot
  • Examination questions
  • Laboratory tests
  • Pharmacy

ASJC Scopus subject areas

  • Education
  • Pharmacy
  • General Pharmacology, Toxicology and Pharmaceutics


Dive into the research topics of 'Evaluation of a Generative Language Model Tool for Writing Examination Questions'. Together they form a unique fingerprint.

Cite this