In this work, we leverage LLMs to produce medical qualification exam questions and the corresponding answers through few-shot prompts, investigating in-depth how LLMs meet the requirements in terms of coherence, evidence of statement, factual consistency, and professionalism etc. Utilizing a multicenter bidirectional anonymized database with respect to comorbid chronic diseases, named Elderly Comorbidity Medical Database (CECMed), we tasked LLMs with generating open-ended questions and answers based on a subset of sampled admission reports. For CECMed, the retrospective cohort includes patients enrolled from January 2010 to January 2022 while the prospective cohort from January 2023 to November 2023, with participants sourced from selected tertiary and community hospitals across the southern, northern, and central regions of China. A total of 8 widely used LLMs were used, including ERNIE 4, ChatGLM 4, Doubao, Hunyuan, Spark 4, Qwen, Conventional medical education requires sophisticated clinicians to formulate questions and answers based on prototypes from EHRs, which is heuristic and time-consuming. We found that mainstream LLMs could generate questions and answers with real-world EHRs at levels close to clinicians. Although current LLMs performed dissatisfactory in some aspects, medical students, interns and residents could reasonably make use of LLMs to facilitate understanding.
翻译:暂无翻译