Automatic differential diagnosis (DDx) is an essential medical task that generates a list of potential diseases as differentials based on patient symptom descriptions. In practice, interpreting these differential diagnoses yields significant value but remains under-explored. Given the powerful capabilities of large language models (LLMs), we investigated using LLMs for interpretable DDx. Specifically, we curated the first DDx dataset with expert-derived interpretation on 570 clinical notes. Besides, we proposed Dual-Inf, a novel framework that enabled LLMs to conduct bidirectional inference (i.e., from symptoms to diagnoses and vice versa) for DDx interpretation. Both human and automated evaluation validated its efficacy in predicting and elucidating differentials across four base LLMs. In addition, Dual-Inf could reduce interpretation errors and hold promise for rare disease explanations. To the best of our knowledge, it is the first work that customizes LLMs for DDx explanation and comprehensively evaluates their interpretation performance. Overall, our study bridges a critical gap in DDx interpretation and enhances clinical decision-making.
翻译:暂无翻译