Electronic health records (EHRs) store an extensive array of patient information, encompassing medical histories, diagnoses, treatments, and test outcomes. These records are crucial for enabling healthcare providers to make well-informed decisions regarding patient care. Summarizing clinical notes further assists healthcare professionals in pinpointing potential health risks and making better-informed decisions. This process contributes to reducing errors and enhancing patient outcomes by ensuring providers have access to the most pertinent and current patient data. Recent research has shown that incorporating prompts with large language models (LLMs) substantially boosts the efficacy of summarization tasks. However, we show that this approach also leads to increased output variance, resulting in notably divergent outputs even when prompts share similar meanings. To tackle this challenge, we introduce a model-agnostic Soft Prompt-Based Calibration (SPeC) pipeline that employs soft prompts to diminish variance while preserving the advantages of prompt-based summarization. Experimental findings on multiple clinical note tasks and LLMs indicate that our method not only bolsters performance but also effectively curbs variance for various LLMs, providing a more uniform and dependable solution for summarizing vital medical information.
翻译:电子健康记录(EHR)存储着包括医疗史、诊断、治疗和检测结果等方方面面的患者信息。这些记录对于医生作出明智的病情抉择至关重要。将临床笔记进行摘要有助于医疗保健专业人士准确判断潜在的健康风险,做出更明智的决策,这有助于减少错误,并通过确保医疗提供者获得最相关和最新的患者数据来提高患者的疗效。最近的研究表明,将大型语言模型(LLM)与提示结合使用,显著提高了摘要任务的效果。然而,我们发现这种方法也会导致输出方差增加,即使提示具有相似的含义,输出也会非常不同。为了解决这个挑战,我们引入了一种基于模型的 Soft Prompt-Based Calibration (SPeC)流程,采用软提示来减小方差,同时保留提示型总结的优势。在多个临床笔记任务和LLM上进行的实验发现,我们的方法不仅增强了性能,还有效地抑制了不同LLM的方差,为摘要重要医疗信息提供了更加均匀和可靠的解决方案。