AI-driven medical history-taking is an important component in symptom checking, automated patient intake, triage, and other AI virtual care applications. As history-taking is extremely varied, machine learning models require a significant amount of data to train. To overcome this challenge, existing systems are developed using indirect data or expert knowledge. This leads to a training-inference gap as models are trained on different kinds of data than what they observe at inference time. In this work, we present a two-stage re-ranking approach that helps close the training-inference gap by re-ranking the first-stage question candidates using a dialogue-contextualized model. For this, we propose a new model, global re-ranker, which cross-encodes the dialogue with all questions simultaneously, and compare it with several existing neural baselines. We test both transformer and S4-based language model backbones. We find that relative to the expert system, the best performance is achieved by our proposed global re-ranker with a transformer backbone, resulting in a 30% higher normalized discount cumulative gain (nDCG) and a 77% higher mean average precision (mAP).
翻译:医疗史采集是症状检查、自动患者登记、分诊和其他 AI 虚拟护理应用的重要组成部分。由于病史采集非常多样化,对机器学习模型的训练需要大量的数据。为了克服这一挑战,现有系统使用间接数据或专家知识开发。这导致了训练-推论的差距,因为模型在推论时观察到的数据与它们训练的数据不同。在这项工作中,我们提出了一种两阶段重排序方法,通过使用基于对话上下文的模型重新排列第一阶段的问题候选项来帮助缩小训练-推论差距。为此,我们提出了一种新型模型-全局重排序器,它同时使用对话和所有问题进行交叉编码,并将其与几种现有的神经基线进行了比较。我们测试了 transformer 和 S4-based 语言模型的背骨。我们发现,相对于专家系统,我们提出的具有 transformer 背骨的全局重排序器实现了最佳性能,结果是标准化折扣累积增益 (nDCG) 高出 30%,平均精度 (mAP) 高出 77%。