With the rise of telemedicine, the task of developing Dialogue Systems for Medical Diagnosis (DSMD) has received much attention in recent years. Different from early researches that needed to rely on extra human resources and expertise to help construct the system, recent researches focused on how to build DSMD in a purely data-driven manner. However, the previous data-driven DSMD methods largely overlooked the system interpretability, which is critical for a medical application, and they also suffered from the data sparsity issue at the same time. In this paper, we explore how to bring interpretability to data-driven DSMD. Specifically, we propose a more interpretable decision process to implement the dialogue manager of DSMD by reasonably mimicking real doctors' inquiry logics, and we devise a model with highly transparent components to conduct the inference. Moreover, we collect a new DSMD dataset, which has a much larger scale, more diverse patterns and is of higher quality than the existing ones. The experiments show that our method obtains 7.7%, 10.0%, 3.0% absolute improvement in diagnosis accuracy respectively on three datasets, demonstrating the effectiveness of its rational decision process and model design. Our codes and the GMD-12 dataset are available at https://github.com/lwgkzl/BR-Agent.
翻译:随着远程医疗的兴起,发展医疗诊断对话系统(DSMD)的任务近年来受到了很多关注。与早期的研究不同,早期的研究需要依赖额外的人力资源和专门知识来帮助建立该系统,最近的研究侧重于如何以纯数据驱动的方式建立DSMD。然而,先前的数据驱动DSMD方法在很大程度上忽视了系统可解释性,而该系统对于医疗应用至关重要,同时也受到数据紧张问题的影响。在本文中,我们探索了如何使数据驱动的DSMD具有可解释性。具体地说,我们建议了一个更可解释的决策过程,通过合理模仿真正的医生调查逻辑来实施DSMD对话管理器,我们设计了一个具有高度透明组成部分的模型来进行推断。此外,我们收集了一个新的DSMD数据集,该数据集的规模要大得多,模式要更加多样化,质量要高于现有的数据。实验表明,我们的方法获得了7.7%、10.0%、3.0%的诊断准确度改进了DMD的精确度。我们在三种数据集/GMA/MD的模型设计中分别展示了其合理决定程序的有效性。