Question Answering on Electronic Health Records (EHR-QA) has a significant impact on the healthcare domain, and it is being actively studied. Previous research on structured EHR-QA focuses on converting natural language queries into query language such as SQL or SPARQL (NLQ2Query), so the problem scope is limited to pre-defined data types by the specific query language. In order to expand the EHR-QA task beyond this limitation to handle multi-modal medical data and solve complex inference in the future, more primitive systemic language is needed. In this paper, we design the program-based model (NLQ2Program) for EHR-QA as the first step towards the future direction. We tackle MIMICSPARQL*, the graph-based EHR-QA dataset, via a program-based approach in a semi-supervised manner in order to overcome the absence of gold programs. Without the gold program, our proposed model shows comparable performance to the previous state-of-the-art model, which is an NLQ2Query model (0.9% gain). In addition, for a reliable EHR-QA model, we apply the uncertainty decomposition method to measure the ambiguity in the input question. We empirically confirmed data uncertainty is most indicative of the ambiguity in the input question.
翻译:电子健康记录(EHR-QA)问题解答(EHR-QA)对保健领域有重大影响,目前正在积极研究。关于结构化的EHR-QA的研究侧重于将自然语言查询转换成查询语言,如SQL或SPARQL(NLQ2Query),因此问题范围仅限于特定查询语言预先界定的数据类型。为了扩大EHR-QA的任务,使之超越这一限制,处理多模式医疗数据并解决未来复杂的推断,需要更原始的系统语言。在本文中,我们为EHR-QA设计基于程序的模式(NLQ2Program),作为朝未来方向迈出的第一步。我们处理MIMIMSPARQL*,基于图表的EHR-QA数据集,通过以半超过方式采用基于程序的方法,以克服缺少黄金方案的问题。如果没有黄金方案,我们提议的模型显示与先前的状态模型(NLQ2Query模型是NLHR2Query模型)相似的业绩。在确定性数据不确定性方面采用最精确性的方法。我们确认的E-QA的精确度的方法。在确定性的方法中采用。我们确认的EQA输入。