Biomedical Question Answering aims to obtain an answer to the given question from the biomedical domain. Due to its high requirement of biomedical domain knowledge, it is difficult for the model to learn domain knowledge from limited training data. We propose a contextual embedding method that combines open-domain QA model \aoa and \biobert model pre-trained on biomedical domain data. We adopt unsupervised pre-training on large biomedical corpus and supervised fine-tuning on biomedical question answering dataset. Additionally, we adopt an MLP-based model weighting layer to automatically exploit the advantages of two models to provide the correct answer. The public dataset \biomrc constructed from PubMed corpus is used to evaluate our method. Experimental results show that our model outperforms state-of-the-art system by a large margin.
翻译:生物医学问题解答旨在从生物医学领域获得对特定问题的答案。由于生物医学领域知识的高度要求,模型很难从有限的培训数据中学习域知识。我们提议一种背景嵌入方法,将开放域QA模型和生物医学领域数据预先培训的\biobert模型结合起来。我们采用了未经监督的大型生物医学物质培训前培训和生物医学问题解答数据集监管的微调。此外,我们采用了基于MLP的模型加权层,以自动利用两个模型的优势来提供正确的答案。从PubMed系统构建的公共数据集\biumrc用于评估我们的方法。实验结果显示,我们的模型在大范围内超越了最新技术系统。