Biomedical Question Answering (BQA) has attracted increasing attention in recent years due to its promising application prospect. It is a challenging task because the biomedical questions are professional and usually vary widely. Existing question answering methods answer all questions with a homogeneous model, leading to various types of questions competing for the shared parameters, which will confuse the model decision for each single type of questions. In this paper, in order to alleviate the parameter competition problem, we propose a Mixture-of-Expert (MoE) based question answering method called MoEBQA that decouples the computation for different types of questions by sparse routing. To be specific, we split a pretrained Transformer model into bottom and top blocks. The bottom blocks are shared by all the examples, aiming to capture the general features. The top blocks are extended to an MoE version that consists of a series of independent experts, where each example is assigned to a few experts according to its underlying question type. MoEBQA automatically learns the routing strategy in an end-to-end manner so that each expert tends to deal with the question types it is expert in. We evaluate MoEBQA on three BQA datasets constructed based on real examinations. The results show that our MoE extension significantly boosts the performance of question answering models and achieves new state-of-the-art performance. In addition, we elaborately analyze our MoE modules to reveal how MoEBQA works and find that it can automatically group the questions into human-readable clusters.
翻译:近年来,生物医学问题解答(BQA)因其有希望的应用前景而引起越来越多的关注。这是一个具有挑战性的任务,因为生物医学问题具有专业性,而且通常差异很大。现有的回答问题的方法用同质模式回答所有问题,导致各种类型的问题争夺共享参数,这将混淆每种类型的问题的示范决定。在本文件中,为了缓解参数竞争问题,我们提议了一个基于“Mixture-for-Expert”(MOEBQA) 的回答方法,即“MoEBQA”,该方法将不同类型问题的计算通过稀释路径解密。具体地说,我们将预先训练的变换模型分为底块和顶层块。底块被所有例子共享,以捕捉通用参数。顶层块将扩展为由一组独立专家组成的一个模版。为了缓解参数竞争问题,我们根据基本问题类型,将每个例子指派给几个专家。MOEBQA自动地学习“路程”战略,以便每个专家倾向于处理问题的类型。我们自动地将“变换”变换“变“变换”的“变换”的“OQA”的“BA”,我们在“BA” 测试将“结果进行真正的“结果” 。我们大幅地显示“变换成“B” 。我们根据“变换”的“变换”的“B”的“BA” 。我们”的“B” 。我们”的“变换”的“结果” 。我们“变换的“B”的“B” 。我们”的“变换”的“变换” 。我们“变换”的“变换” 。我们“变”的“变”的“变”的“BA”的“变” 。我们用“变换的“变”。