The recent explosion of question answering (QA) datasets and models has increased the interest in the generalization of models across multiple domains and formats by either training on multiple datasets or by combining multiple models. Despite the promising results of multi-dataset models, some domains or QA formats may require specific architectures, and thus the adaptability of these models might be limited. In addition, current approaches for combining models disregard cues such as question-answer compatibility. In this work, we propose to combine expert agents with a novel, flexible, and training-efficient architecture that considers questions, answer predictions, and answer-prediction confidence scores to select the best answer among a list of answer candidates. Through quantitative and qualitative experiments we show that our model i) creates a collaboration between agents that outperforms previous multi-agent and multi-dataset approaches in both in-domain and out-of-domain scenarios, ii) is highly data-efficient to train, and iii) can be adapted to any QA format. We release our code and a dataset of answer predictions from expert agents for 16 QA datasets to foster future developments of multi-agent systems on https://github.com/UKPLab/MetaQA.
翻译:最近的问题解答(QA)数据集和模型的爆炸使得人们更加关心通过多数据集培训或合并多种模型,将多种领域和格式的模型普遍化。尽管多数据集模型取得了有希望的结果,但有些领域或QA格式可能需要特定的架构,因此这些模型的适应性可能受到限制。此外,目前将各种模型合并的方法忽视了问答兼容性等线索。在这项工作中,我们提议将专家代理人与新颖、灵活和具有培训效率的结构结合起来,以考虑问题、回答预测和回答前信任评分,在答复候选人名单中选择最佳答案。我们通过定量和定性实验显示,我们的模型i)在超越了以往多剂和多数据组合方法的代理人之间建立了协作,在现场和场外情景中,二)数据效率很高,可以培训,三)可以适应任何质量保证格式。我们发布了16个质量保证/QA数据集的专家代理人的代码和答案预测数据集,以促进未来多剂系统的发展。 http://Metas/QA。