We propose a simple method to generate multilingual question and answer pairs on a large scale through the use of a single generative model. These synthetic samples can be used to improve the zero-shot performance of multilingual QA models on target languages. Our proposed multi-task training of the generative model only requires the labeled training samples in English, thus removing the need for such samples in the target languages, making it applicable to far more languages than those with labeled data. Human evaluations indicate the majority of such samples are grammatically correct and sensible. Experimental results show our proposed approach can achieve large gains on the XQuAD dataset, reducing the gap between zero-shot and supervised performance of smaller QA models on various languages.
翻译:我们提出一个简单方法,通过使用单一基因模型,大规模生成多语种问答配对。这些合成样本可用于改善多语种质量评估模型在目标语言上的零点性能。我们提议的基因模型多任务培训仅要求使用英文的标签培训样本,从而消除了对目标语言中此类样本的需求,使其比有标签数据的语言更适用于更多语言。人类评估表明,大多数此类样本在语法上正确和合理。实验结果显示,我们提出的方法可以在 XQuAD数据集上取得巨大收益,缩小在各种语言上较小质量评估模型的零点性能与监督性性能之间的差距。