This paper aims for a potential architectural breakthrough for multilingual learning and asks: could different tasks from different languages be modeled in a monolithic framework (without any task/language-specific module)? The benefit of achieving this is not only that systems trained on low resources scenario can be assisted by more other languages and tasks, but opening new doors for future multilingual research. We approach this goal by developing a learning framework Polyglot Prompt, where prompting methods are introduced to learn a unified semantic space for different languages and tasks after proper multilingual prompt engineering. Experimentally, we perform a comprehensive evaluation on 6 tasks (topic classification, sentiment classification, named entity recognition, question answering, natural language inference, summarization), 24 datasets, and 49 languages, which shows the efficacy of multilingual multitask prompting training and suggests several interesting observations. e.g., English prompts are polyglots since directly applying them to task samples in other languages could result in a better improvement. We also present an interpretable multilingual evaluation methodology and show how the proposed framework, multilingual multitask prompt training, works. We release all datasets prompted in the best setting and will release our code soon.
翻译:本文旨在为多语种学习创造潜在的建筑突破,并询问:不同语言的不同任务能否在单一框架(没有任何任务/语言特定模块)中建模?实现这一目的的好处不仅在于为低资源情景培训的系统能够得到更多其他语言和任务的协助,而且为今后的多语种研究打开新的大门。我们通过开发学习框架“聚球快”来实现这一目标,即采用快速方法为不同语言学习统一的语义空间,并在适当的多语种快速工程后学习任务。我们实验性地对6项任务(专题分类、情绪分类、名称实体识别、问题回答、自然语言推断、自然语言总结)、24个数据集和49种语言进行全面评价,显示多语种促进培训的效果,并提出若干有趣的观察意见。例如,英语提示是多语种,因为直接将其应用到其他语言的任务样本可以带来更好的改进。我们还提出一种可解释的多语种评估方法,并展示拟议的框架、多语种快速培训、工作。我们在最佳设置中发布所有数据,并将很快发布我们的代码。