Progress in cross-lingual modeling depends on challenging, realistic, and diverse evaluation sets. We introduce Multilingual Knowledge Questions and Answers (MKQA), an open-domain question answering evaluation set comprising 10k question-answer pairs aligned across 26 typologically diverse languages (260k question-answer pairs in total). Answers are based on a heavily curated, language-independent data representation, making results comparable across languages and independent of language-specific passages. With 26 languages, this dataset supplies the widest range of languages to-date for evaluating question answering. We benchmark a variety of state-of-the-art methods and baselines for generative and extractive question answering, trained on Natural Questions, in zero shot and translation settings. Results indicate this dataset is challenging even in English, but especially in low-resource languages
翻译:跨语言建模的进展取决于具有挑战性、现实性和多样性的评价组合。我们引入了多语言知识问答(MKQA),这是一个开放的回答问题评价组合,包括10k个问答对,涉及26种类型多样的语言(共260k个问答对),答案基于大量整理的、语言独立的数据代表制,使结果在各语言之间具有可比性,并独立于语言段落。有了26种语言,该数据集提供最广泛的语言,用于评价问题回答。我们为基因化和采掘问题回答设定了各种最先进的方法和基线,在零镜头和翻译环境中接受了关于自然问题的培训。结果显示,这一数据集即使在英语方面也是挑战性的,特别是在低资源语言方面。