We describe our two-stage system for the Multilingual Information Access (MIA) 2022 Shared Task on Cross-Lingual Open-Retrieval Question Answering. The first stage consists of multilingual passage retrieval with a hybrid dense and sparse retrieval strategy. The second stage consists of a reader which outputs the answer from the top passages returned by the first stage. We show the efficacy of using a multilingual language model with entity representations in pretraining, sparse retrieval signals to help dense retrieval, and Fusion-in-Decoder. On the development set, we obtain 43.46 F1 on XOR-TyDi QA and 21.99 F1 on MKQA, for an average F1 score of 32.73. On the test set, we obtain 40.93 F1 on XOR-TyDi QA and 22.29 F1 on MKQA, for an average F1 score of 31.61. We improve over the official baseline by over 4 F1 points on both the development and test sets.
翻译:我们描述了我们关于跨语言开放检索问题回答(MIA) 2022年多语言信息访问(MIA) 的两阶段共同任务系统,第一阶段是多语言通道检索,采用混合密集和稀少的检索战略,第二阶段是读者,从第一阶段返回的顶部段落中提供答案,我们展示了使用多语言语言模式的效率,实体在培训前有代表性,检索信号稀少,帮助密集检索,以及Fusion-in-Decoder。关于开发成套设备,我们获得了43.46 F1在 XOR-Ty Di QA上和21.99 F1在MKQA上获得的43.73 F1,平均F1分,在测试成套设备上,我们获得了40.93 F1分在XOR-Tydi QA上,22.29 F1分在MKA上,平均F1分为31.61。我们在开发和测试成套设备上比正式基线改进了4 F1分以上。