This paper presents Macquarie University's participation to the BioASQ Synergy Task, and BioASQ9b Phase B. In each of these tasks, our participation focused on the use of query-focused extractive summarisation to obtain the ideal answers to medical questions. The Synergy Task is an end-to-end question answering task on COVID-19 where systems are required to return relevant documents, snippets, and answers to a given question. Given the absence of training data, we used a query-focused summarisation system that was trained with the BioASQ8b training data set and we experimented with methods to retrieve the documents and snippets. Considering the poor quality of the documents and snippets retrieved by our system, we observed reasonably good quality in the answers returned. For phase B of the BioASQ9b task, the relevant documents and snippets were already included in the test data. Our system split the snippets into candidate sentences and used BERT variants under a sentence classification setup. The system used the question and candidate sentence as input and was trained to predict the likelihood of the candidate sentence being part of the ideal answer. The runs obtained either the best or second best ROUGE-F1 results of all participants to all batches of BioASQ9b. This shows that using BERT in a classification setup is a very strong baseline for the identification of ideal answers.
翻译:本文介绍麦克奎里大学参与BioASQ协同任务和BioASQQQ9bB阶段B。 在每项任务中,我们的参与侧重于使用以询问为焦点的抽取摘要,以获得理想的医学问题答案。协同工作是COVID-19的端到端回答任务,其中要求系统返回相关文件、片段和对特定问题的回答。由于缺乏培训数据,我们使用了一个以BioASQ8b培训数据集培训的以询问为重点的总结系统,并试验了检索文件和片段的方法。考虑到我们系统检索的文件和片段质量差,我们观察到返回的答案质量相当好。对于BioASQ9b任务B阶段,相关文件和片段已经包含在测试数据中。我们的系统将片段分成候选句,并在一个判决分类设置下使用BERT变量。系统将问题和候选人句子用作最佳的投入,并经过培训,以预测这一候选句子的概率,这是RGEA1的理想答案的一部分。在BERQ1的分类中,这是对BERQ1 最佳的参与者进行最佳的预测。