With worldwide concerns surrounding the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), there is a rapidly growing body of scientific literature on the virus. Clinicians, researchers, and policy-makers need to be able to search these articles effectively. In this work, we present a zero-shot ranking algorithm that adapts to COVID-related scientific literature. Our approach filters training data from another collection down to medical-related queries, uses a neural re-ranking model pre-trained on scientific text (SciBERT), and filters the target document collection. This approach ranks top among zero-shot methods on the TREC COVID Round 1 leaderboard, and exhibits a P@5 of 0.80 and an nDCG@10 of 0.68 when evaluated on both Round 1 and 2 judgments. Despite not relying on TREC-COVID data, our method outperforms models that do. As one of the first search methods to thoroughly evaluate COVID-19 search, we hope that this serves as a strong baseline and helps in the global crisis.
翻译:围绕严重急性呼吸系统综合症科罗纳病毒2 (SARS-COV-2) (SARS-COV-2) (SARS-COV-2) (SARS-COV-2) (SARS-COV-2) (SARS-COV-2) (SARS-CORV-2)), 全世界范围内都对严重急性呼吸系统综合症科罗纳病毒2 (Corona病毒2)感到关切,因此,有关病毒的科学文献正在迅速增加。 临床医生、研究人员和决策者需要能够有效地搜索这些文章。 在这项工作中,我们提出了一个适应COVID相关科学文献的零分级算法。 我们的方法过滤了从另一组收集到医学相关查询的训练数据,使用了事先经过科学文本培训的神经再排序模型(SciBERT), 过滤了目标文件的收集。 这一方法在TREC COVID 1 圆1 圆盘中处于零分数方法中位居首位,, 并展示了0. 0. 0.80 和 nDCG@ 10 (NDCG) 0.68 (在第一和第二回合两轮判决中都得到评价时 ) 。 尽管我们的方法并不依赖TREC-COVID-COVID-COVID) 的PD 。 我们的方法超越了模型, 的模型的模型是用来进行彻底评估的模型。