Relation extraction is a core problem for natural language processing in the biomedical domain. Recent research on relation extraction showed that prompt-based learning improves the performance on both fine-tuning on full training set and few-shot training. However, less effort has been made on domain-specific tasks where good prompt design can be even harder. In this paper, we investigate prompting for biomedical relation extraction, with experiments on the ChemProt dataset. We present a simple yet effective method to systematically generate comprehensive prompts that reformulate the relation extraction task as a cloze-test task under a simple prompt formulation. In particular, we experiment with different ranking scores for prompt selection. With BioMed-RoBERTa-base, our results show that prompting-based fine-tuning obtains gains by 14.21 F1 over its regular fine-tuning baseline, and 1.14 F1 over SciFive-Large, the current state-of-the-art on ChemProt. Besides, we find prompt-based learning requires fewer training examples to make reasonable predictions. The results demonstrate the potential of our methods in such a domain-specific relation extraction task.
翻译:在生物医学领域,自然语言处理处理的一个核心问题是关系提取。最近关于关系提取的研究显示,以迅速为基础的学习提高了全套培训的微调和短视培训的绩效。然而,在具体领域的任务方面,如果能够更难进行良好的及时设计,则没有作出更多的努力。在本文件中,我们通过ChemProt数据集实验,调查生物医学关系提取的急促性,我们提出了一个简单而有效的方法,系统生成全面性提示,将关系提取任务重新定位为简单迅速的阻塞测试任务。特别是,我们试验了用于迅速选择的不同分数。在BioMed-ROBERTA基地,我们的结果显示,以迅速为基础的微调在常规微调基线上取得了14.21F1的收益,在SciFive-Large,即ChemProt的当前状态艺术上取得了1.14F1的收益。此外,我们发现,基于即时学习需要较少的培训示例,以作出合理的预测。结果显示,我们在这种特定领域提取工作上的方法的潜力。