Simile interpretation (SI) and simile generation (SG) are challenging tasks for NLP because models require adequate world knowledge to produce predictions. Previous works have employed many hand-crafted resources to bring knowledge-related into models, which is time-consuming and labor-intensive. In recent years, pre-trained language models (PLMs) based approaches have become the de-facto standard in NLP since they learn generic knowledge from a large corpus. The knowledge embedded in PLMs may be useful for SI and SG tasks. Nevertheless, there are few works to explore it. In this paper, we probe simile knowledge from PLMs to solve the SI and SG tasks in the unified framework of simile triple completion for the first time. The backbone of our framework is to construct masked sentences with manual patterns and then predict the candidate words in the masked position. In this framework, we adopt a secondary training process (Adjective-Noun mask Training) with the masked language model (MLM) loss to enhance the prediction diversity of candidate words in the masked position. Moreover, pattern ensemble (PE) and pattern search (PS) are applied to improve the quality of predicted words. Finally, automatic and human evaluations demonstrate the effectiveness of our framework in both SI and SG tasks.
翻译:由于模型需要充足的世界知识才能作出预测,Simile 解释(SI)和Simmile 一代(SG)对于NLP来说是具有挑战性的任务,因为模型需要足够的世界知识才能作出预测。以前的工作利用了许多手工制作的资源将知识相关工作引入耗时和劳力密集的模型中。近年来,预先培训的语言模型(PLM)方法在从大型知识中学习通用知识后,就成了NLP的脱法标准。PLM中所包含的知识也许对SI和SG的任务有用。然而,探索它的工作很少。在本文中,我们探索PLMS的硅知识,以便第一次在Simle三重完成的统一框架内解决SI和SG的任务。我们框架的骨干是用手动模式构建遮掩蔽的句子,然后预测隐蔽位置的候选词。我们采用了一个带有隐蔽语言模型(MLMM)损失的二级培训进程,以加强隐蔽语言模型(MLM)中候选词的预测多样性。此外,SLMMM(PE)和S-S-S-S-S-I-S-S-S-S-I-S-S-S-S-S-S-S-I-S-S-S-S-I-I-S-S-S-S-I-S-S-I-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-I-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-