Figurative and metaphorical language are commonplace in discourse, and figurative expressions play an important role in communication and cognition. However, figurative language has been a relatively under-studied area in NLP, and it remains an open question to what extent modern language models can interpret nonliteral phrases. To address this question, we introduce Fig-QA, a Winograd-style nonliteral language understanding task consisting of correctly interpreting paired figurative phrases with divergent meanings. We evaluate the performance of several state-of-the-art language models on this task, and find that although language models achieve performance significantly over chance, they still fall short of human performance, particularly in zero- or few-shot settings. This suggests that further work is needed to improve the nonliteral reasoning capabilities of language models.
翻译:引言和比喻语言在对话中司空见惯,比喻语言在沟通和认知中起着重要作用。然而,比喻语言在《国家语言规划》中是一个研究相对不足的领域,对于现代语言模式在多大程度上可以解释不识字的词句,它仍然是一个未决问题。为了解决这个问题,我们引入了Winograd式的不识字语言理解任务Fig-QA,这是一个Winograd式的非语言理解任务,由正确解释不同含义的对称比喻语组成。我们评估了数个最先进的语言模型在这项工作上的绩效,发现虽然语言模型的性能大大超过机会,但它们仍然低于人类的性能,特别是在零或少见的环境中。这表明需要进一步努力提高语言模型的非识字推理能力。