Recent advances in language modeling have enabled new conversational systems. In particular, it is often desirable for people to make choices among specified options when using such systems. We address the problem of reference resolution, when people use natural expressions to choose between real world entities. For example, given the choice `Should we make a Simnel cake or a Pandan cake?' a natural response from a non-expert may be indirect: `let's make the green one'. Reference resolution has been little studied with natural expressions, thus robustly understanding such language has large potential for improving naturalness in dialog, recommendation, and search systems. We create AltEntities (Alternative Entities), a new public dataset of entity pairs and utterances, and develop models for the disambiguation problem. Consisting of 42K indirect referring expressions across three domains, it enables for the first time the study of how large language models can be adapted to this task. We find they achieve 82%-87% accuracy in realistic settings, which while reasonable also invites further advances.
翻译:语言建模方面的最近进展使得新的对话系统得以实现。 特别是,人们在使用这些系统时往往需要在特定选项中做出选择。 当人们使用自然表达方式在真实世界实体之间做出选择时,我们解决了参考解决方案的问题。例如,鉴于选择“我们是否应该做一个Simnel蛋糕或潘丹蛋糕?” 非专家的自然反应可能是间接的:“让我们做一个绿色的?” 参考解决方案与自然表达方式研究得很少,因此,对此类语言的深入理解对于改善对话、建议和搜索系统中的自然性有很大的潜力。 我们创建了“替代实体”(替代实体),一个新的实体对口和言词公共数据集,并开发了脱矛盾问题的模型。 将42K间接的表达方式汇集到三个领域,首次使得能够研究大型语言模型如何适应这项任务。 我们发现,在现实环境中,这些语言实现了82%至87%的准确度,这虽然合理,但也可以进一步。