An engaging and provocative question can open up a great conversation. In this work, we explore a novel scenario: a conversation agent views a set of the user's photos (for example, from social media platforms) and asks an engaging question to initiate a conversation with the user. The existing vision-to-question models mostly generate tedious and obvious questions, which might not be ideals conversation starters. This paper introduces a two-phase framework that first generates a visual story for the photo set and then uses the story to produce an interesting question. The human evaluation shows that our framework generates more response-provoking questions for starting conversations than other vision-to-question baselines.
翻译:一个有吸引力和挑衅性的问题可以打开一个伟大的对话。 在这一工作中,我们探索了一个新颖的情景:一个对话代理查看了用户的一组照片(例如社交媒体平台上的照片),并提出了启动与用户对话的热门问题。现有的视觉对质疑模型大多会产生乏味和明显的问题,这或许不是理想对话的起点。本文介绍了一个两阶段框架,首先为照片集制作一个视觉故事,然后利用这个故事来生成一个有趣的问题。人类评估显示,我们的框架比其他视觉对质疑基线为开始对话生成了更多的响应引发的问题。