We present Viola, an open-domain dialogue system for spoken conversation that uses a topic-agnostic dialogue manager based on a simple generate-and-rank approach. Leveraging recent advances of generative dialogue systems powered by large language models, Viola fetches a batch of response candidates from various neural dialogue models trained with different datasets and knowledge-grounding inputs. Additional responses originating from template-based generators are also considered, depending on the user's input and detected entities. The hand-crafted generators build on a dynamic knowledge graph injected with rich content that is crawled from the web and automatically processed on a daily basis. Viola's response ranker is a fine-tuned polyencoder that chooses the best response given the dialogue history. While dedicated annotations for the polyencoder alone can indirectly steer it away from choosing problematic responses, we add rule-based safety nets to detect neural degeneration and a dedicated classifier to filter out offensive content. We analyze conversations that Viola took part in for the Alexa Prize Socialbot Grand Challenge 4 and discuss the strengths and weaknesses of our approach. Lastly, we suggest future work with a focus on curating conversation data specifcially for socialbots that will contribute towards a more robust data-driven socialbot.
翻译:我们向维奥拉展示了一个开放对话系统,它是一个开放的对口对话系统,它使用一个基于简单生成和排序方法的专题-不可知对话管理器。利用由大型语言模型驱动的基因对话系统的最新进步,维奥拉从经过不同数据集和知识基础投入培训的各种神经对话模型中获取了一批回应候选人。基于模板的发电机产生的更多回应也得到了考虑,这取决于用户的投入和检测到的实体。手工制作的发电机建立在一个动态知识图上,该图中注入了丰富的内容,该图内容从网络上爬入,并每天自动处理。维奥拉的响应排名器是一个精细调整的多元编码器,根据对话历史选择最佳的响应。虽然单是多功能编码器专用的注释可以间接地引导它选择有问题的答复,但我们增加了基于规则的安全网,以探测神经衰变,并专门分类来过滤攻击性的内容。我们分析维奥拉参加过的一个充满丰富内容的知识图表的动态图,该图内容从网络上爬过,每天自动处理。维奥拉的响应器是一个精细调的多调的多元编码,可以选择最好的响应。根据对话历史历史历史,我们的方法的优点和弱点。最后,我们建议未来的工作将以社会驱动的数据为社会对话重点进行社会对话。我们未来的工作,将促进社会对话。我们更强大的社会对话。