Most existing dialogue systems fail to respond properly to potentially unsafe user utterances by either ignoring or passively agreeing with them. To address this issue, we introduce ProsocialDialog, the first large-scale multi-turn dialogue dataset to teach conversational agents to respond to problematic content following social norms. Covering diverse unethical, problematic, biased, and toxic situations, ProsocialDialog contains responses that encourage prosocial behavior, grounded in commonsense social rules (i.e., rules-of-thumb, RoTs). Created via a human-AI collaborative framework, ProsocialDialog consists of 58K dialogues, with 331K utterances, 160K RoTs, and 497K dialogue safety labels accompanied by free-form rationales. With this dataset, we introduce a dialogue safety detection module, Canary, capable of generating RoTs given conversational context, and a socially-informed dialogue agent, Prost. Empirical results show that Prost generates more socially acceptable dialogues compared to other state-of-the-art language and dialogue models in both in-domain and out-of-domain settings. Additionally, Canary effectively guides conversational agents and off-the-shelf language models to generate significantly more prosocial responses. Our work highlights the promise and importance of creating and steering conversational AI to be socially responsible.
翻译:为了解决这一问题,我们引入了ProsocialDialog,这是第一个大型多方向对话数据集,用于教育对话者根据社会规范对问题内容作出反应。覆盖了多种不道德、问题、偏见和有毒情况,ProsocialDialog包含基于常识社会规则(即,高礼规则、罗氏规则)的鼓励亲社会行为的反应。通过人类-大赦国际合作框架创建的ProsocialDialog由58K对话组成,包括331K语、160K罗特和497K对话安全标签,并配有自由格式理论。有了这一数据集,我们引入了对话安全检测模块,卡纳里,能够产生具有对话背景的罗氏行为,以及社会知情的对话代理人,普罗斯特。 爱皮卡卡利卡结果显示,与其它最先进的语言和对话模式相比,普罗斯特产生了更社会可接受的对话。 与我们在内部和外部和外部对话模式中创造更具有公信力的、更具社会责任性的对话指南。