The research of open-domain dialog systems has been greatly prospered by neural models trained on large-scale corpora, however, such corpora often introduce various safety problems (e.g., offensive languages, biases, and toxic behaviors) that significantly hinder the deployment of dialog systems in practice. Among all these unsafe issues, addressing social bias is more complex as its negative impact on marginalized populations is usually expressed implicitly, thus requiring normative reasoning and rigorous analysis. In this paper, we focus our investigation on social bias detection of dialog safety problems. We first propose a novel Dial-Bias Frame for analyzing the social bias in conversations pragmatically, which considers more comprehensive bias-related analyses rather than simple dichotomy annotations. Based on the proposed framework, we further introduce CDail-Bias Dataset that, to our knowledge, is the first well-annotated Chinese social bias dialog dataset. In addition, we establish several dialog bias detection benchmarks at different label granularities and input types (utterance-level and context-level). We show that the proposed in-depth analyses together with these benchmarks in our Dial-Bias Frame are necessary and essential to bias detection tasks and can benefit building safe dialog systems in practice.
翻译:对开放式对话系统的研究由于在大型公司方面受过培训的神经模型而大大繁荣了对开放式对话系统的研究,然而,这种公司常常带来各种安全问题(例如攻击性语言、偏见和有毒行为),严重妨碍对话系统的实际部署。在所有这些不安全问题中,解决社会偏见问题更为复杂,因为社会偏见对边缘化人口的负面影响通常以隐含的方式表示,因此需要进行规范推理和严格分析。在本文件中,我们集中调查对对话安全问题的社会偏见的发现。我们首先提出一个新的Dial-Bias框架,以务实地分析对话中的社会偏见,考虑更全面的偏见分析,而不是简单的二分法说明。根据拟议的框架,我们进一步采用CDail-Bas数据集,据我们所知,这是第一个附有注释的中国社会偏见对话数据集。此外,我们还在不同标签颗粒和投入类型(不相上和上下层)建立了几个对话偏差检测基准。我们表明,拟议在我们的Di-Bas框架中与这些基准一起进行深入分析,对于发现偏见的任务和建设安全对话系统是必要和必不可少的。