Previous researches on dialogue system assessment usually focus on the quality evaluation (e.g. fluency, relevance, etc) of responses generated by the chatbots, which are local and technical metrics. For a chatbot which responds to millions of online users including minors, we argue that it should have a healthy mental tendency in order to avoid the negative psychological impact on them. In this paper, we establish several mental health assessment dimensions for chatbots (depression, anxiety, alcohol addiction, empathy) and introduce the questionnaire-based mental health assessment methods. We conduct assessments on some well-known open-domain chatbots and find that there are severe mental health issues for all these chatbots. We consider that it is due to the neglect of the mental health risks during the dataset building and the model training procedures. We expect to attract researchers' attention to the serious mental health problems of chatbots and improve the chatbots' ability in positive emotional interaction.
翻译:以往关于对话系统评估的研究通常侧重于对聊天室所做出的反应的质量评价(例如流利、相关性等),这些反应是当地和技术指标。对于响应包括未成年人在内的数以百万计在线用户的聊天室来说,我们认为,它应该有一个健康的心理倾向,以避免对这些人产生负面的心理影响。在本文件中,我们为聊天室建立了几个心理健康评估层面(抑郁、焦虑、酒精上瘾、同情),并介绍了基于问卷的心理健康评估方法。我们评估了一些著名的开放性聊天室,发现所有这些聊天室都存在严重的心理健康问题。我们认为,这是因为在建立数据集和示范培训程序期间忽视了心理健康风险。我们期望吸引研究人员注意聊天室的严重心理健康问题,并提高聊天室在积极情感互动方面的能力。