Online platforms and communities establish their own norms that govern what behavior is acceptable within the community. Substantial effort in NLP has focused on identifying unacceptable behaviors and, recently, on forecasting them before they occur. However, these efforts have largely focused on toxicity as the sole form of community norm violation. Such focus has overlooked the much larger set of rules that moderators enforce. Here, we introduce a new dataset focusing on a more complete spectrum of community norms and their violations in the local conversational and global community contexts. We introduce a series of models that use this data to develop context- and community-sensitive norm violation detection, showing that these changes give high performance.
翻译:在线平台和社区建立自己的规范,规范哪些行为在社区内可以接受。国家劳工政策下的大量工作侧重于识别不可接受的行为,最近还侧重于在这些行为发生之前进行预测。然而,这些努力主要侧重于毒性,将其作为社区违反规范的唯一形式。这种重点忽视了主持人所执行的一套范围更广的规则。在这里,我们引入了一套新的数据集,侧重于更完整的社区规范及其在地方对话和全球社区环境中的违反行为。我们引入了一系列模型,利用这些数据开发对背景和社区敏感的违规标准检测,显示这些变化表现良好。