Abuse on the Internet is an important societal problem of our time. Millions of Internet users face harassment, racism, personal attacks, and other types of abuse across various platforms. The psychological effects of abuse on individuals can be profound and lasting. Consequently, over the past few years, there has been a substantial research effort towards automated abusive language detection in the field of NLP. In this position paper, we discuss the role that modeling of users and online communities plays in abuse detection. Specifically, we review and analyze the state of the art methods that leverage user or community information to enhance the understanding and detection of abusive language. We then explore the ethical challenges of incorporating user and community information, laying out considerations to guide future research. Finally, we address the topic of explainability in abusive language detection, proposing properties that an explainable method should aim to exhibit. We describe how user and community information can facilitate the realization of these properties and discuss the effective operationalization of explainability in view of the properties.
翻译:数以百万计的互联网用户在各种平台上面临骚扰、种族主义、人身攻击和其他类型的虐待,虐待对个人的心理影响可能是深刻和持久的,因此,在过去几年里,为在NLP领域自动滥用语言探测进行了大量研究努力。 在本立场文件中,我们讨论了用户和在线社区模型在发现滥用行为方面所起的作用。具体地说,我们审查并分析了利用用户或社区信息增进了解和发现滥用语言的先进方法。然后,我们探讨了将用户和社区信息纳入其中的伦理挑战,提出了指导未来研究的考虑因素。最后,我们讨论了在滥用语言探测中可解释性的专题,提出了一种可以解释的方法,目的是展示哪些属性。我们描述了用户和社区信息如何促进这些属性的实现,并讨论了从这些属性的角度对可解释性的有效操作。