The rapid growth in user generated content on social media has resulted in a significant rise in demand for automated content moderation. Various methods and frameworks have been proposed for the tasks of hate speech detection and toxic comment classification. In this work, we combine common datasets to extend these tasks to brand safety. Brand safety aims to protect commercial branding by identifying contexts where advertisements should not appear and covers not only toxicity, but also other potentially harmful content. As these datasets contain different label sets, we approach the overall problem as a binary classification task. We demonstrate the need for building brand safety specific datasets via the application of common toxicity detection datasets to a subset of brand safety and empirically analyze the effects of weighted sampling strategies in text classification.
翻译:快速增长的社交媒体用户生成内容导致自动内容审核需求的显著增加。已经提出了各种方法和框架用于仇恨言论检测和有毒评论分类。在这项工作中,我们结合常见数据集将这些任务扩展到品牌安全。品牌安全旨在通过识别广告不应出现的情境来保护商业品牌,它不仅涵盖有毒性,还包括其他潜在有害内容。由于这些数据集包含不同的标签集,我们将整体问题视为二元分类任务。我们演示了建立品牌安全专用数据集的必要性,并通过应用常见的有毒性检测数据集对品牌安全的子集进行实证分析加权采样策略在文本分类中的效果。