Text moderation for user generated content, which helps to promote healthy interaction among users, has been widely studied and many machine learning models have been proposed. In this work, we explore an alternative perspective by augmenting reactive reviews with proactive forecasting. Specifically, we propose a new concept {\it text toxicity propensity} to characterize the extent to which a text tends to attract toxic comments. Beta regression is then introduced to do the probabilistic modeling, which is demonstrated to function well in comprehensive experiments. We also propose an explanation method to communicate the model decision clearly. Both propensity scoring and interpretation benefit text moderation in a novel manner. Finally, the proposed scaling mechanism for the linear model offers useful insights beyond this work.
翻译:对有助于促进用户之间健康互动的用户生成内容的文本温和度进行了广泛研究,并提出了许多机器学习模式。在这项工作中,我们探索了另一种观点,以主动预测的方式加强反应性审查。具体地说,我们提出了一个新的概念 : ~ ~ 文本毒性倾向},以说明文本在多大程度上容易吸引有毒评论。然后引入了Beta回归,以进行概率模型模型,这在全面试验中证明运作良好。我们还提出了一个解释方法,以明确传达示范决定。两种倾向性评分和解释性评分均有利于文本温和。最后,拟议线性模型的缩放机制提供了超出这项工作的有用洞察力。