The ability to quantify incivility online, in news and in congressional debates, is of great interest to political scientists. Computational tools for detecting online incivility for English are now fairly accessible and potentially could be applied more broadly. We test the Jigsaw Perspective API for its ability to detect the degree of incivility on a corpus that we developed, consisting of manual annotations of civility in American news. We demonstrate that toxicity models, as exemplified by Perspective, are inadequate for the analysis of incivility in news. We carry out error analysis that points to the need to develop methods to remove spurious correlations between words often mentioned in the news, especially identity descriptors and incivility. Without such improvements, applying Perspective or similar models on news is likely to lead to wrong conclusions, that are not aligned with the human perception of incivility.
翻译:在网上、新闻和国会辩论中量化不文明的能力是政治科学家非常感兴趣的。 用于检测不文明英语的在线工具现在可以相当容易地获得,并有可能更广泛地应用。 我们测试吉格肖观点API, 因为它有能力在由美国新闻文明的人工说明组成的我们开发的文体上检测不文明的程度。 我们证明透视所展示的毒性模型不足以分析新闻中的不文明。 我们进行了错误分析,指出需要制定方法消除新闻中经常提到的言词,特别是身份描述和不文明之间的虚假关联。 没有这种改进,在新闻上应用“透视”或类似模式可能会导致错误的结论,而这种结论与人类对不文明的看法不相符。