Research shows that many like-minded people use popular microblogging websites for posting hateful speech against various religions and race. Automatic identification of racist and hate promoting posts is required for building social media intelligence and security informatics based solutions. However, just keyword spotting based techniques cannot be used to accurately identify the intent of a post. In this paper, we address the challenge of the presence of ambiguity in such posts by identifying the intent of author. We conduct our study on Tumblr microblogging website and develop a cascaded ensemble learning classifier for identifying the posts having racist or radicalized intent. We train our model by identifying various semantic, sentiment and linguistic features from free-form text. Our experimental results shows that the proposed approach is effective and the emotion tone, social tendencies, language cues and personality traits of a narrative are discriminatory features for identifying the racist intent behind a post.
翻译:研究显示,许多志同道合的人使用流行的微博客网站张贴针对不同宗教和种族的仇恨言论。 建立社交媒体情报和安全信息解决方案需要自动识别种族主义和仇恨宣传职位。 但是,仅用关键字识别技术无法准确识别一个职位的意图。 在本文中,我们通过确定作者的意图来解决这些职位存在模棱两可之处的挑战。 我们在Tumblr微博客网站进行研究,并开发一个连锁连锁学习分类,以识别具有种族主义或激进意图的岗位。我们通过从自由形式文本中识别各种语义、情绪和语言特征来培训我们的模型。我们的实验结果表明,拟议的方法是有效的,而一个叙事的情绪、社会倾向、语言提示和个性特征是确定一个职位背后的种族主义意图的歧视性特征。