The concept of fairness is gaining popularity in academia and industry. Social media is especially vulnerable to media biases and toxic language and comments. We propose a fair ML pipeline that takes a text as input and determines whether it contains biases and toxic content. Then, based on pre-trained word embeddings, it suggests a set of new words by substituting the bi-ased words, the idea is to lessen the effects of those biases by replacing them with alternative words. We compare our approach to existing fairness models to determine its effectiveness. The results show that our proposed pipeline can de-tect, identify, and mitigate biases in social media data
翻译:公平概念在学术界和产业界越来越受欢迎。社交媒体特别容易受到媒体偏见、有毒语言和评论的影响。我们建议公平的ML管道,将文本作为投入,并确定其中是否含有偏见和有毒内容。然后,根据预先培训的字嵌入,它提出一套新词,用两用词替换,用替代词来减轻这些偏见的影响。我们将我们的做法与现有的公平模式进行比较,以确定其有效性。结果显示,我们提议的管道可以删除、识别和减少社会媒体数据的偏见。</s>