Recommender systems, information retrieval, and other information access systems present unique challenges for examining and applying concepts of fairness and bias mitigation in unstructured text. This paper introduces Dbias, which is a Python package to ensure fairness in news articles. Dbias is a trained Machine Learning (ML) pipeline that can take a text (e.g., a paragraph or news story) and detects if the text is biased or not. Then, it detects the biased words in the text, masks them, and recommends a set of sentences with new words that are bias-free or at least less biased. We incorporate the elements of data science best practices to ensure that this pipeline is reproducible and usable. We show in experiments that this pipeline can be effective for mitigating biases and outperforms the common neural network architectures in ensuring fairness in the news articles.
翻译:咨询系统、信息检索系统和其他信息存取系统对审查和应用非结构化文本中的公平和减少偏见概念提出了独特的挑战。本文件介绍Dbias,这是确保新闻文章公平性的Python套件。Dbias是经过培训的机器学习(ML)管道,可以接收文字(例如段落或新闻报道),并检测文字是否偏向。然后,它检测文字中的偏向词,遮盖它们,建议用新的词句,不带偏见,或至少不带有偏见。我们纳入了数据科学最佳做法的要素,以确保这一管道可以复制和使用。我们在实验中显示,这一管道能够有效地减少偏见,并超越共同的神经网络结构,确保新闻文章的公正性。