Software often produces biased outputs. In particular, machine learning (ML) based software are known to produce erroneous predictions when processing discriminatory inputs. Such unfair program behavior can be caused by societal bias. In the last few years, Amazon, Microsoft and Google have provided software services that produce unfair outputs, mostly due to societal bias (e.g. gender or race). In such events, developers are saddled with the task of conducting fairness testing. Fairness testing is challenging; developers are tasked with generating discriminatory inputs that reveal and explain biases. We propose a grammar-based fairness testing approach (called ASTRAEA) which leverages context-free grammars to generate discriminatory inputs that reveal fairness violations in software systems. Using probabilistic grammars, ASTRAEA also provides fault diagnosis by isolating the cause of observed software bias. ASTRAEA's diagnoses facilitate the improvement of ML fairness. ASTRAEA was evaluated on 18 software systems that provide three major natural language processing (NLP) services. In our evaluation, ASTRAEA generated fairness violations with a rate of ~18%. ASTRAEA generated over 573K discriminatory test cases and found over 102K fairness violations. Furthermore, ASTRAEA improves software fairness by ~76%, via model-retraining.
翻译:特别是,基于机器的学习(ML)软件在处理歧视性投入时已知会产生错误的预测。这种不公平的方案行为可能由社会偏见造成。在过去几年里,亚马逊、微软和谷歌提供了产生不公平产出的软件服务,主要是由于社会偏见(如性别或种族),在这些活动中,开发者承担着进行公平测试的任务。公平测试具有挑战性;开发者的任务是提供显示和解释偏见的歧视性投入。我们建议采用基于语法的公平测试方法(称为ASTRAEA),利用无背景语法生成歧视性投入,以揭示软件系统中的不公平现象。使用概率语法、ASTRAEA还提供错误诊断,将观察到的软件偏差的原因隔离开来。ASTRAEA的诊断有助于改善ML的公平性。ASTRAEA在提供三种主要自然语言处理(NLP)服务的18个软件系统中进行了评估。我们的评估是,ASTRAEA产生了一种高达~18 %的公平性侵犯率。ASTRA在573K的歧视性测试案件中产生了573K的歧视性案例,并且发现通过102K的公平性培训改进了ABRA的公平性。