As for other forms of AI, speech recognition has recently been examined with respect to performance disparities across different user cohorts. One approach to achieve fairness in speech recognition is to (1) identify speaker cohorts that suffer from subpar performance and (2) apply fairness mitigation measures targeting the cohorts discovered. In this paper, we report on initial findings with both discovery and mitigation of performance disparities using data from a product-scale AI assistant speech recognition system. We compare cohort discovery based on geographic and demographic information to a more scalable method that groups speakers without human labels, using speaker embedding technology. For fairness mitigation, we find that oversampling of underrepresented cohorts, as well as modeling speaker cohort membership by additional input variables, reduces the gap between top- and bottom-performing cohorts, without deteriorating overall recognition accuracy.
翻译:至于其他形式的AI,最近对不同用户组别之间业绩差异的语音承认进行了审查,在语音承认方面实现公平的一种方法是:(1) 确定有低级表现的发言者组别,(2) 对发现的组别采用公平性减缓措施;在本文件中,我们利用一个产品规模AI助理语音识别系统的数据,报告发现和减少性能差异的初步结果;我们将基于地理和人口信息的群别发现与一种更可扩展的方法进行比较,即使用语言嵌入技术,将无人类标签的发言者组别与可扩缩的方法相比较;为公平性缓解起见,我们发现,对代表性不足的组别进行过度抽样,以及用其他投入变量模拟语言组别组成,可以缩小上层和下层表现组别之间的差距,同时不降低总体认知准确性。