We address performance fairness for speaker verification using the adversarial reweighting (ARW) method. ARW is reformulated for speaker verification with metric learning, and shown to improve results across different subgroups of gender and nationality, without requiring annotation of subgroups in the training data. An adversarial network learns a weight for each training sample in the batch so that the main learner is forced to focus on poorly performing instances. Using a min-max optimization algorithm, this method improves overall speaker verification fairness. We present three different ARWformulations: accumulated pairwise similarity, pseudo-labeling, and pairwise weighting, and measure their performance in terms of equal error rate (EER) on the VoxCeleb corpus. Results show that the pairwise weighting method can achieve 1.08% overall EER, 1.25% for male and 0.67% for female speakers, with relative EER reductions of 7.7%, 10.1% and 3.0%, respectively. For nationality subgroups, the proposed algorithm showed 1.04% EER for US speakers, 0.76% for UK speakers, and 1.22% for all others. The absolute EER gap between gender groups was reduced from 0.70% to 0.58%, while the standard deviation over nationality groups decreased from 0.21 to 0.19.
翻译:我们处理的是使用对抗性重算法(ARW)对发言者进行检查的绩效公平性。ARW是重新拟订的,供发言者进行标准化学习核查,并显示可以改进不同性别和国籍分组的工作成果,而无需在培训数据中说明分组。一个对抗性网络在每批中学习每个培训样本的权重,以便主要学习者不得不关注表现不佳的情况。使用微量最大优化算法,这种方法可以改善总体发言者核查的公正性。我们提出了三种不同的ARW公式:在VoxCelebamp上累积的相似性、假标签和对称加权,并用相同误差率衡量其绩效。结果显示,对称权重方法可以达到1.08%的总体ER,男性为1.25%,女性发言者为0.67%,女性演讲者为0.7 %;在国籍分组中,拟议的算法显示美国发言者为1.04%的EER,联合王国发言者为0.76 %,其他发言者为1.22%。绝对的EER加权加权加权比0.15%的性别群体差从0.15%降至0.15%,而性别群体之间的标准差差则从0.15%降至0.1%。