We introduce leave-one-out unfairness, which characterizes how likely a model's prediction for an individual will change due to the inclusion or removal of a single other person in the model's training data. Leave-one-out unfairness appeals to the idea that fair decisions are not arbitrary: they should not be based on the chance event of any one person's inclusion in the training data. Leave-one-out unfairness is closely related to algorithmic stability, but it focuses on the consistency of an individual point's prediction outcome over unit changes to the training data, rather than the error of the model in aggregate. Beyond formalizing leave-one-out unfairness, we characterize the extent to which deep models behave leave-one-out unfairly on real data, including in cases where the generalization error is small. Further, we demonstrate that adversarial training and randomized smoothing techniques have opposite effects on leave-one-out fairness, which sheds light on the relationships between robustness, memorization, individual fairness, and leave-one-out fairness in deep models. Finally, we discuss salient practical applications that may be negatively affected by leave-one-out unfairness.
翻译:我们引入了“放假一出”的不公平现象,这说明模型对个人的预测可能因模型培训数据中包括或删除单个人而发生变化。“放假一出”不公平现象使人认为,公平决定不是任意的:它们不应基于任何人被列入培训数据中的机会事件。“放假一出”不公平现象与算法稳定性密切相关,但它侧重于单个点的预测结果相对于培训数据单位变化的预测结果的一致性,而不是模型整体的错误。除了将“放假一出”不公平现象正规化外,我们描述深层模型对真实数据采取“放假一出”不公平做法的程度,包括一般错误很小的情况。此外,我们证明,对抗式培训和随机的平滑技术对“放假一出”公平性具有相反的影响,这揭示了强健、记忆化、个人公平性和深层模型中的“放假一出”公平性之间的关系。最后,我们讨论了可能会受到“放假一出”不公平负面影响的突出的实际应用情况。