The standard approach to personalization in machine learning consists of training a model with group attributes like sex, age group, and blood type. In this work, we show that this approach to personalization fails to improve performance for all groups who provide personal data. We discuss how this effect inflicts harm in applications where models assign predictions on the basis of group membership. We propose collective preference guarantees to ensure the fair use of group attributes in prediction. We characterize how common approaches to personalization violate fair use due to failures in model development and deployment. We conduct a comprehensive empirical study of personalization in clinical prediction models. Our results highlight the prevalence of fair use violations, demonstrate actionable interventions to mitigate harm and underscore the need to measure the gains of personalization for all groups who provide personal data.
翻译:机器学习中的个性化标准方法包括培训一个具有性别、年龄组和血型等群体属性的模式。在这项工作中,我们表明,这种个性化方法未能改善所有提供个人数据的群体的业绩。我们讨论了这种影响如何在应用模型根据群体成员情况作出预测时造成伤害。我们提出集体偏好保障,以确保在预测中公平使用群体属性。我们说明了个人化的共同方法如何由于模型开发和部署失败而违反公平使用。我们对临床预测模型中的个性化进行了全面的经验性研究。我们的结果突出了公平使用侵权的普遍程度,显示了减轻伤害的可行干预措施,并强调有必要衡量提供个人数据的所有群体的个人化收益。