Deployed supervised machine learning models make predictions that interact with and influence the world. This phenomenon is called performative prediction by Perdomo et al. (ICML 2020). It is an ongoing challenge to understand the influence of such predictions as well as design tools so as to control that influence. We propose a theoretical framework where the response of a target population to the deployed classifier is modeled as a function of the classifier and the current state (distribution) of the population. We show necessary and sufficient conditions for convergence to an equilibrium of two retraining algorithms, repeated risk minimization and a lazier variant. Furthermore, convergence is near an optimal classifier. We thus generalize results of Perdomo et al., whose performativity framework does not assume any dependence on the state of the target population. A particular phenomenon captured by our model is that of distinct groups that acquire information and resources at different rates to be able to respond to the latest deployed classifier. We study this phenomenon theoretically and empirically.
翻译:所部署的受监督的机器学习模型可以作出与世界互动和影响世界的预测。这一现象被称为Perdomo等人(ICML 2020年)的绩效预测。了解这种预测的影响以及控制这种影响的设计工具是一项持续的挑战。我们提出了一个理论框架,其中目标人群对部署的分类器的反应模式是分类器和当前人口状况(分布)的功能;我们为达到两种再培训算法、重复风险最小化和拉子变异的平衡提供了必要和充分的条件。此外,趋同接近于最佳的分类器。我们因此推广了Perdomo等人的结果,其性能框架并不假定对目标人群的状况有任何依赖。我们的模型所捕捉的一个特定现象是不同群体以不同速度获得信息和资源,以便能够对最新部署的分类器作出反应。我们从理论上和实验上研究这种现象。