Performative distribution shift captures the setting where the choice of which ML model is deployed changes the data distribution. For example, a bank which uses the number of open credit lines to determine a customer's risk of default on a loan may induce customers to open more credit lines in order to improve their chances of being approved. Because of the interactions between the model and data distribution, finding the optimal model parameters is challenging. Works in this area have focused on finding stable points, which can be far from optimal. Here we introduce performative gradient descent (PerfGD), which is the first algorithm which provably converges to the performatively optimal point. PerfGD explicitly captures how changes in the model affects the data distribution and is simple to use. We support our findings with theory and experiments.
翻译:实际分配转移捕捉了选择 ML 模式的布局,从而改变了数据分布。例如,银行使用开放信用额度的数量来确定客户贷款违约的风险,可能会促使客户打开更多的信用额度,以提高其获得批准的机会。由于模型与数据分配之间的互动,找到最佳模型参数具有挑战性。这一领域的工作侧重于寻找稳定点,这可能远非最佳。我们在这里引入了表现性梯度下降(PerfGD),这是第一个可以与表现性最佳点相趋同的算法。 PerfGD明确捕捉到模型的变化如何影响数据分配和简单使用。我们用理论和实验来支持我们的结论。