Algorithmic decision making is increasingly prevalent, but often vulnerable to strategic manipulation by agents seeking a favorable outcome. Prior research has shown that classifier abstention (allowing a classifier to decline making a decision due to insufficient confidence) can significantly increase classifier accuracy. This paper studies abstention within a strategic classification context, exploring how its introduction impacts strategic agents' responses and how principals should optimally leverage it. We model this interaction as a Stackelberg game where a principal, acting as the classifier, first announces its decision policy, and then strategic agents, acting as followers, manipulate their features to receive a desired outcome. Here, we focus on binary classifiers where agents manipulate observable features rather than their true features, and show that optimal abstention ensures that the principal's utility (or loss) is no worse than in a non-abstention setting, even in the presence of strategic agents. We also show that beyond improving accuracy, abstention can also serve as a deterrent to manipulation, making it costlier for agents, especially those less qualified, to manipulate to achieve a positive outcome when manipulation costs are significant enough to affect agent behavior. These results highlight abstention as a valuable tool for reducing the negative effects of strategic behavior in algorithmic decision making systems.
翻译:算法决策日益普及,但往往容易受到寻求有利结果的智能体进行策略性操纵。先前研究表明,分类器弃权(允许分类器因置信度不足而拒绝做出决策)能显著提高分类器准确性。本文在策略性分类背景下研究弃权机制,探讨其引入如何影响策略性智能体的响应,以及决策者应如何最优地利用它。我们将这种互动建模为斯塔克尔伯格博弈,其中作为分类器的决策者首先宣布其决策策略,随后作为跟随者的策略性智能体操纵其特征以获得期望结果。在此,我们专注于二元分类器,其中智能体操纵的是可观测特征而非真实特征,并证明最优弃权能确保决策者的效用(或损失)在存在策略性智能体的情况下不劣于非弃权设置。我们还证明,除了提高准确性外,弃权还能作为操纵行为的威慑手段,当操纵成本足够显著以影响智能体行为时,弃权会使智能体(尤其是资质较差的智能体)为获得正向结果而进行操纵的成本更高。这些结果表明,弃权是减少算法决策系统中策略性行为负面效应的有效工具。