This study examines issues of algorithmic fairness in the context of systems that inform tax audit selection by the United States Internal Revenue Service (IRS). While the field of algorithmic fairness has developed primarily around notions of treating like individuals alike, we instead explore the concept of vertical equity -- appropriately accounting for relevant differences across individuals -- which is a central component of fairness in many public policy settings. Applied to the design of the U.S. individual income tax system, vertical equity relates to the fair allocation of tax and enforcement burdens across taxpayers of different income levels. Through a unique collaboration with the Treasury Department and IRS, we use access to anonymized individual taxpayer microdata, risk-selected audits, and random audits from 2010-14 to study vertical equity in tax administration. In particular, we assess how the use of modern machine learning methods for selecting audits may affect vertical equity. First, we show how the use of more flexible machine learning (classification) methods -- as opposed to simpler models -- shifts audit burdens from high to middle-income taxpayers. Second, we show that while existing algorithmic fairness techniques can mitigate some disparities across income, they can incur a steep cost to performance. Third, we show that the choice of whether to treat risk of underreporting as a classification or regression problem is highly consequential. Moving from classification to regression models to predict underreporting shifts audit burden substantially toward high income individuals, while increasing revenue. Last, we explore the role of differential audit cost in shaping the audit distribution. We show that a narrow focus on return-on-investment can undermine vertical equity. Our results have implications for the design of algorithmic tools across the public sector.
翻译:本研究报告审查了在美国国内税收署(IRS)为税务审计选择提供依据的系统范围内的算法公平问题。虽然算法公平领域主要围绕个人同等待遇的概念发展,但我们却探讨了纵向公平的概念 -- -- 适当核算个人之间的相关差异 -- -- 这是许多公共政策环境中公平的一个核心组成部分。适用于美国个人所得税制度的设计,纵向公平涉及不同收入水平纳税人之间税收和执法负担的公平分配。第二,我们通过与财政部和国税局的独特合作,利用匿名个人纳税人微观数据、风险选择审计和随机审计领域,研究2010-2014年税收管理中的纵向公平。特别是,我们评估了使用现代机器学习方法选择审计会如何影响纵向公平。首先,我们展示了如何使用更灵活的机器学习(分类)方法,而不是简单的模式,将审计负担从高到中等收入纳税人之间的公平。第二,我们表明,虽然现有的算法公平技术可以减少收入之间的某些差异,但成本会大幅下降,税收管理中的回报也会急剧降低。第三,我们评估使用现代机器学习方法选择的收益分类,而后,我们又在持续地分析中显示,在审计中选择了较高的收入分类中,在不断退化。