This paper presents a data-driven framework to improve the trustworthiness of US tax preparation software systems. Given the legal implications of bugs in such software on its users, ensuring compliance and trustworthiness of tax preparation software is of paramount importance. The key barriers in developing debugging aids for tax preparation systems are the unavailability of explicit specifications and the difficulty of obtaining oracles. We posit that, since the US tax law adheres to the legal doctrine of precedent, the specifications about the outcome of tax preparation software for an individual taxpayer must be viewed in comparison with individuals that are deemed similar. Consequently, these specifications are naturally available as properties on the software requiring similar inputs provide similar outputs. Inspired by the metamorphic testing paradigm, we dub these relations metamorphic relations. In collaboration with legal and tax experts, we explicated metamorphic relations for a set of challenging properties from various US Internal Revenue Services (IRS) publications including Publication 596 (Earned Income Tax Credit), Schedule 8812 (Qualifying Children/Other Dependents), and Form 8863 (Education Credits). We focus on an open-source tax preparation software for our case study and develop a randomized test-case generation strategy to systematically validate the correctness of tax preparation software guided by metamorphic relations. We further aid this test-case generation by visually explaining the behavior of software on suspicious instances using easy to-interpret decision-tree models. Our tool uncovered several accountability bugs with varying severity ranging from non-robust behavior in corner-cases (unreliable behavior when tax returns are close to zero) to missing eligibility conditions in the updated versions of software.
 翻译:本文提出了一个数据驱动框架,以提高美国税务准备软件系统的可信度。鉴于此类软件中的错误对其用户的法律影响,确保税收准备软件的合规性和可信度至关重要。在为税务准备系统开发调试辅助工具方面的主要障碍是缺乏明确的规格和难以获得甲骨文。我们假设,由于美国税法遵循了先例的法律理论,必须把个人纳税人税务准备软件结果的规格与被认为相似的个人相比较。因此,这些规格自然可以提供,因为需要类似投入的软件的属性提供了类似的产出。受变换性测试模式的启发,我们将这些关系归为变换式关系。我们与法律和税务专家合作,为一系列具有挑战性的财产复制了变式关系,这些出版物包括第596号出版物(收入抵免税)、第8812号表(在儿童/其他可变离异性退税时,以及第8863号表格(教育抵减)等出版物。我们侧重于一个公开源税准备软件源的编制软件软件,用于我们案例研究中的易变换行为,并系统地通过系统测试生成的系统测试软件,来验证我们的一系列税变现的系统化的税变换行为。