As machine learning (ML) models are increasingly being deployed in high-stakes applications, policymakers have suggested tighter data protection regulations (e.g., GDPR, CCPA). One key principle is the "right to be forgotten" which gives users the right to have their data deleted. Another key principle is the right to an actionable explanation, also known as algorithmic recourse, allowing users to reverse unfavorable decisions. To date, it is unknown whether these two principles can be operationalized simultaneously. Therefore, we introduce and study the problem of recourse invalidation in the context of data deletion requests. More specifically, we theoretically and empirically analyze the behavior of popular state-of-the-art algorithms and demonstrate that the recourses generated by these algorithms are likely to be invalidated if a small number of data deletion requests (e.g., 1 or 2) warrant updates of the predictive model. For the setting of linear models and overparameterized neural networks -- studied through the lens of neural tangent kernels (NTKs) -- we suggest a framework to identify a minimal subset of critical training points which, when removed, maximize the fraction of invalidated recourses. Using our framework, we empirically show that the removal of as little as 2 data instances from the training set can invalidate up to 95 percent of all recourses output by popular state-of-the-art algorithms. Thus, our work raises fundamental questions about the compatibility of "the right to an actionable explanation" in the context of the "right to be forgotten" while also providing constructive insights on the determining factors of recourse robustness.
翻译:随着机器学习(ML)模型越来越多地被应用到高取量应用中,决策者建议更严格的数据保护条例(如GDPR、CCPA)。一个关键原则是“被遗忘的权利”使用户有权删除数据。另一个关键原则是有权获得可操作的解释,也称为算法追索,让用户能够反转不讨好的决定。到目前为止,尚不清楚这两项原则能否同时运用。因此,我们引入并研究数据删除请求中的追偿无效问题。更具体地说,我们从理论上和经验上分析流行的“最新”算法,并表明如果少量数据删除请求(如,1或2)需要更新预测模型,这些算法产生的追索可能无效。对于“线性模型”和超分光度神经内核(NTKs)的透镜研究,我们建议建立一个框架,确定一个最起码的关键培训背景,在删除时,“最精确的解算法解释性解释,我们从“最无效的解算法框架”到“最彻底的解算法,我们用最无效的解算法框架,也可以通过“最彻底的解算出“从第95的算的算式的解算法”的算出“我们最无效的解过程的算出所有的算法过程的解过程。