Existing and planned legislation stipulates various obligations to provide information about machine learning algorithms and their functioning, often interpreted as obligations to "explain". Many researchers suggest using post-hoc explanation algorithms for this purpose. In this paper, we combine legal, philosophical and technical arguments to show that post-hoc explanation algorithms are unsuitable to achieve the law's objectives. Indeed, most situations where explanations are requested are adversarial, meaning that the explanation provider and receiver have opposing interests and incentives, so that the provider might manipulate the explanation for her own ends. We show that this fundamental conflict cannot be resolved because of the high degree of ambiguity of post-hoc explanations in realistic application scenarios. As a consequence, post-hoc explanation algorithms are unsuitable to achieve the transparency objectives inherent to the legal norms. Instead, there is a need to more explicitly discuss the objectives underlying "explainability" obligations as these can often be better achieved through other mechanisms. There is an urgent need for a more open and honest discussion regarding the potential and limitations of post-hoc explanations in adversarial contexts, in particular in light of the current negotiations of the European Union's draft Artificial Intelligence Act.
翻译:现有和计划制定的立法规定了提供机器学习算法及其功能信息的各种义务,往往被解释为“解释”的义务。许多研究人员建议为此目的使用“热后解释算法”。在本文件中,我们将法律、哲学和技术论据结合起来,以表明“热后解释算法”不适于实现法律目标。事实上,大多数要求解释的情况都是对立的,这意味着解释提供者和接受者的利益和动机相互对立,从而提供者可能操纵对其目的的解释。我们表明,由于在现实应用情况下,这种根本的冲突无法解决,因为热后解释解释法的高度模糊性,因此,“热后解释算法”不适于实现法律规范固有的透明度目标。相反,需要更明确地讨论“解释”义务背后的目标,因为这些义务往往可以通过其他机制更好地实现。迫切需要就“解释”后解释在对抗背景下的潜力和局限性进行更加公开和诚实的讨论,特别是鉴于欧洲联盟《人工智能法草案》目前的谈判。