This paper provides empirical concerns about post-hoc explanations of black-box ML models, one of the major trends in AI explainability (XAI), by showing its lack of interpretability and societal consequences. Using a representative consumer panel to test our assumptions, we report three main findings. First, we show that post-hoc explanations of black-box model tend to give partial and biased information on the underlying mechanism of the algorithm and can be subject to manipulation or information withholding by diverting users' attention. Secondly, we show the importance of tested behavioral indicators, in addition to self-reported perceived indicators, to provide a more comprehensive view of the dimensions of interpretability. This paper contributes to shedding new light on the actual theoretical debate between intrinsically transparent AI models and post-hoc explanations of black-box complex models-a debate which is likely to play a highly influential role in the future development and operationalization of AI systems.
翻译:本文对黑盒 ML 模型的热后解释提出了经验上的关切,这是AI解释性的主要趋势之一,表明其缺乏可解释性和社会后果。我们用一个有代表性的消费者小组来测试我们的假设,我们报告了三个主要结论。首先,我们表明,黑盒模型的热后解释往往对算法的基本机制产生偏颇和偏颇的信息,并可能通过转移用户的注意力而加以操纵或隐瞒信息。第二,我们表明,除了自我报告的可识别指标外,测试的行为指标对于更全面地了解可解释性层面的重要性。本文有助于重新阐明在本质上透明的AI 模型与黑盒复杂模型的事后解释之间的实际理论辩论,这种辩论有可能对AI系统的未来开发和运行产生很大影响。