It is often argued that one goal of explaining automated decision systems (ADS) is to facilitate positive perceptions (e.g., fairness or trustworthiness) of users towards such systems. This viewpoint, however, makes the implicit assumption that a given ADS is fair and trustworthy, to begin with. If the ADS issues unfair outcomes, then one might expect that explanations regarding the system's workings will reveal its shortcomings and, hence, lead to a decrease in fairness perceptions. Consequently, we suggest that it is more meaningful to evaluate explanations against their effectiveness in enabling people to appropriately assess the quality (e.g., fairness) of an associated ADS. We argue that for an effective explanation, perceptions of fairness should increase if and only if the underlying ADS is fair. In this in-progress work, we introduce the desideratum of appropriate fairness perceptions, propose a novel study design for evaluating it, and outline next steps towards a comprehensive experiment.
翻译:人们常常认为,解释自动决策系统(ADS)的目的之一是便利用户对此类系统产生正面看法(例如公平或可信赖性),但这种观点使人暗含地认为,一个特定ADS是公平和可信赖的。如果ADS产生不公平的结果,那么人们可能会期望,对系统工作的解释将揭示其缺点,从而降低对公平的看法。因此,我们建议,根据解释的有效性来评价解释是否有效,使人们能够适当评估相关ADS的质量(例如公平性)。我们主张,为了有效解释,只有在基本ADS是公平的时,对公平性的看法才会增加。在这项工作中,我们引入了对适当公平看法的否定,提出评估它的新的研究设计,并概述全面试验的下一步步骤。