Social Explainable AI (SAI) is a new direction in artificial intelligence that emphasises decentralisation, transparency, social context, and focus on the human users. SAI research is still at an early stage. Consequently, it concentrates on delivering the intended functionalities, but largely ignores the possibility of unwelcome behaviours due to malicious or erroneous activity. We propose that, in order to capture the breadth of relevant aspects, one can use models and logics of strategic ability, that have been developed in multi-agent systems. Using the STV model checker, we take the first step towards the formal modelling and verification of SAI environments, in particular of their resistance to various types of attacks by compromised AI modules.
翻译:社会可解释的AI(SAI)是人工智能的新方向,强调权力下放、透明度、社会背景和以人类用户为重点。 SAI的研究仍处于早期阶段。因此,它侧重于实现预期功能,但基本上忽视了恶意或错误活动造成不受欢迎的行为的可能性。我们建议,为了了解相关方面的范围,人们可以使用多试剂系统中开发的战略能力模型和逻辑。我们利用STV模型检查器,迈出了正式模拟和核查SAI环境的第一步,特别是它们抵制受到各种类型的AI模块破坏的攻击。