Misalignment in Multi-Agent Systems (MAS) is frequently treated as a technical failure. Yet, issues may arise from the conceptual design phase, where semantic ambiguity and normative projection occur. The Rabbit-Duck illusion illustrates how perspective-dependent readings of agent behavior, such as the conflation of cooperation-coordination, can create epistemic instability; e.g., coordinated agents in cooperative Multi-Agent Reinforcement Learning (MARL) benchmarks being interpreted as morally aligned, despite being optimized for shared utility maximization only. Motivated by three drivers of meaning-level misalignment in MAS (coordination-cooperation ambiguity, conceptual fluctuation, and semantic instability), we introduce the Misalignment Mosaic: a framework for diagnosing how misalignment emerges through language, framing, and design assumptions. The Mosaic comprises four components: 1. Terminological Inconsistency, 2. Interpretive Ambiguity, 3. Concept-to-Code Decay, and 4. Morality as Cooperation. Building on insights from the Morality-as-Cooperation Theory, we call for consistent meaning-level grounding in MAS to ensure systems function as intended: technically and ethically. This need is particularly urgent as MAS principles influence broader Artificial Intelligence (AI) workflows, amplifying risks in trust, interpretability, and governance. While this work focuses on the coordination/cooperation ambiguity, the Mosaic generalizes to other overloaded terms, such as alignment, autonomy, and trust.
翻译:多智能体系统(MAS)中的错位常被视为技术故障。然而,问题可能源于概念设计阶段,即存在语义模糊性和规范性投射之处。兔鸭错觉揭示了智能体行为的视角依赖性解读(如合作与协调的混淆)如何引发认知不稳定性;例如,在合作性多智能体强化学习(MARL)基准中,仅针对共享效用最大化进行优化的协调智能体,可能被解读为具有道德对齐性。基于多智能体系统中意义层面错位的三个驱动因素(协调-合作模糊性、概念波动和语义不稳定性),我们提出了"错位马赛克"框架:该框架用于诊断错位如何通过语言、框架设定和设计假设产生。马赛克包含四个组成部分:1. 术语不一致性,2. 解释模糊性,3. 概念到代码的衰减,4. 道德即合作。借鉴"道德即合作理论"的洞见,我们呼吁在多智能体系统中建立一致的意义层面基础,以确保系统按预期运行——包括技术层面和伦理层面。随着多智能体系统原则影响更广泛的人工智能(AI)工作流程,这种需求尤为迫切,这放大了信任、可解释性和治理方面的风险。尽管本文重点关注协调/合作模糊性,但马赛克框架可推广至其他语义过载的术语,如对齐、自主性和信任。