Pragmatics studies how context can contribute to language meanings. In human communication, language is never interpreted out of context, and sentences can usually convey more information than their literal meanings. However, this mechanism is missing in most multi-agent systems, restricting the communication efficiency and the capability of human-agent interaction. In this paper, we propose an algorithm, using which agents can spontaneously learn the ability to "read between lines" without any explicit hand-designed rules. We integrate the theory of mind (ToM) in a cooperative multi-agent pedagogical situation and propose an adaptive reinforcement learning (RL) algorithm to develop a communication protocol. ToM is a profound cognitive science concept, claiming that people regularly reason about other's mental states, including beliefs, goals, and intentions, to obtain performance advantage in competition, cooperation or coalition. With this ability, agents consider language as not only messages but also rational acts reflecting others' hidden states. Our experiments demonstrate the advantage of pragmatic protocols over non-pragmatic protocols. We also show the teaching complexity following the pragmatic protocol empirically approximates to recursive teaching dimension (RTD).
翻译:在人类交流中,语言从未被从上下文解释,判决通常能够传递比字面含义更多的信息。然而,这一机制在多数多试剂系统中缺失,限制了通信效率和人剂相互作用的能力。在本文中,我们建议一种算法,使代理商能够自发地学习“在行之间阅读”的能力,而没有任何明确的手工设计的规则。我们把思想理论(TOM)纳入合作性的多试剂教学情况,并提议一种适应性强化学习算法,以发展通信协议。托姆是一个深刻的认知科学概念,声称人们经常了解他人的精神状态,包括信仰、目标和意图,以便在竞争、合作或联合中获得业绩优势。有了这种能力,代理商不仅将语言视为信息,而且还将反映他人隐藏状态的合理行为视为语言。我们的实验表明,务实协议比非戏剧性协议更有利。我们还展示了实用协议在经验上接近累合教学内容(RTD)之后的教学复杂性。