Teamwork is a set of interrelated reasoning, actions and behaviors of team members that facilitate common objectives. Teamwork theory and experiments have resulted in a set of states and processes for team effectiveness in both human-human and agent-agent teams. However, human-agent teaming is less well studied because it is so new and involves asymmetry in policy and intent not present in human teams. To optimize team performance in human-agent teaming, it is critical that agents infer human intent and adapt their polices for smooth coordination. Most literature in human-agent teaming builds agents referencing a learned human model. Though these agents are guaranteed to perform well with the learned model, they lay heavy assumptions on human policy such as optimality and consistency, which is unlikely in many real-world scenarios. In this paper, we propose a novel adaptive agent architecture in human-model-free setting on a two-player cooperative game, namely Team Space Fortress (TSF). Previous human-human team research have shown complementary policies in TSF game and diversity in human players' skill, which encourages us to relax the assumptions on human policy. Therefore, we discard learning human models from human data, and instead use an adaptation strategy on a pre-trained library of exemplar policies composed of RL algorithms or rule-based methods with minimal assumptions of human behavior. The adaptation strategy relies on a novel similarity metric to infer human policy and then selects the most complementary policy in our library to maximize the team performance. The adaptive agent architecture can be deployed in real-time and generalize to any off-the-shelf static agents. We conducted human-agent experiments to evaluate the proposed adaptive agent framework, and demonstrated the suboptimality, diversity, and adaptability of human policies in human-agent teams.
翻译:团队团队是一组有助于共同目标的团队成员相互关联的推理、行动和行为。团队理论和实验的结果是,在人类团队和代理代理团队中,团队效力都产生了一组国家和流程,但是,由于人力代理团队太新,政策和意图不对称,因此研究得不够周密。为了优化团队在团队中的工作表现,至关重要的是,代理人员要推断人的意图并调整其政策以利顺利协调。人类代理团队中的大多数文献都建立了参考人类模式的代理商。尽管这些代理商保证与学习的模型运行良好,但他们对人类政策,如最佳性和一致性,在许多现实世界情景中是不可能做到的。在本论文中,我们提议在两者合作游戏中,即Space Forest Foresters团队(TSFF)中采用新的适应性代理机构架构。人类团队以往的研究显示,TSF游戏和人类参与者技能的多样性政策是互补的,这鼓励我们放松对人政策的假设。因此,我们放弃人类模型从人类数据中学习最优化的模型,而是使用人类代理机构适应规则的模型,然后用人类成本的模型来调整。