Existing approaches to coalition formation often assume that requirements associated with tasks are precisely specified by the human operator. However, prior work has demonstrated that humans, while extremely adept at solving complex problems, struggle to explicitly state their solution strategy. In this work, we propose a framework to learn implicit task requirements directly from expert demonstrations of coalition formation. We also account for the fact that demonstrators may utilize different, equally-valid solutions to the same task. Essentially, we contribute a framework to model and infer such heterogeneous strategies to coalition formation. Next, we develop a resource-aware approach to generalize the inferred strategies to new teams without requiring additional training. To this end, we formulate and solve a constrained optimization problem that simultaneously selects the most appropriate strategy for a given target team, and optimizes the constituents of its coalitions accordingly. We evaluate our approach against several baselines, including some that resemble existing approaches, using detailed numerical simulations, StarCraft II battles, and a multi-robot emergency response scenario. Our results indicate that our framework consistently outperforms all baselines in terms of requirement satisfaction, resource utilization, and task success rates.
翻译:然而,先前的工作表明,人类虽然非常擅长解决复杂问题,但努力明确阐述其解决方案战略。在这项工作中,我们提出了一个框架,直接从组建联盟的专家演示中学习隐含的任务要求。我们还说明了示威者可能利用不同、同等有效的办法执行同一任务的事实。基本上,我们为构建联盟提供了一种框架,以模拟和推导这种不同战略。接着,我们制定了一种资源意识方法,在不需要额外培训的情况下将推断出的战略推广到新团队。为此,我们制定和解决一个限制性的优化问题,同时为特定的目标团队选择最适当的战略,并据此优化联盟的构成。我们根据若干基线,包括一些与现有方法相似的基线,使用详细的数字模拟,StarCraft II战斗,以及多机器人应急设想。我们的成果表明,我们的框架在需求满意度、资源利用和任务成功率方面始终超越所有基线。