人类机器人小组通过经常性神经日程表传播的异种图的学习协调政策 (Learning Coordination Policies over Heterogeneous Graphs for Human-Robot Teams via Recurrent Neural Schedule Propagation)

As human-robot collaboration increases in the workforce, it becomes essential for human-robot teams to coordinate efficiently and intuitively. Traditional approaches for human-robot scheduling either utilize exact methods that are intractable for large-scale problems and struggle to account for stochastic, time varying human task performance, or application-specific heuristics that require expert domain knowledge to develop. We propose a deep learning-based framework, called HybridNet, combining a heterogeneous graph-based encoder with a recurrent schedule propagator for scheduling stochastic human-robot teams under upper- and lower-bound temporal constraints. The HybridNet's encoder leverages Heterogeneous Graph Attention Networks to model the initial environment and team dynamics while accounting for the constraints. By formulating task scheduling as a sequential decision-making process, the HybridNet's recurrent neural schedule propagator leverages Long Short-Term Memory (LSTM) models to propagate forward consequences of actions to carry out fast schedule generation, removing the need to interact with the environment between every task-agent pair selection. The resulting scheduling policy network provides a computationally lightweight yet highly expressive model that is end-to-end trainable via Reinforcement Learning algorithms. We develop a virtual task scheduling environment for mixed human-robot teams in a multi-round setting, capable of modeling the stochastic learning behaviors of human workers. Experimental results showed that HybridNet outperformed other human-robot scheduling solutions across problem sizes for both deterministic and stochastic human performance, with faster runtime compared to pure-GNN-based schedulers.

翻译：随着劳动力队伍中人-机器人协作的增加,人类-机器人团队必须高效和直觉地进行协调。传统的人类-机器人时间安排方法要么使用对大规模问题十分棘手的精确方法,并努力对随机性、时间差异的人类任务性能负责,要么使用需要专家领域知识才能开发的应用程序特有的超常性格。我们提议了一个深层次的学习框架,称为混合网,将一个混杂的图形化编码器与一个经常性的时间表推进器结合起来,用于在上下限时间限制下安排随机型人类-机器人小组。混合网的网络利用超异性功能关注网络来模拟初始环境和团队动态,同时计算制约因素。通过将任务时间安排设计成一个需要专家领域知识开发的顺序决策程序,混合网的经常性神经时间表驱动器将长期短期内存储(LSTM)模型用于传播行动在快速时间生成过程中的前瞻性后果,从而消除了在每项任务代理选择中与环境互动的需要。因此,混合网络的网络将利用超异性功能性功能性图表网络来模拟环境的升级。