The ability to learn new tasks and quickly adapt to different variations or dimensions is an important attribute in agile robotics. In our previous work, we have explored Behavior Trees and Motion Generators (BTMGs) as a robot arm policy representation to facilitate the learning and execution of assembly tasks. The current implementation of the BTMGs for a specific task may not be robust to the changes in the environment and may not generalize well to different variations of tasks. We propose to extend the BTMG policy representation with a module that predicts BTMG parameters for a new task variation. To achieve this, we propose a model that combines a Gaussian process and a weighted support vector machine classifier. This model predicts the performance measure and the feasibility of the predicted policy with BTMG parameters and task variations as inputs. Using the outputs of the model, we then construct a surrogate reward function that is utilized within an optimizer to maximize the performance of a task over BTMG parameters for a fixed task variation. To demonstrate the effectiveness of our proposed approach, we conducted experimental evaluations on push and obstacle avoidance tasks in simulation and with a real KUKA iiwa robot. Furthermore, we compared the performance of our approach with four baseline methods.
翻译:学习新任务和迅速适应不同变化或维度的能力是灵活机器人中的一个重要属性。在以往的工作中,我们探索了行为树和运动发电机(BTMGs)作为机器人臂政策的代表,以便利学习和执行组装任务。目前对具体任务实施BTMG的能力可能不足以适应环境变化,也可能不能广泛适用于不同任务的变化。我们提议扩大BTMG政策代表制,以模块预测新任务变异的BTMG参数。为了实现这一点,我们提出了一个模型,将高斯进程和加权支持矢量机分类器结合起来。这一模型预测了绩效计量和预测政策的可行性,而BTMG参数和任务变异则作为投入。然后,我们利用模型的输出,建立一个替代奖赏功能,在一个优化器内,以最大限度地完成一项超过BTMG参数的任务,固定任务变异。为了证明我们拟议方法的有效性,我们用模拟中推力和障碍避免任务与实际的KUKAi机器人基准方法进行了对比。此外,我们将模拟中的推力和障碍任务与实际的KUKAi机器人方法进行了对比。</s>