This paper proposes a neural network-based user simulator that can provide a multimodal interactive environment for training Reinforcement Learning (RL) agents in collaborative tasks involving multiple modes of communication. The simulator is trained on the existing ELDERLY-AT-HOME corpus and accommodates multiple modalities such as language, pointing gestures, and haptic-ostensive actions. The paper also presents a novel multimodal data augmentation approach, which addresses the challenge of using a limited dataset due to the expensive and time-consuming nature of collecting human demonstrations. Overall, the study highlights the potential for using RL and multimodal user simulators in developing and improving domestic assistive robots.
翻译:本文提出了一种基于神经网络的用户模拟器,可为训练涉及多种通信模式的强化学习(RL)代理提供多模态交互环境。模拟器通过现有的ELDERLY-AT-HOME语料库进行训练,并支持语言、指向性手势和触觉示意动作等多种模态。本文还提出了一种新颖的多模式数据扩充方法,解决了由于采集人类演示所需的时间和费用的限制,而导致数据集有限的问题。总体上,本研究强调了使用RL和多模态用户模拟器在开发和改进家庭辅助机器人方面的潜力。