Our aim is to build autonomous agents that can solve tasks in environments like Minecraft. To do so, we used an imitation learning-based approach. We formulate our control problem as a search problem over a dataset of experts' demonstrations, where the agent copies actions from a similar demonstration trajectory of image-action pairs. We perform a proximity search over the BASALT MineRL-dataset in the latent representation of a Video PreTraining model. The agent copies the actions from the expert trajectory as long as the distance between the state representations of the agent and the selected expert trajectory from the dataset do not diverge. Then the proximity search is repeated. Our approach can effectively recover meaningful demonstration trajectories and show human-like behavior of an agent in the Minecraft environment.
翻译:我们的目标是建立能够解决像Minecraft这样的环境中的任务的自主代理商。 为此,我们采用了仿照学习方法。我们把控制问题作为专家示范的数据集的搜索问题提出来,在专家示范的数据集中,代理商抄录了类似的图像行动轨迹。我们在视频预科培训模型的潜表中,对BASALT MineRL数据集进行近距离搜索。代理商抄录了专家轨迹的行动,只要该代理商的国家陈述与所选专家轨迹之间的距离没有差异。然后重复了近距离搜索。我们的方法可以有效地恢复有意义的示范轨迹,并显示一个代理商在地雷操作环境中的类似行为。