This paper proposes a novel framework for autonomous drone navigation through a cluttered environment. Control policies are learnt in a low-level environment during training and are applied to a complex environment during inference. The controller learnt in the training environment is tricked into believing that the robot is still in the training environment when it is actually navigating in a more complex environment. The framework presented in this paper can be adapted to reuse simple policies in more complex tasks. We also show that the framework can be used as an interpretation tool for reinforcement learning algorithms.
翻译:本文提出一个通过环绕环境进行自主无人驾驶导航的新框架。 控制政策在培训期间在低层环境中学习,在推断期间适用于复杂的环境。 在培训环境中学习的控制员被骗以为,当机器人实际在更复杂的环境中航行时,它仍然在培训环境中。 本文提出的框架可以调整为在更复杂的任务中重新利用简单政策。 我们还表明,该框架可以用作强化学习算法的解释工具。