利用多种环境促进学习和决策:拆解使用案例 (Leveraging Multiple Environments for Learning and Decision Making: a Dismantling Use Case)

Learning is usually performed by observing real robot executions. Physics-based simulators are a good alternative for providing highly valuable information while avoiding costly and potentially destructive robot executions. We present a novel approach for learning the probabilities of symbolic robot action outcomes. This is done leveraging different environments, such as physics-based simulators, in execution time. To this end, we propose MENID (Multiple Environment Noise Indeterministic Deictic) rules, a novel representation able to cope with the inherent uncertainties present in robotic tasks. MENID rules explicitly represent each possible outcomes of an action, keep memory of the source of the experience, and maintain the probability of success of each outcome. We also introduce an algorithm to distribute actions among environments, based on previous experiences and expected gain. Before using physics-based simulations, we propose a methodology for evaluating different simulation settings and determining the least time-consuming model that could be used while still producing coherent results. We demonstrate the validity of the approach in a dismantling use case, using a simulation with reduced quality as simulated system, and a simulation with full resolution where we add noise to the trajectories and some physical parameters as a representation of the real system.

翻译：物理模拟器是提供高价值信息的好选择,同时避免费用昂贵和潜在破坏性的机器人处决。我们提出了一种新颖的方法来了解象征性机器人动作结果的概率。这是利用不同环境,例如物理模拟器,在使用时利用物理模拟器等。为此,我们建议采用多环境噪音杂音不确定性德义规则,一种能够应对机器人任务中存在的内在不确定性的新型代表形式。MENID规则明确代表一项行动的每一种可能结果,保存经验来源的记忆,并保持每项结果的成功概率。我们还根据以往的经验和预期收益,采用一种在环境中分配行动的算法。在使用物理模拟之前,我们提出一种方法,用于评价不同的模拟环境,确定在仍然产生一致结果时可以使用的最不费时的模型。我们用模拟系统模拟质量下降的模拟来证明拆解方法的有效性,并用完全分辨率进行模拟,将噪音和一些物理参数添加到真实的系统。