Coarse-to-fine Q-attention enables sample-efficient robot manipulation by discretizing the translation space in a coarse-to-fine manner, where the resolution gradually increases at each layer in the hierarchy. Although effective, Q-attention suffers from "coarse ambiguity" - when voxelization is significantly coarse, it is not feasible to distinguish similar-looking objects without first inspecting at a finer resolution. To combat this, we propose to envision Q-attention as a tree that can be expanded and used to accumulate value estimates across the top-k voxels at each Q-attention depth. When our extension, Q-attention with Tree Expansion (QTE), replaces standard Q-attention in the Attention-driven Robot Manipulation (ARM) system, we are able to accomplish a larger set of tasks; especially on those that suffer from "coarse ambiguity". In addition to evaluating our approach across 12 RLBench tasks, we also show that the improved performance is visible in a real-world task involving small objects.
翻译:粗到软的 Q 注意 使样本效率高的机器人操作能够以粗到软的方式将翻译空间分解, 使分辨率在层次的每个层次上逐渐增加。 虽然效果有效, Q 注意存在“ 粗模糊” —— 当 voxelization 明显粗粗粗时, 区分相似的外观对象是不可行的, 不首先在细的分辨率上进行检查 。 为了解决这个问题, 我们提议设想Q 注意是一棵树, 它可以扩大, 并用来在每一个注意深度的顶端的 oxels 之间积累价值估计值。 当我们的扩展时, Q 注意与树的扩大( QTE), 取代关注驱动的机器人操纵( ARM) 系统的标准 时, 我们能够完成更大的任务, 特别是那些受“ 粗模糊” 影响的任务。 除了评估我们的12 RLBench 任务外, 我们还表明, 在涉及小物体的现实世界任务中, 能够看到更好的表现。