Long-horizon robotic tasks are hard due to continuous state-action spaces and sparse feedback. Symbolic world models help by decomposing tasks into discrete predicates that capture object properties and relations. Existing methods learn predicates either top-down, by prompting foundation models without data grounding, or bottom-up, from demonstrations without high-level priors. We introduce UniPred, a bilevel learning framework that unifies both. UniPred uses large language models (LLMs) to propose predicate effect distributions that supervise neural predicate learning from low-level data, while learned feedback iteratively refines the LLM hypotheses. Leveraging strong visual foundation model features, UniPred learns robust predicate classifiers in cluttered scenes. We further propose a predicate evaluation method that supports symbolic models beyond STRIPS assumptions. Across five simulated and one real-robot domains, UniPred achieves 2-4 times higher success rates than top-down methods and 3-4 times faster learning than bottom-up approaches, advancing scalable and flexible symbolic world modeling for robotics.


翻译:长时程机器人任务因连续状态-动作空间与稀疏反馈而难以实现。符号世界模型通过将任务分解为捕捉物体属性与关系的离散谓词来提供帮助。现有方法要么采用自顶向下方式(通过提示基础模型而无需数据支撑),要么采用自底向上方式(从演示数据中学习而缺乏高层先验)来学习谓词。我们提出UniPred——一个统一两者的双层学习框架。UniPred利用大语言模型(LLMs)提出谓词效应分布,以此监督从低层数据中进行的神经谓词学习,同时学习到的反馈会迭代优化LLM假设。通过利用强大的视觉基础模型特征,UniPred能够在杂乱场景中学习稳健的谓词分类器。我们进一步提出一种支持超越STRIPS假设的符号模型的谓词评估方法。在五个仿真域和一个真实机器人域中,UniPred实现了比自顶向下方法高2-4倍的成功率,以及比自底向上方法快3-4倍的学习速度,推动了机器人领域可扩展且灵活的符号世界建模。

0
下载
关闭预览

相关内容

Top
微信扫码咨询专知VIP会员