Learning dynamics is at the heart of many important applications of machine learning (ML), such as robotics and autonomous driving. In these settings, ML algorithms typically need to reason about a physical system using high dimensional observations, such as images, without access to the underlying state. Recently, several methods have proposed to integrate priors from classical mechanics into ML models to address the challenge of physical reasoning from images. In this work, we take a sober look at the current capabilities of these models. To this end, we introduce a suite consisting of 17 datasets with visual observations based on physical systems exhibiting a wide range of dynamics. We conduct a thorough and detailed comparison of the major classes of physically inspired methods alongside several strong baselines. While models that incorporate physical priors can often learn latent spaces with desirable properties, our results demonstrate that these methods fail to significantly improve upon standard techniques. Nonetheless, we find that the use of continuous and time-reversible dynamics benefits models of all classes.
翻译:学习动态是机器学习(ML)许多重要应用的核心,例如机器人和自主驱动。在这些环境中,ML算法通常需要使用高维观测系统来解释物理系统,例如图像,而不能进入基本状态。最近,有几种方法建议将古典机械学的前科纳入ML模型,以应对图像中物理推理的挑战。在这项工作中,我们清醒地审视这些模型的当前能力。为此,我们引入了一个套件,由17个数据集组成,其中包含基于各种动态的物理系统的直观观测。我们与几个强大的基线一道,对物理激励方法的主要类别进行彻底和详细的比较。虽然包含物理前科的模型往往可以学习具有理想特性的潜在空间,但我们的结果表明,这些方法在标准技术上没有显著改进。然而,我们发现,使用持续和可逆的动态模型对所有类别都有好处。