Learning behavioral patterns from observational data has been a de-facto approach to motion forecasting. Yet, the current paradigm suffers from two shortcomings: brittle under covariate shift and inefficient for knowledge transfer. In this work, we propose to address these challenges from a causal representation perspective. We first introduce a causal formalism of motion forecasting, which casts the problem as a dynamic process with three groups of latent variables, namely invariant mechanisms, style confounders, and spurious features. We then introduce a learning framework that treats each group separately: (i) unlike the common practice of merging datasets collected from different locations, we exploit their subtle distinctions by means of an invariance loss encouraging the model to suppress spurious correlations; (ii) we devise a modular architecture that factorizes the representations of invariant mechanisms and style confounders to approximate a causal graph; (iii) we introduce a style consistency loss that not only enforces the structure of style representations but also serves as a self-supervisory signal for test-time refinement on the fly. Experiment results on synthetic and real datasets show that our three proposed components significantly improve the robustness and reusability of the learned motion representations, outperforming prior state-of-the-art motion forecasting models for out-of-distribution generalization and low-shot transfer.
翻译:从观测数据中得出的学习行为模式一直是对运动预测的一种非抽象的方法。然而,目前的模式存在两个缺点:(一) 在共变变化中处于低效状态,知识转让效率低下。在这项工作中,我们提议从因果代表角度应对这些挑战。我们首先引入运动预测的因果形式化,将问题作为一个动态过程,由三种潜在变量组成,即变化机制、风格混淆器和虚假特征。然后我们引入一个学习框架,分别对待每个群体:(一) 不同于从不同地点收集的数据集合并的常见做法,我们利用这些数据集的微妙区别,其方式是不定损失,鼓励采用模式来抑制虚假的关联;(二) 我们设计一个模块结构,将变化机制和风格共体的表达方式与近似因果图结合起来;(三) 我们引入一种风格一致性损失,不仅强化风格表达结构,而且作为测试时间改进飞行的自我监督信号。 合成和真实数据集的实验结果显示,我们三个拟议的低变动模型大大改进了以往动态的动态,从而改进了一般变现性模型的稳定性和变现性。