This paper proposes a probabilistic motion prediction method for long motions. The motion is predicted so that it accomplishes a task from the initial state observed in the given image. While our method evaluates the task achievability by the Energy-Based Model (EBM), previous EBMs are not designed for evaluating the consistency between different domains (i.e., image and motion in our method). Our method seamlessly integrates the image and motion data into the image feature domain by spatially-aligned temporal encoding so that features are extracted along the motion trajectory projected onto the image. Furthermore, this paper also proposes a data-driven motion optimization method, Deep Motion Optimizer (DMO), that works with EBM for motion prediction. Different from previous gradient-based optimizers, our self-supervised DMO alleviates the difficulty of hyper-parameter tuning to avoid local minima. The effectiveness of the proposed method is demonstrated with a variety of experiments with similar SOTA methods.
翻译:本文为长动作提出了一种概率运动预测方法。 该动议的预测是, 它能够完成从给定图像所观察到的初始状态开始的任务。 虽然我们的方法评估了基于能源模型(EBM)的任务可实现性, 以前的EBM并不是用来评价不同领域( 即图像和运动在我们的方法中)的一致性的。 我们的方法通过空间拉近时间编码将图像和运动数据无缝地融入图像特征域, 以便按照向图像投射的运动轨迹提取特征。 此外, 本文还提出了一种数据驱动的运动优化方法, 深动作优化器(DMO), 与 EBM 一起进行运动预测。 不同于先前的基于梯度的优化器, 我们自我监督的DMO 减轻了超参数调整以避免本地微型的难度。 与SOTA 方法类似的实验展示了拟议方法的有效性 。