Variational autoencoder (VAE) has widely been utilized for modeling data distributions because it is theoretically elegant, easy to train, and has nice manifold representations. However, when applied to image reconstruction and synthesis tasks, VAE shows the limitation that the generated sample tends to be blurry. We observe that a similar problem, in which the generated trajectory is located between adjacent lanes, often arises in VAE-based trajectory forecasting models. To mitigate this problem, we introduce a hierarchical latent structure into the VAE-based forecasting model. Based on the assumption that the trajectory distribution can be approximated as a mixture of simple distributions (or modes), the low-level latent variable is employed to model each mode of the mixture and the high-level latent variable is employed to represent the weights for the modes. To model each mode accurately, we condition the low-level latent variable using two lane-level context vectors computed in novel ways, one corresponds to vehicle-lane interaction and the other to vehicle-vehicle interaction. The context vectors are also used to model the weights via the proposed mode selection network. To evaluate our forecasting model, we use two large-scale real-world datasets. Experimental results show that our model is not only capable of generating clear multi-modal trajectory distributions but also outperforms the state-of-the-art (SOTA) models in terms of prediction accuracy. Our code is available at https://github.com/d1024choi/HLSTrajForecast.
翻译:在模拟数据分布时,广泛使用自变自动编码器(VAE)来模拟数据分布,因为它在理论上是优雅的,易于培训,而且具有良好的多面性。然而,在应用到图像重建和合成任务时,VAE显示生成的样本往往模糊起来。我们观察到一个类似的问题,即生成的轨迹位于相邻通道之间,通常出现在基于VAE的轨迹预测模型中。为了缓解这一问题,我们在基于VAE的预测模型中引入了一种等级潜伏结构。基于以下假设,即轨分布可以近似为简单分布(或模式)的混合体,低水平潜伏变量用于模拟每种混合物模式的模型,而高水平潜伏变量则用于代表模式的重量。我们观察到了类似的问题,为了精确地模拟,我们用两种以新方式计算的双层轨道水平的轨迹矢量矢量调节低潜伏变量,一种与车辆-连线互动相对,另一种与车辆-车辆-车辆互动相对。背景矢量也用于通过拟议的模式选择网络模拟重量,但使用低层潜伏变量变量变量变量来模拟。为了评估我们的预测模型,我们使用两种大规模的模型,我们使用两种大规模的轨道分布模型,我们使用两种大的模型,我们使用两种大尺度的模型,我们使用两种大的模型,我们使用两种模型显示的轨道流流流体模型。我们使用两种模型。我们使用两种模型显示的模型的模型的模型的模型的模型。我们使用两种模型。