The design of a safe and reliable Autonomous Driving stack (ADS) is one of the most challenging tasks of our era. These ADS are expected to be driven in highly dynamic environments with full autonomy, and a reliability greater than human beings. In that sense, to efficiently and safely navigate through arbitrarily complex traffic scenarios, ADS must have the ability to forecast the future trajectories of surrounding actors. Current state-of-the-art models are typically based on Recurrent, Graph and Convolutional networks, achieving noticeable results in the context of vehicle prediction. In this paper we explore the influence of attention in generative models for motion prediction, considering both physical and social context to compute the most plausible trajectories. We first encode the past trajectories using a LSTM network, which serves as input to a Multi-Head Self-Attention module that computes the social context. On the other hand, we formulate a weighted interpolation to calculate the velocity and orientation in the last observation frame in order to calculate acceptable target points, extracted from the driveable of the HDMap information, which represents our physical context. Finally, the input of our generator is a white noise vector sampled from a multivariate normal distribution while the social and physical context are its conditions, in order to predict plausible trajectories. We validate our method using the Argoverse Motion Forecasting Benchmark 1.1, achieving competitive unimodal results.
翻译:设计一个安全可靠的自动驾驶堆叠(ADS)是我们时代最艰巨的任务之一。这些ADS预计将在高度动态的环境中驱动,具有完全自主性,比人类更可靠。从这个意义上讲,为了高效和安全地通过任意复杂的交通情景,ADS必须有能力预测周围行为者的未来轨迹。目前最先进的模型通常以经常性、图表和演动网络为基础,在车辆预测方面取得显著成果。在本文中,我们探讨了运动预测基因化模型中的关注影响,既考虑物理和社会环境,也考虑最可信的轨迹。我们首先使用LSTM网络对过去轨迹进行编码,作为计算周围行为者未来轨迹的输入。另一方面,我们制定加权的内插图,以计算最后观察框架中的速度和方向,从而计算可接受的目标点,从可驱动的HDDMAP信息中提取,以计算最可信的轨迹轨迹,以计算最可信的轨迹。我们用正常的平流路标的RVATR,最后,我们用正常的平流路路段进行输入,而我们用正常的平流路标的方式进行。