Pedestrian crossing is one of the most typical behavior which conflicts with natural driving behavior of vehicles. Consequently, pedestrian crossing prediction is one of the primary task that influences the vehicle planning for safe driving. However, current methods that rely on the practically collected data in real driving scenes cannot depict and cover all kinds of scene condition in real traffic world. To this end, we formulate a deep virtual to real distillation framework by introducing the synthetic data that can be generated conveniently, and borrow the abundant information of pedestrian movement in synthetic videos for the pedestrian crossing prediction in real data with a simple and lightweight implementation. In order to verify this framework, we construct a benchmark with 4667 virtual videos owning about 745k frames (called Virtual-PedCross-4667), and evaluate the proposed method on two challenging datasets collected in real driving situations, i.e., JAAD and PIE datasets. State-of-the-art performance of this framework is demonstrated by exhaustive experiment analysis. The dataset and code can be downloaded from the website \url{http://www.lotvs.net/code_data/}.
翻译:因此,行人过境预测是影响车辆安全驾驶规划的主要任务之一。然而,目前依靠实际驾驶场实际收集的数据的方法无法描述和涵盖真实交通世界中所有类型的现场状况。为此,我们通过引入可方便生成的合成数据,制定了一个深度虚拟到真实蒸馏框架,并在合成视频中借用行人移动的丰富信息,供行人以简单和轻量级的方式在真实数据中通过预测时使用。为了核实这一框架,我们建立了一个基准,有4667个虚拟视频,拥有约745k框架(称为虚拟PedCross-4667),并评估了在真实驾驶环境中收集的两种挑战性数据集的拟议方法,即JAAAD和PIE数据集。 国家艺术局对这一框架的绩效通过详尽的实验分析得到证明。数据设置和代码可以从网站下载:http://www.lotvs.net/code_data/}。