This article presents a model for traffic incident prediction. Specifically, we address the fundamental problem of data scarcity in road traffic accident prediction by training our model on emergency braking events instead of accidents. Based on relevant risk factors for traffic accidents and corresponding data categories, we evaluate different options for preprocessing sparse data and different Machine Learning models. Furthermore, we present a prototype implementing a traffic incident prediction model for Germany based on emergency braking data from Mercedes-Benz vehicles as well as weather, traffic and road data, respectively. After model evaluation and optimisation, we found that a Random Forest model trained on artificially balanced (under-sampled) data provided the highest classification accuracy of 85% on the original imbalanced data. Finally, we present our conclusions and discuss further work; from gathering more data over a longer period of time to build stronger classification systems, to addition of internal factors such as the driver's visual and cognitive attention.
翻译:本文为交通事故预测提供了一个模型。 具体地说, 我们通过培训关于紧急制动事件而不是事故的模型,解决道路交通事故预测中数据稀缺的基本问题。 根据交通事故的相关风险因素和相应的数据类别,我们评估了预处理稀缺数据和不同机器学习模型的不同选项。 此外, 我们提出了一个德国交通事故预测模型的原型,该模型分别基于奔驰汽车的紧急制动数据以及天气、交通和道路数据。 在模型评估和优化之后,我们发现,经过人工平衡(抽样不足)数据培训的随机森林模型为原始不平衡数据提供了85%的最高分类精确度。 最后,我们提出我们的结论并讨论进一步的工作;从在更长的时期内收集更多数据到建立更强大的分类系统,再加一些内部因素,例如驾驶员的视觉和认知关注。