利用机器学习算法对交通事故伤害严重性进行因果分析和分类 (Causal Analysis and Classification of Traffic Crash Injury Severity Using Machine Learning Algorithms)

Causal analysis and classification of injury severity applying non-parametric methods for traffic crashes has received limited attention. This study presents a methodological framework for causal inference, using Granger causality analysis, and injury severity classification of traffic crashes, occurring on interstates, with different machine learning techniques including decision trees (DT), random forest (RF), extreme gradient boosting (XGBoost), and deep neural network (DNN). The data used in this study were obtained for traffic crashes on all interstates across the state of Texas from a period of six years between 2014 and 2019. The output of the proposed severity classification approach includes three classes for fatal and severe injury (KA) crashes, non-severe and possible injury (BC) crashes, and property damage only (PDO) crashes. While Granger Causality helped identify the most influential factors affecting crash severity, the learning-based models predicted the severity classes with varying performance. The results of Granger causality analysis identified the speed limit, surface and weather conditions, traffic volume, presence of workzones, workers in workzones, and high occupancy vehicle (HOV) lanes, among others, as the most important factors affecting crash severity. The prediction performance of the classifiers yielded varying results across the different classes. Specifically, while decision tree and random forest classifiers provided the greatest performance for PDO and BC severities, respectively, for the KA class, the rarest class in the data, deep neural net classifier performed superior than all other algorithms, most likely due to its capability of approximating nonlinear models. This study contributes to the limited body of knowledge pertaining to causal analysis and classification prediction of traffic crash injury severity using non-parametric approaches.

翻译：采用非参数方法对交通事故进行伤害严重程度分析和分类的工作得到的关注有限。本研究提供了一个因果推断方法框架,使用Granger因果关系分析,以及交通事故伤害严重程度分类,这些事故发生在州际之间,采用不同的机器学习技术,包括决策树(DT)、随机森林(RF)、极端梯度增强(XGBoost)和深神经网络(DNN)等。本研究中所使用的数据来自2014年至2019年六年期间德克萨斯州所有州际交通事故的数据。拟议严重程度分类方法的输出包括致命和严重伤害(KA)、碰撞、非严重和可能伤害(BC)的三种类别。Granger Causality帮助确定了影响碰撞严重程度的最有影响的因素,而基于学习的模型则预测了不同性能。Granger Cality分析的结果是速度限制、地表和天气条件、交通量、交通量、工作场所的存在、以及高度占用车辆(HOV)的不合理性能(HOV)的三种类别(BAR)类别、最可能发生的碰撞事故事故、最可能发生性能分析。