使用双级机器学习框架预测事故持续时间,包括异端清除和异端内部联合优化 (Incident duration prediction using a bi-level machine learning framework with outlier removal and intra-extra joint optimisation)

Predicting the duration of traffic incidents is a challenging task due to the stochastic nature of events. The ability to accurately predict how long accidents will last can provide significant benefits to both end-users in their route choice and traffic operation managers in handling of non-recurrent traffic congestion. This paper presents a novel bi-level machine learning framework enhanced with outlier removal and intra-extra joint optimisation for predicting the incident duration on three heterogeneous data sets collected for both arterial roads and motorways from Sydney, Australia and San-Francisco, U.S.A. Firstly, we use incident data logs to develop a binary classification prediction approach, which allows us to classify traffic incidents as short-term or long-term. We find the optimal threshold between short-term versus long-term traffic incident duration, targeting both class balance and prediction performance while also comparing the binary versus multi-class classification approaches. Secondly, for more granularity of the incident duration prediction to the minute level, we propose a new Intra-Extra Joint Optimisation algorithm (IEO-ML) which extends multiple baseline ML models tested against several regression scenarios across the data sets. Final results indicate that: a) 40-45 min is the best split threshold for identifying short versus long-term incidents and that these incidents should be modelled separately, b) our proposed IEO-ML approach significantly outperforms baseline ML models in $66\%$ of all cases showcasing its great potential for accurate incident duration prediction. Lastly, we evaluate the feature importance and show that time, location, incident type, incident reporting source and weather at among the top 10 critical factors which influence how long incidents will last.

翻译：预测交通事故持续时间是一项具有挑战性的任务,因为事件性质混乱,因此,准确预测事故持续多久的能力是具有挑战性的任务; 准确预测事故持续多久能给最终使用者选择路线的路线和交通业务管理人员处理非经常性交通拥堵带来重大好处; 本文提出了一个新的双级机器学习框架,通过排除外差和外部内部联合优化,在悉尼、澳大利亚和美国圣弗朗西斯科为动脉公路和高速公路收集的三个不同数据集中预测事故持续时间,从而强化了一个新的双级机器学习框架; 首先,我们利用事件数据日志来制定一种二元准确的分类预测预测方法,从而使我们能够将交通事故列为短期或长期交通拥堵塞。我们发现一个最佳的门槛,即短期与长期交通阻塞时间,同时比较二是双级与多级的分类方法。第二,关于事件持续时间预测的更粗略度,我们建议采用新的内部联合优化算法(IEO-ML),根据若干次基线模型将交通事件列为长期或长期影响。