Advances in Data Science permeate every field of Transportation Science and Engineering, resulting in developments in the transportation sector that {are} data-driven. Nowadays, Intelligent Transportation Systems (ITS) could be arguably approached as a ``story'' intensively producing and consuming large amounts of data. A~diversity of sensing devices densely spread over the infrastructure, vehicles or the travelers' personal devices act as sources of data flows that are eventually fed {into} software running on automatic devices, actuators or control systems producing, in~turn, complex information flows {among} users, traffic managers, data analysts, traffic modeling scientists, etc. These~information flows provide enormous opportunities to improve model development and decision-making. This work aims to describe how data, coming from diverse ITS sources, can be used to learn and adapt data-driven models for efficiently operating ITS assets, systems and processes; in~other words, for data-based models to fully become \emph{actionable}. Grounded in this described data modeling pipeline for ITS, we~define the characteristics, engineering requisites and challenges intrinsic to its three compounding stages, namely, data fusion, adaptive learning and model evaluation. We~deliberately generalize model learning to be adaptive, since, in~the core of our paper is the firm conviction that most learners will have to adapt to the ever-changing phenomenon scenario underlying the majority of ITS applications. Finally, we~provide a prospect of current research lines within Data Science that can bring notable advances to data-based ITS modeling, which will eventually bridge the gap towards the practicality and actionability of such models.
翻译:数据科学的进步贯穿于运输科学和工程的每个领域,导致运输部门的发展,而运输部门的发展是数据驱动的。如今,智能运输系统(ITS)可以说是一个“故事”的密集生产和消耗大量数据。遥感设备多样化,在基础设施、车辆或旅行者的个人设备上传播密集,成为数据流的来源,最终在自动装置、驱动器或控制系统上提供数据流,在数据流中生成复杂的信息流,在数据流中生成,在数据流中,用户、交通管理者、数据分析员、交通模型科学家等。这些信息流为改进模型开发和决策提供了巨大的机会。这项工作旨在说明如何利用来自各种ITS来源的数据学习和调整数据驱动模型,以便有效地操作ITS的资产、系统和流程;用其他语言来说,使数据模型完全成为可操作的模型}。在这个描述的数据模型中,我们测量了数据流的特点、工程要求和挑战,将当前三个核心应用阶段的模型带到了我们学习最精细的模型中。