Recent statistical methods fitted on large-scale GPS data are getting close to answering the proverbial "When are we there?" question. Unfortunately, current methods often only provide point predictions for travel time. Understanding travel time distribution is key for decision-making and downstream applications (e.g., ride share pricing decisions). Empirically, single road-segment travel time is well-studied, understanding how to aggregate such information over many segments to arrive at the distribution of travel time over a route is challenging. We develop a novel statistical approach to this problem, where we show that, under general conditions, without assuming a distribution of speed, travel time normalized by distance follows a Gaussian distribution with route-invariant population mean and variance. We develop efficient inference methods for such parameters, with which we propose population prediction intervals for travel time. Our population intervals are asymptotically tight and require only two parameter estimates. Using road-level information (e.g.~traffic density), we further develop a catered trips-specific Gaussian-based predictive distribution, resulting in tight prediction intervals for short and long trips. Our methods, implemented in an R-package, are illustrated in a real-world case study using mobile GPS data, showing that our trip-specific and population intervals both achieve the 95\% theoretical coverage levels. Compared to alternative approaches, our trip-specific predictive distribution achieves (a) the theoretical coverage at every level of significance, (b) tighter prediction intervals, (c) less predictive bias, and (d) more efficient estimation and prediction procedures that only rely on the first and second moment estimates of speed on edges of the network. This makes our approach promising for low latency large-scale transportation applications.
翻译:根据大型全球定位系统数据安装的近期统计方法正在接近于回答“我们何时到达?”这一俗称的问题。 不幸的是,目前的方法往往只提供旅行时间的点数预测。了解旅行时间分布是决策和下游应用的关键。 随机地,单一的路段旅行时间是经过仔细研究的,了解如何将许多部分的此类信息汇总到一条路线上的旅行时间分布上,这具有挑战性。我们制定了一种新的统计方法,在一般条件下,我们显示在不假定速度分布的情况下,通过距离实现旅行正常的旅行时间在高斯分布之后,与路线变化人口平均值和差异值相适应。我们为这些参数制定了高效的推论方法,我们据此提出旅行时间的人口预测间隔间隔。我们的人口间隔太短,只需要两个参数估计。我们使用公路级别信息(如:低轨道密度密度),我们进一步开发一个有针对性旅行预测的戈斯预测分布,从而导致短程和长程预测的间隔时间间隔。我们采用的方法,在实际和短程中都采用了一个有深度的精确的预测方法。 我们的计算方法,在一次有甚小的轨道上,在一次有深度的轨道上,在一次有线段段段段里,我们的数据分析中,我们用一个有线段段里,我们的数据分析中,我们用一个显示一个有线段段段段内,在一次有线段内,我们的数据显示了一次有线段段段段段段内,在进行。