Traditional network models encapsulate travel behavior among all origin-destination pairs based on a simplified and generic utility function. Typically, the utility function consists of travel time solely and its coefficients are equated to estimates obtained from stated preference data. While this modeling strategy is reasonable, the inherent sampling bias in individual-level data may be further amplified over network flow aggregation, leading to inaccurate flow estimates. This data must be collected from surveys or travel diaries, which may be labor intensive, costly and limited to a small time period. To address these limitations, this study extends classical bi-level formulations to estimate travelers' utility functions with multiple attributes using system-level data. We formulate a methodology grounded on non-linear least squares to statistically infer travelers' utility function in the network context using traffic counts, traffic speeds, traffic incidents and sociodemographic information, among other attributes. The analysis of the mathematical properties of the optimization problem and of its pseudo-convexity motivate the use of normalized gradient descent. We also develop a hypothesis test framework to examine statistical properties of the utility function coefficients and to perform attributes selection. Experiments on synthetic data show that the coefficients are consistently recovered and that hypothesis tests are a reliable statistic to identify which attributes are determinants of travelers' route choices. Besides, a series of Monte-Carlo experiments suggest that statistical inference is robust to noise in the Origin-Destination matrix and in the traffic counts, and to various levels of sensor coverage. The methodology is also deployed at a large scale using real-world multi-source data in Fresno, CA collected before and during the COVID-19 outbreak.
翻译:传统的网络模型包含基于简化和通用公用功能的所有来源目的地配对的旅行行为。通常,公用事业功能仅由旅行时间组成,其系数等同于从所报优惠数据得出的估计数。虽然这一模型战略是合理的,但个人数据中固有的抽样偏差可能因网络流量汇总而进一步扩大,从而导致流量估计不准确。这些数据必须从调查或旅行日记中收集,调查或旅行日记可能是劳动密集型的,费用昂贵,且限于很小的时间段。为解决这些限制,本研究还扩展了传统的双级配方,以利用系统级数据估计具有多种属性的旅行者公用事业功能。我们制定了一种基于非线性最低方的方法,以统计性最低方的数据为基础,利用交通量计数、交通速度、交通事故事件和社会人口统计信息等要素,分析优化问题的数学性质和假隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐性。在统计周期中,利用统计性数据测底隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐性,在统计模型中,在历史序列中,在数据序列中,在统计序列中,在统计序列中,在深度中,在数据中,在深度中,在深度中,在深度中,在深度中,在深度测隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐,在度试验中,在度中,在度中,在度试验中,在度试验中,在度中,在度中,在比隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐,在比,在比,在度中,在度试验中,在度中,在比中,在度中,在度上,在比隐隐隐隐隐隐隐隐隐隐,在比。在比,在比,在比,在比。在比。