Driven in part by the COVID-19 pandemic, the pace of online purchases for at-home delivery has accelerated significantly. However, responding to this development has been challenging given the lack of public data. The existing data may be infrequent, and a significant portion of data may be missing because of survey participant non-responses. This data paucity renders conventional predictive models unreliable. We address this shortcoming by developing algorithms for data imputation and synthetic demand estimation for future years without the actual ground truth data. We use 2017 Puget Sound Regional Council (PSRC) and National Household Travel Survey (NHTS) data and impute from the NHTS for the Seattle-Tacoma-Bellevue MSA where delivery data is relatively more frequent. Our imputation has the mean-squared error $\mathsf{MSE} \approx 0.65$ to NHTS with mean $\approx 1$ and standard deviation $\approx 3.5$ and provides a similarity matching between the two data sources' samples. Given the unavailability of NHTS data for 2021, we use the temporal fidelity of PSRC data sources (2017 and 2021) to project the resolution onto the NHTS providing a synthetic estimate of NHTS deliveries. Beyond the improved reliability of the estimates, we report explanatory variables that were relevant in determining the volume of deliveries. This work furthers existing methods in demand estimation for goods deliveries by maximizing available sparse data to generate reasonable estimates that could facilitate policy decisions.
翻译:部分受COVID-19大流行驱使,在线购买上门交货的速度大大加快,然而,由于缺乏公共数据,对这一发展的反应一直具有挑战性。现有数据可能并不多,而且由于调查参与者没有答复,很大一部分数据可能缺乏。这种数据缺乏使常规预测模型不可靠。我们通过制定未来年份数据估算和合成需求估算的算法来弥补这一缺陷,而没有实际地面真相数据。我们使用2017年普吉特湾区域理事会和国家住户旅行调查(NHTS)数据,并从NHTS中提取数据,用于西雅图-塔科马-贝利武海勒海勒海勒海勒海勒海勒海勒海勒海勒海勒海勒海勒海勒海勒海勒海勒海勒海勒海勒海勒海勒海勒海勒海勒海勒,因为2021年无法获取NHTS数据估算数据,因此,我们利用目前可靠的海勒海勒海勒海勒海勒海勒海勒海勒海勒海勒海勒海勒海勒海勒海勒海勒。