An ideal synthetic population, a key input to activity-based models, mimics the distribution of the individual- and household-level attributes in the actual population. Since the entire population's attributes are generally unavailable, household travel survey (HTS) samples are used for population synthesis. Synthesizing population by directly sampling from HTS ignores the attribute combinations that are unobserved in the HTS samples but exist in the population, called 'sampling zeros'. A deep generative model (DGM) can potentially synthesize the sampling zeros but at the expense of generating 'structural zeros' (i.e., the infeasible attribute combinations that do not exist in the population). This study proposes a novel method to minimize structural zeros while preserving sampling zeros. Two regularizations are devised to customize the training of the DGM and applied to a generative adversarial network (GAN) and a variational autoencoder (VAE). The adopted metrics for feasibility and diversity of the synthetic population indicate the capability of generating sampling and structural zeros -- lower structural zeros and lower sampling zeros indicate the higher feasibility and the lower diversity, respectively. Results show that the proposed regularizations achieve considerable performance improvement in feasibility and diversity of the synthesized population over traditional models. The proposed VAE additionally generated 23.5% of the population ignored by the sample with 79.2% precision (i.e., 20.8% structural zeros rates), while the proposed GAN generated 18.3% of the ignored population with 89.0% precision. The proposed improvement in DGM generates a more feasible and diverse synthetic population, which is critical for the accuracy of an activity-based model.
翻译:理想的合成人口,是对基于活动模型的关键投入,可以模仿实际人口中个人和家庭层面属性的分布。由于整个人口属性一般都不存在,因此使用家庭旅行调查样本进行人口合成。从HTS直接抽样对人口进行合成,忽略了在HTS样本中未观察到的、但在人口中存在的称为“抽样零”的属性组合。深层基因化模型(DGM)可以综合抽样零数,但牺牲产生“结构零”的能力(即,整个人口属性不可行的属性组合在实际人口中并不存在)。这项研究提出一种新的方法,在保持抽样零的同时,将结构零数最小化。 两项正规化方法旨在定制DGM培训,并适用于基因化的对抗网络(GAN)和变异性自动电解码模型(VAE)。 所采纳的合成人口可行性和多样性指标表明生成抽样和结构零的能力 -- 较低的结构零和较低的抽样率(即,低结构零)和低度的精确度混合率率率率率率率(即人口结构零),使18种样本的精确度生成率率率率率提高了结构比率率率率率率率率率,这又能提高结构零。