We introduce a method to make inference on the subgroups' sizes of a heterogeneous population using survey data, even in the presence of a single list. To this aim, we use Fisher's noncentral hypergeometric distribution, which allows us to account for the possibility that capture heterogeneity is related to key survey variables. We propose a Bayesian approach for estimating the population sizes posterior distributions, exploiting both extra-experimental information, e.g., coming from administrative data, and the computational efficiency of MCMC and ABC methods. The motivating case study deals with the size estimation of the population of Italian youngsters who are not employed one year after graduating by gender and degree program. We account for the possibility that surveys' response rates differ according to individuals' employment status, implying a not-at-random missing data scenario. We find that employed persons are generally more inclined to answer the questionnaire; this behavior might imply the overestimation of the employment rate.
翻译:我们采用一种方法,利用调查数据,即使有单一的清单,对各分组不同人口的规模作出推断。为此目的,我们使用Fisher的非中央超地球测量分布,从而使我们能够说明捕捉的异质性与关键调查变量有关的可能性。我们建议采用巴耶斯方法来估计人口规模的后游分布,利用来自行政数据等额外实验性信息以及MCMC和ABC方法的计算效率。激励性案例研究涉及在按性别和学位方案毕业一年之后没有就业的意大利青少年的人口规模估计。我们考虑到调查的响应率可能因个人就业状况而异,意味着一种非随机缺失的数据情景。我们发现,就业人员通常更愿意回答问卷;这种行为可能意味着过高地估计就业率。