Nonprobability (convenience) samples are increasingly sought to reduce the estimation variance for one or more population variables of interest that are performed using a randomized survey (reference) sample by increasing the effective sample size. Estimation of a population quantity derived from a convenience sample will typically result in bias since the distribution of variables of interest in the convenience sample is different from the population distribution. A recent set of approaches estimates inclusion probabilities for convenience sample units by specifying reference sample-weighted pseudo likelihoods. This paper introduces a novel approach that derives the propensity score for the observed sample as a function of inclusion probabilities for the reference and convenience samples as our main result. Our approach allows specification of a likelihood directly for the observed sample as opposed to the approximate or pseudo likelihood. We construct a Bayesian hierarchical formulation that simultaneously estimates sample propensity scores and the convenience sample inclusion probabilities. We use a Monte Carlo simulation study to compare our likelihood based results with the pseudo likelihood based approaches considered in the literature.
翻译:利用随机调查(参考)样本进行的一组或数种感兴趣人口变数的估计差异,通过增加有效抽样规模,越来越多地寻求不概率(混凝土)抽样,以缩小这些变数的估计差异。从方便抽样中得出的人口数量估计结果通常会产生偏差,因为方便抽样中感兴趣的变数分布与人口分布不同。最近一套方法通过具体规定参考样本加权伪可能性,估计方便抽样单位的概率。本文采用了一种新颖的方法,得出所观察到样本的常态分数,作为将参考和便利抽样的概率作为我们的主要结果的一种函数。我们的方法允许对所观察到的抽样直接说明可能性,而不是近似可能性或假可能性。我们制作了一种巴伊斯等级配方,同时估计抽样偏好分数和纳入便利抽样概率的概率。我们用蒙特卡洛模拟研究来比较我们基于可能性的结果和文献中考虑的假概率方法。