With the precipitous decline in response rates, researchers and pollsters have been left with highly non-representative samples, relying on constructed weights to make these samples representative of the desired target population. Though practitioners employ valuable expert knowledge to choose what variables, $X$ must be adjusted for, they rarely defend particular functional forms relating these variables to the response process or the outcome. Unfortunately, commonly-used calibration weights -- which make the weighted mean $X$ in the sample equal that of the population -- only ensure correct adjustment when the portion of the outcome and the response process left unexplained by linear functions of $X$ are independent. To alleviate this functional form dependency, we describe kernel balancing for population weighting (kpop). This approach replaces the design matrix $\mathbf{X}$ with a kernel matrix, $\mathbf{K}$ encoding high-order information about $\mathbf{X}$. Weights are then found to make the weighted average row of $\mathbf{K}$ among sampled units approximately equal that of the target population. This produces good calibration on a wide range of smooth functions of $X$, without relying on the user to explicitly specify those functions. We describe the method and illustrate it by application to polling data from the 2016 U.S. presidential election.
翻译:由于答复率急剧下降,研究人员和民意测验员的抽样极不具有代表性,他们依靠构建的重量来让这些样品代表预期的目标人口。虽然从业人员使用宝贵的专家知识来选择变量,但必须调整美元,他们很少维护与答复进程或结果相关的特定功能形式。不幸的是,通常使用的校准重量使抽样中的加权平均值X美元与人口相等,只有在结果和反应过程因直线函数X美元而无法解释的部分是独立的时,才能确保正确调整。为了减轻这种功能形式的依赖性,我们描述人口加权(kpop)的内核平衡。这种方法用一个内核矩阵取代设计矩阵U\mathbf{X}美元,用一个内核矩阵来取代这些变量。 $\mathf{K} 编码高端信息使样本中的加权平均值等于人口值为X美元。然后发现,当结果在抽样单位中的加权平均值为$\mathbf{K}($K}中,在抽样单位中大约等于目标人口之间,我们描述人口加权的内核平衡的内值平衡(kpo)(xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx