In this paper we propose a convolution estimator for estimating the density of a response variable that employs an underlying multiple regression framework to enhance the accuracy of density estimates through the incorporation of auxiliary information. Suppose we have a sample consisting of $N$ complete case observations of a response variable and an associated set of covariates, along with an additional sample consisting of $M$ observations of the covariates only. We show that the mean square error of the multiple regression-enhanced convolution estimator converges as $O(N^{-1})$ towards zero, and moreover, for a large fixed $N$, that the mean square error converges as $O(M^{-4/5})$ towards an $O(N^{-1})$ constant. This is the first time that the convergence of a convolution estimator with respect to the amount of additional covariate information has been established. In contrast to convolution estimators based on the Nadaraya-Watson estimator for a nonlinear regression model, the multiple regression-enhanced convolution estimator proposed in this paper does not suffer from the curse of dimensionality. It is particularly useful for scenarios in which one wants to estimate the density of a response variable that is challenging to measure, while being in possession of a large amount of additional covariate information. In fact, an application of this type from the field of ophthalmology motivated our work in this paper.
翻译:在本文中,我们提出一个分析反应变量的密度的回溯估计值,该变量使用一个基本的多重回归框架,通过纳入辅助信息来提高密度估计的准确性。假设我们有一个样本,由对响应变量和一组相关的共变体的完整案件观察元美元组成,加上一个仅由共变体观察元组成的额外样本。我们表明,多重回归增强共变估计值的平均平方差是零(N ⁇ -1})美元,此外,对于一个大固定的美元,平均平方差是美元(M ⁇ -4/5})美元集中到一个美元(N ⁇ -1})的密度估计值。这是第一次确定变动估计值与额外共变差信息量的趋同。与以纳达拉亚-瓦特森非线性回归模型为基础的革命估计值的正方差平均差相交汇, 多重回归(M ⁇ -4/5})平方差误差将美元集中到一个美元($O(N ⁇ -N ⁇ -1}) 美元) 美元, 和相交错点值的平方数值相交汇点将集合起来, 而本文中的一种具有挑战性的多位数的计算法的模型中, 则要从一个高度的模型的模型中, 的模型的模型中, 的模型的计算, 是一个具有挑战性模型的磁场数的计算。