Theoretically, the conditional expectation of a square-integrable random variable $Y$ given a $d$-dimensional random vector $X$ can be obtained by minimizing the mean squared distance between $Y$ and $f(X)$ over all Borel measurable functions $f \colon \mathbb{R}^d \to \mathbb{R}$. However, in many applications this minimization problem cannot be solved exactly, and instead, a numerical method that computes an approximate minimum over a suitable subfamily of Borel functions has to be used. The quality of the result depends on the adequacy of the subfamily and the performance of the numerical method. In this paper, we derive an expected value representation of the minimal mean square distance which in many applications can efficiently be approximated with a standard Monte Carlo average. This enables us to provide guarantees for the accuracy of any numerical approximation of a given conditional expectation. We illustrate the method by assessing the quality of approximate conditional expectations obtained by linear, polynomial as well as neural network regression in different concrete examples.
翻译:从理论上讲,如果以美元为单位的维度随机矢量为单位,则可以通过将所有波罗尔可测量函数中Y美元和f(X)美元之间的平均正方差最小化来达到一个可成形的随机可变值Y$的有条件预期。然而,在许多应用中,这一最小化问题无法完全解决,相反,必须使用一个数字方法,计算出一个合适的波罗尔函数子组的近似最低值。结果的质量取决于子家庭是否充足和数字方法的性能。在本文中,我们得出了在许多应用中与标准蒙得卡洛平均数相近的最起码的平均正方差的预期值。这使我们能够为某一有条件期望的任何数字近似值的准确性提供保证。我们通过在不同的具体例子中评估线性、多元性和神经网络的近似条件预期质量来说明该方法。