Inference from limited data requires a notion of measure on parameter space, most explicit in the Bayesian framework as a prior. Here we demonstrate that Jeffreys prior, the best-known uninformative choice, introduces enormous bias when applied to typical scientific models. Such models have a relevant effective dimensionality much smaller than the number of microscopic parameters. Because Jeffreys prior treats all microscopic parameters equally, it is from uniform when projected onto the sub-space of relevant parameters, due to variations in the local co-volume of irrelevant directions. We present results on a principled choice of measure which avoids this issue, leading to unbiased inference in complex models. This optimal prior depends on the quantity of data to be gathered, and approaches Jeffreys prior in the asymptotic limit. However, this limit cannot be justified without an impossibly large amount of data, exponential in the number of microscopic parameters.
翻译:远离渐进病态
注:文中出现的专业名词暂不进行中文翻译,需在文中用英文标识。在有限数据的推断中,需要一种参数空间上的度量概念,这在贝叶斯框架中作为先验是最明确的。在此,我们证明了杰弗里先验,这是最著名的不知情选择,当应用于典型的科学模型时会引入巨大的偏差。这些模型的相关有效维数比微观参数的数量要小得多。由于杰弗里先验将所有微观参数都视为相等,因此在不相关的方向的本地共体积的变化下,将在相关参数的子空间上进行投影时变得不均匀。我们提出了一种基于原则的度量选择,避免了这个问题,在复杂模型中实现了无偏推断。这种最优先验取决于要收集的数据量,并在渐近极限下接近杰弗里先验。然而,在微观参数的数量上,这个极限无法在不可行的大数据量下证明,指数级增加。