In statistical inference, uncertainty is unknown and all models are wrong. A person who makes a statistical model and a prior distribution is simultaneously aware that they are fictional and virtual candidates. In order to study such cases, several statistical measures have been constructed, such as cross validation, information criteria, and marginal likelihood, however, their mathematical properties have not yet been completely clarified when statistical models are under- and over- parametrized. In this paper, we introduce a place of mathematical theory of Bayesian statistics for unknown uncertainty, on which we show general properties of cross validation, information criteria, and marginal likelihood. The derived theory holds even if an unknown uncertainty is unrealizable by a statistical model or even if the posterior distribution cannot be approximated by any normal distribution, hence it gives a helpful standpoint for a person who cannot believe in any specific model and prior. The results are followings. (1) There exists a more precise statistical measure of the generalization loss than leave-one-out cross validation and information criterion based on the mathematical properties of them. (2) There exists a more efficient approximation method of the free energy, which is the minus log marginal likelihood, even if the posterior distribution cannot be approximated by any normal distribution. (3) And the prior distributions optimized by the cross validation and the widely applicable information criterion are asymptotically equivalent to each other, which are different from that by the marginal likelihood.
翻译:在统计推论中,不确定性是未知的,所有模型都是错的。一个人,如果统计模型和先前的分布同时意识到它们都是虚构的和虚拟的候选者。为了研究这类案例,已经制定了若干统计措施,例如交叉验证、信息标准和可能性很小,但是,当统计模型的不对称和超称化时,它们的数学特性还没有完全澄清。在本文中,我们引入了巴伊西亚统计数学理论的一处位置,以未知的不确定性为根据,我们在该理论上显示了交叉验证、信息标准和边缘可能性的一般特性。衍生理论即使统计模型无法实现未知的不确定性,或者即使后方分布无法被任何正常分布所近似,也仍然持有若干统计措施,例如交叉验证,因此,对于无法相信任何具体模型和先前的模型的人来说,其数学特性尚未完全澄清。(1) 在本文中,我们采用了一个比较精确的统计尺度,衡量通用损失,而不是基于其数学特性的跨校准和信息标准。(2) 自由能源的近比法更为高效,即边际分布概率为负边际的概率,即使先前的分布标准与先前的分布相近似,也不可能,因为以往的优化分配为不同的标准,而以往的比比比比平正的分布的分布是不同的。(3),因此,即使比比比平比比比比比比比比比比比标准的分布的分布的分布可作准准,通过不同的,因此,通过不同的分配的分布的分布比比比比比比比比比比比比比比比比比比比比比比比比比比比比比比比比比差的分布为不同的,因此任何。