The marginal Bayesian predictive classifiers (mBpc) as opposed to the simultaneous Bayesian predictive classifiers (sBpc), handle each data separately and hence tacitly assumes the independence of the observations. However, due to saturation in learning of generative model parameters, the adverse effect of this false assumption on the accuracy of mBpc tends to wear out in face of increasing amount of training data; guaranteeing the convergence of these two classifiers under de Finetti type of exchangeability. This result however, is far from trivial for the sequences generated under Partition exchangeability (PE), where even umpteen amount of training data is not ruling out the possibility of an unobserved outcome (Wonderland!). We provide a computational scheme that allows the generation of the sequences under PE. Based on that, with controlled increase of the training data, we show the convergence of the sBpc and mBpc. This underlies the use of simpler yet computationally more efficient marginal classifiers instead of simultaneous. We also provide a parameter estimation of the generative model giving rise to the partition exchangeable sequence as well as a testing paradigm for the equality of this parameter across different samples. The package for Bayesian predictive supervised classifications, parameter estimation and hypothesis testing of the Ewens Sampling formula generative model is deposited on CRAN as PEkit package and free available from https://github.com/AmiryousefiLab/PEkit.
翻译:与同时使用的Bayesian预测分类(sBpc)相比,边缘的Bayesian预测分类(mBpc)与同时使用的Bayesian预测分类(sBpc)相比,分别处理每份数据,从而默认观测结果的独立性。然而,由于在学习基因化模型参数过程中的饱和度,这一虚假假设对 mBpc 精确度的不利影响往往在培训数据数量不断增加的情况下消失;保证这两个分类在“Finitettti”类交易中趋同。然而,这一结果对于在分区互换(PE)下生成的序列来说远不是微不足道的,因为在那里,即使有数量的培训数据也不能排除不可观察的结果的可能性(Wonderland!我们提供了一个计算方案,允许生成PEE的序列。在此基础上,随着培训数据的控制增加,我们显示了 sBpc 和 mBpc 的趋同性。这是使用更简单、更高效的边际分类而不是同时使用的。我们还提供了一个参数估计模型,用以提升可分区互换/Samplikalimal 的组合的模型,作为不同模型的样本测试标准。