Principal component analysis (PCA) is perhaps the most widely method for data dimensionality reduction. A key question in PCA decomposition of data is deciding how many factors to retain. This manuscript describes a new approach to automatically selecting the number of principal components based on the Bayesian minimum message length method of inductive inference. We also derive a new estimate of the isotropic residual variance and demonstrate, via numerical experiments, that it improves on the usual maximum likelihood approach.
翻译:主要组成部分分析(PCA)也许是减少数据维度的最广泛方法,五氯苯甲醚数据分解的一个关键问题是决定要保留多少因素。本手稿描述了一种根据巴伊西亚电文最小电文长度推导法自动选择主要组成部分数量的新方法。我们还得出了对异位残余差异的新估计,并通过数字实验表明它改进了通常的最大可能性方法。