Principal component analysis has been used to reduce dimensionality of datasets for a long time. In this paper, we will demonstrate that in mode detection the components of smallest variance, the pettiest components, are more important. We prove that when the data follows a multivariate normal distribution, by implementing "pettiest component analysis" when the data is normally distributed, we obtain boxes of optimal size in the sense that their size is minimal over all possible boxes with the same number of dimensions and given probability. We illustrate our result with a simulation revealing that pettiest component analysis works better than its competitors.
翻译:长期以来,主要元件分析被用来减少数据集的维度。在本文中,我们将证明,在模式检测中,最小差异的元件、小元件组件更为重要。我们证明,当数据遵循多变量正常分布时,通过在数据通常分布时进行“最小元件分析”,我们获得最佳尺寸的盒子,其含义是,所有可能的盒体大小最小,尺寸和概率相同。我们用模拟来说明我们的结果,显示毛件分析比其竞争者效果更好。