Principal Component Analysis (PCA) is a popular method for dimension reduction and has attracted an unfailing interest for decades. Recently, kernel PCA has emerged as an extension of PCA but, despite its use in practice, a sound theoretical understanding of kernel PCA is missing. In this paper, we contribute lower and upper bounds on the efficiency of kernel PCA, involving the empirical eigenvalues of the kernel Gram matrix. Two bounds are for fixed estimators, and two are for randomized estimators through the PAC-Bayes theory. We control how much information is captured by kernel PCA on average, and we dissect the bounds to highlight strengths and limitations of the kernel PCA algorithm. Therefore, we contribute to the better understanding of kernel PCA. Our bounds are briefly illustrated on a toy numerical example.
翻译:主要成分分析(PCA)是减少尺寸的流行方法,几十年来一直吸引人们的兴趣。最近,内部五氯苯甲醚已成为五氯苯甲醚的延伸,但实际上却缺乏对内部五氯苯甲醚的正确理论理解。在本文中,我们对内核五氯苯甲醚效率的上下界限做出了贡献,涉及内核格拉姆矩阵的经验性外值。两条界限是固定测算器,两条界限是通过PAC-Bayes理论随机测算器的。我们控制平均有多少信息被内核五氯苯甲醚捕获,我们解开界限以突出内核五氯苯甲醚算法的长处和短处。因此,我们为更好地了解内核五氯苯甲醚的内核值作出了贡献。我们的范围通过一个微小的数字示例作了简要说明。