We study kernel methods in machine learning from the perspective of feature subspace. We establish a one-to-one correspondence between feature subspaces and kernels and propose an information-theoretic measure for kernels. In particular, we construct a kernel from Hirschfeld--Gebelein--R\'{e}nyi maximal correlation functions, coined the maximal correlation kernel, and demonstrate its information-theoretic optimality. We use the support vector machine (SVM) as an example to illustrate a connection between kernel methods and feature extraction approaches. We show that the kernel SVM on maximal correlation kernel achieves minimum prediction error. Finally, we interpret the Fisher kernel as a special maximal correlation kernel and establish its optimality.
翻译:我们从地物子空间的角度来研究机器学习的内核方法。 我们在地物子空间和内核之间建立一对一的对应关系,并为内核提出信息理论措施。特别是,我们从Hirschfeld-Gebelin-R\'{e}nyi 最大关联函数中建造一个内核,创建了最大相关内核,并展示了它的信息理论最佳性。我们用支持性矢量机(SVM)作为例证,说明内核方法和地物提取方法之间的联系。我们显示,关于最大相关内核的SVM内核可以达到最小的预测错误。最后,我们把渔业内核解释为一种特殊的最大相关内核,并确立其最佳性。