使用EigenGame 使五氯苯甲醚与EigenGame一起原始化 (Priming PCA with EigenGame)

We introduce primed-PCA (pPCA), an extension of the recently proposed EigenGame algorithm for computing principal components in a large-scale setup. Our algorithm first runs EigenGame to get an approximation of the principal components, and then applies an exact PCA in the subspace they span. Since this subspace is of small dimension in any practical use of EigenGame, this second step is extremely cheap computationally. Nonetheless, it improves accuracy significantly for a given computational budget across datasets. In this setup, the purpose of EigenGame is to narrow down the search space, and prepare the data for the second step, an exact calculation. We show formally that pPCA improves upon EigenGame under very mild conditions, and we provide experimental validation on both synthetic and real large-scale datasets showing that it systematically translates to improved performance. In our experiments we achieve improvements in convergence speed by factors of 5-25 on the datasets of the original EigenGame paper.

翻译：我们引入了用于在大型设置中计算主要组件的最近提议的 EigenGame 算法( pPCA) 。我们的算法首先运行 EigenGame, 以获得主要组件的近似值, 然后在它们所覆盖的子空间中应用精确的 CPA 。由于这个子空间在EigenGame 的任何实际使用中都属于小尺寸, 第二步是极廉价的计算。然而, 它大大提高了特定计算预算跨数据集的精确度。在这个设置中, EigenGame 的目的是缩小搜索空间, 为第二步准备数据, 精确的计算。我们正式显示 PPCA 在非常温和的条件下对 EigenGame 的改进, 我们提供合成和真实的大型数据集的实验性验证, 表明它能够系统地转换为改进性能。在我们的实验中, 我们通过原始 EigenGame 纸的数据设置的5- 25 来提高趋同速度。

相关内容

PCA

关注 3

在统计中，主成分分析（PCA）是一种通过最大化每个维度的方差来将较高维度空间中的数据投影到较低维度空间中的方法。给定二维，三维或更高维空间中的点集合，可以将“最佳拟合”线定义为最小化从点到线的平均平方距离的线。可以从垂直于第一条直线的方向类似地选择下一条最佳拟合线。重复此过程会产生一个正交的基础，其中数据的不同单个维度是不相关的。这些基向量称为主成分。

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【Manning新书】现代Java实战，592页pdf

专知会员服务

101+阅读 · 2020年5月22日