AgFlow:通过 " 渐进流动 " 的隐性规范化效应快速示范选择 (AgFlow: Fast Model Selection of Penalized PCA via Implicit Regularization Effects of Gradient Flow)

Principal component analysis (PCA) has been widely used as an effective technique for feature extraction and dimension reduction. In the High Dimension Low Sample Size (HDLSS) setting, one may prefer modified principal components, with penalized loadings, and automated penalty selection by implementing model selection among these different models with varying penalties. The earlier work [1, 2] has proposed penalized PCA, indicating the feasibility of model selection in $L_2$- penalized PCA through the solution path of Ridge regression, however, it is extremely time-consuming because of the intensive calculation of matrix inverse. In this paper, we propose a fast model selection method for penalized PCA, named Approximated Gradient Flow (AgFlow), which lowers the computation complexity through incorporating the implicit regularization effect introduced by (stochastic) gradient flow [3, 4] and obtains the complete solution path of $L_2$-penalized PCA under varying $L_2$-regularization. We perform extensive experiments on real-world datasets. AgFlow outperforms existing methods (Oja [5], Power [6], and Shamir [7] and the vanilla Ridge estimators) in terms of computation costs.

翻译：主要成分分析(PCA)被广泛用作地貌提取和降低尺寸的有效技术。在高尺寸低样本规模(HDLSS)的设置中,人们可能更倾向于修改主要成分,规定惩罚性装载,并通过在这些不同模型中采用示范选择,规定不同的惩罚,自动选择刑罚。早先的工作[1,2] 已经提出惩罚性五氯苯甲醚,指出通过山脊回归的解决方案路径,以2美元罚款的五氯苯甲醚模式选择的可行性,但是,由于对矩阵进行密集的反向计算,它耗时非常多。在本文中,我们提议为受处罚的五氯苯甲醚(Apopblod gradient Flow (AgFlow))采用快速模式选择方法,该方法通过纳入(cheatic)梯度流[3,4] 引入的隐含的正规化效果,降低计算的复杂性,并获得2美元计值的五氯苯甲醚的完整溶液路径,但以不同的L_2美元为常规化。我们在真实世界数据集上进行广泛的实验。AgFlow 超越了现有方法(Oja [5]、Pow [6] 和Shamir [7] 和香层估测成本的计算方法。

相关内容

PCA

关注 3

在统计中，主成分分析（PCA）是一种通过最大化每个维度的方差来将较高维度空间中的数据投影到较低维度空间中的方法。给定二维，三维或更高维空间中的点集合，可以将“最佳拟合”线定义为最小化从点到线的平均平方距离的线。可以从垂直于第一条直线的方向类似地选择下一条最佳拟合线。重复此过程会产生一个正交的基础，其中数据的不同单个维度是不相关的。这些基向量称为主成分。

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【UIUC硬核书】统计学习理论，Statistical Learning Theory，213页pdf

专知会员服务

134+阅读 · 2020年4月14日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日