利用小空间对核心中央流流五氯苯甲醚</s> (Streaming Kernel PCA Algorithm With Small Space)

Principal Component Analysis (PCA) is a widely used technique in machine learning, data analysis and signal processing. With the increase in the size and complexity of datasets, it has become important to develop low-space usage algorithms for PCA. Streaming PCA has gained significant attention in recent years, as it can handle large datasets efficiently. The kernel method, which is commonly used in learning algorithms such as Support Vector Machines (SVMs), has also been applied in PCA algorithms. We propose a streaming algorithm for Kernel PCA problems based on the traditional scheme by Oja. Our algorithm addresses the challenge of reducing the memory usage of PCA while maintaining its accuracy. We analyze the performance of our algorithm by studying the conditions under which it succeeds. Specifically, we show that, when the spectral ratio $R := \lambda_1/\lambda_2$ of the target covariance matrix is lower bounded by $C \cdot \log n\cdot \log d$, the streaming PCA can be solved with $O(d)$ space cost. Our proposed algorithm has several advantages over existing methods. First, it is a streaming algorithm that can handle large datasets efficiently. Second, it employs the kernel method, which allows it to capture complex nonlinear relationships among data points. Third, it has a low-space usage, making it suitable for applications where memory is limited.

翻译：元件分析(PCA)是机器学习、数据分析和信号处理中广泛使用的一种技术。随着数据集规模和复杂性的提高,开发低空使用率算法对常设仲裁院非常重要。近年来,常设仲裁院由于能够高效率地处理大型数据集而引起了极大关注。在支持矢量机(SVMS)等学习算法中通常使用的内核方法也用于常设仲裁院的算法。我们根据Oja的传统方案,提出了Kernel 常设仲裁院问题流算法。我们的算法解决了减少常设仲裁院记忆使用量的同时保持其准确性的挑战。我们通过研究其成功的条件来分析我们的算法的性能。具体地说,我们表明,当光谱比率 $ = =\ lamda_1/\ lambda_2$ 用于目标变异性矩阵的学习算法中通常被 $C\ cdotg\ log ncd\ log d$。流中,流中常设仲裁院可以用$(d)$(d)$(d) lax lad laom lavel lavel lavel shail lag) 这样的现有方法中有一些优势, 方法可以处理高数据。我们提议的算法可以将它处理高的第二个的计算方法。</s>

相关内容

PCA

关注 3

在统计中，主成分分析（PCA）是一种通过最大化每个维度的方差来将较高维度空间中的数据投影到较低维度空间中的方法。给定二维，三维或更高维空间中的点集合，可以将“最佳拟合”线定义为最小化从点到线的平均平方距离的线。可以从垂直于第一条直线的方向类似地选择下一条最佳拟合线。重复此过程会产生一个正交的基础，其中数据的不同单个维度是不相关的。这些基向量称为主成分。

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

44+阅读 · 2020年12月18日