带有警告性评论的时间序列数据主要组成部分分析的非现用理论理论 (Asymptotic Theory of Principal Component Analysis for Time Series Data with Cautionary Comments)

Principal component analysis (PCA) is a most frequently used statistical tool in almost all branches of data science. However, like many other statistical tools, there is sometimes the risk of misuse or even abuse. In this paper, we highlight possible pitfalls in using the theoretical results of PCA based on the assumption of independent data when the data are time series. For the latter, we state with proof a central limit theorem of the eigenvalues and eigenvectors (loadings), give direct and bootstrap estimation of their asymptotic covariances, and assess their efficacy via simulation. Specifically, we pay attention to the proportion of variation, which decides the number of principal components (PCs), and the loadings, which help interpret the meaning of PCs. Our findings are that while the proportion of variation is quite robust to different dependence assumptions, the inference of PC loadings requires careful attention. We initiate and conclude our investigation with an empirical example on portfolio management, in which the PC loadings play a prominent role. It is given as a paradigm of correct usage of PCA for time series data.

翻译：主要组成部分分析(PCA)是几乎所有数据科学分支中最常用的统计工具,然而,与其他许多统计工具一样,有时也存在滥用或甚至滥用的风险。在本文件中,我们强调在数据为时间序列时,根据独立数据的假设,使用五氯苯的理论结果可能存在陷阱。对于数据为时间序列,我们用证据说明,在使用五氯苯的理论结果时可能存在陷阱。对于后者,我们用一个核心限度来说明电子元值和二次元体(装载)的理论,直接地和靴套地估计其无症状的变量,并通过模拟来评估其效力。具体地说,我们注意差异的比例,它决定了主要组成部分(PCs)的数量,而负荷则有助于解释PCs的含义。我们的调查结果是,虽然变化的比例与不同的依赖性假设相当强,但PC负荷的推论需要认真注意。我们开始并结束我们的调查,在组合管理方面有一个经验实例,其中PC负荷起着突出的作用。我们把它作为正确使用五氯苯的时间序列数据的一个范例。

相关内容

PCA

关注 3

在统计中，主成分分析（PCA）是一种通过最大化每个维度的方差来将较高维度空间中的数据投影到较低维度空间中的方法。给定二维，三维或更高维空间中的点集合，可以将“最佳拟合”线定义为最小化从点到线的平均平方距离的线。可以从垂直于第一条直线的方向类似地选择下一条最佳拟合线。重复此过程会产生一个正交的基础，其中数据的不同单个维度是不相关的。这些基向量称为主成分。

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

数据科学导论，54页ppt，Introduction to Data Science

专知会员服务

42+阅读 · 2020年7月27日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日