Pearson's correlation is among the mostly widely reported measures of association. The strength of the statistical evidence for linear association is determined by the p-value of a hypothesis test. If the true distribution of a dataset is bivariate normal, then under specific data transformations a t-statistic returns the exact p-value, otherwise it is an approximation. Alternatively, the p-value can be estimated by analyzing the distribution of the sample correlation under permutations of the data. Moment approximations of this distribution are not as widely used since estimation of the moments themselves are numerically intensive with greater uncertainties. In this paper we derive an inductive formula allowing for the analytic expression of the sample moments of the sample correlation under permutations of the data in terms of the central moments of the data. These formulas placed in a proper statistical framework could open up the possibility of new estimation methods for computing the p-value.
翻译:Pearson的关联性是大多数报告最广泛的关联度之一。 线性关联的统计证据的强度是由假设测试的 p值决定的。 如果数据集的真正分布是双变法正常的, 那么在特定的数据转换下, t- 统计返回准确的 p 值, 否则就是一个近似值。 或者, p- 值可以通过在数据对齐下分析样本相关性的分布来估计。 这种分布的动向没有被广泛使用, 因为对时间本身的估计在数值上密集,不确定性更大。 在本文中,我们得出了一个诱导性公式, 允许在数据中央时间的变相下对样本相关性的样本时间进行分析表达。 这些放在适当的统计框架中的公式可以打开计算 p- 值的新估计方法的可能性 。