Efficient and accurate estimation of multivariate empirical probability distributions is fundamental to the calculation of information-theoretic measures such as mutual information and transfer entropy. Common techniques include variations on histogram estimation which, whilst computationally efficient, often fail to closely approximate the probability density functions - particularly for distributions with fat tails or fine substructure, or when sample sizes are small. This paper demonstrates that the application of rotation operations can improve entropy estimates by aligning the geometry of the partition to the sample distribution. A method for generating equiprobable multivariate histograms is presented, using recursive binary partitioning, for which optimal rotations are found. Such optimal partitions were observed to be more accurate than existing techniques in estimating entropies of correlated bivariate Gaussian distributions with known theoretical values, across varying sample sizes (99\% CI).
翻译:对多变经验概率分布的高效和准确估计对于计算信息理论计量方法,例如相互信息和传输酶等信息分布至关重要。常见技术包括直方图估计的变异,这些变异在计算效率上往往不能接近概率密度函数,特别是脂肪尾巴或细亚结构的分布,或当样本大小小时。本文表明,采用旋转操作,使分区的几何与样本分布相匹配,可以改善对英方位估计值。采用了一种生成可装备性多变正方位图的方法,为此可以找到最佳的循环双向分布。在估计不同样本大小(99-CI)之间相关双差高斯分布的已知理论值时,观察到这种最佳分区比现有方法更精确(99-CI)。