Entropy estimation is of practical importance in information theory and statistical science. Many existing entropy estimators suffer from fast growing estimation bias with respect to dimensionality, rendering them unsuitable for high-dimensional problems. In this work we propose a transform-based method for high-dimensional entropy estimation, which consists of the following two main ingredients. First by modifying the k-NN based entropy estimator, we propose a new estimator which enjoys small estimation bias for samples that are close to a uniform distribution. Second we design a normalizing flow based mapping that pushes samples toward a uniform distribution, and the relation between the entropy of the original samples and the transformed ones is also derived. As a result the entropy of a given set of samples is estimated by first transforming them toward a uniform distribution and then applying the proposed estimator to the transformed samples. The performance of the proposed method is compared against several existing entropy estimators, with both mathematical examples and real-world applications.
翻译:熵估计在信息理论和统计学中具有实际重要性。许多现有的熵估算器在维度上存在快速增长的偏差,因此不适用于高维问题。在本文中,我们提出了一种基于变换的高维熵估计方法,包括以下两个主要部分。首先,通过修改基于k-NN的熵估计器,我们提出了一种新的估计器,它在接近均匀分布的样本上具有小的估计偏差。其次,我们设计了一个基于正则化流的映射,将样本推向均匀分布,并推导出原始样本的熵与转换后样本之间的关系。因此,估计给定样本集的熵是通过先将它们转换为均匀分布,然后将提出的估计器应用于转换后的样本来完成的。我们使用数学示例和实际应用程序将所提出的方法的性能与几种现有的熵估算器进行比较。