The entropy is a measure of uncertainty that plays a central role in information theory. When the distribution of the data is unknown, an estimate of the entropy needs be obtained from the data sample itself. We propose a semi-parametric estimate, based on a mixture model approximation of the distribution of interest. The estimate can rely on any type of mixture, but we focus on Gaussian mixture model to demonstrate its accuracy and versatility. Performance of the proposed approach is assessed through a series of simulation studies. We also illustrate its use on two real-life data examples.
翻译:昆虫是一种不确定性的量度,在信息理论中起着核心作用。当数据分布不明时,从数据样本本身中得出对酶需要的估计。我们根据利息分布的混合模型近似值提出半参数估计。该估计可以依赖任何类型的混合物,但我们侧重于高斯混合模型,以证明其准确性和多功能性。通过一系列模拟研究评估了拟议方法的绩效。我们还在两个实际数据实例中说明了该方法的使用情况。