As a fundamental concept in information theory, mutual information (MI) has been commonly applied to quantify association between random variables. Most existing estimators of MI have unstable statistical performance since they involve parameter tuning. We develop a consistent and powerful estimator, called fastMI, that does not incur any parameter tuning. Using a copula formulation, fastMI estimates MI by leveraging Fast Fourier transformation-based estimation of the underlying density. Extensive simulation studies reveal that fastMI outperforms state-of-the-art estimators with improved estimation accuracy and reduced run time for large data sets. fastMI not only provides a powerful test for independence that controls type I error, it may be used for further inference purposes. We establish asymptotic normality of fastMI for dependent random variables using a new data-splitting analytic argument. Anticipating that fastMI will be a powerful tool in estimating mutual information in a broad range of data, we develop an R package fastMI for broader dissemination.
翻译:作为信息理论的一个基本概念,相互信息(MI)通常用于量化随机变量之间的关联。大多数MI现有测算员都具有不稳定的统计性,因为它们涉及参数调控。我们开发了一个一致和强大的测算员,称为快速MI,不需要任何参数调控。我们使用一种混合配方,快速MI通过利用快速Fourier变换法对基础密度进行估算来估算MI。广泛的模拟研究表明,快速MI优于最先进的估测员,提高了估算准确性,减少了大型数据集运行时间。快速MI不仅为控制I型错误的独立提供了强有力的测试,而且可用于进一步的推断目的。我们用新的数据分裂分析论来为依赖性随机变量确定快速MI的常态性。我们预计快速MI将成为评估广泛数据中相互信息的有力工具,我们开发了一个用于更广泛传播的R包快速MI。