Mutual information (MI) is a fundamental measure of statistical dependence, with a myriad of applications to information theory, statistics, and machine learning. While it possesses many desirable structural properties, the estimation of high-dimensional MI from samples suffers from the curse of dimensionality. Motivated by statistical scalability to high dimensions, this paper proposes \emph{sliced} MI (SMI) as a surrogate measure of dependence. SMI is defined as an average of MI terms between one-dimensional random projections. We show that it preserves many of the structural properties of classic MI, while gaining scalable computation and efficient estimation from samples. Furthermore, and in contrast to classic MI, SMI can grow as a result of deterministic transformations. This enables leveraging SMI for feature extraction by optimizing it over processing functions of raw data to identify useful representations thereof. Our theory is supported by numerical studies of independence testing and feature extraction, which demonstrate the potential gains SMI offers over classic MI for high-dimensional inference.
翻译:相互信息(MI)是统计依赖性的基本衡量标准,对信息理论、统计和机器学习有多种应用。虽然它拥有许多可取的结构特性,但从样本中估算高维 MI 具有维度的诅咒。受统计可扩缩到高维的驱动,本文件提出\emph{slied} MI(SMI)作为衡量依赖性的替代标准。SMI的定义是单维随机预测之间MI条件的平均值。我们表明它保存了经典MI的许多结构特性,同时从样本中获取可缩放的计算和高效的估计。此外,与经典MI不同的是,SMI可以随着确定性变异而增长。这通过优化SMI在原始数据的处理功能上对特征提取的利用,从而确定这些特性的有用表达方式。我们的理论得到独立测试和特征提取数字研究的支持,这些研究表明SMI在高维度推断中可能超过经典MI的收益。