We introduce an information-theoretic quantity with similar properties to mutual information that can be estimated from data without making explicit assumptions on the underlying distribution. This quantity is based on a recently proposed matrix-based entropy that uses the eigenvalues of a normalized Gram matrix to compute an estimate of the eigenvalues of an uncentered covariance operator in a reproducing kernel Hilbert space. We show that a difference of matrix-based entropies (DiME) is well suited for problems involving maximization of mutual information between random variables. While many methods for such tasks can lead to trivial solutions, DiME naturally penalizes such outcomes. We provide several examples of use cases for the proposed quantity including a multi-view representation learning problem where DiME is used to encourage learning a shared representation among views with high mutual information. We also show the versatility of DiME by using it as objective function for a variety of tasks.
翻译:我们引入了具有类似属性的信息理论数量与相互信息,这种数量可以从数据中估算出来,而不必对基本分布作出明确的假设。这一数量基于最近提出的基于矩阵的宏图,它使用正常的格拉姆矩阵的精华值来计算复制内核希尔伯特空间中一个未出现的共变操作员的精华值估计数。我们表明,基于矩阵的异种(Dime)的差别非常适合解决在随机变量之间实现最大程度的相互信息的问题。虽然许多这类任务的方法可能导致微不足道的解决方案,但DimME自然会惩罚这类结果。我们为拟议数量提供了几个使用案例,包括多视角的学习问题,其中Dime用来鼓励学习具有高度相互信息的观点之间的共同代表。我们还通过将Dime作为各种任务的客观功能来显示Dime的多功能。