We argue that the estimation of mutual information between high dimensional continuous random variables can be achieved by gradient descent over neural networks. We present a Mutual Information Neural Estimator (MINE) that is linearly scalable in dimensionality as well as in sample size, trainable through back-prop, and strongly consistent. We present a handful of applications on which MINE can be used to minimize or maximize mutual information. We apply MINE to improve adversarially trained generative models. We also use MINE to implement Information Bottleneck, applying it to supervised classification; our results demonstrate substantial improvement in flexibility and performance in these settings.
翻译:我们认为,对高维连续随机变量之间相互信息的估计可以通过神经网络的梯度下降来实现。我们提出了一个在尺寸和样本大小上可以线性缩放的相互信息神经模拟器(MINE),可以通过反向直径进行训练,并且非常一致。我们提出了可以使用高维连续随机变量最大限度地减少或最大限度地增加相互信息的一些应用。我们运用MIME改进对抗性训练的基因模型。我们还利用MIME实施信息瓶颈,将其用于监督分类;我们的结果表明在这些环境中灵活性和性能都大有改进。