互信息(Mutual Information)是信息论里一种有用的信息度量,它可以看成是一个随机变量中包含的关于另一个随机变量的信息量,或者说是一个随机变量由于已知另一个随机变量而减少的不肯定性.

VIP内容

近年来,互信息(MI)在限制深度神经网络(DNNs)泛化误差方面引起了人们的广泛关注。然而,由于很难准确估计神经网络中的信息熵,因此以往的研究大多都需要放宽信息熵的界限,从而削弱了对泛化的信息理论解释。针对这一局限性,本文引入了一种用于精确估计MI的DNNs的概率表示方法。利用本文提出的MI估计器,我们验证了对泛化的信息理论解释,并得出了一个比最先进的松解更紧的概化边界。

成为VIP会员查看完整内容
0
17

最新内容

Language models are known to produce vague and generic outputs. We propose two unsupervised decoding strategies based on either word-frequency or point-wise mutual information to increase the specificity of any model that outputs a probability distribution over its vocabulary at generation time. We test the strategies in a prompt completion task; with human evaluations, we find that both strategies increase the specificity of outputs with only modest decreases in sensibility. We also briefly present a summarization use case, where these strategies can produce more specific summaries.

0
0
下载
预览

最新论文

Language models are known to produce vague and generic outputs. We propose two unsupervised decoding strategies based on either word-frequency or point-wise mutual information to increase the specificity of any model that outputs a probability distribution over its vocabulary at generation time. We test the strategies in a prompt completion task; with human evaluations, we find that both strategies increase the specificity of outputs with only modest decreases in sensibility. We also briefly present a summarization use case, where these strategies can produce more specific summaries.

0
0
下载
预览
Top