Pimentel et al. (2020) recently analysed probing from an information-theoretic perspective. They argue that probing should be seen as approximating a mutual information. This led to the rather unintuitive conclusion that representations encode exactly the same information about a target task as the original sentences. The mutual information, however, assumes the true probability distribution of a pair of random variables is known, leading to unintuitive results in settings where it is not. This paper proposes a new framework to measure what we term Bayesian mutual information, which analyses information from the perspective of Bayesian agents -- allowing for more intuitive findings in scenarios with finite data. For instance, under Bayesian MI we have that data can add information, processing can help, and information can hurt, which makes it more intuitive for machine learning applications. Finally, we apply our framework to probing where we believe Bayesian mutual information naturally operationalises ease of extraction by explicitly limiting the available background knowledge to solve a task.
翻译:Pimentel等人(2020年)最近从信息理论角度分析了调查。 他们认为,调查应被视为接近相互信息。 这导致一个相当不直观的结论,即表示对目标任务的信息与最初的句子完全相同。 但是,相互信息假定了随机变量的真实概率分布,导致在不相干的环境中产生不直观的结果。本文提出了一个新框架,以衡量我们所说的巴耶西亚相互信息,从巴伊西亚物剂的角度分析信息 -- -- 允许用有限数据对情景进行更多直观的发现。 例如,在巴伊西亚的 MI 下,我们掌握的数据可以添加信息、处理帮助和信息伤害,这让机器学习应用更加直观。 最后,我们运用我们的框架来验证我们认为巴伊西亚人相互信息自然操作的提取容易,明确限制可用的背景知识来解决一项任务。