The mutual information between predictions and model parameters -- also referred to as expected information gain or BALD in machine learning -- measures informativeness. It is a popular acquisition function in Bayesian active learning and Bayesian optimal experiment design. In data subset selection, i.e. active learning and active sampling, several recent works use Fisher information, Hessians, similarity matrices based on the gradients, or simply the gradient lengths to compute the acquisition scores that guide sample selection. Are these different approaches connected, and if so how? In this paper, we revisit the Fisher information and use it to show how several otherwise disparate methods are connected as approximations of information-theoretic quantities.
翻译:预测和模型参数之间的相互信息 -- -- 在机器学习中也称为预期的信息获取或BALD -- -- 测量信息信息性。这是巴耶斯积极学习和巴耶斯最佳实验设计中流行的获取功能。在数据子集选择中,即积极学习和积极抽样,最近的一些作品使用了渔业信息、海珊、基于梯度的类似矩阵,或者仅仅用梯度长度来计算用于指导样本选择的获取分数。这些不同方法是否相互连接,如果是,如何连接?在本文件中,我们重新审视了渔业信息,并使用它来显示几种不同方法是如何连接到信息理论数量的近似值的。