Information-theoretic approaches to active learning have traditionally focused on maximising the information gathered about the model parameters, most commonly by optimising the BALD score. We highlight that this can be suboptimal from the perspective of predictive performance. For example, BALD lacks a notion of an input distribution and so is prone to prioritise data of limited relevance. To address this we propose the expected predictive information gain (EPIG), an acquisition function that measures information gain in the space of predictions rather than parameters. We find that using EPIG leads to stronger predictive performance compared with BALD across a range of datasets and models, and thus provides an appealing drop-in replacement.
翻译:信息理论方法的主动学习传统上集中在最大化模型参数收集的信息上,最常见的是通过优化BALD分数来实现的。我们强调,从预测性能的角度来看,这可能是次优的。例如,BALD缺乏输入分布的概念,因此容易优先考虑相关性有限的数据。为了解决这个问题,我们提出了期望预测信息增益(EPIG),一种在预测空间而非参数空间中衡量信息增益的获取函数。我们发现,与BALD相比,使用EPIG会在各种数据集和模型上带来更强的预测性能,并且因此提供了一种有吸引力的易于实现的替换方法。