Ensembles of neural networks have been shown to give better performance than single networks, both in terms of predictions and uncertainty estimation. Additionally, ensembles allow the uncertainty to be decomposed into aleatoric (data) and epistemic (model) components, giving a more complete picture of the predictive uncertainty. Ensemble distillation is the process of compressing an ensemble into a single model, often resulting in a leaner model that still outperforms the individual ensemble members. Unfortunately, standard distillation erases the natural uncertainty decomposition of the ensemble. We present a general framework for distilling both regression and classification ensembles in a way that preserves the decomposition. We demonstrate the desired behaviour of our framework and show that its predictive performance is on par with standard distillation.
翻译:神经网络的集合显示,在预测和不确定性估计方面,其性能优于单一网络。此外,集合使得不确定性可以分解成单体(数据)和单体(模型)组成部分,更完整地描述预测的不确定性。混合蒸馏是将一个组合压缩成单一模型的过程,往往导致一种比单个共体成员更精细的模型。不幸的是,标准蒸馏消除了共体的自然不确定性分解。我们提出了一个用于蒸馏回归和分类组合的总框架,以保持分解的方式。我们展示了我们框架的理想行为,并表明其预测性能与标准蒸馏相同。