In this work we use variational inference to quantify the degree of uncertainty in deep learning model predictions of radio galaxy classification. We show that the level of model posterior variance for individual test samples is correlated with human uncertainty when labelling radio galaxies. We explore the model performance and uncertainty calibration for different weight priors and suggest that a sparse prior produces more well-calibrated uncertainty estimates. Using the posterior distributions for individual weights, we demonstrate that we can prune 30% of the fully-connected layer weights without significant loss of performance by removing the weights with the lowest signal-to-noise ratio. A larger degree of pruning can be achieved using a Fisher information based ranking, but both pruning methods affect the uncertainty calibration for Fanaroff-Riley type I and type II radio galaxies differently. Like other work in this field, we experience a cold posterior effect, whereby the posterior must be down-weighted to achieve good predictive performance. We examine whether adapting the cost function to accommodate model misspecification can compensate for this effect, but find that it does not make a significant difference. We also examine the effect of principled data augmentation and find that this improves upon the baseline but also does not compensate for the observed effect. We interpret this as the cold posterior effect being due to the overly effective curation of our training sample leading to likelihood misspecification, and raise this as a potential issue for Bayesian deep learning approaches to radio galaxy classification in future.
翻译:在这项工作中,我们使用变式推论来量化无线电星系分类深度学习模型预测的不确定性程度。我们显示,在给无线电星系贴上标签时,单个测试样品的模型外表差异程度与人类的不确定性相关。我们探索不同重量前星系的模型性能和不确定性校准,并表明,稀疏的先天性生成的不确定性估计数更加精确。我们使用个别重量的后天性分布法,表明我们可以利用最低的信号对音比,在不显著降低深度性能的情况下,利用深度性能损失30%。利用基于渔业信息的排名,可以实现更大程度的细度的细度调整,但两种细度方法都会影响Fanaroff-Riley I型和II型无线电星系的不确定性校准。我们与该领域的其他工作一样,会经历一种冷度的外表效应,使后天体重量必须降低,以达到良好的预测性能。我们研究的是,通过采用最低的信号对音比比率来调整成本函数以适应模型的精确性能来弥补这一影响,但发现,它不会产生显著的细度差异。我们研究方法,我们还研究如何修正了这种测度,从而提高了这种测测测测测测测测度,从而测测度,从而得出了这种测测度,从而得出了这种测测测测度,从而测测测测测测测度后的结果。我们测测测度后的结果。