In this work we use variational inference to quantify the degree of uncertainty in deep learning model predictions of radio galaxy classification. We show that the level of model posterior variance for individual test samples is correlated with human uncertainty when labelling radio galaxies. We explore the model performance and uncertainty calibration for a variety of different weight priors and suggest that a sparse prior produces more well-calibrated uncertainty estimates. Using the posterior distributions for individual weights, we show that we can prune 30% of the fully-connected layer weights without significant loss of performance by removing the weights with the lowest signal-to-noise ratio (SNR). We demonstrate that a larger degree of pruning can be achieved using a Fisher information based ranking, but we note that both pruning methods affect the uncertainty calibration for Fanaroff-Riley type I and type II radio galaxies differently. Finally we show that, like other work in this field, we experience a cold posterior effect, whereby the posterior must be down-weighted to achieve good predictive performance. We examine whether adapting the cost function to accommodate model misspecification can compensate for this effect, but find that it does not make a significant difference. We also examine the effect of principled data augmentation and find that this improves upon the baseline but also does not compensate for the observed effect. We interpret this as the cold posterior effect being due to the overly effective curation of our training sample leading to likelihood misspecification, and raise this as a potential issue for Bayesian deep learning approaches to radio galaxy classification in future.
翻译:在这项工作中,我们使用变式推论来量化对无线电星系分类的深度学习模型预测的不确定性程度。我们显示,在给无线电星系贴上标签时,单个测试样品的模型后端差值水平与人类的不确定性相关。我们探索不同重量前端的模型性能和不确定性校准,并表明先前的稀疏会得出更精确的不确定性估计。我们使用个别重量的后端分布,显示我们可以通过去除信号至噪音比率最低的重量,在不显著降低性能的情况下,在不显著降低性能的情况下,利用完全连系层重量的30%。我们表明,通过去除信号至噪音比率最低的比重,我们可以看到,利用渔业信息排名,可以实现更大程度的裁剪裁率,但是我们注意到,两种裁剪裁法的方法都会影响Fanaroff-Riley I型和二型无线电星系的不确定性校准。最后,我们显示,与这个领域的其他工作一样,我们经历着冷层后端的测序效应,因此必须降低后端的比值,才能取得良好的预测性能。我们研究是否调整成本功能,以适应模型的测算,作为深度测序的测算结果,我们也无法测算,我们也无法测算的测算,也不会改变这种测测测算结果。我们如何使这种测测测测测测测测测测测测得出了这种测算结果。