Bayesian Neural Networks with Latent Variables (BNN+LVs) provide uncertainties in prediction estimates by explicitly modeling model uncertainty (via priors on network weights) and environmental stochasticity (via a latent input noise variable). In this work, we first show that BNN+LV suffers from a serious form of non-identifiability: explanatory power can be transferred between model parameters and input noise while fitting the data equally well. We demonstrate that as a result, the posterior mode over the network weights and latent variables is asymptotically biased away from the ground truth, and as a result, traditional inference methods may yield parameters that generalize poorly and mis-estimate uncertainty. Next, we develop a novel inference procedure that explicitly mitigates the effects of likelihood non-identifiability during training and yields high quality predictions as well as uncertainty estimates. We demonstrate that our inference method improves upon benchmark methods across a range of synthetic and real data sets.
翻译:在这项工作中,我们首先表明,BNN+LV有着一种严重的不可识别性:模型参数和输入噪音之间可以转移解释力,同时同样地适应数据。我们证明,因此,网络重量和潜在变量的后方模式与地面真实性无区别,因此,传统的推论方法可能会产生参数,这些参数一般化地说不准确和误估不确定性。 其次,我们制定新的推论程序,明确减轻培训期间无法识别可能性的影响,并得出高质量的预测和不确定性估计。我们证明,我们的推论方法在一系列合成和真实数据集的基准方法方面有所改进。