Bayesian Neural Networks with Latent Variables (BNN+LVs) capture predictive uncertainty by explicitly modeling model uncertainty (via priors on network weights) and environmental stochasticity (via a latent input noise variable). In this work, we first show that BNN+LV suffers from a serious form of non-identifiability: explanatory power can be transferred between the model parameters and latent variables while fitting the data equally well. We demonstrate that as a result, in the limit of infinite data, the posterior mode over the network weights and latent variables is asymptotically biased away from the ground-truth. Due to this asymptotic bias, traditional inference methods may in practice yield parameters that generalize poorly and misestimate uncertainty. Next, we develop a novel inference procedure that explicitly mitigates the effects of likelihood non-identifiability during training and yields high-quality predictions as well as uncertainty estimates. We demonstrate that our inference method improves upon benchmark methods across a range of synthetic and real data-sets.
翻译:在这项工作中,我们首先表明,BNN+LV有着一种严重的不可识别性:在模型参数和潜在变量之间可以转移解释力,同时同样地适应数据;我们证明,由于无限数据的限制,网络重量和潜在变量的后方模式与网络重量和潜在变量之间的后方模式与地面图象相比,具有不可预测性。由于这种不可识别性偏差,传统的推断方法在实践中可能产生参数,而这种参数一般化的低度和误估不确定性。接下来,我们开发一种新的推论程序,明确减轻在培训期间无法识别的可能性的影响,并得出高质量的预测和不确定性估计。我们证明,我们的推论方法在一系列合成和真实数据集的基准方法方面有所改进。