Uncertainty estimation in deep models is essential in many real-world applications and has benefited from developments over the last several years. Recent evidence suggests that existing solutions dependent on simple Gaussian formulations may not be sufficient. However, moving to other distributions necessitates Monte Carlo (MC) sampling to estimate quantities such as the KL divergence: it could be expensive and scales poorly as the dimensions of both the input data and the model grow. This is directly related to the structure of the computation graph, which can grow linearly as a function of the number of MC samples needed. Here, we construct a framework to describe these computation graphs, and identify probability families where the graph size can be independent or only weakly dependent on the number of MC samples. These families correspond directly to large classes of distributions. Empirically, we can run a much larger number of iterations for MC approximations for larger architectures used in computer vision with gains in performance measured in confident accuracy, stability of training, memory and training time.
翻译:深海模型的不确定性估计在许多现实世界应用中至关重要,并得益于过去几年的发展动态。最近的证据显示,依靠简单的高斯配方的现有解决方案可能还不够。然而,转向其他分布方法,蒙特卡洛(MC)取样就必须估算数量,如KL差异:随着输入数据和模型的尺寸增长,它可能费用昂贵,规模可能不高。这与计算图的结构直接相关,因为计算图可以线性增长,与需要的MC样本数量成函数。在这里,我们建立一个框架来描述这些计算图表,并找出图表大小可以独立或只能微弱依赖MC样本数量的概率家庭。这些家庭直接对应大型分布类别。很自然,我们可以对计算机视觉中使用的较大结构进行更多多的组合,其性能以自信的准确性、培训的稳定性、记忆和培训时间来衡量。