The distribution of a neural network's latent representations has been successfully used to detect out-of-distribution (OOD) data. This work investigates whether this distribution moreover correlates with a model's epistemic uncertainty, thus indicates its ability to generalise to novel inputs. We first empirically verify that epistemic uncertainty can be identified with the surprise, thus the negative log-likelihood, of observing a particular latent representation. Moreover, we demonstrate that the output-conditional distribution of hidden representations also allows quantifying aleatoric uncertainty via the entropy of the predictive distribution. We analyse epistemic and aleatoric uncertainty inferred from the representations of different layers and conclude that deeper layers lead to uncertainty with similar behaviour as established - but computationally more expensive - methods (e.g. deep ensembles). While our approach does not require modifying the training process, we follow prior work and experiment with an additional regularising loss that increases the information in the latent representations. We find that this leads to improved OOD detection of epistemic uncertainty at the cost of ambiguous calibration close to the data distribution. We verify our findings on both classification and regression models.
翻译:神经网络潜在代表的分布已被成功地用于检测分布(OOD)数据。 这项工作调查了这种分布是否还与模型的缩写不确定性相关, 从而表明它是否有能力对新投入进行概括化。 我们首先通过经验来核实,可辨别隐含不确定性与惊讶,从而对特定潜在代表进行观测的负日志相似性。 此外,我们证明,隐含代表的输出条件分布还允许通过预测分布的酶来量化疏漏性不确定性。 我们分析了从不同层次的表述中推断出来的缩略和偏移不确定性,并得出结论认为,更深层导致不确定性,其行为与既定的类似,但计算上更昂贵的方法(例如深层集合)相类似。 我们的方法并不要求修改培训过程,但我们遵循了先前的工作,并试验了额外的定期损失,从而增加了隐含代表中的信息。 我们发现,在接近数据分布的模糊校准成本上, OOD对隐性不确定性的检测工作得到了改进。