Distribution regression has recently attracted much interest as a generic solution to the problem of supervised learning where labels are available at the group level, rather than at the individual level. Current approaches, however, do not propagate the uncertainty in observations due to sampling variability in the groups. This effectively assumes that small and large groups are estimated equally well, and should have equal weight in the final regression. We account for this uncertainty with a Bayesian distribution regression formalism, improving the robustness and performance of the model when group sizes vary. We frame our models in a neural network style, allowing for simple MAP inference using backpropagation to learn the parameters, as well as MCMC-based inference which can fully propagate uncertainty. We demonstrate our approach on illustrative toy datasets, as well as on a challenging problem of predicting age from images.
翻译:最近,分布回归作为监督学习问题的通用解决办法引起了人们很大的兴趣,因为在集团一级,而不是个人一级,有标签可以使用,但目前的做法并不传播由于群体抽样的可变性而导致的观察不确定性。这实际上假定,小群体和大群体得到同样好的估计,在最后回归中应当具有同等的份量。我们用巴耶斯分布回归形式主义来解释这种不确定性,在群体大小不同时,改进模型的坚固性和性。我们用神经网络风格来设计模型,允许利用简单的MAP推论来学习参数,以及以MCMCMC为基础的推论来充分传播不确定性。我们展示了我们关于说明玩具数据集的方法,以及从图像中预测年龄这一具有挑战性的问题。