Learning the distribution of a continuous or categorical response variable $\boldsymbol y$ given its covariates $\boldsymbol x$ is a fundamental problem in statistics and machine learning. Deep neural network-based supervised learning algorithms have made great progress in predicting the mean of $\boldsymbol y$ given $\boldsymbol x$, but they are often criticized for their ability to accurately capture the uncertainty of their predictions. In this paper, we introduce classification and regression diffusion (CARD) models, which combine a denoising diffusion-based conditional generative model and a pre-trained conditional mean estimator, to accurately predict the distribution of $\boldsymbol y$ given $\boldsymbol x$. We demonstrate the outstanding ability of CARD in conditional distribution prediction with both toy examples and real-world datasets, the experimental results on which show that CARD in general outperforms state-of-the-art methods, including Bayesian neural network-based ones that are designed for uncertainty estimation, especially when the conditional distribution of $\boldsymbol y$ given $\boldsymbol x$ is multi-modal. In addition, we utilize the stochastic nature of the generative model outputs to obtain a finer granularity in model confidence assessment at the instance level for classification tasks.
翻译:深神经网络监督的学习算法在预测以$\boldsymbol y$给$\boldsymbol x$的平均值方面取得了巨大进展,但是,这些算法往往因其能够准确捕捉预测的不确定性而受到批评。在本文中,我们引入了分类和回归扩散模型(CARD)模型,这些模型结合了一种分辨扩散的有条件基因化模型和事先训练的有条件平均估测器,以准确预测以$\boldsymbol x$为单位的美元/boldsymbol x$的分布情况。我们展示了CARD在有条件分配模型中与玩具示例和真实世界数据集相比的出色能力,实验结果表明,CARD在总体上不符合最新预测方法,包括基于Bayesian神经网络的模型,这些模型是用来进行不确定性估算的,特别是当以美元/boldsymallsimational为单位时,我们用美元-sylsyldal imal imal imal imal imal exal astial astial astial astial exal extimeal exlidudududududududuction 。