Score-based diffusion models have achieved remarkable progress in various domains with the ability to generate new data samples that do not exist in the training set. In this work, we examine the hypothesis that their generalization ability arises from an interpolation effect caused by a smoothing of the empirical score function. Focusing on settings where the training set lies uniformly in a one-dimensional linear subspace, we study the interplay between score smoothing and the denoising dynamics with mathematically solvable models. In particular, we demonstrate how a smoothed score function can lead to the generation of samples that interpolate among the training data within their subspace while avoiding full memorization. We also present evidence that learning score functions with regularized neural networks can have a similar effect on the denoising dynamics as score smoothing.
翻译:暂无翻译