We present a neural network structure, ControlNet, to control pretrained large diffusion models to support additional input conditions. The ControlNet learns task-specific conditions in an end-to-end way, and the learning is robust even when the training dataset is small (< 50k). Moreover, training a ControlNet is as fast as fine-tuning a diffusion model, and the model can be trained on a personal devices. Alternatively, if powerful computation clusters are available, the model can scale to large amounts (millions to billions) of data. We report that large diffusion models like Stable Diffusion can be augmented with ControlNets to enable conditional inputs like edge maps, segmentation maps, keypoints, etc. This may enrich the methods to control large diffusion models and further facilitate related applications.
翻译:我们提出了一个神经网络结构,即控制网,以控制预先训练的大型扩散模型,以支持额外的输入条件。控制网以端到端的方式学习任务特定的条件,即使培训数据集小( < 50k),学习也是很健全的。此外,控制网的训练速度和微调扩散模型一样快,可以就个人设备进行培训。或者,如果有强大的计算组,模型可以扩大到大量数据(百万至数十亿)。我们报告说,像稳定传播这样的大型扩散模型可以通过控制网加以扩大,以便能够有条件的投入,如边缘地图、分割图、关键点等等。这可能会丰富控制大型扩散模型的方法,并进一步便利相关应用。