During image editing, existing deep generative models tend to re-synthesize the entire output from scratch, including the unedited regions. This leads to a significant waste of computation, especially for minor editing operations. In this work, we present Spatially Sparse Inference (SSI), a general-purpose technique that selectively performs computation for edited regions and accelerates various generative models, including both conditional GANs and diffusion models. Our key observation is that users tend to make gradual changes to the input image. This motivates us to cache and reuse the feature maps of the original image. Given an edited image, we sparsely apply the convolutional filters to the edited regions while reusing the cached features for the unedited regions. Based on our algorithm, we further propose Sparse Incremental Generative Engine (SIGE) to convert the computation reduction to latency reduction on off-the-shelf hardware. With 1.2%-area edited regions, our method reduces the computation of DDIM by 7.5$\times$ and GauGAN by 18$\times$ while preserving the visual fidelity. With SIGE, we accelerate the speed of DDIM by 3.0x on RTX 3090 and 6.6$\times$ on Apple M1 Pro CPU, and GauGAN by 4.2$\times$ on RTX 3090 and 14$\times$ on Apple M1 Pro CPU.
翻译:在图像编辑过程中,现有的深层基因化模型往往会从零开始重新合成全部产出,包括未经编辑的区域。这导致大量计算浪费,特别是细小编辑操作。在这项工作中,我们展示了空间粗略推断(SSI),这是一种通用技术,有选择地为编辑区域进行计算,并加速各种基因化模型,包括有条件的GANs和传播模型。我们的主要观察是,用户往往逐渐改变输入图像。这促使我们缓存和再利用原始图像的地貌图。根据经编辑的图像,我们很少将革命过滤器应用到编辑区域,同时为未经编辑的区域重新使用缓存的特性。根据我们的算法,我们进一步提议将计算减少的松散增精度精度引擎(SIGE)转换为减少现成硬件的粘度。在1.2%的区域编辑区域,我们的方法是将DDIM的计算值减少7.5美元,GOGAN减少18美元,同时保持直观的准确性。与SIGE90美元相比,我们加速了CIMX30的计算速度。