We present a novel algorithm to reduce tensor compute required by a conditional image generation autoencoder without sacrificing quality of photo-realistic image generation. Our method is device agnostic, and can optimize an autoencoder for a given CPU-only, GPU compute device(s) in about normal time it takes to train an autoencoder on a generic workstation. We achieve this via a two-stage novel strategy where, first, we condense the channel weights, such that, as few as possible channels are used. Then, we prune the nearly zeroed out weight activations, and fine-tune the autoencoder. To maintain image quality, fine-tuning is done via student-teacher training, where we reuse the condensed autoencoder as the teacher. We show performance gains for various conditional image generation tasks: segmentation mask to face images, face images to cartoonization, and finally CycleGAN-based model over multiple compute devices. We perform various ablation studies to justify the claims and design choices, and achieve real-time versions of various autoencoders on CPU-only devices while maintaining image quality, thus enabling at-scale deployment of such autoencoders.
翻译:我们推出一种新算法, 以减少一个有条件的图像生成自动编码器所需的电压计算, 同时又不牺牲照片现实图像生成的质量。 我们的方法是设备不可知性, 并且可以在正常时间里优化一个自动编码器, 用于在通用工作站上训练自动编码器。 我们通过一个两阶段的新战略实现这一点, 首先, 我们通过两个阶段的新战略, 首先, 我们压缩频道的重量, 比如尽可能少的频道使用。 然后, 我们利用几乎零的重量激活, 并精细调整自动编码器。 为了保持图像质量, 微调是通过师生培训完成的, 在那里我们重新使用压缩的自动编码器作为教师。 我们展示了各种有条件图像生成任务的性能增益: 面部部分遮罩, 脸图像到卡通化, 最后, 以 CycroGAN 为基础的模型超过多个计算装置。 我们进行各种通缩研究, 以证明索赔和设计选择的合理性, 并实现CPU- 专用设备上各种自动编码器的实时版本, 同时保持自动配置质量。