We propose a multi-layer variational autoencoder method, we call HR-VQVAE, that learns hierarchical discrete representations of the data. By utilizing a novel objective function, each layer in HR-VQVAE learns a discrete representation of the residual from previous layers through a vector quantized encoder. Furthermore, the representations at each layer are hierarchically linked to those at previous layers. We evaluate our method on the tasks of image reconstruction and generation. Experimental results demonstrate that the discrete representations learned by HR-VQVAE enable the decoder to reconstruct high-quality images with less distortion than the baseline methods, namely VQVAE and VQVAE-2. HR-VQVAE can also generate high-quality and diverse images that outperform state-of-the-art generative models, providing further verification of the efficiency of the learned representations. The hierarchical nature of HR-VQVAE i) reduces the decoding search time, making the method particularly suitable for high-load tasks and ii) allows to increase the codebook size without incurring the codebook collapse problem.
翻译:我们建议采用多层变换自动编码器方法,我们称之为 HR-VQVAE,通过使用一个新的客观功能,HR-VQVAE的每个层通过矢量定量编码器,从前层的残渣中通过一个矢量定量编码器,从前层的残骸中学习独立的表示;此外,每个层的表示方式与前层的图象重建和生成任务有等级联系;我们评估图像重建和生成任务的方法;实验结果显示,HR-VQVVAE所学的离散表示方式使解码器能够以比基线方法,即VQVVAE和VQVVAE-2更不扭曲的方式重建高质量的图像;HR-VVVVAE还能够产生质量和多样性的图像,这些图像超越了最先进的基因化模型,进一步核实了所学的图象的效率;HR-VVVVAEi的等级性质减少了解码搜索时间,使该方法特别适合高负荷任务和二) 能够增加编码簿的大小,而不会引起代码折叠问题。