Image-to-image translation models transfer images from input domain to output domain in an endeavor to retain the original content of the image. Contrastive Unpaired Translation is one of the existing methods for solving such problems. Significant advantage of this method, compared to competitors, is the ability to train and perform well in cases where both input and output domains are only a single image. Another key thing that differentiates this method from its predecessors is the usage of image patches rather than the whole images. It also turns out that sampling negatives (patches required to calculate the loss) from the same image achieves better results than a scenario where the negatives are sampled from other images in the dataset. This type of approach encourages mapping of corresponding patches to the same location in relation to other patches (negatives) while at the same time improves the output image quality and significantly decreases memory usage as well as the time required to train the model compared to CycleGAN method used as a baseline. Through a series of experiments we show that using focal loss in place of cross-entropy loss within the PatchNCE loss can improve on the model's performance and even surpass the current state-of-the-art model for image-to-image translation.
翻译:图像到图像翻译模型将图像从输入域向输出域传输到输出域, 以努力保留图像的原始内容。 不匹配的不匹配翻译是解决这类问题的现有方法之一。 与竞争者相比, 这种方法的重大优势是, 在输入和输出域仅是单一图像的情况下, 能够培训和运行良好。 与前一种方法不同的另一个关键因素是图像补丁的使用, 而不是整个图像。 其结果还表明, 从同一图像中取样的负值( 计算损失所需的方块) 与从数据集中从其他图像中取样的负值的情景相比, 效果更好。 这种方法鼓励在与其他补丁( 负值) 相比, 能够在同一地点绘制相应的补丁, 同时提高输出图像质量, 显著降低记忆用量, 以及将模型与用作基线的 CycroomGAN 方法相比培训模型所需的时间。 通过一系列实验, 我们显示, 使用CatchNCE 损失模型中交叉滴损失的焦点损失的位置可以改进模型的性能超过当前图像翻译。