Most image-to-image translation methods require a large number of training images, which restricts their applicability. We instead propose ManiFest: a framework for few-shot image translation that learns a context-aware representation of a target domain from a few images only. To enforce feature consistency, our framework learns a style manifold between source and proxy anchor domains (assumed to be composed of large numbers of images). The learned manifold is interpolated and deformed towards the few-shot target domain via patch-based adversarial and feature statistics alignment losses. All of these components are trained simultaneously during a single end-to-end loop. In addition to the general few-shot translation task, our approach can alternatively be conditioned on a single exemplar image to reproduce its specific style. Extensive experiments demonstrate the efficacy of ManiFest on multiple tasks, outperforming the state-of-the-art on all metrics and in both the general- and exemplar-based scenarios. Our code is available at https://github.com/cv-rits/Manifest .
翻译:多数图像到图像翻译方法需要大量培训图像,这限制了其适用性。我们建议使用ManiFest:一个框架,用于提供少量图像翻译,仅从几张图像中学习一个目标域的上下文代表。为了执行特征一致性,我们的框架在源域和代理锚域之间学习了一种风格式的多重(估计由大量图像组成),所学到的元件通过基于补丁的对称和特征统计对齐损失,相互交织,并变形为少发目标域。所有这些元件都在一个端到端循环中同时接受培训。除了一般的几张图像翻译任务外,我们的方法还可以以单一的外观图像为条件,复制其特定样式。广泛的实验显示ManiFest在多项任务上的效力,超越了所有计量以及一般和基于实例的状态。我们的代码可在https://github.com/cv-rits/Manifest查阅。