Isolating and controlling specific features in the outputs of generative models in a user-friendly way is a difficult and open-ended problem. We develop techniques that allow an oracle user to generate an image they are envisioning in their head by answering a sequence of relative queries of the form \textit{"do you prefer image $a$ or image $b$?"} Our framework consists of a Conditional VAE that uses the collected relative queries to partition the latent space into preference-relevant features and non-preference-relevant features. We then use the user's responses to relative queries to determine the preference-relevant features that correspond to their envisioned output image. Additionally, we develop techniques for modeling the uncertainty in images' predicted preference-relevant features, allowing our framework to generalize to scenarios in which the relative query training set contains noise.
翻译:以方便用户的方式分离和控制基因模型产出的具体特征是一个困难和开放的问题。 我们开发了各种技术,使一个神器用户能够通过回答形式\ textit{""你更喜欢图像$a美元或图像$b$的相对询问顺序,在他们脑中生成他们所想象的图像。}我们的框架包括一个条件性VAE,利用收集到的相关查询将潜在空间分割成与偏好有关的特性和非偏好相关特性。然后我们使用用户对相对查询的答复来确定与其预想产出图像相对的偏好相关特性。此外,我们开发了模拟图像预测偏好相关特性不确定性的技术,使我们的框架能够概括到相对查询训练组包含噪音的情景。