Generative models excel at creating images that closely mimic real scenes, suggesting they inherently encode scene representations. We introduce Intrinsic LoRA (I-LoRA), a general approach that uses Low-Rank Adaptation (LoRA) to discover scene intrinsics such as normals, depth, albedo, and shading from a wide array of generative models. I-LoRA is lightweight, adding minimally to the model's parameters and requiring very small datasets for this knowledge discovery. Our approach, applicable to Diffusion models, GANs, and Autoregressive models alike, generates intrinsics using the same output head as the original images. Through control experiments, we establish a correlation between the generative model's quality and the extracted intrinsics' accuracy. Finally, scene intrinsics obtained by our method with just hundreds to thousands of labeled images, perform on par with those from supervised methods trained on millions of labeled examples.
翻译:暂无翻译