Controllable data generation aims to synthesize data by specifying values for target concepts. Achieving this reliably requires modeling the underlying generative factors and their relationships. In real-world scenarios, these factors exhibit both causal and correlational dependencies, yet most existing methods model only part of this structure. We propose the Causal-Correlation Variational Autoencoder (C2VAE), a unified framework that jointly captures causal and correlational relationships among latent factors. C2VAE organizes the latent space into a structured graph, identifying a set of root causes that govern the generative processes. By optimizing only the root factors relevant to target concepts, the model enables efficient and faithful control. Experiments on synthetic and real-world datasets demonstrate that C2VAE improves generation quality, disentanglement, and intervention fidelity over existing baselines.
翻译:暂无翻译