Trying to capture the sample-label relationship, conditional generative models often end up inheriting the spurious correlation in the training dataset, giving label-conditional distributions that are severely imbalanced in another latent attribute. To mitigate such undesirable correlations engraved into generative models, which we call spurious causality, we propose a general two-step strategy. (a) Fairness Intervention (FI): Emphasize the minority samples that are hard to be generated due to the spurious correlation in the training dataset. (b) Corrective Sampling (CS): Filter the generated samples explicitly to follow the desired label-conditional latent attribute distribution. We design the fairness intervention for various degrees of supervision on the spurious attribute, including unsupervised, weakly-supervised, and semi-supervised scenarios. Our experimental results show that the proposed FICS can successfully resolve the spurious correlation in generated samples on various datasets.
翻译:为了捕捉样本标签关系,有条件的基因变异模型往往最终继承了培训数据集中虚假的关联,给标签条件分布带来在另一个潜在属性中严重不平衡的标签条件。为了减轻被刻录成基因化模型的这种不可取的关联,我们称之为虚假因果关系,我们提出了一个一般的两步战略。 (a) 公平干预(FI):强调由于培训数据集中虚假的关联而难以生成的少数样本。 (b) 纠正性抽样(CS):过滤生成的样本,明确遵循预期的标签条件潜在属性分布。我们设计了对虚假属性不同程度的监督的公平干预,包括不受监督、监督薄弱和半监督的情景。我们的实验结果表明,拟议的FICS能够成功解决各种数据集中生成样本的虚假关联。