Using generated data to improve the performance of downstream discriminative models has recently gained popularity due to the great development of pre-trained language models. In most previous studies, generative models and discriminative models are trained separately and thus could not adapt to any changes in each other. As a result, the generated samples can easily deviate from the real data distribution, while the improvement of the discriminative model quickly reaches saturation. Generative adversarial networks (GANs) train generative models via an adversarial process with discriminative models to achieve joint training. However, the training of standard GANs is notoriously unstable and often falls short of convergence. In this paper, to address these issues, we propose a $\textit{self-consistent learning}$ framework, in which a discriminator and a generator are cooperatively trained in a closed-loop form. The discriminator and the generator enhance each other during multiple rounds of alternating training until a scoring consensus is reached. This framework proves to be easy to train and free from instabilities such as mode collapse and non-convergence. Extensive experiments on sentence semantic matching demonstrate the effectiveness of the proposed framework: the discriminator achieves 10+ AP of improvement on the zero-shot setting and new state-of-the-art performance on the full-data setting.
翻译:利用生成的数据来改进下游歧视性模式的性能,最近由于受过训练的语文模式的大力发展,利用生成的数据来提高下游歧视性模式的绩效最近受到欢迎。在以往的大多数研究中,基因模型和歧视性模式是分开培训的,因此无法相互适应任何变化。结果,产生的样本很容易偏离真实的数据分布,而歧视模式的改进很快达到饱和。产生式对立网络(GANs)通过带有歧视性模式的对立过程来培训基因化模式,从而实现联合培训。然而,标准的GANs的培训臭名昭著,不稳定,往往达不到趋同。在本文中,为了解决这些问题,我们提议了一个用美元来单独培训,因此无法相互适应任何变化。结果,产生的样本可以很容易地偏离实际数据分布,同时以闭路方式合作的方式培训。在多轮轮轮培训期间,歧视者和产生者相互加强。这个框架证明很容易培训和摆脱诸如模式崩溃和非趋同性等不稳定性因素。在判决中,对判决中的自相匹配性测试是广泛的实验。在判决中,用“自我一致学习”这个框架展示了“歧视+标准”的全套标准。在零数据设置上实现了。</s>