Unsupervised learning is of growing interest because it unlocks the potential held in vast amounts of unlabelled data to learn useful representations for inference. Autoencoders, a form of generative model, may be trained by learning to reconstruct unlabelled input data from a latent representation space. More robust representations may be produced by an autoencoder if it learns to recover clean input samples from corrupted ones. Representations may be further improved by introducing regularisation during training to shape the distribution of the encoded data in latent space. We suggest denoising adversarial autoencoders, which combine denoising and regularisation, shaping the distribution of latent space using adversarial training. We introduce a novel analysis that shows how denoising may be incorporated into the training and sampling of adversarial autoencoders. Experiments are performed to assess the contributions that denoising makes to the learning of representations for classification and sample synthesis. Our results suggest that autoencoders trained using a denoising criterion achieve higher classification performance, and can synthesise samples that are more consistent with the input data than those trained without a corruption process.
翻译:不受监督的学习越来越令人感兴趣,因为它释放了大量未贴标签数据的潜力,以了解有用的推断表征。Autoencoders是一种基因模型,可以通过学习从潜在代表空间重建无标签输入数据进行训练。如果自动编码者学会从腐败的样本中回收干净输入样本,则可以提出更强有力的说明。在培训期间引入正规化,以形成编码数据在隐蔽空间的分布方式,可以进一步改进这些说明。我们建议取消对抗性自动编码者,这种数据结合拆分和正规化,利用对抗性培训塑造潜在空间的分布。我们引入了一种新的分析,表明如何将拆分纳入对立性自动编码者的培训和取样中。进行实验是为了评估分解为学习分类和样本合成的表述方式做出的贡献。我们的结果表明,采用分解标准培训的自动编码者能够取得更高的分类性能,并且能够对与输入数据更为一致的样本进行合成,而不是没有腐败过程的受训者。