Recent advances in deep generative models have led to impressive results in a variety of application domains. Motivated by the possibility that deep learning models might memorize part of the input data, there have been increased efforts to understand how memorization can occur. In this work, we extend a recently proposed measure of memorization for supervised learning (Feldman, 2019) to the unsupervised density estimation problem and simplify the accompanying estimator. Next, we present an exploratory study that demonstrates how memorization can arise in probabilistic deep generative models, such as variational autoencoders. This reveals that the form of memorization to which these models are susceptible differs fundamentally from mode collapse and overfitting. Finally, we discuss several strategies that can be used to limit memorization in practice.
翻译:深层基因模型的最近进步导致在各种应用领域取得了令人印象深刻的成果。由于深深学习模型有可能将部分输入数据混为一文,因此更加努力了解如何进行记忆化。在这项工作中,我们最近提出的用于监督学习的记忆化措施(Feldman, 2019年)扩大到未受监督的密度估计问题,并简化了相应的估计数据。接下来,我们提出一项探索性研究,说明在变异自动电解器等具有稳定性的深层基因模型中,如何产生记忆化。这揭示了这些模型所依赖的记忆化形式与模式崩溃和过度装配根本不同。最后,我们讨论了在实际中可用于限制记忆化的若干战略。