The strength of machine learning models stems from their ability to learn complex function approximations from data; however, this strength also makes training deep neural networks challenging. Notably, the complex models tend to memorize the training data, which results in poor regularization performance on test data. The regularization techniques such as L1, L2, dropout, etc. are proposed to reduce the overfitting effect; however, they bring in additional hyperparameters tuning complexity. These methods also fall short when the inter-class similarity is high due to the underlying data distribution, leading to a less accurate model. In this paper, we present a novel approach to regularize the models by leveraging the information-rich latent embeddings and their high intra-class correlation. We create phantom embeddings from a subset of homogenous samples and use these phantom embeddings to decrease the inter-class similarity of instances in their latent embedding space. The resulting models generalize better as a combination of their embedding and regularize them without requiring an expensive hyperparameter search. We evaluate our method on two popular and challenging image classification datasets (CIFAR and FashionMNIST) and show how our approach outperforms the standard baselines while displaying better training behavior.
翻译:机器学习模型的强大之处在于它们能够从数据中学习复杂的函数逼近;然而,这种强大也使得深度神经网络的训练十分具有挑战性。尤其是,复杂模型往往会将训练数据记忆下来,导致在测试数据上表现不佳的正则化效果。为了减少过拟合的效果,提出了L1、L2、dropout等正则化技术;然而,这些方法会带来额外的超参数调整复杂度。当由于潜在的数据分布而导致类内相似性高时,这些方法的效果也不佳,从而导致模型不那么准确。本文提出了一种新颖的方法,通过利用信息丰富的潜在嵌入及其高度内部一致性来正则化模型。我们从一部分同质样本中创建幽灵嵌入,并使用这些幽灵嵌入来降低嵌入空间中实例的类间相似度。结果,模型综合利用嵌入的特点并在不需要昂贵的超参数搜索的情况下启用正则化。我们在两个流行而具有挑战性的图像分类数据集上(CIFAR和FashionMNIST)对我们的方法进行了评估,展示了我们的方法如何优于标准基线,并展示了更好的训练行为。