Dataset Distillation (DD), a newly emerging field, aims at generating much smaller and high-quality synthetic datasets from large ones. Existing DD methods based on gradient matching achieve leading performance; however, they are extremely computationally intensive as they require continuously optimizing a dataset among thousands of randomly initialized models. In this paper, we assume that training the synthetic data with diverse models leads to better generalization performance. Thus we propose two \textbf{model augmentation} techniques, ~\ie using \textbf{early-stage models} and \textbf{weight perturbation} to learn an informative synthetic set with significantly reduced training cost. Extensive experiments demonstrate that our method achieves up to 20$\times$ speedup and comparable performance on par with state-of-the-art baseline methods.
翻译:数据蒸馏(DD)是一个新兴领域,目的是从大型领域产生更小和高质量的合成数据集。现有的基于梯度匹配的DD方法取得了领先性能;然而,这些方法在计算上极为密集,因为它们需要不断优化数千个随机初始化模型的数据集。在本文中,我们假设用多种模型对合成数据进行培训可以提高一般化性能。因此,我们建议采用两种\textbf{模型扩增}技术,即使用\ textbf{ear-阶段模型}和\ textbf{重量性扰动}来学习一个内容丰富的合成数据集,而培训成本则大大降低。广泛的实验表明,我们的方法达到20美元/倍的速率和与最先进的基线方法相当的可比较性能。