Dataset distillation aims to synthesize small datasets with little information loss from original large-scale ones for reducing storage and training costs. Recent state-of-the-art methods mainly constrain the sample synthesis process by matching synthetic images and the original ones regarding gradients, embedding distributions, or training trajectories. Although there are various matching objectives, currently the strategy for selecting original images is limited to naive random sampling. We argue that random sampling overlooks the evenness of the selected sample distribution, which may result in noisy or biased matching targets. Besides, the sample diversity is also not constrained by random sampling. These factors together lead to optimization instability in the distilling process and degrade the training efficiency. Accordingly, we propose a novel matching strategy named as \textbf{D}ataset distillation by \textbf{RE}present\textbf{A}tive \textbf{M}atching (DREAM), where only representative original images are selected for matching. DREAM is able to be easily plugged into popular dataset distillation frameworks and reduce the distilling iterations by more than 8 times without performance drop. Given sufficient training time, DREAM further provides significant improvements and achieves state-of-the-art performances.
翻译:数据集蒸馏旨在合成小数据集,而原始大型存储和培训成本则很少丢失信息。最近的先进方法主要通过匹配合成图像和关于梯度、嵌入分布或培训轨迹的原始图像,限制样本合成过程。虽然有各种匹配目标,但目前选择原始图像的战略仅限于天真随机抽样。我们争辩说,随机抽样忽略了所选样本分布的均衡性,这可能导致噪音或偏差的匹配目标。此外,抽样多样性也不受随机抽样的制约。这些因素加在一起导致蒸馏过程中的优化不稳定性并降低培训效率。因此,我们提出一个新的匹配战略,名为\ textbf{resent\ resent\ textb{A}textb{a}textb{A}textbf{{M}atching (DREAM), 仅选择具有代表性的原始图像进行匹配。DREAM 能够很容易地插入流行的数据集蒸馏框架,并减少蒸馏过程的不稳定性能降低培训效率。因此,我们提出了一个新的匹配性能比8倍地提高性能。</s>