Dataset distillation compresses large datasets into smaller synthetic coresets which retain performance with the aim of reducing the storage and computational burden of processing the entire dataset. Today's best-performing algorithm, \textit{Kernel Inducing Points} (KIP), which makes use of the correspondence between infinite-width neural networks and kernel-ridge regression, is prohibitively slow due to the exact computation of the neural tangent kernel matrix, scaling $O(|S|^2)$, with $|S|$ being the coreset size. To improve this, we propose a novel algorithm that uses a random feature approximation (RFA) of the Neural Network Gaussian Process (NNGP) kernel, which reduces the kernel matrix computation to $O(|S|)$. Our algorithm provides at least a 100-fold speedup over KIP and can run on a single GPU. Our new method, termed an RFA Distillation (RFAD), performs competitively with KIP and other dataset condensation algorithms in accuracy over a range of large-scale datasets, both in kernel regression and finite-width network training. We demonstrate the effectiveness of our approach on tasks involving model interpretability and privacy preservation.
翻译:数据蒸馏器将大型数据压缩成较小的合成核心组,保持性能,目的是减少处理整个数据集的存储和计算负担。今天的最佳算法, \ textit{ Kernel 引导点} (KIP), 使用无限线神经网络和内核脊回归之间的对应, 由于精确计算神经中层内核内核内核基体, 缩放$O( ⁇ S ⁇ 2), 以美元为单位, 保持性能, 以美元为核心群体大小。 为了改进这一点, 我们建议了一种新型算法, 使用神经网络Gaussian进程( NNNGP) 的随机性特征近似值( RFA), 将内核矩阵计算减为$O( ⁇ ) 和 内核脊脊脊脊脊回归法, 其计算速度至少100倍, 超过KIP, 并且可以运行在单一的 GPU。 我们的新方法, 称为RFA 蒸馏( RFADA),, 与 KIP 和其他数据存储集中的集中配置配置配置配置算法,, 使用一种随机系统内核的精确化方法。