Sparse shrunk additive models and sparse random feature models have been developed separately as methods to learn low-order functions, where there are few interactions between variables, but neither offers computational efficiency. On the other hand, $\ell_2$-based shrunk additive models are efficient but do not offer feature selection as the resulting coefficient vectors are dense. Inspired by the success of the iterative magnitude pruning technique in finding lottery tickets of neural networks, we propose a new method -- Sparser Random Feature Models via IMP (ShRIMP) -- to efficiently fit high-dimensional data with inherent low-dimensional structure in the form of sparse variable dependencies. Our method can be viewed as a combined process to construct and find sparse lottery tickets for two-layer dense networks. We explain the observed benefit of SHRIMP through a refined analysis on the generalization error for thresholded Basis Pursuit and resulting bounds on eigenvalues. From function approximation experiments on both synthetic data and real-world benchmark datasets, we show that SHRIMP obtains better than or competitive test accuracy compared to state-of-art sparse feature and additive methods such as SRFE-S, SSAM, and SALSA. Meanwhile, SHRIMP performs feature selection with low computational complexity and is robust to the pruning rate, indicating a robustness in the structure of the obtained subnetworks. We gain insight into the lottery ticket hypothesis through SHRIMP by noting a correspondence between our model and weight/neuron subnetworks.
翻译:作为学习低顺序功能的方法,分别开发了稀有的缩缩添加模型和稀有的随机特征模型,作为学习低顺序功能的方法,其中变量之间很少互动,但没有提供计算效率。另一方面,基于 ell_2$的缩缩缩添加模型效率高,但是由于由此产生的系数矢量密度高,因此没有提供特征选择。由于迭接级规模裁剪技术在寻找神经网络彩票方面取得成功,我们提出了一个新方法 -- -- 通过IMP(SHRIMP),使高维数据与内在的低维结构有效地相匹配,其形式是分散的不易变数。我们的方法可以被视为一个建造和找到两层密集网络的稀释彩票的综合过程。我们解释SHRIMP的观察效益,方法是通过精细分析基底底底线总错误的精确分析,并由此而将乙级价值捆绑起来。我们从合成数据和真实世界基准数据集的功能近似模型中,我们显示SHRI-ROPRO的相对性测试准确性结构,通过S-HRI-RO的精度精度的精度结构,通过S-S-S-S-S-S-S-S-ROpreal 的精度的精度的精度结构进行S-S-S-S-S-MA的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度,通过SMA的精度和精度的精度的精度的精度的精度的精度的精度,通过SMA的精度的精度的精度的精度的精度,通过SMA的精度的精度的精度的精度的精度,通过SMA的精度的精度的精度的精度的精度的精度的精度的精度的精度,通过SMA的精度的精度的精度的精度的精度的精度的精度的精度的精度,通过SMA的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度