We present the backbone method, a generic framework that enables sparse and interpretable supervised machine learning methods to scale to ultra-high dimensional problems. We solve sparse regression problems with $10^7$ features in minutes and $10^8$ features in hours, as well as decision tree problems with $10^5$ features in minutes. The proposed method operates in two phases; we first determine the backbone set, that consists of potentially relevant features, by solving a number of tractable subproblems; then, we solve a reduced problem, considering only the backbone features. For the sparse regression problem, we show that, under certain assumptions and with high probability, the backbone set consists of the true relevant features. Numerical experiments on both synthetic and real-world datasets demonstrate that our method outperforms or competes with state-of-the-art methods in ultra-high dimensional problems, and competes with optimal solutions in problems where exact methods scale, both in terms of recovering the true relevant features and in its out-of-sample predictive performance.
翻译:我们提出主干法,这是一个通用框架,使稀有和可解释的受监督的机器学习方法能够推广到超高维问题。我们用每分钟10瓦7美元的特征和每小时10瓦8美元的特征解决稀疏回归问题,用每分钟10瓦5美元的特征解决决策树问题。拟议方法分两个阶段运作;我们首先确定由潜在相关特征组成的主干法,通过解决若干可移植的子问题;然后,我们解决一个减少的问题,只考虑主干法。对于稀少的回归问题,我们表明,在某些假设和可能性很高的情况下,主干法由真实的相关特征组成。合成和真实世界数据集的量化实验表明,在超高维问题中,我们的方法优于或与最先进的方法竞争,在精确方法规模的问题上,无论是从恢复真正相关特征的角度,还是从模拟的预测性能中,都与最佳解决方案竞争。