We propose a new regression algorithm that learns from a set of input-output pairs. Our algorithm is designed for populations where the relation between the input variables and the output variable exhibits a heterogeneous behavior across the predictor space. The algorithm starts with generating subsets that are concentrated around random points in the input space. This is followed by training a local predictor for each subset. Those predictors are then combined in a novel way to yield an overall predictor. We call this algorithm ``LEarning with Subset Stacking'' or LESS, due to its resemblance to the method of stacking regressors. We compare the testing performance of LESS with state-of-the-art methods on several datasets. Our comparison shows that LESS is a competitive supervised learning method. Moreover, we observe that LESS is also efficient in terms of computation time and it allows a straightforward parallel implementation.
翻译:我们建议一种新的回归算法, 从一组投入输出对子中学习。 我们的算法是为那些输入变量和输出变量之间的关系显示整个预测空间不同行为的人口设计的。 算法从生成子集开始, 这些子集集中在输入空间的随机点上。 然后对每个子集进行本地预测器培训。 这些预测器然后以一种新颖的方式组合成一个总体预测器。 我们称这个算法“ 与子元件堆放或 LESS 脱轨 ”, 因为它与堆放回归器的方法相似。 我们比较了LESS 的测试性能和几个数据集中最先进的方法。 我们的比较表明 LESS 是一种竞争性的监管学习方法。 此外, 我们观察到LESS 在计算时间上也是有效的, 并且它允许一个直接的平行执行 。