We propose a new algorithm that learns from a set of input-output pairs. Our algorithm is designed for populations where the relation between the input variables and the output variable exhibits a heterogeneous behavior across the predictor space. The algorithm starts with generating subsets that are concentrated around random points in the input space. This is followed by training a local predictor for each subset. Those predictors are then combined in a novel way to yield an overall predictor. We call this algorithm "LEarning with Subset Stacking" or LESS, due to its resemblance to method of stacking regressors. We compare the testing performance of LESS with the state-of-the-art methods on several datasets. Our comparison shows that LESS is a competitive supervised learning method. Moreover, we observe that LESS is also efficient in terms of computation time and it allows a straightforward parallel implementation.
翻译:我们提出一种新的算法,从一组投入输出对子中学习。 我们的算法是为那些输入变量和输出变量之间的关系显示整个预测空间不同行为的人口设计的。 算法首先产生子集, 集中在输入空间的随机点上。 然后为每个子集培训一个本地预测器。 然后这些预测器以新颖的方式组合成一个总体预测器。 我们称这个算法为“ 与子集处理或 LESS 脱轨, 因为它与堆放回归器方法相似 。 我们比较了LESS的测试性能和几个数据集中最先进的方法。 我们的比较表明LESS是一种竞争性的监管学习方法。 此外, 我们观察到LESS在计算时间上也是有效的, 并且它允许一个简单的平行执行 。