The projection pursuit regression (PPR) has played an important role in the development of statistics and machine learning. According to the two cultures of Breiman (2001), PPR is an algorithmic model that can be used to approximate any general regression. Although PPR can achieve the almost optimal consistency rate asymptotically as shown in this paper, its effectiveness in prediction is rarely seen in practice. To improve the prediction, we propose an ensemble procedure, hereafter referred to as ePPR, by adopting the "feature bagging" of the Random Forest (RF). In comparison, ePPR has several advantages over RF, and its theoretical consistency can be proved under more general settings than RF. Extensive comparisons based on real data sets show that ePPR is significantly more efficient in regression and classification than RF and other competitors.
翻译:根据布雷曼(2001年)的两种文化,PPR是一种算法模型,可以用来估计任何总的回归。虽然PPR可以实现本文所示几乎最佳的统一率,但在实际中很少看到其预测的有效性。为了改进预测,我们建议采用“随机森林(RF)的“性能包包”这一组合程序(以下简称ePPR ) 。相比之下,ePPR比RF(RF)具有若干优势,其理论一致性可以在比RF(RF)更普遍的环境下得到证明。基于真实数据集的广泛比较表明,ePPR在回归和分类方面比RF(RF)和其他竞争者效率要高得多。