Bootstrap aggregating (Bagging) and boosting are two popular ensemble learning approaches, which combine multiple base learners to generate a composite model for more accurate and more reliable performance. They have been widely used in biology, engineering, healthcare, etc. This paper proposes BoostForest, which is an ensemble learning approach using BoostTree as base learners and can be used for both classification and regression. BoostTree constructs a tree model by gradient boosting. It increases the randomness (diversity) by drawing the cut-points randomly at node splitting. BoostForest further increases the randomness by bootstrapping the training data in constructing different BoostTrees. BoostForest generally outperformed four classical ensemble learning approaches (Random Forest, Extra-Trees, XGBoost and LightGBM) on 35 classification and regression datasets. Remarkably, BoostForest tunes its parameters by simply sampling them randomly from a parameter pool, which can be easily specified, and its ensemble learning framework can also be used to combine many other base learners.
翻译:诱导器集成和推进是两种流行的混合学习方法,它们将多个基础学习者结合起来,形成一个更准确和更可靠的性能的综合模型。它们已被广泛用于生物学、工程学、保健等。本文件提议了“促进Forest”,这是一种混合学习方法,使用BustTree作为基础学习者,可用于分类和回归。“促进”通过梯度推动构建树模型。它通过在节点分裂时随机绘制切分点来增加随机性(多样性)。“促进”进一步增加随机性,在构建不同的布尔、工程学、保健等时,采用示意式,使培训数据更加随机性。“促进Forest”一般地超过35个分类和回归数据集的四种典型组合学习方法(兰多森林、外层、XGBoost和LightGBMM)。值得注意的是,“促进”调整其参数,它只是从参数库中随机抽选取这些参数,这些参数可以很容易指定,而且其可包含的学习框架也可以用来将许多其他基础学习者结合起来。