In this article, we strengthen the proof methods of some previously weakly consistent variants of random forests into strongly consistent proof methods, and improve the data utilization of these variants, in order to obtain better theoretical properties and experimental performance. In addition, based on the multinomial random forest (MRF) and Bernoulli random forest (BRF), we propose a data-driven multinomial random forest (DMRF) algorithm, which has lower complexity than MRF and higher complexity than BRF while satisfying strong consistency. It has better performance in classification and regression problems than previous RF variants that only satisfy weak consistency, and in most cases even surpasses standard random forest. To the best of our knowledge, DMRF is currently the most excellent strongly consistent RF variant with low algorithm complexity
翻译:在这篇文章中,我们加强了一些先前弱一致的随机森林变体的证明方法,改善了这些变体的数据利用率,以获得更好的理论性质和实验性能。此外,基于多项式随机森林(MRF)和伯努利随机森林(BRF),我们提出了一种数据驱动的多项式随机森林(DMRF)算法,它的复杂度比MRF低,比BRF高,同时满足强一致性。它在分类和回归问题上的性能优于之前只满足弱一致性的随机森林变体,并且在大多数情况下甚至超过了标准随机森林。据我们所知,DMRF目前是复杂度低的最优强一致性随机森林变体。