An ensemble of decision trees is known as Random Forest. As suggested by Breiman, the strength of unstable learners and the diversity among them are the ensemble models' core strength. In this paper, we propose two approaches known as oblique and rotation double random forests. In the first approach, we propose a rotation based double random forest. In rotation based double random forests, transformation or rotation of the feature space is generated at each node. At each node different random feature subspace is chosen for evaluation, hence the transformation at each node is different. Different transformations result in better diversity among the base learners and hence, better generalization performance. With the double random forest as base learner, the data at each node is transformed via two different transformations namely, principal component analysis and linear discriminant analysis. In the second approach, we propose oblique double random forest. Decision trees in random forest and double random forest are univariate, and this results in the generation of axis parallel split which fails to capture the geometric structure of the data. Also, the standard random forest may not grow sufficiently large decision trees resulting in suboptimal performance. To capture the geometric properties and to grow the decision trees of sufficient depth, we propose oblique double random forest. The oblique double random forest models are multivariate decision trees. At each non-leaf node, multisurface proximal support vector machine generates the optimal plane for better generalization performance. Also, different regularization techniques (Tikhonov regularisation and axis-parallel split regularisation) are employed for tackling the small sample size problems in the decision trees of oblique double random forest.
翻译:一组决定树被称为随机森林。 如 Breiman 所言, 不稳定学习者的强度和多样性是组合模型的核心力量。 在本文中, 我们提出两种方法, 称为斜度和旋转双随机森林。 在第一个方法中, 我们提议一个基于旋转的双随机森林。 在以旋转为基础的双随机森林中, 在每个节点上, 生成特性空间的转换或旋转。 每个节点选择不同的随机特性子空间进行评价, 因此每个节点的变化是不同的。 不同的转变导致基础学习者之间更加多样化, 因而更加概括化的性能。 在双随机的森林中, 每个节点的数据被转换为双向的双向森林。 在常规学习者中, 两个节点的数据被转换为双向的双向 。 标准随机的森林, 在常规学习者中, 双向的双向树, 以双向的森林决定性能 。 双向的双向的森林和双向的树 。 双向的树 代表着 双向的森林决定性性能 。 双向的 。 双向的树 和双向的树 代表着, 双向的森林决定性决定性 。