Tree-based ensembles such as the Random Forest are modern classics among statistical learning methods. In particular, they are used for predicting univariate responses. In case of multiple outputs the question arises whether we separately fit univariate models or directly follow a multivariate approach. For the latter, several possibilities exist that are, e.g. based on modified splitting or stopping rules for multi-output regression. In this work we compare these methods in extensive simulations to help in answering the primary question when to use multivariate ensemble techniques.
翻译:在统计学习方法中,随机森林等基于树木的集合是现代的经典,特别是用来预测单项反应。在多种产出的情况下,问题在于我们是否分别适合单项模型,还是直接采用多变量方法。对于后者,存在几种可能性,例如基于修改的分解或停止多产出回归规则。在这项工作中,我们在广泛的模拟中比较这些方法,以帮助回答使用多变量组合技术时的首要问题。