New models of random forests jointly using the attention and self-attention mechanisms are proposed for solving the regression problem. The models can be regarded as extensions of the attention-based random forest whose idea stems from applying a combination of the Nadaraya-Watson kernel regression and the Huber's contamination model to random forests. The self-attention aims to capture dependencies of the tree predictions and to remove noise or anomalous predictions in the random forest. The self-attention module is trained jointly with the attention module for computing weights. It is shown that the training process of attention weights is reduced to solving a single quadratic or linear optimization problem. Three modifications of the general approach are proposed and compared. A specific multi-head self-attention for the random forest is also considered. Heads of the self-attention are obtained by changing its tuning parameters including the kernel parameters and the contamination parameter of models. Numerical experiments with various datasets illustrate the proposed models and show that the supplement of the self-attention improves the model performance for many datasets.
翻译:为解决回归问题,提出了使用注意和自我注意机制联合使用随机森林的新模型,这些模型可被视为以注意为基础的随机森林的延伸,其想法来自将Nadaraya-Watson内核回归和Huber的污染模型结合应用到随机森林。自我注意的目的是捕捉树木预测的依赖性,并消除随机森林中的噪音或异常预测。自我注意模块与计算重量的注意模块共同培训。显示注意的训练过程将降低到解决单一的二次曲线或线性优化问题。提出并比较了三种一般方法的修改。还考虑了随机森林的具体多头自我注意。通过改变调控参数,包括内核参数和模型的污染参数,获得了自我注意的负责人。用各种数据集进行的数量实验,说明了拟议的模型,并表明对自我注意的补充改善了许多数据集的模型性能。