A new random forest based model for solving the Multiple Instance Learning (MIL) problem under small tabular data, called Soft Tree Ensemble MIL (STE-MIL), is proposed. A new type of soft decision trees is considered, which is similar to the well-known soft oblique trees, but with a smaller number of trainable parameters. In order to train the trees, it is proposed to convert them into neural networks of a specific form, which approximate the tree functions. It is also proposed to aggregate the instance and bag embeddings (output vectors) by using the attention mechanism. The whole STE-MIL model, including soft decision trees, neural networks, the attention mechanism and a classifier, is trained in an end-to-end manner. Numerical experiments with tabular datasets illustrate STE-MIL. The corresponding code implementing the model is publicly available.
翻译:在小型表格数据下,提出了一个新的基于森林的随机模式,即所谓的软树集合型MIL(STE-MIL),用于解决多例学习问题。新的软决定树类型与众所周知的软斜树相似,但培训参数较少。为了培训树木,建议将其转换成一种特定形式的神经网络,与树功能相近。还提议利用关注机制将实例和包嵌(输出矢量)汇总在一起。整个STE-MIL模型,包括软决定树、神经网络、关注机制和分类器,以端到端方式培训。用表格数据集进行的数字实验可以说明STE-MIL。执行模型的相应代码是公开的。