This paper presents a model-agnostic ensemble approach for supervised learning. The proposed approach is based on a parametric version of Random Subspace, in which each base model is learned from a feature subset sampled according to a Bernoulli distribution. Parameter optimization is performed using gradient descent and is rendered tractable by using an importance sampling approach that circumvents frequent re-training of the base models after each gradient descent step. The degree of randomization in our parametric Random Subspace is thus automatically tuned through the optimization of the feature selection probabilities. This is an advantage over the standard Random Subspace approach, where the degree of randomization is controlled by a hyper-parameter. Furthermore, the optimized feature selection probabilities can be interpreted as feature importance scores. Our algorithm can also easily incorporate any differentiable regularization term to impose constraints on these importance scores.
翻译:本文介绍了一种用于监督学习的模型-不可知共通性方法。 提议的方法基于随机子空间的参数版本,其中每个基准模型都是从根据伯努利分布法抽样的特征子组中学习的。 参数优化是使用梯度下降法进行的,通过使用重要的抽样方法绕过每梯度下降一步后经常对基准模型进行再培训而变得可移动。 因此,我们的参数随机子空间的随机化程度通过优化特征选择概率自动调整。 这是标准的随机子空间方法的优势,因为随机化的程度由超参数控制。 此外,优化特征选择概率可以被解释为特征重要性分数。 我们的算法还可以很容易纳入任何不同的正规化术语,以限制这些重要性分数。