Ensemble pruning, selecting a subset of individual learners from an original ensemble, alleviates the deficiencies of ensemble learning on the cost of time and space. Accuracy and diversity serve as two crucial factors while they usually conflict with each other. To balance both of them, we formalize the ensemble pruning problem as an objection maximization problem based on information entropy. Then we propose an ensemble pruning method including a centralized version and a distributed version, in which the latter is to speed up the former's execution. At last, we extract a general distributed framework for ensemble pruning, which can be widely suitable for most of existing ensemble pruning methods and achieve less time consuming without much accuracy decline. Experimental results validate the efficiency of our framework and methods, particularly with regard to a remarkable improvement of the execution speed, accompanied by gratifying accuracy performance.
翻译:集合裁剪,从原始的组合体中选择一组个别学习者,从而缓解了在时间和空间成本方面共同学习的缺陷。精确性和多样性是两个关键因素,它们通常相互冲突。为了平衡两者,我们将共同裁剪问题正式确定为基于信息增殖的反对最大化问题。然后我们提出一个共同裁剪方法,包括一个集中版和一个分布式版本,其中后者将加快前者的执行速度。最后,我们为共同裁剪提取一个总的分布式框架,这个框架可以广泛适用于大多数现有的共同裁剪裁剪方法,并且可以节省时间,而不会降低准确性。实验结果验证了我们的框架和方法的效率,特别是在显著提高执行速度方面,并辅之以令人欣慰的准确性表现。