Stacking (or stacked generalization) is an ensemble learning method with one main distinctiveness from the rest: even though several base models are trained on the original data set, their predictions are further used as input data for one or more metamodels arranged in at least one extra layer. Composing a stack of models can produce high-performance outcomes, but it usually involves a trial-and-error process. Therefore, our previously developed visual analytics system, StackGenVis, was mainly designed to assist users in choosing a set of top-performing and diverse models by measuring their predictive performance. However, it only employs a single logistic regression metamodel. In this paper, we investigate the impact of alternative metamodels on the performance of stacking ensembles using a novel visualization tool, called MetaStackVis. Our interactive tool helps users to visually explore different singular and pairs of metamodels according to their predictive probabilities and multiple validation metrics, as well as their ability to predict specific problematic data instances. MetaStackVis was evaluated with a usage scenario based on a medical data set and via expert interviews.
翻译:堆积式(或堆叠式一般化)是一种混合学习方法,与其他方法一样具有一种主要特征:尽管一些基础模型在原始数据集上受过培训,但其预测被进一步用作至少一个额外层排列的一个或多个元模型的输入数据。堆积式模型可以产生高性能结果,但通常涉及一个试验和高级过程。因此,我们以前开发的视觉分析系统StackGen Vis,主要设计来帮助用户通过测量其预测性能来选择一套最优秀和多样化的模型。然而,它只使用一个单一的物流回归元模型。在本文中,我们利用一个叫作MetamStack Vision的新型视觉化工具,调查替代元模型对堆叠式组合性效果的影响。我们的交互式工具帮助用户根据预测性概率和多重验证度来对不同的单项和一对项元模型进行视觉探索,以及他们预测特定问题的数据实例的能力。Metam-Stack Vis根据一套医疗数据集和专家访谈,用一种使用设想来评价使用。