Generative models for classification use the joint probability distribution of the class variable and the features to construct a decision rule. Among generative models, Bayesian networks and naive Bayes classifiers are the most commonly used and provide a clear graphical representation of the relationship among all variables. However, these have the disadvantage of highly restricting the type of relationships that could exist, by not allowing for context-specific independences. Here we introduce a new class of generative classifiers, called staged tree classifiers, which formally account for context-specific independence. They are constructed by a partitioning of the vertices of an event tree from which conditional independence can be formally read. The naive staged tree classifier is also defined, which extends the classic naive Bayes classifier whilst retaining the same complexity. An extensive simulation study shows that the classification accuracy of staged tree classifiers is competitive with that of state-of-the-art classifiers and an example showcases their use in practice.
翻译:在基因模型中,Bayesian 网络和天真的Bayes分类器是最常用的,它们提供了所有变量之间关系的清晰图形说明。然而,这些模型的缺点是,不允许因具体情况而独立,从而严重限制了可能存在的关系类型。这里我们引入了一种新的基因分类器类别,称为分阶段树分类器,正式说明具体背景的独立性。这些分类器是用事件树的脊椎分割而成的,可以正式读取有条件的独立。还定义了天真的树分类器,它扩大了经典的天真Bayes分类器,同时保留了同样的复杂性。一项广泛的模拟研究表明,分阶段树分类器的分类精度与最先进的分类器相比是竞争性的,一个实例展示了它们在实践中的用途。