Categorical responses arise naturally from various scientific disciplines. Under many circumstances, there is no predetermined order for the response categories and the response has to be modeled as nominal. In this paper we regard the order of response categories as part of the statistical model and show that the true order when it exists can be selected using likelihood-based model selection criteria. For predictive purposes, a statistical model with a chosen order may perform better than models based on nominal responses even if a true order does not exist. For multinomial logistic models widely used for categorical responses, we show the existence of theoretically equivalent orders that are indistinguishable based on likelihood criteria and discover the connections between their maximum likelihood estimators. We use simulation studies and real data analysis to confirm the needs and benefits of choosing the most appropriate order for categorical responses.
翻译:分类响应在许多科学领域自然而然地出现。在许多情况下,没有预定的响应类别顺序,因此必须将响应建模为名义变量。本文将响应类别顺序视为统计模型的一部分,并展示了真实顺序存在时可以使用基于似然的模型选择标准来选择顺序。对于预测目的,虽然实际上没有真实顺序存在,但基于已选择的顺序的统计模型可能比基于名义响应的模型表现更好。对于广泛用于分类响应的多项Logistic模型,我们证明了存在理论等价顺序,这些顺序在似然标准下不可区分,并发现它们的最大似然估计器之间的联系。我们使用模拟研究和实际数据分析来证实选择分类响应的最合适顺序的必要性和好处。