Firth的后勤回归与罕见的事件: 准确的影响估计和预测? (Firth's logistic regression with rare events: accurate effect estimates AND predictions?)

Firth-type logistic regression has become a standard approach for the analysis of binary outcomes with small samples. Whereas it reduces the bias in maximum likelihood estimates of coefficients, bias towards 1/2 is introduced in the predicted probabilities. The stronger the imbalance of the outcome, the more severe is the bias in the predicted probabilities. We propose two simple modifications of Firth-type logistic regression resulting in unbiased predicted probabilities. The first corrects the predicted probabilities by a post-hoc adjustment of the intercept. The other is based on an alternative formulation of Firth-types estimation as an iterative data augmentation procedure. Our suggested modification consists in introducing an indicator variable which distinguishes between original and pseudo observations in the augmented data. In a comprehensive simulation study these approaches are compared to other attempts to improve predictions based on Firth-type penalization and to other published penalization strategies intended for routine use. For instance, we consider a recently suggested compromise between maximum likelihood and Firth-type logistic regression. Simulation results are scrutinized both with regard to prediction and regression coefficients. Finally, the methods considered are illustrated and compared for a study on arterial closure devices in minimally invasive cardiac surgery.

翻译： Firth 型物流回归已成为对小型样品的二进制结果进行分析的标准方法。它减少了对系数最大概率估计的偏差, 在预测概率中引入了对1/2的偏差, 在预测概率中引入了对1/2的偏差。结果偏差越大, 偏差的偏差就越大。我们建议对Firth 型物流回归进行两次简单的修改, 从而得出公正的预测概率。首先纠正了对拦截量进行热后调整的预测概率。另一种则基于Firth 型估算的替代公式, 作为一种迭代数据增强程序。我们建议的修改包括在扩大的数据中引入一个指标变量, 区分原始和假观察。在一项全面模拟研究中, 将这些方法与其他旨在改进基于Firth 型处罚的预测和为常规使用而公布的其他惩罚战略进行比较。例如, 我们考虑最近提出的在最大可能性和Firth 型物流回归之间达成妥协的建议。模拟结果在预测和回归系数方面都经过仔细审查。最后, 所考虑的方法被说明并比较, 用于在最起码的心脏关闭装置的研究。