In this paper we extend the work of Owen (2007) by deriving a second order expansion for the slope parameter in logistic regression, when the size of the majority class is unbounded and the minority class is finite. More precisely, we demonstrate that the second order term converges to a normal distribution and explicitly compute its variance, which surprisingly once again depends only on the mean of the minority class points and not their arrangement under mild regularity assumptions. In the case that the majority class is normally distributed, we illustrate that the variance of the the limiting slope depends exponentially on the z-score of the average of the minority class's points with respect to the majority class's distribution. We confirm our results by Monte Carlo simulations.
翻译:在本文中,我们延长了欧文(2007年)的工作,在后勤回归中,当多数阶层的大小没有限制,少数阶层是有限的时,为斜坡参数增加第二顺序,从而扩大了欧文(2007年)的工作。更确切地说,我们证明,第二顺序术语与正常分布一致,并明确计算其差异,令人惊讶的是,这再次仅取决于少数阶层点的平均值,而不是根据温和的正常假设作出的安排。在多数阶层通常分布的情况下,我们说明,限制斜坡的差异成倍地取决于多数阶层平均分点的Z分数。我们通过蒙特卡洛模拟证实了我们的结果。