通过最大 Lq 类回归率对β回归的强力估计值 (Robust estimation in beta regression via maximum Lq-likelihood)

from arxiv, This preprint has not undergone peer review or any post-submission improvements or corrections. The Version of Record of this article is published in Statistical Papers, and is available online at https://doi.org/10.1007/s00362-022-01320-0

Beta regression models are widely used for modeling continuous data limited to the unit interval, such as proportions, fractions, and rates. The inference for the parameters of beta regression models is commonly based on maximum likelihood estimation. However, it is known to be sensitive to discrepant observations. In some cases, one atypical data point can lead to severe bias and erroneous conclusions about the features of interest. In this work, we develop a robust estimation procedure for beta regression models based on the maximization of a reparameterized Lq-likelihood. The new estimator offers a trade-off between robustness and efficiency through a tuning constant. To select the optimal value of the tuning constant, we propose a data-driven method which ensures full efficiency in the absence of outliers. We also improve on an alternative robust estimator by applying our data-driven method to select its optimum tuning constant. Monte Carlo simulations suggest marked robustness of the two robust estimators with little loss of efficiency. Applications to three datasets are presented and discussed. As a by-product of the proposed methodology, residual diagnostic plots based on robust fits highlight outliers that would be masked under maximum likelihood estimation.

翻译：贝塔回归模型被广泛用来模拟限于单位间隔的连续数据,如比例、分数和比率等。贝塔回归模型参数的推论通常以最大概率估计为基础。但众所周知,该模型对差异性观测十分敏感。在某些情况下,一个非典型数据点可能导致严重偏差,并对感兴趣的特征得出错误结论。在这项工作中,我们根据重新校准的Lq-相似值最大化,为贝塔回归模型开发一个强有力的估计程序。新的估计器通过调制常数在稳健性和效率之间进行权衡。为选择调制常数的最佳值,我们建议一种数据驱动方法,以确保在没有外部线的情况下充分提高效率。我们还改进了替代的稳健性估算器,采用我们的数据驱动法选择其最佳调控常数。蒙特卡洛模拟显示,两个稳健的测算器具有显著的稳健性,效率略微下降。对三个数据集的应用被介绍和讨论。作为拟议方法的一个副产品,根据稳健性估计,在最大可能性下,将用最精确的余诊断图示。