在统计中,离群点是与其他观测值有显着差异的数据点。离群点可能是由于测量的可变性或可能指示实验错误; 后者有时会从数据集中排除。 离群值会在统计分析中引起严重的问题。离群值可以偶然出现在任何分布中,但它们通常表示测量误差或总体具有重尾分布。 在前一种情况下,人们希望丢弃它们或使用对异常值具有鲁棒性的统计数据,而在后一种情况下,它们表明分布具有较高的偏度,并且在使用假设正态分布的工具或直觉时应格外谨慎。 异常值的常见原因是两种分布的混合,这可能是两个不同的子种群,或者可能表示“正确的试验”与“测量误差”;这是通过混合模型建模的。

最新论文

The density power divergence (DPD) and related measures have produced many useful statistical procedures which provide a good balance between model efficiency on one hand, and outlier stability or robustness on the other. The large number of citations received by the original DPD paper (Basu et al., 1998) and its many demonstrated applications indicate the popularity of these divergences and the related methods of inference. The estimators that are derived from this family of divergences are all M-estimators where the defining $\psi$ function is based explicitly on the form of the model density. The success of the minimum divergence estimators based on the density power divergence makes it imperative and meaningful to look for other, similar divergences in the same spirit. The logarithmic density power divergence (Jones et al., 2001), a logarithmic transform of the density power divergence, has also been very successful in producing inference procedures with a high degree of efficiency simultaneously with a high degree of robustness. This further strengthens the motivation to look for statistical divergences that are transforms of the density power divergence, or, alternatively, members of the functional density power divergence class. This note characterizes the functional density power divergence class, and thus identifies the available divergence measures within this construct that may possibly be explored for robust and efficient statistical inference.

0
0
下载
预览
Top