In the context of large samples, a small number of individuals might spoil basic statistical indicators like the mean. It is difficult to detect automatically these atypical individuals, and an alternative strategy is using robust approaches. This paper focuses on estimating the geometric median of a random variable, which is a robust indicator of central tendency. In order to deal with large samples of data arriving sequentially, online stochastic Newton algorithms for estimating the geometric median are introduced and we give their rates of convergence. Since estimates of the median and those of the Hessian matrix can be recursively updated, we also determine confidences intervals of the median in any designated direction and perform online statistical tests.
翻译:在大样本情况下,少数个体可能会破坏基本的统计指标,如均值。自动检测这些异态个体是困难的,另一个策略是使用鲁棒性方法。本文专注于估计随机变量的几何中位数,这是一个鲁棒的中心趋势指标。为了处理顺序到达的大数据样本,我们引入了用于估算几何中位数的在线随机牛顿算法,并给出了它们的收敛速率。由于中位数的估计值和海森矩阵的估计值可以进行递归更新,因此我们还确定了中位数在任何指定方向上的置信区间并执行在线统计检验。