由集群应用程序驱动的地方深度功能的分析和统计特性 (Analytical and statistical properties of local depth functions motivated by clustering applications)

Local general depth ($LGD$) functions are used for describing the local geometric features and mode(s) in multivariate distributions. In this paper, we undertake a rigorous systematic study of $LGD$ and establish several analytical and statistical properties. First, we show that, when the underlying probability distribution is absolutely continuous with density $f(\cdot)$, the scaled version of $LGD$ (referred to as $\tau$-approximation) converges, uniformly and in $L^d(\mathbb{R}^p)$ to $f(\cdot)$ when $\tau$ converges to zero. Second, we establish that, as the sample size diverges to infinity the centered and scaled sample $LGD$ converge in distribution to a centered Gaussian process uniformly in the space of bounded functions on $\mathcal{H}_G$, a class of functions yielding $LGD$. Third, using the sample version of the $\tau$-approximation ($S \tau A$) and the gradient system analysis, we develop a new clustering algorithm. The validity of this algorithm requires several results concerning the uniform finite difference approximation of the gradient system associated with $S \tau A$. For this reason, we establish \emph{Bernstein}-type inequality for deviations between the centered and scaled sample $LGD$, which is also of independent interest. Finally, invoking the above results, we establish consistency of the clustering algorithm. Applications of the proposed methods to mode estimation and upper level set estimation are also provided. Finite sample performance of the methodology are evaluated using numerical experiments and data analysis.

翻译：本地一般深度 (LGD$) 函数用于描述多变量分布中的本地几何特征和模式。在本文中, 我们对$LGD$进行严格的系统化研究, 并确立若干分析和统计属性。首先, 我们显示, 当基概率分布绝对连续以密度$f(\\cdd) 美元为单位时, 美元GD$的缩放版本( 称为$\tau$- apprormation {( mathbb{ R ⁇ p) 用于描述多变量分布的本地几何特征和模式化模式。当$\\\ taub{ 以美元为单位时, 我们进行严格系统化( $xxxxxxxxxxxxxx) 样本化( $xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx