Determinantal point processes (a.k.a. DPPs) have recently become popular tools for modeling the phenomenon of negative dependence, or repulsion, in data. However, our understanding of an analogue of a classical parametric statistical theory is rather limited for this class of models. In this work, we investigate a parametric family of Gaussian DPPs with a clearly interpretable effect of parametric modulation on the observed points. We show that parameter modulation impacts the observed points by introducing directionality in their repulsion structure, and the principal directions correspond to the directions of maximal (i.e. the most long ranged) dependency. This model readily yields a novel and viable alternative to Principal Component Analysis (PCA) as a dimension reduction tool that favors directions along which the data is most spread out. This methodological contribution is complemented by a statistical analysis of a spiked model similar to that employed for covariance matrices as a framework to study PCA. These theoretical investigations unveil intriguing questions for further examination in random matrix theory, stochastic geometry and related topics.
翻译:确定点进程(a.k.a.a.DPPs)最近已成为在数据中模拟负依赖性或反向现象的流行工具。然而,我们对典型的参数统计理论的类比的理解对于这一类模型来说相当有限。在这项工作中,我们调查了高斯的参数组DPPs的参数组,对观察点的参数调节作用有明确的解释作用。我们通过在反向结构中引入方向性来显示参数调控作用所观察到的点,主要方向与最高值(即最长范围)依赖性的方向相对应。这一模型很容易产生出一个创新和可行的替代主元组成部分分析(PCA)的替代方法,作为有利于数据最分散方向的减少维度工具。这一方法的贡献得到一个与用作研究五氯苯甲醚的框架的共变矩阵类似的急剧模型的统计分析的补充。这些理论调查揭示了在随机矩阵理论、随机分析的几何测量和相关专题中进一步研究的令人感兴趣的问题。