Gausian 决定因素进程:数据方向性的新模式 (Gaussian Determinantal Processes: a new model for directionality in data)

Determinantal point processes (a.k.a. DPPs) have recently become popular tools for modeling the phenomenon of negative dependence, or repulsion, in data. However, our understanding of an analogue of a classical parametric statistical theory is rather limited for this class of models. In this work, we investigate a parametric family of Gaussian DPPs with a clearly interpretable effect of parametric modulation on the observed points. We show that parameter modulation impacts the observed points by introducing directionality in their repulsion structure, and the principal directions correspond to the directions of maximal (i.e. the most long ranged) dependency. This model readily yields a novel and viable alternative to Principal Component Analysis (PCA) as a dimension reduction tool that favors directions along which the data is most spread out. This methodological contribution is complemented by a statistical analysis of a spiked model similar to that employed for covariance matrices as a framework to study PCA. These theoretical investigations unveil intriguing questions for further examination in random matrix theory, stochastic geometry and related topics.

翻译：确定点进程(a.k.a.a.DPPs)最近已成为在数据中模拟负依赖性或反向现象的流行工具。然而,我们对典型的参数统计理论的类比的理解对于这一类模型来说相当有限。在这项工作中,我们调查了高斯的参数组DPPs的参数组,对观察点的参数调节作用有明确的解释作用。我们通过在反向结构中引入方向性来显示参数调控作用所观察到的点,主要方向与最高值(即最长范围)依赖性的方向相对应。这一模型很容易产生出一个创新和可行的替代主元组成部分分析(PCA)的替代方法,作为有利于数据最分散方向的减少维度工具。这一方法的贡献得到一个与用作研究五氯苯甲醚的框架的共变矩阵类似的急剧模型的统计分析的补充。这些理论调查揭示了在随机矩阵理论、随机分析的几何测量和相关专题中进一步研究的令人感兴趣的问题。

相关内容

PCA

关注 3

在统计中，主成分分析（PCA）是一种通过最大化每个维度的方差来将较高维度空间中的数据投影到较低维度空间中的方法。给定二维，三维或更高维空间中的点集合，可以将“最佳拟合”线定义为最小化从点到线的平均平方距离的线。可以从垂直于第一条直线的方向类似地选择下一条最佳拟合线。重复此过程会产生一个正交的基础，其中数据的不同单个维度是不相关的。这些基向量称为主成分。

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日