Directional data require specialized probability models because of the non-Euclidean and periodic nature of their domain. When a directional variable is observed jointly with linear variables, modeling their dependence adds an additional layer of complexity. This paper introduces a novel Bayesian nonparametric approach for directional-linear data based on the Dirichlet process. We first extend the projected normal distribution to model the joint distribution of linear variables and a directional variable with arbitrary dimension as a projection of a higher-dimensional augmented multivariate normal distribution (MVN). We call the new distribution the semi-projected normal distribution (SPN); it possesses properties similar to the MVN. The SPN is then used as the mixture distribution in a Dirichlet process model to obtain a more flexible class of models for directional-linear data. We propose a normal conditional inverse-Wishart distribution as part of the prior distribution to address an identifiability issue inherited from the projected normal and preserve conjugacy with the SPN distribution. A Gibbs sampling algorithm is provided for posterior inference. Experiments on synthetic data and the Berkeley image database show superior performance of the Dirichlet process SPN mixture model (DPSPN) in clustering compared to other directional-linear models. We also build a hierarchical Dirichlet process model with the SPN to develop a likelihood ratio approach to bloodstain pattern analysis using the DPSPN model for density estimation to estimate the likelihood of a given pattern from a set of training data.
翻译:方向性数据要求专门的概率模型, 原因是其域域的非欧元性和周期性。 当与线性变量共同观测方向变量时, 模拟其依赖性会增加一层复杂度。 本文介绍了基于 Drichlet 进程对方向线性数据采用新型的Bayesian非参数性方法。 我们首先将预测的正常分布扩大到模拟线性变量和具有任意性的方向变量的联合分布, 以预测从预测的正常模式中继承的识别性问题为基础, 并保持与 SPN 分布的共变性。 我们将新的分布称为半预测正常分布(SPN); 它拥有类似于 MVN 的属性。 然后SPN 将用作Drichlet 模式模型中混合分布的混合物, 以获得更灵活的方向模型模式模型和Berkey图像数据库中用于建立SPNBR 水平性模型的更高性能。 我们建议, 将SPN 模型中S 的模型和S- dirlital 数据模型中的数据分析过程改为SPN 等级级模型。